Tony Finch - Venti, Foundation, Bloom filters, and Erlang

dotatfanf wrote
on 11th June 2008 at 23:29
Previous Entry Add to Memories Share Next Entry

Venti, Foundation, Bloom filters, and Erlang

I've just read a paper about "Foundation", which is an updated version of Plan 9's Venti archiving system. Foundation is designed to back up a computer nightly to a USB hard disk and an FTP server. Like Venti, it uses content-addressed storage to eliminate duplicate blocks; the main difference is that Foundation has been optimized to reduce seeks on a single disk, where Venti assumed a fast RAID could hide latency. Foundation uses a Bloom filter to speed up index lookups when eliminating duplicate blocks, and it writes data blocks in an append-only log, so it hits two things that I have been thinking about recently.

Towards the end of the paper are a few wonderful paragraphs that address the problem of reclaiming backup space wasted by bulky temporary data - the paper uses the example of on-disk copies of DVDs to save physical space when travelling. This motivates the desire to support deleting a nightly snapshot. They say:

Conceptually, deleting a snapshot resembles garbage collection in programming languages or log cleaning in LFS. First, the CAS layer writes a new version of the system catalog that no longer points to the snapshot. Then, the system reclaims the space used by blocks that are no longer reachable from any other catalog entry. A more recent snapshot, for example, may still point to some block in the deleted snapshot.

Interestingly, the structure of Foundation's log makes identifying unreferenced blocks particularly efficient: as a natural result of the log being append-only, all pointers within the log point "backwards". Garbage collection can thus proceed in a single, sequential pass through the log using an algorithm developed by Armstrong and Virding for garbage collecting immutable data structures in the Erlang programming language.

The algorithm works as follows. Starting at the most recent log entry and scanning backwards, it maintains a list of "live" blocks initialized from the pointers in the system catalog. Each time it encounters a live block, it deletes that block from its list. If the block is a metadata block that contains pointers to other blocks, it adds these pointers to its list. If the algorithm encounters a block that is not in its list, then there are no live pointers to that block later in the log, and since all pointers point backwards, the algorithm can reclaim the block's space immediately. The system can also use this algorithm incrementally: starting from the end of the log, it can scan backward until "enough" space has been reclaimed, and then stop.

The expensive part of the Erlang algorithm is maintaining the list of live blocks. If references to many blocks occur much later in the log than the blocks themselves, this list could grow too large to fit in memory. We note, however, that a conservative version of the collector could use a Bloom filter to store the list of live blocks. Although false positives in the filter would prevent the algorithm from reclaiming some legitimate garbage, its memory usage would be fixed at the size of the Bloom filter.

This reminds me of a recent post on Phil Wadler's blog where he highlights some comments on the proposal to add functional programming to the ACM curriculum. The paragraphs I quoted above are an excellent example of FP knowledge being useful in all parts of computer science.


(Leave a comment)
From:gerald_duck
Date:2008-06-11 22:21 (UTC)
(Link)
That feels like a design it would be interesting to marry with my thoughts on filesystem design — although intended for backup, it might well do a good job of providing storage for live data, too.
(Reply) (Thread)
From:pozorvlak
Date:2008-06-11 22:51 (UTC)
(Link)
Cool!

I've been aware of Bloom filters for a while, ever since Maciej Ceglowski's excellent perl.com article on the subject a few years ago, but I've been noticing a lot of Bloom filter-related stuff on the intertubes in the last couple of weeks. Maybe this algorithm is what kicked off the interest?
(Reply) (Thread)
From:shae
Date:2008-06-12 12:31 (UTC)

Ah cute!

(Link)
I hadn't thought about using bloom filters for GC, I like it!

Do you know of any particularly good papers on Bloom filters in routing? I'd be especially interested in routing in mesh networks.
(Reply) (Thread)
From:shae
Date:2008-06-12 12:52 (UTC)

Doh - Broder & Mitzenmacher

(Link)
I just remembered you pointed to the Broder & Mitzenmacher survey of bloom filters earlier. Upon investigation, it's focussed on network applications, like routing. I think I can guess how you might have answered my previous comment. I blame lack of coffee.
(Reply) (Thread)

(Leave a comment)

Powered by LiveJournal.com