amoebafs

Similar to Ceph, which is built on top of a distributed object system called RADOS, AmoebaFS is also implemented on top of a distributed object system provided by Aspen. What sets AmoebaFS apart is that Aspen provides much more flexible transaction and object models which can be leveraged to benefit AmoebaFS in a number of ways. Whereas RADOS is designed for scalable support of homogenous storage media and access patterns, Aspen is designed for flexibility and to take advantage of heterogeneous storage media and access patterns.

Aspen also includes a basic distributed computing solution that can be used for ensuring that tasks are eventually completed even in the presence of full system crashes. This can be used for performing a number of background tasks such as deduplication or migrating file content from one set of stores to another. Such migrations might be useful for AmoebaFS deployments where all new files are written to stores backed by SSDs to ensure maximum write speed. Then, in the background, those files could be migrated in bulk to cheaper spinning disk drives in a sequential manner that minimizes head swings and allows the writes to occur at full disk bandwidth. With this approach, the SSDs would essentially act as a buffered cache in a traditional system but without the additional complexity of architecting a separate caching system and integrating it with the application. The writes to the SSDs are normal Aspen object writes that just happen to be backed by SSDs. There’s no extra complexity, architectural impedance mismatches, or additional points of failure to manage.

One obvious advantage AmoebaFS can leverage is the ability to place inode objects and directory trees on low-latency media like SSD or NVMe. This will minimize the time required to fetch those frequently-accessed objects and improve overall performance. Hybrid solutions for file content might also make sense for some use cases such as serving video files where the time-to-first-byte is an important metric. In such a case, the first few megabytes of a file could be backed by SSD to minimize head-of-file access times whereas the rest of the file could be kept on spinning disks where network buffering will mask the higher-latency reads. Whereas most traditional distributed file systems must either be all SSD or all HDD, Aspen allows the backing media to be tuned to the specific workflow and budget. Also, AmoebaFS can change its internal layout and backing media on-the-fly to keep pace with changing runtime environment needs as they evolve over time.

Another feature of Aspen that AmoebaFS can take advantage of is the ability to migrate data stores between hosts. For initial prototyping and experimentation, the filesystem could be placed on stores all located on the same host. To protect against data loss even on a single system like this, the stores could be spread across a number of different disks. Later, the stores could be migrated to separate hosts in the data center to protect against node failures. Later still, some of those stores could be migrated to different sites to protect against regional outages. Similarly, the ability to transfer stores between hosts also allows for the easy addition and removal of storage servers. To add a new server, simply create or transfer some stores to it. To remove, simply migrate all the stores off.

Currently, AmoebaFS is in the proof-of-concept stage and is aimed at demonstrating Aspen’s viability as a distributed data platform. Most of the basic operations have been implemented and it works well enough for simple manual tinkering but little effort has yet been put into hardening the system or doing comprehensive testing. The current implementation is also fairly simplistic and has yet to be enhanced to support the advanced use cases mentioned above. The project seems promising though and would likely evolve rapidly if others find it interesting enough to join in on its development.