A distributed object storage system designed to scale from a single machine to thousands of servers. Swift is optimized for multi-tenancy and high concurrency. Swift is ideal for backups, web and mobile content, and any other unstructured data that can grow without bound.

Swift was originally developed as the basis for Rackspace's Cloud Files and was open-sourced in 2010 as part of the OpenStack project. It has since grown to include contributions from many companies and has spawned a thriving ecosystem of 3rd party tools.

1 Swift vs Ceph

Swift has been around since the dawn of OpenStack time – which is a bare five years ago. It is one of the core projects of OpenStack and has been tested and found stable and useful time and again.

Trouble is, Swift’s design comes up short in both transfer speed and latency. A major reason for these issues is that the traffic to and from the Swift cluster flows through the proxy servers.

Another reason many people think Ceph is the better alternative is that Swift does not provide block or file storage.

Finally, latency rears its ugly head when object replicas aren’t necessarily updated at the same time, which can cause requesters to receive an old version of an object after the first write of the new version. This behavior is known as eventual consistency.

Ceph, on the other hand, has its own set of issues, especially in a cloud context. Its multi-region support, while often cited as an advantage, is also a master-slave model. With replication possible only from master to slave, you see uneven load distribution in an infrastructure that covers more than two regions.

Ceph’s two-region design is also impractical as writes are only supported on the master, with no provision to block writes on the slave. In a worst case scenario, such a configuration can corrupt the cluster.

Another drawback to Ceph is security. RADOS clients on cloud compute nodes communicate directly with the RADOS servers over the same network Ceph uses for unencrypted replication traffic. If a Ceph client node gets compromised, an attacker could observe traffic on the storage network.

In light of Ceph’s drawbacks, you might ask why we don’t just build a Ceph cluster that spans two regions? One reason is that Ceph writes only synchronously and requires a quorum of writes to return successfully.

With those issues in mind, let’s imagine a cluster with two regions, separated by a thousand miles, 100ms latency, and a fairly slow network connection. Let’s further imagine we are writing two copies into the local region and two more to the remote region. Now the quorum of our four copies is three, which means the write request is not going to return before at least one remote copy is written. It also means that even a small write will be delayed by 0.2 seconds, and larger writes are going to be seriously hampered by the throughput restriction.

On the other hand, Swift in the same two-region architecture will be able to write locally first and then replicate to the remote region over a period of time due to the eventual consistency design. Swift also requires a write quorum, but the write_affinity setting can configure the cluster to force a quorum of writes to the local region, so after the local writes are finished the write returns a success status.

2 How to choose

In a single-region deployment without plans for multi-region expansion, Ceph can be the obvious choice. Mirantis OpenStack offers it as a backend for both Glance and Cinder; however, once larger scale comes into play, Swift becomes more attractive as a backend for Glance. Its multi-region capabilities may trump Ceph’s speed and stronger consistency model.