A comparative review of Swift vs Ceph

Swift and Ceph are popular cloud storage systems. Swift provides object storage and ceph provides object and block storage.

Swift is an open source object storage system, that runs on standard server hardware. Swift supports RESTful HTTP APIs, data replication across various failure zones, scaling out without downtime etc. Objects are written to multiple drives, and data is replicated across a cluster. If a node fails swift replicates the contents from active nodes to new nodes in the cluster. Swift storage can be accessed through REST HTTP APIs. Swift can scale up horizontally by adding more nodes. The following figure shows client accessing swift storage. The client access swift through a proxy server as mentioned.
Swift

Ceph is a storage platform that stores data on a single distributed computer cluster, and provides interfaces for object-, block- and file-level storage to clients.
Ceph is a storage system that is freely available, distributed, scalable and without a single point of failure. Ceph has more features compared to swift, it supports block storage and file-level storage which is not present in swift. Ceph supports xfs, btrfs and ext4 file systems. CRUSH algorithm determines the data placement in ceph. It calculates the location of stored objects whenever needed and the clients can directly communicate with the storage nodes.

The following diagram shows a client communicating with ceph. The client access the ceph object gateway through an API and the gateway then provides access to the ceph cluster.
cephCeph performs better in terms of transfer speed and latency. Ceph clients directly contacts the storage nodes for data retrieval/storage. But in case of swift the  traffic to and from the Swift cluster goes through the proxy servers. Thus swift has a bottleneck compared to Ceph.

When we use proxies a load balancer must be used to distribute work among the nodes. Ceph monitors has monitor nodes which gives cluster maps to the clients and storage nodes. Clients can thus directly contact the storage nodes to access data. This procedure is faster and gives lesser overhead when compared to swift.

Another advantage of ceph is that it provides block and file storage, where as swift provides only object storage.Swift is an eventually consistent system. The object replicas may not updated at the same time. The updates may reflect on one node and it may take some time to get updated on other nodes. A client may receive older version of the data. This is called eventual consistency . So swift, which is an eventually consistent is not reliable in every situation. But ceph is highly consistent and reliable.

In order to access a swift storage we use HTTP REST interface. There is no other access point. But ceph can be accessed via a number of methods.  Ceph provides a scalable, consistent object store and a bunch of interfaces to access it, including native access, an http REST API, block devices and a filesystem-type interface.

In case of read operations ceph performs better over swift. Ceph manages a higher number of read operations than swift when the data size is small. When the object size goes higher, the amount of read operations that the two systems can perform is approximately the same, but each system reaches its highest performance with a different number of threads.
Ceph can reach a better performance with more parallel workers than Swift.  Ceph performs better at handling an increasing number of parallel requests.

For write operations, Ceph performs better when the size of the objects is small. The Ceph I/O Performance scales over Swift because ceph clients connects to OSD’s directly. Swifts I/O performance is limited by the proxy server which may increase the bottleneck. So Ceph performs faster and has smaller overhead. The lookup procedure in ceph is faster due to the use of crush algorithm. Cephs response time is excellent for larger objects.

Ceph performs better for multi user environment as there is less performance degradation as clients increase. Ceph also gives better bandwidth at lower concurrency.

The following table shows a summary of what we have discussed.

cephvsswift

Thus we can see that using ceph has more advantages than using swift.

Posted January 6, 2016 by John Mathew

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>