Scaling Costs with DynamoDB

Jeff Hwang

As a company grows, more and more data is accumulated. The exciting and terrifying part about working with email is that users come in with a large data set from the get go. Early on, we knew we needed a database that would allow us to scale fast.

There are generally two ways companies go about this.

Scaling Vertically

Just upgrade to a bigger instance with more CPU, more RAM, and more storage. All you have to do is just pay more. 💸 💸  Contrary to popular belief, this is a viable short term approach, since it requires little to no engineering investment.

Scaling Horizontally

Alternatively, you can split you database across multiple smaller, cheaper instances (or machines). The rise of open source and NoSQL technologies in the past decade makes this much more feasible to do.

Breaking it down even further, there are a few ways you can go about scaling horizontally, or shard.

You can put different tables on different instances. 

This is the easiest option. It also has the added benefit of the bulkhead pattern, meaning you can isolate your critical tables, such as the users table, from the less critical services, such as the notifications table. As long as you never have to do joins then you're solid! But this won't work for us because we're trying to shard a single thread table.

You can do ranged based sharding.

In practice this is very difficult and messy to do. You need to pick suitable sizes for your ranges, handle assigning keys appropriately to make sure no instance is overloaded, and handle reallocation of keys when an item is deleted.

You can do hash based sharding.

This is a bit easier than ranged based sharding, because the hash function will handle balancing for you, as long as you pick a suitable key that distributes your dataset in a uniform manner. For example, a user ID might be a uniform key if each of your users have approximately the same amount of data.

Scaling the number of instances

For the first two it's easy; if you have more tables or ranges, just put them on a new machine. 

For hash-based sharding, you have to rebalance the cluster when you spin up a new instance, because keys that might have originally mapped to one instance might have changed after introduction of new instances. Of course, we're engineers. We're smart. We can use consistent hashing to reduce this rebalancing work.

Other Factors

Scalability is only one aspect of a database. But it doesn't stop here. How about availability? How about redundancy? How about security? These are real questions infrastructure engineers have to think about.

Using Managed Services like a Startup

Scalability was our prime focus. But we also wanted a simple solution so we could focus on building a great product and not have to worry about infrastructure issues. We turned our eyes to AWS DynamoDB, which is based off the original Dynamo paper. It is Amazon's Fully Managed, NoSQL, Key-Value store. From an engineering standpoint, it's a beautiful product.

  • Flexible schema.
  • Low latency.
  • Hash based sharding.
  • Scalability at the push of a button.
  • Redundancy across availability zones.
  • Role-base permissions and two factor authentication through IAM.
  • Monitoring and metrics through AWS CloudWatch. (not the best, but it works...)

Partitioning Pains with DynamoDB

The number of partitions in your cluster is controlled by three factors: dataset size, read capacity, and write capacity. You can adjust the read and write capacity through their console UI, and this will translate to a number of partitions. Their best practices article explains how to calculated the estimated number of partitions. Basically it boils down to:

If you want more throughput, just crank up the capacity and the number of partitions will follow to meet your requests. Simple. However, there is actually another factor you need to consider. What is the read and write capacity PER partition?

AWS chooses to split read and write capacity uniformly over all partitions:

Hey wait a minute...

🤔

More partitions, means less capacity per partition

So does that really mean if I have more instances, I have less throughput per instance? Technically yes. AWS is relying on the fact that if your partition key (or hash key) distributes you data perfectly evenly, then the capacity demands of each partitions would also split evenly. 

However this is hardly the case in production. 

Chances are you will have to scale again. Rinse and repeat. Uniformity is the goal, not the norm. How many times have you flipped a coin 10 times in a row, and gotten an equal number of heads and tails? We know the probability is 1/2 so the outcome should be uniform right? In practice, expectations are only realized in the long run when you have a large enough data set.

Additionally, how uniform is your dataset to begin with? Is there a uniform key to pick? For Polymail, we have power users who send hundreds of messages a day, and users who only send 10 - 20 messages per day. Splitting by folder would cause issues as well since most user's Archive folder has significantly more messages than their Inbox folder.

Issues with large data sets

DynamoDB has another formula for the number of partitions.

If you have a lot of data, as in our case, this will split your partitions. And you will never be able to meet the read and write capacity demands of each partitions without over provisioning.

Eventually, you can end up with something like this:

Over provisioned to handle spikes, and a mix of hot and under utilized partitions.

Partitions are split forever

If you want to handle spikes in traffic, the immediate course of action is to just increase the number of instances to handle the surge, and bring it back down to save money. You can scale the read and write capacity freely, but once you split partitions, partitions are split forever.

This is one of the biggest caveats, which is actually mentioned in their docs.

Whether it's a technical limitation or profitable business strategy, it's definitely a cash sink for companies using this service. 😒

Conclusion

DynamoDB is still a good product. It works reliably. Our main gripe with the service is the cost, due to our access patterns, AWS’s pricing model, and the inability to rebalance our cluster. 

If you don’t have intense performance requirements, then DynamoDB may still be a viable option. We still use it in production today for some smaller, less frequently accessed tables, where the cost is insignificant. Or if your company can afford to over provision, then none of the issues above really pertain to you.

There are other cool features you don't get with any many other databases such DynamoDB Streams, which allows you to use CRUD operations as event streams.

The main take away is pay attention to and plan when your partitions split.

A future with Spanner

After more than half a year with the DynamoDB. We turned our eyes to another product, Google Cloud Spanner. Google's horizontally scalable, strongly consistent, relational database service. This database is the public version of their own internal database service used to run many of their products globally, released just earlier this year 2017.

It has many of the attributes we initially liked about DynamoDB: 

  • Fully managed.
  • Hash based sharding, and scalability at the push of a button.
  • Role-based permissions through IAM.
  • Monitoring and metrics through their console and GCP Stackdriver (also not the best, but it works...)

Row Level Sharding 

Scalability is handled a bit different with Spanner. Rather than specifying a partition key to determine which instance your data is going to live on, Spanner shards on a row level basis, hashing on the primary key.

This helps out with data uniformity and distributing the load. Objects are more or less similar in size, and you have lots of them, which addresses the expectation issue.

Additionally, GCP splits based on access patterns to minimize hotspots. Their schema design docs and their optimization white paper go into depth on how this works.

Data Locality

The other major difference between Dynamo and Spanner is how they handle data locality. In Spanner, keys that have the same prefix are stored closer together either on the same shard, or node.

This has very interesting characteristics. For example, you can prefix your keys with user IDs to make sure all the data for a user is grouped together. Together this allows for very flexible scalability with minimal tradeoffs to latency.

However, for the sake of completeness, this may not be acceptable for applications with strict latency requirements, where you might want guarantees that certain data live on the same node. But for us, we can tolerate a bit higher latency for more flexibility.

Next Steps

For part 2 of these series, we'll discuss how we migrated from DynamoDB to Spanner. And in part 3, we'll go into our production experience with Spanner.

Stay tuned!! 💌

Subscribe to Polymail Blog

Get our updates in your inbox.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.