Edit 12/14/2018: This document is out of date! One of the great things about DynamoDB is that they are constantly improving it. They posted documentation on how to do row versioning in Using Sort Keys for Version Control. I’d also recommend watching Advanced Design Patterns for DynamoDB from re:Invent 2018. The speaker recommends using the new TransactWriteItems operation to get rid of the complexity with ordering requests.
Your architecture should be built around your application’s use-cases. The layout below can be used if you don’t expect your users to access older versions frequently.
We can maintain two separate tables for the resources that need versioning.
resource table contains only the latest version for each item.
The primary key of the table is just its partition key.
In the table below, the primary and partition key is just a
hash which is a String. The
version is a Number.
The remaining attributes define your resource.
hash is the only attribute for the primary key, any new entry with the same
hash will overwrite the row.
We assume that any operation that updates a record, a new update, or a rollback should always increment the version number of the item.
Resource History Table
resource-history table contains every revision of the items.
It will have more storage, but can have a lower read capacity if you do not expect the users to retrieve older entries frequently.
The main difference is that the primary key is a composite key
(partitionKey: hash, sortKey: version), so every new
version for the same hash will
have its own row.
|1c5815b2||1||some old values|
Creating a new item involves first writing the item to the
resource-history table, and then writing the same entry to the
If the first step fails, then nothing has been written to the tables and the user can safely issue another request.
If the second step fails, then there will be an extra record that’s in the
resource-history table which won’t be accessed by any user.
Retrieving the latest item requires us just to fetch the record with that
hash from the
resource table. We are guaranteed to have either one or no record for a given
Updating an existing item requires us to first fetch the item’s latest version from the
resource table, increment its version, and then write the new entry to both tables just like in CREATE.
The failure scenarios are similar to the CREATE operation. If the new entry is
added only to the
resource-history table, then when the user requests the same update operation,
the previously created entry with the key
(hash, v2) in
resource-history will be replaced.
Deleting an item requires us to only delete it from the main
If you work with a single table and try to have immutable records, then the UPDATE operation is going to have a user experience trade-off.
For example, if you decide to implement the stackoverflow answer listed in the introduction, you will have two writes to the same table. Depending on the order you’ve chosen, if the second write operation fails, then the user will either see their item deleted or you’ll lose a historical record.
Client side row versioning is not perfect. Our solution also has drawbacks as explained above, but the customer experience is better than actually losing data.