Data manipulation

Amazon’s DynamoDB offers the ability to both update and insert data with a single save() method that is mostly exposed by Dynamodb-mapper.

Saving

As Dynamodb-mapper directly exposes items properties as python properties, manipulating data is as easy as manipulating any Python object. Once done, just call save() on your model instance.

Conflict detection

save() has an optional parameter raise_on_conflict. When set to True, save will ensure that:

  • saving a new object will not overwrite a pre-existing one at the same keys
  • DB object has not changed before saving when the object was read form the DB.

If the first scenario occurs, OverwriteError is raised. In all other cases, it is ConflictError.

Please note that ConflictError inherits from OverwriteError. If you make a distinction between both cases, OverwriteError must be the first except block.

Use case: Virtual coins

When a player purchases a virtual good in a game, virtual money is withdrawn from from its internal account. After the operation, the balance must be > 0. If multiple orders are being processed at the same time, we must prevent the lost update scenario:

  • initial balance = 200
  • purchase P1 150
  • purchase P2 100
  • read balance P1 -> 200
  • read balance P2 -> 200
  • update balance P1 -> 50
  • update balance P1 -> 100

Indeed, when saving, you expect that the balance has not changed. This is what raise_on_conflict is for.

from dynamodb_mapper.model import DynamoDBModel, autoincrement_int

class NotEnoughCreditException(Exception):
    pass

class User(DynamoDBModel):
    __table__ = u"game-dev-users"
    __hash_key__ = u"login"
    __schema__ = {
        u"login": unicode,
        u"firstname": unicode,
        u"lastname": unicode,
        u"email": unicode,
        u"connections": int,
        #...
        u"balance": int,
    }

user = User.get("waldo")
if user.balance - 150 < 0:
    raise NotEnoughCreditException
user.balance -= 150

try:
    user.save(raise_on_conflict=True)
except ConflictError:
    print "Ooops: Lost update syndrome caught!"

Note: In a real world application, this would most probably be wrapped in Transactions which transparently rely on the same mechanism and provides a way to persist states.

Deleting

Just like save(), delete() features the raise_on_conflict option. When True, it will ensure that:

  • deleting a new object does nothing. In other words, you are not accidentally deleting a random Item
  • DB object has not changed before deleting when the object was read form the DB.

In all other case, the delete operation will proceed as usual .

Note: Eventual consistent read operations might be able to successfully get the Item for a short while, usually under 1s.

Use case: single operation user deletion

An item may be deleted in a single operation as long as the keys are known. The trick is to create an object with only these keys and to call delete on it. Of course, it will not work if raise_on_conflict=True.

from dynamodb_mapper.model import DynamoDBModel, autoincrement_int
from boto.dynamodb.exceptions import DynamoDBKeyNotFoundError

class User(DynamoDBModel):
    __table__ = u"game-dev-users"
    __hash_key__ = u"login"
    __schema__ = {
        u"login": unicode,
        u"firstname": unicode,
        u"lastname": unicode,
        u"email": unicode,
        u"connections": int,
        #...
        u"balance": int,
    }

try:
    user = User(login=u"waldo")
    user.delete()
except DynamoDBKeyNotFoundError:
    print "Ooops: user 'waldo' did not exist. Can't delete it!"

Autoincrement technical background

When saving an Item with an autoincrement_int hash_key, the save() method will automatically add checks to prevent accidental overwrite of the “magic item”. The magic item holds the last allocated ID and is saved at hash_key=-1. If hash_key is None then a new ID is automatically and atomically allocated meaning that no collision can occure even if the database connection is lost. Additionaly, a check is performed to make sure no Item were manually inserted to this location. If applicable, a maximum of MAX_RETRIES=100 attempts to allocate a new ID will be performed before raising MaxRetriesExceededError. In all other cases, the Item will be saved exactly where requested.

To make it short, Items involving an autoincrement_int hash_key will involve 2 write request on first save. It is important to keep it in mind when dimensioning an insert-intensive application.

Know when to use it, when *not* to use it.

Example:

>>> model = MyModel() # model with an autoincrement_int 'id' hash_key
>>> model.do_stuff()
>>> model.save()
>>> print model.id # An id field is automatically generated
7

About editing hash_key and/or range_key values

Key fields specifies the Item position. Amazon’s DynamoDB has no support for “moving” an Item. It means that any edition of hash_key and/or range_key values will preserve the original Item and insert a new one at the specified location. To prevent accidental key value change, set raise_on_conflict=True when calling save`.

If you indeed meant to move the Item:

  • delete the item
  • save it to the new location

Example:

>>> model = MyModel.get(24)
>>> model.delete() # Delete *first*
>>> model.id = 42  # Then change the key(s)
>>> model.save()   # Finally, save it

Logically group data manipulations

Some data manipulations requires a whole context to be consistent, status saving or whatever. If your application requires any of these features, please go to the transactions section of this guide.

Limitations

Some limitations over Amazon’s DynamoDB currently applies to this mapper. save() has no support for :

  • returning data after a transaction
  • atomic increments

Please, let us know if this is a blocker to you!