Monday, June 20, 2011

Amazon SimpleDB in Nutshell for those who know RDBMS Systems

Note: Latest Available Amazon SimpleDB version as of today, i.e. when I am writing this, is in Beta status with API Version: “2009-04-15”

Overview of SimpleDB
Amazon SimpleDB is a highly available, flexible, and scalable non-relational data store that offloads the work of database administration. Developers simply store and query data items via web services requests.[1]

Representation of Data with SimpleDB:

Domains— Domains are similar to tables that contain similar data.

You can execute queries against a domain, but cannot execute queries across different domains.

Attributes— They are similar to columns in RDBMS, attributes represent categories of data that can be assigned to items.

Items— Represented by rows, items represent individual objects that contain one or more attribute name-value pairs.

Values—Similar to column value, values represent instances of attributes for items. An attribute can have multiple values. There is no data typing supported for attribute and all data is treated as text data during query execution.

However, Amazon SimpleDB is not a relational database, and does not offer some features needed in certain applications, e.g. complex transactions or joins (i.e. execute queries across different domains).[1] You need to rely on duplicating the data to avoid such scenarios.[2]

Benefits of using SimpleDB

Highly Available
Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability.
Flexible
You can change your data model on the fly, and data is automatically indexed for you.
Scalability
You can access additional machine resources by spreading your data set and requests across multiple domains.

Support for Reporting
Sql Server Reporting Service (SSRS) is a popular platform to build and access reports. SSRS reports are build based on dataset that has set for a report. It is possible to build dataset from SimpleDB.[3] 

Limitation with SimpleDB

Domain (similar to table): 250 active domains per account. More can be requested by filling a form. Note that each attribute (similar to column) can hold multiple value.[4]

Attribute (similar to column) name-value pairs per item: 256[4]

Maximum response size for Select: 1 MB (large data like images can be stored separately into cloud as files and an Attribute can store URL for the resource).[4]

Maximum items per select: 2500[4]

Attribute value length: 1024 bytes[4]

No datatyping: text only. Integers and reals must be represented using leading zeros to ensure proper query comparisons.[5]

References:

[1] Amazon SimpleDB (beta)
http://aws.amazon.com/simpledb/

[2] How And Why Glue Is Using Amazon SimpleDB Instead Of A Relational Database
http://blog.getglue.com/?p=1145

[3] Accessing SimpleDB from SSRS
http://www.chrisumbel.com/article/simpledb_ssrs.aspx

[4] Amazon SimpleDB Limits, Amazon SimpleDB Developer Guide (API Latest version)
http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/

[5] M/DB - A Free Open Source "plug-compatible" alternative to Amazon's SimpleDB database
http://gradvs1.mgateway.com/main/index.html?path=mdb