Bigtable
A Distributed Storage System for Structured Data
Some features
- fast and extremely large-scale DBMS
- a sparse, distributed multi-dimensional sorted map, sharing characteristics of both row-oriented and column-oriented databases.
- designed to scale into the petabyte range
- it works across hundreds or thousands of machines
- it is easy to add more machines to the system and automatically start taking advantage of those resources without any reconfiguration
- each table has multiple dimensions (one of which is a field for time, allowing versioning)
- tables are optimized for GFS (Google File System) by being split into multiple tablets – segments of the table as split along a row chosen such that the tablet will be ~200 megabytes in size.
Architecture
BigTable is not a relational database. It does not support joins nor does it support rich SQL-like queries. Each table is a multidimensional sparse map. Tables consist of rows and columns, and each cell has a time stamp. There can be multiple versions of a cell with different time stamps. The time stamp allows for operations such as “select ‘n’ versions of this Web page” or “delete cells that are older than a specific date/time.”