数据存储基本上是用于存储数据集合的地方,例如数据库、文件系统或目录。在数据库系统中,它们可以通过两种方式存储。这些如下:
- 面向行的数据存储
- 面向列的数据存储
面向行的数据存储和面向列的数据存储的比较如下:
Row oriented data stores | Column oriented data stores |
---|---|
Data is stored and retrieved one row at a time and hence could read unnecessary data if some of the data in a row are required. | In this type of data stores, data are stored and retrieve in columns and hence it can only able to read only the relevant data if required. |
Records in Row Oriented Data stores are easy to read and write. | In this type of data stores, read and write operations are slower as compared to row-oriented. |
Row-oriented data stores are best suited for online transaction system. | Column-oriented stores are best suited for online analytical processing. |
These are not efficient in performing operations applicable to the entire datasets and hence aggregation in row-oriented is an expensive job or operations. | These are efficient in performing operations applicable to the entire dataset and hence enables aggregation over many rows and columns. |
Typical compression mechanisms which provide less efficient result than what we achieve from column-oriented data stores. | These type of data stores basically permits high compression rates due to little distinct or unique values in columns. |
面向行的数据存储的最佳示例是关系数据库,它是一种结构化数据存储,也是一个复杂的查询引擎。随着数据大小的增加,提高性能会带来很大的损失。
面向列的数据存储的最佳示例是HBase 数据库,它基本上是从头开始设计的,旨在提供可扩展性和分区,以实现高效的数据结构序列化、存储和检索。
关系型数据库和 HBase 的特点如下:
Relational Database | HBase |
---|---|
It is basically based on a Fixed Schema. | It is totally Schema-less. |
It is an example of row-oriented datastore. | It is an example of column-oriented datastores. |
It is basically designed to store normalized data. | It is basically designed to store de-normalized data. |
It basically contains thin tables. | It basically contains wide and sparsely oriented populated tables. |
It has no built-in support for partitioning. | It basically supports Automatic Partitioning. |