📜  运营数据库与数据仓库数据仓库教程

📅  最后修改于: 2020-12-30 00:31:05             🧑  作者: Mango

运营数据库和数据仓库之间的区别

操作数据库是数据仓库的信息源。它包含用于运行企业日常运营的详细信息。随着更新的进行,数据会经常更改,并反映最近交易的当前值。

运营数据库管理系统也称为OLTP(在线交易处理数据库),用于实时管理动态数据。

数据仓库系统为用户或知识工作者服务,以进行数据分析和决策。这样的系统可以以特定格式组织和呈现信息,以适应各种用户的多样化需求。这些系统称为在线分析处理(OLAP)系统。

数据仓库和OLTP数据库都是关系数据库。但是,这两个数据库的目标是不同的。

Operational Database Data Warehouse
Operational systems are designed to support high-volume transaction processing. Data warehousing systems are typically designed to support high-volume analytical processing (i.e., OLAP).
Operational systems are usually concerned with current data. Data warehousing systems are usually concerned with historical data.
Data within operational systems are mainly updated regularly according to need. Non-volatile, new data may be added regularly. Once Added rarely changed.
It is designed for real-time business dealing and processes. It is designed for analysis of business measures by subject area, categories, and attributes.
It is optimized for a simple set of transactions, generally adding or retrieving a single row at a time per table. It is optimized for extent loads and high, complex, unpredictable queries that access many rows per table.
It is optimized for validation of incoming information during transactions, uses validation data tables. Loaded with consistent, valid information, requires no real-time validation.
It supports thousands of concurrent clients. It supports a few concurrent clients relative to OLTP.
Operational systems are widely process-oriented. Data warehousing systems are widely subject-oriented
Operational systems are usually optimized to perform fast inserts and updates of associatively small volumes of data. Data warehousing systems are usually optimized to perform fast retrievals of relatively high volumes of data.
Data In Data Out
Less Number of data accessed. Large Number of data accessed.
Relational databases are created for on-line transactional Processing (OLTP) Data Warehouse designed for on-line Analytical Processing (OLAP)

OLTP和OLAP之间的区别

OLTP系统

OLTP系统的操作数据处理。操作数据是包含在特定系统的操作中的那些数据。例如,ATM交易和银行交易等。

OLAP系统

具有历史数据或档案数据的OLAP句柄。历史数据是指长期获得的数据。例如,如果我们收集有关航班预订的最近10年信息,则该数据可以为我们提供许多有意义的数据,例如预订趋势。这可能会提供有用的信息,例如高峰旅行时间,什么样的人正在不同类别的旅行(经济/商务)等。

OLTP和OLAP系统之间的主要区别是在单个事务中分析的数据量。而OLTP同时管理许多并发客户和查询,这些查询和查询一次仅涉及单个记录或有限的文件组。 OLAP系统必须具有对数百万个文件进行操作以回答单个查询的能力。

Feature OLTP OLAP
Characteristic It is a system which is used to manage operational Data. It is a system which is used to manage informational Data.
Users Clerks, clients, and information technology professionals. Knowledge workers, including managers, executives, and analysts.
System orientation OLTP system is a customer-oriented, transaction, and query processing are done by clerks, clients, and information technology professionals. OLAP system is market-oriented, knowledge workers including managers, do data analysts executive and analysts.
Data contents OLTP system manages current data that too detailed and are used for decision making. OLAP system manages a large amount of historical data, provides facilitates for summarization and aggregation, and stores and manages data at different levels of granularity. This information makes the data more comfortable to use in informed decision making.
Database Size 100 MB-GB 100 GB-TB
Database design OLTP system usually uses an entity-relationship (ER) data model and application-oriented database design. OLAP system typically uses either a star or snowflake model and subject-oriented database design.
View OLTP system focuses primarily on the current data within an enterprise or department, without referring to historical information or data in different organizations. OLAP system often spans multiple versions of a database schema, due to the evolutionary process of an organization. OLAP systems also deal with data that originates from various organizations, integrating information from many data stores.
Volume of data Not very large Because of their large volume, OLAP data are stored on multiple storage media.
Access patterns The access patterns of an OLTP system subsist mainly of short, atomic transactions. Such a system requires concurrency control and recovery techniques. Accesses to OLAP systems are mostly read-only methods because of these data warehouses stores historical data.
Access mode Read/write Mostly write
Insert and Updates Short and fast inserts and updates proposed by end-users. Periodic long-running batch jobs refresh the data.
Number of records accessed Tens Millions
Normalization Fully Normalized Partially Normalized
Processing Speed Very Fast It depends on the amount of files contained, batch data refresh, and complex query may take many hours, and query speed can be upgraded by creating indexes.