A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse
environment includes an extraction, transportation, transformation, and loading
(ETL) solution, an online analytical processing (OLAP) engine, client analysis
tools, and other applications that manage the process of gathering data and
delivering it to business users.
In computing, a data warehouse or enterprise data
warehouse (DW, DWH, or EDW) is a database used for reporting and data analysis.
It is a central repository of data which is created by integrating data from
one or more disparate sources. Data warehouses store current as well as
historical data and are used for creating trending reports for senior
management reporting such as annual and quarterly comparisons.
The data stored in the warehouse are uploaded from
the operational systems (such as marketing, sales etc., shown in the figure to
the right). The data may pass through an operational data store for additional
operations before they are used in the DW for reporting.
The typical ETL-based data warehouse uses staging,
data integration, and access layers to house its key functions. The staging
layer or staging database stores raw data extracted from each of the disparate
source data systems. The integration layer integrates the disparate data sets
by transforming the data from the staging layer often storing this transformed
data in an operational data store (ODS) database. The integrated data are then
moved to yet another database, often called the data warehouse database, where
the data is arranged into hierarchical groups often called dimensions and into
facts and aggregate facts. The combination of facts and dimensions is sometimes
called a star schema. The access layer helps users retrieve data
A data warehouse constructed from an integrated data
source systems does not require ETL, staging databases, or operational data
store databases. The integrated data source systems may be considered to be a
part of a distributed operational data store layer. Data federation methods or
data virtualization methods may be used to access the distributed integrated
source data systems to consolidate and aggregate data directly into the data
warehouse database tables. Unlike the ETL-based data warehouse, the integrated
source data systems and the data warehouse are all integrated since there is no
transformation of dimensional or reference data. This integrated data warehouse
architecture supports the drill down from the aggregate data of the data
warehouse to the transactional data of the integrated source data systems.
This is nice article for who is interested in learning ETL testing.Thanks for sharing ETL testing Online Training information
ReplyDeletewith us.