Let us first understand what Data Warehousing Architecture means?
The architecture of a Data Warehouse typically comprises of the descriptions and functionalities of components and services. It also tells us how various components interact.
By analyzing the architecture we will be understand the scope for expansion on the existing system.
Below is an image that gives us a sample of a Data Warehouse Architecture
Some important terminologies to remember
Data Mart
Before we jump to the different architecture approaches for building a Data Warehouse, let us understand what Data Marts are?
From the DW, granular level data flows to various departments for their custom needs. Data which is periodically loaded from the DW to a local database, then the local database is referred to as a Data Mart
Data Sources in a DW environment
Data sources in a DW environment can be anything from a flat file to a RDBMS and also a Mainframe application. They can broadly be classified into:
- Legacy – Mainframes
- Operational Data – transaction processing systems
- External Data source – Authorization data for Visa, master cards
Metadata
At all levels of the data warehouse, information is required to support the maintenance and use of a Data Warehouse
Metadata is data about the data.
Operational Data Source (ODS)
Operational data source is similar to a Data Warehouse but with a few differences.
Here are some features
- It is subject oriented
- Integrated
- Current – history is not kept
- Volatile – Since changes are current
- Detailed – it is usually more detailed than a DW as summary data is not stored
So ODS is typically something which integrates data from multiple business operations to address operational issues that span one or more business functions of an organization.
Benefits of Operational Data Source (ODS)
- Supports operational reporting needs of an organization
- Provides a complete view of customer relationships, the data for which might be stored in several databases
- Provides more current data than a DW and is more integrated than a OLTP system
- Reduces the additional burden on other operational or data warehouse systems by providing the required additional data
- Sends data to other operational systems also
- Data is updated frequently and so this can be used for drill down reporting from the summary reported by the Data Warehouse
Data Warehousing Architectures
Data warehouses can be built on the following architectures
- Centralized
- Federated
- Tiered
Centralized
In a centralized architecture, there exists only one data warehouse which stores all the necessary data for analysis.
Pros:
It helps organizations answer questions that were ‘unanswerable’ before the data was loaded to DW
It is best suited for integrated business where organizations across enterprise work together.
Cons:
The disadvantage here is loss of performance when compared to a distributed approach.
%nbsp;
Federated
In this architecture, data is logically consolidated but stored on separate physical databases either at the same of different physical location.
The local Data Marts store only relevant information for that particular department.
When compared to a Centralized architecture, the amount of data is reduced.
Tiered
A tired architecture is a like distributed data approach. This process cannot be done at one go as there are many sources that need to be integrated to a DW.
The first step would be the collected data at the lowest level and then keep consolidating until all the data is available in the DW
Ex: Data from all branches in a region is collected first, then data from all regions is integrated to form the DW
Pros:
- It has a faster response time because data is located near the client applications
- Data retrieval can be done on a reduced set of data thereby making it faster
The best data warehouse services companies give the expertise in the world of a large quantity of data.
ReplyDeleteVery well explained thanks for your information. Keep posting.
ReplyDeleteonline etl testing training
etl testing online training