preview

The Storage Of Data Warehouse Essay

Better Essays

Every organizations have their own independent Data Warehouse and due to increase in the number of transactions, the size of the data is also increasing. Data warehouse is the central repository of information for an organization. There are multiple data sources like OLTP, excel, csv, txt, xml, etc, that are generated from various systems and are populated to data warehouse by ETL and thus Data Warehouse stores the summarized integrated business data in a central repository. The Data Warehouse is used for the analytical applications (OLAP – On-Line Analytical Processing), decision making, data mining and user applications.

ETL plays an important role in building the Data Warehouse. In the traditional way of ETL, all the analysis, activities and operations are stopped and then the refreshment of data warehouse will be done. Since this will be done in off-peak hours, the Data Warehouse will not be having the latest operational transactions and hence there is no freshness of data. This problem is called by Data Latency. Near Real time Data Warehousing is a solution for this problem. It will update the Data Warehouse in near real time manner, immediately after change data detected in data source. Thus, data latency can be minimized.

In order to develop the near real time data warehouse, there are problems which were previously not found in the traditional ETL process. The objective of this dissertation is to find the solutions for the problems at each stage of

Get Access