Wednesday, December 22, 2010

Continual Aggregate Hub and the Data Warehouse

Devlopment in this area makes this article obsolete. 

The CAH will contain aggregates represented as xml-documents. These aggregates are tuned for the usage patters relevant for the process executed, aka. the primary products. Many other, less relevant but not unimportant, usage patterns do exist (for more secondary products). These are often not so timely relevant for the main process and can wait. We see that the data warehouse (DWH) or a more dedicated operational data store (ODS) has the role of fulfilling the purpose of the secondary producs. (The CAH is a ODS to some extent). The CAH emits all new aggregates produced and will enable the DWH to be more operational. The DWH can query and collect data at any interval it is capable of. The Aggregates are also clearly defined and identified, so it makes the ETL process simpler. Furthermore the all details are available for querying in the CAH so the DW does not need to keep them. These capabilities will lessen the burden on the DWH.
Although the CAH stores data as xml, the DWH may store this as it is best suited.
Creative Commons License
Continual Aggregate Hub and the Data Warehouse by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

No comments:

Post a Comment