Type a new keyword(s) and press Enter to search

Datawarehouse and Data Mining

 

            
             According to Hwang and Xu (2005), a data warehouse provides organizations with a unique opportunity to increase sales and productivity. This can certainly be seen in many large organizations throughout various industries. A data warehouse is a type of database system that brings together data from various sources throughout an enterprise or industry. This data is often disparate, yet similar in the general area to which it belongs. For instance, information from a manufacturing facility and information from a maintenance facility will usually reside on two different systems. However, valuable knowledge could be gained by bringing this information together. Data sets can be much more heterogeneous than the above example.
             The first step in a data warehouse is to consolidate the data and place it into domains that make sense. Data is coming in from many different sources, many of which do not store data in the same way. This requires a system to homogenize the data set so that it can be stored in its appropriate domain.
             A domain in data warehousing is a significantly flattened data model when compared to the normalized relational databases used in day-to-day business. User-facing databases hold data integrity and lack of redundancy as one of the most critical aspects of the system. To that end, the relational model pushes high levels of normalization that often result in a large number of tables that represent different entities and concepts within the business. However, the more normal a database becomes, the harder the database must work to retrieve information from it.
             Domains can be categorized into two basic areas: dimensions and facts. A fact domain may contain sales data. The most common dimension domain is the Time domain. Most data about a sale, including sales person, item sold, price, customer, might all be in a single table, the fact table. Important dimensions, like time frame, will exist in their own table and be referenced just like any table in a relational database: with a foreign key.


Essays Related to Datawarehouse and Data Mining