In today's business world, information about the customer is a necessity for a businesses trying to maximize its profits. A new, and important, tool in gaining this knowledge is Data Mining. Data Mining is a set of automated procedures used to find previously unknown patterns and relationships in data. These patterns and relationships, once extracted, can be used to make valid predictions about the behavior of the customer.
Data Mining is generally used for four main tasks: (1) to improve the process of making new customers and retaining customers; (2) to reduce fraud; (3) to identify internal wastefulness and deal with that wastefulness in operations, and (4) to chart unexplored areas of the internet (Cavoukian). The fulfillment of these tasks can be enhanced if appropriate data has been collected and if that data is stored in a data warehouse. According to Stanford University, "A Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated.This makes it much easier and more efficient to run queries over data that originally came from different sources." When data about an organization's practices is easier to access, it becomes more economical to mine. "Without the pool of validated and scrubbed data that a data warehouse provides, the data mining process requires considerable additional effort to pre-process the data" (SAS Institute).
There are several different types of models and algorithms used to "mine" the data. These include, but are not limited to, neural networks, decision trees, rule induction, boosting, and genetic algorithms.
.
Neural networks are physical cellular systems which can acquire, store, and .
utilize experiential knowledge (Zurada). Neural networks offer a way to efficiently model large and complex problems. Decision trees are diagrams used for making decisions in business or computer programming.