Data Mining

Data Mining:

      What is Data Mining?

A popular term in business today is data mining, especially how it applies to business, marketing, and sales. The term usually refers to attempting to acquire patterns and knowledge from a large amount of data, rather than extraction as the name suggests. The term can also be applied to any form of large scale information processing, and any form of machine learning.

The actual data mining task can be automatic or semi-automatic analysis of large quantities of data to extract previously obscured patterns which could positively or negatively impact a business. This includes cluster analysis, anomaly detection, and detecting dependencies. These patterns can be further extrapolated in order to aid any predictive analytic it may be applied too.

Data mining may also be referred too as data dredging, data fishing, and data snooping.

    How is Data Mining Used?

In order for Data Mining to be used a target set of data must be set. As data mining only exposes already present patterns in the data stream, the more data present the better the results. Commonly the data comes from a data warehouse containing a massive amount of data to pull from. After establishing the parameters for the extraction a variety of steps may be employed based upon what the researchers are searching for, these include;

  1. Anomaly detection: In which the information seen as either interesting or perceived as data errors are displayed.
  2. Dependency modeling: Which searches for direct relationships between variables.
  3. Clustering: The act of gathering said matched variables into a single group without using any previously known variables.
  4. Classification: the task of generalizing each variable into more rigid structures .
  5. Regression: Attempts to retrace previous steps in order to analyze patterns and apply them to a function that models the data with the least error.
  6. Summarization: Provides a compact simple to understand representation of the set data, including visualization and generating reports.

For instance a data mining service is used by this fantastic Attorney Referral Service to isolate exactly which attorney is the best fit for you and your claim:

It should also be noted that data mining can be unintentionally misused, producing results which appear to be significant but are false, or misleading simply be not using a stringent enough parameter and collecting far too much data that will throw the groupings off. This can normally be bypassed by training the system on a set of data on which patterns are already known and seeing if the same patterns are prevalent.