Frequent pattern discovery explained

Frequent pattern discovery (or FP discovery, FP mining, or Frequent itemset mining) is part of knowledge discovery in databases, Massive Online Analysis, and data mining; it describes the task of finding the most frequent and relevant patterns in large datasets.[1] The concept was first introduced for mining transaction databases.Frequent patterns are defined as subsets (itemsets, subsequences, or substructures) that appear in a data set with frequency no less than a user-specified or auto-determined threshold.[2] [3]

Techniques

Techniques for FP mining include:

For the most part, FP discovery can be done using association rule learning with particular algorithms Eclat, FP-growth and the Apriori algorithm.

Other strategies include:

and respective specific techniques.

Implementations exist for various machine learning systems or modules like MLlib for Apache Spark.[5]

Notes and References

  1. Frequent pattern mining: current status and future directions. 10.1007/s10618-006-0059-1. Data Mining and Knowledge Discovery. 15. 55–86. 2019-01-31. Jiawei Han. Hong Cheng. Dong Xin. Xifeng Yan. 2007. 8085527. free.
  2. Web site: Frequent Pattern Mining . SIGKDD . 1980-01-01 . 2019-01-31.
  3. Web site: Frequent pattern Mining, Closed frequent itemset, max frequent itemset in data mining . T4Tutorials . 2018-12-09 . . 2019-01-31.
  4. Agrawal . Rakesh . Imieliński . Tomasz . Swami . Arun . Mining association rules between sets of items in large databases . ACM SIGMOD Record . 22 . 2 . 1993-06-01 . 0163-5808 . 10.1145/170036.170072 . 207–216 . 10.1.1.217.4132 .
  5. Web site: Frequent Pattern Mining . Spark 2.4.0 Documentation . . 2019-01-31.