The MEME suite is a collection of tools for the discovery and analysis of sequence motifs.
See main article: Multiple EM for Motif Elicitation. Multiple Expectation maximizations for Motif Elicitation (MEME) is a tool for discovering motifs in a group of related DNA or protein sequences. [1] MEME takes as input a group of DNA or protein sequences and outputs as many motifs as requested up to a user-specified statistical confidence threshold. MEME uses statistical modeling techniques to automatically choose the best width, number of occurrences, and description for each motif.[2]
Gapped local alignment of motifs (GLAM 2) is a tool for discovering gapped motifs in a group of DNA or protein sequences.Unlike MEME, GLAM2 does not try to find several different motifs all in one go. Instead, it performs replicates: it tries to find the best possible motif multiple times.[3]
Discriminative Regular Expression Motif Elicitation (DREME) is a tool for discovering motifs in large collections of sequences.DREME is computationally efficient and therefore is suitable for motif search on large data sets derived from ChIP-seq (Chromatin immunoprecipitation followed by sequencing) experiments. In the interest of computational efficiency, DREME finds only motifs that can be expressed in the IUPAC alphabet, which contains the standard DNA alphabet as well as eleven 'wildcard' characters (for example, indicates either or).
MEME-ChIP is a tool for discovering motifs in data sets derived from ChIP-seq (Chromatin immunoprecipitation followed by sequencing) experiments.[4]
Find Individual Motif Occurrences (FIMO) is a tool for finding instances of motifs in a sequence database. FIMO searches the database for the provided motifs, and reports a q-value for each match.[5]
GLAM2SCAN is a tool for finding occurrences of a GLAM2 motif in a sequence database.[6]
Motif Alignment & Search Tool (MAST) is a tool for searching biological sequence databases for sequences that contain an occurrence of each motif in a given set of motifs. MAST scores the matches and reports p-values for four types of events:
Spaced Motif Analysis Tool (SpaMo) is a tool for inferring interactions between transcription factors.SpaMo takes a set of sequences (typically sequences surrounding ChIP-seq peaks), a motif represented in these sequences, and a database of known motifs.SpaMo searches the database for instances of database motifs enriched in sites neighboring the given motif.These enrichments suggest physical interaction between the factors that bind each motif.[7]
Central Motif Enrichment Analysis (CentriMo) is a tool for inferring direct DNA binding from ChIP-seq data.CentriMo is based on the observation that the positional distribution of binding sites matching the direct-binding motif tends to be unimodal, well centered and maximal in the precise center of the ChIP-seq peak regions.CentriMo takes a set of sequences and plots the occurrence of motifs relative to the ChIP-seq peak.Motifs that occur exclusively at the peak provide good evidence of direct binding, while motifs that do not occur in a consistent position relative to the peak may not bind directly.[8]
Motif Cluster Alignment and Search Tool (MCAST) is a tool for searching a sequence database for statistically significant clusters of non-overlapping occurrences of a set of motifs.Such clusters may represent regulatory modules.
Tomtom is a tool for comparing a DNA motif to a database of known motifs.TOMTOM searches for statistically significantly similar motifs to the query motif.TOMTOM is useful for determining whether a discovered motif is novel or is a variation of a known motif.
Gene Ontology for MOtifs (GOMO) is a tool for identifying possible roles for DNA binding motifs.It does so by comparing genes the motif occurs upstream of to a Gene Ontology database.If the motif occurs statistically significantly upstream of genes related to a particular function (for example, lactose digestion), it suggests that the transcription factor that binds the motif may regulate that function (for example, by promoting transcription of proteins that digest lactose).