Enhancer-FACS-seq (eFS),[1] [2] developed by the Bulyk lab[3] at Brigham and Women’s Hospital and Harvard Medical School,[4] is a highly parallel enhancer assay that aims for the identification of active, tissue-specific transcriptional enhancers, in the context of whole Drosophila melanogaster embryos. This technology replaces the use of microscopy to screen for tissue-specific enhancers with fluorescence activated cell sorting (FACS) of dissociated cells from whole embryos, combined with identification by high-throughput Illumina sequencing.
In metazoans, in order to respond to environmental stress, differentiate properly, and progress normally through the cell cycle, a eukaryotic cell needs a specific and coordinated gene expression program, which involves the highly regulated transcription of thousands of genes.[5] This gene regulation is in large part controlled, in a tissue-specific manner, by the binding of transcription factors to noncoding genomic regions referred to as cis-regulatory modules (CRMs),[6] activating or repressing gene expression by modulating the structure of the chromatin and therefore having a positive or negative effect on transcription regulation. CRMs activating gene expression are often referred to as transcriptional enhancers, whereas those repressing gene expression are referred to as transcriptional silencers.
Despite being a powerful model organism for biology and the study of transcriptional enhancers, the tissue specific activity of less than 5% of the estimated 50,000 transcriptional enhancers in Drosophila melanogaster have been discovered.[7] [8] Over the past decade, the main method for detection of tissue- or cell-type specific activities of enhancers in Drosophila melanogaster was to test candidate enhancers by traditional reporter assays,[9] [10] which are low-throughput and costly. Over the past few years, even though enhancer discovery has been improved and other parallel reporter assays have been developed,[11] [12] [13] [14] [15] [16] none so far allowed the direct identification of enhancer activity in a genomic context in cell types of interest in a whole embryo.
Each candidate CRM (cCRM) is cloned upstream of a reporter gene. Compared to traditional reporter assays, the main innovation is the use of fluorescence activated cell sorting (FACS) of dissociated cells, instead of microscopy, to screen for tissue-specific enhancers. This approach utilizes a two-marker system: in each embryo, one marker (here, the rat CD2 cell surface protein) is used to label cells of a specific tissue for being sorted by FACS, and the other marker (here, green fluorescent protein GFP) is used as a reporter of CRM activity.
Cells are sorted according to their tissue type and then by GFP fluorescence, and the cCRMs are recovered by PCR from double-positive sorted cells, and from total input cells. High-throughput sequencing of both populations then allows measuring the relative abundance of each cCRM in input and sorted populations; one can then assess the enrichment or depletion of each cCRM in double-positive cells versus input as a measure of activity in the CD2-positive cell type being tested.
In the initial report on this method,[17] a library of ~500 cCRMs was drawn from a variety of genomic data sources (e.g., TF-bound regions, coactivator-bound regions, DNase I hypersensitive sites, and predictions from the Bulyk lab’s PhylCRM algorithm [18]) by PCR from genomic DNA, and then screened for activity in embryonic mesoderm and in specific mesodermal cell types. The results were validated by traditional reporter gene assay in Drosophila melanogaster embryos for 68 cCRMs tested by eFS. The specificity of eFS was excellent among significantly enriched cCRMs, while sensitivity was good where the majority of the CD2-positive cells express GFP. It was found that the known enhancer-associated chromatin marks H3K27ac, H3K4me1, and Pol II are significantly enriched among the enhancers found to be active in mesoderm.
The eFS assay could be used to analyze other cell or tissue types. By assessing enrichment in GFP-expressing CD2-negative as well as CD2-positive cells, and by crossing a common pool of reporter transformant male flies to females expressing CD2 in different cell types, it is possible to assay specificity as well as activity. Accelerating the annotation of the regulatory genome in Drosophila should in principle generate the kind of large-scale regulatory interaction data that would allow exploring the network properties of transcriptional regulation.