Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains

Nat Commun. 2017 Dec 21;8(1):2237. doi: 10.1038/s41467-017-02386-3.

Abstract

Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromatin
  • Computational Biology
  • DNA / genetics*
  • Enhancer Elements, Genetic / genetics*
  • Epigenesis, Genetic / genetics*
  • Gene Expression Regulation
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mice
  • Models, Statistical
  • Nucleic Acid Conformation
  • Promoter Regions, Genetic / genetics*

Substances

  • Chromatin
  • DNA