HiChIP and PLAC-Seq are emerging technologies for studying genome-wide long-range chromatin interactions mediated by the protein of interest, enabling more sensitive and cost-efficient interrogation of protein-centric chromatin conformation. However, due to the unbalanced read distribution introduced by protein immunoprecipitation, existing reproducibility measures developed for Hi-C data are not appropriate for the analysis of HiChIP and PLAC-Seq data. Here, we present HPRep, a stratified and weighted correlation metric derived from normalized contact counts, to quantify reproducibility in HiChIP and PLAC-Seq data. We applied HPRep to multiple real datasets and demonstrate that HPRep outperforms existing reproducibility measures developed for Hi-C data. Specifically, we applied HPRep to H3K4me3 PLAC-Seq data from mouse embryonic stem cells and mouse brain tissues as well as H3K27ac HiChIP data from human lymphoblastoid cell line GM12878 and leukemia cell line K562, showing that HPRep can more clearly separate among pseudo-replicates, real replicates, and non-replicates. Furthermore, in an H3K4me3 PLAC-Seq dataset consisting of 11 samples from four human brain cell types, HPRep demonstrated the expected clustering of data that could not be achieved by existing methods developed for Hi-C data, highlighting the need for a reproducibility metric tailored to HiChIP and PLAC-Seq data.

Author