Coronary artery disease (CAD) arises from the interaction of genetic and environmental factors. Although genome-wide association studies (GWAS) have identified multiple risk loci and single nucleotide polymorphisms (SNPs) associated with risk of CAD, they are predominantly located in non-coding or intergenic regions and their mechanisms of effect are largely unknown. Accordingly, our objective was to develop a data-driven informatics pipeline to understand complex CAD risk loci, and to apply this to a poorly understood cluster of SNPs in the vicinity of ZEB2.
We developed a unique informatics pipeline leveraging a multi-tissue CAD genetics-of-gene-expression dataset, GWAS datasets, and other resources. The pipeline first dissected SNP locations and their linkage disequilibrium relationships, and progressed through analyses of tissue-specific expression quantitative trait loci, and then gene-gene, gene-phenotype, SNP-phenotype relationships. The pipeline concluded by exploring CAD-relevant gene regulatory networks (GRNs).
We identified three independent CAD risk SNPs in close proximity to the ZEB2 coding region (rs6740731, rs17678683 and rs2252641/rs1830321). Our pipeline determined that these SNPs likely act in concert via the atherosclerotic arterial wall and adipose tissues, by governing metabolic and lipid functions. In addition, ZEB2 is the top key driver of a liver-specific GRN that is related to lipid levels, metabolic and anthropometric measures, and CAD severity.
Using a novel informatics pipeline, we disclosed the multi-faceted mechanisms of action of the ZEB2-associated CAD risk SNPs. This pipeline can serve as a roadmap to dissect complex SNP-gene-tissue-phenotype relationships and to reveal targets for tissue- and gene-specific therapeutic interventions.

Copyright © 2020 Elsevier B.V. All rights reserved.