Multi-view data, that is matched sets of measurements on the same subjects, have become increasingly common with advances in multi-omics technology. Often, it is of interest to find associations between the views that are related to the intrinsic class memberships. Existing association methods cannot directly incorporate class information, while existing classification methods do not take into account between-views associations. In this work, we propose a framework for Joint Association and Classification Analysis of multi-view data (JACA). Our goal is not to merely improve the misclassification rates, but to provide a latent representation of high-dimensional data that is both relevant for the subtype discrimination and coherent across the views. We motivate the methodology by establishing a connection between canonical correlation analysis and discriminant analysis. We also establish the estimation consistency of JACA in high-dimensional settings. A distinct advantage of JACA is that it can be applied to the multi-view data with block-missing structure, that is to cases where a subset of views or class labels is missing for some subjects. The application of JACA to quantify the associations between RNAseq and miRNA views with respect to consensus molecular subtypes in colorectal cancer data from The Cancer Genome Atlas project leads to improved misclassification rates and stronger found associations compared to existing methods.
This article is protected by copyright. All rights reserved.