Biometrics. 2021 May 19. doi: 10.1111/biom.13486. Online ahead of print.
ABSTRACT
In the form of multi-dimensional arrays, tensor data have become increasingly prevalent in modern scientific studies and biomedical applications such as computational biology, brain imaging analysis, and process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structure. Therefore, ad-hoc dimension reduction methods on tensor data may lack statistical efficiency and can obscure essential findings. Model-based clustering is a cornerstone of multivariate statistics and unsupervised learning; however, existing methods and algorithms are not designed for tensor-variate samples. In this article, we propose a Tensor Envelope Mixture Model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. TEMM incorporates tensor-structure-preserving dimension reduction into mixture modeling and drastically reduces the number of free parameters and estimative variability. An EM-type algorithm is developed to obtain likelihood-based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower-dimensional subspaces known as the tensor envelopes. We demonstrate the encouraging empirical performance of the proposed method in extensive simulation studies and a real data application in comparison with existing vector and tensor clustering methods. This article is protected by copyright. All rights reserved.
PMID:34010459 | DOI:10.1111/biom.13486