JOURNAL ARTICLE

Full-model estimation for non-parametric multivariate finite mixture models.

  • Published In: Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2024, v. 86, n. 4. P. 896 1 of 3

  • Database: Business Source Ultimate 2 of 3

  • Authored By: Chaumaray, Marie Du Roy de; Marbac, Matthieu 3 of 3

Abstract

This article presents a novel method for full-model selection in non-parametric finite mixture models, simultaneously estimating the number of mixture components and identifying the subset of relevant variables for clustering. The approach discretizes continuous variables into bins whose number increases with sample size, transforming the problem into a latent class model with multinomial components, and applies a penalized log-likelihood criterion (e.g., BIC) to ensure consistent estimation under mild assumptions. Theoretical results establish the consistency of the estimator even when the upper bound on the number of components grows with the sample size. Numerical experiments and real-data applications demonstrate the method's effectiveness in variable selection and component number estimation, particularly when parametric assumptions are violated, and recommend discretization based on empirical quantiles with a bin number growing roughly as the sample size to the power 1/6. The method can be extended to mixed-type data and grouped variables, with practical implementation relying on an EM algorithm for parameter and variable subset estimation.

Additional Information

  • Source:Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2024/09, Vol. 86, Issue 4, p896
  • Document Type:Article
  • Subject Area:Business and Management
  • Publication Date:2024
  • ISSN:1369-7412
  • DOI:10.1093/jrsssb/qkae002
  • Accession Number:179665025
  • Copyright Statement:Copyright of Journal of the Royal Statistical Society: Series B (Statistical Methodology) is the property of Oxford University Press / USA and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Looking to go deeper into this topic? Look for more articles on EBSCOhost.