Un modèle Bayésien de co-clustering de données mixtes

Abstract : We propose a MAP Bayesian approach to perform and evaluate a co-clustering of mixed-type data tables. The proposed model infers an optimal segmentation of all variables then performs a co-clustering by minimizing a Bayesian model selection cost function. One advantage of this approach is that it is user parameter-free. Another main advantage is the proposed criterion which gives an exact measure of the model quality, measured by probability of fitting it to the data. Continuous optimization of this criterion ensures finding better and better models while avoiding data over-fitting. The experiments conducted on real data show the interest of this co-clustering approach in exploratory data analysis of large data sets.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02007805
Contributor : Fabrice Rossi <>
Submitted on : Tuesday, February 5, 2019 - 2:02:56 PM
Last modification on : Thursday, February 7, 2019 - 5:58:15 PM
Long-term archiving on : Monday, May 6, 2019 - 3:18:39 PM

Files

boucharebboulleetal2018modele-...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02007805, version 1
  • ARXIV : 1902.02056

Collections

Citation

Aichetou Bouchareb, Marc Boullé, Fabrice Rossi, Fabrice Clérot. Un modèle Bayésien de co-clustering de données mixtes. Extraction et gestion des connaissances 2018, Jan 2018, Paris, France. pp.275-280. ⟨hal-02007805⟩

Share

Metrics

Record views

35

Files downloads

24