Understanding the contribution of individual features in predictive models remains a central goal in interpretable machine learning, and while many model-agnostic methods exist to estimate feature importance, they often fall short in capturing high-order interactions and disentangling overlapping contributions. In this work, we present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method, leveraging Conditional Mutual Information (CMI) estimated via a k-Nearest Neighbor (kNN) approach working on mixed discrete and continuous random variables. Our framework decomposes feature contributions into unique, synergistic, and redundant components, offering a richer, model-free understanding of their predictive roles. We validate the method using synthetic datasets with known Gaussian structures, where ground truth interaction patterns are numerically derived, and further test it on non-Gaussian and real-world gene expression data from TCGA-BRCA. Results indicate that the proposed estimator accurately recovers theoretical and expected findings, providing a potential use case for developing feature selection algorithms or model development based on the interaction analysis.

Lazic, I., Bara, C., Iovino, M., Stramaglia, S., Kasas-Lazetic, K., Jakovljevic, N., et al. (2026). Information-theoretic quantification of high-order feature effects in classification problems. CHAOS, SOLITONS AND FRACTALS, 204 [10.1016/j.chaos.2025.117724].

Information-theoretic quantification of high-order feature effects in classification problems

Lazic I.;Iovino M.;Faes L.
2026-01-01

Abstract

Understanding the contribution of individual features in predictive models remains a central goal in interpretable machine learning, and while many model-agnostic methods exist to estimate feature importance, they often fall short in capturing high-order interactions and disentangling overlapping contributions. In this work, we present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method, leveraging Conditional Mutual Information (CMI) estimated via a k-Nearest Neighbor (kNN) approach working on mixed discrete and continuous random variables. Our framework decomposes feature contributions into unique, synergistic, and redundant components, offering a richer, model-free understanding of their predictive roles. We validate the method using synthetic datasets with known Gaussian structures, where ground truth interaction patterns are numerically derived, and further test it on non-Gaussian and real-world gene expression data from TCGA-BRCA. Results indicate that the proposed estimator accurately recovers theoretical and expected findings, providing a potential use case for developing feature selection algorithms or model development based on the interaction analysis.
2026
Lazic, I., Bara, C., Iovino, M., Stramaglia, S., Kasas-Lazetic, K., Jakovljevic, N., et al. (2026). Information-theoretic quantification of high-order feature effects in classification problems. CHAOS, SOLITONS AND FRACTALS, 204 [10.1016/j.chaos.2025.117724].
File in questo prodotto:
File Dimensione Formato  
186-Lazic-ChosSolFract-2026.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 1.25 MB
Formato Adobe PDF
1.25 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/704242
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact