Prototype-based Neural Networks for Tabular Biomedical Data

Abstract

Tabular biomedical data poses challenges in machine learning because it is often high-dimensional and typically low-sample-size. Previous research has attempted to address these challenges via feature selection approaches, which can lead to unstable performance on real-world data. This suggests that current methods lack appropriate inductive biases that capture patterns common to different samples. In this paper, we propose ProtoGate, a prototype-based neural model that introduces an inductive bias by attending to both homogeneity and heterogeneity across samples. ProtoGate selects features in a global-to-local manner and leverages them to produce explainable predictions via an interpretable prototype-based model. We conduct comprehensive experiments to evaluate the performance of ProtoGate on synthetic and real-world datasets. Our results show that exploiting the homogeneous and heterogeneous patterns in the data can improve prediction accuracy while prototypes imbue interpretability.

Publication
In ICML 2023 Workshop “Interpretable Machine Learning in Healthcare (IMLH)”
Xiangjian Jiang
Xiangjian Jiang
PhD Student in Computer Science

My research interests include explainable AI and data mining, with a particular focus on data-centric tabular foundation models.