Data-driven learning optimal K values for K-nearest neighbour matching in causal inference-宏观农业研究院

科学研究research

科研进展

您当前的位置：首页 > 科学研究 > 科研进展 > 正文

Data-driven learning optimal K values for K-nearest neighbour matching in causal inference

发布人：芦旭然发布时间：2025-05-28

Abstract

Within the realm of causal inference, a pivotal task involves causal effect estimation from observational data when there exist confounding variables. The K-Nearest Neighbour Matching (K-NNM) method is widely applied to handle confounding bias, but its general application sets a uniform K value for all samples, which can lead to suboptimal results in practice. To overcome this limitation, this paper introduces a novel method for causal effect estimation called Dynamic K-Nearest Neighbour Matching (DK-NNM). The DK-NNM method employs a data-driven learning strategy to determine the optimal value of K for each sample. In practice, DK-NNM reconstructs a sparse coefficient matrix for all samples using sparse learning, while simultaneously learning a graph matrix to preserve local information and sample similarity. This approach helps identify the most suitable K-value for each sample. Additionally, DK-NNM utilizes joint propensity and prognostic scores to effectively mitigate confounding bias arising from high-dimensional covariates during the K-NNM process. Experiments performed on various synthetic, semi-synthetic, and real-world datasets conclusively demonstrate that DK-NNM surpasses baseline models in estimating causal effects from observational data and provides significant improvements over traditional methods.

上一篇：Combining geometric-optical and spectral invariants theories for modeling canopy fluorescence anisotropy

下一篇：HarvestStat africa – Harmonized Subnational Crop Statistics for Sub-Saharan africa

首页HOME

关于我们About Us

新闻动态NEWS

科研队伍people

科学研究research

学生培养Education

大数据平台Big Data

智库平台THINK TANK

加入我们join us

专题feature

科学研究research

科研进展

Abstract