How good are inducing points for dataset distillation?

Das, ShrutimoyShrutimoyDas2026-03-252026-03-252026-01-2010.1609/aaai.v40i48.422052-s2.0-105034244885https://repository.iitgn.ac.in/handle/IITG2025/34909Dataset distillation methods learn a representative summary of the full dataset such that training on the distilled data is more efficient in terms of time and space. The current state-of-the-art methods exploit the correspondence between infinitely wide neural networks (NNs) and kernel ridge regression to design distillation methods that result in high-quality summaries of the data. In this work, we leverage the correspondence between infinitely wide networks and Gaussian Processes(GPs) for learning a distilled dataset. We investigate the feasibility of using the inducing points method for Gaussian Processes, as a data distillation method. While most of the existing dataset distillation methods are based on loss or gradient matching, our method looks at the function space approximation, facilitated by the NN-GP correspondence. Additionally, using recent theoretical results on GP regression and neural tangent kernels(NTKs), we also provide an upper bound on the size of the distilled data. We demonstrate the utility of inducing points as distilled data on a set of datasets empirically.en-USHow good are inducing points for dataset distillation?Conference Paper