Towards Explaining Distribution Shifts

Abstract

A distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding distribution shifts is critical for examining and hopefully mitigating the effect of such a shift. Most prior work has focused on merely detecting if a shift has occurred and assumes any detected shift can be understood and handled appropriately by a human operator. We hope to aid in these manual mitigation tasks by explaining the distribution shift using interpretable transportation maps from the original distribution to the shifted one. We derive our interpretable mappings from a relaxation of the optimal transport problem, where the candidate mappings are restricted to a set of interpretable mappings. We then use a wide array of quintessential examples of distribution shift in real-world tabular, text, and image cases to showcase how our explanatory mappings provide a better balance between detail and interpretability than baseline explanations by both visual inspection and our PercentExplained metric.

Publication
International Conference on Machine Learning

Our paper can be found via clicking on the PDF icon at the top! Please reach out if you have any questions. Cheers :)

Sean Kulinski
Sean Kulinski
GenAI Research Scientist

My research interest lies at making more reliable generative AI models. Namely, this means the problems I tend to work with are robust ML, capabilities focused pretraining/finetuning, model failure analysis, and all that comes with it.