Authors

Haoyang Liu, Maheep Chaudhary, and Haohan Wang

School of Information Sciences, University of Illinois Urbana-Champaign

Abstract

The trustworthiness of machine learning has emerged as a critical topic in the field, encompassing various applications and research areas such as robustness, security, interpretability, and fairness. Over the past decade, dedicated efforts have been made to address these issues, resulting in a proliferation of methods tailored to each specific challenge. In this survey paper, we provide a systematic overview of the technical advancements in trustworthy machine learning, focusing on robustness, adversarial robustness, interpretability, and fairness from a data-centric perspective, as we believe that achieving trustworthiness in machine learning often involves overcoming challenges posed by the data structures that traditional empirical risk minimization (ERM) training cannot resolve.

Interestingly, we observe a convergence of methods introduced from this perspective, despite their development as independent solutions across various subfields of trustworthy machine learning. Furthermore, we find that Pearl’s hierarchy of causality serves as a unifying framework for categorizing these techniques. Consequently, this survey first presents the background of trustworthy machine learning development using a unified set of concepts, connects this unified language to Pearl’s hierarchy of causality, and finally discusses methods explicitly inspired by causality literature. By doing so, we established a unified language with mathematical vocabulary as a principled connection between these methods across robustness, adversarial robustness, interpretability, and fairness under a data-centric perspective, fostering a more cohesive understanding of the field.

Further, we extend our study to the trustworthiness of large pretrained models. We first present a brief summary of the dominant techniques in these models, such as fine-tuning, parameter-efficient fine-tuning, prompting, and reinforcement learning with human feedback. We then connect these techniques with standard ERM, upon which previous trustworthy machine learning solutions were built. This connection allows us to immediately build upon the principled understanding of the trustworthy method established in previous sections, applying it to these new techniques in large pretrained models, openning up possibilities for many new methods. We also survey the current existing methods under this perspective.

Finally, we offer a brief summary of the applications of these methods and also discuss about some future aspects relating to our surve

Download the Paper

Get Involved

Suggest Additional Papers

Help us improve the survey by suggesting additional papers.

Join Our Slack

Join our community on Slack to discuss ideas, ask questions, and collaborate with new friends.

Sign Up for Reading Groups

Sign up for our reading groups where we discuss papers and key concepts (currently paused).

More Resources

Causality and Vision

The github page of Maheep's notes on causality and computer vision papers

Trustworthy ML Initiative

Join our community on Trustworthy Machine Learning Initiative.

DREAM

Visit the our new lab at UIUC drafting this survye paper. (site still under construction)

Citation Guide

If you consider our work useful, please consider cite us:

@article{liu2023trustworthy,
  title={Towards Trustworthy and Aligned Machine Learning: A Data-centric Survey with Causality Perspectives},
  author={Liu, Haoyang and Chaudhary, Maheep and Wang, Haohan},
  year={2023},
  publisher={School of Information Sciences, University of Illinois Urbana-Champaign}
}