关键词:
Medicine
Biology
Acoustics
Analytical chemistry
Artificial intelligence
Chemistry
Computer science
Medical imaging
Medical personnel
Optics
摘要:
Machine learning models are being deployed to biological and clinical settings, including here at Stanford, e.g. to analyze ultrasounds automatically or map ancestries from genomics data. However, machine learning models suffer from issues of reliability: even models with good test performance often fail in unpredictable ways when deployed to real-world settings. In this thesis, I present two frameworks for more reliable machine learning: one for supervised learning and one for unsupervised learning. In the case of supervised learning, I present Gradio (***), an open-source Python framework for interactively testing models on real-world data. Gradio is being used to run the first real-time clinical trial of a machine learning model in the Stanford Department of Dermatology, and has been used to validate models at Google, Siemens, Amazon, Mercy Hospital, and Harvard. In the thesis, I describe the core questions that led to the development of Gradio, and showcase applications that demonstrate the usefulness of the framework. I further describe a novel explanation method that we have developed that allows debugging of faulty models with Gradio. On the unsupervised side, I introduce the framework of contrastive datasets, which provides a more reliable way to find patterns in unlabeled data. Our framework is quite general and has been adopted for purposes such as denoising images and mapping ancestries of admixed populations. Together, these frameworks provide a way to do more reliable unsupervised and supervised machine learning.