Snorkel metal: Weak supervision for multi-task learning

A. Alex Ratner, B. Braden Hancock, J. Jared Dunnmon, R. Roger Goldman, C. Christopher Ré

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels of granularity by domain expert users. In MeTaL, a user specifies a problem consisting of multiple, hierarchically-related sub-tasks - -for example, classifying a document at multiple levels of granularity - -and then provides labeling functions for each sub-task as weak supervision. MeTaL learns a re-weighted model of these labeling functions, and uses the combined signal to train a hierarchical multi-task network which is automatically compiled from the structure of the sub-tasks. Using MeTaL on a radiology report triage task and a fine-grained news classification task, we achieve average gains of 11.2 accuracy points over a baseline supervised approach and 9.5 accuracy points over the predictions of the user-provided labeling functions.

Original languageEnglish (US)
Title of host publicationProceedings of the 2nd Workshop on Data Management for End-To-End Machine Learning, DEEM 2018 - In conjunction with the 2018 ACM SIGMOD/PODS Conference
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450358286
DOIs
StatePublished - Jun 15 2018
Externally publishedYes
Event2nd Workshop on Data Management for End-To-End Machine Learning, DEEM 2018 - In conjunction with the 2018 ACM SIGMOD/PODS Conference - Houston, United States
Duration: Jun 15 2018 → …

Publication series

NameProceedings of the 2nd Workshop on Data Management for End-To-End Machine Learning, DEEM 2018 - In conjunction with the 2018 ACM SIGMOD/PODS Conference

Conference

Conference2nd Workshop on Data Management for End-To-End Machine Learning, DEEM 2018 - In conjunction with the 2018 ACM SIGMOD/PODS Conference
CountryUnited States
CityHouston
Period6/15/18 → …

ASJC Scopus subject areas

  • Sociology and Political Science
  • Hardware and Architecture
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'Snorkel metal: Weak supervision for multi-task learning'. Together they form a unique fingerprint.

Cite this