Many machine learning (ML) models operating in big and small enterprises are often trained on noisy, crowdsourced, or user-generated data. As the annotators are not experts in the application domain or data labeling, ML specialists have to take this property into account when training and operating with the model.
The talk is intended for ML engineers and researchers and will show them how to take into account the specifics of crowdsourced annotations when building or improving their own ML systems. We will look at three important issues:• How to properly account for noisy labeled data when training a model • How to take into account the subjective responses of annotators • How to track distribution bias using model monitoring
Ideas for further research and development in respect of building ML solutions powered by crowdsourced data will also be presented.
Attendees will learn:• How to handle noisy training data using such ML methods as CrowdLayer and CoNAL • How to reliably gather subjective opinions of humans for complex cases using pairwise comparisons • How, with the help of crowdsourcing, it is possible to quickly notice a distributional shift in the already deployed model
We are looking for passionate people willing to cultivate and inspire the next generation of leaders in tech, business, and data science. If you are one of them get in touch with us!