"I used an open-sourced library, cleanlab, to remove low-quality labels on an image dataset. The [ResNet] model trained on the dataset without low-quality data gained 4 percentage points of accuracy compared to the baseline model (trained on all data)."
"A while back, I made a toxic language classifier. However, I was unsatisfied with the training data. I split the text by sentences while retaining the original label, hoping I'd be able to quickly clean-up, but that didn't work well. I took the sentence-labeled training data and threw it at cleanlab to see how well confident learning could identify the incorrect labels. These results look amazing to me. If nothing else, this can help identify training data to TOSS if you don't want to automate correction."