Marios Kadriu
Data Scientist | Software Engineer | DeepMind Scholar
Jul 27, 2023
Anderson Chaves
Lead Data Scientist at EssilorLuxottica
Another cool thing for the day! 💡 Thanks, Wojtek Kuberski and NannyML! #python #machinelearning #datascience
Nov 10, 2022
Pascal Biese
AI/ML Engineer | LLMs | NLP
Does the pattern below look familiar to you? If not, you can consider yourself incredibly lucky! For everyone else, check out NannyML: NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, interactive visualizations, is completely model-agnostic and currently supports all tabular binary classification use cases. (Source: https://lnkd.in/e2RjQsQg) #DataScience #MachineLearning #AI #DeepLearning
May 16, 2022
Kishan Savant
Software Engineer @VCollab | Open Source Enthusiast Software Engineer @VCollab | Open Source Enthusiast
Thank you Hakim Elakhrass, Niels Nuyttens and NannyML team for sending this #contributions #swag all the way from Belgium. Really appreciate the note, Hakim. Looking forward to more #opensource #contributions to the wonderful NannyML #modelmonitoring tool. …see more
Apr 8, 2023
🚀 Mikkel Jensen
Data Scientist | Developer | ML & AI
Great resources and libraries I have come across recently: NannyML: An open source tool for monitoring models post deployment. I find the ability to estimate performance without labels especially helpful, as we have a one year lag on our labels. It is also possible to detect data/classifier drift. 𝐎𝐩𝐭𝐁𝐢𝐧𝐧𝐢𝐧𝐠: The go to data binning library. Super useful for discovering interesting intervals in variables, and for binning and preprocessing before applying a logistic regression on top. 𝐓𝐚𝐛𝐃𝐃𝐏𝐌: A new method for modelling tabular data. While I have only briefly browsed the paper, it seems promising for generating tabular data, outperforming both Smote and GANs in most of the tested cases! The paper was published only a week ago, but I'm probably already late to the party talking about this one. What are your favorite new tools? Links in the comments👇 ------------------------------------------------------------------ Talking data science, credit risk, cryptocurrency, software development - Connect & start the conversation! #machinelearning #python #datascience #opensource
Oct 6, 2022
Olivier Binette
Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke
🙅 Stop monitoring data drift. Here's what to do instead. After deploying machine learning models, it's essential to ensure that they keep functioning as intended. One common way to do this is by monitoring data drift, or verifying that the data used for predictions is similar to the data the model was trained on. If there is a significant difference, retraining the model on updated data can help. However, monitoring data drift does not indicate if the drift is affecting the model's ability to solve the task it was trained for. An alternative and more focused approach is to continuously monitor the model's generalization performance. This can be done without using any labeled data through clever statistical techniques, such as confidence-based performance estimation and direct loss estimation. 🚀 NannyML implements these methods, allowing you to estimate the model's performance on drifted data and focus on what is most important for your task. Does it mean you should really stop monitoring data drift? No. But keep in mind that data drift does not always equate to performance drift, and it's usually more beneficial to focus on the latter. Code below is from https://lnkd.in/e95kJv-H #machinelearning #ml #ai #drift #performance #evaluation #statistics #datascience #nannyml #datadrift #MLOps
Feb 2, 2023
Roger Kamena, M.Sc.
Senior Data Scientist, AI-Powered Analytics at UKG Senior Data Scientist, AI-Powered Analytics at UKG
Even though sheer experience taught me that data drift alone is not enough to monitor models in production, and that monitoring performance through randomized control trials on out of sample data regularly is a needed practice, it’s nice to read a paper that confirms it with empirical evidence. Moreover NannyML is a nice discovery for me. Can’t wait to test it!
Feb 8, 2023
Louis Owen
AI & Data Science | Yellow.ai
[Estimating Accuracy Without Ground Truth] Getting your ML model to production is not an easy task. Monitoring your deployed ML model is even harder. If you have a business metric that directly correlates to your ML model's performance, then it's good for you. However, what if there are multiple factors that influence your business metric and your ML model is just one of them? What if you want to know how exactly your ML model performing in production measured by technical ML metrics (Accuracy, Precision, Recall, etc)? ✨Introducing CBPE (Confidence Based Performance Estimation) developed by the amazing team behind NannyML. With CBPE you can estimate the performance of any ML model, without any ground-truth! How it is even possible? It's possible by relying on the prediction output confidence score and under the assumption of the model is well-calibrated and concept drift is not exist. Furthermore, we can also estimate the performance of a regressor ML model with a similar algorithm developed by the NannyML team called DLE (Direct Loss Estimation)! Curious to learn more? You can refer to the following articles for more information! 📌 https://lnkd.in/gJuPBQcb 📌https://lnkd.in/gsbG3HgE 📌https://lnkd.in/gh3A6iUJ #artificialintelligence #machinelearning #datascience #sharingiscaring
Feb 7, 2023