Gagandeep Singh
Staff Data Scientist @ Walmart
🚀 Exciting News for Data Practitioners! 🚀 Are you tired of the hassle and expense of annotating production data to monitor your machine learning models? Well, Allow me to introduce you to NannyML, the open-source post-deployment model monitoring framework in Python that's about to make your life a whole lot easier. NannyML comes packed with some seriously clever features that are a game-changer for those of us who work with data. What's the best part? It doesn't rely on labeled data! Yes, you heard that right – all the magic happens with the features you're already capturing while your model is in production. Let's dive into the essence of NannyML: 🔍 Feature Drift Detection: NannyML helps you identify univariate drift by comparing feature distributions across different chunks of data. Think of these 'chunks' as snapshots of your data – they can be based on time, size, or number. 📊 Advanced Analysis: For more complex scenarios involving multivariate drift, NannyML checks the Principal Component Analysis (PCA) data reconstruction error across chunks. In simple terms, it ensures the stability of key components over time, so you can spot any deviations. But wait, there's more: 📈 Model Performance Estimation: NannyML doesn't stop at drift detection; it also estimates your model's performance, helping you maintain top-notch results. 💰 Business Value Estimation: It even assists you in assessing the business value of your models, ensuring they continue to deliver the desired outcomes. 🔍 Data Quality Monitoring: Keeping an eye on data quality? NannyML has your back, ensuring your data remains reliable and consistent. While NannyML offers a plethora of functionalities, let's focus on feature drift detection, which doesn't require labeled data. Unfortunately, it won't help with concept drift, but that's a small trade-off for the convenience it provides. 📊 Univariate Drift: Check if feature distributions have shifted between reference and analysis periods. This can be a game-changer for maintaining model accuracy. 📉 Output Drift: You can also track shifts in the distribution of predicted classes over time, ensuring your model's predictions remain on point. 📊 Multivariate Drift: NannyML goes a step further by looking at the overall shift in feature distributions, validating the univariate results. In conclusion, NannyML is a powerful tool that simplifies drift detection for production models. With its intuitive interface and open-source nature, there's no excuse not to use it. Don't let your production models lose their business value due to neglect. Embrace NannyML and ensure your models stay on the right track! 🚀
Oct 29, 2023
Marios Kadriu
Data Scientist | Software Engineer | DeepMind Scholar
91% is an incredible percentage. Just goes to demonstrate how crucial it is to keep an eye on the performance of the models during production. Furthermore, there may be additional factors at play in addition to data drift that causes degradation. I am very happy to learn that NannyML an open-source python module exists to assist us in starting a performance monitoring project. Have you utilized NannyML or a different model performance monitoring tool?  #datascience #ai #machinelearning …see more
Jul 27, 2023
Anderson Chaves
Lead Data Scientist at EssilorLuxottica
Another cool thing for the day! 💡 Thanks, Wojtek Kuberski and NannyML! #python #machinelearning #datascience
Nov 10, 2022
Pascal Biese
AI/ML Engineer | LLMs | NLP
Does the pattern below look familiar to you? If not, you can consider yourself incredibly lucky! For everyone else, check out NannyML: NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, interactive visualizations, is completely model-agnostic and currently supports all tabular binary classification use cases. (Source: lnkd.in/e2RjQsQg) #DataScience #MachineLearning #AI #DeepLearning
May 16, 2022
Kishan Savant
Software Engineer @VCollab | Open Source Enthusiast Software Engineer @VCollab | Open Source Enthusiast
Thank you Hakim Elakhrass, Niels Nuyttens and NannyML team for sending this #contributions #swag all the way from Belgium. Really appreciate the note, Hakim. Looking forward to more #opensource #contributions to the wonderful NannyML #modelmonitoring tool. …see more
Apr 8, 2023
🚀 Mikkel Jensen
Data Scientist | Developer | ML & AI
Great resources and libraries I have come across recently: NannyML: An open source tool for monitoring models post deployment. I find the ability to estimate performance without labels especially helpful, as we have a one year lag on our labels. It is also possible to detect data/classifier drift. 𝐎𝐩𝐭𝐁𝐢𝐧𝐧𝐢𝐧𝐠: The go to data binning library. Super useful for discovering interesting intervals in variables, and for binning and preprocessing before applying a logistic regression on top. 𝐓𝐚𝐛𝐃𝐃𝐏𝐌: A new method for modelling tabular data. While I have only briefly browsed the paper, it seems promising for generating tabular data, outperforming both Smote and GANs in most of the tested cases! The paper was published only a week ago, but I'm probably already late to the party talking about this one. What are your favorite new tools? Links in the comments👇 ------------------------------------------------------------------ Talking data science, credit risk, cryptocurrency, software development - Connect & start the conversation! #machinelearning #python #datascience #opensource
Oct 6, 2022
Olivier Binette
Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke
🙅 Stop monitoring data drift. Here's what to do instead. After deploying machine learning models, it's essential to ensure that they keep functioning as intended. One common way to do this is by monitoring data drift, or verifying that the data used for predictions is similar to the data the model was trained on. If there is a significant difference, retraining the model on updated data can help. However, monitoring data drift does not indicate if the drift is affecting the model's ability to solve the task it was trained for. An alternative and more focused approach is to continuously monitor the model's generalization performance. This can be done without using any labeled data through clever statistical techniques, such as confidence-based performance estimation and direct loss estimation. 🚀 NannyML implements these methods, allowing you to estimate the model's performance on drifted data and focus on what is most important for your task. Does it mean you should really stop monitoring data drift? No. But keep in mind that data drift does not always equate to performance drift, and it's usually more beneficial to focus on the latter. Code below is from lnkd.in/e95kJv-H #machinelearning #ml #ai #drift #performance #evaluation #statistics #datascience #nannyml #datadrift #MLOps
Feb 2, 2023
Roger Kamena, M.Sc.
Senior Data Scientist, AI-Powered Analytics at UKG Senior Data Scientist, AI-Powered Analytics at UKG
Even though sheer experience taught me that data drift alone is not enough to monitor models in production, and that monitoring performance through randomized control trials on out of sample data regularly is a needed practice, it’s nice to read a paper that confirms it with empirical evidence. Moreover NannyML is a nice discovery for me. Can’t wait to test it!
Feb 8, 2023
Louis Owen
AI & Data Science | Yellow.ai
[Estimating Accuracy Without Ground Truth] Getting your ML model to production is not an easy task. Monitoring your deployed ML model is even harder. If you have a business metric that directly correlates to your ML model's performance, then it's good for you. However, what if there are multiple factors that influence your business metric and your ML model is just one of them? What if you want to know how exactly your ML model performing in production measured by technical ML metrics (Accuracy, Precision, Recall, etc)? ✨Introducing CBPE (Confidence Based Performance Estimation) developed by the amazing team behind NannyML. With CBPE you can estimate the performance of any ML model, without any ground-truth! How it is even possible? It's possible by relying on the prediction output confidence score and under the assumption of the model is well-calibrated and concept drift is not exist. Furthermore, we can also estimate the performance of a regressor ML model with a similar algorithm developed by the NannyML team called DLE (Direct Loss Estimation)! Curious to learn more? You can refer to the following articles for more information! 📌 lnkd.in/gJuPBQcb 📌https://lnkd.in/gsbG3HgE 📌https://lnkd.in/gh3A6iUJ #artificialintelligence #machinelearning #datascience #sharingiscaring
Feb 7, 2023
Smriti Mishra
Data Science & Engineering at Adage
How can you know if your ML models did not fail silently after deployment? NannyML was trending on GitHub (it was also #3 on Product Hunt). It's a fantastic Open-Source Python Library for detecting silent ML model failure! Key aspects include: 🔹Estimate the performance of a deployed ML model in the absence of target data! Detect multivariate and univariate data drift robustly!  🔹Drop-in link performance due to drift in specific aspects  🔹It is compatible with all categorization models.  🔹Built-in performance and data drift visualisation pip install nannyml Check out the open-source project and give it a star to keep up with future updates like forthcoming regression support! lnkd.in/ej_VtyeP #technology #artificialintelligence #python #data #programming #machinelearning
Jun 20, 2022
👋 Simon Stiebellehner
Building ML Platforms | Engineering Manager | 👨‍🏫 University Lecturer | Advisor
Deployed a #MachineLearning model to production? Now you're having sleepless nights due to 𝗶𝗺𝗺𝗶𝗻𝗲𝗻𝘁 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗱𝗲𝗴𝗿𝗮𝗱𝗮𝘁𝗶𝗼𝗻? 😰 𝗚𝗲𝘁 𝗮 𝗡𝗮𝗻𝗻𝘆 𝗳𝗼𝗿 𝘆𝗼𝘂𝗿 𝗺𝗼𝗱𝗲𝗹! 𝗡𝗮𝗻𝗻𝘆𝗠𝗟 (by NannyML) is an 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝗣𝘆𝘁𝗵𝗼𝗻 𝗽𝗮𝗰𝗸𝗮𝗴𝗲 that helps you estimate model performance in production without access to targets! 𝚙𝚒𝚙  𝚒𝚗𝚜𝚝𝚊𝚕𝚕  𝚗𝚊𝚗𝚗𝚢𝚖𝚕 ➡️ 𝗗𝗲𝘁𝗲𝗰𝘁 𝗱𝗮𝘁𝗮 𝗱𝗿𝗶𝗳𝘁 of deployed models w/o targets(!) ➡️ Configure 𝗮𝗹𝗲𝗿𝘁𝘀 ➡️ Link data drift back to 𝗺𝗼𝗱𝗲𝗹 𝗰𝗵𝗮𝗻𝗴𝗲𝘀 ➡️ 𝗠𝗼𝗱𝗲𝗹-𝗮𝗴𝗻𝗼𝘀𝘁𝗶𝗰 ➡️ Comes with a neat set of 𝘃𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 Check it out and star the repo to stay up-to-date! ⭐ 𝗚𝗶𝘁𝗛𝘂𝗯: lnkd.in/e-_YSDsh --- Follow me for curated, high-quality content on productionizing #MachineLearning and #MLOps . Let’s take #DataScience from Notebook to Production!
May 14, 2022
João Maia
Data Scientist | Machine Learning | Python | Keras
Thanks Wojtek, I recently implemented your solution in my personal project. And it's really really usefull to identify bad features, even going through other variable selection methods and help me to prevent some failures.
Jul 27, 2023
NLP Logix
3,527 followers
Great read alert: this article from NannyML discusses a recent study by MIT, Harvard, The University of Monterrey, and other top institutions-- in regard to models degrading over time. lnkd.in/g49zhe5a #ml #ai #mlmodels #datascienceisateamsport
Apr 18, 2023
Gagandeep Singh
Staff Data Scientist @ Walmart
🚀 Exciting News for Data Practitioners! 🚀 Are you tired of the hassle and expense of annotating production data to monitor your machine learning models? Well, Allow me to introduce you to NannyML, the open-source post-deployment model monitoring framework in Python that's about to make your life a whole lot easier. NannyML comes packed with some seriously clever features that are a game-changer for those of us who work with data. What's the best part? It doesn't rely on labeled data! Yes, you heard that right – all the magic happens with the features you're already capturing while your model is in production. Let's dive into the essence of NannyML: 🔍 Feature Drift Detection: NannyML helps you identify univariate drift by comparing feature distributions across different chunks of data. Think of these 'chunks' as snapshots of your data – they can be based on time, size, or number. 📊 Advanced Analysis: For more complex scenarios involving multivariate drift, NannyML checks the Principal Component Analysis (PCA) data reconstruction error across chunks. In simple terms, it ensures the stability of key components over time, so you can spot any deviations. But wait, there's more: 📈 Model Performance Estimation: NannyML doesn't stop at drift detection; it also estimates your model's performance, helping you maintain top-notch results. 💰 Business Value Estimation: It even assists you in assessing the business value of your models, ensuring they continue to deliver the desired outcomes. 🔍 Data Quality Monitoring: Keeping an eye on data quality? NannyML has your back, ensuring your data remains reliable and consistent. While NannyML offers a plethora of functionalities, let's focus on feature drift detection, which doesn't require labeled data. Unfortunately, it won't help with concept drift, but that's a small trade-off for the convenience it provides. 📊 Univariate Drift: Check if feature distributions have shifted between reference and analysis periods. This can be a game-changer for maintaining model accuracy. 📉 Output Drift: You can also track shifts in the distribution of predicted classes over time, ensuring your model's predictions remain on point. 📊 Multivariate Drift: NannyML goes a step further by looking at the overall shift in feature distributions, validating the univariate results. In conclusion, NannyML is a powerful tool that simplifies drift detection for production models. With its intuitive interface and open-source nature, there's no excuse not to use it. Don't let your production models lose their business value due to neglect. Embrace NannyML and ensure your models stay on the right track! 🚀
Oct 29, 2023
Marios Kadriu
Data Scientist | Software Engineer | DeepMind Scholar
91% is an incredible percentage. Just goes to demonstrate how crucial it is to keep an eye on the performance of the models during production. Furthermore, there may be additional factors at play in addition to data drift that causes degradation. I am very happy to learn that NannyML an open-source python module exists to assist us in starting a performance monitoring project. Have you utilized NannyML or a different model performance monitoring tool?  #datascience #ai #machinelearning …see more
Jul 27, 2023
Anderson Chaves
Lead Data Scientist at EssilorLuxottica
Another cool thing for the day! 💡 Thanks, Wojtek Kuberski and NannyML! #python #machinelearning #datascience
Nov 10, 2022
Pascal Biese
AI/ML Engineer | LLMs | NLP
Does the pattern below look familiar to you? If not, you can consider yourself incredibly lucky! For everyone else, check out NannyML: NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, interactive visualizations, is completely model-agnostic and currently supports all tabular binary classification use cases. (Source: lnkd.in/e2RjQsQg) #DataScience #MachineLearning #AI #DeepLearning
May 16, 2022
Kishan Savant
Software Engineer @VCollab | Open Source Enthusiast Software Engineer @VCollab | Open Source Enthusiast
Thank you Hakim Elakhrass, Niels Nuyttens and NannyML team for sending this #contributions #swag all the way from Belgium. Really appreciate the note, Hakim. Looking forward to more #opensource #contributions to the wonderful NannyML #modelmonitoring tool. …see more
Apr 8, 2023
🚀 Mikkel Jensen
Data Scientist | Developer | ML & AI
Great resources and libraries I have come across recently: NannyML: An open source tool for monitoring models post deployment. I find the ability to estimate performance without labels especially helpful, as we have a one year lag on our labels. It is also possible to detect data/classifier drift. 𝐎𝐩𝐭𝐁𝐢𝐧𝐧𝐢𝐧𝐠: The go to data binning library. Super useful for discovering interesting intervals in variables, and for binning and preprocessing before applying a logistic regression on top. 𝐓𝐚𝐛𝐃𝐃𝐏𝐌: A new method for modelling tabular data. While I have only briefly browsed the paper, it seems promising for generating tabular data, outperforming both Smote and GANs in most of the tested cases! The paper was published only a week ago, but I'm probably already late to the party talking about this one. What are your favorite new tools? Links in the comments👇 ------------------------------------------------------------------ Talking data science, credit risk, cryptocurrency, software development - Connect & start the conversation! #machinelearning #python #datascience #opensource
Oct 6, 2022
Olivier Binette
Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke Data Science Research, ML Evaluation & Entity Resolution // PhD Candidate at Duke
🙅 Stop monitoring data drift. Here's what to do instead. After deploying machine learning models, it's essential to ensure that they keep functioning as intended. One common way to do this is by monitoring data drift, or verifying that the data used for predictions is similar to the data the model was trained on. If there is a significant difference, retraining the model on updated data can help. However, monitoring data drift does not indicate if the drift is affecting the model's ability to solve the task it was trained for. An alternative and more focused approach is to continuously monitor the model's generalization performance. This can be done without using any labeled data through clever statistical techniques, such as confidence-based performance estimation and direct loss estimation. 🚀 NannyML implements these methods, allowing you to estimate the model's performance on drifted data and focus on what is most important for your task. Does it mean you should really stop monitoring data drift? No. But keep in mind that data drift does not always equate to performance drift, and it's usually more beneficial to focus on the latter. Code below is from lnkd.in/e95kJv-H #machinelearning #ml #ai #drift #performance #evaluation #statistics #datascience #nannyml #datadrift #MLOps
Feb 2, 2023
Roger Kamena, M.Sc.
Senior Data Scientist, AI-Powered Analytics at UKG Senior Data Scientist, AI-Powered Analytics at UKG
Even though sheer experience taught me that data drift alone is not enough to monitor models in production, and that monitoring performance through randomized control trials on out of sample data regularly is a needed practice, it’s nice to read a paper that confirms it with empirical evidence. Moreover NannyML is a nice discovery for me. Can’t wait to test it!
Feb 8, 2023
Louis Owen
AI & Data Science | Yellow.ai
[Estimating Accuracy Without Ground Truth] Getting your ML model to production is not an easy task. Monitoring your deployed ML model is even harder. If you have a business metric that directly correlates to your ML model's performance, then it's good for you. However, what if there are multiple factors that influence your business metric and your ML model is just one of them? What if you want to know how exactly your ML model performing in production measured by technical ML metrics (Accuracy, Precision, Recall, etc)? ✨Introducing CBPE (Confidence Based Performance Estimation) developed by the amazing team behind NannyML. With CBPE you can estimate the performance of any ML model, without any ground-truth! How it is even possible? It's possible by relying on the prediction output confidence score and under the assumption of the model is well-calibrated and concept drift is not exist. Furthermore, we can also estimate the performance of a regressor ML model with a similar algorithm developed by the NannyML team called DLE (Direct Loss Estimation)! Curious to learn more? You can refer to the following articles for more information! 📌 lnkd.in/gJuPBQcb 📌https://lnkd.in/gsbG3HgE 📌https://lnkd.in/gh3A6iUJ #artificialintelligence #machinelearning #datascience #sharingiscaring
Feb 7, 2023
Smriti Mishra
Data Science & Engineering at Adage
How can you know if your ML models did not fail silently after deployment? NannyML was trending on GitHub (it was also #3 on Product Hunt). It's a fantastic Open-Source Python Library for detecting silent ML model failure! Key aspects include: 🔹Estimate the performance of a deployed ML model in the absence of target data! Detect multivariate and univariate data drift robustly!  🔹Drop-in link performance due to drift in specific aspects  🔹It is compatible with all categorization models.  🔹Built-in performance and data drift visualisation pip install nannyml Check out the open-source project and give it a star to keep up with future updates like forthcoming regression support! lnkd.in/ej_VtyeP #technology #artificialintelligence #python #data #programming #machinelearning
Jun 20, 2022
👋 Simon Stiebellehner
Building ML Platforms | Engineering Manager | 👨‍🏫 University Lecturer | Advisor
Deployed a #MachineLearning model to production? Now you're having sleepless nights due to 𝗶𝗺𝗺𝗶𝗻𝗲𝗻𝘁 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗱𝗲𝗴𝗿𝗮𝗱𝗮𝘁𝗶𝗼𝗻? 😰 𝗚𝗲𝘁 𝗮 𝗡𝗮𝗻𝗻𝘆 𝗳𝗼𝗿 𝘆𝗼𝘂𝗿 𝗺𝗼𝗱𝗲𝗹! 𝗡𝗮𝗻𝗻𝘆𝗠𝗟 (by NannyML) is an 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝗣𝘆𝘁𝗵𝗼𝗻 𝗽𝗮𝗰𝗸𝗮𝗴𝗲 that helps you estimate model performance in production without access to targets! 𝚙𝚒𝚙  𝚒𝚗𝚜𝚝𝚊𝚕𝚕  𝚗𝚊𝚗𝚗𝚢𝚖𝚕 ➡️ 𝗗𝗲𝘁𝗲𝗰𝘁 𝗱𝗮𝘁𝗮 𝗱𝗿𝗶𝗳𝘁 of deployed models w/o targets(!) ➡️ Configure 𝗮𝗹𝗲𝗿𝘁𝘀 ➡️ Link data drift back to 𝗺𝗼𝗱𝗲𝗹 𝗰𝗵𝗮𝗻𝗴𝗲𝘀 ➡️ 𝗠𝗼𝗱𝗲𝗹-𝗮𝗴𝗻𝗼𝘀𝘁𝗶𝗰 ➡️ Comes with a neat set of 𝘃𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 Check it out and star the repo to stay up-to-date! ⭐ 𝗚𝗶𝘁𝗛𝘂𝗯: lnkd.in/e-_YSDsh --- Follow me for curated, high-quality content on productionizing #MachineLearning and #MLOps . Let’s take #DataScience from Notebook to Production!
May 14, 2022
João Maia
Data Scientist | Machine Learning | Python | Keras
Thanks Wojtek, I recently implemented your solution in my personal project. And it's really really usefull to identify bad features, even going through other variable selection methods and help me to prevent some failures.
Jul 27, 2023
NLP Logix
3,527 followers
Great read alert: this article from NannyML discusses a recent study by MIT, Harvard, The University of Monterrey, and other top institutions-- in regard to models degrading over time. lnkd.in/g49zhe5a #ml #ai #mlmodels #datascienceisateamsport
Apr 18, 2023
Logo