Sopa 🇰🇪
➡️ We've seen quite a buzz around the impact of new image-generating models on creative jobs lately. But slipping under the radar are the game-changing tools in scientific research that are tapping into Large Language Models (LLMs) 👨🔬👩🔬.
This quick read from Nature showcases a few exciting use cases: one researcher's deploying a tool to rephrase tricky paragraphs, another's using one to generate catchy titles, and yet another's utilizing one for some good ol' "rubber-ducking" (the old trick of talking out loud through a problem to an inanimate object – yes, like a rubber duck – to help clarify your thought process).
This is testament to the seismic shifts happening in the science world, especially around the crucial tasks of literature research and penning scientific papers ✍️.
➡️ One solid practice that data scientists often overlook when building a machine learning model is error analysis 🕵️. After a few iterations and seeing model evaluation metrics improve, it's essential to delve into whether errors are clustered within certain data subsets, or randomly distributed. If it's the former, it suggests the model has blind spots, which can be tackled by either upping the error cost on these subsets or gathering more representative data.
Sliceline is a tool designed to isolate these subsets. Based on recent research conducted at the University of Graz in Austria 🇦🇹, it was implemented and open-sourced a few weeks ago by the DataDome data science team, specialists in safeguarding web traffic against bots 🤖 and denial-of-service attacks.
➡️ Lastly, if you're free on December 15, I strongly recommend checking out the NormConf event. It's a free online conference whose tagline is "the tech conference about all the stuff that matters in data and machine learning, but doesn't get the spotlight".
So, it addresses less trendy subjects that data professionals deal with on a daily basis, such as: when is it smarter to build a model or stick to basic rules? How can you make your code configurable for better reproducibility? How can you efficiently resolve your Python 🐍 dependency issues?
The conference is organized by the likes of Vicky Boydis, Vincent Warmerdam (referenced in ⚡️Trendbreak #5 and #19⚡️), and Caitlin Hudon (mentioned in ⚡️Trendbreak #8⚡️), and the program is top-notch, with all presentations available to replay!
Have a great week and see you next time! 👋