Ia ora na 🇫🇷 !
Hold onto your hats, ⚡️Trendbreak⚡️ has just hit edition 40 (😱) and I'm going big to celebrate this milestone! 🎂
➡️ Last week, AI conquered yet another game previously thought too complex for it. Say hello to CICERO, Meta's (you know, Facebook in its past life) latest wonder that's been let loose on the game of Diplomacy 🗺️. This game, born in the 50s and a blend of Risk and poker, was allegedly a favorite of Kennedy and Kissinger. Its rules? Players attempt to conquer Europe 🇪🇺, alternating between negotiations and troop movements 🪖. The winners are those who master the art of winning allies and, well, betraying them at the opportune moment...
Until now, Diplomacy was considered beyond the reach of AI, given its demand for nuanced understanding of players' motives. But here comes CICERO, marrying strategic reasoning with natural conversation, and smashing its way into the top 10% of players 🥇 after a grueling 40-game session and achieving a score twice the average of the 82 human players.
I'd urge you to check out a few minutes of a match between an expert player and six CICERO agents – the naturalness of the written interaction is truly jaw-dropping.
There's some technical wizardry at work here: a language model (BART, fine-tuned on player dialogues from 40k games) coupled with a reinforcement learning model. You can find more juicy details in this blog post and this Science article (though it's behind a paywall, alas).
➡️ A recent literature review provides an overview of scientific papers featuring clinical prediction models using machine learning techniques 🩺🤖.
The authors examined a sample of over 150 studies published in 2018 and 2019 on PubMed, the primary bibliographic search engine in biology and medicine, describing the development or validation of multivariate models (more than two predictors) for diagnosis or prognosis.
After careful reading, they extracted a wealth of information: design characteristics and data sources, methodologies for handling missing data and model learning (optimization procedures, data split, validation, model type, evaluation metrics...), availability of code and model, etc. They then calculated some statistics to identify key trends.
The review reveals several interesting or surprising findings. For example, a majority (59%) of the constructed models are quite traditional: SVM, Random Forests, neural networks... On the data front, few studies (20%) use EMRs (Electronic Medical Records), the median number of predictors is 24 (mainly age, sex, patient clinical history, and blood and urine analysis parameters), and the patient count ranges from a few hundred to a few thousand. In terms of reproducibility, only a minority of articles (12.5%) use external model validation; only a third of the studies report their hyperparameter choices; data preparation steps, missing value handling, feature selection, validation or calibration methods are poorly described or not conducted, and data and code are rarely accessible (12% of studies).
The authors conclude with a rather harsh critique 😤: there seems to be little attention paid to the ultimate goal (improving patient care), the actual clinical management process, or relevant performance metrics for the clinic (as opposed to simple model error rates).
This study links to the Guiding Principles for designing medical machine learning models, jointly published by the FDA 🇺🇸, Health Canada 🇨🇦, and the UK's health product regulatory agency 🇬🇧 in mid-2021. Principle 1, for instance, encourages understanding the model's deployment context in a clinical workflow. Principles 6 and 8 echo this, emphasizing clinical benefits over mere model evaluation metrics. Principle 4 insists on the use of independent external validation datasets, and Principle 9 calls for greater data and model transparency.
➡️ We end this edition with a powerful but lesser-known tool: Dataset Search 💿🔍, a Google search engine specialized in datasets. This brief blog post accompanying its official release in early 2020 explains how to find your way among over 25 million indexed datasets.
Thanks for your attention, and see you soon! 🤓