Ola todos! 🇪🇸
Here's a blog post that presents good practices for writing data dictionaries, which are a type of documentation suited for data.
While advancements in computer hardware performance (predicted by Moore's Law) allow for a regular reduction in computation time, sometimes the software also contributes. This article describes the quest for a more efficient way to multiply matrices, one of the most common computing operations in machine learning.
We conclude with one of my favorite scientific articles, which shows that simple descriptive statistics (mean, standard deviation...) as well as graphical representations that summarize the data (box-and-whisker plots, or boxplots in English) can sometimes hide big differences between several data sets. This is particularly visible on this spectacular animated graph, where a set of continuously modified points will keep the same descriptive statistics to two decimal places!
Have a great weekend everyone!