Events Conference on Foundations and Advances of Machine Learning in Official Statistics, 3rd to 5th April, 2024

Session 4.2 Applied ML II

The Journey of Machine Learning at the Italian National Institute of Statistics

Mauro Bruno* 1, Marco Di Zio1, Fabrizio De Fausti1

Abstract

The paper provides a comprehensive overview of the evolution and application of machine learning (ML) in the Italian National Institute of Statistics (Istat) from the early '90s to the present, highlighting key projects, challenges, and advancements in utilizing ML for official statistics. It outlines the initial exploration of ML techniques in editing and imputation tasks, followed by significant international collaborations that explored various ML methods against traditional statistical techniques. The paper highlights the pivotal role of big data for introduction of ML in official statistics. Furthermore, the paper displays specific applications of ML at Istat, including remote sensing for urban green statistics, the use of vessel Automatic Identification System (AIS) data for maritime statistics, sentiment analysis, and web scraping for enterprise classification. It also delves into the use of ML for imputation with multisource data, particularly in the context of integrating survey and administrative data for the Italian census.

Further the opportunities related to the use of generative AI into official statistics will be discussed. Indeed, Large Language Models (LLMs) ability to generate human-like text can be used to enhance efficiencies of tasks that require human intervention, e.g., code translation and explanation (SAS to R), support in updating classification definitions or drafting methodological notes, thereby enhancing productivity. However, the deployment of LLMs also introduces risks including ethical dilemmas, legal issues, and the potential for generating misleading information. Ensuring responsible use involves human oversight, adherence to privacy principles, and ongoing education on LLM capabilities and limitations.

*: Speaker

1: Istat - Italy