Project background
Robust data on the short-term economic development in Germany are essential, especially in times of crisis. In such situations, economic conditions can change quickly and unexpectedly, creating significant challenges for both enterprises and policy makers. Up-to-date information is key to be able to react swiftly to economic changes, take targeted measures and improve risk assessment. Such information forms the basis for informed decisions that are necessary to mitigate negative effects and promote economic stability. As digitalisation advances, new possibilities emerge for optimising processes and enhancing methodological approaches, which further increases the importance and availability of current data.
Recognising these developments, the Federal Statistical Office launched a project in the summer of 2021 designed to speed up the provision of estimates for short-term economic indicators. The publication period varies depending on the economic sector. We publish retail trade turnover results 30 days after the end of the reference month (t+30), the turnover in manufacturing is released after 45 days (t+45), the results of the monthly construction survey after 55 days (t+55) and the data of the quarterly survey of building completion work after approximately 60 days (t+60). The objective of the project was to provide early indicators of short-term economic development within 15 to 20 days after the end of the reference period, meaning that these results would be available much earlier than the official data.
The project focused on the manufacturing sector and identified potential for faster data collection and improvements in statistical methods. With regard to the methodology, extrapolation, imputation and econometric estimation procedures (ARIMA and dynamic factor models) and machine learning methods were examined and enhanced. We also explored whether external data that are available early are suitable for improving the data basis and close any data gaps. The intention was to develop a procedure which provided sound results and could be applied not only at the federal level but also across as many Länder as possible.
The underlying data were based on the monthly report on local units in manufacturing, mining and quarrying (only in German). The report covers all producing local units located in Germany with 50 or more persons employed. The tabulating units at the lowest level are "local kind-of-activity units".
A local kind-of-activity unit includes all activities of a local unit which belong to the same group in the classification of economic activities.
Microdata access made it possible to enhance the methodology with the result that a successful imputation method could ultimately be developed on the basis of the microdata. As some of the data have not yet been edited in the statistical offices of the Länder at this early stage of the estimation process, an outlier identification procedure was developed specifically for this purpose.
Here, the month-on-month change rate of turnover of a local kind-of-activity unit was put in relation to the result of its sector (4-digit item of the classification of economic activities). The resulting range made it possible to use data material for the estimation which had not undergone data editing beforehand.
One project component was networking and cooperation within the scientific community. In this context, we established an international network to facilitate the sharing of knowledge and methods. Furthermore, we presented the project to the Committee of Experts for Industry, the Federation-Länder Committee on Statistics of the economics ministries, at the Statistical Week and to the Eurostat Business Statistics Directors Group. The exchange with other European countries underscored the strong international interest in data that are available early, as evidenced by the large number of comparable projects in other countries.
State of play and methodology
The microdata were collected and processed at time t+15 and time t+20 over a period of more than two years. It was found that at time t+20, the volume of edited data available had increased by up to 20% and data quality had improved significantly, eliminating the need for internal outlier identification. In the end, work therefore focused on calculating and providing unadjusted turnover data for the manufacturing sector at time t+20 – and therefore 25 days ahead of the previous release date.
This early indicator is of particular interest to both external user groups and official statistics. In official statistics, this indicator is mainly used in national accounts, serving as an initial gauge of GDP performance - for both the output approach (based on gross value added) and the expenditure approach (based on capital formation).
For the quarterly calculation of gross domestic product, an early estimate is already provided on the 30th day of the second month. As reliable information is not available for the third month of the quarter, other indicators must be used to forecast the developments of that month. Reliable estimates of turnover at time t+20 would be very helpful for these calculations.
An article to be published in the 3/2025 issue of the Wirtschaft und Statistik scientific journal (WISTA), deals with the underlying methodology currently in use, the results and advanced methodological approaches and evaluations. The integration of this early indicator into official statistics could be considered provided that the level of quality is maintained and more detailed estimates can be supplied for economic sectors at the 2-digit level.
A number of different imputation approaches were tested, with regression-based imputation registering the smallest deviations on average and therefore providing the most stable estimation quality over time. In this case, the underlying imputation is based on the R package "mice" (Multivariate Imputation by Chained Equations – version 3.15.0) and subject to linear regression. The term "imputation" refers to procedures used to replace missing data in statistical surveys. Missing data are imputed to produce a matrix which consists of existing and imputed microdata, with no gaps remaining at the end of the process.
For the current reference month, all values which are not available in edited form at time t+20 are imputed in this manner. The data basis of the regressors begins in 2014 and grows each month. Missing older values are imputed beforehand so that a complete predictor matrix is available at the time of estimation. The imputation is based on local kind-of-activity units (totalling approximately 32,000 per month) and uses a separate model for this purpose, with each 2-digit item of the economic sector representing a separate target variable. Then the results are aggregated across all economic sectors. The problem with the data is that the values already reported in the current reference month do not represent an adequate sample of the universe. Larger and heterogeneous local units, in particular, tend to report their data later, which means that the data available are not suitable to serve as a regressor for estimating the missing values of the respective month as accurately as possible.
Regressors of the model
The following table shows the regressors used in the model to explain the dependent variables. They were selected on the basis of theoretical considerations and empirical findings to provide the most precise and comprehensive insight possible into the underlying relationships. Each regressor represents a specific influencing factor whose impact on the dependent variable is quantified in the model.
Regressor | Description |
---|---|
Quantiles | Moving averages of turnover (12 months) on the basis of local kind-of-activity units (KAUs) are subdivided into quartiles, based on a cumulative distribution function of the two-digit items (economic sectors) of the German Classification of Economic Activities (WZ) |
Target variable - value for the previous month | Based on local KAUs |
Target variable - value for the previous year | Based on local KAUs |
Moving averages of sales | 3-month period, scaled between 0 and 1 based on local KAUs, geared to target variable (Germany/abroad) |
Moving averages of new orders | 6-month period, scaled between 0 and 1 based on local KAUs, geared to target variable (Germany/abroad) |
Domestic and foreign prices - value for the previous month | Based on 4-digit items of the WZ (local KAUs, geared to target variable (Germany/abroad) |
Calendar factors | Monthly, based on 4-digit items of the WZ (focus), geared to target variable (Germany/abroad) |
Type of unit | Single-unit/multi-unit enterprise |
Land | Categorical variable |
Persons employed - value for the previous month | Scaled between 0 and 1 |
Time fixed effects | Reference year and reference month |
Outlook
Via the "Economic Dashboard", we will share the progress of the "t+20" project with all interested parties from now on by updating the data on a regular basis. Alongside cumulative results since the project’s launch in July 2022, the most recent estimates will also be provided.
Another central aspect of the presentation are evaluations of the current data structure and quality criteria which enable a well-founded assessment of the project results. These evaluations will also be updated monthly on the Economic Dashboard. The underlying parameters provide important information for assessing the accuracy of the forecasts made in the "t+20" project.
Regular data updates ensure that the information is always up to date, thereby providing a sound basis for informed assessments of short-term economic developments. The "t+20" project contributes to a comprehensive analysis by enabling the faster provision of relevant data.