Improve financial reporting with the help of Data Science

This blog post is based on a customer case from winter 2019/2020. The project was implemented as an Analytics Jumpstart and this post will guide the reader through the problem, solution and result.

What is group consolidation?

Large companies are often divided into complex corporate structures where different units have responsibility for different parts of the operation. This division is often based on the units’ geographical location or area of responsibility. Irrespective of the basis of this division, the group needs to sum up all revenues, costs, investments and deductions from all the various units in order to ensure that the income statement and balance statement are kept up to date and for the management to always have an updated picture of how the company is performing. As part of this process, even entries and transactions that have taken place between the group’s various subsidiary companies need to be eliminated so that they do not impact the external income result.

This process of summing up posts and statistics for all the various units is known as group consolidation, and it is usually carried out in connection with presentation of the final accounts. Just how often the final accounts are presented depends to some extent on the individual company’s corporate structure and the size of the group, but the general practice is to produce monthly accounts, and to sum up and consolidate the group’s financial metrics once a month.

The company we helped this past winter works with monthly accounts. At the end of every month, the company’s various units collate a number of key metrics and then report them in to the group’s parent company. In many cases, this report takes place manually by reading off input Excel files.

Group consolidation is often a stressful process subject to high demands on performance

Even if certain logical checks are already included in the reporting systems, errors still occasionally occur in reporting. For instance in the form of an extra zero, or a unit forgetting to report an important metric, or a unit confusing the various metrics in its report. The finance department in the parent company is responsible for identifying these errors and for getting in touch with the reporting units so as to receive the correct adjustments before the final accounts are presented.

In reality the finance department operates under considerable pressure as the time draws near for presentation of the final accounts. Often it has just a few days from receiving the metrics from the various units, to completion of consolidation and presentation of the balance sheet and income statement to the executive management. To put this into perspective, the company receives tens of thousands of different reported metrics every single month. Work on manually identifying and correcting discrepancies is therefore stressful and there is a risk that some discrepancies may not be identified in time. This in turn may lead to the entire income statement or balance sheet being inaccurate. The company therefore looked for a smarter and more resource-efficient way of carrying out monthly examinations of the reported metrics so as to identify discrepancies and ensure the high quality of the consolidated final accounts.

A three-part plan

At the same time as the company struggled with its monthly final accounts, it was actually sitting on an unknown gold mine. The company had saved many years’ financial reporting history! We therefore decided jointly to examine whether this historical data could be used to solve the problem and give the company better security and higher quality in its monthly reports.

Our joint project plan comprised three parts, separated from each other by go/no-go decisions: 

  1. Forecast time series based on historical data (can we extrapolate anything about a unit’s future reporting based on its historic reporting?)
  2. Identify and prioritise discrepancies (can we identify actual errors by checking those time series where the reported values differ significantly from the forecast values?)
  3. Visualise discrepancies in a suitable tool (can we make the information available and accessible so that it can easily be used in the finance function’s final accounts process?)


Part 1: forecast financial time series

In the first step we collated the historical reporting data for each unit and metric in something known as time series. A time series is a collection of reported data points with time-stamps, and the example below shows how the accumulated tax debt for unit X varied over time.

Example of time series for reporting data

After that we examined whether it was possible to use mathematical models to forecast what metrics the various units would report in the future. Since Machine Learning and Data Science is an iterative process that is often all about creating and verifying various hypotheses, we tested several different algorithms and frameworks. The advantage of applying Machine Learning at this stage was that a separate mathematical model could automatically be created to forecast each and every one of the time series. We thus soon achieved relatively good results with forecasts that on average only deviated by between 5 and 15 per cent from the actual values.


Part 2: identify and prioritise discrepancies

Once we had established that it was possible to forecast the financial reports, the next step was to use these forecasts to identify and prioritise the biggest discrepancies in an authentic monthly final account at the company. We could compare the figures that were reported by the various units with our own forecasts, identifying the time series that had the greatest discrepancies between our forecast values and the values that the units had reported. One example of this is shown in the figure below.

Illustration of forecast and discrepancy

Part 3: visualisation of the results

Finally we visualised the discrepancies we had identified by showing the historical values, forecast values and actual reported values in a dashboard in Microsoft Power BI. The user could click through a list to check the various discrepancies that the model had identified and decide whether to contact the unit that had submitted the report for an adjustment, if necessary.

Result & conclusions

The computer’s ability to swiftly process large quantities of information makes the problem of identifying discrepancies in financial reporting data highly suitable for automation with the help of Machine Learning. Within the framework of a small-scale pilot project we succeeded in creating functioning models with clear visualisation that that could be used by the company’s finance department in direct conjunction with preparation of the final accounts. The fact that the models also succeeded in identifying several genuine reporting errors within the framework of the pilot project can be seen as an added bonus!

There is considerable scope for further developing the project. In a first stage the focus is on integrating the models in the company’s IT infrastructure and the monthly final accounts process. In the longer term it is possible to document which units often report incorrect figures and to provide training courses and other measures designed to proactively deal with the problem.

Do not hesitate to get in touch if you have any questions! Just send an email to and we can discuss the issue further. 

David Genelöv, Advectas

David Genelöv
My name is David and I work as a Data Scientist and Management Consultant at Advectas. After reading Industrial Economics at Chalmers, I worked on evaluating the business potential of various research projects at the PARC research center in California and its parent company, Xerox. At Advectas, I have worked on Machine Learning projects in finance, healthcare and logistics, among others.

All posts by David Genelöv