Machine learning in management accounting: an approach for final cost and cost curve estimation of active projects based on project data found in enterprise resource planning systems
Purpose – The purpose of this thesis is to analyze the applicability of a novel machine learning (ML) approach for the final cost and cost curve estimation of actual costs of active projects.
Design/methodology/approach – Projects, across several industries, tend to exceed their planned costs. In this thesis, I develop an approach that allows learning from past projects from any company, and storing the derived knowledge in a model. This model has the capability to estimate the actual production cost curve of a project at that company at any time-point while the project is active. In order to evaluate the approach’s ability to identify project cost overruns, I apply it to a dataset of project cost reports of a German engineering company.
Findings –The approach is suitable for the estimation of project cost curves including final costs. All tested regression model types increase their estimation power towards the project end. Gradient boosting regression is the best performing model type. The final cost estimation of this model type is always better than the initial human estimation and significantly increases after 30% of project time.
Research limitations/implications – The utilized project data comes from one company. I recommend further evaluation of the approach on additional project data. A focus is set on the entire approach, consisting of data analysis, data transformation, model creation, and evaluation. Further research, especially in the field of model selection and creation, is necessary.
Practical implications – The findings suggest that machine learning models trained on past project information are able to estimate project cost curves and can be useful to the practitioner.
Originality/value – Several promising approaches regarding project cost estimation at the beginning of a project exist in the literature. This thesis introduces a new approach, which allows estimating project cost curves at any time point while a project is active, and is particularly suited for enterprise resource planning (ERP) data structures.
Paper type Master’s thesis