An Experimental Comparison of Alternative Techniques for Event-Log Augmentation
Abstract
Process mining analyzes and improves processes by examining transactional data stored in event logs, which record sequences of events with timestamps. However, the effectiveness of process mining, especially when combined with machine or deep learning, depends on having large event logs. Event log augmentation addresses this limitation by generating additional traces that simulate realistic process executions while considering various perspectives like time, control-flow, workflow, resources, and domain-specific attributes. Although prior research has explored event-log augmentation techniques, there has been no comprehensive comparison of their effectiveness. This paper reports on an evaluation of seven state-of-the-art augmentation techniques across eight event logs. The results are also compared with those obtained by a baseline technique based on a stochastic transition system. The comparison has been carried on analyzing four different aspects: similarity, preservation of predictive information, information loss/enhancement, and computational times required. Results show that, considering the different criteria, a technique based on a stochastic transition system combined with resource queue modeling would provide higher quality synthetic event logs. Event-log augmentation techniques are also compared with traditional data-augmentation techniques, showing that the former provide significant benefits, whereas the latter fail to consider process constraints.