Process Data Quality
This website provides, among other things, a repository of published work dealing with process-data quality. The site is also intended as a place where researchers interested in process-data quality can visit to check on the latest advances in the area.
The temporal and case-based nature of process-related data imparts different quality considerations from those relevant to database and data mining.
A distinct characteristic of an event log is that there exists
temporal constraints among events. Each row in an event log (i.e. an event) has temporal relationships with other events. This is in contrast to other types of log typically used in traditional data analytics (such as data mining or even basic spreadsheet analysis) whereby the concept of case or resource temporal constraints does not exist. In process mining, the data is structured as multiple-record cases, whereas in data mining, the data is structured as single-record cases. (Suriadi et al., 2017)
Process mining is, by now, a mature data analysis discipline. As with other forms of data analysis, the quality and reliability of insights derived through analysis is directly related to the quality of the input (garbage in - garbage out). In the case of process mining, the input is an event log comprised of event data captured (in information systems) during the execution of the process. It is crucial then that the event log be treated as a first-class citizen. While data quality is an easily understood concept little effort has so far been directed towards (i) defining exactly what is understood as "process data quality", (ii) systematically detecting data quality issues in event logs, (iii) quantifying event log quality, (iv) repairing quality issues in event logs (while minimising information loss and not compromising analysis aims), or (v) providing methodological guidance in event log preparation from source process-related data.
Thankfully, this is changing and this website provides resources into the state-of-the-art.
News & Events
Our Work in the Area of Process Data Quality
This section includes our publications in the area of process data quality. The articles cover such topics as "what is process data quality?", "how do I measure data quality?", "what kinds of quality issues affect process data?", "how do I find quality issues in process data?", and "how do I fix quality issues in process data?".