Introduction
The world in which we are living is submitted to different
dynamic changes. Nowadays we live in a very competitive market, where
customer's needs and expectations are always changing. Industry requirements
are also changing and many mergers and acquisitions are taking place. This has
led to many new different challenges for organizations. In order to gain a
competitive advantage, organizations should revise, change and improve their
strategic business processes, in a fast and efficient way, to avoid losing
market share.
Most organizations use information systems to support the
execution of their business processes [15]. Examples of information systems
supporting operational processes are Workflow Management Systems (WMS) [46],
Customer Relationship Management (CRM) systems, Enterprise Resource Planning
(ERP) systems and so on. These information systems typically support logging
capabilities that register what has been executed in the organization. These
produced logs usually contain data about cases (i.e. process instances) that
have been executed in the organization, the times at which the tasks were
executed, the persons or systems that performed these tasks, and other kinds of
data.
These logs are the starting point for process mining, and are
usually called event logs. The type of data in an event log determines which
perspectives of process mining can be discovered. If the log (i) provides the
tasks that are executed in the process and (ii) it is possible to infer their
order of execution and link these tasks to individual cases (or process
instances), then the control-flow perspective can be mined. The most potential
next step for many applications after getting the events log is to filter it.
Filtering is an iterative process. Coarse-grained scoping was done when
extracting the data into an event log. Filtering corresponds to fine-grained
scoping based on initial analysis results. For example, for process discovery
one can decide to focus on the 10 most frequent activities to keep the model
manageable [41].
Based on the filtered log, the different types of process
mining can be applied: discovery and conformance. The primary objective of
process mining is to discover process models
2
Introduction
based on available event log data. In Discovery there is no a
priori model, i.e., based on an event log, some models can be discovered and
constructed based on low-level events. There exist many techniques to
automatically construct process models (e.g., in terms of a Petri nets) based
some event log. In this thesis we focus only on three algorithms which are:
Alpha, Heuristic and Genetic algorithms. In Conformance, there is a priori
model. This model is compared with the event log and discrepancies between the
log and the model are analyzed.
Many free and commercial software's framework for the use and
implementation of process mining algorithms have been developed, in this paper
we use the open-source process mining toolkit, which is the ProM Framework.
As mentioned before, there are many process mining algorithms
with different theoretical foundations and aims, raising the question of how to
choose the best for a particular situation. Most of these algorithms perform
well on structured processes with little disturbances. However, in reality it
is difficult to determine the scope of a process and typically there are all
kinds of disturbances. As a result, process mining techniques produce
spaghetti-like models that are difficult to read and that attempt to merge
unrelated cases. There is a need for methods for objectively comparing process
mining algorithms against known characteristics of business process models and
logs.
An approach to overcome this is to cluster process instances
(a process instance is manifested as a trace and an event log corresponds to a
multi-set of traces) such that each of the resulting clusters correspond to a
coherent set of process instances that can be adequately represented by a
process model. For this aims, we have used the clustering algorithm and the
profile concept proposed by Song et al [32] and we proposed a new approach to
traces clustering based on logical operators. In our approach we define another
distance measure between traces and clusters center. We use the XOR
operator to calculate the distance between traces and the clusters center
and we use the AND operator to calculate the new clusters centers.
The rest of this thesis is organized as follows:
In the first chapter we gives an overview about Business
process, starting by short definition, their management, life cycle and some
business processes modeling languages. Then we present the main Process mining
concepts, like Event logs, log filtering and process mining perspectives and
control-flow discovery. These two section are followed by the Evaluation of
process mining section, in which, different evaluation metrics are exposed.
The second chapter is devoted to the presentation of the ProM
framework which is a powerful Process mining tool. In this chapter we mainly
present some mining plug-ins
3
|