Responsible Handling of Data – Process Mining Rule 2 of 4

This is article no. 2 of the four-part article series Privacy, Security and Ethics in Process Mining.

Read this article in German:
Datenschutz, Sicherheit und Ethik beim Process Mining – Regel 2 von 4

Like in any other data analysis technique, you must be careful with the data once you have obtained it. In many projects, nobody thinks about the data handling until it is brought up by the security department. Be that person who thinks about the appropriate level of protection and has a clear plan already prior to the collection of the data.

Do:

  • Have external parties sign a Non Disclosure Agreement (NDA) to ensure the confidentiality of the data. This holds, for example, for consultants you have hired to perform the process mining analysis for you, or for researchers who are participating in your project. Contact your legal department for this. They will have standard NDAs that you can use.
  • Make sure that the hard drive of your laptop, external hard drives, and USB sticks that you use to transfer the data and your analysis results are encrypted.

Don’t:

  • Give the data set to your co-workers before you have checked what is actually in the data. For example, it could be that the data set contains more information than you requested, or that it contains sensitive data that you did not think about. For example, the names of doctors and nurses might be mentioned in a free-text medical notes attribute. Make sure you remove or anonymize (see guideline No. 3) all sensitive data before you pass it on.
  • Upload your data to a cloud-based process mining tool without checking that your organization allows you to upload this kind of data. Instead, use a desktop-based process mining tool (like Disco [3] or ProM [4]) to analyze your data locally or get the cloud-based process mining vendor to set-up an on-premise version of their software within your organization. This is also true for cloud-based storage services like Dropbox: Don’t just store data or analysis results in the cloud even if it is convenient.

Five Illusions about Big Data you can’t help but believe in

Big Data is a smorgasbord of data. Even the marketing world has acknowledged the gravity of Big Data. But alas! Instead of having such a resplendent data power by our side, we are no closer to construct smart marketing decisions than before, when the concept was not well known.

So, something is definitely not right, right? Not all information derived from this industry is precise and to address this issue, I have highlighted five common misconceptions about Big Data. Know it, work on it and gain from it.

 

Misconception 1: Human touch surpasses automation

Entrepreneurs are the ones who pull their weight. The human effort they offer yields potential success for the firm, only if it is backed by meaningful data.
“One of the most common misconceptions is that people believe they will always outperform computers in their decision-making process. That may have been the case in the past, but with the complexity of today’s markets and the advancement of technology, this assumption no longer holds true,” says Victor Rosenman, CEO of Feedvisor, the pioneer of Algo-Commerce. He added, “All business owners are constantly required to make critical decisions, and the most effective decisions are not based on gut feelings, but on facts and data.”

Misconception 2: Data leads to more costs

Money makes a business. It is also the other way round. Using artificial intelligence, small business-owners benefit the most. AI saves time and money both, thus helps in raising the revenues. You need to understand that big data wouldn’t be enjoying the current hot seat status, if it was that expensive to implement. They are low on cost now, even getting lower. Moreover, besides being inexpensive, big data also aid in curbing other costs that the company would have to bear otherwise.

Misconception 3: Data takes the lead in big changes

“The view of cognitive systems as brains that automatically solve any problem is a popular misconception.” – IBM’s Brandon Buckner recently said. Integrated tools are mostly implemented to do stuffs like gauge human expertise and enhance human intelligence. By this, he meant that technologies actually support your business instead of taking the lead. With data, business-owners enjoy better decision-making capabilities, which is propitious for future business endeavours.

Misconception 4: Little data is too little to make any impact

Though big data arrests the glowing eyes, little data seizes the mind.  Little data is a small set of data. We know that people always look for a bulk of information, but at times, quality is not what they seek. Sometimes, little data can do the job, which bulk data fail to do. The information in little data is more restrained, clean and unprecedented.

Misconception 5: Big data for big businesses

No more, you need to shell out ludicrous amounts of money to acquire big data technologies. Non- Fortune 500 companies are also introducing big data in their systems. And the best part is that it is no more confined to a single sector, it is omnipresent in almost every industry.

In 2011 McKinsey Global Institute report called “Big data: The next frontier for innovation, competition, and productivity” revealed: “The use of big data will become a key basis of competition and growth for individual firms.” Now it is 2017, so just think how big Big Data must have grown in size and scope over the past 6 years.

Clarify Goal of the Analysis – Process Mining Rule 1 of 4

This is article no. 1 of the four-part article series Privacy, Security and Ethics in Process Mining.

Read this article in German:
Datenschutz, Sicherheit und Ethik beim Process Mining – Regel 1 von 4

Clarify Goal of the Analysis

The good news is that in most situations Process Mining does not need to evaluate personal information, because it usually focuses on the internal organizational processes rather than, for example, on customer profiles. Furthermore, you are investigating the overall process patterns. For example, a process miner is typically looking for ways to organize the process in a smarter way to avoid unnecessary idle times rather than trying to make people work faster.

However, as soon as you would like to better understand the performance of a particular process, you often need to know more about other case attributes that could explain variations in process behaviours or performance. And people might become worried about where this will lead them.

Therefore, already at the very beginning of the process mining project, you should think about the goal of the analysis. Be clear about how the results will be used. Think about what problem are you trying to solve and what data you need to solve this problem.

Do:

  • Check whether there are legal restrictions regarding the data. For example, in Germany employee-related data cannot be used and typically simply would not be extracted in the first place. If your project relates to analyzing customer data, make sure you understand the restrictions and consider anonymization options (see guideline No. 3).
  • Consider establishing an ethical charter that states the goal of the project, including what will and what will not be done based on the analysis. For example, you can clearly state that the goal is not to evaluate the performance of the employees. Communicate to the people who are responsible for extracting the data what these goals are and ask for their assistance to prepare the data accordingly.

Don’t:

  • Start out with a fuzzy idea and simply extract all the data you can get. Instead, think about what problem are you trying to solve? And what data do you actually need to solve this problem? Your project should focus on business goals that can get the support of the process managers you work with (see guideline No. 4).
  • Make your first project too big. Instead, focus on one process with a clear goal. If you make the scope of your project too big, people might block it or work against you while they do not yet even understand what process mining can do.

Privacy, Security and Ethics in Process Mining – Article Series

When I moved to the Netherlands 12 years ago and started grocery shopping at one of the local supermarket chains, Albert Heijn, I initially resisted getting their Bonus card (a loyalty card for discounts), because I did not want the company to track my purchases. I felt that using this information would help them to manipulate me by arranging or advertising products in a way that would make me buy more than I wanted to. It simply felt wrong.

Read this article in German:
Datenschutz, Sicherheit und Ethik beim Process Mining – Artikelserie

The truth is that no data analysis technique is intrinsically good or bad. It is always in the hands of the people using the technology to make it productive and constructive. For example, while supermarkets could use the information tracked through the loyalty cards of their customers to make sure that we have to take the longest route through the store to get our typical items (passing by as many other products as possible), they can also use this information to make the shopping experience more pleasant, and to offer more products that we like.

Most companies have started to use data analysis techniques to analyze their data in one way or the other. These data analyses can bring enormous opportunities for the companies and for their customers, but with the increased use of data science the question of ethics and responsible use also grows more dominant. Initiatives like the Responsible Data Science seminar series [1] take on this topic by raising awareness and encouraging researchers to develop algorithms that have concepts like fairness, accuracy, confidentiality, and transparency built in (see Wil van der Aalst’s presentation on Responsible Data Science at Process Mining Camp 2016).

Process Mining can provide you with amazing insights about your processes, and fuel your improvement initiatives with inspiration and enthusiasm, if you approach it in the right way. But how can you ensure that you use process mining responsibly? What should you pay attention to when you introduce process mining in your own organization?

In this article series, we provide you four guidelines that you can follow to prepare your process mining analysis in a responsible way:

Part 1 of 4: Clarify the Goal of the Analysis

Part 2 of 4: Responsible Handling of Data

Part 3 of 4: Consider Anonymization

Part 4 of 4: Establish a collaborative Culture

Acknowledgements

We would like to thank Frank van Geffen and Léonard Studer, who initiated the first discussions in the workgroup around responsible process mining in 2015. Furthermore, we would like to thank Moe Wynn, Felix Mannhardt and Wil van der Aalst for their feedback on earlier versions of this article.