The Camunda Process Engine maintains the state of running process instances inside a database. This includes saving the state of a process instance to the database as it reaches a wait state and reading the state as process execution continues. This database is called the runtime database. In addition to maintaining the runtime state, the process engine creates an audit log providing audit information about executed process instances; this is called this event stream or the history event stream.
Camunda keeps track of all the data that flows through it, as well as the metadata. So if you’ve got a process that is seeking approval for paying an invoice, for example, Camunda tracks:
- All the details of the invoice itself
- Who worked on it last
- How long it sat there before they started working on it
- How long it took them to finish the task
- How long the next person took, etc.
In this Podcast episode, Stuart and Max explore the 5 things that you need to know about Camunda Process History when using the open-source Camunda Platform, an Enterprise Platform for Workflow and Decision Automation.
This is really useful information for process improvement and optimization, but….
- There’s a limit of 4000 characters for string objects. So, if your data happens to be really long, that’s going to break things. Now, you can get around that by using SPIN (a lightweight wrapper library that provides an easy-to-use API when working with text-based data formats such as XML and JSON), but that it has its own pros and cons.
- All the information loaded into Camunda’s head is stored: so, if you’ve got sensitive information as a part of your process payload, which could be a security concern.
- Those history tables load up fast! If you’re not actively purging them, they will, 100%, crash your server.
How do you work around the negative on this?
You need to discriminate between your process data and your application data.
- Process data is the information that Camunda needs in order to do its job.
- Which way to go in a gateway
- The input to a DMN table
- Who’s doing the task
- How long they have to do it
- The business key
- Restful APIs that need to be called
- Application data is the actual payload that the business cares about. The details of the Invoice, who approved it, the amount, etc. This is important business data, but it’s not strictly necessary in order for the Camunda engine to drive a process definition. This is the data you want to store in your internal application database.
Then apply your internal company policies and best practices for application data, while still letting Camunda do what it does best. The business key will act as a tether, connecting your business instance data to the Camunda instance that’s shepherding it.
Keep Process Data and Application Data separated which will minimize the size of Process Data, but, still need to clean the history tables.
Yes, you still have to purge the history tables, or they break your server. In order to do that, you need to enable the History Cleanup Features. You can trigger it manually, but most people set it to execute automatically. In order to do that, you have to:
- Set the Time-To-Live, TTL, for your process definitions, which can be done right in the model. Most people set this to six months.
- Chose a History level. Your options are
- NONE
- Activity
- Audit
- FULL
- AUTO
- This can be done via the bpm-platform.xml and processes.xml file
- Configure a clean-up schedule that aligns with your throughput and company policies.
What are the top five things you need to know about Camunda Process History?
This podcast discusses 5 things you need to be aware of as “Best Practices” when dealing with Camunda Process History. Stuart and Max discuss the top ways to manage Process related data including the differences between Camunda Process Data, Task Data, and Process History, the limitations of Camunda Process related data storage, how to best work with Camunda Process History, Process Application data, and how to keep all Process related data cleaned-up.