One of the pitfalls of multithreaded programming is deadlock, a situation where each thread exclusively holds one resource while waiting for another's resource. Every non-trivial multithreaded program must contend with deadlocks. One strategy is to detect a deadlock at runtime and take some action to remove it (e.g. send a quit signal to one of the threads). Another approach is to design the program carefully to avoid deadlocks. In practice, this can be a difficult task. The Petri net framework presented in this article supports a hybrid approach, combining careful design with runtime checks to create a deadlock-free process.
Like UML activity diagrams, Petri nets are graphical representations of processes providing a state-oriented view and an activity-oriented view. In other words, Petri nets simultaneously represent the state of a system and what the system is doing. What makes Petri nets powerful is their semantics; formal mathematical analysis techniques can be used to determine characteristics of a given net. For example, it can be proven that a particular net with the right initial conditions will not reach a deadlocked state.
This article briefly discusses the properties of Petri nets and presents a demonstration of a (intentionally) poorly designed application using a C++ framework with Win32 synchronization objects. The framework supports rapid implementation of a process that has been described with a Petri net and is capable of runtime or post-build testing for deadlocks.
Petri Nets
The Petri Net is named after C.A. Petri, who developed the concept as part of his doctoral thesis in 1962. In mathematical terms, a Petri net is a directed bipartite graph. The two types of vertices are called places and transitions. The directed edges, known as arcs, connect places to transitions, and transitions to places.
A place is a container that holds zero or more tokens. The set of places and the number of tokens in each represents the state of the system. This set is called a marking. A transition represents an activity of the system. The arcs that point towards a transition are input arcs, and those that point towards a place are output arcs. Each arc has an associated arc expression, which indicates how many tokens will be added to or removed from a place when the transition executes. The arc expression is usually 1, in which case it is omitted from drawings.
A transition is considered enabled when enough tokens exist in the places connected to its input arcs for all input arc expressions to be satisfied. Only enabled transitions can execute. After an enabled transition has completed, tokens are added to the places connected to its output arcs according to the output arc expressions. The net now has a new marking, and a new set of enabled transitions exists. A dead marking, or deadlocked state, is one where the set of enabled transitions is empty.
Figure 1 is a Petri net representation of a theoretical, and flawed, file printing application with its initial marking displayed. The application has a hold-and-wait problem and will deadlock. Places P0, P3, P6, and P7 contain one token each. The tokens at P0 and P3 represent the number of threads which can execute in the left chain of execution and the right chain, respectively. The tokens at P6 and P7 represent a lock on a file or printer resource, respectively. A single token in each resource indicates that the locks are exclusive.
Concurrency and Conflict
The initial marking M0 of the example Petri net can also be described as the set of enabled transitions; M0 = { T0, T3 } . If T0 fires first, T3 is still enabled, and vice versa. T0 and T3 are enabled concurrently. By systematically tracing execution of the Petri net and the evolution of its markings we build what is called the occurrence graph, the graph of all markings reachable from the initial marking. Firing T0 gives M1 = { T1, T3 } . If T1 fires first, T3 is disabled, and vice versa. This situation is called a conflict. T1 and T3 are enabled, but not concurrently.
The most efficient multithreaded applications would maximize concurrent states and minimize conflicted states, and Petri net analysis can help in their design.
Continuing our systematic execution, T1 gives M2 = { T2 } , and T2 gives M0 again. No problems yet. Let's return to M1 , but this time T3 fires first, giving M3 = { } . Deadlock. If through some intervention we were to give T0 and T1 priority over T3 , we will have created a live-lock situation. The chain on the right ( T3 , T4 , T5 ) would never execute.
The Framework
The Petri net framework is anchored by the CPetriNet class, which aggregates Place , Transition , Arc , and Thread . Listing 1 shows how a net is constructed and operated in a console application main() function. Both Place and Transition inherit from Node , which keeps track of arc connections. For details on these classes see the full source archive.
int _tmain(int argc, _TCHAR* argv[]) { CPetriNet net; // places 0-5 are generic for(int k = 0; k < 6; ++k) { net.AddPlace(new CPlace); } // places 6, 7 are specific net.AddPlace(new CPlaceFile); net.AddPlace(new CPlacePrinter); // transitions 0-5 net.AddTransition(new CTransitionGetFile); net.AddTransition( new CTransitionGetPrinter); net.AddTransition(new CTransitionPrint); net.AddTransition( new CTransitionGetPrinter); net.AddTransition(new CTransitionGetFile); net.AddTransition(new CTransitionPrint); // string describing all arc connections string strArcs = "p0t0t0p1p1t1t1p2p2t2t2p0p3t3t3p4p4t4" \ "t4p5p5t5t5p3p6t0p6t4t2p6t5p6p7t1p7t3" \ "t2p7t5p7;"; net.MakeConnections(strArcs); // set initial marking net.GetPlace(0)->AddTokens(1); net.GetPlace(3)->AddTokens(1); net.GetPlace(6)->AddTokens(1); net.GetPlace(7)->AddTokens(1); // create two threads net.AddThread(new CThread); net.AddThread(new CThread); if(!net.Test()) { cout << "ERROR: deadlock state exists" << endl; return 1; } net.Start(); // run demonstration for 30 seconds Sleep(30000); net.Stop(); return 0; }
Listing 1
Places 0-5 in the example are generic Petri net places, which provide valuable state information but do not represent resources. Resource classes inherit from Place and implement an additional interface to the resource.
Transition is an abstract class. Users implement the Execute() method in subclasses. Each transition is executed by the next available thread. It's important not to think of the execution chains in Figure 1 as cycles for a single thread. Resource interfaces in classes inheriting from Place must not use per-thread synchronization or locking mechanisms. A properly constructed net provides the necessary locking.
Execute() methods use the Visitor pattern (Gamma et al., 1995) to access Place resources. Classes derived from Place define an Accept() method for each client class derived from Transition. Execute() methods call Accept() on each connected place in some defined order, as shown below.
void DerivedTransition::Execute() { arcInput_[1]-> GetPlace()->Accept(*this); // visit other places. . . }
One consequence of Visitor is that the base class Place must declare Accept() methods for all derived Transition types. To preserve the framework-like design of Place , a PlaceNodeVisitor interface class defines empty Accept() methods. The Place class uses multiple inheritance to expose a modifiable PlaceNodeVisitor interface without requiring changes to Place 's definition.
Another consequence of Visitor is that Transition classes become stateful, essentially duplicating data from a Place to be used by Accept() calls to subsequent Place s. An alternative design might use runtime type information (RTTI) to access connected derived Place s with dynamic_cast :
void DerivedTransition::Execute() { DerivedPlace* p = dynamic_cast<DerivedPlace*>( arcInput_[1]->GetPlace() ); assert(p != 0); // use DerivedPlace's methods. . . }
The RTTI design does away with PlaceNodeVisitor and allows purely behavioural derived Transition classes to use multiple derived Place s within a single function body.
Both designs tie the identity of a specific resource to the order in which it is connected by the MakeConnections() function, determined by the ordering of the string describing the list of connections. In the RTTI design, asserting that the cast pointer is not null is a good debug check.
The largest number of concurrently enabled transitions in any marking in the occurrence graph determines the maximum number of threads that can process the net. Presently this would be set during design, but a function could feasibly be written to create the appropriate number of threads at runtime.
Net Processing
Figure 2 is a state chart of Thread processing of the net. Processing is started by a call to PetriNet::Start() . This calculates the set of enabled transitions from the initial token placement and creates a Win32 semaphore with an initial count of 1 and a maximum count equal to the number of threads. A Win32 semaphore is a synchronization object that decrements its count for each thread it allows to pass and blocks threads while its count is zero. When the thread releases the semaphore its count is incremented.
Operations that change the internal state of a Petri net (e.g. adding or removing tokens) must be performed atomically. PetriNet contains a Win32 mutex for this purpose. A mutex allows a single thread to pass and blocks all other threads until it is released. Mutex is short for mutual exclusion. A thread can gain exclusive access to the net by creating a ScopedLock object. ScopedLock 's constructor accepts a PetriNet reference and is a friend class to PetriNet , so it can acquire a lock on PetriNet 's mutex. When the created lock object goes out of scope its destructor releases PetriNet 's mutex.
A new marking is calculated twice per loop, the first time after removing tokens from input places. This marking may be empty due to a conflict or because only one transition was enabled in the previous marking, but this does not produce deadlock. If the new marking calculated after removing tokens still has enabled transitions, the semaphore is released, enabling any waiting thread to process a remaining transition concurrently.
After executing the transition and adding tokens to output places, a new marking is calculated again. To prevent a live-lock, the order of the calculated marking is shuffled randomly so that the same transition in a conflicted pair is not picked first every time. If the marking is empty the semaphore is not released and the system is deadlocked.
Post-build Deadlock Testing
The PetriNet::Test() method builds the occurrence graph by calculating every marking reachable from the initial marking (without the random shuffle). Intermediate (post remove, pre add) markings are not considered here. If an empty marking is found the test fails and the function returns false. The test algorithm in pseudo-code looks like this:
Calculate set of enabled transitions from initial token placement. If set is empty declare failure. Name the initial set, count it as unvisited and add it to a list. Call the initial set the current set. While there are unvisited sets: Take the first unvisited transition in the current set. Push the transition and the name of the current set onto a stack. Remove tokens from the places connected to the transition's inputs. Add tokens to the places connected to the transition's outputs. Mark this transition in this set visited. Calculate the new set. If set is empty declare failure. Else if set not in list: Name new set. Add it to the list. Mark it unvisited. Make it the current set. End If all transitions in the current set have been visited: Declare the set visited. Undo the transition token move at the top of the stack. Make the set at the top of the stack the current set. End End
In the example application Test() is called prior to Start() to prevent a deadlocked net from running. Test() works without executing any of the net's transitions. A practical application could be executed with a command line switch that causes main() to call Test() and return an exit code. This could be performed as a post-build step in development. This feature would be helpful if the structure of the net or the number of resources were undergoing design.
Fixing the Demo Application
Suppose instead of a printer resource we constructed a print queue resource that took file submissions in an atomic operation. With this change to the resource we would obtain not an exclusive lock on a printer but a lock on a queue location. Adding a second token to the printer resource in the initial marking and building the occurrence graph proves that a two-position printer queue would prevent deadlock. The markings are as follows:
M0 = { T0, T3 } M1 = { T1, T3 } M2 = { T2, T3 } M3 = { T2 } M4 = { T0, T4 } (conflict) M5 = { T1 } M6 = { T5 }
Figure 3 shows the output of the demo application with two initial printer tokens. The results of each marking calculation are printed as well, with the token placement following the set of enabled transitions.
start { 0 3 } [ 1 0 0 1 0 0 1 2 ] { 0 } [ 1 0 0 0 0 0 1 1 ] thread 3548 gets printer { } [ 0 0 0 0 0 0 0 1 ] { } [ 0 0 0 0 1 0 0 1 ] thread 3648 gets file { 1 } [ 0 1 0 0 1 0 0 1 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3548 gets printer { 2 } [ 0 0 1 0 1 0 0 0 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3648 PRINTING: Hello World! { 0 4 } [ 1 0 0 0 1 0 1 1 ] { } [ 1 0 0 0 0 0 0 1 ] thread 3548 gets file { 5 } [ 1 0 0 0 0 1 0 1 ] { } [ 1 0 0 0 0 0 0 1 ] thread 3648 PRINTING: Hello World! { 3 0 } [ 1 0 0 1 0 0 1 2 ] { 3 } [ 0 0 0 1 0 0 0 2 ] thread 3548 gets file { } [ 0 0 0 0 0 0 0 1 ] thread 3648 gets printer { 1 } [ 0 1 0 0 0 0 0 1 ] { 1 } [ 0 1 0 0 1 0 0 1 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3548 gets printer { 2 } [ 0 0 1 0 1 0 0 0 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3648 PRINTING: Hello World! { 4 0 } [ 1 0 0 0 1 0 1 1 ] { } [ 0 0 0 0 1 0 0 1 ] thread 3548 gets file { 1 } [ 0 1 0 0 1 0 0 1 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3648 gets printer { 2 } [ 0 0 1 0 1 0 0 0 ] { } [ 0 0 0 0 1 0 0 0 ] thread 3548 PRINTING: Hello World! { 0 4 } [ 1 0 0 0 1 0 1 1 ] { } [ 1 0 0 0 0 0 0 1 ] thread 3648 gets file { 5 } [ 1 0 0 0 0 1 0 1 ] { } [ 1 0 0 0 0 0 0 1 ] thread 3548 PRINTING: Hello World! { 3 0 } [ 1 0 0 1 0 0 1 2 ] . . .continues. . .
Figure 3
Future Work
There is a lot of room for future improvement of the framework. A feature of Petri nets not implemented yet is the inhibitor arc, in which the presence of tokens in the connected place inhibits the connected transition. Concepts from higherlevel Petri nets would add powerful functionality. For example, colored Petri nets allow tokens to have a type (or color) property. This enables the use of complex arc expressions involving type and quantity and makes possible decision branches as part of the net structure.
Conclusion
A multithreaded process designed using Petri net analysis might be deployed rapidly enough using this framework to justify the added runtime costs. For more information on Petri nets two references are listed below [ Marsan-et-al , Jensen ].
References
[Marsan-et-al] Marsan, M. Ajmone et al. 1995. Modelling with Generalized Stochastic Petri Nets . Chichester: John Wiley & Sons.
[Jensen] Jensen, Kurt. 1996. Colored Petri Nets, vol. 1 . 2nd ed. Berlin: Springer-Verlag.
[Gamma-et-al] Gamma, Erich et al. 1995. Design Patterns . Reading: Addison-Wesley.