CAV 2014 Measurements

↓ To the results ↓

On this page we provide all of our measurements comparing some state-of-the-art model checkers to our algorithm. The tools and algorithms included in the measurements are the following:

NuSMV2: traditional BDD-based fixed point computations,
NuXMV (NuSMV3): the newest generation of SAT-based model checking algorithms (so-called IC3), various abstraction and reduction techniques,
ITS-LTL: the newest variant of saturation based model checking algorithms based on different automaton representations and optimizations,
Hybrid MC: the prototype of our algorithm presented in the submitted paper currently implemented in C#.

The models are traditional asynchronous benchmark models [1,2]. In addition we used an industrial case-study of a safety logic in a nuclear power plant from [3]:

Counter models a simple binary N-bit counter.
The Round Robin model (RR-N) is a resource sharing protocol and Slotted Ring (SR-N) is the model of a communication protocol with N participants.
DPhil is the well-known dining philosophers model.
The Flexible Manufacturing System (FMS-N) and Kanban models a production system. The parameter N refers to the complexity of the model checking problem.
PRISE example is a small, but important safety-critical industrial system: the safety function initiating an emergency procedure in a nuclear power plant. As verification requirement we used a complex fairness property stating that the initialization can be infinitely often activated under dangerous situations.

The following table shows the different models and expressions we evaluated in each tool, with the runtimes of the algorithms. To reproduce our results, one may download the following archives, in which we give our models, expressions and scripts, in some cases together with the tool itself. To run measurements in PetriDotNet, please refer to the README file in the Application folder. The other tools are run using the "run-timeout.bat" or "run_timeout.sh" script files. Be aware that NuSMV and NuXMV may require installation and modification of environment variables.

WARNING: The following packages contain executables in order to provide a way to exactly reproduce results, use them only for this purpose. If you need any of the tools for other use, the newest versions are available from their official websites with the proper licensing (all of them are free for personal and academic use).

All measurements were done on a laptop with Intel^® Core™ i3-350M CPU (3M Cache, 2.26 GHz), 8 GB 1333 MHz DDR3 RAM, and a Samsung 840 EVO SSD. ITS and Hybrid MC were run on Microsoft Windows 8, while runtimes of NuSMV and NuXMV were measured on Ubuntu 13.10.

Memory consumption is not measured, as Hybrid MC is run in a managed environment (.Net), and the garbage collection mechanism could make the measurements not reproducible (memory consumption highly depends on the schedule of the GC). Experiments showed that Hybrid MC uses only a small amount of memory (despite it is run in a managed environment), and ITS-LTL is also efficient in terms of memory usage. NuSMV and NuXMV both has problems with memory, since they need to transform the input model to a low-level formalism, while Hybrid-MC and ITS both work on high-level models. Comparison to these tools is justified by the fact that they are well-known tools with stable algorithms, and highly poopular on the market.

Due to the limited precision of measurements, runtimes less than 0.01 are just signed "<0.01". Timeout was set to 1200s, cases where algorithms did not finish within this time limit are denoted by ">1200.00". When an algorithm ran out of memory, the table shows "memory" instead of the runtime. The sign "---" marks cases where the conversion of benchmark models to the tool's format was not done due to technical reasons (such as format incompatibilities, the size of the resulting model, or simply the fact that model checking of smaller models with the corresponding tool already failed to complete).

SCC detection statistics show some information about the performance of our hybrid approach. FP denotes the number of incremental symbolic fixed point computations actually started. R shows how many such computations were prevented by observing recurrent states, while A shows the same for explicit runs on abstractions.

Model	Expression	Result	Runtime (s)				SCC detection statistics⁴
Model	Expression	Result	NuSMV2¹	NuXMV¹	ITS-LTL²	Hybrid MC³	FP	R	A
Counter-10	!G(bit9=0)	true	0.01 (0.02)	0.01 (0.02)	---	<0.01 (<0.01)	0	2	23
Counter-15	!G(bit14=0)	true	0.21 (0.23)	0.02 (0.05)	---	<0.01 (<0.01)	0	2	38
Counter-20	!G(bit19=0)	true	15.30 (15.36)	0.02 (0.07)	---	<0.01 (0.01)	0	2	53
Counter-50	!G(bit49=0)	true	>1200.00	0.04 (0.70)	---	0.01 (0.02)	0	2	143
DPhil-50	F(HasLeft1=1&HasRight1=1)	false	>1200.00	2.15 (108.58)	0.20	0.01 (0.03)	1	2	5
	F(HasLeft50=1&HasRight50=1)	false	>1200.00	2.31 (108.24)	0.20	<0.01 (0.02)	1	1	3
	GF(HasRight1=1) -> GF(HasLeft1=1&HasRight1=1)	false	>1200.00	176.72 (281.71)	0.20	0.01 (0.03)	8	14	4
	GF(HasRight50=1) -> GF(HasLeft50=1&HasRight50=1)	false	>1200.00	178.87 (284.28)	0.22	0.13 (0.15)	7	12	6
DPhil-100	F(HasLeft1=1&HasRight1=1)	false	>1200.00	>1200.00	0.64	0.01 (0.08)	1	2	5
	F(HasLeft100=1&HasRight100=1)	false	>1200.00	>1200.00	0.63	0.01 (0.07)	1	1	3
	GF(HasRight1=1) -> GF(HasLeft1=1&HasRight1=1)	false	>1200.00	>1200.00	0.64	0.01 (0.08)	8	14	4
	GF(HasRight100=1) -> GF(HasLeft100=1&HasRight100=1)	false	>1200.00	>1200.00	0.64	0.51 (0.58)	7	12	6
DPhil-200	F(HasLeft1=1&HasRight1=1)	false	memory	memory	2.30	0.02 (0.31)	1	2	5
	F(HasLeft200=1&HasRight200=1)	false	memory	memory	2.16	0.01 (0.29)	1	1	3
	GF(HasRight1=1) -> GF(HasLeft1=1&HasRight1=1)	false	memory	memory	2.45	0.04 (0.32)	8	14	4
	GF(HasRight200=1) -> GF(HasLeft200=1&HasRight200=1)	false	memory	memory	2.19	2.18 (2.44)	7	12	6
DPhil-300	F(HasLeft1=1&HasRight1=1)	false	---	---	5.19	0.04 (0.67)	1	2	5
	F(HasLeft300=1&HasRight300=1)	false	---	---	5.03	0.02 (0.64)	1	1	3
	GF(HasRight1=1) -> GF(HasLeft1=1&HasRight1=1)	false	---	---	5.14	0.06 (0.69)	8	14	4
	GF(HasRight300=1) -> GF(HasLeft300=1&HasRight300=1)	false	---	---	5.13	5.25 (5.89)	7	12	6
FMS-5	F(P1s=P2s=P3s=5)	false	>1200.00	0.03 (0.64)	0.13	<0.01 (0.01)	1	28	17
FMS-5	!G(P1wP2>0&P2=P3=5)	true	>1200.00	0.02 (0.63)	0.02	<0.01 (<0.01)	0	0	0
FMS-10	F(P1s=P2s=P3s=10)	false	>1200.00	0.02 (1.38)	0.53	0.01 (0.01)	1	53	37
FMS-10	!G(P1wP2>0&P2=P3=10)	true	>1200.00	0.02 (1.36)	0.03	<0.01 (<0.01)	0	0	0
FMS-20	F(P1s=P2s=P3s=20)	false	>1200.00	0.03 (1.54)	3.70	0.02 (0.02)	1	103	77
FMS-20	!G(P1wP2>0&P2=P3=20)	true	>1200.00	0.03 (1.56)	0.03	<0.01 (<0.01)	0	0	0
Kaban-10	!G(Pout1>0\|Pout2>0\|Pout3>0\|Pout4>0)	true	0.02 (0.47)	0.02 (0.50)	0.02	<0.01 (<0.01)	0	0	0
Kaban-10	!G(Pback2=9\|Pback3=9)	true	0.07 (0.52)	0.02 (0.49)	0.02	<0.01 (<0.01)	0	0	0
Kanban-20	!G(Pout1>0\|Pout2>0\|Pout3>0\|Pout4>0)	true	0.71 (1.26)	0.02 (0.60)	0.02	<0.01 (<0.01)	0	0	0
Kanban-20	!G(Pback2=19\|Pback3=19)	true	9.51 (10.07)	0.02 (0.62)	0.03	<0.01 (<0.01)	0	0	0
Kanban-30	!G(Pout1>0\|Pout2>0\|Pout3>0\|Pout4>0)	true	0.76 (1.34)	0.03 (0.64)	0.03	<0.01 (<0.01)	0	0	0
Kanban-30	!G(Pback2=29\|Pback3=29)	true	13.64 (14.22)	0.03 (0.65)	0.02	<0.01 (<0.01)	0	0	0
RR-10	!G(True)	false	>1200.00	0.08 (1.68)	0.08	0.03 (0.03)	1	110	186
RR-50	!G(True)	false	>1200.00	4.47 (813.12)	0.64	1.28 (1.30)	1	2550	2986
RR-100	!G(True)	false	memory	memory	2.13	8.86 (8.96)	1	10100	10986
SR-5	!G(C1=0)	true	32.03 (32.36)	0.02 (0.38)	0.03	<0.01 (<0.01)	0	0	0
	!G(C5 = 0)	true	14.45 (14.77)	0.04 (0.40)	0.03	<0.01 (<0.01)	0	0	0
	!G(H1=0\|G1=0);	false	39.43 (39.77)	0.03 (0.40)	0.06	0.02 (0.02)	1	4	95
	GF(H1=1) -> GF(H1=1&G1=1)	false	>1200.00	>1200.00	0.12	0.06 (0.06)	1	66	333
SR-10	!G(C1=0)	true	>1200.00	0.10 (2.01)	0.03	<0.01 (<0.01)	0	0	0
	!G(C10 = 0)	true	>1200.00	0.10 (1.98)	0.03	<0.01 (<0.01)	0	0	0
	!G(H1=0\|G1=0);	false	>1200.00	0.11 (1.99)	0.22	0.07 (0.07)	1	7	433
	GF(H1=1) -> GF(H1=1&G1=1)	false	>1200.00	>1200.00	1.08	0.37 (0.37)	1	111	3444
SR-50	!G(C1=0)	true	>1200.00	6.11 (595.26)	0.19	<0.01 (0.03)	0	0	0
	!G(C50 = 0)	true	>1200.00	5.87 (595.61)	0.19	<0.01 (0.04)	0	0	0
	!G(H1=0\|G1=0);	false	>1200.00	6.10 (597.61)	18.33	2.68 (2.71)	1	27	19423
	GF(H1=1) -> GF(H1=1&G1=1)	false	>1200.00	>1200.00	807.58	32.97 (33.00)	1	351	368021
SR-100	!G(C1=0)	true	memory	memory	0.58	<0.01 (0.14)	0	0	0
	!G(C100 = 0)	true	memory	memory	0.58	<0.01 (0.14)	0	0	0
	!G(H1=0\|G1=0);	false	memory	memory	168.73	16.25 (16.39)	1	52	119473
	GF(H1=1) -> GF(H1=1&G1=1)	false	memory	memory	>1200.00	251.91 (252.06)	1	651	2812346
PRISE-S	G(P15OUTPUT_2!=true U (P15OUTPUT_2!=true& P18outPUT_1=true))	false	>1200.00	>1200.00	---	4.98 (5.13)	1	69435	13180
	!GF(P15OUTPUT_2=true\|P15OUTPUT_2=false)	false	>1200.00	>1200.00	---	4.19 (4.21)	1	64925	0
	!GF(P15OUTPUT_2=true) -> GF(P18outPUT_1=true)	false	>1200.00	>1200.00	---	5.05 (5.08)	1	74436	13283
	GF(P15OUTPUT_2=true) -> GF(P18outPUT_1=true)	false	>1200.00	>1200.00	---	7.55 (7.58)	1	152486	18
PRISE-M	G(P15OUTPUT_2!=true U (P15OUTPUT_2!=true& P18outPUT_1=true))	false	---	---	---	12.51 (12.73)	1	179680	32968
	!GF(P15OUTPUT_2=true\|P15OUTPUT_2=false)	false	---	---	---	10.64 (10.67)	1	163437	0
	!GF(P15OUTPUT_2=true) -> GF(P18outPUT_1=true)	false	---	---	---	12.64 (12.67)	1	192437	33347
	GF(P15OUTPUT_2=true) -> GF(P18outPUT_1=true)	false	---	---	---	20.12 (20.14)	1	389210	36

¹ Runtime of the algorithm, total runtime with loading and conversion time shown in parentheses.

² Runtime results provided by the tool itself.

³ Runtime of state space exploration and model checking, total runtime with initialization and the building of the property automaton is shown in parentheses.

⁴ Order of applying these techniques is the following: (non-empty state and transition set) → (presence of accepting states) → presence of recurrent states → explicit search in the abstraction → incremental symbolic fixed-point computation. This way, the shown metrics mark cases where the technique was actually applied and broke the chain.

[1] Model Checking Contest: http://mcc.lip6.fr/models.php

[2] G. Ciardo, R. Marmorstein, and R. Siminiceanu. Saturation unbound. In Proc. Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2003), LNCS 2619, pages 379-393. Warsaw, Poland, Apr. 2003. Springer-Verlag.

[3] Tamás Bartha, András Vörös, Attila Jámbor, Dániel Darvas: Verification of an Industrial Safety Function Using Coloured Petri Nets and Model Checking. In: 14th International Conference on Modern Information Technology in the Innovation Processes of the Industrial Entreprises (MITIP 2012), pp. 472-485, Budapest, Hungary, 10/2012.

Elsődleges linkek

Primary Links

CAV 2014 Measurements

Nyelvek

Belépés

Leírások

Navigáció