File: README_MSMProp2.txt

package info (click to toggle)
valgrind 1%3A3.10.0-4~bpo7%2B1
  • links: PTS, VCS
  • area: main
  • in suites: wheezy-backports
  • size: 97,940 kB
  • sloc: ansic: 589,429; xml: 21,096; exp: 8,751; cpp: 7,366; asm: 6,526; perl: 5,656; sh: 5,334; makefile: 4,946; haskell: 195
file content (156 lines) | stat: -rw-r--r-- 5,539 bytes parent folder | download | duplicates (9)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156

MSMProp2, a simplified but functionally equivalent version of MSMProp1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Julian Seward, OpenWorks Ltd, 19 August 2008
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Note that this file does NOT describe the state machine used in the
svn://svn.valgrind.org/branches/YARD version of Helgrind.  That state
machine is different again from any previously described machine.

See the file README_YARD.txt for more details on YARD.

                     ----------------------

In early 2008 Konstantin Serebryany proposed "MSMProp1", a memory
state machine for data race detection.  It is described at
http://code.google.com/p/data-race-test/wiki/MSMProp1

Implementation experiences show MSMProp1 is useful, but difficult to
implement efficiently.  In particular keeping the memory usage under
control is complex and difficult.

This note points out a key simplification of MSMProp1, which makes it
easier to implement without changing the functionality.


The idea
~~~~~~~~

The core of the idea pertains to the "Condition" entry for MSMProp1
state machine rules E5 and E6(r).  These are, respectively:

    HB(SS, currS)  and its negation
    ! HB(SS, currS).

Here, SS is a set of segments, and currS is a single segment.  Each
segment contains a vector timestamp.  The expression "HB(SS, currS)"
is intended to denote

   for each segment S in SS  .  happens_before(S,currS)

where happens_before(S,T) means that S's vector timestamp is ordered
before-or-equal to T's vector timestamp.

In words, the expression

   for each segment S in SS  .  happens_before(S,currS)

is equivalent to saying that currS has a timestamp which is
greater-than-equal to the timestamps of all the segments in SS.

The key observation is that this is equivalent to

   happens_before( JOIN(SS), currS )

where JOIN is the lattice-theoretic "max" or "least upper bound"
operation on vector clocks.  Given the definition of HB,
happens_before and (binary) JOIN, this is easy to prove.


The consequences
~~~~~~~~~~~~~~~~

With that observation in place, it is a short step to observe that
storing segment sets in MSMProp1 is unnecessary.  Instead of
storing a segment set in each shadow value, just store and
update a single vector timestamp.  The following two equivalences
hold:

   MSMProp1                        MSMProp2

   adding a segment S              join-ing S's vector timestamp
   to the segment-set              to the current vector timestamp

   HB(SS,currS)                    happens_before(
                                      currS's timestamp,
                                      current vector timestamp )

Once it is no longer necessary to represent segment sets, it then
also becomes unnecessary to represent segments.  This constitutes
a significant simplication to the implementation.


The resulting state machine, MSMProp2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

MSMProp2 is isomorphic to MSMProp1, with the following changes:

States are    New,   Read(VTS,LS),   Write(VTS,LS)

where LS is a lockset (as before) and VTS is a vector timestamp.

For a thread T with current lockset 'currLS' and current VTS 'currVTS'
making a memory access, the new rules are

Name  Old-State         Op  Guard         New-State              Race-If

E1  New                 rd  True          Read(currVTS,currLS)   False

E2  New                 wr  True          Write(currVTS,currLS)  False

E3  Read(oldVTS,oldLS)  rd  True          Read(newVTS,newLS)     False

E4  Read(oldVTS,oldLS)  wr  True          Write(newVTS,newLS)    #newLS == 0 
                                                                 && !hb(oldVTS,currVTS)

E5  Write(oldVTS,oldLS) rd  hb(oldVTS,    Read(currVTS,currLS)   False
                               currVTS)

E6r Write(oldVTS,oldLS) rd  !hb(oldVTS,   Write(newVTS,newLS)    #newLS == 0 
                                currVTS)                         && !hb(oldVTS,currVTS)

E6w Write(oldVTS,oldLS) wr  True          Write(newVTS,newLS)    #newLS == 0 
                                                                 && !hb(oldVTS,currVTS)

   where newVTS = join2(oldVTS,currVTS)

         newLS  = if   hb(oldVTS,currVTS)
                  then currLS
                  else intersect(oldLS,currLS)

         hb(vts1, vts2) =  vts1 happens before or is equal to vts2


Interpretation of the states
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I always found the state names in MSMProp1 confusing.  Both MSMProp1
and MSMProp2 are easier to understand if the states Read and Write are
renamed, like this:

   old name           new name

   Read               WriteConstraint
   Write              AllConstraint

The effect of a state Read(VTS,LS) is to constrain all later-observed
writes so that either (1) the writing thread holds at least one lock
in common with LS, or (2) those writes must happen-after VTS.  If
neither of those two conditions hold, a race is reported.

Hence a Read state places a constraint on writes.

The effect of a state Write(VTS,LS) is similar, but it applies to all
later-observed accesses: either (1) the accessing thread holds at
least one lock in common with LS, or (2) those accesses must
happen-after VTS.  If neither of those two conditions hold, a race is
reported.

Hence a Write state places a constraint on all accesses.

If we ignore the LS component of these states, the intuitive
interpretation of the VTS component is that it states the earliest
vector-time that the next write / access may safely happen.