OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-dev/] [or1k-gcc/] [libitm/] [libitm.texi] - Blame information for rev 744

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 737 jeremybenn
\input texinfo @c -*-texinfo-*-
2
 
3
@c %**start of header
4
@setfilename libitm.info
5
@settitle GNU libitm
6
@c %**end of header
7
 
8
 
9
@copying
10
Copyright @copyright{} 2011 Free Software Foundation, Inc.
11
 
12
Permission is granted to copy, distribute and/or modify this document
13
under the terms of the GNU Free Documentation License, Version 1.2 or
14
any later version published by the Free Software Foundation; with no
15
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
16
A copy of the license is included in the section entitled ``GNU
17
Free Documentation License''.
18
@end copying
19
 
20
@ifinfo
21
@dircategory GNU Libraries
22
@direntry
23
* libitm: (libitm).                    GNU Transactional Memory Library
24
@end direntry
25
 
26
This manual documents the GNU Transactional Memory Library.
27
 
28
@insertcopying
29
@end ifinfo
30
 
31
 
32
@setchapternewpage odd
33
 
34
@titlepage
35
@title The GNU Transactional Memory Library
36
@page
37
@vskip 0pt plus 1filll
38
@comment For the @value{version-GCC} Version*
39
@sp 1
40
@insertcopying
41
@end titlepage
42
 
43
@summarycontents
44
@contents
45
@page
46
 
47
 
48
@node Top
49
@top Introduction
50
@cindex Introduction
51
 
52
This manual documents the usage and internals of libitm, the GNU Transactional
53
Memory Library. It provides transaction support for accesses to a process'
54
memory, enabling easy-to-use synchronization of accesses to shared memory by
55
several threads.
56
 
57
 
58
@comment
59
@comment  When you add a new menu item, please keep the right hand
60
@comment  aligned to the same column.  Do not use tabs.  This provides
61
@comment  better formatting.
62
@comment
63
@menu
64
* Enabling libitm::            How to enable libitm for your applications.
65
* C/C++ Language Constructs for TM::
66
                               Notes on the language-level interface supported
67
                               by gcc.
68
* The libitm ABI::             Notes on the external ABI provided by libitm.
69
* Internals::                  Notes on libitm's internal synchronization.
70
* GNU Free Documentation License::
71
                               How you can copy and share this manual.
72
* Index::                      Index of this documentation.
73
@end menu
74
 
75
 
76
@c ---------------------------------------------------------------------
77
@c Enabling libitm
78
@c ---------------------------------------------------------------------
79
 
80
@node Enabling libitm
81
@chapter Enabling libitm
82
 
83
To activate support for TM in C/C++, the compile-time flag @option{-fgnu-tm}
84
must be specified. This enables TM language-level constructs such as
85
transaction statements (@code{__transaction}, @pxref{C/C++ Language
86
Constructs for TM} for details).
87
 
88
@c ---------------------------------------------------------------------
89
@c C/C++ Language Constructs for TM
90
@c ---------------------------------------------------------------------
91
 
92
@node C/C++ Language Constructs for TM
93
@chapter C/C++ Language Constructs for TM
94
 
95
TODO: link to the C++ TM spec. a few examples. how gcc's support differs.
96
 
97
@c ---------------------------------------------------------------------
98
@c The libitm ABI
99
@c ---------------------------------------------------------------------
100
 
101
@node The libitm ABI
102
@chapter The libitm ABI
103
 
104
The ABI provided by libitm is basically equal to the Linux variant of Intel's
105
current TM ABI specification document (Revision 1.1, May 6 2009) but with the
106
differences listed in this chapter. It would be good if these changes would
107
eventually be merged into a future version of this specification. To ease
108
look-up, the following subsections mirror the structure of this specification.
109
 
110
@section [No changes] Objectives
111
@section [No changes] Non-objectives
112
 
113
@section Library design principles
114
@subsection [No changes] Calling conventions
115
@subsection [No changes] TM library algorithms
116
@subsection [No changes] Optimized load and store routines
117
@subsection [No changes] Aligned load and store routines
118
 
119
@subsection Data logging functions
120
 
121
The memory locations accessed with transactional loads and stores and the
122
memory locations whose values are logged must not overlap. This required
123
separation only extends to the scope of the execution of one transaction
124
including all the executions of all nested transactions.
125
 
126
The compiler must be consistent (within the scope of a single transaction)
127
about which memory locations are shared and which are not shared with other
128
threads (i.e., data must be accessed either transactionally or
129
nontransactionally). Otherwise, non-write-through TM algorithms would not work.
130
 
131
@subsection [No changes] Scatter/gather calls
132
@subsection [No changes] Serial and irrevocable mode
133
@subsection [No changes] Transaction descriptor
134
@subsection Store allocation
135
 
136
There is no @code{getTransaction} function.
137
 
138
@subsection [No changes] Naming conventions
139
 
140
@subsection Function pointer encryption
141
 
142
Currently, this is not implemented.
143
 
144
 
145
@section Types and macros list
146
 
147
@code{_ITM_codeProperties} has changed, @pxref{txn-code-properties,,Starting a
148
transaction}.
149
@code{_ITM_srcLocation} is not used.
150
 
151
 
152
@section Function list
153
 
154
@subsection Initialization and finalization functions
155
These functions are not part of the ABI.
156
 
157
@subsection [No changes] Version checking
158
@subsection [No changes] Error reporting
159
@subsection [No changes] inTransaction call
160
 
161
@subsection State manipulation functions
162
There is no @code{getTransaction} function. Transaction identifiers for
163
nested transactions will be ordered but not necessarily sequential (i.e., for
164
a nested transaction's identifier @var{IN} and its enclosing transaction's
165
identifier @var{IE}, it is guaranteed that @math{IN >= IE}).
166
 
167
@subsection [No changes] Source locations
168
 
169
@subsection Starting a transaction
170
 
171
@subsubsection Transaction code properties
172
 
173
@anchor{txn-code-properties}
174
The bit @code{hasNoXMMUpdate} is instead called @code{hasNoVectorUpdate}.
175
Iff it is set, vector register save/restore is not necessary for any target
176
machine.
177
 
178
The @code{hasNoFloatUpdate} bit (@code{0x0010}) is new. Iff it is set, floating
179
point register save/restore is not necessary for any target machine.
180
 
181
@code{undoLogCode} is not supported and a fatal runtime error will be raised
182
if this bit is set. It is not properly defined in the ABI why barriers
183
other than undo logging are not present; Are they not necessary (e.g., a
184
transaction operating purely on thread-local data) or have they been omitted by
185
the compiler because it thinks that some kind of global synchronization
186
(e.g., serial mode) might perform better? The specification suggests that the
187
latter might be the case, but the former seems to be more useful.
188
 
189
The @code{readOnly} bit (@code{0x4000}) is new. @strong{TODO} Lexical or dynamic
190
scope?
191
 
192
@code{hasNoRetry} is not supported. If this bit is not set, but
193
@code{hasNoAbort} is set, the library can assume that transaction
194
rollback will not be requested.
195
 
196
It would be useful if the absence of externally-triggered rollbacks would be
197
reported for the dynamic scope as well, not just for the lexical scope
198
(@code{hasNoAbort}). Without this, a library cannot exploit this together
199
with flat nesting.
200
 
201
@code{exceptionBlock} is not supported because exception blocks are not used.
202
 
203
@subsubsection [No changes] Windows exception state
204
@subsubsection [No changes] Other machine state
205
 
206
@subsubsection [No changes] Results from beginTransaction
207
 
208
@subsection Aborting a transaction
209
 
210
@code{_ITM_rollbackTransaction} is not supported. @code{_ITM_abortTransaction}
211
is supported but the abort reasons @code{exceptionBlockAbort},
212
@code{TMConflict}, and @code{userRetry} are not supported. There are no
213
exception blocks in general, so the related cases also do not have to be
214
considered. To encode @code{__transaction_cancel [[outer]]}, compilers must
215
set the new @code{outerAbort} bit (@code{0x10}) additionally to the
216
@code{userAbort} bit in the abort reason.
217
 
218
@subsection Committing a transaction
219
 
220
The exception handling (EH) scheme is different. The Intel ABI requires the
221
@code{_ITM_tryCommitTransaction} function that will return even when the
222
commit failed and will have to be matched with calls to either
223
@code{_ITM_abortTransaction} or @code{_ITM_commitTransaction}. In contrast,
224
gcc relies on transactional wrappers for the functions of the Exception
225
Handling ABI and on one additional commit function (shown below). This allows
226
the TM to keep track of EH internally and thus it does not have to embed the
227
cleanup of EH state into the existing EH code in the program.
228
@code{_ITM_tryCommitTransaction} is not supported.
229
@code{_ITM_commitTransactionToId} is also not supported because the
230
propagation of thrown exceptions will not bypass commits of nested
231
transactions.
232
 
233
@example
234
void _ITM_commitTransactionEH(void *exc_ptr) ITM_REGPARM;
235
void *_ITM_cxa_allocate_exception (size_t);
236
void _ITM_cxa_throw (void *obj, void *tinfo, void *dest);
237
void *_ITM_cxa_begin_catch (void *exc_ptr);
238
void _ITM_cxa_end_catch (void);
239
@end example
240
 
241
@code{_ITM_commitTransactionEH} must be called to commit a transaction if an
242
exception could be in flight at this position in the code. @code{exc_ptr} is
243
the current exception or zero if there is no current exception.
244
The @code{_ITM_cxa...} functions are transactional wrappers for the respective
245
@code{__cxa...} functions and must be called instead of these in transactional
246
code.
247
 
248
To support this EH scheme, libstdc++ needs to provide one additional function
249
(@code{_cxa_tm_cleanup}), which is used by the TM to clean up the exception
250
handling state while rolling back a transaction:
251
 
252
@example
253
void __cxa_tm_cleanup (void *unthrown_obj, void *cleanup_exc,
254
                       unsigned int caught_count);
255
@end example
256
 
257
@code{unthrown_obj} is non-null if the program called
258
@code{__cxa_allocate_exception} for this exception but did not yet called
259
@code{__cxa_throw} for it. @code{cleanup_exc} is non-null if the program is
260
currently processing a cleanup along an exception path but has not caught this
261
exception yet. @code{caught_count} is the nesting depth of
262
@code{__cxa_begin_catch} within the transaction (which can be counted by the TM
263
using @code{_ITM_cxa_begin_catch} and @code{_ITM_cxa_end_catch});
264
@code{__cxa_tm_cleanup} then performs rollback by essentially performing
265
@code{__cxa_end_catch} that many times.
266
 
267
 
268
 
269
@subsection Exception handling support
270
 
271
Currently, there is no support for functionality like
272
@code{__transaction_cancel throw} as described in the C++ TM specification.
273
Supporting this should be possible with the EH scheme explained previously
274
because via the transactional wrappers for the EH ABI, the TM is able to
275
observe and intercept EH.
276
 
277
 
278
@subsection [No changes] Transition to serial--irrevocable mode
279
@subsection [No changes] Data transfer functions
280
@subsection [No changes] Transactional memory copies
281
 
282
@subsection Transactional versions of memmove
283
 
284
If either the source or destination memory region is to be accessed
285
nontransactionally, then source and destination regions must not be
286
overlapping. The respective @code{_ITM_memmove} functions are still
287
available but a fatal runtime error will be raised if such regions do overlap.
288
To support this functionality, the ABI would have to specify how the
289
intersection of the regions has to be accessed (i.e., transactionally or
290
nontransactionally).
291
 
292
@subsection [No changes] Transactional versions of memset
293
@subsection [No changes] Logging functions
294
 
295
@subsection User-registered commit and undo actions
296
 
297
Commit actions will get executed in the same order in which the respective
298
calls to @code{_ITM_addUserCommitAction} happened. Only
299
@code{_ITM_noTransactionId} is allowed as value for the
300
@code{resumingTransactionId} argument. Commit actions get executed after
301
privatization safety has been ensured.
302
 
303
Undo actions will get executed in reverse order compared to the order in which
304
the respective calls to @code{_ITM_addUserUndoAction} happened. The ordering of
305
undo actions w.r.t. the roll-back of other actions (e.g., data transfers or
306
memory allocations) is undefined.
307
 
308
@code{_ITM_getThreadnum} is not supported currently because its only purpose
309
is to provide a thread ID that matches some assumed performance tuning output,
310
but this output is not part of the ABI nor further defined by it.
311
 
312
@code{_ITM_dropReferences} is not supported currently because its semantics and
313
the intention behind it is not entirely clear. The
314
specification suggests that this function is necessary because of certain
315
orderings of data transfer undos and the releasing of memory regions (i.e.,
316
privatization). However, this ordering is never defined, nor is the ordering of
317
dropping references w.r.t. other events.
318
 
319
@subsection [New] Transactional indirect calls
320
 
321
Indirect calls (i.e., calls through a function pointer) within transactions
322
should execute the transactional clone of the original function (i.e., a clone
323
of the original that has been fully instrumented to use the TM runtime), if
324
such a clone is available. The runtime provides two functions to
325
register/deregister clone tables:
326
 
327
@example
328
struct clone_entry
329
@{
330
  void *orig, *clone;
331
@};
332
 
333
void _ITM_registerTMCloneTable (clone_entry *table, size_t entries);
334
void _ITM_deregisterTMCloneTable (clone_entry *table);
335
@end example
336
 
337
Registered tables must be writable by the TM runtime, and must be live
338
throughout the life-time of the TM runtime.
339
 
340
@strong{TODO} The intention was always to drop the registration functions
341
entirely, and create a new ELF Phdr describing the linker-sorted table.  Much
342
like what currently happens for @code{PT_GNU_EH_FRAME}.
343
This work kept getting bogged down in how to represent the @var{N} different
344
code generation variants.  We clearly needed at least two---SW and HW
345
transactional clones---but there was always a suggestion of more variants for
346
different TM assumptions/invariants.
347
 
348
The compiler can then use two TM runtime functions to perform indirect calls in
349
transactions:
350
@example
351
void *_ITM_getTMCloneOrIrrevocable (void *function) ITM_REGPARM;
352
void *_ITM_getTMCloneSafe (void *function) ITM_REGPARM;
353
@end example
354
 
355
If there is a registered clone for supplied function, both will return a
356
pointer to the clone. If not, the first runtime function will attempt to switch
357
to serial--irrevocable mode and return the original pointer, whereas the second
358
will raise a fatal runtime error.
359
 
360
@subsection [New] Transactional dynamic memory management
361
 
362
@example
363
void *_ITM_malloc (size_t)
364
       __attribute__((__malloc__)) ITM_PURE;
365
void *_ITM_calloc (size_t, size_t)
366
       __attribute__((__malloc__)) ITM_PURE;
367
void _ITM_free (void *) ITM_PURE;
368
@end example
369
 
370
These functions are essentially transactional wrappers for @code{malloc},
371
@code{calloc}, and @code{free}. Within transactions, the compiler should
372
replace calls to the original functions with calls to the wrapper functions.
373
 
374
 
375
@section [No changes] Future Enhancements to the ABI
376
 
377
@section Sample code
378
 
379
The code examples might not be correct w.r.t. the current version of the ABI,
380
especially everything related to exception handling.
381
 
382
 
383
@section [New] Memory model
384
 
385
The ABI should define a memory model and the ordering that is guaranteed for
386
data transfers and commit/undo actions, or at least refer to another memory
387
model that needs to be preserved. Without that, the compiler cannot ensure the
388
memory model specified on the level of the programming language (e.g., by the
389
C++ TM specification).
390
 
391
For example, if a transactional load is ordered before another load/store, then
392
the TM runtime must also ensure this ordering when accessing shared state. If
393
not, this might break the kind of publication safety used in the C++ TM
394
specification. Likewise, the TM runtime must ensure privatization safety.
395
 
396
 
397
 
398
@c ---------------------------------------------------------------------
399
@c Internals
400
@c ---------------------------------------------------------------------
401
 
402
@node Internals
403
@chapter Internals
404
 
405
@section TM methods and method groups
406
 
407
libitm supports several ways of synchronizing transactions with each other.
408
These TM methods (or TM algorithms) are implemented in the form of
409
subclasses of @code{abi_dispatch}, which provide methods for
410
transactional loads and stores as well as callbacks for rollback and commit.
411
All methods that are compatible with each other (i.e., that let concurrently
412
running transactions still synchronize correctly even if different methods
413
are used) belong to the same TM method group. Pointers to TM methods can be
414
obtained using the factory methods prefixed with @code{dispatch_} in
415
@file{libitm_i.h}. There are two special methods, @code{dispatch_serial} and
416
@code{dispatch_serialirr}, that are compatible with all methods because they
417
run transactions completely in serial mode.
418
 
419
@subsection TM method life cycle
420
 
421
The state of TM methods does not change after construction, but they do alter
422
the state of transactions that use this method. However, because
423
per-transaction data gets used by several methods, @code{gtm_thread} is
424
responsible for setting an initial state that is useful for all methods.
425
After that, methods are responsible for resetting/clearing this state on each
426
rollback or commit (of outermost transactions), so that the transaction
427
executed next is not affected by the previous transaction.
428
 
429
There is also global state associated with each method group, which is
430
initialized and shut down (@code{method_group::init()} and @code{fini()})
431
when switching between method groups (see @file{retry.cc}).
432
 
433
@subsection Selecting the default method
434
 
435
The default method that libitm uses for freshly started transactions (but
436
not necessarily for restarted transactions) can be set via an environment
437
variable (@env{ITM_DEFAULT_METHOD}), whose value should be equal to the name
438
of one of the factory methods returning abi_dispatch subclasses but without
439
the "dispatch_" prefix (e.g., "serialirr" instead of
440
@code{GTM::dispatch_serialirr()}).
441
 
442
Note that this environment variable is only a hint for libitm and might not
443
be supported in the future.
444
 
445
 
446
@section Nesting: flat vs. closed
447
 
448
We support two different kinds of nesting of transactions. In the case of
449
@emph{flat nesting}, the nesting structure is flattened and all nested
450
transactions are subsumed by the enclosing transaction. In contrast,
451
with @emph{closed nesting}, nested transactions that have not yet committed
452
can be rolled back separately from the enclosing transactions; when they
453
commit, they are subsumed by the enclosing transaction, and their effects
454
will be finally committed when the outermost transaction commits.
455
@emph{Open nesting} (where nested transactions can commit independently of the
456
enclosing transactions) are not supported.
457
 
458
Flat nesting is the default nesting mode, but closed nesting is supported and
459
used when transactions contain user-controlled aborts
460
(@code{__transaction_cancel} statements). We assume that user-controlled
461
aborts are rare in typical code and used mostly in exceptional situations.
462
Thus, it makes more sense to use flat nesting by default to avoid the
463
performance overhead of the additional checkpoints required for closed
464
nesting. User-controlled aborts will correctly abort the innermost enclosing
465
transaction, whereas the whole (i.e., outermost) transaction will be restarted
466
otherwise (e.g., when a transaction encounters data conflicts during
467
optimistic execution).
468
 
469
 
470
@section Locking conventions
471
 
472
This section documents the locking scheme and rules for all uses of locking
473
in libitm. We have to support serial(-irrevocable) mode, which is implemented
474
using a global lock as explained next (called the @emph{serial lock}). To
475
simplify the overall design, we use the same lock as catch-all locking
476
mechanism for other infrequent tasks such as (de)registering clone tables or
477
threads. Besides the serial lock, there are @emph{per-method-group locks} that
478
are managed by specific method groups (i.e., groups of similar TM concurrency
479
control algorithms), and lock-like constructs for quiescence-based operations
480
such as ensuring privatization safety.
481
 
482
Thus, the actions that participate in the libitm-internal locking are either
483
@emph{active transactions} that do not run in serial mode, @emph{serial
484
transactions} (which (are about to) run in serial mode), and management tasks
485
that do not execute within a transaction but have acquired the serial mode
486
like a serial transaction would do (e.g., to be able to register threads with
487
libitm). Transactions become active as soon as they have successfully used the
488
serial lock to announce this globally (@pxref{serial-lock-impl,,Serial lock
489
implementation}). Likewise, transactions become serial transactions as soon as
490
they have acquired the exclusive rights provided by the serial lock (i.e.,
491
serial mode, which also means that there are no other concurrent active or
492
serial transactions). Note that active transactions can become serial
493
transactions when they enter serial mode during the runtime of the
494
transaction.
495
 
496
@subsection State-to-lock mapping
497
 
498
Application data is protected by the serial lock if there is a serial
499
transaction and no concurrently running active transaction (i.e., non-serial).
500
Otherwise, application data is protected by the currently selected method
501
group, which might use per-method-group locks or other mechanisms. Also note
502
that application data that is about to be privatized might not be allowed to be
503
accessed by nontransactional code until privatization safety has been ensured;
504
the details of this are handled by the current method group.
505
 
506
libitm-internal state is either protected by the serial lock or accessed
507
through custom concurrent code. The latter applies to the public/shared part
508
of a transaction object and most typical method-group-specific state.
509
 
510
The former category (protected by the serial lock) includes:
511
@itemize @bullet
512
@item The list of active threads that have used transactions.
513
@item The tables that map functions to their transactional clones.
514
@item The current selection of which method group to use.
515
@item Some method-group-specific data, or invariants of this data. For example,
516
resetting a method group to its initial state is handled by switching to the
517
same method group, so the serial lock protects such resetting as well.
518
@end itemize
519
In general, such state is immutable whenever there exists an active
520
(non-serial) transaction. If there is no active transaction, a serial
521
transaction (or a thread that is not currently executing a transaction but has
522
acquired the serial lock) is allowed to modify this state (but must of course
523
be careful to not surprise the current method group's implementation with such
524
modifications).
525
 
526
@subsection Lock acquisition order
527
 
528
To prevent deadlocks, locks acquisition must happen in a globally agreed-upon
529
order. Note that this applies to other forms of blocking too, but does not
530
necessarily apply to lock acquisitions that do not block (e.g., trylock()
531
calls that do not get retried forever). Note that serial transactions are
532
never return back to active transactions until the transaction has committed.
533
Likewise, active transactions stay active until they have committed.
534
Per-method-group locks are typically also not released before commit.
535
 
536
Lock acquisition / blocking rules:
537
@itemize @bullet
538
 
539
@item Transactions must become active or serial before they are allowed to
540
use method-group-specific locks or blocking (i.e., the serial lock must be
541
acquired before those other locks, either in serial or nonserial mode).
542
 
543
@item Any number of threads that do not currently run active transactions can
544
block while trying to get the serial lock in exclusive mode. Note that active
545
transactions must not block when trying to upgrade to serial mode unless there
546
is no other transaction that is trying that (the latter is ensured by the
547
serial lock implementation.
548
 
549
@item Method groups must prevent deadlocks on their locks. In particular, they
550
must also be prepared for another active transaction that has acquired
551
method-group-specific locks but is blocked during an attempt to upgrade to
552
being a serial transaction. See below for details.
553
 
554
@item Serial transactions can acquire method-group-specific locks because there
555
will be no other active nor serial transaction.
556
 
557
@end itemize
558
 
559
There is no single rule for per-method-group blocking because this depends on
560
when a TM method might acquire locks. If no active transaction can upgrade to
561
being a serial transaction after it has acquired per-method-group locks (e.g.,
562
when those locks are only acquired during an attempt to commit), then the TM
563
method does not need to consider a potential deadlock due to serial mode.
564
 
565
If there can be upgrades to serial mode after the acquisition of
566
per-method-group locks, then TM methods need to avoid those deadlocks:
567
@itemize @bullet
568
@item When upgrading to a serial transaction, after acquiring exclusive rights
569
to the serial lock but before waiting for concurrent active transactions to
570
finish (@pxref{serial-lock-impl,,Serial lock implementation} for details),
571
we have to wake up all active transactions waiting on the upgrader's
572
per-method-group locks.
573
@item Active transactions blocking on per-method-group locks need to check the
574
serial lock and abort if there is a pending serial transaction.
575
@item Lost wake-ups have to be prevented (e.g., by changing a bit in each
576
per-method-group lock before doing the wake-up, and only blocking on this lock
577
using a futex if this bit is not group).
578
@end itemize
579
 
580
@strong{TODO}: Can reuse serial lock for gl-*? And if we can, does it make
581
sense to introduce further complexity in the serial lock? For gl-*, we can
582
really only avoid an abort if we do -wb and -vbv.
583
 
584
 
585
@subsection Serial lock implementation
586
@anchor{serial-lock-impl}
587
 
588
The serial lock implementation is optimized towards assuming that serial
589
transactions are infrequent and not the common case. However, the performance
590
of entering serial mode can matter because when only few transactions are run
591
concurrently or if there are few threads, then it can be efficient to run
592
transactions serially.
593
 
594
The serial lock is similar to a multi-reader-single-writer lock in that there
595
can be several active transactions but only one serial transaction. However,
596
we do want to avoid contention (in the lock implementation) between active
597
transactions, so we split up the reader side of the lock into per-transaction
598
flags that are true iff the transaction is active. The exclusive writer side
599
remains a shared single flag, which is acquired using a CAS, for example.
600
On the fast-path, the serial lock then works similar to Dekker's algorithm but
601
with several reader flags that a serial transaction would have to check.
602
A serial transaction thus requires a list of all threads with potentially
603
active transactions; we can use the serial lock itself to protect this list
604
(i.e., only threads that have acquired the serial lock can modify this list).
605
 
606
We want starvation-freedom for the serial lock to allow for using it to ensure
607
progress for potentially starved transactions (@pxref{progress-guarantees,,
608
Progress Guarantees} for details). However, this is currently not enforced by
609
the implementation of the serial lock.
610
 
611
Here is pseudo-code for the read/write fast paths of acquiring the serial
612
lock (read-to-write upgrade is similar to write_lock:
613
@example
614
// read_lock:
615
tx->shared_state |= active;
616
__sync_synchronize(); // or STLD membar, or C++0x seq-cst fence
617
while (!serial_lock.exclusive)
618
  if (spinning_for_too_long) goto slowpath;
619
 
620
// write_lock:
621
if (CAS(&serial_lock.exclusive, 0, this) != 0)
622
  goto slowpath; // writer-writer contention
623
// need a membar here, but CAS already has full membar semantics
624
bool need_blocking = false;
625
for (t: all txns)
626
  @{
627
    for (;t->shared_state & active;)
628
      if (spinning_for_too_long) @{ need_blocking = true; break; @}
629
  @}
630
if (need_blocking) goto slowpath;
631
@end example
632
 
633
Releasing a lock in this spin-lock version then just consists of resetting
634
@code{tx->shared_state} to inactive or clearing @code{serial_lock.exclusive}.
635
 
636
However, we can't rely on a pure spinlock because we need to get the OS
637
involved at some time (e.g., when there are more threads than CPUs to run on).
638
Therefore, the real implementation falls back to a blocking slow path, either
639
based on pthread mutexes or Linux futexes.
640
 
641
 
642
@subsection Reentrancy
643
 
644
libitm has to consider the following cases of reentrancy:
645
@itemize @bullet
646
 
647
@item Transaction calls unsafe code that starts a new transaction: The outer
648
transaction will become a serial transaction before executing unsafe code.
649
Therefore, nesting within serial transactions must work, even if the nested
650
transaction is called from within uninstrumented code.
651
 
652
@item Transaction calls either a transactional wrapper or safe code, which in
653
turn starts a new transaction: It is not yet defined in the specification
654
whether this is allowed. Thus, it is undefined whether libitm supports this.
655
 
656
@item Code that starts new transactions might be called from within any part
657
of libitm: This kind of reentrancy would likely be rather complex and can
658
probably be avoided. Therefore, it is not supported.
659
 
660
@end itemize
661
 
662
@subsection Privatization safety
663
 
664
Privatization safety is ensured by libitm using a quiescence-based approach.
665
Basically, a privatizing transaction waits until all concurrent active
666
transactions will either have finished (are not active anymore) or operate on
667
a sufficiently recent snapshot to not access the privatized data anymore. This
668
happens after the privatizing transaction has stopped being an active
669
transaction, so waiting for quiescence does not contribute to deadlocks.
670
 
671
In method groups that need to ensure publication safety explicitly, active
672
transactions maintain a flag or timestamp in the public/shared part of the
673
transaction descriptor. Before blocking, privatizers need to let the other
674
transactions know that they should wake up the privatizer.
675
 
676
@strong{TODO} Ho to implement the waiters? Should those flags be
677
per-transaction or at a central place? We want to avoid one wake/wait call
678
per active transactions, so we might want to use either a tree or combining
679
to reduce the syscall overhead, or rather spin for a long amount of time
680
instead of doing blocking. Also, it would be good if only the last transaction
681
that the privatizer waits for would do the wake-up.
682
 
683
@subsection Progress guarantees
684
@anchor{progress-guarantees}
685
 
686
Transactions that do not make progress when using the current TM method will
687
eventually try to execute in serial mode. Thus, the serial lock's progress
688
guarantees determine the progress guarantees of the whole TM. Obviously, we at
689
least need deadlock-freedom for the serial lock, but it would also be good to
690
provide starvation-freedom (informally, all threads will finish executing a
691
transaction eventually iff they get enough cycles).
692
 
693
However, the scheduling of transactions (e.g., thread scheduling by the OS)
694
also affects the handling of progress guarantees by the TM. First, the TM
695
can only guarantee deadlock-freedom if threads do not get stopped. Likewise,
696
low-priority threads can starve if they do not get scheduled when other
697
high-priority threads get those cycles instead.
698
 
699
If all threads get scheduled eventually, correct lock implementations will
700
provide deadlock-freedom, but might not provide starvation-freedom. We can
701
either enforce the latter in the TM's lock implementation, or assume that
702
the scheduling is sufficiently random to yield a probabilistic guarantee that
703
no thread will starve (because eventually, a transaction will encounter a
704
scheduling that will allow it to run). This can indeed work well in practice
705
but is not necessarily guaranteed to work (e.g., simple spin locks can be
706
pretty efficient).
707
 
708
Because enforcing stronger progress guarantees in the TM has a higher runtime
709
overhead, we focus on deadlock-freedom right now and assume that the threads
710
will get scheduled eventually by the OS (but don't consider threads with
711
different priorities). We should support starvation-freedom for serial
712
transactions in the future. Everything beyond that is highly related to proper
713
contention management across all of the TM (including with TM method to
714
choose), and is future work.
715
 
716
@strong{TODO} Handling thread priorities: We want to avoid priority inversion
717
but it's unclear how often that actually matters in practice. Workloads that
718
have threads with different priorities will likely also require lower latency
719
or higher throughput for high-priority threads. Therefore, it probably makes
720
not that much sense (except for eventual progress guarantees) to use
721
priority inheritance until the TM has priority-aware contention management.
722
 
723
 
724
@c ---------------------------------------------------------------------
725
@c GNU Free Documentation License
726
@c ---------------------------------------------------------------------
727
 
728
@include fdl.texi
729
 
730
@c ---------------------------------------------------------------------
731
@c Index
732
@c ---------------------------------------------------------------------
733
 
734
@node Index
735
@unnumbered Index
736
 
737
@printindex cp
738
 
739
@bye

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.