OpenCores
URL https://opencores.org/ocsvn/or1k_soc_on_altera_embedded_dev_kit/or1k_soc_on_altera_embedded_dev_kit/trunk

Subversion Repositories or1k_soc_on_altera_embedded_dev_kit

[/] [or1k_soc_on_altera_embedded_dev_kit/] [trunk/] [linux-2.6/] [linux-2.6.24/] [Documentation/] [mutex-design.txt] - Blame information for rev 17

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 3 xianfeng
Generic Mutex Subsystem
2
 
3
started by Ingo Molnar 
4
 
5
  "Why on earth do we need a new mutex subsystem, and what's wrong
6
   with semaphores?"
7
 
8
firstly, there's nothing wrong with semaphores. But if the simpler
9
mutex semantics are sufficient for your code, then there are a couple
10
of advantages of mutexes:
11
 
12
 - 'struct mutex' is smaller on most architectures: .e.g on x86,
13
   'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes.
14
   A smaller structure size means less RAM footprint, and better
15
   CPU-cache utilization.
16
 
17
 - tighter code. On x86 i get the following .text sizes when
18
   switching all mutex-alike semaphores in the kernel to the mutex
19
   subsystem:
20
 
21
        text    data     bss     dec     hex filename
22
     3280380  868188  396860 4545428  455b94 vmlinux-semaphore
23
     3255329  865296  396732 4517357  44eded vmlinux-mutex
24
 
25
   that's 25051 bytes of code saved, or a 0.76% win - off the hottest
26
   codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%)
27
   Smaller code means better icache footprint, which is one of the
28
   major optimization goals in the Linux kernel currently.
29
 
30
 - the mutex subsystem is slightly faster and has better scalability for
31
   contended workloads. On an 8-way x86 system, running a mutex-based
32
   kernel and testing creat+unlink+close (of separate, per-task files)
33
   in /tmp with 16 parallel tasks, the average number of ops/sec is:
34
 
35
    Semaphores:                        Mutexes:
36
 
37
    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
38
    8 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
39
    checking VFS performance.          checking VFS performance.
40
    avg loops/sec:      34713          avg loops/sec:      84153
41
    CPU utilization:    63%            CPU utilization:    22%
42
 
43
   i.e. in this workload, the mutex based kernel was 2.4 times faster
44
   than the semaphore based kernel, _and_ it also had 2.8 times less CPU
45
   utilization. (In terms of 'ops per CPU cycle', the semaphore kernel
46
   performed 551 ops/sec per 1% of CPU time used, while the mutex kernel
47
   performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times
48
   more efficient.)
49
 
50
   the scalability difference is visible even on a 2-way P4 HT box:
51
 
52
    Semaphores:                        Mutexes:
53
 
54
    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
55
    4 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
56
    checking VFS performance.          checking VFS performance.
57
    avg loops/sec:      127659         avg loops/sec:      181082
58
    CPU utilization:    100%           CPU utilization:    34%
59
 
60
   (the straight performance advantage of mutexes is 41%, the per-cycle
61
    efficiency of mutexes is 4.1 times better.)
62
 
63
 - there are no fastpath tradeoffs, the mutex fastpath is just as tight
64
   as the semaphore fastpath. On x86, the locking fastpath is 2
65
   instructions:
66
 
67
    c0377ccb :
68
    c0377ccb:       f0 ff 08                lock decl (%eax)
69
    c0377cce:       78 0e                   js     c0377cde <.text.lock.mutex>
70
    c0377cd0:       c3                      ret
71
 
72
   the unlocking fastpath is equally tight:
73
 
74
    c0377cd1 :
75
    c0377cd1:       f0 ff 00                lock incl (%eax)
76
    c0377cd4:       7e 0f                   jle    c0377ce5 <.text.lock.mutex+0x7>
77
    c0377cd6:       c3                      ret
78
 
79
 - 'struct mutex' semantics are well-defined and are enforced if
80
   CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have
81
   virtually no debugging code or instrumentation. The mutex subsystem
82
   checks and enforces the following rules:
83
 
84
   * - only one task can hold the mutex at a time
85
   * - only the owner can unlock the mutex
86
   * - multiple unlocks are not permitted
87
   * - recursive locking is not permitted
88
   * - a mutex object must be initialized via the API
89
   * - a mutex object must not be initialized via memset or copying
90
   * - task may not exit with mutex held
91
   * - memory areas where held locks reside must not be freed
92
   * - held mutexes must not be reinitialized
93
   * - mutexes may not be used in hardware or software interrupt
94
   *   contexts such as tasklets and timers
95
 
96
   furthermore, there are also convenience features in the debugging
97
   code:
98
 
99
   * - uses symbolic names of mutexes, whenever they are printed in debug output
100
   * - point-of-acquire tracking, symbolic lookup of function names
101
   * - list of all locks held in the system, printout of them
102
   * - owner tracking
103
   * - detects self-recursing locks and prints out all relevant info
104
   * - detects multi-task circular deadlocks and prints out all affected
105
   *   locks and tasks (and only those tasks)
106
 
107
Disadvantages
108
-------------
109
 
110
The stricter mutex API means you cannot use mutexes the same way you
111
can use semaphores: e.g. they cannot be used from an interrupt context,
112
nor can they be unlocked from a different context that which acquired
113
it. [ I'm not aware of any other (e.g. performance) disadvantages from
114
using mutexes at the moment, please let me know if you find any. ]
115
 
116
Implementation of mutexes
117
-------------------------
118
 
119
'struct mutex' is the new mutex type, defined in include/linux/mutex.h
120
and implemented in kernel/mutex.c. It is a counter-based mutex with a
121
spinlock and a wait-list. The counter has 3 states: 1 for "unlocked",
122
 
123
waiters queued".
124
 
125
the APIs of 'struct mutex' have been streamlined:
126
 
127
 DEFINE_MUTEX(name);
128
 
129
 mutex_init(mutex);
130
 
131
 void mutex_lock(struct mutex *lock);
132
 int  mutex_lock_interruptible(struct mutex *lock);
133
 int  mutex_trylock(struct mutex *lock);
134
 void mutex_unlock(struct mutex *lock);
135
 int  mutex_is_locked(struct mutex *lock);
136
 void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
137
 int  mutex_lock_interruptible_nested(struct mutex *lock,
138
                                      unsigned int subclass);

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.