1 |
1325 |
phoenix |
<HTML>
|
2 |
|
|
<HEAD>
|
3 |
|
|
<TITLE>LinuxThreads Frequently Asked Questions</TITLE>
|
4 |
|
|
</HEAD>
|
5 |
|
|
<BODY>
|
6 |
|
|
<H1 ALIGN=center>LinuxThreads Frequently Asked Questions <BR>
|
7 |
|
|
(with answers)</H1>
|
8 |
|
|
<H2 ALIGN=center>[For LinuxThreads version 0.8]</H2>
|
9 |
|
|
|
10 |
|
|
<HR><P>
|
11 |
|
|
|
12 |
|
|
<A HREF="#A">A. The big picture</A><BR>
|
13 |
|
|
<A HREF="#B">B. Getting more information</A><BR>
|
14 |
|
|
<A HREF="#C">C. Issues related to the C library</A><BR>
|
15 |
|
|
<A HREF="#D">D. Problems, weird behaviors, potential bugs</A><BR>
|
16 |
|
|
<A HREF="#E">E. Missing functions, wrong types, etc</A><BR>
|
17 |
|
|
<A HREF="#F">F. C++ issues</A><BR>
|
18 |
|
|
<A HREF="#G">G. Debugging LinuxThreads programs</A><BR>
|
19 |
|
|
<A HREF="#H">H. Compiling multithreaded code; errno madness</A><BR>
|
20 |
|
|
<A HREF="#I">I. X-Windows and other libraries</A><BR>
|
21 |
|
|
<A HREF="#J">J. Signals and threads</A><BR>
|
22 |
|
|
<A HREF="#K">K. Internals of LinuxThreads</A><P>
|
23 |
|
|
|
24 |
|
|
<HR>
|
25 |
|
|
<P>
|
26 |
|
|
|
27 |
|
|
<H2><A NAME="A">A. The big picture</A></H2>
|
28 |
|
|
|
29 |
|
|
<H4><A NAME="A.1">A.1: What is LinuxThreads?</A></H4>
|
30 |
|
|
|
31 |
|
|
LinuxThreads is a Linux library for multi-threaded programming.
|
32 |
|
|
It implements the Posix 1003.1c API (Application Programming
|
33 |
|
|
Interface) for threads. It runs on any Linux system with kernel 2.0.0
|
34 |
|
|
or more recent, and a suitable C library (see section <A HREF="C">C</A>).
|
35 |
|
|
<P>
|
36 |
|
|
|
37 |
|
|
<H4><A NAME="A.2">A.2: What are threads?</A></H4>
|
38 |
|
|
|
39 |
|
|
A thread is a sequential flow of control through a program.
|
40 |
|
|
Multi-threaded programming is, thus, a form of parallel programming
|
41 |
|
|
where several threads of control are executing concurrently in the
|
42 |
|
|
program. All threads execute in the same memory space, and can
|
43 |
|
|
therefore work concurrently on shared data.<P>
|
44 |
|
|
|
45 |
|
|
Multi-threaded programming differs from Unix-style multi-processing in
|
46 |
|
|
that all threads share the same memory space (and a few other system
|
47 |
|
|
resources, such as file descriptors), instead of running in their own
|
48 |
|
|
memory space as is the case with Unix processes.<P>
|
49 |
|
|
|
50 |
|
|
Threads are useful for two reasons. First, they allow a program to
|
51 |
|
|
exploit multi-processor machines: the threads can run in parallel on
|
52 |
|
|
several processors, allowing a single program to divide its work
|
53 |
|
|
between several processors, thus running faster than a single-threaded
|
54 |
|
|
program, which runs on only one processor at a time. Second, some
|
55 |
|
|
programs are best expressed as several threads of control that
|
56 |
|
|
communicate together, rather than as one big monolithic sequential
|
57 |
|
|
program. Examples include server programs, overlapping asynchronous
|
58 |
|
|
I/O, and graphical user interfaces.<P>
|
59 |
|
|
|
60 |
|
|
<H4><A NAME="A.3">A.3: What is POSIX 1003.1c?</A></H4>
|
61 |
|
|
|
62 |
|
|
It's an API for multi-threaded programming standardized by IEEE as
|
63 |
|
|
part of the POSIX standards. Most Unix vendors have endorsed the
|
64 |
|
|
POSIX 1003.1c standard. Implementations of the 1003.1c API are
|
65 |
|
|
already available under Sun Solaris 2.5, Digital Unix 4.0,
|
66 |
|
|
Silicon Graphics IRIX 6, and should soon be available from other
|
67 |
|
|
vendors such as IBM and HP. More generally, the 1003.1c API is
|
68 |
|
|
replacing relatively quickly the proprietary threads library that were
|
69 |
|
|
developed previously under Unix, such as Mach cthreads, Solaris
|
70 |
|
|
threads, and IRIX sprocs. Thus, multithreaded programs using the
|
71 |
|
|
1003.1c API are likely to run unchanged on a wide variety of Unix
|
72 |
|
|
platforms.<P>
|
73 |
|
|
|
74 |
|
|
<H4><A NAME="A.4">A.4: What is the status of LinuxThreads?</A></H4>
|
75 |
|
|
|
76 |
|
|
LinuxThreads implements almost all of Posix 1003.1c, as well as a few
|
77 |
|
|
extensions. The only part of LinuxThreads that does not conform yet
|
78 |
|
|
to Posix is signal handling (see section <A HREF="#J">J</A>). Apart
|
79 |
|
|
from the signal stuff, all the Posix 1003.1c base functionality,
|
80 |
|
|
as well as a number of optional extensions, are provided and conform
|
81 |
|
|
to the standard (to the best of my knowledge).
|
82 |
|
|
The signal stuff is hard to get right, at least without special kernel
|
83 |
|
|
support, and while I'm definitely looking at ways to implement the
|
84 |
|
|
Posix behavior for signals, this might take a long time before it's
|
85 |
|
|
completed.<P>
|
86 |
|
|
|
87 |
|
|
<H4><A NAME="A.5">A.5: How stable is LinuxThreads?</A></H4>
|
88 |
|
|
|
89 |
|
|
The basic functionality (thread creation and termination, mutexes,
|
90 |
|
|
conditions, semaphores) is very stable. Several industrial-strength
|
91 |
|
|
programs, such as the AOL multithreaded Web server, use LinuxThreads
|
92 |
|
|
and seem quite happy about it. There used to be some rough edges in
|
93 |
|
|
the LinuxThreads / C library interface with libc 5, but glibc 2
|
94 |
|
|
fixes all of those problems and is now the standard C library on major
|
95 |
|
|
Linux distributions (see section <A HREF="#C">C</A>). <P>
|
96 |
|
|
|
97 |
|
|
<HR>
|
98 |
|
|
<P>
|
99 |
|
|
|
100 |
|
|
<H2><A NAME="B">B. Getting more information</A></H2>
|
101 |
|
|
|
102 |
|
|
<H4><A NAME="B.1">B.1: What are good books and other sources of
|
103 |
|
|
information on POSIX threads?</A></H4>
|
104 |
|
|
|
105 |
|
|
The FAQ for comp.programming.threads lists several books:
|
106 |
|
|
<A HREF="http://www.serpentine.com/~bos/threads-faq/">http://www.serpentine.com/~bos/threads-faq/</A>.<P>
|
107 |
|
|
|
108 |
|
|
There are also some online tutorials. Follow the links from the
|
109 |
|
|
LinuxThreads web page:
|
110 |
|
|
<A HREF="http://pauillac.inria.fr/~xleroy/linuxthreads">http://pauillac.inria.fr/~xleroy/linuxthreads</A>.<P>
|
111 |
|
|
|
112 |
|
|
<H4><A NAME="B.2">B.2: I'd like to be informed of future developments on
|
113 |
|
|
LinuxThreads. Is there a mailing list for this purpose?</A></H4>
|
114 |
|
|
|
115 |
|
|
I post LinuxThreads-related announcements on the newsgroup
|
116 |
|
|
<A HREF="news:comp.os.linux.announce">comp.os.linux.announce</A>,
|
117 |
|
|
and also on the mailing list
|
118 |
|
|
<code>linux-threads@magenet.com</code>.
|
119 |
|
|
You can subscribe to the latter by writing
|
120 |
|
|
<A HREF="mailto:majordomo@magenet.com">majordomo@magenet.com</A>.<P>
|
121 |
|
|
|
122 |
|
|
<H4><A NAME="B.3">B.3: What are good places for discussing
|
123 |
|
|
LinuxThreads?</A></H4>
|
124 |
|
|
|
125 |
|
|
For questions about programming with POSIX threads in general, use
|
126 |
|
|
the newsgroup
|
127 |
|
|
<A HREF="news:comp.programming.threads">comp.programming.threads</A>.
|
128 |
|
|
Be sure you read the
|
129 |
|
|
<A HREF="http://www.serpentine.com/~bos/threads-faq/">FAQ</A>
|
130 |
|
|
for this group before you post.<P>
|
131 |
|
|
|
132 |
|
|
For Linux-specific questions, use
|
133 |
|
|
<A
|
134 |
|
|
HREF="news:comp.os.linux.development.apps">comp.os.linux.development.apps</A>
|
135 |
|
|
and <A
|
136 |
|
|
HREF="news:comp.os.linux.development.kernel">comp.os.linux.development.kernel</A>.
|
137 |
|
|
The latter is especially appropriate for questions relative to the
|
138 |
|
|
interface between the kernel and LinuxThreads.<P>
|
139 |
|
|
|
140 |
|
|
<H4><A NAME="B.4">B.4: How should I report a possible bug in
|
141 |
|
|
LinuxThreads?</A></H4>
|
142 |
|
|
|
143 |
|
|
If you're using glibc 2, the best way by far is to use the
|
144 |
|
|
<code>glibcbug</code> script to mail a bug report to the glibc
|
145 |
|
|
maintainers. <P>
|
146 |
|
|
|
147 |
|
|
If you're using an older libc, or don't have the <code>glibcbug</code>
|
148 |
|
|
script on your machine, then e-mail me directly
|
149 |
|
|
(<code>Xavier.Leroy@inria.fr</code>). <P>
|
150 |
|
|
|
151 |
|
|
In both cases, before sending the bug report, make sure that it is not
|
152 |
|
|
addressed already in this FAQ. Also, try to send a short program that
|
153 |
|
|
reproduces the weird behavior you observed. <P>
|
154 |
|
|
|
155 |
|
|
<H4><A NAME="B.5">B.5: I'd like to read the POSIX 1003.1c standard. Is
|
156 |
|
|
it available online?</A></H4>
|
157 |
|
|
|
158 |
|
|
Unfortunately, no. POSIX standards are copyrighted by IEEE, and
|
159 |
|
|
IEEE does not distribute them freely. You can buy paper copies from
|
160 |
|
|
IEEE, but the price is fairly high ($120 or so). If you disagree with
|
161 |
|
|
this policy and you're an IEEE member, be sure to let them know.<P>
|
162 |
|
|
|
163 |
|
|
On the other hand, you probably don't want to read the standard. It's
|
164 |
|
|
very hard to read, written in standard-ese, and targeted to
|
165 |
|
|
implementors who already know threads inside-out. A good book on
|
166 |
|
|
POSIX threads provides the same information in a much more readable form.
|
167 |
|
|
I can personally recommend Dave Butenhof's book, <CITE>Programming
|
168 |
|
|
with POSIX threads</CITE> (Addison-Wesley). Butenhof was part of the
|
169 |
|
|
POSIX committee and also designed the Digital Unix implementations of
|
170 |
|
|
POSIX threads, and it shows.<P>
|
171 |
|
|
|
172 |
|
|
Another good source of information is the X/Open Group Single Unix
|
173 |
|
|
specification which is available both
|
174 |
|
|
<A HREF="http://www.rdg.opengroup.org/onlinepubs/7908799/index.html">on-line</A>
|
175 |
|
|
and as a
|
176 |
|
|
<A HREF="http://www.UNIX-systems.org/gosolo2/">book and CD/ROM</A>.
|
177 |
|
|
That specification includes pretty much all the POSIX standards,
|
178 |
|
|
including 1003.1c, with some extensions and clarifications.<P>
|
179 |
|
|
|
180 |
|
|
<HR>
|
181 |
|
|
<P>
|
182 |
|
|
|
183 |
|
|
<H2><A NAME="C">C. Issues related to the C library</A></H2>
|
184 |
|
|
|
185 |
|
|
<H4><A NAME="C.1">C.1: Which version of the C library should I use
|
186 |
|
|
with LinuxThreads?</A></H4>
|
187 |
|
|
|
188 |
|
|
The best choice by far is glibc 2, a.k.a. libc 6. It offers very good
|
189 |
|
|
support for multi-threading, and LinuxThreads has been closely
|
190 |
|
|
integrated with glibc 2. The glibc 2 distribution contains the
|
191 |
|
|
sources of a specially adapted version of LinuxThreads.<P>
|
192 |
|
|
|
193 |
|
|
glibc 2 comes preinstalled as the default C library on several Linux
|
194 |
|
|
distributions, such as RedHat 5 and up, and Debian 2.
|
195 |
|
|
Those distributions include the version of LinuxThreads matching
|
196 |
|
|
glibc 2.<P>
|
197 |
|
|
|
198 |
|
|
<H4><A NAME="C.2">C.2: My system has libc 5 preinstalled, not glibc
|
199 |
|
|
2. Can I still use LinuxThreads?</H4>
|
200 |
|
|
|
201 |
|
|
Yes, but you're likely to run into some problems, as libc 5 only
|
202 |
|
|
offers minimal support for threads and contains some bugs that affect
|
203 |
|
|
multithreaded programs. <P>
|
204 |
|
|
|
205 |
|
|
The versions of libc 5 that work best with LinuxThreads are
|
206 |
|
|
libc 5.2.18 on the one hand, and libc 5.4.12 or later on the other hand.
|
207 |
|
|
Avoid 5.3.12 and 5.4.7: these have problems with the per-thread errno
|
208 |
|
|
variable. <P>
|
209 |
|
|
|
210 |
|
|
<H4><A NAME="C.3">C.3: So, should I switch to glibc 2, or stay with a
|
211 |
|
|
recent libc 5?</A></H4>
|
212 |
|
|
|
213 |
|
|
I'd recommend you switch to glibc 2. Even for single-threaded
|
214 |
|
|
programs, glibc 2 is more solid and more standard-conformant than libc
|
215 |
|
|
5. And the shortcomings of libc 5 almost preclude any serious
|
216 |
|
|
multi-threaded programming.<P>
|
217 |
|
|
|
218 |
|
|
Switching an already installed
|
219 |
|
|
system from libc 5 to glibc 2 is not completely straightforward.
|
220 |
|
|
See the <A HREF="http://sunsite.unc.edu/LDP/HOWTO/Glibc2-HOWTO.html">Glibc2
|
221 |
|
|
HOWTO</A> for more information. Much easier is (re-)installing a
|
222 |
|
|
Linux distribution based on glibc 2, such as RedHat 6.<P>
|
223 |
|
|
|
224 |
|
|
<H4><A NAME="C.4">C.4: Where can I find glibc 2 and the version of
|
225 |
|
|
LinuxThreads that goes with it?</A></H4>
|
226 |
|
|
|
227 |
|
|
On <code>prep.ai.mit.edu</code> and its many, many mirrors around the world.
|
228 |
|
|
See <A
|
229 |
|
|
HREF="http://www.gnu.org/order/ftp.html">http://www.gnu.org/order/ftp.html</A>
|
230 |
|
|
for a list of mirrors.<P>
|
231 |
|
|
|
232 |
|
|
<H4><A NAME="C.5">C.5: Where can I find libc 5 and the version of
|
233 |
|
|
LinuxThreads that goes with it?</A></H4>
|
234 |
|
|
|
235 |
|
|
For libc 5, see <A HREF="ftp://sunsite.unc.edu/pub/Linux/devel/GCC/"><code>ftp://sunsite.unc.edu/pub/Linux/devel/GCC/</code></A>.<P>
|
236 |
|
|
|
237 |
|
|
For the libc 5 version of LinuxThreads, see
|
238 |
|
|
<A HREF="ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy/linuxthreads/">ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy/linuxthreads/</A>.<P>
|
239 |
|
|
|
240 |
|
|
<H4><A NAME="C.6">C.6: How can I recompile the glibc 2 version of the
|
241 |
|
|
LinuxThreads sources?</A></H4>
|
242 |
|
|
|
243 |
|
|
You must transfer the whole glibc sources, then drop the LinuxThreads
|
244 |
|
|
sources in the <code>linuxthreads/</code> subdirectory, then recompile
|
245 |
|
|
glibc as a whole. There are now too many inter-dependencies between
|
246 |
|
|
LinuxThreads and glibc 2 to allow separate re-compilation of LinuxThreads.
|
247 |
|
|
<P>
|
248 |
|
|
|
249 |
|
|
<H4><A NAME="C.7">C.7: What is the correspondence between LinuxThreads
|
250 |
|
|
version numbers, libc version numbers, and RedHat version
|
251 |
|
|
numbers?</A></H4>
|
252 |
|
|
|
253 |
|
|
Here is a summary. (Information on Linux distributions other than
|
254 |
|
|
RedHat are welcome.)<P>
|
255 |
|
|
|
256 |
|
|
<TABLE>
|
257 |
|
|
<TR><TD>LinuxThreads </TD> <TD>C library</TD> <TD>RedHat</TD></TR>
|
258 |
|
|
<TR><TD>0.7, 0.71 (for libc 5)</TD> <TD>libc 5.x</TD> <TD>RH 4.2</TD></TR>
|
259 |
|
|
<TR><TD>0.7, 0.71 (for glibc 2)</TD> <TD>glibc 2.0.x</TD> <TD>RH 5.x</TD></TR>
|
260 |
|
|
<TR><TD>0.8</TD> <TD>glibc 2.1.1</TD> <TD>RH 6.0</TD></TR>
|
261 |
|
|
<TR><TD>0.8</TD> <TD>glibc 2.1.2</TD> <TD>not yet released</TD></TR>
|
262 |
|
|
</TABLE>
|
263 |
|
|
<P>
|
264 |
|
|
|
265 |
|
|
<HR>
|
266 |
|
|
<P>
|
267 |
|
|
|
268 |
|
|
<H2><A NAME="D">D. Problems, weird behaviors, potential bugs</A></H2>
|
269 |
|
|
|
270 |
|
|
<H4><A NAME="D.1">D.1: When I compile LinuxThreads, I run into problems in
|
271 |
|
|
file <code>libc_r/dirent.c</code></A></H4>
|
272 |
|
|
|
273 |
|
|
You probably mean:
|
274 |
|
|
<PRE>
|
275 |
|
|
libc_r/dirent.c:94: structure has no member named `dd_lock'
|
276 |
|
|
</PRE>
|
277 |
|
|
I haven't actually seen this problem, but several users reported it.
|
278 |
|
|
My understanding is that something is wrong in the include files of
|
279 |
|
|
your Linux installation (<code>/usr/include/*</code>). Make sure
|
280 |
|
|
you're using a supported version of the libc 5 library. (See question <A
|
281 |
|
|
HREF="#C.2">C.2</A>).<P>
|
282 |
|
|
|
283 |
|
|
<H4><A NAME="D.2">D.2: When I compile LinuxThreads, I run into problems with
|
284 |
|
|
<CODE>/usr/include/sched.h</CODE>: there are several occurrences of
|
285 |
|
|
<CODE>_p</CODE> that the C compiler does not understand</A></H4>
|
286 |
|
|
|
287 |
|
|
Yes, <CODE>/usr/include/sched.h</CODE> that comes with libc 5.3.12 is broken.
|
288 |
|
|
Replace it with the <code>sched.h</code> file contained in the
|
289 |
|
|
LinuxThreads distribution. But really you should not be using libc
|
290 |
|
|
5.3.12 with LinuxThreads! (See question <A HREF="#C.2">C.1</A>.)<P>
|
291 |
|
|
|
292 |
|
|
<H4><A NAME="D.3">D.3: My program does <CODE>fdopen()</CODE> on a file
|
293 |
|
|
descriptor opened on a pipe. When I link it with LinuxThreads,
|
294 |
|
|
<CODE>fdopen()</CODE> always returns NULL!</A></H4>
|
295 |
|
|
|
296 |
|
|
You're using one of the buggy versions of libc (5.3.12, 5.4.7., etc).
|
297 |
|
|
See question <A HREF="#C.1">C.1</A> above.<P>
|
298 |
|
|
|
299 |
|
|
<H4><A NAME="D.4">D.4: My program creates a lot of threads, and after
|
300 |
|
|
a while <CODE>pthread_create()</CODE> no longer returns!</A></H4>
|
301 |
|
|
|
302 |
|
|
This is known bug in the version of LinuxThreads that comes with glibc
|
303 |
|
|
2.1.1. An upgrade to 2.1.2 is recommended. <P>
|
304 |
|
|
|
305 |
|
|
<H4><A NAME="D.5">D.5: When I'm running a program that creates N
|
306 |
|
|
threads, <code>top</code> or <code>ps</code>
|
307 |
|
|
display N+2 processes that are running my program. What do all these
|
308 |
|
|
processes correspond to?</A></H4>
|
309 |
|
|
|
310 |
|
|
Due to the general "one process per thread" model, there's one process
|
311 |
|
|
for the initial thread and N processes for the threads it created
|
312 |
|
|
using <CODE>pthread_create</CODE>. That leaves one process
|
313 |
|
|
unaccounted for. That extra process corresponds to the "thread
|
314 |
|
|
manager" thread, a thread created internally by LinuxThreads to handle
|
315 |
|
|
thread creation and thread termination. This extra thread is asleep
|
316 |
|
|
most of the time.
|
317 |
|
|
|
318 |
|
|
<H4><A NAME="D.6">D.6: Scheduling seems to be very unfair when there
|
319 |
|
|
is strong contention on a mutex: instead of giving the mutex to each
|
320 |
|
|
thread in turn, it seems that it's almost always the same thread that
|
321 |
|
|
gets the mutex. Isn't this completely broken behavior?</A></H4>
|
322 |
|
|
|
323 |
|
|
That behavior has mostly disappeared in recent releases of
|
324 |
|
|
LinuxThreads (version 0.8 and up). It was fairly common in older
|
325 |
|
|
releases, though.
|
326 |
|
|
|
327 |
|
|
What happens in LinuxThreads 0.7 and before is the following: when a
|
328 |
|
|
thread unlocks a mutex, all other threads that were waiting on the
|
329 |
|
|
mutex are sent a signal which makes them runnable. However, the
|
330 |
|
|
kernel scheduler may or may not restart them immediately. If the
|
331 |
|
|
thread that unlocked the mutex tries to lock it again immediately
|
332 |
|
|
afterwards, it is likely that it will succeed, because the threads
|
333 |
|
|
haven't yet restarted. This results in an apparently very unfair
|
334 |
|
|
behavior, when the same thread repeatedly locks and unlocks the mutex,
|
335 |
|
|
while other threads can't lock the mutex.<P>
|
336 |
|
|
|
337 |
|
|
In LinuxThreads 0.8 and up, <code>pthread_unlock</code> restarts only
|
338 |
|
|
one waiting thread, and pre-assign the mutex to that thread. Hence,
|
339 |
|
|
if the thread that unlocked the mutex tries to lock it again
|
340 |
|
|
immediately, it will block until other waiting threads have had a
|
341 |
|
|
chance to lock and unlock the mutex. This results in much fairer
|
342 |
|
|
scheduling.<P>
|
343 |
|
|
|
344 |
|
|
Notice however that even the old "unfair" behavior is perfectly
|
345 |
|
|
acceptable with respect to the POSIX standard: for the default
|
346 |
|
|
scheduling policy, POSIX makes no guarantees of fairness, such as "the
|
347 |
|
|
thread waiting for the mutex for the longest time always acquires it
|
348 |
|
|
first". Properly written multithreaded code avoids that kind of heavy
|
349 |
|
|
contention on mutexes, and does not run into fairness problems. If
|
350 |
|
|
you need scheduling guarantees, you should consider using the
|
351 |
|
|
real-time scheduling policies <code>SCHED_RR</code> and
|
352 |
|
|
<code>SCHED_FIFO</code>, which have precisely defined scheduling
|
353 |
|
|
behaviors. <P>
|
354 |
|
|
|
355 |
|
|
<H4><A NAME="D.7">D.7: I have a simple test program with two threads
|
356 |
|
|
that do nothing but <CODE>printf()</CODE> in tight loops, and from the
|
357 |
|
|
printout it seems that only one thread is running, the other doesn't
|
358 |
|
|
print anything!</A></H4>
|
359 |
|
|
|
360 |
|
|
Again, this behavior is characteristic of old releases of LinuxThreads
|
361 |
|
|
(0.7 and before); more recent versions (0.8 and up) should not exhibit
|
362 |
|
|
this behavior.<P>
|
363 |
|
|
|
364 |
|
|
The reason for this behavior is explained in
|
365 |
|
|
question <A HREF="#D.6">D.6</A> above: <CODE>printf()</CODE> performs
|
366 |
|
|
locking on <CODE>stdout</CODE>, and thus your two threads contend very
|
367 |
|
|
heavily for the mutex associated with <CODE>stdout</CODE>. But if you
|
368 |
|
|
do some real work between two calls to <CODE>printf()</CODE>, you'll
|
369 |
|
|
see that scheduling becomes much smoother.<P>
|
370 |
|
|
|
371 |
|
|
<H4><A NAME="D.8">D.8: I've looked at <code><pthread.h></code>
|
372 |
|
|
and there seems to be a gross error in the <code>pthread_cleanup_push</code>
|
373 |
|
|
macro: it opens a block with <code>{</code> but does not close it!
|
374 |
|
|
Surely you forgot a <code>}</code> at the end of the macro, right?
|
375 |
|
|
</A></H4>
|
376 |
|
|
|
377 |
|
|
Nope. That's the way it should be. The closing brace is provided by
|
378 |
|
|
the <code>pthread_cleanup_pop</code> macro. The POSIX standard
|
379 |
|
|
requires <code>pthread_cleanup_push</code> and
|
380 |
|
|
<code>pthread_cleanup_pop</code> to be used in matching pairs, at the
|
381 |
|
|
same level of brace nesting. This allows
|
382 |
|
|
<code>pthread_cleanup_push</code> to open a block in order to
|
383 |
|
|
stack-allocate some data structure, and
|
384 |
|
|
<code>pthread_cleanup_pop</code> to close that block. It's ugly, but
|
385 |
|
|
it's the standard way of implementing cleanup handlers.<P>
|
386 |
|
|
|
387 |
|
|
<H4><A NAME="D.9">D.9: I tried to use real-time threads and my program
|
388 |
|
|
loops like crazy and freezes the whole machine!</A></H4>
|
389 |
|
|
|
390 |
|
|
Versions of LinuxThreads prior to 0.8 are susceptible to ``livelocks''
|
391 |
|
|
(one thread loops, consuming 100% of the CPU time) in conjunction with
|
392 |
|
|
real-time scheduling. Since real-time threads and processes have
|
393 |
|
|
higher priority than normal Linux processes, all other processes on
|
394 |
|
|
the machine, including the shell, the X server, etc, cannot run and
|
395 |
|
|
the machine appears frozen.<P>
|
396 |
|
|
|
397 |
|
|
The problem is fixed in LinuxThreads 0.8.<P>
|
398 |
|
|
|
399 |
|
|
<H4><A NAME="D.10">D.10: My application needs to create thousands of
|
400 |
|
|
threads, or maybe even more. Can I do this with
|
401 |
|
|
LinuxThreads?</A></H4>
|
402 |
|
|
|
403 |
|
|
No. You're going to run into several hard limits:
|
404 |
|
|
<UL>
|
405 |
|
|
<LI>Each thread, from the kernel's standpoint, is one process. Stock
|
406 |
|
|
Linux kernels are limited to at most 512 processes for the super-user,
|
407 |
|
|
and half this number for regular users. This can be changed by
|
408 |
|
|
changing <code>NR_TASKS</code> in <code>include/linux/tasks.h</code>
|
409 |
|
|
and recompiling the kernel. On the x86 processors at least,
|
410 |
|
|
architectural constraints seem to limit <code>NR_TASKS</code> to 4090
|
411 |
|
|
at most.
|
412 |
|
|
<LI>LinuxThreads contains a table of all active threads. This table
|
413 |
|
|
has room for 1024 threads at most. To increase this limit, you must
|
414 |
|
|
change <code>PTHREAD_THREADS_MAX</code> in the LinuxThreads sources
|
415 |
|
|
and recompile.
|
416 |
|
|
<LI>By default, each thread reserves 2M of virtual memory space for
|
417 |
|
|
its stack. This space is just reserved; actual memory is allocated
|
418 |
|
|
for the stack on demand. But still, on a 32-bit processor, the total
|
419 |
|
|
virtual memory space available for the stacks is on the order of 1G,
|
420 |
|
|
meaning that more than 500 threads will have a hard time fitting in.
|
421 |
|
|
You can overcome this limitation by moving to a 64-bit platform, or by
|
422 |
|
|
allocating smaller stacks yourself using the <code>setstackaddr</code>
|
423 |
|
|
attribute.
|
424 |
|
|
<LI>Finally, the Linux kernel contains many algorithms that run in
|
425 |
|
|
time proportional to the number of process table entries. Increasing
|
426 |
|
|
this number drastically will slow down the kernel operations
|
427 |
|
|
noticeably.
|
428 |
|
|
</UL>
|
429 |
|
|
(Other POSIX threads libraries have similar limitations, by the way.)
|
430 |
|
|
For all those reasons, you'd better restructure your application so
|
431 |
|
|
that it doesn't need more than, say, 100 threads. For instance,
|
432 |
|
|
in the case of a multithreaded server, instead of creating a new
|
433 |
|
|
thread for each connection, maintain a fixed-size pool of worker
|
434 |
|
|
threads that pick incoming connection requests from a queue.<P>
|
435 |
|
|
|
436 |
|
|
<HR>
|
437 |
|
|
<P>
|
438 |
|
|
|
439 |
|
|
<H2><A NAME="E">E. Missing functions, wrong types, etc</A></H2>
|
440 |
|
|
|
441 |
|
|
<H4><A NAME="E.1">E.1: Where is <CODE>pthread_yield()</CODE> ? How
|
442 |
|
|
comes LinuxThreads does not implement it?</A></H4>
|
443 |
|
|
|
444 |
|
|
Because it's not part of the (final) POSIX 1003.1c standard.
|
445 |
|
|
Several drafts of the standard contained <CODE>pthread_yield()</CODE>,
|
446 |
|
|
but then the POSIX guys discovered it was redundant with
|
447 |
|
|
<CODE>sched_yield()</CODE> and dropped it. So, just use
|
448 |
|
|
<CODE>sched_yield()</CODE> instead.
|
449 |
|
|
|
450 |
|
|
<H4><A NAME="E.2">E.2: I've found some type errors in
|
451 |
|
|
<code><pthread.h></code>.
|
452 |
|
|
For instance, the second argument to <CODE>pthread_create()</CODE>
|
453 |
|
|
should be a <CODE>pthread_attr_t</CODE>, not a
|
454 |
|
|
<CODE>pthread_attr_t *</CODE>. Also, didn't you forget to declare
|
455 |
|
|
<CODE>pthread_attr_default</CODE>?</A></H4>
|
456 |
|
|
|
457 |
|
|
No, I didn't. What you're describing is draft 4 of the POSIX
|
458 |
|
|
standard, which is used in OSF DCE threads. LinuxThreads conforms to the
|
459 |
|
|
final standard. Even though the functions have the same names as in
|
460 |
|
|
draft 4 and DCE, their calling conventions are slightly different. In
|
461 |
|
|
particular, attributes are passed by reference, not by value, and
|
462 |
|
|
default attributes are denoted by the NULL pointer. Since draft 4/DCE
|
463 |
|
|
will eventually disappear, you'd better port your program to use the
|
464 |
|
|
standard interface.<P>
|
465 |
|
|
|
466 |
|
|
<H4><A NAME="E.3">E.3: I'm porting an application from Solaris and I
|
467 |
|
|
have to rename all thread functions from <code>thr_blah</code> to
|
468 |
|
|
<CODE>pthread_blah</CODE>. This is very annoying. Why did you change
|
469 |
|
|
all the function names?</A></H4>
|
470 |
|
|
|
471 |
|
|
POSIX did it. The <code>thr_*</code> functions correspond to Solaris
|
472 |
|
|
threads, an older thread interface that you'll find only under
|
473 |
|
|
Solaris. The <CODE>pthread_*</CODE> functions correspond to POSIX
|
474 |
|
|
threads, an international standard available for many, many platforms.
|
475 |
|
|
Even Solaris 2.5 and later support the POSIX threads interface. So,
|
476 |
|
|
do yourself a favor and rewrite your code to use POSIX threads: this
|
477 |
|
|
way, it will run unchanged under Linux, Solaris, and quite a lot of
|
478 |
|
|
other platforms.<P>
|
479 |
|
|
|
480 |
|
|
<H4><A NAME="E.4">E.4: How can I suspend and resume a thread from
|
481 |
|
|
another thread? Solaris has the <CODE>thr_suspend()</CODE> and
|
482 |
|
|
<CODE>thr_resume()</CODE> functions to do that; why don't you?</A></H4>
|
483 |
|
|
|
484 |
|
|
The POSIX standard provides <B>no</B> mechanism by which a thread A can
|
485 |
|
|
suspend the execution of another thread B, without cooperation from B.
|
486 |
|
|
The only way to implement a suspend/restart mechanism is to have B
|
487 |
|
|
check periodically some global variable for a suspend request
|
488 |
|
|
and then suspend itself on a condition variable, which another thread
|
489 |
|
|
can signal later to restart B.<P>
|
490 |
|
|
|
491 |
|
|
Notice that <CODE>thr_suspend()</CODE> is inherently dangerous and
|
492 |
|
|
prone to race conditions. For one thing, there is no control on where
|
493 |
|
|
the target thread stops: it can very well be stopped in the middle of
|
494 |
|
|
a critical section, while holding mutexes. Also, there is no
|
495 |
|
|
guarantee on when the target thread will actually stop. For these
|
496 |
|
|
reasons, you'd be much better off using mutexes and conditions
|
497 |
|
|
instead. The only situations that really require the ability to
|
498 |
|
|
suspend a thread are debuggers and some kind of garbage collectors.<P>
|
499 |
|
|
|
500 |
|
|
If you really must suspend a thread in LinuxThreads, you can send it a
|
501 |
|
|
<CODE>SIGSTOP</CODE> signal with <CODE>pthread_kill</CODE>. Send
|
502 |
|
|
<CODE>SIGCONT</CODE> for restarting it.
|
503 |
|
|
Beware, this is specific to LinuxThreads and entirely non-portable.
|
504 |
|
|
Indeed, a truly conforming POSIX threads implementation will stop all
|
505 |
|
|
threads when one thread receives the <CODE>SIGSTOP</CODE> signal!
|
506 |
|
|
One day, LinuxThreads will implement that behavior, and the
|
507 |
|
|
non-portable hack with <CODE>SIGSTOP</CODE> won't work anymore.<P>
|
508 |
|
|
|
509 |
|
|
<H4><A NAME="E.5">E.5: Does LinuxThreads implement
|
510 |
|
|
<CODE>pthread_attr_setstacksize()</CODE> and
|
511 |
|
|
<CODE>pthread_attr_setstackaddr()</CODE>?</A></H4>
|
512 |
|
|
|
513 |
|
|
These optional functions are provided in recent versions of
|
514 |
|
|
LinuxThreads (0.8 and up). Earlier releases did not provide these
|
515 |
|
|
optional components of the POSIX standard.<P>
|
516 |
|
|
|
517 |
|
|
Even if <CODE>pthread_attr_setstacksize()</CODE> and
|
518 |
|
|
<CODE>pthread_attr_setstackaddr()</CODE> are now provided, we still
|
519 |
|
|
recommend that you do not use them unless you really have strong
|
520 |
|
|
reasons for doing so. The default stack allocation strategy for
|
521 |
|
|
LinuxThreads is nearly optimal: stacks start small (4k) and
|
522 |
|
|
automatically grow on demand to a fairly large limit (2M).
|
523 |
|
|
Moreover, there is no portable way to estimate the stack requirements
|
524 |
|
|
of a thread, so setting the stack size yourself makes your program
|
525 |
|
|
less reliable and non-portable.<P>
|
526 |
|
|
|
527 |
|
|
<H4><A NAME="E.6">E.6: LinuxThreads does not support the
|
528 |
|
|
<CODE>PTHREAD_SCOPE_PROCESS</CODE> value of the "contentionscope"
|
529 |
|
|
attribute. Why? </A></H4>
|
530 |
|
|
|
531 |
|
|
With a "one-to-one" model, as in LinuxThreads (one kernel execution
|
532 |
|
|
context per thread), there is only one scheduler for all processes and
|
533 |
|
|
all threads on the system. So, there is no way to obtain the behavior of
|
534 |
|
|
<CODE>PTHREAD_SCOPE_PROCESS</CODE>.
|
535 |
|
|
|
536 |
|
|
<H4><A NAME="E.7">E.7: LinuxThreads does not implement process-shared
|
537 |
|
|
mutexes, conditions, and semaphores. Why?</A></H4>
|
538 |
|
|
|
539 |
|
|
This is another optional component of the POSIX standard. Portable
|
540 |
|
|
applications should test <CODE>_POSIX_THREAD_PROCESS_SHARED</CODE>
|
541 |
|
|
before using this facility.
|
542 |
|
|
<P>
|
543 |
|
|
The goal of this extension is to allow different processes (with
|
544 |
|
|
different address spaces) to synchronize through mutexes, conditions
|
545 |
|
|
or semaphores allocated in shared memory (either SVR4 shared memory
|
546 |
|
|
segments or <CODE>mmap()</CODE>ed files).
|
547 |
|
|
<P>
|
548 |
|
|
The reason why this does not work in LinuxThreads is that mutexes,
|
549 |
|
|
conditions, and semaphores are not self-contained: their waiting
|
550 |
|
|
queues contain pointers to linked lists of thread descriptors, and
|
551 |
|
|
these pointers are meaningful only in one address space.
|
552 |
|
|
<P>
|
553 |
|
|
Matt Messier and I spent a significant amount of time trying to design a
|
554 |
|
|
suitable mechanism for sharing waiting queues between processes. We
|
555 |
|
|
came up with several solutions that combined two of the following
|
556 |
|
|
three desirable features, but none that combines all three:
|
557 |
|
|
<UL>
|
558 |
|
|
<LI>allow sharing between processes having different UIDs
|
559 |
|
|
<LI>supports cancellation
|
560 |
|
|
<LI>supports <CODE>pthread_cond_timedwait</CODE>
|
561 |
|
|
</UL>
|
562 |
|
|
We concluded that kernel support is required to share mutexes,
|
563 |
|
|
conditions and semaphores between processes. That's one place where
|
564 |
|
|
Linus Torvalds's intuition that "all we need in the kernel is
|
565 |
|
|
<CODE>clone()</CODE>" fails.
|
566 |
|
|
<P>
|
567 |
|
|
Until suitable kernel support is available, you'd better use
|
568 |
|
|
traditional interprocess communications to synchronize different
|
569 |
|
|
processes: System V semaphores and message queues, or pipes, or sockets.
|
570 |
|
|
<P>
|
571 |
|
|
|
572 |
|
|
<HR>
|
573 |
|
|
<P>
|
574 |
|
|
|
575 |
|
|
<H2><A NAME="F">F. C++ issues</A></H2>
|
576 |
|
|
|
577 |
|
|
<H4><A NAME="F.1">F.1: Are there C++ wrappers for LinuxThreads?</A></H4>
|
578 |
|
|
|
579 |
|
|
Douglas Schmidt's ACE library contains, among a lot of other
|
580 |
|
|
things, C++ wrappers for LinuxThreads and quite a number of other
|
581 |
|
|
thread libraries. Check out
|
582 |
|
|
<A HREF="http://www.cs.wustl.edu/~schmidt/ACE.html">http://www.cs.wustl.edu/~schmidt/ACE.html</A><P>
|
583 |
|
|
|
584 |
|
|
<H4><A NAME="F.2">F.2: I'm trying to use LinuxThreads from a C++
|
585 |
|
|
program, and the compiler complains about the third argument to
|
586 |
|
|
<CODE>pthread_create()</CODE> !</A></H4>
|
587 |
|
|
|
588 |
|
|
You're probably trying to pass a class member function or some
|
589 |
|
|
other C++ thing as third argument to <CODE>pthread_create()</CODE>.
|
590 |
|
|
Recall that <CODE>pthread_create()</CODE> is a C function, and it must
|
591 |
|
|
be passed a C function as third argument.<P>
|
592 |
|
|
|
593 |
|
|
<H4><A NAME="F.3">F.3: I'm trying to use LinuxThreads in conjunction
|
594 |
|
|
with libg++, and I'm having all sorts of trouble.</A></H4>
|
595 |
|
|
|
596 |
|
|
>From what I understand, thread support in libg++ is completely broken,
|
597 |
|
|
especially with respect to locking of iostreams. H.J.Lu wrote:
|
598 |
|
|
<BLOCKQUOTE>
|
599 |
|
|
If you want to use thread, I can only suggest egcs and glibc. You
|
600 |
|
|
can find egcs at
|
601 |
|
|
<A HREF="http://www.cygnus.com/egcs">http://www.cygnus.com/egcs</A>.
|
602 |
|
|
egcs has libsdtc++, which is MT safe under glibc 2. If you really
|
603 |
|
|
want to use the libg++, I have a libg++ add-on for egcs.
|
604 |
|
|
</BLOCKQUOTE>
|
605 |
|
|
<HR>
|
606 |
|
|
<P>
|
607 |
|
|
|
608 |
|
|
<H2><A NAME="G">G. Debugging LinuxThreads programs</A></H2>
|
609 |
|
|
|
610 |
|
|
<H4><A NAME="G.1">G.1: Can I debug LinuxThreads program using gdb?</A></H4>
|
611 |
|
|
|
612 |
|
|
Yes, but not with the stock gdb 4.17. You need a specially patched
|
613 |
|
|
version of gdb 4.17 developed by Eric Paire and colleages at The Open
|
614 |
|
|
Group, Grenoble. The patches against gdb 4.17 are available at
|
615 |
|
|
<A HREF="http://www.gr.opengroup.org/java/jdk/linux/debug.htm"><code>http://www.gr.opengroup.org/java/jdk/linux/debug.htm</code></A>.
|
616 |
|
|
Precompiled binaries of the patched gdb are available in RedHat's RPM
|
617 |
|
|
format at <A
|
618 |
|
|
HREF="http://odin.appliedtheory.com/"><code>http://odin.appliedtheory.com/</code></A>.<P>
|
619 |
|
|
|
620 |
|
|
Some Linux distributions provide an already-patched version of gdb;
|
621 |
|
|
others don't. For instance, the gdb in RedHat 5.2 is thread-aware,
|
622 |
|
|
but apparently not the one in RedHat 6.0. Just ask (politely) the
|
623 |
|
|
makers of your Linux distributions to please make sure that they apply
|
624 |
|
|
the correct patches to gdb.<P>
|
625 |
|
|
|
626 |
|
|
<H4><A NAME="G.2">G.2: Does it work with post-mortem debugging?</A></H4>
|
627 |
|
|
|
628 |
|
|
Not very well. Generally, the core file does not correspond to the
|
629 |
|
|
thread that crashed. The reason is that the kernel will not dump core
|
630 |
|
|
for a process that shares its memory with other processes, such as the
|
631 |
|
|
other threads of your program. So, the thread that crashes silently
|
632 |
|
|
disappears without generating a core file. Then, all other threads of
|
633 |
|
|
your program die on the same signal that killed the crashing thread.
|
634 |
|
|
(This is required behavior according to the POSIX standard.) The last
|
635 |
|
|
one that dies is no longer sharing its memory with anyone else, so the
|
636 |
|
|
kernel generates a core file for that thread. Unfortunately, that's
|
637 |
|
|
not the thread you are interested in.
|
638 |
|
|
|
639 |
|
|
<H4><A NAME="G.3">G.3: Any other ways to debug multithreaded programs, then?</A></H4>
|
640 |
|
|
|
641 |
|
|
Assertions and <CODE>printf()</CODE> are your best friends. Try to debug
|
642 |
|
|
sequential parts in a single-threaded program first. Then, put
|
643 |
|
|
<CODE>printf()</CODE> statements all over the place to get execution traces.
|
644 |
|
|
Also, check invariants often with the <CODE>assert()</CODE> macro. In truth,
|
645 |
|
|
there is no other effective way (save for a full formal proof of your
|
646 |
|
|
program) to track down concurrency bugs. Debuggers are not really
|
647 |
|
|
effective for subtle concurrency problems, because they disrupt
|
648 |
|
|
program execution too much.<P>
|
649 |
|
|
|
650 |
|
|
<HR>
|
651 |
|
|
<P>
|
652 |
|
|
|
653 |
|
|
<H2><A NAME="H">H. Compiling multithreaded code; errno madness</A></H2>
|
654 |
|
|
|
655 |
|
|
<H4><A NAME="H.1">H.1: You say all multithreaded code must be compiled
|
656 |
|
|
with <CODE>_REENTRANT</CODE> defined. What difference does it make?</A></H4>
|
657 |
|
|
|
658 |
|
|
It affects include files in three ways:
|
659 |
|
|
<UL>
|
660 |
|
|
<LI> The include files define prototypes for the reentrant variants of
|
661 |
|
|
some of the standard library functions,
|
662 |
|
|
e.g. <CODE>gethostbyname_r()</CODE> as a reentrant equivalent to
|
663 |
|
|
<CODE>gethostbyname()</CODE>.<P>
|
664 |
|
|
|
665 |
|
|
<LI> If <CODE>_REENTRANT</CODE> is defined, some
|
666 |
|
|
<code><stdio.h></code> functions are no longer defined as macros,
|
667 |
|
|
e.g. <CODE>getc()</CODE> and <CODE>putc()</CODE>. In a multithreaded
|
668 |
|
|
program, stdio functions require additional locking, which the macros
|
669 |
|
|
don't perform, so we must call functions instead.<P>
|
670 |
|
|
|
671 |
|
|
<LI> More importantly, <code><errno.h></code> redefines errno when
|
672 |
|
|
<CODE>_REENTRANT</CODE> is
|
673 |
|
|
defined, so that errno refers to the thread-specific errno location
|
674 |
|
|
rather than the global errno variable. This is achieved by the
|
675 |
|
|
following <code>#define</code> in <code><errno.h></code>:
|
676 |
|
|
<PRE>
|
677 |
|
|
#define errno (*(__errno_location()))
|
678 |
|
|
</PRE>
|
679 |
|
|
which causes each reference to errno to call the
|
680 |
|
|
<CODE>__errno_location()</CODE> function for obtaining the location
|
681 |
|
|
where error codes are stored. libc provides a default definition of
|
682 |
|
|
<CODE>__errno_location()</CODE> that always returns
|
683 |
|
|
<code>&errno</code> (the address of the global errno variable). Thus,
|
684 |
|
|
for programs not linked with LinuxThreads, defining
|
685 |
|
|
<CODE>_REENTRANT</CODE> makes no difference w.r.t. errno processing.
|
686 |
|
|
But LinuxThreads redefines <CODE>__errno_location()</CODE> to return a
|
687 |
|
|
location in the thread descriptor reserved for holding the current
|
688 |
|
|
value of errno for the calling thread. Thus, each thread operates on
|
689 |
|
|
a different errno location.
|
690 |
|
|
</UL>
|
691 |
|
|
<P>
|
692 |
|
|
|
693 |
|
|
<H4><A NAME="H.2">H.2: Why is it so important that each thread has its
|
694 |
|
|
own errno variable? </A></H4>
|
695 |
|
|
|
696 |
|
|
If all threads were to store error codes in the same, global errno
|
697 |
|
|
variable, then the value of errno after a system call or library
|
698 |
|
|
function returns would be unpredictable: between the time a system
|
699 |
|
|
call stores its error code in the global errno and your code inspects
|
700 |
|
|
errno to see which error occurred, another thread might have stored
|
701 |
|
|
another error code in the same errno location. <P>
|
702 |
|
|
|
703 |
|
|
<H4><A NAME="H.3">H.3: What happens if I link LinuxThreads with code
|
704 |
|
|
not compiled with <CODE>-D_REENTRANT</CODE>?</A></H4>
|
705 |
|
|
|
706 |
|
|
Lots of trouble. If the code uses <CODE>getc()</CODE> or
|
707 |
|
|
<CODE>putc()</CODE>, it will perform I/O without proper interlocking
|
708 |
|
|
of the stdio buffers; this can cause lost output, duplicate output, or
|
709 |
|
|
just crash other stdio functions. If the code consults errno, it will
|
710 |
|
|
get back the wrong error code. The following code fragment is a
|
711 |
|
|
typical example:
|
712 |
|
|
<PRE>
|
713 |
|
|
do {
|
714 |
|
|
r = read(fd, buf, n);
|
715 |
|
|
if (r == -1) {
|
716 |
|
|
if (errno == EINTR) /* an error we can handle */
|
717 |
|
|
continue;
|
718 |
|
|
else { /* other errors are fatal */
|
719 |
|
|
perror("read failed");
|
720 |
|
|
exit(100);
|
721 |
|
|
}
|
722 |
|
|
}
|
723 |
|
|
} while (...);
|
724 |
|
|
</PRE>
|
725 |
|
|
Assume this code is not compiled with <CODE>-D_REENTRANT</CODE>, and
|
726 |
|
|
linked with LinuxThreads. At run-time, <CODE>read()</CODE> is
|
727 |
|
|
interrupted. Since the C library was compiled with
|
728 |
|
|
<CODE>-D_REENTRANT</CODE>, <CODE>read()</CODE> stores its error code
|
729 |
|
|
in the location pointed to by <CODE>__errno_location()</CODE>, which
|
730 |
|
|
is the thread-local errno variable. Then, the code above sees that
|
731 |
|
|
<CODE>read()</CODE> returns -1 and looks up errno. Since
|
732 |
|
|
<CODE>_REENTRANT</CODE> is not defined, the reference to errno
|
733 |
|
|
accesses the global errno variable, which is most likely 0. Hence the
|
734 |
|
|
code concludes that it cannot handle the error and stops.<P>
|
735 |
|
|
|
736 |
|
|
<H4><A NAME="H.4">H.4: With LinuxThreads, I can no longer use the signals
|
737 |
|
|
<code>SIGUSR1</code> and <code>SIGUSR2</code> in my programs! Why? </A></H4>
|
738 |
|
|
|
739 |
|
|
The short answer is: because the Linux kernel you're using does not
|
740 |
|
|
support realtime signals. <P>
|
741 |
|
|
|
742 |
|
|
LinuxThreads needs two signals for its internal operation.
|
743 |
|
|
One is used to suspend and restart threads blocked on mutex, condition
|
744 |
|
|
or semaphore operations. The other is used for thread
|
745 |
|
|
cancellation.<P>
|
746 |
|
|
|
747 |
|
|
On ``old'' kernels (2.0 and early 2.1 kernels), there are only 32
|
748 |
|
|
signals available and the kernel reserves all of them but two:
|
749 |
|
|
<code>SIGUSR1</code> and <code>SIGUSR2</code>. So, LinuxThreads has
|
750 |
|
|
no choice but use those two signals.<P>
|
751 |
|
|
|
752 |
|
|
On recent kernels (2.2 and up), more than 32 signals are provided in
|
753 |
|
|
the form of realtime signals. When run on one of those kernels,
|
754 |
|
|
LinuxThreads uses two reserved realtime signals for its internal
|
755 |
|
|
operation, thus leaving <code>SIGUSR1</code> and <code>SIGUSR2</code>
|
756 |
|
|
free for user code. (This works only with glibc, not with libc 5.) <P>
|
757 |
|
|
|
758 |
|
|
<H4><A NAME="H.5">H.5: Is the stack of one thread visible from the
|
759 |
|
|
other threads? Can I pass a pointer into my stack to other threads?
|
760 |
|
|
</A></H4>
|
761 |
|
|
|
762 |
|
|
Yes, you can -- if you're very careful. The stacks are indeed visible
|
763 |
|
|
from all threads in the system. Some non-POSIX thread libraries seem
|
764 |
|
|
to map the stacks for all threads at the same virtual addresses and
|
765 |
|
|
change the memory mapping when they switch from one thread to
|
766 |
|
|
another. But this is not the case for LinuxThreads, as it would make
|
767 |
|
|
context switching between threads more expensive, and at any rate
|
768 |
|
|
might not conform to the POSIX standard.<P>
|
769 |
|
|
|
770 |
|
|
So, you can take the address of an "auto" variable and pass it to
|
771 |
|
|
other threads via shared data structures. However, you need to make
|
772 |
|
|
absolutely sure that the function doing this will not return as long
|
773 |
|
|
as other threads need to access this address. It's the usual mistake
|
774 |
|
|
of returning the address of an "auto" variable, only made much worse
|
775 |
|
|
because of concurrency. It's much, much safer to systematically
|
776 |
|
|
heap-allocate all shared data structures. <P>
|
777 |
|
|
|
778 |
|
|
<HR>
|
779 |
|
|
<P>
|
780 |
|
|
|
781 |
|
|
<H2><A NAME="I">I. X-Windows and other libraries</A></H2>
|
782 |
|
|
|
783 |
|
|
<H4><A NAME="I.1">I.1: My program uses both Xlib and LinuxThreads.
|
784 |
|
|
It stops very early with an "Xlib: unknown 0 error" message. What
|
785 |
|
|
does this mean? </A></H4>
|
786 |
|
|
|
787 |
|
|
That's a prime example of the errno problem described in question <A
|
788 |
|
|
HREF="#H.2">H.2</A>. The binaries for Xlib you're using have not been
|
789 |
|
|
compiled with <CODE>-D_REENTRANT</CODE>. It happens Xlib contains a
|
790 |
|
|
piece of code very much like the one in question <A
|
791 |
|
|
HREF="#H.2">H.2</A>. So, your Xlib fetches the error code from the
|
792 |
|
|
wrong errno location and concludes that an error it cannot handle
|
793 |
|
|
occurred.<P>
|
794 |
|
|
|
795 |
|
|
<H4><A NAME="I.2">I.2: So, what can I do to build a multithreaded X
|
796 |
|
|
Windows client? </A></H4>
|
797 |
|
|
|
798 |
|
|
The best solution is to use X libraries that have been compiled with
|
799 |
|
|
multithreading options set. Linux distributions that come with glibc
|
800 |
|
|
2 as the main C library generally provide thread-safe X libraries.
|
801 |
|
|
At least, that seems to be the case for RedHat 5 and later.<P>
|
802 |
|
|
|
803 |
|
|
You can try to recompile yourself the X libraries with multithreading
|
804 |
|
|
options set. They contain optional support for multithreading; it's
|
805 |
|
|
just that the binaries provided by your Linux distribution were built
|
806 |
|
|
without this support. See the file <code>README.Xfree3.3</code> in
|
807 |
|
|
the LinuxThreads distribution for patches and info on how to compile
|
808 |
|
|
thread-safe X libraries from the Xfree3.3 distribution. The Xfree3.3
|
809 |
|
|
sources are readily available in most Linux distributions, e.g. as a
|
810 |
|
|
source RPM for RedHat. Be warned, however, that X Windows is a huge
|
811 |
|
|
system, and recompiling even just the libraries takes a lot of time
|
812 |
|
|
and disk space.<P>
|
813 |
|
|
|
814 |
|
|
Another, less involving solution is to call X functions only from the
|
815 |
|
|
main thread of your program. Even if all threads have their own errno
|
816 |
|
|
location, the main thread uses the global errno variable for its errno
|
817 |
|
|
location. Thus, code not compiled with <code>-D_REENTRANT</code>
|
818 |
|
|
still "sees" the right error values if it executes in the main thread
|
819 |
|
|
only. <P>
|
820 |
|
|
|
821 |
|
|
<H4><A NAME="I.2">This is a lot of work. Don't you have precompiled
|
822 |
|
|
thread-safe X libraries that you could distribute?</A></H4>
|
823 |
|
|
|
824 |
|
|
No, I don't. Sorry. But consider installing a Linux distribution
|
825 |
|
|
that comes with thread-safe X libraries, such as RedHat 6.<P>
|
826 |
|
|
|
827 |
|
|
<H4><A NAME="I.3">I.3: Can I use library FOO in a multithreaded
|
828 |
|
|
program?</A></H4>
|
829 |
|
|
|
830 |
|
|
Most libraries cannot be used "as is" in a multithreaded program.
|
831 |
|
|
For one thing, they are not necessarily thread-safe: calling
|
832 |
|
|
simultaneously two functions of the library from two threads might not
|
833 |
|
|
work, due to internal use of global variables and the like. Second,
|
834 |
|
|
the libraries must have been compiled with <CODE>-D_REENTRANT</CODE> to avoid
|
835 |
|
|
the errno problems explained in question <A HREF="#H.2">H.2</A>.
|
836 |
|
|
<P>
|
837 |
|
|
|
838 |
|
|
<H4><A NAME="I.4">I.4: What if I make sure that only one thread calls
|
839 |
|
|
functions in these libraries?</A></H4>
|
840 |
|
|
|
841 |
|
|
This avoids problems with the library not being thread-safe. But
|
842 |
|
|
you're still vulnerable to errno problems. At the very least, a
|
843 |
|
|
recompile of the library with <CODE>-D_REENTRANT</CODE> is needed.
|
844 |
|
|
<P>
|
845 |
|
|
|
846 |
|
|
<H4><A NAME="I.5">I.5: What if I make sure that only the main thread
|
847 |
|
|
calls functions in these libraries?</A></H4>
|
848 |
|
|
|
849 |
|
|
That might actually work. As explained in question <A HREF="#I.1">I.1</A>,
|
850 |
|
|
the main thread uses the global errno variable, and can therefore
|
851 |
|
|
execute code not compiled with <CODE>-D_REENTRANT</CODE>.<P>
|
852 |
|
|
|
853 |
|
|
<H4><A NAME="I.6">I.6: SVGAlib doesn't work with LinuxThreads. Why?
|
854 |
|
|
</A></H4>
|
855 |
|
|
|
856 |
|
|
Because both LinuxThreads and SVGAlib use the signals
|
857 |
|
|
<code>SIGUSR1</code> and <code>SIGUSR2</code>. See question <A
|
858 |
|
|
HREF="#H.4">H.4</A>.
|
859 |
|
|
<P>
|
860 |
|
|
|
861 |
|
|
|
862 |
|
|
<HR>
|
863 |
|
|
<P>
|
864 |
|
|
|
865 |
|
|
<H2><A NAME="J">J. Signals and threads</A></H2>
|
866 |
|
|
|
867 |
|
|
<H4><A NAME="J.1">J.1: When it comes to signals, what is shared
|
868 |
|
|
between threads and what isn't?</A></H4>
|
869 |
|
|
|
870 |
|
|
Signal handlers are shared between all threads: when a thread calls
|
871 |
|
|
<CODE>sigaction()</CODE>, it sets how the signal is handled not only
|
872 |
|
|
for itself, but for all other threads in the program as well.<P>
|
873 |
|
|
|
874 |
|
|
On the other hand, signal masks are per-thread: each thread chooses
|
875 |
|
|
which signals it blocks independently of others. At thread creation
|
876 |
|
|
time, the newly created thread inherits the signal mask of the thread
|
877 |
|
|
calling <CODE>pthread_create()</CODE>. But afterwards, the new thread
|
878 |
|
|
can modify its signal mask independently of its creator thread.<P>
|
879 |
|
|
|
880 |
|
|
<H4><A NAME="J.2">J.2: When I send a <CODE>SIGKILL</CODE> to a
|
881 |
|
|
particular thread using <CODE>pthread_kill</CODE>, all my threads are
|
882 |
|
|
killed!</A></H4>
|
883 |
|
|
|
884 |
|
|
That's how it should be. The POSIX standard mandates that all threads
|
885 |
|
|
should terminate when the process (i.e. the collection of all threads
|
886 |
|
|
running the program) receives a signal whose effect is to
|
887 |
|
|
terminate the process (such as <CODE>SIGKILL</CODE> or <CODE>SIGINT</CODE>
|
888 |
|
|
when no handler is installed on that signal). This behavior makes a
|
889 |
|
|
lot of sense: when you type "ctrl-C" at the keyboard, or when a thread
|
890 |
|
|
crashes on a division by zero or a segmentation fault, you really want
|
891 |
|
|
all threads to stop immediately, not just the one that caused the
|
892 |
|
|
segmentation violation or that got the <CODE>SIGINT</CODE> signal.
|
893 |
|
|
(This assumes default behavior for those signals; see question
|
894 |
|
|
<A HREF="#J.3">J.3</A> if you install handlers for those signals.)<P>
|
895 |
|
|
|
896 |
|
|
If you're trying to terminate a thread without bringing the whole
|
897 |
|
|
process down, use <code>pthread_cancel()</code>.<P>
|
898 |
|
|
|
899 |
|
|
<H4><A NAME="J.3">J.3: I've installed a handler on a signal. Which
|
900 |
|
|
thread executes the handler when the signal is received?</A></H4>
|
901 |
|
|
|
902 |
|
|
If the signal is generated by a thread during its execution (e.g. a
|
903 |
|
|
thread executes a division by zero and thus generates a
|
904 |
|
|
<CODE>SIGFPE</CODE> signal), then the handler is executed by that
|
905 |
|
|
thread. This also applies to signals generated by
|
906 |
|
|
<CODE>raise()</CODE>.<P>
|
907 |
|
|
|
908 |
|
|
If the signal is sent to a particular thread using
|
909 |
|
|
<CODE>pthread_kill()</CODE>, then that thread executes the handler.<P>
|
910 |
|
|
|
911 |
|
|
If the signal is sent via <CODE>kill()</CODE> or the tty interface
|
912 |
|
|
(e.g. by pressing ctrl-C), then the POSIX specs say that the handler
|
913 |
|
|
is executed by any thread in the process that does not currently block
|
914 |
|
|
the signal. In other terms, POSIX considers that the signal is sent
|
915 |
|
|
to the process (the collection of all threads) as a whole, and any
|
916 |
|
|
thread that is not blocking this signal can then handle it.<P>
|
917 |
|
|
|
918 |
|
|
The latter case is where LinuxThreads departs from the POSIX specs.
|
919 |
|
|
In LinuxThreads, there is no real notion of ``the process as a whole'':
|
920 |
|
|
in the kernel, each thread is really a distinct process with a
|
921 |
|
|
distinct PID, and signals sent to the PID of a thread can only be
|
922 |
|
|
handled by that thread. As long as no thread is blocking the signal,
|
923 |
|
|
the behavior conforms to the standard: one (unspecified) thread of the
|
924 |
|
|
program handles the signal. But if the thread to which PID the signal
|
925 |
|
|
is sent blocks the signal, and some other thread does not block the
|
926 |
|
|
signal, then LinuxThreads will simply queue in
|
927 |
|
|
that thread and execute the handler only when that thread unblocks
|
928 |
|
|
the signal, instead of executing the handler immediately in the other
|
929 |
|
|
thread that does not block the signal.<P>
|
930 |
|
|
|
931 |
|
|
This is to be viewed as a LinuxThreads bug, but I currently don't see
|
932 |
|
|
any way to implement the POSIX behavior without kernel support.<P>
|
933 |
|
|
|
934 |
|
|
<H4><A NAME="J.3">J.3: How shall I go about mixing signals and threads
|
935 |
|
|
in my program? </A></H4>
|
936 |
|
|
|
937 |
|
|
The less you mix them, the better. Notice that all
|
938 |
|
|
<CODE>pthread_*</CODE> functions are not async-signal safe, meaning
|
939 |
|
|
that you should not call them from signal handlers. This
|
940 |
|
|
recommendation is not to be taken lightly: your program can deadlock
|
941 |
|
|
if you call a <CODE>pthread_*</CODE> function from a signal handler!
|
942 |
|
|
<P>
|
943 |
|
|
|
944 |
|
|
The only sensible things you can do from a signal handler is set a
|
945 |
|
|
global flag, or call <CODE>sem_post</CODE> on a semaphore, to record
|
946 |
|
|
the delivery of the signal. The remainder of the program can then
|
947 |
|
|
either poll the global flag, or use <CODE>sem_wait()</CODE> and
|
948 |
|
|
<CODE>sem_trywait()</CODE> on the semaphore.<P>
|
949 |
|
|
|
950 |
|
|
Another option is to do nothing in the signal handler, and dedicate
|
951 |
|
|
one thread (preferably the initial thread) to wait synchronously for
|
952 |
|
|
signals, using <CODE>sigwait()</CODE>, and send messages to the other
|
953 |
|
|
threads accordingly.
|
954 |
|
|
|
955 |
|
|
<H4><A NAME="J.4">J.4: When one thread is blocked in
|
956 |
|
|
<CODE>sigwait()</CODE>, other threads no longer receive the signals
|
957 |
|
|
<CODE>sigwait()</CODE> is waiting for! What happens? </A></H4>
|
958 |
|
|
|
959 |
|
|
It's an unfortunate consequence of how LinuxThreads implements
|
960 |
|
|
<CODE>sigwait()</CODE>. Basically, it installs signal handlers on all
|
961 |
|
|
signals waited for, in order to record which signal was received.
|
962 |
|
|
Since signal handlers are shared with the other threads, this
|
963 |
|
|
temporarily deactivates any signal handlers you might have previously
|
964 |
|
|
installed on these signals.<P>
|
965 |
|
|
|
966 |
|
|
Though surprising, this behavior actually seems to conform to the
|
967 |
|
|
POSIX standard. According to POSIX, <CODE>sigwait()</CODE> is
|
968 |
|
|
guaranteed to work as expected only if all other threads in the
|
969 |
|
|
program block the signals waited for (otherwise, the signals could be
|
970 |
|
|
delivered to other threads than the one doing <CODE>sigwait()</CODE>,
|
971 |
|
|
which would make <CODE>sigwait()</CODE> useless). In this particular
|
972 |
|
|
case, the problem described in this question does not appear.<P>
|
973 |
|
|
|
974 |
|
|
One day, <CODE>sigwait()</CODE> will be implemented in the kernel,
|
975 |
|
|
along with others POSIX 1003.1b extensions, and <CODE>sigwait()</CODE>
|
976 |
|
|
will have a more natural behavior (as well as better performances).<P>
|
977 |
|
|
|
978 |
|
|
<HR>
|
979 |
|
|
<P>
|
980 |
|
|
|
981 |
|
|
<H2><A NAME="K">K. Internals of LinuxThreads</A></H2>
|
982 |
|
|
|
983 |
|
|
<H4><A NAME="K.1">K.1: What is the implementation model for
|
984 |
|
|
LinuxThreads?</A></H4>
|
985 |
|
|
|
986 |
|
|
LinuxThreads follows the so-called "one-to-one" model: each thread is
|
987 |
|
|
actually a separate process in the kernel. The kernel scheduler takes
|
988 |
|
|
care of scheduling the threads, just like it schedules regular
|
989 |
|
|
processes. The threads are created with the Linux
|
990 |
|
|
<code>clone()</code> system call, which is a generalization of
|
991 |
|
|
<code>fork()</code> allowing the new process to share the memory
|
992 |
|
|
space, file descriptors, and signal handlers of the parent.<P>
|
993 |
|
|
|
994 |
|
|
Advantages of the "one-to-one" model include:
|
995 |
|
|
<UL>
|
996 |
|
|
<LI> minimal overhead on CPU-intensive multiprocessing (with
|
997 |
|
|
about one thread per processor);
|
998 |
|
|
<LI> minimal overhead on I/O operations;
|
999 |
|
|
<LI> a simple and robust implementation (the kernel scheduler does
|
1000 |
|
|
most of the hard work for us).
|
1001 |
|
|
</UL>
|
1002 |
|
|
The main disadvantage is more expensive context switches on mutex and
|
1003 |
|
|
condition operations, which must go through the kernel. This is
|
1004 |
|
|
mitigated by the fact that context switches in the Linux kernel are
|
1005 |
|
|
pretty efficient.<P>
|
1006 |
|
|
|
1007 |
|
|
<H4><A NAME="K.2">K.2: Have you considered other implementation
|
1008 |
|
|
models?</A></H4>
|
1009 |
|
|
|
1010 |
|
|
There are basically two other models. The "many-to-one" model
|
1011 |
|
|
relies on a user-level scheduler that context-switches between the
|
1012 |
|
|
threads entirely in user code; viewed from the kernel, there is only
|
1013 |
|
|
one process running. This model is completely out of the question for
|
1014 |
|
|
me, since it does not take advantage of multiprocessors, and require
|
1015 |
|
|
unholy magic to handle blocking I/O operations properly. There are
|
1016 |
|
|
several user-level thread libraries available for Linux, but I found
|
1017 |
|
|
all of them deficient in functionality, performance, and/or robustness.
|
1018 |
|
|
<P>
|
1019 |
|
|
|
1020 |
|
|
The "many-to-many" model combines both kernel-level and user-level
|
1021 |
|
|
scheduling: several kernel-level threads run concurrently, each
|
1022 |
|
|
executing a user-level scheduler that selects between user threads.
|
1023 |
|
|
Most commercial Unix systems (Solaris, Digital Unix, IRIX) implement
|
1024 |
|
|
POSIX threads this way. This model combines the advantages of both
|
1025 |
|
|
the "many-to-one" and the "one-to-one" model, and is attractive
|
1026 |
|
|
because it avoids the worst-case behaviors of both models --
|
1027 |
|
|
especially on kernels where context switches are expensive, such as
|
1028 |
|
|
Digital Unix. Unfortunately, it is pretty complex to implement, and
|
1029 |
|
|
requires kernel support which Linux does not provide. Linus Torvalds
|
1030 |
|
|
and other Linux kernel developers have always been pushing the
|
1031 |
|
|
"one-to-one" model in the name of overall simplicity, and are doing a
|
1032 |
|
|
pretty good job of making kernel-level context switches between
|
1033 |
|
|
threads efficient. LinuxThreads is just following the general
|
1034 |
|
|
direction they set.<P>
|
1035 |
|
|
|
1036 |
|
|
<HR>
|
1037 |
|
|
<ADDRESS>Xavier.Leroy@inria.fr</ADDRESS>
|
1038 |
|
|
</BODY>
|
1039 |
|
|
</HTML>
|