| 1 |
721 |
jeremybenn |
Copyright (c) 1988, 1989 Hans-J. Boehm, Alan J. Demers
|
| 2 |
|
|
Copyright (c) 1991-1996 by Xerox Corporation. All rights reserved.
|
| 3 |
|
|
Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved.
|
| 4 |
|
|
Copyright (c) 1999-2004 Hewlett-Packard Development Company, L.P.
|
| 5 |
|
|
|
| 6 |
|
|
The file linux_threads.c is also
|
| 7 |
|
|
Copyright (c) 1998 by Fergus Henderson. All rights reserved.
|
| 8 |
|
|
|
| 9 |
|
|
The files Makefile.am, and configure.in are
|
| 10 |
|
|
Copyright (c) 2001 by Red Hat Inc. All rights reserved.
|
| 11 |
|
|
|
| 12 |
|
|
Several files supporting GNU-style builds are copyrighted by the Free
|
| 13 |
|
|
Software Foundation, and carry a different license from that given
|
| 14 |
|
|
below.
|
| 15 |
|
|
|
| 16 |
|
|
THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED
|
| 17 |
|
|
OR IMPLIED. ANY USE IS AT YOUR OWN RISK.
|
| 18 |
|
|
|
| 19 |
|
|
Permission is hereby granted to use or copy this program
|
| 20 |
|
|
for any purpose, provided the above notices are retained on all copies.
|
| 21 |
|
|
Permission to modify the code and to distribute modified code is granted,
|
| 22 |
|
|
provided the above notices are retained, and a notice that the code was
|
| 23 |
|
|
modified is included with the above copyright notice.
|
| 24 |
|
|
|
| 25 |
|
|
A few of the files needed to use the GNU-style build procedure come with
|
| 26 |
|
|
slightly different licenses, though they are all similar in spirit. A few
|
| 27 |
|
|
are GPL'ed, but with an exception that should cover all uses in the
|
| 28 |
|
|
collector. (If you are concerned about such things, I recommend you look
|
| 29 |
|
|
at the notice in config.guess or ltmain.sh.)
|
| 30 |
|
|
|
| 31 |
|
|
This is version 6.6 of a conservative garbage collector for C and C++.
|
| 32 |
|
|
|
| 33 |
|
|
You might find a more recent version of this at
|
| 34 |
|
|
|
| 35 |
|
|
http://www.hpl.hp.com/personal/Hans_Boehm/gc
|
| 36 |
|
|
|
| 37 |
|
|
OVERVIEW
|
| 38 |
|
|
|
| 39 |
|
|
This is intended to be a general purpose, garbage collecting storage
|
| 40 |
|
|
allocator. The algorithms used are described in:
|
| 41 |
|
|
|
| 42 |
|
|
Boehm, H., and M. Weiser, "Garbage Collection in an Uncooperative Environment",
|
| 43 |
|
|
Software Practice & Experience, September 1988, pp. 807-820.
|
| 44 |
|
|
|
| 45 |
|
|
Boehm, H., A. Demers, and S. Shenker, "Mostly Parallel Garbage Collection",
|
| 46 |
|
|
Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design
|
| 47 |
|
|
and Implementation, SIGPLAN Notices 26, 6 (June 1991), pp. 157-164.
|
| 48 |
|
|
|
| 49 |
|
|
Boehm, H., "Space Efficient Conservative Garbage Collection", Proceedings
|
| 50 |
|
|
of the ACM SIGPLAN '91 Conference on Programming Language Design and
|
| 51 |
|
|
Implementation, SIGPLAN Notices 28, 6 (June 1993), pp. 197-206.
|
| 52 |
|
|
|
| 53 |
|
|
Boehm H., "Reducing Garbage Collector Cache Misses", Proceedings of the
|
| 54 |
|
|
2000 International Symposium on Memory Management.
|
| 55 |
|
|
|
| 56 |
|
|
Possible interactions between the collector and optimizing compilers are
|
| 57 |
|
|
discussed in
|
| 58 |
|
|
|
| 59 |
|
|
Boehm, H., and D. Chase, "A Proposal for GC-safe C Compilation",
|
| 60 |
|
|
The Journal of C Language Translation 4, 2 (December 1992).
|
| 61 |
|
|
|
| 62 |
|
|
and
|
| 63 |
|
|
|
| 64 |
|
|
Boehm H., "Simple GC-safe Compilation", Proceedings
|
| 65 |
|
|
of the ACM SIGPLAN '96 Conference on Programming Language Design and
|
| 66 |
|
|
Implementation.
|
| 67 |
|
|
|
| 68 |
|
|
(Some of these are also available from
|
| 69 |
|
|
http://www.hpl.hp.com/personal/Hans_Boehm/papers/, among other places.)
|
| 70 |
|
|
|
| 71 |
|
|
Unlike the collector described in the second reference, this collector
|
| 72 |
|
|
operates either with the mutator stopped during the entire collection
|
| 73 |
|
|
(default) or incrementally during allocations. (The latter is supported
|
| 74 |
|
|
on only a few machines.) On the most common platforms, it can be built
|
| 75 |
|
|
with or without thread support. On a few platforms, it can take advantage
|
| 76 |
|
|
of a multiprocessor to speed up garbage collection.
|
| 77 |
|
|
|
| 78 |
|
|
Many of the ideas underlying the collector have previously been explored
|
| 79 |
|
|
by others. Notably, some of the run-time systems developed at Xerox PARC
|
| 80 |
|
|
in the early 1980s conservatively scanned thread stacks to locate possible
|
| 81 |
|
|
pointers (cf. Paul Rovner, "On Adding Garbage Collection and Runtime Types
|
| 82 |
|
|
to a Strongly-Typed Statically Checked, Concurrent Language" Xerox PARC
|
| 83 |
|
|
CSL 84-7). Doug McIlroy wrote a simpler fully conservative collector that
|
| 84 |
|
|
was part of version 8 UNIX (tm), but appears to not have received
|
| 85 |
|
|
widespread use.
|
| 86 |
|
|
|
| 87 |
|
|
Rudimentary tools for use of the collector as a leak detector are included
|
| 88 |
|
|
(see http://www.hpl.hp.com/personal/Hans_Boehm/gc/leak.html),
|
| 89 |
|
|
as is a fairly sophisticated string package "cord" that makes use of the
|
| 90 |
|
|
collector. (See doc/README.cords and H.-J. Boehm, R. Atkinson, and M. Plass,
|
| 91 |
|
|
"Ropes: An Alternative to Strings", Software Practice and Experience 25, 12
|
| 92 |
|
|
(December 1995), pp. 1315-1330. This is very similar to the "rope" package
|
| 93 |
|
|
in Xerox Cedar, or the "rope" package in the SGI STL or the g++ distribution.)
|
| 94 |
|
|
|
| 95 |
|
|
Further collector documantation can be found at
|
| 96 |
|
|
|
| 97 |
|
|
http://www.hpl.hp.com/personal/Hans_Boehm/gc
|
| 98 |
|
|
|
| 99 |
|
|
|
| 100 |
|
|
GENERAL DESCRIPTION
|
| 101 |
|
|
|
| 102 |
|
|
This is a garbage collecting storage allocator that is intended to be
|
| 103 |
|
|
used as a plug-in replacement for C's malloc.
|
| 104 |
|
|
|
| 105 |
|
|
Since the collector does not require pointers to be tagged, it does not
|
| 106 |
|
|
attempt to ensure that all inaccessible storage is reclaimed. However,
|
| 107 |
|
|
in our experience, it is typically more successful at reclaiming unused
|
| 108 |
|
|
memory than most C programs using explicit deallocation. Unlike manually
|
| 109 |
|
|
introduced leaks, the amount of unreclaimed memory typically stays
|
| 110 |
|
|
bounded.
|
| 111 |
|
|
|
| 112 |
|
|
In the following, an "object" is defined to be a region of memory allocated
|
| 113 |
|
|
by the routines described below.
|
| 114 |
|
|
|
| 115 |
|
|
Any objects not intended to be collected must be pointed to either
|
| 116 |
|
|
from other such accessible objects, or from the registers,
|
| 117 |
|
|
stack, data, or statically allocated bss segments. Pointers from
|
| 118 |
|
|
the stack or registers may point to anywhere inside an object.
|
| 119 |
|
|
The same is true for heap pointers if the collector is compiled with
|
| 120 |
|
|
ALL_INTERIOR_POINTERS defined, as is now the default.
|
| 121 |
|
|
|
| 122 |
|
|
Compiling without ALL_INTERIOR_POINTERS may reduce accidental retention
|
| 123 |
|
|
of garbage objects, by requiring pointers from the heap to to the beginning
|
| 124 |
|
|
of an object. But this no longer appears to be a significant
|
| 125 |
|
|
issue for most programs.
|
| 126 |
|
|
|
| 127 |
|
|
There are a number of routines which modify the pointer recognition
|
| 128 |
|
|
algorithm. GC_register_displacement allows certain interior pointers
|
| 129 |
|
|
to be recognized even if ALL_INTERIOR_POINTERS is nor defined.
|
| 130 |
|
|
GC_malloc_ignore_off_page allows some pointers into the middle of large objects
|
| 131 |
|
|
to be disregarded, greatly reducing the probablility of accidental
|
| 132 |
|
|
retention of large objects. For most purposes it seems best to compile
|
| 133 |
|
|
with ALL_INTERIOR_POINTERS and to use GC_malloc_ignore_off_page if
|
| 134 |
|
|
you get collector warnings from allocations of very large objects.
|
| 135 |
|
|
See README.debugging for details.
|
| 136 |
|
|
|
| 137 |
|
|
WARNING: pointers inside memory allocated by the standard "malloc" are not
|
| 138 |
|
|
seen by the garbage collector. Thus objects pointed to only from such a
|
| 139 |
|
|
region may be prematurely deallocated. It is thus suggested that the
|
| 140 |
|
|
standard "malloc" be used only for memory regions, such as I/O buffers, that
|
| 141 |
|
|
are guaranteed not to contain pointers to garbage collectable memory.
|
| 142 |
|
|
Pointers in C language automatic, static, or register variables,
|
| 143 |
|
|
are correctly recognized. (Note that GC_malloc_uncollectable has semantics
|
| 144 |
|
|
similar to standard malloc, but allocates objects that are traced by the
|
| 145 |
|
|
collector.)
|
| 146 |
|
|
|
| 147 |
|
|
WARNING: the collector does not always know how to find pointers in data
|
| 148 |
|
|
areas that are associated with dynamic libraries. This is easy to
|
| 149 |
|
|
remedy IF you know how to find those data areas on your operating
|
| 150 |
|
|
system (see GC_add_roots). Code for doing this under SunOS, IRIX 5.X and 6.X,
|
| 151 |
|
|
HP/UX, Alpha OSF/1, Linux, and win32 is included and used by default. (See
|
| 152 |
|
|
README.win32 for win32 details.) On other systems pointers from dynamic
|
| 153 |
|
|
library data areas may not be considered by the collector.
|
| 154 |
|
|
If you're writing a program that depends on the collector scanning
|
| 155 |
|
|
dynamic library data areas, it may be a good idea to include at least
|
| 156 |
|
|
one call to GC_is_visible() to ensure that those areas are visible
|
| 157 |
|
|
to the collector.
|
| 158 |
|
|
|
| 159 |
|
|
Note that the garbage collector does not need to be informed of shared
|
| 160 |
|
|
read-only data. However if the shared library mechanism can introduce
|
| 161 |
|
|
discontiguous data areas that may contain pointers, then the collector does
|
| 162 |
|
|
need to be informed.
|
| 163 |
|
|
|
| 164 |
|
|
Signal processing for most signals may be deferred during collection,
|
| 165 |
|
|
and during uninterruptible parts of the allocation process.
|
| 166 |
|
|
Like standard ANSI C mallocs, by default it is unsafe to invoke
|
| 167 |
|
|
malloc (and other GC routines) from a signal handler while another
|
| 168 |
|
|
malloc call may be in progress. Removing -DNO_SIGNALS from Makefile
|
| 169 |
|
|
attempts to remedy that. But that may not be reliable with a compiler that
|
| 170 |
|
|
substantially reorders memory operations inside GC_malloc.
|
| 171 |
|
|
|
| 172 |
|
|
The allocator/collector can also be configured for thread-safe operation.
|
| 173 |
|
|
(Full signal safety can also be achieved, but only at the cost of two system
|
| 174 |
|
|
calls per malloc, which is usually unacceptable.)
|
| 175 |
|
|
WARNING: the collector does not guarantee to scan thread-local storage
|
| 176 |
|
|
(e.g. of the kind accessed with pthread_getspecific()). The collector
|
| 177 |
|
|
does scan thread stacks, though, so generally the best solution is to
|
| 178 |
|
|
ensure that any pointers stored in thread-local storage are also
|
| 179 |
|
|
stored on the thread's stack for the duration of their lifetime.
|
| 180 |
|
|
(This is arguably a longstanding bug, but it hasn't been fixed yet.)
|
| 181 |
|
|
|
| 182 |
|
|
INSTALLATION AND PORTABILITY
|
| 183 |
|
|
|
| 184 |
|
|
As distributed, the macro SILENT is defined in Makefile.
|
| 185 |
|
|
In the event of problems, this can be removed to obtain a moderate
|
| 186 |
|
|
amount of descriptive output for each collection.
|
| 187 |
|
|
(The given statistics exhibit a few peculiarities.
|
| 188 |
|
|
Things don't appear to add up for a variety of reasons, most notably
|
| 189 |
|
|
fragmentation losses. These are probably much more significant for the
|
| 190 |
|
|
contrived program "test.c" than for your application.)
|
| 191 |
|
|
|
| 192 |
|
|
Note that typing "make test" will automatically build the collector
|
| 193 |
|
|
and then run setjmp_test and gctest. Setjmp_test will give you information
|
| 194 |
|
|
about configuring the collector, which is useful primarily if you have
|
| 195 |
|
|
a machine that's not already supported. Gctest is a somewhat superficial
|
| 196 |
|
|
test of collector functionality. Failure is indicated by a core dump or
|
| 197 |
|
|
a message to the effect that the collector is broken. Gctest takes about
|
| 198 |
|
|
35 seconds to run on a SPARCstation 2. It may use up to 8 MB of memory. (The
|
| 199 |
|
|
multi-threaded version will use more. 64-bit versions may use more.)
|
| 200 |
|
|
"Make test" will also, as its last step, attempt to build and test the
|
| 201 |
|
|
"cord" string library. This will fail without an ANSI C compiler, but
|
| 202 |
|
|
the garbage collector itself should still be usable.
|
| 203 |
|
|
|
| 204 |
|
|
The Makefile will generate a library gc.a which you should link against.
|
| 205 |
|
|
Typing "make cords" will add the cord library to gc.a.
|
| 206 |
|
|
Note that this requires an ANSI C compiler.
|
| 207 |
|
|
|
| 208 |
|
|
It is suggested that if you need to replace a piece of the collector
|
| 209 |
|
|
(e.g. GC_mark_rts.c) you simply list your version ahead of gc.a on the
|
| 210 |
|
|
ld command line, rather than replacing the one in gc.a. (This will
|
| 211 |
|
|
generate numerous warnings under some versions of AIX, but it still
|
| 212 |
|
|
works.)
|
| 213 |
|
|
|
| 214 |
|
|
All include files that need to be used by clients will be put in the
|
| 215 |
|
|
include subdirectory. (Normally this is just gc.h. "Make cords" adds
|
| 216 |
|
|
"cord.h" and "ec.h".)
|
| 217 |
|
|
|
| 218 |
|
|
The collector currently is designed to run essentially unmodified on
|
| 219 |
|
|
machines that use a flat 32-bit or 64-bit address space.
|
| 220 |
|
|
That includes the vast majority of Workstations and X86 (X >= 3) PCs.
|
| 221 |
|
|
(The list here was deleted because it was getting too long and constantly
|
| 222 |
|
|
out of date.)
|
| 223 |
|
|
It does NOT run under plain 16-bit DOS or Windows 3.X. There are however
|
| 224 |
|
|
various packages (e.g. win32s, djgpp) that allow flat 32-bit address
|
| 225 |
|
|
applications to run under those systemsif the have at least an 80386 processor,
|
| 226 |
|
|
and several of those are compatible with the collector.
|
| 227 |
|
|
|
| 228 |
|
|
In a few cases (Amiga, OS/2, Win32, MacOS) a separate makefile
|
| 229 |
|
|
or equivalent is supplied. Many of these have separate README.system
|
| 230 |
|
|
files.
|
| 231 |
|
|
|
| 232 |
|
|
Dynamic libraries are completely supported only under SunOS/Solaris,
|
| 233 |
|
|
(and even that support is not functional on the last Sun 3 release),
|
| 234 |
|
|
Linux, FreeBSD, NetBSD, IRIX 5&6, HP/UX, Win32 (not Win32S) and OSF/1
|
| 235 |
|
|
on DEC AXP machines plus perhaps a few others listed near the top
|
| 236 |
|
|
of dyn_load.c. On other machines we recommend that you do one of
|
| 237 |
|
|
the following:
|
| 238 |
|
|
|
| 239 |
|
|
1) Add dynamic library support (and send us the code).
|
| 240 |
|
|
2) Use static versions of the libraries.
|
| 241 |
|
|
3) Arrange for dynamic libraries to use the standard malloc.
|
| 242 |
|
|
This is still dangerous if the library stores a pointer to a
|
| 243 |
|
|
garbage collected object. But nearly all standard interfaces
|
| 244 |
|
|
prohibit this, because they deal correctly with pointers
|
| 245 |
|
|
to stack allocated objects. (Strtok is an exception. Don't
|
| 246 |
|
|
use it.)
|
| 247 |
|
|
|
| 248 |
|
|
In all cases we assume that pointer alignment is consistent with that
|
| 249 |
|
|
enforced by the standard C compilers. If you use a nonstandard compiler
|
| 250 |
|
|
you may have to adjust the alignment parameters defined in gc_priv.h.
|
| 251 |
|
|
Note that this may also be an issue with packed records/structs, if those
|
| 252 |
|
|
enforce less alignment for pointers.
|
| 253 |
|
|
|
| 254 |
|
|
A port to a machine that is not byte addressed, or does not use 32 bit
|
| 255 |
|
|
or 64 bit addresses will require a major effort. A port to plain MSDOS
|
| 256 |
|
|
or win16 is hard.
|
| 257 |
|
|
|
| 258 |
|
|
For machines not already mentioned, or for nonstandard compilers, the
|
| 259 |
|
|
following are likely to require change:
|
| 260 |
|
|
|
| 261 |
|
|
1. The parameters in gcconfig.h.
|
| 262 |
|
|
The parameters that will usually require adjustment are
|
| 263 |
|
|
STACKBOTTOM, ALIGNMENT and DATASTART. Setjmp_test
|
| 264 |
|
|
prints its guesses of the first two.
|
| 265 |
|
|
DATASTART should be an expression for computing the
|
| 266 |
|
|
address of the beginning of the data segment. This can often be
|
| 267 |
|
|
&etext. But some memory management units require that there be
|
| 268 |
|
|
some unmapped space between the text and the data segment. Thus
|
| 269 |
|
|
it may be more complicated. On UNIX systems, this is rarely
|
| 270 |
|
|
documented. But the adb "$m" command may be helpful. (Note
|
| 271 |
|
|
that DATASTART will usually be a function of &etext. Thus a
|
| 272 |
|
|
single experiment is usually insufficient.)
|
| 273 |
|
|
STACKBOTTOM is used to initialize GC_stackbottom, which
|
| 274 |
|
|
should be a sufficient approximation to the coldest stack address.
|
| 275 |
|
|
On some machines, it is difficult to obtain such a value that is
|
| 276 |
|
|
valid across a variety of MMUs, OS releases, etc. A number of
|
| 277 |
|
|
alternatives exist for using the collector in spite of this. See the
|
| 278 |
|
|
discussion in gcconfig.h immediately preceding the various
|
| 279 |
|
|
definitions of STACKBOTTOM.
|
| 280 |
|
|
|
| 281 |
|
|
2. mach_dep.c.
|
| 282 |
|
|
The most important routine here is one to mark from registers.
|
| 283 |
|
|
The distributed file includes a generic hack (based on setjmp) that
|
| 284 |
|
|
happens to work on many machines, and may work on yours. Try
|
| 285 |
|
|
compiling and running setjmp_t.c to see whether it has a chance of
|
| 286 |
|
|
working. (This is not correct C, so don't blame your compiler if it
|
| 287 |
|
|
doesn't work. Based on limited experience, register window machines
|
| 288 |
|
|
are likely to cause trouble. If your version of setjmp claims that
|
| 289 |
|
|
all accessible variables, including registers, have the value they
|
| 290 |
|
|
had at the time of the longjmp, it also will not work. Vanilla 4.2 BSD
|
| 291 |
|
|
on Vaxen makes such a claim. SunOS does not.)
|
| 292 |
|
|
If your compiler does not allow in-line assembly code, or if you prefer
|
| 293 |
|
|
not to use such a facility, mach_dep.c may be replaced by a .s file
|
| 294 |
|
|
(as we did for the MIPS machine and the PC/RT).
|
| 295 |
|
|
At this point enough architectures are supported by mach_dep.c
|
| 296 |
|
|
that you will rarely need to do more than adjust for assembler
|
| 297 |
|
|
syntax.
|
| 298 |
|
|
|
| 299 |
|
|
3. os_dep.c (and gc_priv.h).
|
| 300 |
|
|
Several kinds of operating system dependent routines reside here.
|
| 301 |
|
|
Many are optional. Several are invoked only through corresponding
|
| 302 |
|
|
macros in gc_priv.h, which may also be redefined as appropriate.
|
| 303 |
|
|
The routine GC_register_data_segments is crucial. It registers static
|
| 304 |
|
|
data areas that must be traversed by the collector. (User calls to
|
| 305 |
|
|
GC_add_roots may sometimes be used for similar effect.)
|
| 306 |
|
|
Routines to obtain memory from the OS also reside here.
|
| 307 |
|
|
Alternatively this can be done entirely by the macro GET_MEM
|
| 308 |
|
|
defined in gc_priv.h. Routines to disable and reenable signals
|
| 309 |
|
|
also reside here if they are need by the macros DISABLE_SIGNALS
|
| 310 |
|
|
and ENABLE_SIGNALS defined in gc_priv.h.
|
| 311 |
|
|
In a multithreaded environment, the macros LOCK and UNLOCK
|
| 312 |
|
|
in gc_priv.h will need to be suitably redefined.
|
| 313 |
|
|
The incremental collector requires page dirty information, which
|
| 314 |
|
|
is acquired through routines defined in os_dep.c. Unless directed
|
| 315 |
|
|
otherwise by gcconfig.h, these are implemented as stubs that simply
|
| 316 |
|
|
treat all pages as dirty. (This of course makes the incremental
|
| 317 |
|
|
collector much less useful.)
|
| 318 |
|
|
|
| 319 |
|
|
4. dyn_load.c
|
| 320 |
|
|
This provides a routine that allows the collector to scan data
|
| 321 |
|
|
segments associated with dynamic libraries. Often it is not
|
| 322 |
|
|
necessary to provide this routine unless user-written dynamic
|
| 323 |
|
|
libraries are used.
|
| 324 |
|
|
|
| 325 |
|
|
For a different version of UN*X or different machines using the
|
| 326 |
|
|
Motorola 68000, Vax, SPARC, 80386, NS 32000, PC/RT, or MIPS architecture,
|
| 327 |
|
|
it should frequently suffice to change definitions in gcconfig.h.
|
| 328 |
|
|
|
| 329 |
|
|
|
| 330 |
|
|
THE C INTERFACE TO THE ALLOCATOR
|
| 331 |
|
|
|
| 332 |
|
|
The following routines are intended to be directly called by the user.
|
| 333 |
|
|
Note that usually only GC_malloc is necessary. GC_clear_roots and GC_add_roots
|
| 334 |
|
|
calls may be required if the collector has to trace from nonstandard places
|
| 335 |
|
|
(e.g. from dynamic library data areas on a machine on which the
|
| 336 |
|
|
collector doesn't already understand them.) On some machines, it may
|
| 337 |
|
|
be desirable to set GC_stacktop to a good approximation of the stack base.
|
| 338 |
|
|
(This enhances code portability on HP PA machines, since there is no
|
| 339 |
|
|
good way for the collector to compute this value.) Client code may include
|
| 340 |
|
|
"gc.h", which defines all of the following, plus many others.
|
| 341 |
|
|
|
| 342 |
|
|
1) GC_malloc(nbytes)
|
| 343 |
|
|
- allocate an object of size nbytes. Unlike malloc, the object is
|
| 344 |
|
|
cleared before being returned to the user. Gc_malloc will
|
| 345 |
|
|
invoke the garbage collector when it determines this to be appropriate.
|
| 346 |
|
|
GC_malloc may return 0 if it is unable to acquire sufficient
|
| 347 |
|
|
space from the operating system. This is the most probable
|
| 348 |
|
|
consequence of running out of space. Other possible consequences
|
| 349 |
|
|
are that a function call will fail due to lack of stack space,
|
| 350 |
|
|
or that the collector will fail in other ways because it cannot
|
| 351 |
|
|
maintain its internal data structures, or that a crucial system
|
| 352 |
|
|
process will fail and take down the machine. Most of these
|
| 353 |
|
|
possibilities are independent of the malloc implementation.
|
| 354 |
|
|
|
| 355 |
|
|
2) GC_malloc_atomic(nbytes)
|
| 356 |
|
|
- allocate an object of size nbytes that is guaranteed not to contain any
|
| 357 |
|
|
pointers. The returned object is not guaranteed to be cleared.
|
| 358 |
|
|
(Can always be replaced by GC_malloc, but results in faster collection
|
| 359 |
|
|
times. The collector will probably run faster if large character
|
| 360 |
|
|
arrays, etc. are allocated with GC_malloc_atomic than if they are
|
| 361 |
|
|
statically allocated.)
|
| 362 |
|
|
|
| 363 |
|
|
3) GC_realloc(object, new_size)
|
| 364 |
|
|
- change the size of object to be new_size. Returns a pointer to the
|
| 365 |
|
|
new object, which may, or may not, be the same as the pointer to
|
| 366 |
|
|
the old object. The new object is taken to be atomic iff the old one
|
| 367 |
|
|
was. If the new object is composite and larger than the original object,
|
| 368 |
|
|
then the newly added bytes are cleared (we hope). This is very likely
|
| 369 |
|
|
to allocate a new object, unless MERGE_SIZES is defined in gc_priv.h.
|
| 370 |
|
|
Even then, it is likely to recycle the old object only if the object
|
| 371 |
|
|
is grown in small additive increments (which, we claim, is generally bad
|
| 372 |
|
|
coding practice.)
|
| 373 |
|
|
|
| 374 |
|
|
4) GC_free(object)
|
| 375 |
|
|
- explicitly deallocate an object returned by GC_malloc or
|
| 376 |
|
|
GC_malloc_atomic. Not necessary, but can be used to minimize
|
| 377 |
|
|
collections if performance is critical. Probably a performance
|
| 378 |
|
|
loss for very small objects (<= 8 bytes).
|
| 379 |
|
|
|
| 380 |
|
|
5) GC_expand_hp(bytes)
|
| 381 |
|
|
- Explicitly increase the heap size. (This is normally done automatically
|
| 382 |
|
|
if a garbage collection failed to GC_reclaim enough memory. Explicit
|
| 383 |
|
|
calls to GC_expand_hp may prevent unnecessarily frequent collections at
|
| 384 |
|
|
program startup.)
|
| 385 |
|
|
|
| 386 |
|
|
6) GC_malloc_ignore_off_page(bytes)
|
| 387 |
|
|
- identical to GC_malloc, but the client promises to keep a pointer to
|
| 388 |
|
|
the somewhere within the first 256 bytes of the object while it is
|
| 389 |
|
|
live. (This pointer should nortmally be declared volatile to prevent
|
| 390 |
|
|
interference from compiler optimizations.) This is the recommended
|
| 391 |
|
|
way to allocate anything that is likely to be larger than 100Kbytes
|
| 392 |
|
|
or so. (GC_malloc may result in failure to reclaim such objects.)
|
| 393 |
|
|
|
| 394 |
|
|
7) GC_set_warn_proc(proc)
|
| 395 |
|
|
- Can be used to redirect warnings from the collector. Such warnings
|
| 396 |
|
|
should be rare, and should not be ignored during code development.
|
| 397 |
|
|
|
| 398 |
|
|
8) GC_enable_incremental()
|
| 399 |
|
|
- Enables generational and incremental collection. Useful for large
|
| 400 |
|
|
heaps on machines that provide access to page dirty information.
|
| 401 |
|
|
Some dirty bit implementations may interfere with debugging
|
| 402 |
|
|
(by catching address faults) and place restrictions on heap arguments
|
| 403 |
|
|
to system calls (since write faults inside a system call may not be
|
| 404 |
|
|
handled well).
|
| 405 |
|
|
|
| 406 |
|
|
9) Several routines to allow for registration of finalization code.
|
| 407 |
|
|
User supplied finalization code may be invoked when an object becomes
|
| 408 |
|
|
unreachable. To call (*f)(obj, x) when obj becomes inaccessible, use
|
| 409 |
|
|
GC_register_finalizer(obj, f, x, 0, 0);
|
| 410 |
|
|
For more sophisticated uses, and for finalization ordering issues,
|
| 411 |
|
|
see gc.h.
|
| 412 |
|
|
|
| 413 |
|
|
The global variable GC_free_space_divisor may be adjusted up from its
|
| 414 |
|
|
default value of 4 to use less space and more collection time, or down for
|
| 415 |
|
|
the opposite effect. Setting it to 1 or 0 will effectively disable collections
|
| 416 |
|
|
and cause all allocations to simply grow the heap.
|
| 417 |
|
|
|
| 418 |
|
|
The variable GC_non_gc_bytes, which is normally 0, may be changed to reflect
|
| 419 |
|
|
the amount of memory allocated by the above routines that should not be
|
| 420 |
|
|
considered as a candidate for collection. Careless use may, of course, result
|
| 421 |
|
|
in excessive memory consumption.
|
| 422 |
|
|
|
| 423 |
|
|
Some additional tuning is possible through the parameters defined
|
| 424 |
|
|
near the top of gc_priv.h.
|
| 425 |
|
|
|
| 426 |
|
|
If only GC_malloc is intended to be used, it might be appropriate to define:
|
| 427 |
|
|
|
| 428 |
|
|
#define malloc(n) GC_malloc(n)
|
| 429 |
|
|
#define calloc(m,n) GC_malloc((m)*(n))
|
| 430 |
|
|
|
| 431 |
|
|
For small pieces of VERY allocation intensive code, gc_inl.h
|
| 432 |
|
|
includes some allocation macros that may be used in place of GC_malloc
|
| 433 |
|
|
and friends.
|
| 434 |
|
|
|
| 435 |
|
|
All externally visible names in the garbage collector start with "GC_".
|
| 436 |
|
|
To avoid name conflicts, client code should avoid this prefix, except when
|
| 437 |
|
|
accessing garbage collector routines or variables.
|
| 438 |
|
|
|
| 439 |
|
|
There are provisions for allocation with explicit type information.
|
| 440 |
|
|
This is rarely necessary. Details can be found in gc_typed.h.
|
| 441 |
|
|
|
| 442 |
|
|
THE C++ INTERFACE TO THE ALLOCATOR:
|
| 443 |
|
|
|
| 444 |
|
|
The Ellis-Hull C++ interface to the collector is included in
|
| 445 |
|
|
the collector distribution. If you intend to use this, type
|
| 446 |
|
|
"make c++" after the initial build of the collector is complete.
|
| 447 |
|
|
See gc_cpp.h for the definition of the interface. This interface
|
| 448 |
|
|
tries to approximate the Ellis-Detlefs C++ garbage collection
|
| 449 |
|
|
proposal without compiler changes.
|
| 450 |
|
|
|
| 451 |
|
|
Cautions:
|
| 452 |
|
|
1. Arrays allocated without new placement syntax are
|
| 453 |
|
|
allocated as uncollectable objects. They are traced by the
|
| 454 |
|
|
collector, but will not be reclaimed.
|
| 455 |
|
|
|
| 456 |
|
|
2. Failure to use "make c++" in combination with (1) will
|
| 457 |
|
|
result in arrays allocated using the default new operator.
|
| 458 |
|
|
This is likely to result in disaster without linker warnings.
|
| 459 |
|
|
|
| 460 |
|
|
3. If your compiler supports an overloaded new[] operator,
|
| 461 |
|
|
then gc_cpp.cc and gc_cpp.h should be suitably modified.
|
| 462 |
|
|
|
| 463 |
|
|
4. Many current C++ compilers have deficiencies that
|
| 464 |
|
|
break some of the functionality. See the comments in gc_cpp.h
|
| 465 |
|
|
for suggested workarounds.
|
| 466 |
|
|
|
| 467 |
|
|
USE AS LEAK DETECTOR:
|
| 468 |
|
|
|
| 469 |
|
|
The collector may be used to track down leaks in C programs that are
|
| 470 |
|
|
intended to run with malloc/free (e.g. code with extreme real-time or
|
| 471 |
|
|
portability constraints). To do so define FIND_LEAK in Makefile
|
| 472 |
|
|
This will cause the collector to invoke the report_leak
|
| 473 |
|
|
routine defined near the top of reclaim.c whenever an inaccessible
|
| 474 |
|
|
object is found that has not been explicitly freed. Such objects will
|
| 475 |
|
|
also be automatically reclaimed.
|
| 476 |
|
|
Productive use of this facility normally involves redefining report_leak
|
| 477 |
|
|
to do something more intelligent. This typically requires annotating
|
| 478 |
|
|
objects with additional information (e.g. creation time stack trace) that
|
| 479 |
|
|
identifies their origin. Such code is typically not very portable, and is
|
| 480 |
|
|
not included here, except on SPARC machines.
|
| 481 |
|
|
If all objects are allocated with GC_DEBUG_MALLOC (see next section),
|
| 482 |
|
|
then the default version of report_leak will report the source file
|
| 483 |
|
|
and line number at which the leaked object was allocated. This may
|
| 484 |
|
|
sometimes be sufficient. (On SPARC/SUNOS4 machines, it will also report
|
| 485 |
|
|
a cryptic stack trace. This can often be turned into a sympolic stack
|
| 486 |
|
|
trace by invoking program "foo" with "callprocs foo". Callprocs is
|
| 487 |
|
|
a short shell script that invokes adb to expand program counter values
|
| 488 |
|
|
to symbolic addresses. It was largely supplied by Scott Schwartz.)
|
| 489 |
|
|
Note that the debugging facilities described in the next section can
|
| 490 |
|
|
sometimes be slightly LESS effective in leak finding mode, since in
|
| 491 |
|
|
leak finding mode, GC_debug_free actually results in reuse of the object.
|
| 492 |
|
|
(Otherwise the object is simply marked invalid.) Also note that the test
|
| 493 |
|
|
program is not designed to run meaningfully in FIND_LEAK mode.
|
| 494 |
|
|
Use "make gc.a" to build the collector.
|
| 495 |
|
|
|
| 496 |
|
|
DEBUGGING FACILITIES:
|
| 497 |
|
|
|
| 498 |
|
|
The routines GC_debug_malloc, GC_debug_malloc_atomic, GC_debug_realloc,
|
| 499 |
|
|
and GC_debug_free provide an alternate interface to the collector, which
|
| 500 |
|
|
provides some help with memory overwrite errors, and the like.
|
| 501 |
|
|
Objects allocated in this way are annotated with additional
|
| 502 |
|
|
information. Some of this information is checked during garbage
|
| 503 |
|
|
collections, and detected inconsistencies are reported to stderr.
|
| 504 |
|
|
|
| 505 |
|
|
Simple cases of writing past the end of an allocated object should
|
| 506 |
|
|
be caught if the object is explicitly deallocated, or if the
|
| 507 |
|
|
collector is invoked while the object is live. The first deallocation
|
| 508 |
|
|
of an object will clear the debugging info associated with an
|
| 509 |
|
|
object, so accidentally repeated calls to GC_debug_free will report the
|
| 510 |
|
|
deallocation of an object without debugging information. Out of
|
| 511 |
|
|
memory errors will be reported to stderr, in addition to returning
|
| 512 |
|
|
NIL.
|
| 513 |
|
|
|
| 514 |
|
|
GC_debug_malloc checking during garbage collection is enabled
|
| 515 |
|
|
with the first call to GC_debug_malloc. This will result in some
|
| 516 |
|
|
slowdown during collections. If frequent heap checks are desired,
|
| 517 |
|
|
this can be achieved by explicitly invoking GC_gcollect, e.g. from
|
| 518 |
|
|
the debugger.
|
| 519 |
|
|
|
| 520 |
|
|
GC_debug_malloc allocated objects should not be passed to GC_realloc
|
| 521 |
|
|
or GC_free, and conversely. It is however acceptable to allocate only
|
| 522 |
|
|
some objects with GC_debug_malloc, and to use GC_malloc for other objects,
|
| 523 |
|
|
provided the two pools are kept distinct. In this case, there is a very
|
| 524 |
|
|
low probablility that GC_malloc allocated objects may be misidentified as
|
| 525 |
|
|
having been overwritten. This should happen with probability at most
|
| 526 |
|
|
one in 2**32. This probability is zero if GC_debug_malloc is never called.
|
| 527 |
|
|
|
| 528 |
|
|
GC_debug_malloc, GC_malloc_atomic, and GC_debug_realloc take two
|
| 529 |
|
|
additional trailing arguments, a string and an integer. These are not
|
| 530 |
|
|
interpreted by the allocator. They are stored in the object (the string is
|
| 531 |
|
|
not copied). If an error involving the object is detected, they are printed.
|
| 532 |
|
|
|
| 533 |
|
|
The macros GC_MALLOC, GC_MALLOC_ATOMIC, GC_REALLOC, GC_FREE, and
|
| 534 |
|
|
GC_REGISTER_FINALIZER are also provided. These require the same arguments
|
| 535 |
|
|
as the corresponding (nondebugging) routines. If gc.h is included
|
| 536 |
|
|
with GC_DEBUG defined, they call the debugging versions of these
|
| 537 |
|
|
functions, passing the current file name and line number as the two
|
| 538 |
|
|
extra arguments, where appropriate. If gc.h is included without GC_DEBUG
|
| 539 |
|
|
defined, then all these macros will instead be defined to their nondebugging
|
| 540 |
|
|
equivalents. (GC_REGISTER_FINALIZER is necessary, since pointers to
|
| 541 |
|
|
objects with debugging information are really pointers to a displacement
|
| 542 |
|
|
of 16 bytes form the object beginning, and some translation is necessary
|
| 543 |
|
|
when finalization routines are invoked. For details, about what's stored
|
| 544 |
|
|
in the header, see the definition of the type oh in debug_malloc.c)
|
| 545 |
|
|
|
| 546 |
|
|
INCREMENTAL/GENERATIONAL COLLECTION:
|
| 547 |
|
|
|
| 548 |
|
|
The collector normally interrupts client code for the duration of
|
| 549 |
|
|
a garbage collection mark phase. This may be unacceptable if interactive
|
| 550 |
|
|
response is needed for programs with large heaps. The collector
|
| 551 |
|
|
can also run in a "generational" mode, in which it usually attempts to
|
| 552 |
|
|
collect only objects allocated since the last garbage collection.
|
| 553 |
|
|
Furthermore, in this mode, garbage collections run mostly incrementally,
|
| 554 |
|
|
with a small amount of work performed in response to each of a large number of
|
| 555 |
|
|
GC_malloc requests.
|
| 556 |
|
|
|
| 557 |
|
|
This mode is enabled by a call to GC_enable_incremental().
|
| 558 |
|
|
|
| 559 |
|
|
Incremental and generational collection is effective in reducing
|
| 560 |
|
|
pause times only if the collector has some way to tell which objects
|
| 561 |
|
|
or pages have been recently modified. The collector uses two sources
|
| 562 |
|
|
of information:
|
| 563 |
|
|
|
| 564 |
|
|
1. Information provided by the VM system. This may be provided in
|
| 565 |
|
|
one of several forms. Under Solaris 2.X (and potentially under other
|
| 566 |
|
|
similar systems) information on dirty pages can be read from the
|
| 567 |
|
|
/proc file system. Under other systems (currently SunOS4.X) it is
|
| 568 |
|
|
possible to write-protect the heap, and catch the resulting faults.
|
| 569 |
|
|
On these systems we require that system calls writing to the heap
|
| 570 |
|
|
(other than read) be handled specially by client code.
|
| 571 |
|
|
See os_dep.c for details.
|
| 572 |
|
|
|
| 573 |
|
|
2. Information supplied by the programmer. We define "stubborn"
|
| 574 |
|
|
objects to be objects that are rarely changed. Such an object
|
| 575 |
|
|
can be allocated (and enabled for writing) with GC_malloc_stubborn.
|
| 576 |
|
|
Once it has been initialized, the collector should be informed with
|
| 577 |
|
|
a call to GC_end_stubborn_change. Subsequent writes that store
|
| 578 |
|
|
pointers into the object must be preceded by a call to
|
| 579 |
|
|
GC_change_stubborn.
|
| 580 |
|
|
|
| 581 |
|
|
This mechanism performs best for objects that are written only for
|
| 582 |
|
|
initialization, and such that only one stubborn object is writable
|
| 583 |
|
|
at once. It is typically not worth using for short-lived
|
| 584 |
|
|
objects. Stubborn objects are treated less efficiently than pointerfree
|
| 585 |
|
|
(atomic) objects.
|
| 586 |
|
|
|
| 587 |
|
|
A rough rule of thumb is that, in the absence of VM information, garbage
|
| 588 |
|
|
collection pauses are proportional to the amount of pointerful storage
|
| 589 |
|
|
plus the amount of modified "stubborn" storage that is reachable during
|
| 590 |
|
|
the collection.
|
| 591 |
|
|
|
| 592 |
|
|
Initial allocation of stubborn objects takes longer than allocation
|
| 593 |
|
|
of other objects, since other data structures need to be maintained.
|
| 594 |
|
|
|
| 595 |
|
|
We recommend against random use of stubborn objects in client
|
| 596 |
|
|
code, since bugs caused by inappropriate writes to stubborn objects
|
| 597 |
|
|
are likely to be very infrequently observed and hard to trace.
|
| 598 |
|
|
However, their use may be appropriate in a few carefully written
|
| 599 |
|
|
library routines that do not make the objects themselves available
|
| 600 |
|
|
for writing by client code.
|
| 601 |
|
|
|
| 602 |
|
|
|
| 603 |
|
|
BUGS:
|
| 604 |
|
|
|
| 605 |
|
|
Any memory that does not have a recognizable pointer to it will be
|
| 606 |
|
|
reclaimed. Exclusive-or'ing forward and backward links in a list
|
| 607 |
|
|
doesn't cut it.
|
| 608 |
|
|
Some C optimizers may lose the last undisguised pointer to a memory
|
| 609 |
|
|
object as a consequence of clever optimizations. This has almost
|
| 610 |
|
|
never been observed in practice. Send mail to boehm@acm.org
|
| 611 |
|
|
for suggestions on how to fix your compiler.
|
| 612 |
|
|
This is not a real-time collector. In the standard configuration,
|
| 613 |
|
|
percentage of time required for collection should be constant across
|
| 614 |
|
|
heap sizes. But collection pauses will increase for larger heaps.
|
| 615 |
|
|
(On SPARCstation 2s collection times will be on the order of 300 msecs
|
| 616 |
|
|
per MB of accessible memory that needs to be scanned. Your mileage
|
| 617 |
|
|
may vary.) The incremental/generational collection facility helps,
|
| 618 |
|
|
but is portable only if "stubborn" allocation is used.
|
| 619 |
|
|
Please address bug reports to boehm@acm.org. If you are
|
| 620 |
|
|
contemplating a major addition, you might also send mail to ask whether
|
| 621 |
|
|
it's already been done (or whether we tried and discarded it).
|
| 622 |
|
|
|