Annotation of OpenXM_contrib2/asir2000/gc/doc/tree.html, Revision 1.2
1.1 noro 1: <HTML>
2: <HEAD>
3: <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
1.2 ! noro 4: <AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author>
1.1 noro 5: </HEAD>
6: <BODY>
7: <H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
8: <P>
9: The conservative garbage collector described
1.2 ! noro 10: <A HREF="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">here</a>
! 11: uses a 2-level tree
1.1 noro 12: data structure to aid in fast pointer identification.
13: This data structure is described in a bit more detail here, since
14: <OL>
15: <LI> Variations of the data structure are more generally useful.
16: <LI> It appears to be hard to understand by reading the code.
17: <LI> Some other collectors appear to use inferior data structures to
18: solve the same problem.
19: <LI> It is central to fast collector operation.
20: </ol>
21: A candidate pointer is divided into three sections, the <I>high</i>,
22: <I>middle</i>, and <I>low</i> bits. The exact division between these
23: three groups of bits is dependent on the detailed collector configuration.
24: <P>
25: The high and middle bits are used to look up an entry in the table described
26: here. The resulting table entry consists of either a block descriptor
27: (<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
28: identifying the layout of objects in the block, or an indication that this
29: address range corresponds to the middle of a large block, together with a
30: hint for locating the actual block descriptor. Such a hint consist
31: of a displacement that can be subtracted from the middle bits of the candidate
32: pointer without leaving the object.
33: <P>
34: In either case, the block descriptor (<TT>struct hblkhdr</tt>)
35: refers to a table of object starting addresses (the <TT>hb_map</tt> field).
36: The starting address table is indexed by the low bits if the candidate pointer.
37: The resulting entry contains a displacement to the beginning of the object,
38: or an indication that this cannot be a valid object pointer.
39: (If all interior pointer are recognized, pointers into large objects
40: are handled specially, as appropriate.)
41:
42: <H2>The Tree</h2>
43: <P>
44: The rest of this discussion focuses on the two level data structure
45: used to map the high and middle bits to the block descriptor.
46: <P>
47: The high bits are used as an index into the <TT>GC_top_index</tt> (really
48: <TT>GC_arrays._top_index</tt>) array. Each entry points to a
49: <TT>bottom_index</tt> data structure. This structure in turn consists
50: mostly of an array <TT>index</tt> indexed by the middle bits of
51: the candidate pointer. The <TT>index</tt> array contains the actual
52: <TT>hdr</tt> pointers.
53: <P>
54: Thus a pointer lookup consists primarily of a handful of memory references,
55: and can be quite fast:
56: <OL>
57: <LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
58: <TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
59: <LI> The appropriate <TT>hdr</tt> pointer is looked up in the
60: <TT>bottom_index</tt> structure, based on the middle bits.
61: <LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
62: structure. (This memory reference is necessary since we try to share
63: block layout maps.)
64: <LI> The displacement to the beginning of the object is retrieved from the
65: above map.
66: </ol>
67: <P>
68: In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
69: point to distinct <TT>bottom_index</tt> structures. If no address with
70: the corresponding high bits is part of the heap, then the entry points
71: to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
72: only of NULL <TT>hdr</tt> pointers.
73: <P>
74: <TT>Bottom_index</tt> structures contain slightly more information than
75: just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
76: all <TT>bottom_index</tt> structures in ascending order for fast traversal.
77: This list is pointed to be <TT>GC_all_bottom_indices</tt>.
78: It is maintained with the aid of <TT>key</tt> field that contains the
79: high bits corresponding to the <TT>bottom_index</tt>.
80:
81: <H2>64 bit addresses</h2>
82: <P>
83: In the case of 64 bit addresses, this picture is complicated slightly
84: by the fact that one of the index structures would have to be huge to
85: cover the entire address space with a two level tree. We deal with this
86: by turning <TT>GC_top_index</tt> into a chained hash table, instead of
87: a simple array. This adds a <TT>hash_link</tt> field to the
88: <TT>bottom_index</tt> structure.
89: <P>
90: The "hash function" consists of dropping the high bits. This is cheap to
91: compute, and guarantees that there will be no collisions if the heap
92: is contiguous and not excessively large.
93:
94: <H2>A picture</h2>
95: <P>
96: The following is an ASCII diagram of the data structure.
97: This was contributed by Dave Barrett several years ago.
98: <PRE>
99:
100: Data Structure used by GC_base in gc3.7:
101: 21-Apr-94
102:
103:
104:
105:
106: 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
107: +------------------+----------------+------------------+------------------+
108: p:| | TL_HASH(hi) | | HBLKDISPL(p) |
109: +------------------+----------------+------------------+------------------+
110: \-----------------------HBLKPTR(p)-------------------/
111: \------------hi-------------------/
112: \______ ________/ \________ _______/ \________ _______/
113: V V V
114: | | |
115: GC_top_index[] | | |
116: --- +--------------+ | | |
117: ^ | | | | |
118: | | | | | |
119: TOP +--------------+<--+ | |
120: _SZ +-<| [] | * | |
121: (items)| +--------------+ if 0 < bi< HBLKSIZE | |
122: | | | | then large object | |
123: | | | | starts at the bi'th | |
124: v | | | HBLK before p. | i |
125: --- | +--------------+ | (word- |
126: v | aligned) |
127: bi= |GET_BI(p){->hash_link}->key==hi | |
128: v | |
129: | (bottom_index) \ scratch_alloc'd | |
130: | ( struct bi ) / by get_index() | |
131: --- +->+--------------+ | |
132: ^ | | | |
133: ^ | | | |
134: BOTTOM | | ha=GET_HDR_ADDR(p) | |
135: _SZ(items)+--------------+<----------------------+ +-------+
136: | +--<| index[] | |
137: | | +--------------+ GC_obj_map: v
138: | | | | from / +-+-+-----+-+-+-+-+ ---
139: v | | | GC_add < 0| | | | | | | | ^
140: --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
141: | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
142: | +--------------+ +-->| | | j | | | | | +1
143: | | key | | +-+-+-----+-+-+-+-+ |
144: | +--------------+ | +-+-+-----+-+-+-+-+ |
145: | | hash_link | | | | | | | | | | v
146: | +--------------+ | +-+-+-----+-+-+-+-+ ---
147: | | |<--MAX_OFFSET--->|
148: | | (bytes)
149: HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
150: | \ from | =HBLKSIZE/WORDSZ
151: | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
152: +-->+----------------------+ | (8/16 bits each)
153: GET_HDR(p)| word hb_sz (words) | |
154: +----------------------+ |
155: | struct hblk *hb_next | |
156: +----------------------+ |
157: |mark_proc hb_mark_proc| |
158: +----------------------+ |
159: | char * hb_map |>-------------+
160: +----------------------+
161: | ushort hb_obj_kind |
162: +----------------------+
163: | hb_last_reclaimed |
164: --- +----------------------+
165: ^ | |
166: MARK_BITS| hb_marks[] | *if hdr is free, hb_sz + DISCARD_WORDS
167: _SZ(words)| | is the size of a heap chunk (struct hblk)
168: v | | of at least MININCR*HBLKSIZE bytes (below),
169: --- +----------------------+ otherwise, size of each object in chunk.
170:
171: Dynamic data structures above are interleaved throughout the heap in blocks of
172: size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
173: freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
174:
175: (struct hblk)
176: --- +----------------------+ < HBLKSIZE --- --- DISCARD_
177: ^ |garbage[DISCARD_WORDS]| aligned ^ ^ HDR_BYTES WORDS
178: | | | | v (bytes) (words)
179: | +-----hb_body----------+ < WORDSZ | --- ---
180: | | | aligned | ^ ^
181: | | Object 0 | | hb_sz |
182: | | | i |(word- (words)|
183: | | | (bytes)|aligned) v |
184: | + - - - - - - - - - - -+ --- | --- |
185: | | | ^ | ^ |
186: n * | | j (words) | hb_sz BODY_SZ
187: HBLKSIZE | Object 1 | v v | (words)
188: (bytes) | |--------------- v MAX_OFFSET
189: | + - - - - - - - - - - -+ --- (bytes)
190: | | | !All_INTERIOR_PTRS ^ |
191: | | | sets j only for hb_sz |
192: | | Object N | valid object offsets. | |
193: v | | All objects WORDSZ v v
194: --- +----------------------+ aligned. --- ---
195:
196: DISCARD_WORDS is normally zero. Indeed the collector has not been tested
197: with another value in ages.
198: </pre>
199: </body>
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>