Annotation of OpenXM_contrib2/asir2000/gc/doc/tree.html, Revision 1.1
1.1 ! noro 1: <HTML>
! 2: <HEAD>
! 3: <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
! 4: <AUTHOR> Hans-J. Boehm, Silicon Graphics</author>
! 5: </HEAD>
! 6: <BODY>
! 7: <H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
! 8: <P>
! 9: The conservative garbage collector described
! 10: <A HREF="gc.html">here</a> uses a 2-level tree
! 11: data structure to aid in fast pointer identification.
! 12: This data structure is described in a bit more detail here, since
! 13: <OL>
! 14: <LI> Variations of the data structure are more generally useful.
! 15: <LI> It appears to be hard to understand by reading the code.
! 16: <LI> Some other collectors appear to use inferior data structures to
! 17: solve the same problem.
! 18: <LI> It is central to fast collector operation.
! 19: </ol>
! 20: A candidate pointer is divided into three sections, the <I>high</i>,
! 21: <I>middle</i>, and <I>low</i> bits. The exact division between these
! 22: three groups of bits is dependent on the detailed collector configuration.
! 23: <P>
! 24: The high and middle bits are used to look up an entry in the table described
! 25: here. The resulting table entry consists of either a block descriptor
! 26: (<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
! 27: identifying the layout of objects in the block, or an indication that this
! 28: address range corresponds to the middle of a large block, together with a
! 29: hint for locating the actual block descriptor. Such a hint consist
! 30: of a displacement that can be subtracted from the middle bits of the candidate
! 31: pointer without leaving the object.
! 32: <P>
! 33: In either case, the block descriptor (<TT>struct hblkhdr</tt>)
! 34: refers to a table of object starting addresses (the <TT>hb_map</tt> field).
! 35: The starting address table is indexed by the low bits if the candidate pointer.
! 36: The resulting entry contains a displacement to the beginning of the object,
! 37: or an indication that this cannot be a valid object pointer.
! 38: (If all interior pointer are recognized, pointers into large objects
! 39: are handled specially, as appropriate.)
! 40:
! 41: <H2>The Tree</h2>
! 42: <P>
! 43: The rest of this discussion focuses on the two level data structure
! 44: used to map the high and middle bits to the block descriptor.
! 45: <P>
! 46: The high bits are used as an index into the <TT>GC_top_index</tt> (really
! 47: <TT>GC_arrays._top_index</tt>) array. Each entry points to a
! 48: <TT>bottom_index</tt> data structure. This structure in turn consists
! 49: mostly of an array <TT>index</tt> indexed by the middle bits of
! 50: the candidate pointer. The <TT>index</tt> array contains the actual
! 51: <TT>hdr</tt> pointers.
! 52: <P>
! 53: Thus a pointer lookup consists primarily of a handful of memory references,
! 54: and can be quite fast:
! 55: <OL>
! 56: <LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
! 57: <TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
! 58: <LI> The appropriate <TT>hdr</tt> pointer is looked up in the
! 59: <TT>bottom_index</tt> structure, based on the middle bits.
! 60: <LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
! 61: structure. (This memory reference is necessary since we try to share
! 62: block layout maps.)
! 63: <LI> The displacement to the beginning of the object is retrieved from the
! 64: above map.
! 65: </ol>
! 66: <P>
! 67: In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
! 68: point to distinct <TT>bottom_index</tt> structures. If no address with
! 69: the corresponding high bits is part of the heap, then the entry points
! 70: to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
! 71: only of NULL <TT>hdr</tt> pointers.
! 72: <P>
! 73: <TT>Bottom_index</tt> structures contain slightly more information than
! 74: just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
! 75: all <TT>bottom_index</tt> structures in ascending order for fast traversal.
! 76: This list is pointed to be <TT>GC_all_bottom_indices</tt>.
! 77: It is maintained with the aid of <TT>key</tt> field that contains the
! 78: high bits corresponding to the <TT>bottom_index</tt>.
! 79:
! 80: <H2>64 bit addresses</h2>
! 81: <P>
! 82: In the case of 64 bit addresses, this picture is complicated slightly
! 83: by the fact that one of the index structures would have to be huge to
! 84: cover the entire address space with a two level tree. We deal with this
! 85: by turning <TT>GC_top_index</tt> into a chained hash table, instead of
! 86: a simple array. This adds a <TT>hash_link</tt> field to the
! 87: <TT>bottom_index</tt> structure.
! 88: <P>
! 89: The "hash function" consists of dropping the high bits. This is cheap to
! 90: compute, and guarantees that there will be no collisions if the heap
! 91: is contiguous and not excessively large.
! 92:
! 93: <H2>A picture</h2>
! 94: <P>
! 95: The following is an ASCII diagram of the data structure.
! 96: This was contributed by Dave Barrett several years ago.
! 97: <PRE>
! 98:
! 99: Data Structure used by GC_base in gc3.7:
! 100: 21-Apr-94
! 101:
! 102:
! 103:
! 104:
! 105: 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
! 106: +------------------+----------------+------------------+------------------+
! 107: p:| | TL_HASH(hi) | | HBLKDISPL(p) |
! 108: +------------------+----------------+------------------+------------------+
! 109: \-----------------------HBLKPTR(p)-------------------/
! 110: \------------hi-------------------/
! 111: \______ ________/ \________ _______/ \________ _______/
! 112: V V V
! 113: | | |
! 114: GC_top_index[] | | |
! 115: --- +--------------+ | | |
! 116: ^ | | | | |
! 117: | | | | | |
! 118: TOP +--------------+<--+ | |
! 119: _SZ +-<| [] | * | |
! 120: (items)| +--------------+ if 0 < bi< HBLKSIZE | |
! 121: | | | | then large object | |
! 122: | | | | starts at the bi'th | |
! 123: v | | | HBLK before p. | i |
! 124: --- | +--------------+ | (word- |
! 125: v | aligned) |
! 126: bi= |GET_BI(p){->hash_link}->key==hi | |
! 127: v | |
! 128: | (bottom_index) \ scratch_alloc'd | |
! 129: | ( struct bi ) / by get_index() | |
! 130: --- +->+--------------+ | |
! 131: ^ | | | |
! 132: ^ | | | |
! 133: BOTTOM | | ha=GET_HDR_ADDR(p) | |
! 134: _SZ(items)+--------------+<----------------------+ +-------+
! 135: | +--<| index[] | |
! 136: | | +--------------+ GC_obj_map: v
! 137: | | | | from / +-+-+-----+-+-+-+-+ ---
! 138: v | | | GC_add < 0| | | | | | | | ^
! 139: --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
! 140: | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
! 141: | +--------------+ +-->| | | j | | | | | +1
! 142: | | key | | +-+-+-----+-+-+-+-+ |
! 143: | +--------------+ | +-+-+-----+-+-+-+-+ |
! 144: | | hash_link | | | | | | | | | | v
! 145: | +--------------+ | +-+-+-----+-+-+-+-+ ---
! 146: | | |<--MAX_OFFSET--->|
! 147: | | (bytes)
! 148: HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
! 149: | \ from | =HBLKSIZE/WORDSZ
! 150: | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
! 151: +-->+----------------------+ | (8/16 bits each)
! 152: GET_HDR(p)| word hb_sz (words) | |
! 153: +----------------------+ |
! 154: | struct hblk *hb_next | |
! 155: +----------------------+ |
! 156: |mark_proc hb_mark_proc| |
! 157: +----------------------+ |
! 158: | char * hb_map |>-------------+
! 159: +----------------------+
! 160: | ushort hb_obj_kind |
! 161: +----------------------+
! 162: | hb_last_reclaimed |
! 163: --- +----------------------+
! 164: ^ | |
! 165: MARK_BITS| hb_marks[] | *if hdr is free, hb_sz + DISCARD_WORDS
! 166: _SZ(words)| | is the size of a heap chunk (struct hblk)
! 167: v | | of at least MININCR*HBLKSIZE bytes (below),
! 168: --- +----------------------+ otherwise, size of each object in chunk.
! 169:
! 170: Dynamic data structures above are interleaved throughout the heap in blocks of
! 171: size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
! 172: freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
! 173:
! 174: (struct hblk)
! 175: --- +----------------------+ < HBLKSIZE --- --- DISCARD_
! 176: ^ |garbage[DISCARD_WORDS]| aligned ^ ^ HDR_BYTES WORDS
! 177: | | | | v (bytes) (words)
! 178: | +-----hb_body----------+ < WORDSZ | --- ---
! 179: | | | aligned | ^ ^
! 180: | | Object 0 | | hb_sz |
! 181: | | | i |(word- (words)|
! 182: | | | (bytes)|aligned) v |
! 183: | + - - - - - - - - - - -+ --- | --- |
! 184: | | | ^ | ^ |
! 185: n * | | j (words) | hb_sz BODY_SZ
! 186: HBLKSIZE | Object 1 | v v | (words)
! 187: (bytes) | |--------------- v MAX_OFFSET
! 188: | + - - - - - - - - - - -+ --- (bytes)
! 189: | | | !All_INTERIOR_PTRS ^ |
! 190: | | | sets j only for hb_sz |
! 191: | | Object N | valid object offsets. | |
! 192: v | | All objects WORDSZ v v
! 193: --- +----------------------+ aligned. --- ---
! 194:
! 195: DISCARD_WORDS is normally zero. Indeed the collector has not been tested
! 196: with another value in ages.
! 197: </pre>
! 198: </body>
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>