[BACK]Return to tree.html CVS log [TXT][DIR] Up to [local] / OpenXM_contrib2 / asir2000 / gc / doc

Annotation of OpenXM_contrib2/asir2000/gc/doc/tree.html, Revision 1.2

1.1       noro        1: <HTML>
                      2: <HEAD>
                      3:     <TITLE>  Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
1.2     ! noro        4:     <AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author>
1.1       noro        5: </HEAD>
                      6: <BODY>
                      7: <H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
                      8: <P>
                      9: The conservative garbage collector described
1.2     ! noro       10: <A HREF="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">here</a>
        !            11: uses a 2-level tree
1.1       noro       12: data structure to aid in fast pointer identification.
                     13: This data structure is described in a bit more detail here, since
                     14: <OL>
                     15: <LI> Variations of the data structure are more generally useful.
                     16: <LI> It appears to be hard to understand by reading the code.
                     17: <LI> Some other collectors appear to use inferior data structures to
                     18: solve the same problem.
                     19: <LI> It is central to fast collector operation.
                     20: </ol>
                     21: A candidate pointer is divided into three sections, the <I>high</i>,
                     22: <I>middle</i>, and <I>low</i> bits.  The exact division between these
                     23: three groups of bits is dependent on the detailed collector configuration.
                     24: <P>
                     25: The high and middle bits are used to look up an entry in the table described
                     26: here.  The resulting table entry consists of either a block descriptor
                     27: (<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
                     28: identifying the layout of objects in the block, or an indication that this
                     29: address range corresponds to the middle of a large block, together with a
                     30: hint for locating the actual block descriptor.  Such a hint consist
                     31: of a displacement that can be subtracted from the middle bits of the candidate
                     32: pointer without leaving the object.
                     33: <P>
                     34: In either case, the block descriptor (<TT>struct hblkhdr</tt>)
                     35: refers to a table of object starting addresses (the <TT>hb_map</tt> field).
                     36: The starting address table is indexed by the low bits if the candidate pointer.
                     37: The resulting entry contains a displacement to the beginning of the object,
                     38: or an indication that this cannot be a valid object pointer.
                     39: (If all interior pointer are recognized, pointers into large objects
                     40: are handled specially, as appropriate.)
                     41:
                     42: <H2>The Tree</h2>
                     43: <P>
                     44: The rest of this discussion focuses on the two level data structure
                     45: used to map the high and middle bits to the block descriptor.
                     46: <P>
                     47: The high bits are used as an index into the <TT>GC_top_index</tt> (really
                     48: <TT>GC_arrays._top_index</tt>) array.  Each entry points to a
                     49: <TT>bottom_index</tt> data structure.  This structure in turn consists
                     50: mostly of an array <TT>index</tt> indexed by the middle bits of
                     51: the candidate pointer.  The <TT>index</tt> array contains the actual
                     52: <TT>hdr</tt> pointers.
                     53: <P>
                     54: Thus a pointer lookup consists primarily of a handful of memory references,
                     55: and can be quite fast:
                     56: <OL>
                     57: <LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
                     58: <TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
                     59: <LI> The appropriate <TT>hdr</tt> pointer is looked up in the
                     60: <TT>bottom_index</tt> structure, based on the middle bits.
                     61: <LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
                     62: structure.  (This memory reference is necessary since we try to share
                     63: block layout maps.)
                     64: <LI> The displacement to the beginning of the object is retrieved from the
                     65: above map.
                     66: </ol>
                     67: <P>
                     68: In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
                     69: point to distinct <TT>bottom_index</tt> structures.  If no address with
                     70: the corresponding high bits is part of the heap, then the entry points
                     71: to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
                     72: only of NULL <TT>hdr</tt> pointers.
                     73: <P>
                     74: <TT>Bottom_index</tt> structures contain slightly more information than
                     75: just <TT>hdr</tt> pointers.  The <TT>asc_link</tt> field is used to link
                     76: all <TT>bottom_index</tt> structures in ascending order for fast traversal.
                     77: This list is pointed to be <TT>GC_all_bottom_indices</tt>.
                     78: It is maintained with the aid of <TT>key</tt> field that contains the
                     79: high bits corresponding to the <TT>bottom_index</tt>.
                     80:
                     81: <H2>64 bit addresses</h2>
                     82: <P>
                     83: In the case of 64 bit addresses, this picture is complicated slightly
                     84: by the fact that one of the index structures would have to be huge to
                     85: cover the entire address space with a two level tree.  We deal with this
                     86: by turning <TT>GC_top_index</tt> into a chained hash table, instead of
                     87: a simple array.  This adds a <TT>hash_link</tt> field to the
                     88: <TT>bottom_index</tt> structure.
                     89: <P>
                     90: The "hash function" consists of dropping the high bits.  This is cheap to
                     91: compute, and guarantees that there will be no collisions if the heap
                     92: is contiguous and not excessively large.
                     93:
                     94: <H2>A picture</h2>
                     95: <P>
                     96: The following is an ASCII diagram of the data structure.
                     97: This was contributed by Dave Barrett several years ago.
                     98: <PRE>
                     99:
                    100:                Data Structure used by GC_base in gc3.7:
                    101:                              21-Apr-94
                    102:
                    103:
                    104:
                    105:
                    106:     63                  LOG_TOP_SZ[11]  LOG_BOTTOM_SZ[10]   LOG_HBLKSIZE[13]
                    107:    +------------------+----------------+------------------+------------------+
                    108:  p:|                  |   TL_HASH(hi)  |                  |   HBLKDISPL(p)   |
                    109:    +------------------+----------------+------------------+------------------+
                    110:     \-----------------------HBLKPTR(p)-------------------/
                    111:     \------------hi-------------------/
                    112:                       \______ ________/ \________ _______/ \________ _______/
                    113:                              V                   V                  V
                    114:                              |                   |                  |
                    115:            GC_top_index[]    |                   |                  |
                    116:  ---      +--------------+   |                   |                  |
                    117:   ^       |              |   |                   |                  |
                    118:   |       |              |   |                   |                  |
                    119:  TOP      +--------------+<--+                   |                  |
                    120:  _SZ   +-<|      []      | *                     |                  |
                    121: (items)|  +--------------+  if 0 < bi< HBLKSIZE  |                  |
                    122:   |    |  |              | then large object     |                  |
                    123:   |    |  |              | starts at the bi'th   |                  |
                    124:   v    |  |              | HBLK before p.        |             i    |
                    125:  ---   |  +--------------+                       |          (word-  |
                    126:        v                                         |         aligned) |
                    127:    bi= |GET_BI(p){->hash_link}->key==hi          |                  |
                    128:        v                                         |                  |
                    129:        |   (bottom_index)  \ scratch_alloc'd     |                  |
                    130:        |   ( struct  bi )  / by get_index()      |                  |
                    131:  ---   +->+--------------+                       |                  |
                    132:   ^       |              |                       |                  |
                    133:   ^       |              |                       |                  |
                    134:  BOTTOM   |              |   ha=GET_HDR_ADDR(p)  |                  |
                    135: _SZ(items)+--------------+<----------------------+          +-------+
                    136:   |   +--<|   index[]    |                                  |
                    137:   |   |   +--------------+                      GC_obj_map: v
                    138:   |   |   |              |              from      / +-+-+-----+-+-+-+-+  ---
                    139:   v   |   |              |              GC_add   < 0| | |     | | | | |   ^
                    140:  ---  |   +--------------+             _map_entry \ +-+-+-----+-+-+-+-+   |
                    141:       |   |   asc_link   |                          +-+-+-----+-+-+-+-+ MAXOBJSZ
                    142:       |   +--------------+                      +-->| | |  j  | | | | |  +1
                    143:       |   |     key      |                      |   +-+-+-----+-+-+-+-+   |
                    144:       |   +--------------+                      |   +-+-+-----+-+-+-+-+   |
                    145:       |   |  hash_link   |                      |   | | |     | | | | |   v
                    146:       |   +--------------+                      |   +-+-+-----+-+-+-+-+  ---
                    147:       |                                         |   |<--MAX_OFFSET--->|
                    148:       |                                         |         (bytes)
                    149: HDR(p)| GC_find_header(p)                       |   |<--MAP_ENTRIES-->|
                    150:       |                           \ from        |    =HBLKSIZE/WORDSZ
                    151:       |    (hdr) (struct hblkhdr) / alloc_hdr() |    (1024 on Alpha)
                    152:       +-->+----------------------+              |    (8/16 bits each)
                    153: GET_HDR(p)| word   hb_sz (words) |              |
                    154:           +----------------------+              |
                    155:           | struct hblk *hb_next |              |
                    156:           +----------------------+              |
                    157:           |mark_proc hb_mark_proc|              |
                    158:           +----------------------+              |
                    159:           | char * hb_map        |>-------------+
                    160:           +----------------------+
                    161:           | ushort hb_obj_kind   |
                    162:           +----------------------+
                    163:           |   hb_last_reclaimed  |
                    164:  ---      +----------------------+
                    165:   ^       |                      |
                    166:  MARK_BITS|       hb_marks[]     | *if hdr is free, hb_sz + DISCARD_WORDS
                    167: _SZ(words)|                      |  is the size of a heap chunk (struct hblk)
                    168:   v       |                      |  of at least MININCR*HBLKSIZE bytes (below),
                    169:  ---      +----------------------+  otherwise, size of each object in chunk.
                    170:
                    171: Dynamic data structures above are interleaved throughout the heap in blocks of
                    172: size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
                    173: freed; free lists are used (e.g. alloc_hdr).  HBLK's below are collected.
                    174:
                    175:              (struct hblk)
                    176:  ---      +----------------------+ < HBLKSIZE ---         ---          DISCARD_
                    177:   ^       |garbage[DISCARD_WORDS]|   aligned   ^           ^ HDR_BYTES WORDS
                    178:   |       |                      |             |           v (bytes)   (words)
                    179:   |       +-----hb_body----------+ < WORDSZ    |          ---   ---
                    180:   |       |                      |   aligned   |           ^     ^
                    181:   |       |      Object 0        |             |           hb_sz |
                    182:   |       |                      |           i |(word-    (words)|
                    183:   |       |                      |      (bytes)|aligned)   v     |
                    184:   |       + - - - - - - - - - - -+ ---         |          ---    |
                    185:   |       |                      |  ^          |           ^     |
                    186:   n *     |                      |  j (words)  |          hb_sz BODY_SZ
                    187:  HBLKSIZE |      Object 1        |  v          v           |   (words)
                    188:  (bytes)  |                      |---------------          v   MAX_OFFSET
                    189:   |       + - - - - - - - - - - -+                        ---  (bytes)
                    190:   |       |                      | !All_INTERIOR_PTRS      ^     |
                    191:   |       |                      | sets j only for       hb_sz   |
                    192:   |       |      Object N        | valid object offsets.   |     |
                    193:   v       |                      | All objects WORDSZ      v     v
                    194:  ---      +----------------------+ aligned.               ---   ---
                    195:
                    196: DISCARD_WORDS is normally zero.  Indeed the collector has not been tested
                    197: with another value in ages.
                    198: </pre>
                    199: </body>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>