Þ   briarpig  » thorn  » demos  » map


demos are explained here; a menu at top column right indexes actual topic demos. Here we demo map.

problem

     Wil uses many map classes associating keys with values; all have similar apis even when code is different. Most maps in þ are hashmaps because Wil prefers constant time — aka O(1) time — solutions almost always when optimization is involved (and usually Wil is optimizing something).

     Many folks familiar with std::map in STL don't know it's a tree instead of a hashmap, with O(log(n)) behavior instead of O(1) constant time. In small maps this hardly matters. But in very large maps, more memory is touched to find map entries, resulting in slower speed. Of course, trees also sort map entries — which is great when you need sorted, but meaningless when order does not matter. Hashmaps are famously orderless.

     The main purpose of this page is to show hashmaps used in a writer to pretty print data in a toy language under mu — more about these writer hashmaps shortly. First consider a few common features of hashmaps (so the topic of maps is treated with some small degree of generality before it's hijacked by a single special case example).

see also «

     Other pages show more examples of hashmaps:

  • demo hash has a test closed/probing map (cf »)
  • mu's symbol has open/chaining intern table (cf »)
  • demo page has open/chaining map ypm (cf »)
  • demo book has open/chaining map yBqm (cf »)
  • demo pile has several open/chaining maps (cf »)

maps «

     Maps typically store (key, value) pairs in some representation, with unique keys linked to associated values. But in some special cases only a key matters, so an associated value is meaningless; thus in rare cases, a map might hold only keys.

     Pairs in hashmaps are stored in some order deriving from a hash of each key — the slot to which a key maps is usually a function of the key's hash and the size of a map's underlying table of buckets, where a bucket is where you start looking for a matching key once you start there given a key's hash. The purpose of hashing a key is to skip nearly all map elements, heading straight for the location of a desired key. When a map has as many buckets as members, and when the hash used is collision-resistant, the expected number of keys examined to find a desired pair is one on average, and rarely much more.

     However, some keys will map to the same bucket in a table even when a hash is collision-resistant, so a hashmap must use some way to resolve bucket collisions so keys are stored even when the same starting bucket is hit. Classically, hashmaps use two different major styles of resolving bucket collisions: chaining in a list within a bucket (called open hashing because a bucket is open to multiple additions) and probing in successor buckets (called closed hashing because a bucket is closed to extra additions). Wil uses both kinds of hashmaps and chooses which style to use depending on expected population size and speed requirements. Neither style is best all the time.

     When speed is paramount and space is no object when speed must be optimized, Wil usually uses a closed/probing hashmap because he can arrange very few cache lines are touched on average, which is faster than touching more cache lines. In this sense, an open/chaining hashmap has inferior speed because lists touch more places in memory when scanning a list. But you can't fill a closed/probing hashmap more than around 45% full if you use tombstones because you'd be risking a map completely full of either used or tombstoned buckets once you go over the 50% mark. However, if (key, val) pairs are small, closed/probing hashmaps are still competitive in size with open/chaining hashmaps, even when using less than half available space, because lists have both bucket and entry overhead.

     Wil uses open/chaining hashmaps when speed is less important than flexibility in hashmap table size because expected population size is hard to predict, and it's too expensive to err on the side of allocating far too much space. Buckets that are lists of pairs hitting a bucket are flexible since a hashmap doesn't become full once more members are present than number of buckets — lists just get longer. (But when lists on average get too long, a new larger table is needed to preserve constant time access.) In some situations, Wil assumes a fix-sized hashmap will always be big enough when it has at least a thousand buckets in a page or sixteen thousand in a book. When pretty printing Lisp expressions, for example, it's reasonable to assume a tree of thousands of boxes would be unusually large in size, and that slightly slower printing speed should be okay when mapping very big graphs into buckets with longer lists.

shared structure maps «

     The name of the hashmap class on this page is ypp2dm, where the extra 2d in the name means two dimensional because it supports a two-layer view of content inside: a long-lived zero layer and a transient nonzero layer used only when 'depth' increases above zero. When current depth is above zero, the zero layer is immutable and all changes occur in the nonzero layer. Later when depth decreases to zero again, all changes above the zero layer are discarded in bulk, and the zero layer becomes mutable again.

     The topmost layer is always mutable. When a transient layer is added above the zero layer, it starts with a copy of each zero layer map, created by a single 4K memcopy of one page, taking a snapshot of old buckets so they can be updated in a new top layer without altering the layer underneath. Each layer allocates space in a different set of pages using the pile heap api. This permits all pages associated with one layer to be freed in bulk using arena style allocation.

     The purpose of this two layer system is to support completely reversible trial map updates when a REPL writer in the toy language performs mock printing runs of expressions to measure expected length of output. Trial runs occur in a nonzero layer created when RAII guard ypp2dm::Mlayer is first declared on the stack, increasing depth from zero to one. Any subsequent nested declarations of guard ypp2dm::Mlayer, when they occur, are counted but don't change an existing nonzero transient layer. (In recursive printing code, it can be awkward to know whether your caller already entered a transient mock printing state, so this nested usage of ypp2dm::Mlayer permits local code not to know or care whether a current layer is zero or not.)

     Internally ypp2dm actually contains two different hashmaps, both with a zero and nonzero layer, grouped together because printing code needs both. The first is a set of pointers just like ypm in the page demo (cf »). The second hashmap is just like yppm in the same demo (cf »). Member names in ypp2dm contain a numeral 1 when only one p pointer is in a ypm style map involved; and member names contain numeral 2 when two pp pointers are needed for both key and value in a yppm style map. Call these maps by names map 1 and map 2.

     Map 1 is used to remember box addresses already seen. Map 2 is used to associate a box address with an integer name, so the first time a box is printed it can be labeled with that name — like this: #3=box — and subsequent appearances can be replaced completely by a backward reference like this: #3#.

     Because numerals 1 and 2 are used to chunk members associated with the two maps, Wil avoided using numeral 0 in names associated with the zero layer even though natural, because it would be confusing. Instead, Wil added base as a shorter synonym for origin to member names devoted specifically to the zero layer. For example, both maps have arrays of bucket pointers and one member points to the current bucket table, while another member with suffix _base always points to the zero layer table.

class api «

     "Hey Wil," Stu said. "I've been reading ypp2dm source code and it looks like only layer zero puts freed elems in a free list. Why is that? Why leak freed nonzero layer elems?"

     "A couple reasons," Wil replied. "First, the nonzero layer is temporary on a tiny time horizon. Second, pages allocated by a nonzero layer are freed all together anyway."

     "What makes the time horizon so small?" Stu asked. "Is this specific to your planned use in a pretty printer?"

     "Yes, it is," Wil nodded. "A nozero layer might only last as long as it takes to finish formatting enough content on the current line to require a line break."

     "Wow," Stu considered. "If you're going to get pages from the heap and return them that rapidly for a transient layer, maybe you should pool free pages locally."

     Wil sighed. "You want me to make this code even more complex?" he asked. "The pile api is a local page pool."

     "It was just an idea," Stu shrugged.

class ypp2dm { // two layered combination of ypm and yppm « private: static const u32 kprime1k = e1i_1021_1k; // prime < 1K « static const u32 kslots = e1i_1021_1k; // prime filling 4K «

     Constant kslots is a prime number of pointers fitting in one page; it assumes a yP page is 4K in size, and that pointers are 32 bits in size.

yPBH* m_H; // source of all pages « void* m_nilval; // ptr value representing nil value « u32 m_d; // count: _layer() subtract _unlayer() calls «

     All memory is allocated from pile m_H, which is passed to constructors for each yPq member — they actually allocate each yP page used in maps.

     Pointer m_nilval, defaulting to zero, is returned when a search fails to find a key in a map. Depth m_d is the current layer, starting at zero and only increasing within the scope of Mlayer instances constructed.

u32 m_tag; // must equal e_pp2dm magic signature: « enum { e_pp2dm = 0x7032646d, e_dead_pp2dm = 0x32444d7a };

     Magic signature m_tag must equal constant e_pp2dm while the ypp2dm instance is still alive.

struct Mpe { // map elem in hashmap bucket « Mpe* e_n; // next elem in hashmap bucket « void* e_p; // pointer key stored in this hashmap elem « Mpe() : e_n(0), e_p(0) { } Mpe(void* p) : e_n(0), e_p(p) { } Mpe(Mpe* n, void* p) : e_n(n), e_p(p) { } }; // Mpe: elem in list of pointers in one m_1epp bucket

     Map 1 uses instances of Mpe for each pointer elem in map 1, using e_p as key and e_n as next elem in the same bucket.

yPq m_1Pq; // pages in layers ABOVE ZERO (never zero) « yPq m_1Pq_base; // pages allocated in layer zero « // when m_1vH is exhausted, we alloc page for Mpe elems: yv m_1vH; // space left in current page to alloc elems « yv m_1vH_base; // space left in current layer zero page «

     List m_1Pq_base holds handles to each yP page allocated in layer zero for map 1, while m_1Pq holds the refs to all pages allocated above layer zero. Byte run m_1vH contains free bytes remaining in the last page of elems allocated in the current layer; when depth goes above zero and later returns to zero, m_1vH_base saves a copy and then later restores the value of m_1vH. (When layer zero is active, m_1vH_base is empty.)

n32 m_1n; // count of hashmap entries « n32 m_1n_base; // count of layer zero hashmap entries « // m_1epp comes from 1st page alloc'd in m_Pq by _m1init(): Mpe** m_1epp; // hashmap array: m_1epp[kslots] in a page « Mpe** m_1epp_base; // hashmap buckets in layer zero « // free list needs no "base" when used only in layer zero: Mpe* m_1efree; // free list: removals (layer zero only) «

     The current number of elems in map 1 is counted in m_1n; when depth goes above zero and later returns to zero, m_1n_base saves a copy and then later restores the value of m_1n.

     Array of elem pointers m_1epp is lazily allocated as a single page on first use, initialized to all zeroes when this occurs first in layer zero (when m_1epp_base is still nil). When depth goes above zero, m_1epp is moved to m_1epp_base and then zeroed so it can be lazily allocated once again in a new page; this time when m_1epp_base is non-nil, a new m_1epp copies of m_1epp_base so transient layer nonzero starts with a copy of layer zero.

     Free list m_1efree holds elems removed from map 1, but only when this occurs in layer zero; there is no free list above layer zero: recovery occurs when m_1Pq frees all pages.

bool _m1init(); // true if succeeds 1st init when m_1epp=nil Mpe* _m1new(void* p); // alloc elem from m_vH or new page yo& _m1dump_epp(yo& o) const; // dump ptr hashmap

// note: free list is ONLY used when 0==m_layer in zero layer void _m1push(Mpe* e) { « if (!m_d) { e->e_n = m_1efree; m_1efree = e; } }

     Free map 1 elems are pushed by _m1push(), but this is only done when depth is zero.

Mpe* _m1pop() { « yassert("_m1pop" && 0 == m_d); // only pop in layer zero Mpe* e = m_1efree; if (e) m_1efree = e->e_n; return e; }

     Free map 1 elems are popped by _m1pop(), but this must only occur when depth is zero.

struct Mkve { // map key/val elem in hashmap bucket « Mkve* e_n; // next elem in hashmap bucket « void* e_key; // pointer key stored in this map elem « void* e_val; // the value associated with key « Mkve() : e_n(0), e_key(0), e_val(0) { } Mkve(Mkve* n, void* k, void* v) : e_n(n), e_key(k), e_val(v) { } Mkve(void* k, void* v) : e_n(0), e_key(k), e_val(v) { } }; // Mkve: elem list of ptr pairs in one m_2epp map bucket

     Map 2 uses instances of Mkve for each key/val elem in map 2, using e_key and e_val (obviously) as the key/val pair, and e_n as the next elem in the same bucket.

yPq m_2Pq; // pages in layers ABOVE ZERO (never zero) « yPq m_2Pq_base; // pages allocated in layer zero « // whenever m_2vH is exhausted, we alloc page for Mkve elems: yv m_2vH; // space left in current page for elems « yv m_2vH_base; // space left in current layer zero page «

     List m_2Pq_base holds handles to each yP page allocated in layer zero for map 2, while m_2Pq holds refs to all pages allocated above layer zero. Byte run m_2vH contains free bytes remaining in the last page of elems allocated in the current layer for map 2; when depth goes above zero and later returns to zero, m_2vH_base saves a copy and then later restores the value of m_2vH. (When layer zero is active, m_2vH_base is empty.)

n32 m_2n; // count of hashmap entries « n32 m_2n_base; // count of hashmap entries in layer zero « // m_2epp comes from 1st page alloc'd in m_Pq by _m2init(): Mkve** m_2epp; // hashmap array: m_2epp[kslots] in a page « Mkve** m_2epp_base; // hashmap buckets in layer zero « // free list needs no "base" when used only in layer zero: Mkve* m_2efree; // free list: removals (layer zero only) «

     The current number of elems in map 2 is counted in m_2n; when depth goes above zero and later returns to zero, m_2n_base saves a copy and then later restores the value of m_2n.

     Array of key/val elems m_2epp is lazily allocated as a single page on first use, initialized to all zeroes when this occurs first in layer zero (when m_2epp_base is still nil). When depth goes above zero, m_2epp is moved to m_2epp_base and then zeroed so it can be lazily allocated once again in a new page; this time when m_2epp_base is non-nil, a new m_2epp copies of m_2epp_base so transient layer nonzero starts with a copy of layer zero.

     Free list m_2efree holds elems removed from map 2, but only when this occurs in layer zero; there is no free list above layer zero: recovery occurs when m_2Pq frees all pages.

bool _m2init(); // true if succeeds 1st init when m_2epp=nil Mkve* _m2new(void* k, void* v); // alloc in m_vH or new page yo& _m2dump_epp(yo& o) const; // dump ptr pair hashmap

// note: free list is ONLY used when 0==m_layer in zero layer void _m2push(Mkve* e) { « if (!m_d) { e->e_n = m_2efree; m_2efree = e; } }

     Free map 2 elems are pushed by _m2push(), but this is only done when depth is zero.

Mkve* _m2pop() { « yassert("_m2pop" && 0 == m_d); // only pop in layer zero Mkve* e = m_2efree; if (e) m_2efree = e->e_n; return e; }

     Free map 2 elems are popped by _m2pop(), but this must only occur when depth is zero.

// add new layer, but change only on zero to one transition void _layer(); // add a new temp layer on BOTH old maps // cut old layer, but change only on one to zero transition void _unlayer(); // remove temp layer, show BOTH old maps

     Non-public methods _layer() and _unlayer() increase and decrease depth by one, respectively. Edge triggered changes occur only on transition to and from layer zero. The only public way to invoke these methods is by instantiating guard class Mlayer shown below.

public: class Mlayer { // calls _layer() in ctor, _unlayer() in dtor « private: ypp2dm* l_m; // the map getting a layer public: // only ONE layer expected (more layers might warn) Mlayer(ypp2dm& m) : l_m(&m) { l_m->_layer(); } // add layer ~Mlayer() { l_m->_unlayer(); l_m = 0; } // remove layer }; // Mlayer

     Class Mlayer is a RAII guard class used to increase and decrease layer depth as a side effect of creating and destroying a guard instance. Since destructors are exception safe, this means depth increased by constructing Mlayer on the stack will later be undone, even if an exception is thrown. (But Wil does everything in his power to avoid using exceptions.)

u32 mdepth() const { return m_d; } // Mlayer # for ypp2dm «

     Depth of the current layer is returned by mdepth(), not that you have any reason to ever care.

// api for 1ST of two maps: hold key ptr p members only: void* m1remove(void* p); // cut map key (return p or nilval)

     Method m1remove() deletes key p from the map if currently a member; if found, p is returned, otherwise m_nilval.

     NOTE: this does not respect immutability of layer zero, because a removed elem above layer zero is not returned when layer zero goes back into effect. However, the pretty printer in the toy writer doesn't need to remove elems in transient nonzero layers, so this issue doesn't arise. (And Wil didn't see any reason to solve academic problems ahead of need.)

void m1add(void* p); // add p to map if not present void msee(const void* p) { this->m1add((void*)p); } «

     If not already a member of map 1, m1add() adds pointer p as a new member key in the map. Method msee() is a synonym emphasizing a role of map 1 as means used to track objects seen before when pretty printing.

// return fast answer:: Is address p inside in this map? const void* m1find(const void* p) const; // => p or nilval

     Method m1find() returns p if it's a member of map 1, and otherwise returns m_nilval. Method msaw() below is a synonym for the role of map 1 as means to recall boxes seen.

const void* msaw(const void* p) const { return m1find(p); } « void m1clear(); void munsee_all() { this->m1clear(); } « n32 m1size() const { return m_1n; } // # ptrs in hashmap « n32 mviews() const { return m_1n; } // # ptrs in hashmap «

     Method m1clear() removes all elems from map 1 in both zero and nonzero layers. The immutability of layer zero when depth is nonzero is not respected — but the pretty printer in the toy writer only calls m1clear() in layer zero anyway.

     Clearing map 1 must be independent from map 2 because pretty printing makes two passes over an box tree, and must clear one map between passes.

// api for 2ND of 2 maps: hold (key,val) pairs as members: void* m2remove(void* key); // cut map key (return old val)

     Method m2remove() deletes key from the map if currently a member; if found the associated value is returned, otherwise m_nilval. Note m2remove() does not respect immutability of layer zero when depth is nonzero, any more than m1remove() does, and for similar reasons.

// add (key,val) to map (return old val): void* m2add(void* key, void* val); // returns old val void* mbind(void* key, void* val) { // returns old val « return this->m2add(key, val); }

     If not already a member of map 2, m2add() adds key as a new member pair in the map, associatd with key. If key was previously already a member, the old value before val is returned, otherwise m_nilval.

     Method mbind() is a synonym emphasizing a role of map 2 as a means to bind an address or integer as name for a box address value, so the box can be found by name later.

typedef ptrdiff_t p_t; // seeing void* ptrs as integers «

     Integer type p_t is a pointer sized integer used to overload m2add() below.

void* m2add(p_t key, void* val) { « return this->m2add((void*)key, val); }

// return value known by key, or nil (value is never nil) void* m2find(const void* key) const; // val if key is in map

     Method m2find() returns the value associated with key if it's a member of map 2, and otherwise returns m_nilval. Method mbound() below is a synonym for the role of map 2 as means to bind names to box addresses.

void* mbound(const void* key) const { return m2find(key); } « void m2clear(); void munbind_all() { this->m2clear(); } « n32 m2size() const { return m_2n; } // # ptrs in hashmap « n32 mties() const { return m_2n; } // # ptrs in hashmap «

     Method m2clear() removes all elems from map 2 in both zero and nonzero layers. The immutability of layer zero when depth is nonzero is not respected — but the pretty printer in the toy writer only calls m2clear() in layer zero anyway.

// nilval is value returned on no key found (default zero) explicit ypp2dm(yPBH* h, void* nilval=0); // heap for pages ~ypp2dm(); // releases all books and pages back to the heap

     To create a new ypp2dm instance, you must supply a heap — a pile of pages — used to allocate all space used. If zero does not make an acceptable unique key never used for real elems added, in both maps, then pass in some unique address (say of some dummy global) you know is never the address of a key.

     Destroying ypp2dm releases every page allocated, which return to the source pile passed to the constructor.

struct Mq { // quote wrapper struct « ypp2dm const& q_m; Mq(ypp2dm const& m): q_m(m) { } }; Mq quote() const { return Mq(*this); } // request dump « void mprint() const; // dump to stdout via yout void mdump(yo& o) const; void mcite(yo& o) const; void m2dump(yo& o) const; void m2cite(yo& o) const; void m1dump(yo& o) const; void m1cite(yo& o) const; };

     "Wait," Stu interrupted before Wil could start his usual spiel about debug printing boilerplate. "Zé gave me a card I was supposed to read when you reached this part of the demo."

     Wil sighed. "As long as there's no poetry," he said.

     Looking at the card Stu said, "Zé says this part of the api broadcasts messages to rats with custom brain implants so they can drive tiny bumper cars in a demolition derby."

     Wil pressed fingers into his eyes. "And I have Zé to thank for that?" he asked without heat.

     "Sorry," Stu opened his hands to show innocence. "I thought you two had a running zombie joke and, uh, maybe rats ..."

     "Never mind," Wil held up a hand. "Thanks for that moment of levity. No, actually, this boilerplate resembles printing api in other C++ classes. Do you recall relevant demos?"

     "Yeah," Stu nodded, "The out and quote demos have bits about printing and use of operators like ones below."

inline yo& operator<<(yo& o, ypp2dm const& x) { « x.mdump(o); return o; } inline yo& operator<<(yo& o, ypp2dm::Mq const& x) { « x.q_m.mdump(o); return o; } inline yo& operator<<(yo& o, yct<ypp2dm> const& x) { « x.c_t.mcite(o); return o; }

     "Let's look at sample code using this api below," Wil suggested. "I can truncate an old unit test."

main «

     "Looks like you begin with the usual pile instance as the heap used to allocate space," Stu noted.

main.cpp «

yH& vanillaHeap = yH::g_1H; // malloc() and free() yPBH pile(&vanillaHeap); // allocs pages and books yPBH* heap = &pile; ypp2dm map(heap); yout << "# empty map:" << yendl << map.quote() << yendl;

     "Yes, and after making a ypp2dm instance," Wil said, "I print the initial empty state, without output on stdout as shown below. Note nothing is allocated yet."

# empty map: <ypp2dm me=bffff9e4 H=bffff8fc d=0 1base=0 2base=0> <1epp=0 1n=0 Pn=0 vn=0 1fr=0/> <2epp=0 2n=0 Pn=0 vn=0 2fr=0/> </ypp2dm>

     "Next I add a few odd keys and values," Wil narrated. "The idea is to put only odd integers in the zero layer."

map.m2add((void*)0x10001, (void*)0x1); map.m1add((void*)0x10001); map.m2add((void*)0x30003, (void*)0x3); for (u32 i = 5; i < 10; i += 2) { map.m1add((void*) (i | (i << 16))); map.m2add((void*)(i | (i << 16)), (void*)i); } yout << "# odds:" << yendl << map.quote() << yendl;

     "And there's the output below," Stu anticipated. "Good thing you changed the loop from 10 from 3000. Output would be huge with thousands of elems."

# odds: <ypp2dm me=bffff9e4 H=bffff8fc d=0 1base=0 2base=0> <pm 1epp=2804000 1n=4 b1n=0 Pn=0 bPn=2 vn=4064 bVn=0 1free=0> <face><yPq me=bffff9f4 H=bffff8fc len=0/></face> <base><yPq me=bffffa04 H=bffff8fc len=2> <yh0 a=1 k=S n=28013f8 g=33><yPn me=28013f8 ru=1:1 g=33¬ rwi=w+ tyl=Pu id=5031 P=2804000 crc=e25b9a95/></yh0> <yh0 a=2 k=S n=2801420 g=4e><yPn me=2801420 ru=1:1 g=4e¬ rwi=w+ tyl=Pu id=7031 P=2805000 crc=d5b54f62/></yh0> </yPq></base> <1map 1epp=2804000 b1epp=0 slots=1021 1n=4> <e p=0x10001/> <!-- 1 i=149 --> <e p=0x70007/> <!-- 2 i=196 --> <e p=0x90009/> <!-- 3 i=602 --> <e p=0x50005/> <!-- 4 i=689 --> </map> </pm> <ppm 2epp=2802000 2n=5 b2n=0 Pn=0 bPn=2 vn=4036 bVn=0 2free=0> <face><yPq me=bffffa38 H=bffff8fc len=0/></face> <base><yPq me=bffffa48 H=bffff8fc len=2> <yh0 a=1 k=S n=28013a8 g=6f><yPn me=28013a8 ru=1:1 g=6f¬ rwi=w+ tyl=Pu id=5032 P=2802000 crc=efb99cb3/></yh0> <yh0 a=2 k=S n=28013d0 g=db><yPn me=28013d0 ru=1:1 g=db¬ rwi=w+ tyl=Pu id=7032 P=2803000 crc=f54820cb/></yh0> </yPq></base> <2map 2epp=2802000 b2epp=0 slots=1021 2n=5> <e key=10001 val=1/> <!-- 1 i=149 --> <e key=70007 val=7/> <!-- 2 i=196 --> <e key=30003 val=3/> <!-- 3 i=592 --> <e key=90009 val=9/> <!-- 4 i=602 --> <e key=50005 val=5/> <!-- 5 i=689 --> </2map> </ppm> </ypp2dm>

     "Note I edited the output slightly," Wil pointed. "Extra line breaks are a ¬ glyph and elisions show as a glyph."

     "Next I see you declare a new layer," Stu read. "That makes the zero layer immutable, and the new nonzero layer gets all new additions? New values are even integers?"

     "Yes to all that," Wil agreed. "The new nonzero layer adds elems that are all even. This makes it easier to see they all vanish again when the layer pops."

if (heap) { ypp2dm::Mlayer guard(map); map.m1add((void*)0xa0000a); map.m2add((void*)0x400004, (void*)0x4); for (u32 j = 6; j < 10; j += 2) { map.m1add((void*) (j | (j << 20))); map.m2add((void*)(j | (j << 20)), (void*)j); } yout << "# evens:" << yendl << map.quote() << yendl; }

     "I see output below reveals the number of pages involved," Stu said. "But hashmap content mixes all odd and even elems together. Because they're all part of one map?"

     "Yes," Wil nodded. "Because bucket linked lists simply cross from the top nonzero layer to the earlier zero layer underneath. If I still had thousands of both odd elems in the zero layer and thousands of even elems in the nonzero layer, you'd see buckets with evens at the top and odds underneath."

     "At the end of that block above," Stu noticed, "the layer guard goes out of scope. Does that mean all the even elems are popped at the end of that scope?"

     "Yes, and we'll show that next," Wil said.

# evens: <ypp2dm me=bffff9e4 H=bffff8fc d=1 1base=2804000 2base=2802000> <pm 1epp=2806000 1n=7 b1n=4 Pn=2 bPn=2 vn=4072 bVn=4064¬ 1free=0> <face><yPq me=bffff9f4 H=bffff8fc len=2> <yh0 a=1 k=S n=2801448 g=91><yPn me=2801448 ru=1:1 g=91¬ rwi=w+ tyl=Pu id=3150 P=2806000 /></yh0> <yh0 a=2 k=S n=2801470 g=ac><yPn me=2801470 ru=1:1 g=ac¬ rwi=w+ tyl=Pu id=3170 P=2030000 /></yh0> </yPq></face> <base><yPq me=bffffa04 H=bffff8fc len=2> <yh0 a=1 k=S n=28013f8 g=33><yPn me=28013f8 ru=1:1 g=33¬ rwi=w+ tyl=Pu id=5031 P=2804000 /></yh0> <yh0 a=2 k=S n=2801420 g=4e><yPn me=2801420 ru=1:1 g=4e¬ rwi=w+ tyl=Pu id=7031 P=2805000 /></yh0> </yPq></base> <1map 1epp=2806000 b1epp=2804000 slots=1021 1n=7> <e p=0x800008/> <!-- 1 i=91 --> <e p=0x10001/> <!-- 2 i=149 --> <e p=0x70007/> <!-- 3 i=196 --> <e p=0x600006/> <!-- 4 i=199 --> <e p=0xa0000a/> <!-- 5 i=353 --> <e p=0x90009/> <!-- 6 i=602 --> <e p=0x50005/> <!-- 7 i=689 --> </map> </pm> <ppm 2epp=2031000 2n=8 b2n=5 Pn=2 bPn=2 vn=4060 bVn=4036¬ 2free=0> <face><yPq me=bffffa38 H=bffff8fc len=2> <yh0 a=1 k=S n=2801498 g=43><yPn me=2801498 ru=1:1 g=43¬ rwi=w+ tyl=Pu id=3250 P=2031000 /></yh0> <yh0 a=2 k=S n=28014c0 g=4a><yPn me=28014c0 ru=1:1 g=4a¬ rwi=w+ tyl=Pu id=3270 P=2032000 /></yh0> </yPq></face> <base><yPq me=bffffa48 H=bffff8fc len=2> <yh0 a=1 k=S n=28013a8 g=6f><yPn me=28013a8 ru=1:1 g=6f¬ rwi=w+ tyl=Pu id=5032 P=2802000 /></yh0> <yh0 a=2 k=S n=28013d0 g=db><yPn me=28013d0 ru=1:1 g=db¬ rwi=w+ tyl=Pu id=7032 P=2803000 /></yh0> </yPq></base> <2map 2epp=2031000 b2epp=2802000 slots=1021 2n=8> <e key=800008 val=8/> <!-- 1 i=91 --> <e key=10001 val=1/> <!-- 2 i=149 --> <e key=70007 val=7/> <!-- 3 i=196 --> <e key=600006 val=6/> <!-- 4 i=199 --> <e key=30003 val=3/> <!-- 5 i=592 --> <e key=90009 val=9/> <!-- 6 i=602 --> <e key=50005 val=5/> <!-- 7 i=689 --> <e key=400004 val=4/> <!-- 8 i=974 --> </2map> </ppm> </ypp2dm>

     "Okay, now here's where we just print the map again," Wil said. "This time the nonzero layer is gone, and all the even integers added in the layer have been removed."

yout << "# odds only:"; yout << yendl << map.quote() << yendl << ynow;

     "Cool," Stu said. "I suppose map contents have completely reverted in stdout output shown below."

# odds only: <ypp2dm me=bffff9e4 H=bffff8fc d=0 1base=0 2base=0> <pm 1epp=2804000 1n=4 b1n=0 Pn=0 bPn=2 vn=4064 bVn=0 1free=0> <face><yPq me=bffff9f4 H=bffff8fc len=0/></face> <base><yPq me=bffffa04 H=bffff8fc len=2> <yh0 a=1 k=S n=28013f8 g=33><yPn me=28013f8 ru=1:1 g=33¬ rwi=w+ tyl=Pu id=5031 P=2804000 crc=e25b9a95/></yh0> <yh0 a=2 k=S n=2801420 g=4e><yPn me=2801420 ru=1:1 g=4e¬ rwi=w+ tyl=Pu id=7031 P=2805000 crc=d5b54f62/></yh0> </yPq></base> <1map 1epp=2804000 b1epp=0 slots=1021 1n=4> <e p=0x10001/> <!-- 1 i=149 --> <e p=0x70007/> <!-- 2 i=196 --> <e p=0x90009/> <!-- 3 i=602 --> <e p=0x50005/> <!-- 4 i=689 --> </map> </pm> <ppm 2epp=2802000 2n=5 b2n=0 Pn=0 bPn=2 vn=4036 bVn=0 2free=0> <face><yPq me=bffffa38 H=bffff8fc len=0/></face> <base><yPq me=bffffa48 H=bffff8fc len=2> <yh0 a=1 k=S n=28013a8 g=6f><yPn me=28013a8 ru=1:1 g=6f¬ rwi=w+ tyl=Pu id=5032 P=2802000 crc=efb99cb3/></yh0> <yh0 a=2 k=S n=28013d0 g=db><yPn me=28013d0 ru=1:1 g=db¬ rwi=w+ tyl=Pu id=7032 P=2803000 crc=f54820cb/></yh0> </yPq></base> <2map 2epp=2802000 b2epp=0 slots=1021 2n=5> <e key=10001 val=1/> <!-- 1 i=149 --> <e key=70007 val=7/> <!-- 2 i=196 --> <e key=30003 val=3/> <!-- 3 i=592 --> <e key=90009 val=9/> <!-- 4 i=602 --> <e key=50005 val=5/> <!-- 5 i=689 --> </2map> </ppm> </ypp2dm>
A submenu for demos appears below, letting you go to the page on a topic written as a demo (as the demos page defines it).

menu

     thorn: todo, names, fd, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map « Þ, meter, list, iter, ctype

     (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile)

     Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.

copying «

     "I've been thinking, Wil," Stu started politely. "One of your reasons to use a hashmap instead of a tree was to reduce memory touched, right? To minimize cache line pressure?"

     "Yes," Wil replied with a worried expression, like he guessed Stu's next question. "That's the general hashmap plan."

     "Then why do you copy the entire bucket table page for each new layer nonzero created?" Stu asked. "Isn't that expensive? I mean, doesn't a printer do that at least once per printed line?"

     "You're asking for my seat-of-the-pants reaction?" Wil asked. "Without any measurement or profiling?"

     "Yes, yes," Stu granted. "Add all the provisos you want that you're guessing without any evidence."

     "My gut sense is: yes it's slightly expensive," Wil said. "You wouldn't want to do a large number of pretty prints at the same time in a server under load. Lately I see memory to memory copies run about one microsecond per kilobyte."

     "But you're going to say this is for a human's consumption, aren't you?" Stu guessed. "And latency for human UI can be much more than for high request frequencies needed in servers."

     "You're on the right track," Wil nodded. "The hashmap layering for precise pretty printing with cycle detection and display seems more expensive than you'd want to do in a bottleneck. But if it's just one user waiting for a REPL to print, it will be plenty fast."

     "What if you did want to do a lot of pretty print layout requests at the same time?" Stu hypothesized. "What then?"

     "Well that's an odd assumption," Wil quibbled. "But I'd measure latencies and try alternatives until I had what was needed in some context. I see you hate this answer, Stu."

     "It sounds evasive," Stu objected. "But I suppose that's what avoiding big up-front design is all about: don't do it."

     "The api for ypp2dm doesn't specify how it gets anything done," Wil observed. "These docs go into private detail. I can change internals without affecting any client of the api."

     "Okay, so you could avoid a 4K copy per layer, per line printed," Stu nodded. "I guess it would get more complex, though?"

     "I'm sure it would," Wil agreed. "Current code is simple."

     "Code is easy," Stu granted. "But explanation is complex."

source «

     "If you don't mind," Wil pleaded. "I'd like to get through the source code without a lot of discussion."

     "Hey, I'm easy," Stu shrugged. "Blitz on through."

ypp2dm::ypp2dm(yPBH* pile, void* nilval) « : m_H(pile), m_nilval(nilval), m_d(0), m_tag(e_pp2dm) , m_1Pq(pile), m_1Pq_base(pile), m_1vH(0,0), m_1vH_base(0,0) , m_1n(0), m_1n_base(0), m_1epp(0), m_1epp_base(0), m_1efree(0) , m_2Pq(pile), m_2Pq_base(pile), m_2vH(0,0), m_2vH_base(0,0) , m_2n(0), m_2n_base(0), m_2epp(0), m_2epp_base(0), m_2efree(0) { // wait til 1st _m1new()/_m2new() before _m1init()/_m2init() }

     The constructor does almost nothing besides zero most fields, construct sub-objects, and put the right magic signature in m_tag. The pile heap is passed into each yPq list of pages member.

ypp2dm::~ypp2dm() { « this->m1clear(); this->m2clear(); m_H = 0; m_tag = e_dead_pp2dm; }

     The destructor frees all pages allocated by clearing both map 1 and map 2. Then the magic signature is whacked.

void ypp2dm::m1clear() { « if (yunlikely(!m_H)) { yH::hnil("ypp2dm::m_H"); return; } yassert("_m1init" && e_pp2dm == m_tag); m_1n = m_1n_base = 0; // no members m_1efree = 0; // ... or free elems m_1vH.vclear(); m_1vH_base.vclear(); m_1epp = m_1epp_base = 0; // re-allocated later: _m1init() m_1Pq.qclear(); // free pages AFTER any space in those pages m_1Pq_base.qclear(); // free pages AFTER any space inside if (yunlikely(0 != m_d)) ylog(0, "m1clear() m_d=%d nonzero?", (int) m_d); }

     Clearing map 1 zeroes every member referring to space in map 1, then frees all pages used in both the zero and nonzero layers, returning map 1 to a state just like post construction.

void ypp2dm::m2clear() { « if (yunlikely(!m_H)) { yH::hnil("ypp2dm::m_H"); return; } yassert("m2clear" && e_pp2dm == m_tag); m_2n = m_2n_base = 0; // no members m_2efree = 0; // ... or free elems m_2vH.vclear(); m_2vH_base.vclear(); m_2epp = m_2epp_base = 0; // allocated later by _m2init() m_2Pq.qclear(); // free pages AFTER any space inside them m_2Pq_base.qclear(); // free pages AFTER any space inside if (yunlikely(0 != m_d)) ylog(0, "m2clear() m_d=%d nonzero?", (int) m_d); }

     Clearing map 2 zeroes every member referring to space in map 2, then frees all pages used in both the zero and nonzero layers, returning map 2 to a state just like post construction.

void ypp2dm::_layer() { // add new temp layer on BOTH old maps « yassert("_layer" && e_pp2dm == m_tag); if (++m_d == 1) { // was zero and now one? 1st layer > zero? m_1vH_base = m_1vH; // save old 1vH m_1vH.vclear(); // ... and clear to empty m_1epp_base = m_1epp; // save old 1epp m_1epp = 0; // ... and clear to empty m_1n_base = m_1n; // save old 1n, but make no change m_1Pq.qclear(); // start with no pages in layers > zero m_2vH_base = m_2vH; // save old 2vH m_2vH.vclear(); // ... and clear to empty m_2epp_base = m_2epp; // save old 2epp m_2epp = 0; // ... and clear to empty m_2n_base = m_2n; // save old 1n, but make no change m_2Pq.qclear(); // start with no pages in layers > zero } }

     Adding a layer just increments depth; if this makes depth one, then it was previously zero and this is a transition from the zero layer to the nonzero layer. All members ending in _base capture copies of layer zero values from map 1 and map 2, both. Then "current" bucket tables — m_1epp and m_2epp — in both maps are zeroed so they will be lazily allocated on first use in layer nonzero.

void ypp2dm::_unlayer() { // zap temp layer, show BOTH old maps « u32 d = m_d; yassert("_unlayer" && e_pp2dm == m_tag); if (d) { // was more than zero so unlayer makes sense? m_d = --d; // one layer less now if (0 == d) { // was 1 and now 0? back to zero layer? m_1vH = m_1vH_base; // restore original 1vH m_1vH_base.vclear(); m_1epp = m_1epp_base; // restore other slots too... m_1epp_base = 0; m_1n = m_1n_base; m_1n_base = 0; m_1Pq.qclear(); // discard pages above zero layer m_2vH = m_2vH_base; // restore original 2vH m_2vH_base.vclear(); m_2epp = m_2epp_base; // restore other slots too... m_2epp_base = 0; m_2n = m_2n_base; m_2n_base = 0; m_2Pq.qclear(); // discard pages above zero layer } } else { yellf(__LINE__,__FILE__,"_unlayer() 0==m_d: underflow"); } }

     Removing a layer just decrements depth; if this makes depth zero, then it was previously one and this is a transition from the nonzero layer to the zero layer: every change done in _layer() is reversed, restoring _base values captured earlier from layer zero, so layer zero resumes where it was before.

map one «

     Methods to initialize, allocate elems, and add members in map 1 look very similar to similar methods for map 2 further below.

bool ypp2dm::_m1init() { « // true if succeeds 1st init when m_1epp starts at nil yassert("_m1init" && e_pp2dm == m_tag); u32 d = m_d; // depth zero initially, positive > layer zero if (!m_1epp) { // need to alloc buckets? yP* P = (d)? m_1Pq.qnewP(/*1P*/0x3150) : m_1Pq_base.qnewP(/*P1*/0x5031); if (yunlikely(!P)) { yH::hnil("m_1epp"); return false; } if (d && m_1epp_base) // must copy original base page? P->Pcopy((yP*)(void*)m_1epp_base); // identical to base else P->Pzero(); // zero page for nil bucket slots m_1epp = (Mpe**)(void*) P; // page as hashmap bucket array } return true; // buckets ready to use }

     Map 1 is lazily initialized in _m1init() the first time an element is added by a call to _m1add(). If depth is zero, a new layer zero page is allocated to hold all zero pointers in m_1epp. Otherwise if depth is nonzero and m_1epp_base holds the layer zero array saved in _layer(), the new page allocated copies m_1epp_base so layer nonzero starts as a copy of layer zero (a shallow copy such that every bucket still contains every list in layer zero.)

ypp2dm::Mpe* ypp2dm::_m1new(void* x) { « // alloc new elem from m_vH, or new page u32 d = m_d; // depth zero initially, positive > layer zero if (yunlikely(!m_H || (!m_1epp && !this->_m1init()))) return 0; // nothing can be done Mpe* e = (0==d)? this->_m1pop() : 0; // pop only layer zero if (e) { e->e_p = x; return e; } // re-use old popped elem void* mem = m_1vH.vtake(sizeof(Mpe)); // alloc from old page if (!mem) { // must alloc new page broken into elems? yP* P = (d)? m_1Pq.qnewP(/*1p*/0x3170) : m_1Pq_base.qnewP(/*p1*/0x7031); if (yunlikely(!P)) { yH::hnil("_m1new"); return 0; } m_1vH.vinit(P, yP::kPsz); mem = m_1vH.vtake(sizeof(Mpe)); } if (mem) { // raw memory allocated big enough to construct? ++m_1n; // we'll add one more member to hashmap return new(mem) Mpe(x); // use placement operator new } return 0; }

     Method _m1new() allocates a new Mpe elem for use in map 1 to hold a new key added to the map. A list of free elems is popped first; if that fails, sizeof(Mpe) bytes are allocated from m_1vH: a run heap containing whatever bytes remain in the last page allocated in the current layer specfically for use as map 1 elems. If m_1vH has too few bytes, a new page is allocated in the appropriate layer.

void ypp2dm::m1add(void* p) { // add p to map if not present « yassert("m1add" && e_pp2dm == m_tag); if (yunlikely(!m_H || (!m_1epp && !this->_m1init()))) return; // nothing can be done Mpe** pp = // bucket for p m_1epp + (yh32::hashp(p) % ypp2dm::kslots); for (Mpe* e = *pp; e; e = e->e_n) { // bucket not exhausted? if (p == e->e_p) return; // found p inside a list elem, return (no-op) } Mpe* e = this->_m1new(p); if (e) { e->e_n = *pp; *pp = e; } // push in bucket }

     Method m1add() hashes key p, then searches the list in the associated bucket for an existing elem with an identical key. If found, p is already a member; otherwise a new elem is allocated and constructed by _m1new(), then pushed on the top of the target bucket without altering any existing list elem.

map two «

     Methods to init, alloc elems, and add members in map 2 are described with nearly the same comments as map 1 above.

bool ypp2dm::_m2init() { « // true if succeeds 1st init when m_2epp starts at nil yassert("_m2init" && e_pp2dm == m_tag); u32 d = m_d; // depth zero initially, positive > layer zero if (!m_2epp) { // still need to alloc buckets? yP* P = (d)? m_2Pq.qnewP(/*2P*/0x3250) : m_2Pq_base.qnewP(/*P2*/0x5032); if (!P) { yH::hnil("m_2epp"); return false; } if (d && m_2epp_base) // must copy original base page? P->Pcopy((yP*)(void*)m_2epp_base); // identical to base else P->Pzero(); // zero page for nil bucket slots m_2epp = (Mkve**)(void*) P; // page as hashmap bucket array } return true; // buckets ready to use }

     Map 2 is lazily initialized in _m2init() the first time an element is added by a call to _m2add(). If depth is zero, a new layer zero page is allocated to hold all zero pointers in m_2epp. Otherwise if depth is nonzero and m_2epp_base holds the layer zero array saved in _layer(), the new page allocated copies m_2epp_base so layer nonzero starts as a copy of layer zero (a shallow copy such that every bucket still contains every list in layer zero.)

ypp2dm::Mkve* ypp2dm::_m2new(void* k, void* v) { « // new in m_2vH or new page u32 d = m_d; // depth zero initially, positive > layer zero if (yunlikely(!m_H || (!m_2epp && !this->_m2init()))) return 0; // nothing can be done Mkve* e = (0==d)? this->_m2pop() : 0; // pop only layer zero if (e) { // re-use existing elem? e->e_key = k, e->e_val = v; return e; } void* mem = m_2vH.vtake(sizeof(Mkve)); // alloc in old page if (!mem) { // must alloc new page broken into elems? yP* P = (d)? m_2Pq.qnewP(/*2p*/0x3270) : m_2Pq_base.qnewP(/*p2*/0x7032); if (yunlikely(!P)) { yH::hnil("_mnew"); return 0; } m_2vH.vinit(P, yP::kPsz); mem = m_2vH.vtake(sizeof(Mkve)); } if (mem) { // raw memory allocated big enough to construct? ++m_2n; // we'll add one more member to hashmap return new(mem) Mkve(k, v); // use placement operator new } return 0; }

     Method _m2new() allocates a new Mkve ley/val elem for use in map 2 to hold a new pair added to the map. A list of free elems is popped first; if that fails, sizeof(Mkve) bytes are allocated from m_2vH: a run heap containing whatever bytes remain in the last page allocated in the current layer specfically for use as map 2 elems. If m_2vH has too few bytes, a new page is allocated in the appropriate layer.

void* ypp2dm::m2add(void* k, void* v) { « // add (key,val) & (return old val) yassert("m2add" && e_pp2dm == m_tag); if (yunlikely(!m_H || (!m_2epp && !this->_m2init()))) return m_nilval; // nothing doable n32 i = (yh32::hashp(k) % ypp2dm::kslots); // table index Mkve** pp = m_2epp + i; // bucket for k for (Mkve* e = *pp; e; e = e->e_n) { // more bucket? if (k == e->e_key) { // found p in list elem, return old yassert(0 == m_d); // update only safe in layer zero void* oldval = e->e_val; e->e_val = v; return oldval; } } Mkve* e = _m2new(k, v); if (e) { e->e_n = *pp; *pp = e; } // push bucket return m_nilval; // no old value to return }

     Method m2add() hashes key, then searches the list in the associated bucket for an existing elem with an identical key. If found, key is already a member, and gets updated with the new value; otherwise a new elem is allocated and constructed by _m1new(), then pushed on the bucket top without altering any earlier list elems.

      Note updating an existing map 2 element is wrong if that elem was added in layer zero and the current layer is nonzero, violating immutability of layer zero in layer nonzero. But the existing pretty printer doesn't update map 2 in the nonzero layer, so the assert added catches any new usage. If Wil wanted to take time to bulletproof this code for other later uses, he would always push a new elem instead of updating an old one.

const void* ypp2dm::m1find(const void* p) const { « // return p if found else nil yassert("m1find" && e_pp2dm == m_tag); Mpe** epp = m_1epp; // maybe nil in layers > zero w/ no adds if (!epp) epp = m_1epp_base; // might be non-nil here if (yunlikely(!m_H || !epp)) return 0; // bad, destroyed, or empty: not in map Mpe** pp = epp + (yh32::hashp(p) % ypp2dm::kslots); // bucket for (Mpe* e = *pp; e; e = e->e_n) { // more bucket? if (p == e->e_p) return p; // found p inside a list elem: success } return m_nilval; // not found in map list where p must be }

     Method m1find() just searches the map 1 bucket used by m1add() when inserting key p. When m_1epp is zero, it's possbile _layer() was just called, in which case m_1epp_base must be searched.

void* ypp2dm::m2find(const void* p) const { « // return val if key is in map yassert("m2find" && e_pp2dm == m_tag); Mkve** epp = m_2epp; // maybe nil in layers > zero w/ no adds if (!epp) epp = m_2epp_base; // might be non-nil here if (yunlikely(!m_H || !epp)) return 0; // bad, destroyed, or empty: not in map n32 i = (yh32::hashp(p) % ypp2dm::kslots); // table index Mkve** pp = epp + i; // bucket for p for (Mkve* e = *pp; e; e = e->e_n) { // more bucket? if (p == e->e_key) return e->e_val; // found p in list elem, return val } return m_nilval; // not found in hashmap list where p must be }

     Method m2find() just searches the map 2 bucket used by m2add() when inserting key p. When m_2epp is zero, it's possbile _layer() was just called, in which case m_2epp_base must be searched.

void* ypp2dm::m1remove(void* p) { « // remove p if found (return p or nilval) yassert("m1remove" && e_pp2dm == m_tag); if (yunlikely(!m_H || !m_1epp)) return 0; // bad, destroyed, or empty: not in map Mpe** pp = m_1epp + (yh32::hashp(p) % ypp2dm::kslots); for (Mpe* e = *pp; e; pp = &e->e_n, e = *pp) { // more bucket? if (p == e->e_p) { // found p in list elem? unlink elem? *pp = e->e_n; // unlink elem; try to decr member count: if (m_1n) --m_1n; else yellf(__LINE__,__FILE__,"m1remove() underflow"); yassert("m1remove" && 0 == m_d); // cut only in layer zero this->_m1push(e); return p; // tell caller removal actually happened } } return m_nilval; // not found in map list where p would be }

     Method m1remove() is only called in layer zero. It searches the same map 1 bucket used by m1add() when inserting key p, but keeps track of the earlier reference to each elem examined, so it can be updated. If key p is found, that elem is unlinked from the list by having the predecessor slot (the bucket itself if at the top) point to the elem's successor. Any removed elem is pushed in a free list.

void* ypp2dm::m2remove(void* p) { « // remove p if found (return p or nilval) yassert("m2remove" && e_pp2dm == m_tag); if (yunlikely(!m_H || !m_2epp)) return 0; // bad, destroyed, or empty: not in map n32 i = (yh32::hashp(p) % ypp2dm::kslots); // index in table Mkve** pp = m_2epp + i; // bucket for p for (Mkve* e = *pp; e; pp = &e->e_n, e = *pp) { // more bucket? if (p == e->e_key) { // found p in list? unlink elem? *pp = e->e_n; // unlink elem; try to decr member count: if (m_2n) --m_2n; else yellf(__LINE__,__FILE__,"m2remove() underflow"); yassert("m2remove" && 0 == m_d); // cut only in layer zero this->_m2push(e); return e->e_val; // tell caller removal happened } } return m_nilval; // not found in map list where p would be }

     Method m2remove() is only called in layer zero. It searches the same map 2 bucket used by m2add() when inserting key p, but keeps track of the earlier reference to each elem examined, so it can be updated. If key p is found, that elem is unlinked from the list by having the predecessor slot (the bucket itself if at the top) point to the elem's successor. Any removed elem is pushed in a free list.

debug printing «

     This debug printing api describes ypp2dm on an out stream in pseudo XML format (as convenient notation only).

void ypp2dm::mprint() const { « yout << yendl; this->mdump(yout); yout << yendl << ynow; }

void ypp2dm::mdump(yo& o) const { « o.oft("<ypp2dm me=%lx H=%lx d=%d 1base=%lx 2base=%lx>", (long) this, (long) m_H, (long) m_d, (long) m_1epp_base, (long) m_2epp_base); o << yendl; this->m1dump(o); o << yendl; this->m2dump(o); o.ounend("ypp2dm"); }

     Dump format uses multiple lines to report a verbose summary of all internal state, as opposed to one line cite format below.

void ypp2dm::mcite(yo& o) const { « o.of("<ypp2dm me=%lx H=%lx depth=%d>", (long) this, (long) m_H, (long) m_d); this->m1cite(o); this->m2cite(o); o << "</ypp2dm>"; }

void ypp2dm::m1cite(yo& o) const { « o.of("<1epp=%lx 1n=%u Pn=%u vn=%u 1fr=%lx/>", (long) m_1epp, (int) m_1n, (int) m_1Pq.qsize(), (int) m_1vH.v_n, (long) m_1efree); }

void ypp2dm::m2cite(yo& o) const { « o.of("<2epp=%lx 2n=%u Pn=%u vn=%u 2fr=%lx/>", (long) m_2epp, (int) m_2n, (int) m_2Pq.qsize(), (int) m_2vH.v_n, (long) m_2efree); }

     Cite format generally aims for a single line of output.

void ypp2dm::m1dump(yo& o) const { « if (!m_1n) { this->m1cite(o); return; } // empty? o.oft("<pm 1epp=%lx 1n=%u b1n=%u Pn=%u " "bPn=%u vn=%u bVn=%u 1free=%lx>", (long) m_1epp, (int) m_1n, (int) m_1n_base, (int) m_1Pq.qsize(), (int) m_1Pq_base.qsize(), (int) m_1vH.v_n, (int) m_1vH_base.v_n, (long) m_1efree); o << yendl << "<face>" << m_1Pq.quote() << "</face>"; o << yendl << "<base>" << m_1Pq_base.quote() << "</base>"; o << yendl; this->_m1dump_epp(o); o.ounend("pm"); }

     Dumping map 1 reports all the map 1 members, and then iterates over all hashmap element members below.

yo& ypp2dm::_m1dump_epp(yo& o) const { int sz = 0; « o.oft("<1map 1epp=%lx b1epp=%lx slots=%d 1n=%d> " "<!-- sizeof(Mpe)=%d -->", (long) m_1epp, (long) m_1epp_base, (int) kslots, (int) m_1n, sizeof(Mpe)); for (unsigned i = 0; m_1epp && i < kslots; ++i) { Mpe* p = m_1epp[i]; Mpe* b = 0; int layer = 0; // default to base if (m_1epp_base && m_1epp_base != m_1epp) { // not base? layer = 1; b = m_1epp_base[i]; } if (p) { o.on(); // anything in this bucket? if (p->e_n) { // more than one in list? // print N nodes inside a list tag? o.oft("<slot i=%d>", (int) i); // slot index i in tag n32 len = 0; for ( ; p; p = p->e_n) { if (p == b) layer = 0; // base if (++len > 1) o << yendl; ++sz; const char* fmt = (layer)? "<e+ p=%#lx/>" : "<e p=%#lx/>"; o.of(fmt, (long) p->e_p); } o.ou(); // trailing count: o.of("</slot> <!-- %d -->", (int) sz); } else { // otherwise print node by itself if (!b || p == b) // base? o.of("<e p=%#lx/> <!-- %d i=%d -->", (long) p->e_p, ++sz, (int) i); else o.of("<e+ p=%#lx/> <!-- %d i=%d -->", (long) p->e_p, ++sz, (int) i); } } } o.ounend("map"); return o; }

     Dumping map 2 looks dumping map 1:

void ypp2dm::m2dump(yo& o) const { « yassert("m2dump" && e_pp2dm == m_tag); if (!m_2n) { this->m2cite(o); return; } // empty? o.oft("<ppm 2epp=%lx 2n=%u b2n=%u Pn=%u " "bPn=%u vn=%u bVn=%u 2free=%lx>", (long) m_2epp, (int) m_2n, (int) m_2n_base, (int) m_2Pq.qsize(), (int) m_2Pq_base.qsize(), (int) m_2vH.v_n, (int) m_2vH_base.v_n, (long) m_2efree); o << yendl << "<face>" << m_2Pq.quote() << "</face>"; o << yendl << "<base>" << m_2Pq_base.quote() << "</base>"; o << yendl; this->_m2dump_epp(o); o.ounend("ppm"); }

     Dumping map 2 reports all the map 2 members, and then iterates over all hashmap element members below.

yo& ypp2dm::_m2dump_epp(yo& o) const { int sz = 0; « o.oft("<2map 2epp=%lx b2epp=%lx slots=%d 2n=%d> " "<!-- sizeof(Mkve)=%d -->", (long) m_2epp, (long) m_2epp_base, (int) kslots, (int) m_2n, sizeof(Mkve)); for (unsigned i = 0; m_2epp && i < kslots; ++i) { Mkve* p = m_2epp[i]; Mkve* b = 0; int layer = 0; // default to base if (m_2epp_base && m_2epp_base != m_2epp) { // not base? layer = 1; b = m_2epp_base[i]; } if (p) { o.on(); // anything in this bucket? if (p->e_n) { // more than one in list? // print N nodes inside a list tag? o.oft("<slot i=%d>", (int) i); // slot index i in tag n32 len = 0; for ( ; p; p = p->e_n) { if (p == b) layer = 0; // base if (++len > 1) o << yendl; ++sz; const char* fmt = (layer)? "<e+ k=%lx v=%lx/>" : "<e k=%lx v=%lx/>"; o.of(fmt, (long) p->e_key, (long) p->e_val); } o.ou(); // trailing count: o.of("</slot> <!-- %d -->", (int) sz); } else { // otherwise print node by itself if (!b || p == b) // base? o.of("<e key=%lx val=%lx/> <!-- %d i=%d -->", (long) p->e_key, (long) p->e_val, ++sz, (int) i); else o.of("<e+ key=%lx val=%lx/> <!-- %d i=%d -->", (long) p->e_key, (long) p->e_val, ++sz, (int) i); } } } o.ounend("2map"); return o; }

coupling «

     "Hey Wil," Dex sauntered in to make a couple comments. "Why does ypp2dm have two hashmaps inside?"

     "Because my pretty printer needs both," Wil replied. "You want to know why they're coupled together like this?"

     "Yes, exactly," Dex laced his fingers, apparently pleased that stipulating this much was so easy. "Isn't it bad architecture? What if you need to use them again?"

     "So far I have not used them again," Wil countered. "And joining them was faster and more convenient the first and only time I used hashmaps with two layers like this."

     "But what about other folks?" Dex asked. "What about those poor folks who only need one, and not both?"

     "You mean lazy copy-and-paste coders who want to use code without understanding it?" Wil qualified. "What about them? Maybe I think the exercise is good for them."

     Dex chewed his lip. "That's not a code re-use spirit, Wil."

     "Yes it is, Dex," Wil patronized right back. "If you take re-use to mean rewrite, as I do. Go forth and multiply."

     "Two demerits for pompous phrasing," Dex warned.

     "Oh, get off my lawn," Wil said without heat.

license «

     All this code is available only under the BriarPig mu-babel license described fully on the rights page. You do not have permission to reprint this page in any way. Neither feeds nor repackaging is allowed. You can link this page if you want folks to read it.