Þ   briarpig  » thorn  » demos  » list


demos are explained here; a menu at top column right indexes actual topic demos. Here we demo list.

problem

     Although Wil usually avoids lists in contexts where scaling and speed matter, sometimes they can't be avoided. Also, sometimes a list is the simplist solution in a non-scaling context. Wil prefers a circular doubly-linked list with simple edge conditions: hard to get wrong and easy to review by eye.

     This column focuses on code for very basic doubly-linked lists. General discussion of context, design, and performance appear column right, following definitions of more list classes.

organization

     Code for all iterators appears in the iter demo. Iterators all have names ending with p for pointer because þ iterators act like pointers. But a class is not an iterator when p appears in the middle of a class name — this usually means just pointer in the sense of void*. (But a few exceptions append a further qualification at the end of iterator names.)

     Extremely verbose template classes appear separately at the bottom of the right column, and in the iter demo. You can skip them if templates look horrific to you (which seems reasonable). Wil provides them mainly as raw material to derive more specific replacements for template code needing a refactor. Templates are only marginally tested since Wil uses them rarely.

names

     Most list class names contain either q for queue or l for list. Class names ending in e for element are members that go inside a list. Class names ending in p for pointer are iterators, but p in the middle usually just means an address pointer is part of element content. A class name containing a literal lowercase t for template is usually a template type.

  • q - queue: primitive list
  • l - list: higher level list
  • e - element: list member
  • p - pointer: iterator or address pointer
  • r - reverse: a back-to-front iterator
  • k - const: immutable; a const iterator
  • 32 - 32-bit: has 32-bit length inside
  • t - template: template version

     So yqe is the base queue element: a node with next and previous pointers to successor and predecessor elements. Base queue yq contains one yqe element as the head sentinel element, which is not content. In general, any class name of the form yqTe is a subclass of yqe containing T as content of the list element. For example, yqpe contains a void* pointer. You might be tempted to confuse this with yqp which is yq's forward iterator. The reverse iterator is named yrqp.

link element

     In day job rewrites of yqe using friendly names, Wil uses link or ln most often as part of the class name. State for next and prev pointers to successor and predecessor links is held in members e_n and e_p. You should memorize the names of those members.

struct yqe { // element in yq « yqe* e_n; // next « yqe* e_p; // prev «

yqe() : e_n(this), e_p(this) { } // self reference cycling « yqe(yqe* n, yqe* p) : e_n(n), e_p(p) { }

// "placement" operator new void* operator new(size_t sz, void* p) { return p; } « void* operator new(size_t sz, yH& h) { return h.hmalloc(sz); }

void ecycle() { e_n = e_p = this; } // self reference «

void ezero() { e_n = e_p = 0; } // nil slots « void epush_before(yqe* old) { // place before old in queue « (old->e_p->e_n = this)->e_p = old->e_p; (this->e_n = old)->e_p = this; } void epush_after(yqe* old) { // place after old in queue « (old->e_n->e_p = this)->e_n = old->e_n; (this->e_p = old)->e_n = this; }

void epop() { (e_p->e_n = e_n)->e_p = e_p; } // remove « void eerase() { (e_p->e_n = e_n)->e_p = e_p; } // remove « }; // struct yqe

     Wil most often uses epop() and ezero() to remove a link element from its containing yq list and set the members to zero. In general, Wil doesn't zero elem links on queue removal, so you need to do it explicitly when you want to an invariant like "non-member means zero pointers, and vice versa." It's safer to zero when no longer used, but not required.

     When an operation resembles one in STL interfaces, Wil tries to use the same name, at least in a synonym method. But the idea is to borrow familiarity with STL names to reduce documentation; it's not to borrow STL definitions. Instead, you are expected to memorize the code in methods above, because they are defined to do exactly what appears in the inlines, which is one of the reasons all methods are inlines here.

     In queue elem class yqe the code is the documentation.

     The self referential empty constructor must do the same thing as ecycle() because yq's empty constructor requires it (cf »).

     The following yqe subclasses are typical uses appearing in other demos. For example, yqve is used by some heap subclasses to manage lists of memory blocks in run format.

struct yqve : public yqe { // yq elem is memory fragment in yv « yv e_v; // for classes needing help tracking blocks alloc'd yqve() : yqe(), e_v(0, 0) { } yqve(yv const& v) : yqe(), e_v(v) { } « yqve(yqe* n, yqe* p, yv const& v) : yqe(n, p), e_v(v) { } ~yqve() { } }; // yqve: generic list element for memory fragments

struct yqpe : public yqe { // elem in ptr list « void* e_p; // ptr key stored in map elem yqpe() : yqe(), e_p(0) { } « yqpe(void* p) : yqe(), e_p(p) { } ~yqpe() { } // nothing }; // yqpe: pointer list elem; O(1) removal anywhere in list

struct yqppe : public yqe { // yq is list of void* pairs « void* e_key; // ptr key stored in this hashmap elem void* e_val; // ptr (or int) val stored in map elem yqppe() : yqe(), e_key(0), e_val(0) { } « yqppe(void* k, void* v) : yqe(), e_key(k), e_val(v) { } ~yqppe() { } // nothing }; // yqppe: ptr pair list elem; O(1) removal anywhere in list

     A basic idea behind yqe subclasses is that declaring new ones is trivial and sufficiently clear that little explanation is needed. Template subclass yqte is quite similar (used mainly by template lists).

template <typename T> struct yqte : public yqe { // yqt elem « T e_t; yqte() : yqe(), e_t() { } // inherit yqe's operator new(heap) yqte(const T& t) : yqe(), e_t(t) { } ~yqte() { } // defines place where ~T() occurs when it does };

     Template queue yqt using yqte elements is shown bottom right (cf »); associated iterators appear in the iter demo.

primitive queue

     Wil usually puts name deque for double ended queue in class names for yq when using longer code names. Letter q for queue implies a more primitve api, while l for list sometimes distinguishes a higher level api. (List yl32 automatically counts length.)

class yq { // primitive CIRCULAR double ended linked list « private: yqe q_e; // the "head" sentinel element yq(const yq&); yq& operator=(const yq&); // no copies!

     The only yq state is sentinel element q_e whose members point at the first and last list members. When q_e points at itself, the queue is empty. Construction self-cycles q_e become empty. List linkage is always terminated by the address of q_e.

public: // make static void yq::qnil(const char* where) { // log on nil « yellf(__LINE__,__FILE__,"yq::qnil(where=%s)", where); } const yqe* qhead() const { return &q_e; } // for yqt<T> iters «

     The empty constructor makes the sentinel q_e point at itself. Zero pointers in q_e is never valid.

yq(): q_e() { /*q_e.ecycle();*/ } // self-cycle done by yqe « void qclear() { q_e.ecycle(); } // to empty (NO ELEMS FREED) «

     Method qclear() makes the list empty again. But note this frees no space used by former members of the list. You need to reclaim space for elements however your usage requires.

yqp qbegin() const { return yqp(q_e.e_n); } // 1st: > q_e « yrqp qrbegin() const { return yrqp(q_e.e_p); } // last: < q_e « const yqe* qend() const { return &q_e; } // sentinel elem « const yqe* qrend() const { return &q_e; } // sentinel elem «

     The STL style interation api shown above appears in the iter demo. (It's very straight forward.) You aren't required to use this iteration api; you can simply call methods below directly. An STL style api is only useful when updating old code using STL, when you want to avoid significant logic change.

     The following methods have no direct STL analog, except for qsize() returning list length. The implementation of each (further below) gives more explanation. The qpop_front() and qpop_back() are similar to STL methods; but they combine getting with popping in one method, when STL doesn't.

     Note how the return value of type yqe* implies a downcast will be required in normal use with these methods, to convert from base class yqe to the actual subclass you used.

yqe* qpop_front(); yqe* qpop() { return this->qpop_front(); } « yqe* qpop_back(); yqe* qat(size_t index) const; // zero-based, linear time yqe* operator[](size_t i) const { return this->qat(i); } // \return zero-based index of memb in list, or -1 if missing int qfind(const yqe* memb) const; // negative means miss size_t qlength() const; // linear time length calculation size_t qsize() const { return this->qlength(); } « // the following method is more efficient for long lists: int qlencmp(size_t minLen) const; // -1: len < minLen, 0: len == minLen, 1: len > minLen

     Method qlencmp() is more efficient than qsize() when you want to compare actual size to a specific value: qlencmp() only iterates over as many members as it takes to answer the question. For example, if a list contains ten thousand members, and you want to know if length is over two, qlencmp() need only look at three members to answer definitively. So when minLen is a small constant, qlencmp() is always constant time.

     When the queue is empty, the first member is the sentinel element, and qempty() returns true:

bool qempty() const { return (q_e.e_n == (yqe*) &q_e); } « void qswap(yq& x); // exchange with x by swapping elems yqe* qafter(const yqe* o) const // successor of old o « { return ((o->e_n != &q_e)? o->e_n : (yqe*) 0); } yqe* qbefore(const yqe* o) const // predecessor of old o « { return ((o->e_p != &q_e)? o->e_p : (yqe*) 0); }

     To iterate forwards over yq, just use qfront() and qafter() like this bit of sample code:

for (yqe* e = queue.qfront(); e ; e = queue.qafter(e)) { /*e is next element in list*/; }

     To iterate backwards over yq, just do the same thing using qback() and qbefore(). Nil is returned when the queue is empty, or when no more elements follow the current one.

yqe* qfirst() const { return ((q_e.e_n!=&q_e)? q_e.e_n :0); } « yqe* qfront() const { return ((q_e.e_n!=&q_e)? q_e.e_n :0); } « yqe* qlast() const { return ((q_e.e_p!=&q_e)? q_e.e_p :0); } « yqe* qback() const { return ((q_e.e_p!=&q_e)? q_e.e_p :0); } «

     Pushing a new element on the queue front is shown next. The same code works whether or not the queue is empty. The ascii diagram aims to help you review the code for correctness. (Letter h for head in the figure refers to the q_e sentinel.)

/*From IronDoc documentation for AddFirst: +--------+ +--------+ +--------+ +--------+ +--------+ | h.next |-->| b.next | | h.next |-->| a.next |-->| b.next | +--------+ +--------+ ==> +--------+ +--------+ +--------+ | h.prev |<--| b.prev | | h.prev |<--| a.prev |<--| b.prev | +--------+ +--------+ +--------+ +--------+ +--------+ */ void qprepend(yqe* e) { // note qpush_front() is identical « (q_e.e_n->e_p = e)->e_n = q_e.e_n; (e->e_p = &q_e)->e_n = e; } void qpush_front(yqe* e) { // synonym for qprepend() « (q_e.e_n->e_p = e)->e_n = q_e.e_n; (e->e_p = &q_e)->e_n = e; } void qmove_to_front(yqe* old) { // old must already be in q « old->epop(); this->qpush_front(old); }

     Method qmove_to_front() assumes argument old is in the queue, removes it, then pushes it back on the front.

     The following methods are just like those above, but targeting the back of the queue instead of the front.

/* From IronDoc documentation for AddLast: +--------+ +--------+ +--------+ +--------+ +--------+ | y.next |-->| h.next | | y.next |-->| z.next |-->| h.next | +--------+ +--------+ ==> +--------+ +--------+ +--------+ | y.prev |<--| h.prev | | y.prev |<--| z.prev |<--| h.prev | +--------+ +--------+ +--------+ +--------+ +--------+ */ void qappend(yqe* e) { // note qpush_back() is identical « (q_e.e_p->e_n = e)->e_p = q_e.e_p; (e->e_n = &q_e)->e_p = e; } void qpush_back(yqe* e) { // synonym for qappend() « (q_e.e_p->e_n = e)->e_p = q_e.e_p; (e->e_n = &q_e)->e_p = e; } void qmove_to_back(yqe* old) { // old must already be in q « old->epop(); this->qpush_back(old); } }; // class yq

     Now let's look at non-inline methods defined in yq.cpp:

yqe* yq::qpop_front() { // remove first elem « yqe* e = q_e.e_n; if (e != &q_e) { (q_e.e_n = e->e_n)->e_p = &q_e; return e; } return (yqe*) 0; }

     Note neither qpop_front() nor qpop_back() zeroes pointers in removed elems. Use yqe::ezero() if you need them zero.

yqe* yq::qpop_back() { // remove last elem « yqe* e = q_e.e_p; if (e != &q_e) { (q_e.e_p = e->e_p)->e_n = &q_e; return e; } return (yqe*) 0; }

     Method qat(i) returns the member at index i, else nil:

yqe* yq::qat(size_t index) const { // zero-based « register size_t count = 0; register yqe* e = this->qfront(); for ( ; e; e = this->qafter(e)) { if ( ++count == index ) break; } return e; }

     Method qfind(e) returns the zero-based index of element e, else negative when not found:

int yq::qfind(const yqe* memb) const { « // zero-based index; negative means miss register int count = 0; register const yqe* e = this->qfront(); for ( ; e; e = this->qafter(e)) { if ( memb == e ) return (int) count; ++count; } return -1; }

     Queue size is calculated by walking all list elements:

size_t yq::qlength() const { // count of list items « register size_t count = 0; register yqe* e = this->qfront(); for ( ; e; e = this->qafter(e)) ++count; return count; }

     Comparing size to input length n requires seeing no more than n+1 elements:

int yq::qlencmp(size_t minLen) const { « // -1: len < minLen, 0: len == minLen, 1: len > minLen // the following method is more efficient for long lists: register size_t n = 0; register const yqe* e = this->qfront(); for ( ; e; e = this->qafter(e)) { if ( ++n > minLen ) return 1; // greater than specified minLen } return (n == minLen)? 0 : -1; // equal or less than minLen }

     Method qswap() exchanges all content in two queues. This sort of action is sometimes quite helpful when element ownership in two queues can be exchanged without any net change in, for example, refcounting of elements. (This code is untested.)

/* yq::qswap (in imitation of IronDoc queue illustrations) +------+ +-------+ +------+ +-------+ +---------+ +-------+ |p->e_n|->|q_e.e_n|->|n->e_n| |xp->e_n|->|x.q_e.e_n|->|xn->e_n| +------+ +-------+ +------+ +-------+ +---------+ +-------+ |p->e_p|<-|q_e.e_p|<-|n->e_p| |xp->e_p|<-|x.q_e.e_p|<-|xn->e_p| +------+ +-------+ +------+ +-------+ +---------+ +-------+ */ void yq::qswap(yq& x) { // exchange contents by swapping elems « bool none = this->qempty(); bool xnone = x.qempty(); // no elems in either? bool bothEmpty = none && xnone; // neither contain any elems if (this == &x || bothEmpty) return; // noop: nothing to do yqe& e = q_e; yqe& xe = x.q_e; yqe* n = e.e_n; // next: 1st in this (used only if not empty) yqe* p = e.e_p; // prev: last in this yqe* xn = xe.e_n; // next: 1st in x (used only if x not empty) yqe* xp = xe.e_p; // prev: last in x bool neitherEmpty = !none && !xnone; // both contain elems if (neitherEmpty) { // both lists must update next and prev? n->e_p = p->e_n = &x.q_e; // use x's sentinel xe.e_n = n; xe.e_p = p; // x has my old list xn->e_p = xp->e_n = &q_e; // use my sentinel e.e_n = xn; e.e_p = xp; // I have x's old list } else if (none) { // this is empty, but x is not? xn->e_p = xp->e_n = &q_e; // use my sentinel e.e_n = xn; e.e_p = xp; // I have x's old list x.qclear(); // make x empty } else if (xnone) { // x is empty, but this is not? n->e_p = p->e_n = &x.q_e; // use x's sentinel xe.e_n = n; xe.e_p = p; // x has my old list this->qclear(); // make this empty } }

lisp lists

     Wil's concern about list speed is mostly constrained to C++ contexts. Wil doesn't worry about the same things when coding in Lisp. But this is mainly because Wil plans to do performance intensive code in C++; a little extra cost in Lisp or Smalltalk matters not if gigantic data structures in high frequency use are in C and C++.

license

     All this code is available only under the BriarPig mu-babel license described fully on the rights page. You do not have permission to reprint this page in any way. No feeds or repackaging is allowed. You can link this page if you want folks to read it.

A submenu for demos appears below, letting you go to the page on a topic written as a demo (as the demos page defines it).

menu

     thorn: todo, names, fd, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list « Þ, iter, ctype

     (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile)

     Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.

extensions

     Lists below extend yq by adding a list length, but yq32 does so only by adding a 32-bit q_n length member, without giving you any help maintaining the length beyond providing increment and decrement operators. Sometimes this might be just what you need.

/// yq32 exists only to support common association of yq and n32 class yq32 : public yq { // simple yq plus n32 length « public: n32 q_n; // q size MANUALLY maintained by user of yq inside yq32 yq32() : yq(), q_n(0) { } // empty (circular) list w/ zero elems n32 operator++() { return ++q_n; } // prefix count elems added n32 operator--() { // centralized checking for length underflow if (q_n) --q_n; // only subtract one if positive else { yellf(__LINE__,__FILE__,"yq32::operator--() h_n=0 underflow"); q_n = this->qlength(); // recompute } return q_n; } }; // yq32

     In contrast, list yl32 maintains the new q_n length member for you, as a side effect of using other methods. To ensure you cannot manipulate the raw yq directly, yl32 has-a private yq as a member variable, instead of inheriting publically from yq the way yq32 does.

     In more complex þ classes using lists, Wil uses a plain yq when length is seldom useful, but prefers yl32 instead when he either needs a length or wants one for stats or printed representations.

     All yl32 methods are inlines calling yq's api to change list members, adding only the maintenance of length q_n. Any yq method re-exposed by yl32 has a name replacing the leading q with l.

/// \class yl32 is a "list" version of yq32 automating elem count class yl32 { // simple yq + n32 length (+ replacement q methods) « private: yq l_q; // "has-a" yq instead of "is-a" yq as yq32 does n32 l_n; // length of queue AUTOMATICALLY maintained by yl32

     Length q_n is only lowered by inline _ldown() below. Its main purpose is complaining on underflow. Since _ldown() is only called when an element is removed, the length must be nonzero. But if zero, the length must be garbled, so it's recomputed by walking the list.

n32 _ldown() { // centralized checking for length underflow « if (l_n) --l_n; // only subtract one if positive else { yellf(__LINE__,__FILE__,"yl32::_ldown() l_n=0 (underflow)"); l_n = l_q.qlength(); // recompute } return l_n; }

     When the list hits empty — as evidenced by popping nil instead of an element — _lnone() checks for a nonzero length and then forces length to zero for good measure.

void _lnone() { // check for nonzero len when nil was popped « if (l_n) yellf(__LINE__,__FILE__,"_lnone() l_n=%d to zero", (int) l_n); l_n = 0; }

     All methods popping a member element call _less() which either calls _ldown() or _lnone() since one must be suitable. (Wil assumes lightly nested inline methods are just as efficient as no use of inlines at all, so the purpose of _less() is maximum clarity in pop methods.)

yqe* _less(yqe* e) { // err check on elem removal « if (e) _ldown(); else this->_lnone(); return e; }

     The public api closely follows that of yq while maintaining the new q_n length field. Note clearing a list just makes the list empty; it doesn't free any of the elements, or even unlink them from one another. So presumably you have those bases covered another way. Just remember yl32 doesn't handle space ownership — you do.

public: void lclear() { l_q.qclear(); l_n = 0; } // empty (NONE FREED) « void lerase(yqe* e) { e->eerase(); this->_ldown(); } «

     When you remove an element from the list using erase, you must call lerase() rather than calling yqe::eerase() on the element directly — as you would with plain yq lists — or otherwise length q_n won't be maintained. (This is a hole in the yl32 api created by yqe.)

yqe* lpop() { return this->_less(l_q.qpop_front()); } « yqe* lpop_front() { return this->_less(l_q.qpop_front()); } « yqe* lpop_back() { return this->_less(l_q.qpop_back()); } «

     All pop methods above obviously have identical effect to yq methods, but _less() peeks at the nil or non-nil value in passing to either decrement length or force length to zero.

void lpush_back(yqe* e) { // append « if (e) { ++l_n; l_q.qpush_back(e); } else yq::qnil("lpush_back"); } void lpush(yqe* e) { // prepend « if (e) { ++l_n; l_q.qpush_front(e); } else yq::qnil("lpush"); } void lpush_front(yqe* e) { // prepend « if (e) { ++l_n; l_q.qpush_front(e); } else yq::qnil("lpush_front"); }

     The push methods are too trivial for comment as you can see. However, note these are the last methods changing queue length — methods below have no net change on length.

     Construction makes an empty list; destruction does nothing — if you need element resource cleanup, do it yourself (presumably by emptying the list early in some larger object's destructor).

yl32() : l_q(), l_n(0) { } // empty (circular) list: zero elems « ~yl32() { }

     The begin and end methods below provide an STL style interation api that's described in the iter demo instead of here.

yqp lbegin() const { return l_q.qbegin(); } « yrqp lrbegin() const { return l_q.qrbegin(); } « const yqe* lend() const { return l_q.qend(); } « const yqe* lrend() const { return l_q.qrend(); } «

     All other methods below are just thin inline shells atop yq methods because length doesn't change.

yqe* lat(size_t index) const { return l_q.qat(index); } // linear « yqe* operator[](size_t i) const { return l_q.qat(i); } // linear // \return zero-based index of memb in list, or -1 if missing int lfind(const yqe* memb) const { return l_q.qfind(memb); } «

     Since length is maintained in q_n, we can get length direct without walking the list. However, if you did want to call the yq methods, those are exposed too. (For example, maybe you want to assert lsize() is equal to lqsize() — who knows.)

size_t llength() const { return l_n; } « size_t lqlength() const { return l_q.qlength(); } // linear « size_t lsize() const { return l_n; } « size_t lqsize() const { return l_q.qsize(); } // linear « /// neg: len < minLen, 0: len == minLen, pos: len > minLen int llencmp(size_t minLen) const { « return ((int) l_n) - ((int) minLen); }

bool lempty() const { return 0 == l_n; } // no members « void lswap(yl32& x) { // exchange contents by swapping elems « n32 tn = l_n; l_n = x.l_n; x.l_n = tn; // exchange lengths l_q.qswap(x.l_q); // exchange yq elems }

     Though lswap() doesn't change net combined length of both lists, it does exchange the length of each with the other in addition to swapping queues. (This one's not quite as thin.)

yqe* lafter(const yqe* e) const { return l_q.qafter(e); } « yqe* lbefore(const yqe* e) const { return l_q.qbefore(e); } « yqe* lfirst() const { return l_q.qfirst(); } « yqe* lfront() const { return l_q.qfront(); } « yqe* llast() const { return l_q.qlast(); } « yqe* lback() const { return l_q.qback(); } «

void lmove_to_front(yqe* old) { l_q.qmove_to_front(old); } « void lmove_to_back(yqe* old) { l_q.qmove_to_back(old); } « }; // yl32

     "Do you use yl32 or it's equivalent in day jobs?" Stu asked.

     "No," Wil replied. "Usually just clones of yqe and yq. Less novelty is usually better when folks are already queasy over not using STL."

     "Why do they let you use your own code?" Stu wondered.

     "Because where I do, the fastest thing with least overhead is needed," Wil explained. "Usually lists with intrinsic links are called for, without any allocation from a heap to add elements."

     Wil wasn't finished. "I also use multiple inheritance of yqe in objects I need to add to multiple lists. Let me show you how that works."

multiple inheritance

     Occasionally objects need to be member elements in multiple queues, so Wil inherits yqe multiple times via intermediary classes inheriting yqe once apiece only. Before more explanation, an example might help. Let's look at inheriting twice only, though three times is often common in caches of some sorts, like file i/o page caches.

     Suppose you have a Node type that sometimes must be seen as a Foo in a list of Foos, and sometimes must be seen as Bar in a list of Bars. Wil handles that case like this:

class Foo: public yqe { // Foo related state « // Foo stuff }; class Bar: public yqe { // Bar related state « // Bar stuff }; class Node: public Foo, public Bar { // « // an object needing to be in both Foo lists and Bar lists };

     Pushing a Node into a yq list of Bar elements looks like this:

void pushBar(yq& q, Node* n) { // push as Bar « q.qpush_back((Bar*) n); // using yqe in Bar }

     Popping a Node from a yq list of Bar elements looks like this:

Node* popBar(yq& q) { // pop as Bar « return (Bar*) q.qpop_front(); // using yqe in Bar }

     This works because casting to and from Bar* each time says which path in the inheritance graph to use. And that's the reason to use intermediary subclasses when inheriting from yqe more than once: to name each path in the graph with a different type.

     Of course the pushFoo() and popFoo() methods look exactly the same but with Foo substituted for Bar everywhere.

incremental

     In Wil's suite of list classes, one ugly part is incremental addition of features: a little at a time in small feature upgrades. For example, the base list doesn't count members as superfluous cost to any user who doesn't need a list size. When it comes to primitive data structures, Wil prefers more classes with carefully separated feature accumulation, because this minimizes coupling of what you want and what's forced on you by packaging. The downside of this? Wil has too many class names.

cache lines

     Doubly-linked circular lists have become less useful in recent years as the penalty for cache line misses has gone up. A list operation touching both next and previous list members — and the target member — ends up touching at least three different cache lines in memory, on average, for a very large number of list members with no expected locality of adjacent list members. For example, this is what's expected with LRU (least recently used) lists of objects, which can approximate random member order in a list. In a server under memory pressure, touching many cache lines is expensive compared to any solution touching fewer.

     In some cases, such as LRU lists, this is unavoidable. If a list usage often pops a member at any position then re-pushes at one end or another, then using a doubly-linked circular list doesn't have a better alternative touching fewer cache lines. You'll just have to to it; so there's no use in worrying over how many cache lines you touch.

     But other uses of doubly-linked circular lists might not have this requirement. You might get by with a singly linked list and still have all constant time operations. In this case you pay a penalty in extra cache lines touched when using lists on this page. In a performance bottleneck you might not be able to afford this choice. However, lists on this page are rather simple, so it's quite easy to use them until you have a reason not to do so anymore. That's how Wil plays it.

     Wil used to employ doubly-linked circular lists in hashmap buckets to chain collisions in open hash style (meaning a bucket it open to further additions). But that costs too much in cache lines touched now. Closed hashing with probing to resolve collisions now rules the day, typically touching only one cache line per table probe. (See the tombstones page for a look at good closed hashing style.)

     However, in subclasses of heap apis managing pages and books, Wil still uses linked lists in hashmap buckets, primarily because this is just simpler than alternatives and because page and book allocation is not a main cause of memory pressure for cache lines. (At least not so far.) And until toy language implementations have nothing better to do than improve marginal performance issues, Wil pursues simplicity.

re-inventing wheels

     Lists are easy. They're so easy, many folks think the most important thing is to use a standard list: rolling your own and re-inventing the wheel is a huge hit on style points. Wil agrees using a standard tool is fine when results are not critical.

     But when you're on the hook to make a system scale without risks, Wil thinks you're crazy to use any code you haven't personally vetted and isolated so it's under your control. If you put lists in central components affecting success of your system, it's madness not to have them under your thumb: you could not rule out hidden factors causing odd system behavior. (Axiomatically, all complex systems have odd behavior which you must diagnose.)

     So in some cases you must write your own lists, or adopt someone else's which you review as carefully as writing them from scratch. This is how you create an option to ensure you know exactly what's happening, all the time. A common api allowing fast swapping between your lists and standard replacements is great; substitution is good experimental technique.

     A motto "trust no one" works when it's your skin at stake. Trust standard replacements when you can test whether they act identical to your own tools. Standard tools typically aim for universal use, but "one-size-fits-all" is rarely a perfect fit. Generic memory use can expose you to risks, or deny efficient options; critical infrastructure often needs static allocation.

     Wil's been using the variant of doubly-linked circular lists shown earlier for almost twenty years. It's been rewritten in C and C++ several different times; however, nothing besides the names have changed. The model for code roughly imitates how VAX queue instructions do circular lists: using a sentinel node as a list head, where the sentinel points at itself fore and aft when the list is empty. This manner of list avoids special cases in adding and removing list members, avoiding branches and taking fewer instructions — efficient but for touching many cache lines.

templates

     The majority of code associated with template yqt<T> is actually in the iterators, which appear in the future iter demo. The following code hasn't been tested. This is just a draft Wil wrote for thinking purposes.

     All template methods are inlines, as normal with templates.

     Unlike all other list classes on this page, yqt<T> does manage ownership of space for elements added as members, to more closely resemble the STL api of std::list<T>. The allocator is a heap using an api described in the (future) heap demo.

template <typename T> // queue of T elems (like std::list) class yqt { public: « typedef size_t sz_t; // size type « typedef sz_t size_type; typedef yqtp<T> iterator; typedef ykqtp<T> const_iterator; typedef yrqtp<T> reverse_iterator; typedef ykrqtp<T> const_reverse_iterator;

private: typedef yqte<T> e_t; // type of elems in this queue « yH* q_h; // the heap allocator for elements yq q_q; // the yq containing all the yqte members sz_t q_n; // number of yqte<T> member elements in list q_q mutable T q_0; // dummy used for out of bounds access

void _n() { // "down" decrements size with underflow check « if (q_n) --q_n; else yellf(__LINE__,__FILE__,"yqt::_n()"); }

void _qx(e_t* e) { if (e) { e->~yqte<T>(); q_h->hfree(e); } } «

e_t* _qe() { return new(*q_h) e_t; } // new « e_t* _qe(T const& t) { return new(*q_h) e_t(t); } // new

T& _qf() const { // front's T « e_t* e = (e_t*) q_q.qfront(); return (e)? e->e_t: q_0; } T& _qb() const { // back's T « e_t* e = (e_t*) q_q.qback(); return (e)? e->e_t: q_0; }

public: void qswap(yqt& x) { // exchange all fields with another yqt « yH* th = q_h; q_h = x.q_h; x.q_h = th; // exchange heaps sz_t tn = q_n; q_n = x.q_n; x.q_n = tn; // exchange sizes q_q.qswap(x.q_q); // exchange elements }

yqtp<T> qbegin() { return yqtp<T>(q_q.qhead()->e_n); } « ykqtp<T> qbegin() const { return ykqtp<T>(q_q.qhead()->e_n); } yrqtp<T> qrbegin() { return yrqtp<T>(q_q.qhead()->e_p); } « ykrqtp<T> qrbegin() const { return ykrqtp<T>(q_q.qhead()->e_p); } const yqe* qend() const { return q_q.qhead(); } « const yqe* qrend() const { return q_q.qhead(); } «

yqt() : q_h(&yH::g_1H), q_q(), q_n(0) { } // malloc() based yqt(yH* h) : q_h(h), q_q(), q_n(0) { } // explicit heap

void qclear() { // delete all elems; reduce to zero size « e_t* e = 0; while ((e = (e_t*) q_q.qpop_front()) != 0) // popped an elem? this->_qx(e); // destroy and free each member q_n = 0; // no members; now empty }; ~yqt() { this->qclear(); q_h = 0; }; // delete elems; lose heap sz_t qmax_size() { return (0x7FFFFFF/sizeof(e_t)); } // plausible? «

sz_t qsize() const { return q_n; } // elems in list «

T& qback() { return this->_qb(); } « const T& qback() const { return this->_qb(); } « void qpop_back() { e_t* e = (e_t*) q_q.qpop_back(); « if (e) { this->_n(); _qx(e); } // checked decr then free elem } void qpush_back(T const& t) { « e_t* e = this->_qe(t); // alloc new elem if (e) { q_q.qpush_back(e); ++q_n; } }

T& qfront() { return this->_qf(); } « const T& qfront() const { return this->_qf(); } « void qpop_front() { e_t* e = (e_t*) q_q.qpop_front(); « if (e) { this->_n(); _qx(e); } // checked decr then free elem } void qpush_front(T const& t) { « e_t* e = this->_qe(t); // alloc new elem if (e) { q_q.qpush_front(e); ++q_n; } } }; // yqt