|
problem
A typical Lisp environment, whether Scheme or some other Lisp variant, usually has an interactive session of some sort called a repl, standing for read/eval/print loop. The first input phase is done by a reader; the last output phase is done by a writer — often a pretty printer showing tree-shaped structures with nesting revealed by indentation. Lisp trees of lists can be reasonably formatted for human consumption (or display) using a pretty printer tuned for some maximum target line length. Any single list that's too long to fit on a single line can be split over multiple lines, with children suitably indented to show nesting. To print Lisp syntax code and data in a nice format, Wil currently uses code shown on this page, deriving from a Ygg pretty printer Wil wrote using C++ in the early 90's. The writer shown on this page is something of an incomplete prototype still under development. When it's complete, it will serve as a guide for writing new pretty printers in the toy language itself, with greater runtime flexibility. The role of this writer in C++ is more of a bootstrapping scaffold. The pretty printer shown here can handle cycle structures without looping infinitely. Cycles and shared structure appear using a Lisp syntax convention defined later.
see also
«
This writer uses thorn C++ classes of near generic nature — ypp2dm in the map demo (cf ») plus yno and yoo in the meter demo (cf » and ») — as part of internal state to manage maps and measure content length and line lengths. At the time of writing, the reader page is still pending, so code references to constants in class yi2br (in-stream to box reader) are not yet linked to definitions. The yM virtual machine api (and associated internal ykMgbv managing addresses of well known constant symbols and objects) won't appear until even later.
trees
«
Because Lisp syntax permits nesting of lists, it's more accurate to say tree processing instead of list processing when speaking about Lisp. And when you allow mutation or forward references to make cycles, you can actualy make arbitrary graphs, and not just trees which cannot have cycles. By convention, Lisp code and data is shown in tree format, with child nodes indented according to one scheme or another. For example, you can indent to align with another earlier child in a list, or you might instead indent a distance proportional to depth. Either works fine if you're consistent. Wil first learned Lisp in college when his boss Mark suggested Wil consider it, since Wil was interested in dataflow systems, and under Lisp, both code and data have the same form. Wil said, "But what does this syntax mean?" Mark said, "It's a tree. If you break a line inside a list, the next thing shown on a following line is indented. By convention you align siblings on successive lines to help show they are children of a less indented parent above." "That's it?" Wil asked, and Mark nodded. "Cool. The only syntax convention is indented children? Neat." So after two minutes, Wil learned all there was to know about basic structure in Lisp syntax. Everything else concerns format of individual tokens for atoms. (The meaning of any list depends on members and their locations.)
format conventions
«
Consider a desired format this writer targets — we need to pick one of several options. (You can also support several, then select one, but the current api aims for fewer options beyond target max line length.) But first let's establish terms used to describe a planned output format. The following code fragment is valid Scheme syntax, but it's written in odd style — almost like C in the way closing delimitors are stacked below opening parens. (Note: paren is short for parenthesis — get used to it.) (define (member thing lis)
(if (null? lis)
#f
(if (equal? (car lis) thing)
lis
(member thing (cdr lis))
)
)
)
Even though verbose, Wil often wrote Ygg code like above in the 90's because because it fit the same style he used when writing C++: it was easier to cut and paste with block shifts using a dumb text editor. Just like in C++, whitespace has no meaning in Lisp beyond breaking tokens. So code above is considered identical to the next expression below: (define (member thing lis)
(if (null? lis)
#f
(if (equal? (car lis) thing)
lis
(member thing (cdr lis)))))
The code fragment above is the original, using conventional Scheme style: closing parens appear on the same line as the last member of a preceding list. If you spread the last three closing parens over three lines, under opening parens, you get exactly what was shown earlier. Same thing. What else do you see in the code above? If a long expression breaks over multiple lines, by convention the first two children appear on one line if the second child will also fit. Subsequent children align under the second child. Compare this to a near alternative, where a third child simply indents: (define (member thing lis)
(if (null? lis)
#f
(if (equal? (car lis) thing)
lis
(member thing (cdr lis)))))
The variation above is only sightly different: it aligns to depth — two spaces per indent — instead of aligning to a previous sibling. Let's name these two different styles: The second depth-aligned example above was printd by the writer shown on this page. It would be quite easy to support both formats — sib-aligned and depth-aligned — but the code only shows depth-aligned formatting. Wil prefers depth-aligned because it uses less horizontal space. Let's look at one more example so you can compare the two styles again. Below is another sample of Scheme code Wil found; it shows the original (partly) sib-aligned style: (define (rep-loop evaluator)
(display "repl>")
(let ((expr (read)))
(cond ((eq? expr 'exit)
(display "exiting read-eval-print loop")
(newline))
(#t
(write (evaluator expr))
(rep-loop evaluator)))))
When the expression above is reformatted in depth-aligned format by the writer using a target max line length of 50, you get the output shown below: (define (rep-loop evaluator)
(display "repl>")
(let ((expr (read)))
(cond
((eq? expr 'exit)
(display "exiting read-eval-print loop")
(newline))
(#t (write (evaluator expr)) (rep-loop evaluator)))))
The pretty printer in writer class yb2ow shown on this page aims to write in depth-aligned format as shown above. It still has a couple quirks someplace because Wil wrote all the logic by trial and error, making small changes when an unexpected result appeared in output. Current code still has a few line length related odd effects.
enhancement
«
The max line length targeted by current writer code causes something ugly to occur when nested tree depth pushes minimum indent out close to max line length: line breaks occur constantly after every short expression prints. Output in this case looks like blades of grass mashed over at the tips by a heavy weight represented by the limit. What would look better? Well, if a line is deeply indented due to depth, maybe max line length should be locally increased by some amount. This would make max line length relative to depth, in part, with more consistent results: limits would shift rightward as depth increased. If Wil was reworking code now instead of writing this page, he might have a min line length as well as a max, so depth plus min would be the effect max line length everywhere when it exceeded global max length.
recursion
«
Current code counts levels of recursion, to limit maximum nesting handled by a writer, so there exists a maximum depth of structure that can be printed. Limiting stack use is the main purpose; the current limit was arbitrarily chosen and is subject to later reconsideration when the topic of bounding resource usage is addressed again. The stack depth limit used now is a stopgap measure picked in the absence of more specific requirements. However, it seems reasonable that a writer of limited purpose might handle no more than hundreds of nesting levels. Two mutually recursive internal methods named _wmap() and _wpeel() visit the graph to be printed prior to all writing to find shared structure and cycles. An arbitrary max nesting depth of 512 is imposed by _wmap(). Obviously any other constant can be used instead.
writer
«
The current writer class is named yb2ow, which stands for thorn box to out-stream writer. This name is harder to remember than ywriter, but does explain behavior slightly more. The name is actually a slight misnomer because it actually prints any peg and not just any box, so it also prints immediate values as well. (Perhaps it should be named yp2ow, but p for peg or pointer already occurs often as it is.) This writer must output a format understood by the futue reader (coming soon), but this leaves complete freedom in choice of whitespace used in layout because whitespace is not significant. So a main feature of yb2ow is support for pretty printing output to match Lisp convention — indenting nested expessions after line breaks — while coping with cycles in the graph, so cyclic graphs can be printed and read correctly. The reader understands the same cycle notation used by this writer. Some Lisp implementations support notation for cyclic structures; this writer uses this notation too. (By coincidence, features in the current writer were finished almost exactly one year ago today. So writing some of this requires figuring out intentions from a year ago.) If you use method wshow() to write a graph, it first looks for cycles in the graph and records box addresses in a map of cycles when the same box address is seen more than one time. This actually detects shared structure generally, of which graph cycles are just a subcase. Each box seen more than once is assigned a monotonically increasing integer name that's unique in the graph being printed by one wshow() call. For example, suppose the peg you print with wshow() points to a pair which refers to itself — let's say both car and cdr slots both point at the pair itself. In this case, the integer name of this pair is 1 and printing the pair should display this output: #1=(#1# . #1#)
What does this notation mean? The first time a cycled box (pair in this case) is encountered, it is preceded by syntax #1=, meaning the next object has integer name 1, and any subsequent re-appearance of that box is printed as #1# instead of being printed in its entirety again. In addition to handling cyclic references like this, the pretty printer also has to work even in lines that contain syntax like the above. For example, the pretty printing code metering output size must figure out the expression shown above will write fourteen bytes (in this case) and come up with exactly the same answer when metering the size as it does when actually printing real output. (define (rep-loop evaluator)
(display "repl>")
(let ((expr (read)))
(cond
((eq? expr 'exit)
(display "exiting read-eval-print loop")
(newline))
(#t (write (evaluator expr) #1=(#1# . #1#))
(rep-loop evaluator)))))
For example, #1=(#1# . #1#) was inserted in the expression shown above. The reader parses this correctly and the writer shown on this page writes it out again, so it round-trips without change. And metering code correctly figures out the list beginning with #t fits on that line, without reckoning size is "infinite," or some indeterminate number.
entry points
«
The primary public api for writer yb2ow has three different ways to print a value, and one way to measure length, so you can choose whether or not to get pretty printing, and whether cycles and shared structure should be detected and handled. Service varies from maximal to minimal as follows:
Each of those methods is really just an entry point prepping initial state before using a recursive internal api common to all the methods. The first three printing methods set two boolean values, w_show and w_flat, to indicate desired format. Internal settings work like this:
In effect, w_flat means "no pretty printing," and w_show means "detect cycles." Internally, boolean flag w_flat causes private _wflat() to print without line breaks; otherwise private _wpretty() writes a pretty print format. But this division of labor overlaps when _wpretty() decides something fits all on one line and uses _wflat() to print it; thus pretty printing devolves to flat printing in leaf cases.
public api
«
The yb2ow.h header file defines the following api. class yb2ow « An instance of writer class yb2ow can be instantiated on the stack temporarily, just long enough to print a graph reachable from a particular virtual machine — this usage should be efficient. But you can also instantiate yb2ow inside another object since it won't really use resources except when a print call is in progress (because hashmaps are cleared when printing is done, which releases all pages allocated from the pile referenced by member w_H). class yb2ow { // box to out writer «
private:
yM& w_M; // the virtual machine «
yPBH& w_H; // pile supplied by constructor «
const ykMgbv& w_knowns; // w_M.Mknowns() «
yyb* w_quote; // symbol quote (same as w_knowns.v_quote) «
yyb* w_unquote; // 'unquote (same as w_knowns.v_unquote) «
yyb* w_unquote_splicing; // unquote-splicing «
yyb* w_quasiquote; // quasiquote (w_knowns.v_quasiquote) «
yo& w_o; // the out stream finally written in the end «
yno w_no; // yo measuring size of data written to it «
yoo w_oo; // writes to w_o measuring current line len «
x32 w_max; // max line len before print needs LF+indent «
bool w_flat; // true: no linebreaks or indents permitted «
bool w_show; // true: wshow() is pretty printing «
n32 w_depth; // counts recursion: used to limit nesting «
ypp2dm w_map; // m1: pairs seen; m2: repeated structures «
n32 w_ncycles; // counts cyclic objects: w_map.m2size() «
const void* w_ungloss; // 1 time suppression of _wgloss() «
u8 w_hex[256]; // chars needing hex in printed strings «
class Wpush { public: // bump depth 1 during lifetime only «
yb2ow& p_w; // the writer whose depth is tweaked
operator n32() const { return p_w.w_depth; } // depth
Wpush(yb2ow& w) : p_w(w) { ++p_w.w_depth; } // deeper
~Wpush() { if (p_w.w_depth) --p_w.w_depth; } // undo
};
void _werr(u32 line, yp::Perr e, yo& o); // imm err+lineno
void _werr(yp const& peg); // write peg to stderr (debug)
void _wchar(c32 c, int bucky, yo& o); // immediate char
void _wbox(const void* b, yo& o); // box ptr to w_no or w_o
void _wpeg(yp const& peg, yo& o); // any peg: box or imm
// void _wr128(r128 val, yo& o);
void _wr64(r64 val, yo& o);
void _wr32(double val, yo& o);
void _wpretty(ypairb const& pair, yo& o); // pretty list
void _wflat(ypairb const& pair, yo& o); // p2 as flat list
void _wsymbol(yyb const& sym, yo& o);
void _wstring(ysb const& str, yo& o);
void _wdecimal(char* buf, yo& o); // force real decimal pt
bool _wcadr(ypairb const& pair, yp* cadr, yo& o);
// gloss if in map2: print if not in map1 & add to map1;
// param dot is true when dot is needed before a #n# gloss;
// return count of times seen (0: never, 1: #n=, 2: #n#)
int _wgloss(ypairb const& p, yo& o, bool dot); //!0=>glossed
/// \brief traverse data: populate map2 with all cyclic refs
/// \support for wshow() by finding cycles before _wpeg()
/// \start by clearing w_ncycles and m_map to zero/empty
/// \end with map1 empty again after unwind, but map2 kept
/// \return false if pass fails (eg) from excess recursion
bool _wmaptop(yp const& peg); // find cyclic refs in object
bool _wpeel(yp const& peg); // _wmaptop()'s recurse for peg
void _waddcycle(const void* key); // add cycle if new key
bool _wmap(ypairb const& p); // _wmaptop() recurse for pair
/// \brief measure len relative to max x for pretty printing
/// \param x: max over which length is too big, if it's more
/// \given N=peg-length & c is some nonzero constant, then:
/// \return (N > x)? x + c : N; // exact length or some x+c
p32 _wgauge(ypairb const& pair, x32 x); // length or x + c
// add whole list len to ruler, or til limit < (p32) ruler
/// \return ruler.nolen()
p32 _wgauge(ypairb const& pair, x32 limit, yno& ruler);
// add whole peg length to ruler, or til limit < (p32) ruler
/// \return ruler.nolen()
p32 _wmeasure(yp const& peg, x32 limit, yno& ruler);
// return true if peg length does not bump col much past max
bool _wfits(yp const& peg); // peg fits on current line?
/// \brief _wput() is wwrite() without top level map reset
void _wput(yp const& peg, yo& o); // minimum internal wwrite
/// \brief pretty print with cycles seen and cycle labeling
/// \return cycle count found (map2 size post _wmaptop())
n32 wshow(yp peg); // like wshape() but show cyclic objects
void wwrite(yp peg); // write 1 line no linebreak or indent
n32 wlen(yp peg); // peg len written by wwrite() (less LF)
// name of char c if known & desired
static const char* wcname(c32 c);
// maxLine is targeted max line length for pretty printing
yb2ow(yPBH& H, yo& o, x32 maxLine, yM& vm);
~yb2ow();
struct Wq {
yb2ow const& q_w; Wq(yb2ow const& w):q_w(w) {} }; «
Wq quote() const { return Wq(*this); } «
void wprint() const; // dump to stdout via yout
void wdump(yo& o) const; void wcite(yo& o) const;
}; // yb2ow
inline yb2ow& operator<<(yb2ow& w, yp::Pq const& q) { «
w.wshow(q.q_p); return w; }
inline yb2ow& operator<<(yb2ow& w, yp const& p) { «
w.wshape(p); return w; }
|
menu
mu, toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer « Þ, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile (thorn, todo, names, fd, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype)
immutable graphs
«
This writer's implementation assumes every object graph printed is immutable, and refrains from changing any bits anywhere in a graph printed, in case other code concurrently uses (parts of) that graph at the same time. Wil's original Ygg pretty printer from the early 90's used bits in each box named cycle and visited to perform the task of marking a graph prior to printing, in order to identify cycles in the graph. This approach needed exclusive access to a graph being printed, making Ygg inherently single threaded when printing. The current cycle detection scheme uses a separate pair of maps in ypp2dm — part of the writer's member state — so printing does not alter any part of a graph being written. This supports concurrent use of objects in the graph during a print operation, as long as no one else alters the graph.
garbage collection
«
Because gc moves boxes when garbage collection occurs, and because writer yb2ow holds pointers to boxes in maps (without making these visible as gc roots that can be updated) current writer code assumes gc will not occur in any space overlapping the graph being printed. So using yb2ow to pretty print a graph means gc must be disabled during a single call to print a graph. (This is just one of many quirks making the current implementation just a toy. A pretty printer written in the toy language itself would allow gc to occur at the same time.)
sources
«
yb2ow.cpp « The following non-inline writer methods implement the class api. The constructor takes a yM vm (virtual machine) instance, which provides standard symbol address for quote, quasiquote, unquote, and unquote-splicing, since these might not be the same in every virtual machine. Each writer also takes a yPBH pile for heap allocation (in case it's not the one used by the the vm), and a out stream subclass of yo to receive all content written — which is implicitly assumed single threaded and dedicated to this purpose (without being written by anyone else). yb2ow::yb2ow(yPBH& H, yo& o, x32 x, yM& vm) «
: w_M(vm), w_H(H), w_knowns(vm.Mknowns())
, w_quote(w_knowns.v_quote), w_unquote(w_knowns.v_unquote)
, w_unquote_splicing(w_knowns.v_unquote_splicing)
, w_quasiquote(w_knowns.v_quasiquote) , w_o(o), w_no()
, w_oo(&o), w_max(x), w_flat(true), w_show(false)
, w_depth(0), w_map(&H), w_ncycles(0), w_ungloss(0)
{
register u8* map = w_hex;
::memset(map, 0, 128); // ascii does not need
::memset(map + 128, 'x', 128); // all non-ascii needs hex
// (escape not in (eg) java) map['a'] = e_bel; // alarm 0x7
::memset(map, 'x', 0x20); // .. but control chars need hex
map[ 0x7F ] = 'x'; // including e_del
// need letter code escapes instead of hex (non 'x' values):
map[yi2br::e_bs] = 'b'; // backspace 0x8
map[yi2br::e_cr] = 'r'; // '\r'; // 0xD CR carriage return
map[yi2br::e_lf] = 'n'; // '\n'; // 0xA newline
map[yi2br::e_ht] = 't'; // '\t'; // 0x9 HT horizontal tab
map[yi2br::e_ff] = 'f'; // '\f'; // 0xC FF (page)
map[yi2br::e_vt] = 'v'; // 0xB VT vertical tab
map['\\'] = '\\'; // backslash maps to self (post backslash)
map['"'] = '"'; // quote maps to self (after 1st backslash)
}
void yb2ow::wdump(yo& o) const { «
o.oft("<yb2ow me=%lx o=%lx H=%lx>",
(long) this, (long) &w_o, (long) &w_H);
if (&w_o) o << yendl << ycite(w_o);
if (&w_H) o << yendl << ycite(w_H);
o.ounend("yb2ow");
}
void yb2ow::wcite(yo& o) const { «
o.of("<yb2ow me=%lx o=%lx H=%lx/>",
(long) this, (long) &w_o, (long) &w_H);
}
void yb2ow::_wdecimal(char* buf, yo& o) { «
// ensure floating pt format includes decimal point
register const char* s = buf;
register int c = *s;
if ( c == '-' )
++s; // skip a leading sign in case it's negative
while ((c = *s++) && ::isdigit(c))
/* empty */;
if (!c) { // saw end of string without seeing non-digit?
::strcat(buf, ".0");
}
o << buf;
}
//void yb2ow::_wr128(r128 val, yo& o) { «
// char buf[128]; ::sprintf(buf, "%1.18Lg", (long double) val);
// this->_wdecimal(buf, o);
//}
void yb2ow::_wr64(r64 val, yo& o) { «
char buf[96]; ::sprintf(buf, "%1.16lg", (double) val);
this->_wdecimal(buf, o);
}
void yb2ow::_wr32(double val, yo& o) { «
char buf[96]; ::sprintf(buf, "%1.9g", (double) val);
this->_wdecimal(buf, o);
}
int g_lambdas_dummy = 0;
void yb2ow::_wsymbol(yyb const& sym, yo& o) { «
yv vsym = sym.asv(); // same as operator yv()
if (sym.bescaped()) {
o.oc('|'); o << vsym; o.oc('|');
}
else
o << vsym;
if (&sym == w_knowns.v_lambda) {
++g_lambdas_dummy;
}
//u32 hash = sym.yhash();
//o.of(" #|0x%lx|#", (long) hash);
}
void yb2ow::_wstring(ysb const& str, yo& o) { «
register u8* escape = w_hex;
yv s = str.asv(); // same as ysb::operator yv()
const u8* p = s.v_p;
if (p && s.v_n) { // nonempty?
o.o2c('"');
const u8* start = p; // last byte not yet written
const u8* end = p + s.v_n; // 1 beyond last byte to write
for (/*prep preincr*/ --p; ++p < end; ) {
register int c = *p;
if (escape[c]) { // need to escape this byte?
if (p > start) { // need to write earlier bytes first?
yv before(start, p - start);
o << before; // bytes before p
}
o.o2c('\\'); // escape seq always starts w/ backslash
if ('x' == escape[c]) // write as hex?
o.of("x%02x", (int) c); // 'x' plus two hex digits
else // write as the value stored in escape[c]
o.o2c(escape[c]);
start = p + 1; // next byte to write is after current p
} // if (escape[c])
} // for
if (p > start) { // at least final byte was not escaped?
yv last(start, p - start);
o << last; // trailing bytes before end
}
o.o2c('"');
} // if (p && s.v_n)
}
void yb2ow::_wflat(ypairb const& pair, yo& o) { «
// write pair as flat list
if (w_show && w_ncycles) { // need to see pairs?
if (this->_wgloss(pair, o, /*dot*/ false))
return; // all done when this is second time printed
}
yp* cadr = pair.blen2cadr(); // cadr if length is exactly two
if (cadr && _wcadr(pair, cadr, o)) // len==2? readmacro to o?
return;
bool paren = !pair.bescaped(); // whether to delimit by paren
o.oct((paren)? '(' : '['); // start delimiter
w_map.msee(&pair); // new map member
this->_wpeg(pair.b_car, o); // first list elem
const ypairb* p2 = &pair;
while (p2->b_cdr != yp::i_nil) { // not end of list?
o.oc(' '); // space between list elems
ypairb* cdr = ypairb::mayb(p2->b_cdr); // non-nil if pair
if (cdr) { // another pair?
if (w_show && w_ncycles) {
if (this->_wgloss(*cdr, o, /*dot*/ true))
break; // END WHILE
w_map.msee(cdr); // new map member
}
p2 = cdr;
this->_wpeg(p2->b_car, o); // continue while loop
}
else { // last non-nil cdr contains non-pair?
// dotted list format for last non-nil, non-pair cdr
o.o2c('.', ' '); this->_wpeg(p2->b_cdr, o);
break; // END WHILE
}
}
o.ocu((paren)? ')' : ']'); // end delimiter
}
bool yb2ow::_wcadr(ypairb const& pair, yp* cadr, yo& o) { «
if (cadr) { // length is exactly two?
if (pair.b_car == (ybox*) w_quote) { // quote readmacro?
w_map.msee(&pair); // new map member
o.oc('\''); this->_wpeg(*cadr, o); return true; // 'cadr
}
else if (pair.b_car == (ybox*) w_unquote) {
w_map.msee(&pair); // new map member
o.oc(','); this->_wpeg(*cadr, o); return true; // ,cadr
}
else if (pair.b_car == (ybox*) w_unquote_splicing) {
w_map.msee(&pair); // new map member
o.o2c(',', '@'); this->_wpeg(*cadr, o);
return true; // ,@cadr
}
else if (pair.b_car == (ybox*) w_quasiquote) {
w_map.msee(&pair); // new map member
o.oc('`'); this->_wpeg(*cadr, o); return true; // `cadr
}
}
return false;
}
void yb2ow::_wpretty(ypairb const& pair, yo& o) { «
// print as pretty list
if (w_show && w_ncycles) { // need to see pair?
if (this->_wgloss(pair, o, false)) // glossed?
return; // all done when this is second time printed
}
unsigned col = w_oo.oocol(); // column (line len) before 1st
if (col < (w_max - 8)) {
// not already starting beyond official max?
p32 len = (w_max - col) + 8; // permit a few beyond max
if (this->_wgauge(pair, len) <= len) { // flat good enough?
this->_wflat(pair, o); return; // done
}
}
col += 2; // indent by two for children
yp* cadr = pair.blen2cadr(); // cadr if length is exactly 2
if (cadr && _wcadr(pair, cadr, o)) // len 2? did readmacro?
return;
n32 lines = w_oo.oolines(); // number of newlines see before
bool paren = !pair.bescaped(); // whether to delimit w/paren
o.oct((paren)? '(' : '['); // start delimiter
w_map.msee(&pair); // new map member
this->_wpeg(pair.b_car, o); // first list elem
unsigned n = 1; // length: number of members written so far
const ypairb* p2 = &pair;
while (p2->b_cdr != yp::i_nil) { // not end of list?
ypairb* cdr = ypairb::mayb(p2->b_cdr); // non-nil if pair
if (cdr) { // another pair?
p2 = cdr;
if (++n == 2 && _wfits(p2->b_car)) { // 2nd fits on line?
o.o2c(' ');
if (w_show && w_ncycles) {
if (this->_wgloss(*cdr, o, /*dot*/ true))
break; // END WHILE
w_map.msee(cdr); // new map member
}
this->_wput(p2->b_car, o); // flat is good enough
}
else {
o.oncol(col); // LF then space indent to column
if (w_show && w_ncycles) {
if (this->_wgloss(*cdr, o, /*dot*/ true))
break; // END WHILE
w_map.msee(cdr); // new map member
}
this->_wpeg(p2->b_car, o); // continue while loop
}
}
else {
// used dotted list format for end non-nil, non-pair cdr
if (w_oo.oolines() == lines) // same line so far?
o.o3c(' ', '.', ' ');
else
o.oncol(col); // LF then space indent to column
this->_wpeg(p2->b_cdr, o);
break; // END WHILE
}
}
o.ocu((paren)? ')' : ']'); // end delimiter
}
bool yb2ow::_wfits(yp const& peg) { // peg fits on curr line? «
// return true if peg length does not bump col much past max
ypp2dm::Mlayer guard(w_map); // temp local layer map changes
unsigned col = w_oo.oocol(); // column (line len) before 1st
if (col < (w_max - 4)) { // not to close to max already?
p32 limit = (w_max - col); // desired limit
p32 more = 24; // but measure a few beyond max
w_no << yreset; // zero stats
p32 actual = this->_wmeasure(peg, limit + more, w_no);
p32 after = col + actual;
if (after > w_max) { // goes beyond max?
p32 over = after - w_max; // number of bytes too many
if (after > w_max + 8 || (over * 8) > limit) // too much?
return false; // weighted too far over max limit
}
return true; // peg looks like it works okay on same line
}
return false; // do not try to fit peg on same line
}
void yb2ow::_wbox(const void* b, yo& o) { «
// box pointer to w_no or w_o
yt tag = ybox::btag(b);
long l = (long) b;
switch (tag.kind()) {
case yt::k_none: { o.of("#<none p='^%lx'>", l); return; }
case yt::k_p1: { o.of("#<1-tuple p='^%lx'>", l); return; }
case yt::k_p2: {
if (w_flat) this->_wflat( *((ypairb*) b), o);
else this->_wpretty( *((ypairb*) b), o);
return;
}
case yt::k_p3: { o.of("#<triple p='^%lx'>", l); return; }
case yt::k_p4: { o.of("#<quad p='^%lx'>", l); return; }
case yt::k_hash: { o.of("#<hash p='^%lx'>", l); return; }
//case yt::k_r128: { o << "#li";
// _wr128(((yr128b*) b)->b_r128, o); return; }
case yt::k_i128: { o.of("#<r128 p='^%lx'>", l); return; }
case yt::k_i16: { o.of("#<i16 p='^%lx'>", l); return; }
case yt::k_u16: { o.of("#<u16 p='^%lx'>", l); return; }
case yt::k_i32: { o.of("#<i32 p='^%lx'>", l); return; }
case yt::k_u32: { o.of("#<u32 p='^%lx'>", l); return; }
case yt::k_i64: { o.of("#<i64 p='^%lx'>", l); return; }
case yt::k_u64: { o.of("#<u64 p='^%lx'>", l); return; }
case yt::k_r32: { o << "#si";
_wr32(((yr32b*) b)->b_r32, o); return; }
case yt::k_r64: { o << "#i";
_wr64(((yr64b*) b)->b_r64, o); return; }
case yt::k_vec: { o.of("#<vector p='^%lx'>", l); return; }
case yt::k_obj: { o.of("#<object p='^%lx'>", l); return; }
case yt::k_class: { o.of("#<class p='^%lx'>", l); return; }
case yt::k_int: { o.of("#<bigint p='^%lx'>", l); return; }
case yt::k_tree: { o.of("#<tree p='^%lx'>", l); return; }
case yt::k_cord: { _wstring( *((ysb*) b), o); return; }
case yt::k_sym: { _wsymbol( *((yyb*) b), o); return; }
case yt::k_far: { o.of("#<far p='^%lx'>", l); return; }
case yt::k_map: { o.of("#<map p='^%lx'>", l); return; }
case yt::k_row: { o.of("#<row p='^%lx'>", l); return; }
case yt::k_rig: { o.of("#<rig p='^%lx'>", l); return; }
// return this->trigname();
case yt::k_toc: { o.of("#<toc p='^%lx'>", l); return; }
case yt::k_iter: { o.of("#<iter p='^%lx'>", l); return; }
case yt::k_table: { o.of("#<table p='^%lx'>", l); return; }
case yt::k_fore: { o.of("#<fore p='^%lx'>", l); return; }
case yt::k_back: { o.of("#<back p='^%lx'>", l); return; }
} // switch
o.of("#<BOX p='^%lx' kind='#%lx'>", l, (long) tag.kind());
}
n32 yb2ow::wshow(yp peg) { «
// like wshape() but also showing cyclic objects;
// pretty print with cycle detection and cycle labeling;
// return count of total cycles found
// (size of w_cyclem after _wmaptop())
bool saveFlat = w_flat; w_flat = false;
bool saveShow = w_show; w_show = true;
if (this->_wmaptop(peg)) {
// need clear again to see 1st time printed...
w_map.munsee_all();
this->_wpeg(peg, w_oo);
yassert("yb2ow::wshow" && 0 == w_map.mdepth());
}
else
w_oo << "#<yb2ow::wshow() recursed too deeply>";
w_oo << yendl << ynow;
w_flat = saveFlat; w_show = saveShow;
w_map.munbind_all(); w_map.munsee_all(); // cycles and seen
return w_ncycles; // cleared and recalculated by _wmaptop()
}
void yb2ow::wshape(yp peg) { «
bool save = w_flat; w_flat = false;
bool saveShow = w_show; w_show = false;
this->_wpeg(peg, w_oo); w_oo << yendl << ynow;
w_flat = save; w_show = saveShow;
}
void yb2ow::wwrite(yp peg) { «
bool save = w_flat; w_flat = true;
bool saveShow = w_show; w_show = false;
this->_wpeg(peg, w_oo); w_oo << ynow;
w_flat = save; w_show = saveShow;
}
void yb2ow::_wput(yp const& peg, yo& o) { «
// minimal internal wwrite
// _wput() is wwrite() without any top level map resetting
bool save = w_flat; w_flat = true;
this->_wpeg(peg, o); w_oo << ynow;
w_flat = save;
}
n32 yb2ow::wlen(yp peg) { «
// length of peg written w/ wwrite() (less LF)
bool save = w_flat; w_flat = true;
yno ruler; this->_wpeg(peg, ruler);
ruler << ynow; // get written length
w_flat = save;
return ruler.nolen(); // number of bytes written
}
p32 yb2ow::_wgauge(ypairb const& pair, x32 x) { «
// len or x + c for constant c
yno ruler; this->_wgauge(pair, x, ruler);
return ruler.nolen();
}
p32 yb2ow::_wgauge(ypairb const& pair, x32 limit, yno& ruler) { «
// add entire list len to ruler, or until limit < (p32) ruler
ypp2dm::Mlayer guard(w_map);
// guard: temp local layer changes to w_map only
yo& o = ruler;
if (w_show && w_ncycles) { // need to see pair?
if (this->_wgloss(pair, o, /*dot*/ false)) // glossed?
return ruler.nolen(); // all done when 2nd time printed
}
yp* cadr = pair.blen2cadr(); // cadr if len is exactly two
if (cadr && _wcadr(pair, cadr, ruler))
// length == two? wrote readmacro?
return ruler.nolen();
bool paren = !pair.bescaped(); // whether to delimit w/ paren
o.oct((paren)? '(' : '['); // start delimiter
w_map.msee(&pair); // new map member
this->_wmeasure(pair.b_car, limit, ruler); // first list elem
const ypairb* p2 = &pair;
while (p2->b_cdr != yp::i_nil) { // not end of list?
p32 rlen = ruler.nolen();
if (rlen > limit) // already exceeded limit of interest?
return rlen; // done measuring size
o.oc(' '); // space between list elems
ypairb* cdr = ypairb::mayb(p2->b_cdr);
if (cdr) { // another pair?
if (w_show && w_ncycles) {
if (this->_wgloss(*cdr, o, /*dot*/ false))
break; // END WHILE
w_map.msee(cdr); // new map member
}
p2 = cdr; // continue loop
this->_wmeasure(p2->b_car, limit, ruler);
}
else {
// dotted list format for trailing non-nil, non-pair cdr
o.o2c('.', ' ');
this->_wmeasure(p2->b_cdr, limit, ruler);
break; // END WHILE
}
}
o.ocu((paren)? ')' : ']'); // end delimiter
return ruler.nolen(); // final size written so far
}
p32 yb2ow::_wmeasure(yp const& peg, x32 limit, yno& ruler) { «
// add entire peg len to ruler, or until limit < (p32) ruler
u32 lid = peg.plid();
if ((lid & 3) == 0) { // box?
ybox* b = peg.pbox();
yt tag = ybox::btag(b);
if (yt::k_p2 == tag.kind()) // box is a pair?
this->_wgauge( *((ypairb*) b), limit, ruler);
else
this->_wbox(b, ruler); // gauge some other box type
}
else
this->_wpeg(peg, ruler); // gauge ordinary immediate
return ruler.nolen();
}
int yb2ow::_wgloss(ypairb const& key, yo& o, bool dot) { «
// _wgloss() is true if repeat; gloss if in map2: print if
// not in map1 & add to map1; return count of times
// seen (zero: never, one: #n=, two: #n#)
const void* val = 0;
if (w_show && w_ncycles && (val = w_map.mbound(&key)) != 0) {
// dup?
const u8* ref = (const u8*) &key.b_cdr;
// ref: INSIDE pair (not at start)
const void* saw = w_map.msaw(ref);
if (saw && !w_map.msaw(&key)) { // now defining #N=?
return 0; // did not gloss
}
if (dot) // after the first elem in a list?
o.o2c('.', ' ');
if (saw) { // already printed once before? replace w/ #n#?
o.of("#%lu#", (long) val);
return 2; // replaced with a #n# gloss ref
}
else { // define with #n= prefix
w_map.msee(ref); // new map member
o.of("#%lu=", (long) val);
ybox* box = (ybox*) (void*) &key;
yp peg(box);
this->_wpeg(peg, o);
return 1; // replaced with a #n= gloss name
}
}
return 0; // need to print this pair this time
}
void yb2ow::_waddcycle(const void* key) { «
// add cycle if not already present
if (!w_map.mbound(key)) { // not already present?
n32 n = ++w_ncycles; // number of cycles known so far
void* value = (void*)(ptrdiff_t) n; // count as non-nil ptr
w_map.mbind((void*) key, value); // new map member
}
}
bool yb2ow::_wmap(ypairb const& pair) { «
// recursive _wmaptop() part for pair
if (w_map.msaw(&pair)) { // this pair is in a cycle?
this->_waddcycle(&pair); // ensure pair listed as cycle
return true; // did not recurse too far
}
Wpush deeper(*this); // incr w_depth for rest of this scope
if (w_depth > 512) // recursed too deeply?
return false;
w_map.msee(&pair); // add myself (counted by deeper)
if (!this->_wpeel(pair.b_car)) // first list elem
return false; // recursed too deeply
const ypairb* p2 = &pair;
while (p2->b_cdr != yp::i_nil) { // not end of list?
ypairb* cdr = ypairb::mayb(p2->b_cdr); // non-nil if a pair
if (cdr) { // another pair?
if (w_map.msaw(cdr)) { // cdr is part of a cycle?
this->_waddcycle(cdr); // ensure cdr listed as cycle
return true; // stop (true: did not recurse too far)
}
p2 = cdr;
w_map.msee(cdr); // add another sublist we're now inside
if (!this->_wpeel(p2->b_car)) // recursed too deeply?
return false;
}
else {
// dotted list format for trailing non-nil, non-pair cdr
if (!this->_wpeel(p2->b_cdr)) // recursed too deeply?
return false;
break; // END WHILE
}
}
return true; // got here without recursing too deeply
}
bool yb2ow::_wpeel(yp const& peg) { «
// recursive part of _wmaptop() for pegs
u32 lid = peg.plid();
if ((lid & 3) == 0) { // box?
ybox* b = peg.pbox();
yt tag = ybox::btag(b);
if (yt::k_p2 == tag.kind()) // box is a pair?
return this->_wmap( *((ypairb*) b) ); // recursive
//else
// this->_wmapbox(b); // gauge some other box type
}
return true; // got here without recursing too deeply
}
bool yb2ow::_wmaptop(yp const& peg) { «
// find cyclic references in object
w_map.munsee_all(); w_map.munbind_all(); // start empty
w_ncycles = 0; w_depth = 0; // start at zero
return this->_wpeel(peg);
}
void yb2ow::_wpeg(yp const& peg, yo& o) { «
// any peg: box or immediate
u32 lid = peg.plid();
switch (lid & 3) { // low two 'ilk' bits:
case yp::i_box: this->_wbox(peg.pbox(), o); break;
case yp::i_int: o.of("%d", (int) peg.pi4()); break;
case yp::i_nil: o.o2c('(', ')'); break;
case yp::i_cue: { // need to look at Pcue bits
switch ((lid >> 2) & 3) { // Pcue enum bits
case yp::c_ply: { // need to look at Pply bits
switch ((lid >> 4) & 0x0f) { // Pply enum bits
case yp::p_false: o.o2c('#', 'f'); break;
case yp::p_true: o.o2c('#', 't'); break;
case yp::p_void: o << "#void"; break;
case yp::p_none: o << "#none"; break;
case yp::p_one:
o.of("#<p_one lid=0x%lx>", (long) lid); break;
case yp::p_two:
o.of("#<p_two lid=0x%lx>", (long) lid); break;
case yp::p_three:
o.of("#<p_three lid=0x%lx>", (long) lid); break;
case yp::p_nobox: {
yt tag(peg.pnobox());
o.of("#<nobox %.48s>", tag.tname()); break;
}
case yp::p_next: o << "#next"; break;
case yp::p_rest: o << "#rest"; break;
case yp::p_key: o << "#key"; break;
case yp::p_val: o << "#val"; break;
case yp::p_eof: o << "#eof"; break;
case yp::p_setter:
o.of("#<setter i=%u>", peg.psetter()); break;
case yp::p_getter:
o.of("#<getter i=%u>", peg.pgetter()); break;
case yp::p_token: { int tok = peg.ptoken();
o.of("#<token cd=%d nm=%.48s>", tok,
asName((yi2br::etoken) tok));
break;
}
default: // should be unreachable
o.of("#<c_ply lid=0x%lx>", (long) lid); break;
} // switch ((lid >> 4) & 0x0f) { // Pply enum bits
break;
} // case yp::c_ply
case yp::c_one:
o.of("#<c_one lid=0x%lx>", (long) lid); break;
case yp::c_err:
this->_werr(peg.perrline(), peg.perr(), o); break;
case yp::c_chr:
this->_wchar(peg.pc21(), peg.pbucky(), o); break;
default: // should be unreachable
o.of("#<i_cue lid=0x%lx>", (long) lid); break;
} // switch ((lid >> 2) & 3) { // Pcue enum bits
break;
} // case yp::i_cue
default: // should be unreachable
o.of("#<_wpeg lid=0x%lx>", (long) lid); break;
} // switch (lid & 3) for ilk bits
}
void yb2ow::_werr(yp const& peg) { // write peg to stderr «
bool save = w_flat; w_flat = true;
this->_wpeg(peg, yerr);
yerr << yendl << ynow;
w_flat = save;
}
void yb2ow::_werr(u32 line, yp::Perr e, yo& o) { «
switch (e) {
long ln = (long) line;
case yp::r_cease: // hit eof before end of some expression
o.of("#<unexpected-eof line=%ld>", ln); break;
case yp::r_delim: // hit ) or ] mixup for delimit end
o.of("#<wrong-delimitor line=%ld>", ln); break;
case yp::r_overdot: // too many expressions after dot
o.of("#<extra-expr-post-dot line=%ld>", ln); break;
case yp::r_underdot: // missing expression after dot
o.of("#<missing-expr-post-dot line=%ld>", ln); break;
case yp::r_multidot: // too many dots in one list
o.of("#<multiple-list-dots line=%ld>", ln); break;
case yp::r_badref: // bad #n= or #n# ref: n is unknown
o.of("#<badref n=%d unknown='#n= or #n#'>", ln); break;
default:
o.of("#<err kind=0x%x line=%ld>", (int) e, ln); break;
}
}
void yb2ow::_wchar(c32 c, int bucky, yo& o) { «
const char* cname = yb2ow::wcname(c);
if (bucky) // any bucket bits are set?
o.of("#<char #\\u+%x bucky=0x%x>", (int) c, bucky); // stub
else if (cname) {
o.o2c('#', '\\'); o << cname;
}
else if (c <= 0x7F) // no more than 7-bits?
o.o3c('#', '\\', (u8) c);
else // unicode codepoint
o.of("#\\u+%x", (int) c);
}
static const char* yb2ow_cname[ ] = { // names for 0x00 to 0x20 «
"nul", // = 0x0, // nul ^@
"soh", // = 0x1, // start of heading ^A
"stx", // = 0x2, // start of text ^B
"etx", // = 0x3, // end of text ^C
"eot", // = 0x4, // end of transmission ^D
"enq", // = 0x5, // enquiry ^E
"ack", // = 0x6, // acknowledge ^F
"bel", // = 0x7, // bell ^G
"backspace", // AKA "bs", // = 0x8, // backspace ^H
"tab", // AKA "ht", // = 0x9, // horizontal tab ^I
"newline", // AKA "lf", // = 0xA, // LF, newline ^J
"vt", // = 0xB, // vertical tab ^K
"page", // AKA "ff", // = 0xC, // form feed ^L
"return", // AKA "cr", // = 0xD, // carriage return ^M
"so", // = 0xE, // shift out ^N
"si", // = 0xF, // shift in ^O
"dle", // = 0x10, // data link esc ^P (raw data, not ctl's)
"dc1", // = 0x11, // device ctl 1 ^Q [xon] (resume after ^S)
// "xon", // = "dc1,
"dc2", // = 0x12, // device ctl 2 ^R (reprint current line)
"dc3", // = 0x13, // device ctl 3 ^S [xoff] (pause until ^Q)
// "xoff", // = "dc3,
"dc4", // = 0x14, // device control 4 ^T (status)
"nak", // = 0x15, // negative acknowledge ^U (delete line)
"syn", // = 0x16, // synchronous idle
"etb", // = 0x17, // end of transmission block
"can", // = 0x18, // cancel ^X
"em", // = 0x19, // end of medium ^Y
"sub", // = 0x1A, // substitute ^Z (suspend)
"esc", // = 0x1B, // escape ^[
"fs", // = 0x1C, // file sep ^\ (largest granularity sep)
"gs", // = 0x1D, // group sep ^]
"rs", // = 0x1E, // record sep ^^
"us", // = 0x1F, // unit sep ^_ (smallest granularity sep)
"space" // = 0x20
};
const char* yb2ow::wcname(c32 c) { «
// name of c if known & desired
if (c <= 0x20)
return yb2ow_cname[c];
if (c == 0x7F)
return "del"; // rubout
return 0; // nil means no name,
}
|