|
demos are
explained here;
a menu at top column right indexes actual topic demos.
Here we demo buf.
problem
When writing content of unknown size to a buffer space of fixed capacity, Wil uses a run subclass of yv named yb that adds a new member specifying max size. So a yb buffer is a yv run, but in addition has a max physical capacity b_x apart from logical length v_n. Most initial api below initializes inherited yv state (see run) and b_x from pointers, other yv runs, and iovecs (see iovec). The default copy contructor and assignment operator are fine to use too. With all public members, yb is a pod type (plain old data) of simple nature. struct yb : public yv { // octet buffer: yv plus max capacity «
x32 b_x; // max physical capacity (v_n is logical length) «
yb() : yv(0, 0), b_x(0) { } «
yb(const void* p, n32 len, x32 x) : yv(p, len), b_x(x) { }
yb(yv const& v, x32 max) : yv(v), b_x(max) { }
yb(iovec const& v, x32 max) : yv(v), b_x(max) { }
yb(yv const& v) : yv(v.v_p, 0), b_x(v.v_n) { }
yb(iovec const& v) : yv(v.iov_base, 0), b_x(v.iov_len) { }
void binit(const void* p, n32 n, x32 x) { // re-construct «
v_p = (u8*)p; v_n = n; b_x = x; }
void bclear() { v_p = 0; v_n = 0; b_x = 0; } // zero all «
// ... continued
}; // struct yb
The main purpose of yb is distinguishing total available space from currently used space in a contiguous memory-only format — quite narrow and straight forward. So a demo of yb alone is short; to make it longer, the right column shows a subclass of yo anticipating the out demo writing to a fixed sized yb buffer with a stream api. By it's nature yb aims to support writing to a yv run of contiguous octets with accuracy and safety from buffer overruns. The simplest method of writing a buf writes all the bytes with one value: However, before proceeding with more interesting ways to write yb, first let's dispense with slicing yb subsets because a buf slice is only slightly more complex than a run slice. Like other slice types, ybz is a subclass of yz (see slice) beginning with a z_p position and z_n length, where z_p is an offset from the start (negative: from the end). struct ybz : public yz { // slice of yb «
yb& z_b;
ybz(zp32 p, zn32 n, yb const& b): yz(p, n), z_b(*(yb*)&b) { }
ybz(yz const& z, yb const& b) : yz(z), z_b(*(yb*)&b) { }
struct Zq { ybz const& q_z; Zq(ybz const& z): q_z(z) { } };
Zq quote() const { return Zq(*this); } // to request dump «
void zprint() const; // zdump() to stdout for use under gdb
void zdump(yo& o) const; void zcite(yo& o) const;
// zv() explicitly converts yb slice to yv of range in buf:
inline yv zv() const; // defined immediately after yb's defn
};
inline yo& operator<<(yo& o, ybz::Zq const& x) {
x.q_z.zdump(o); return o; }
inline yo& operator<<(yo& o, yct<ybz> const& x) {
x.c_t.zcite(o); return o; }
All this api except for the final zv() is devoted to debug printing the slice, or capturing initial state from the following yb inlines: // ... yb continued
ybz bz(zp32 p, zn32 n) const { return ybz(p, n, *this); } «
ybz operator()(zp32 p, zn32 n) const { return ybz(p,n,*this); }
}; // struct yb
inline yv ybz::zv() const { // see run (cf «) for yv::vz()
yv v(z_b.v_p, z_b.b_x); return v.vz(*this); } «
And zv() shown immediately above is an inline following yb which simply generates the resulting yv corresponding to the slice of all the space in the buffer. (The idea is that most objects overriding a method to take ybz will explicitly call ybz::zv() when they want to resolve a buf slice as a yv run, because they might have something else they want to do instead. Thus ybz purposely does not define an operator yv() because this would fire too easily on methods accepting a yv input.) We're done with slices; now back to writing buffers. (The debug print methods of ybz are enough like those of yb that we'll just skip them. Look at yb print methods and use imagination.)
printing
Printing yb resembles printing yv (cf «) and several examples of usage printing yb appear in the iovec demo (cf «). Here's one more example where initial content is copied from another yv: static const char* lo = "abcdefghijklmnopqrstuvwxyz"; «
yv vlo(lo); // yv::yv(const char* cstr);
u8 tmp[1024]; yb b1k(tmp, 0, 1024);
b1k << vlo; // bappend(yv const& v);
yout << b1k.quote() << yendl << ynow; // bdump()
The yv source vlo is written to this 1K buf using bappend() shown in the next section, then the final line generates a temp yb::Bq wrapper from quote() to invoke bdump() shown next. The result on stdout appears below — it's just like dumping yv except for x=1024: <yb p=0xbffff6f8 n=26 x=1024 crc='#4c2750bd:26'>
00000: 61 62 63 64 65 66 67 68 69 6a 6b 6c ; abcdefghijkl
0000c: 6d 6e 6f 70 71 72 73 74 75 76 77 78 ; mnopqrstuvwx
00018: 79 7a ; yz
</yb>
The yb api for printing resembles boilerplate for printing most þ objects (see the quote demo for quote() and ycite()): // ... yb continued
struct Bq { yb const& q_b; Bq(yb const& b): q_b(b) { } };
Bq quote() const { return Bq(*this); } // to request dump «
void bprint() const; // bdump() to stdout for use under gdb
void bdump(yo& o) const; void bcite(yo& o) const;
}; // struct yb
inline yo& operator<<(yo& o, yb::Bq const& x) {
x.q_b.bdump(o); return o; }
inline yo& operator<<(yo& o, yct<yb> const& x) {
x.c_t.bcite(o); return o; }
Code called here appears in demos: crc shows crc32 using yh32, and hex shows hexdump code using yv::vhexmax() (cf «). And of course out shows the yo out stream api. void yb::bprint() const { // to stdout for gdb «
yout << yendl; this->bdump(yout); yout << yendl << ynow;
}
void yb::bdump(yo& o) const { // multi line print «
if (v_n) {
yh32 h; h << *this;
o.oftn("<yb p=%#lx n=%ld x=%ld crc='#%lx:%lu'>",
(long) v_p, (long) v_n,
(long) b_x, (long) h.hcrc(), (long) h.hlen());
this->vhexmax(o, (16*1024)); // yv::vhexmax() (cf «)
o.ouend("yb"); // out untab end (tag)
}
else
this->bcite(o);
}
void yb::bcite(yo& o) const { // single line only «
yh32 h; h << *this;
o.of("<yb p=%#lx n=%ld x=%ld crc='#%lx:%lu'/>", (long) v_p,
(long) v_n, (long) b_x, (long) h.hcrc(), (long) h.hlen());
}
Because dumping buffers in hex is so simple and direct, yb makes a convenient demo mechanism to expose what's written by other i/o objects, even though in normal development there's little reason to use yb for that scaffolding purpose.
formatting
You can write formatted content to yb using print() style format markup because under the covers bf() uses vsnprintf() with varargs to do hard parts — yb just provides space and records length. yb& yb::bf(const char* fmt, ...) { // like printf to this buf «
v_n = 0; // default to empty result
if (!fmt) { ylog(1, "bf(fmt=nil)"); return *this; }
if (!v_p) { ylog(1, "nil yb::v_p"); return *this; }
n32 max = b_x;
if (!max) { return *this; } // empty; no space for nul
if (1 == max) { *v_p = 0; v_n = 0; return *this; } // empty
va_list args;
va_start(args,fmt);
vsnprintf((char*)v_p, max-1, fmt, args); // save 1 u8 for nul
va_end(args);
v_p[max-1] = 0; // whether or not vsnprintf() also wrote nul
v_n = (n32) ::strlen((const char*) v_p); // bytes before nul
return *this;
}
yb& yb::operator()(const char* fmt, ...) { // fmt to buf
v_n = 0; // default to empty result
if (!fmt) { ylog(1, "bf(fmt=nil)"); return *this; }
if (!v_p) { ylog(1, "nil yb::v_p"); return *this; }
n32 max = b_x;
if (!max) { return *this; } // empty; no space for nul
if (1 == max) { *v_p = 0; v_n = 0; return *this; } // empty
va_list args;
va_start(args,fmt);
vsnprintf((char*) v_p, max-1, fmt, args); // save 1 for nul
va_end(args);
v_p[max-1] = 0; // whether or not vsnprintf() also wrote nul
v_n = (n32) ::strlen((const char*)v_p); // bytes before nul
return *this;
}
Clearly operator()() does the same thing as bf(): they both format printf() style input in available space in the buffer, while reserving one trailing octet to hold a zero byte for C string nul termination. A bit of preliminary error checking occurs so casual usage won't have minor errors punished without explanation. Some versions of vsnprintf() return number of bytes before the end nul, making the call to strlen() here unnecessary. But this way always works without change. The bf() method and operator()() aren't necessary, but as shortcuts they're convenient when otherwise one might jump through a few more hoops. And because both return yb ref, which is a subclass of yv, you can write it directly to any yo out stream, like this: u8 tmp[1024]; yb b1k(tmp, 0, 1024); // operator()(): «
yout << b1k("tmpsz=%d", sizeof(tmp)) << yendl << ynow;
... calling operator()() which writes this line on stdout:
tmpsz=1024
Both bf() and operator()() write output at offset zero in the buffer. But why not let bf() target any position inside the buffer? That way you could overlay several writes to get what you want. But vsnprintf() make this problematic since it always writes an end nul after formatted output. So to avoid nul clobber something complex must be done; yb simply defers that sort of complexity to ybo — shown column right — allowing you to seek any yb point for formatted writes. By using two yb instances, then appending one to another, you can effect gradual accumulation in one buffer from multiple writes, without resorting to ybo in the right column. See bappend() below (cf »).
seek
As a convenience when using bappend() in the next section, the yb api includes means to seek the position where append next occurs. This is actually the length of content in a buffer because you only append at eof or end-of-buffer, which is the curent length. // ... yb continued
void bseek(p32 n) { v_n = (n < b_x)? n: b_x; } // set len «
yb& operator=(p32 n) { v_n = (n<b_x)? n: b_x; return *this; }
}; // struct yb
You can either call bseek() to set buffer's end, or assign an integer value to yb using operator=() for the same effect. This use of integer assignment is semantically marginal, but Wil thinks usage is clear, especially when a right hand side of assignment is usually zero (in Wil's code) to make a buffer empty before beginning anew. You might dislike assigning integers to yb to set current length, but Wil's going to do it anyway. The bappend() examples shown next include setting yb length just before appending new content.
append
Each overloaded bappend() method adds a copy of source content to the buffer starting at current length v_n, continuing until either buf space or the source runs out. After N bytes have been added, the new length becomes v_n+N, never exceeding max capacity b_x. Operator << is also overloaded for each version of bappend() taking a single argument, so sources can be appended using buf << src syntax. Declarations and inlines are simple: // ... yb continued
bool bappend(yv const& v); // append v_n bytes from v.v_p «
bool bappend(const char* s); // null-terminated C string
bool bappend(iovec const& x) { yv v(x); return bappend(v); }
bool bappend(ybz const& z) { return this->bappend(z.zv()); }
bool bappend(ydvp const& src); // true if fits
bool bappend(ydvp const& src, u32 limit); // true if fits
bool bappend(ydvpz const& src); // true if fits
}; // struct yb
inline yb& operator<<(yb& b, yv const& x) {
b.bappend(x); return b; }
inline yb& operator<<(yb& b, const char* s) {
b.bappend(s); return b; }
inline yb& operator<<(yb& b, iovec const& x) {
b.bappend(x); return b; }
inline yb& operator<<(yb& b, ybz const& z) {
b.bappend(z); return b; }
inline yb& operator<<(yb& b, ydvp const& x) {
b.bappend(x); return b; }
inline yb& operator<<(yb& b, ydvpz const& x) {
b.bappend(x); return b; }
Appending one contiguous yv run of src.v_n bytes is simplest, adding at most the min() of source size and remaining space: bool yb::bappend(yv const& src) { «
u32 more = (b_x > v_n)? (b_x - v_n): 0;
u32 quantum = (more < src.v_n)? more: src.v_n;
if (v_p && src.v_p && quantum) {
::memcpy(v_p+v_n, src.v_p, quantum);
v_n += quantum;
}
return quantum == src.v_n; // added all of src
}
bool yb::bappend(const char* s) {
if (s) { // can convert to yv?
yv v(s); // yv::yv(const char* cstr)
return this->bappend(v); // append strlen(s) bytes
}
return true; // all of nothing added
}
This next example appends several yv slices generated by yv::vz() (cf «) followed by use of yb::bseek() via operator=() from the last section to shorten length before appending a last yv run. static const char* lo = "abcdefghijklmnopqrstuvwxyz";
yv vlo(lo); // yv::yv(const char* cstr);
u8 tmp[1024]; yb b1k(tmp, 0, 1024);
b1k << vlo(0, 5); // first five "abcde" «
yout << b1k.quote() << yendl << ynow;
b1k << vlo(10, 5); // third five "klmno"
yout << b1k.quote() << yendl << ynow;
b1k = 7; // reduce length to only seven:
yout << b1k.quote() << yendl << ynow;
b1k << vlo(-5, 5); // last five "vwxyz"
yout << b1k.quote() << yendl << ynow;
Output from that appears on stdout as shown below. The first dump shows "abcde" appended to an empty buf, followed by "klmno" appended to that, before shortening length to seven and appending "vwxyz": <yb p=0xbffff6d4 n=5 x=1024 crc='#8587d865:5'>
00000: 61 62 63 64 65 ; abcde
</yb>
<yb p=0xbffff6d4 n=10 x=1024 crc='#2ff09329:10'>
00000: 61 62 63 64 65 6b 6c 6d 6e 6f ; abcdeklmno
</yb>
<yb p=0xbffff6d4 n=7 x=1024 crc='#1356cd63:7'>
00000: 61 62 63 64 65 6b 6c ; abcdekl
</yb>
<yb p=0xbffff6d4 n=12 x=1024 crc='#1fc31f4:12'>
00000: 61 62 63 64 65 6b 6c 76 77 78 79 7a ; abcdeklvwxyz
</yb>
Appending an iovec vector is only slightly more complex, involving the same calculation as above, but N times instead of once. You can see an example call to bappend() using operator<<() in the iovec demo (cf «) focusing on the api of yvdp, which is the source here.
bool yb::bappend(ydvp const& src) { «
bool entire = true;
if ( v_n > b_x ) { // length over capacity?
ylog(1, "logical v_n=%d > physical b_x=%d",
(int) v_n, (int) b_x);
v_n = b_x; // force rational values
}
u8* bp = v_p + v_n; // buf write cursor
n32 sz = b_x - v_n; // available space
ydvp ivp(src, 0); // mutable copy at index zero
for ( ; ivp; ++ivp) { // more to do?
const iovec& v = *ivp; // iter's next iovec
n32 n = v.iov_len;
if (n > sz) { // exceeds space left?
n = sz; // only what can fit
entire = false; // insufficient room
if (0 == sz) // no room left? all done?
break;
}
if (n && v.iov_base) { // act on this iovec?
::memcpy(bp, v.iov_base, n);
bp += n; // skip written bytes
sz -= n; // less buf space
}
}
v_n = (n32) (bp - v_p); // used buf bytes
return entire; // true if all of request fit
}
The following bappend() variant limits the max number of bytes to be accepted from source ydvp: bool yb::bappend(ydvp const& src, u32 limit) { «
bool entire = true;
if ( v_n > b_x ) { // length over capacity?
ylog(1, "logical v_n=%d > physical b_x=%d",
(int) v_n, (int) b_x);
v_n = b_x; // force rational values
}
u8* bp = v_p + v_n; // buf write cursor
n32 sz = b_x - v_n; // available space
ydvp ivp(src, 0); // mutable copy at index zero
for ( ; limit && ivp; ++ivp) { // more to do?
const iovec& v = *ivp; // iter's next iovec
n32 n = v.iov_len;
if (n > limit) // iovec exceeds limit left?
n = limit; // max is requested limit
if (n > sz) { // exceeds space left?
n = sz; // only what can fit
entire = false; // insufficient room
if (0 == sz) // no room left? all done?
break;
}
if (n && v.iov_base) { // act on this iovec?
::memcpy(bp, v.iov_base, n);
bp += n; // skip written bytes
sz -= n; // less buf space
limit -= n; // closer to limit
}
}
v_n = (n32) (bp - v_p); // used buf bytes
return entire; // true if all of request fit
}
The most complex bappend() (shown below) takes ydvpz as a source — a slice of a ydvp iovec vector. This requires generating a new, temp, stack-based ydvp instance from the slice, using the same sort of approach shown in the iovec demo (cf «), involving several helper methods to choose an iovec vector size (cf «), and involving several helper ydvp subclasses (cf «) with statically defined lengths of iovec vector. bool yb::bappend(ydvpz const& pz) { // true if fits «
n32 n = pz.z_dvp.psize(); // max iovecs needed for pcut()
if (n <= 256) // at most 1K iovecs needed?
return b256_ydvpz(*this, pz);
else if (n <= 1024) // at most 1K iovecs needed?
return b1K_ydvpz(*this, pz);
else if (n <= 4096) // at most 4K iovecs needed?
return b4K_ydvpz(*this, pz);
else
return b16K_ydvpz(*this, pz); // 64K iovecs is max
}
The top level bappend() for ydvpz checks the starting iovec vector length (because the final subset can't be longer than this) then calls one of the helper methods below to perform the actual append after making a new stack-based ydvp subclass whose constructor converts the input ydvpz slice into a simple ydvp of the desired size. static bool b256_ydvpz(yb& b, ydvpz const& pz) { «
yd256vp iter(pz); // does iter.pcut();
return b.bappend(iter); // append at most 256 iovecs
}
static bool b1K_ydvpz(yb& b, ydvpz const& pz) {
yd1024vp iter(pz); // does iter.pcut();
return b.bappend(iter); // append at most 1024 iovecs
}
static bool b4K_ydvpz(yb& b, ydvpz const& pz) {
yd4096vp iter(pz); // does iter.pcut();
return b.bappend(iter); // append at most 4096 iovecs
}
static bool b16K_ydvpz(yb& b, ydvpz const& pz) {
long psz = pz.z_dvp.psize();
if (psz > 16 * 1024) // over capcity? (maybe assert?)
ylog(0, "b16K_ydvpz() psize=%lu over 16K max", psz);
yd16Kvp iter(pz); // does iter.pcut();
return b.bappend(iter);
}
Let's pick one of those ydvp subclasses above not shown previously and show it's definition in ydv.h. (This would be part of the iovec demo, but now it clarifies this buf demo.) class yd1024vp : public ydvp { // ydvp w/ built-in 1K iovecs «
public: // sample ydvp with in-built space for use in tests
enum { pe_max = 1024 }; // arbitrary built-in iovecs
iovec p_iovec[ pe_max ]; // space for p_v in ydpv
yd1024vp() : ydvp(p_iovec, /*len*/ 0, /*max*/ pe_max) { }
yd1024vp(yo& o, ydvpz const& pz, bool quo)
: ydvp(p_iovec, 0, pe_max) { this->prender(o, pz, quo); }
yd1024vp(ydvpz const& pz) // used by b1K_ydvpz() «
: ydvp(p_iovec, 0, pe_max) { this->pcut(pz); }
~yd1024vp() { }
ydvp pbegin() const { return ydvp(p_v, p_n); } // new iter
y0x* pend() const { return (y0x*) 0; }
}; // class yd1024vp
The constructor used by b1K_ydvpz() is shown isolated by white space, calling ydvp::pcut() shown in the iovec demo as pcut().
compare
Method bdiff() below is used to find the difference between two contiguous yv runs. Wil really only uses bdiff() in unit tests expecting two yv runs to be identical: when this expectation is false Wil wants to find and debug print only what differs so cause can be analyzed. So the definition of bdiff() is driven by this purpose — it's a binary diff finding first and last differing bytes in two yv runs expected equal under yv::veq() (cf «). It also counts total differences. As output, yb becomes a description of the subset of arg one that covers the difference of both runs. So if yb started out referring to other space before bdiff(), it doesn't any more afterward. So this is one of those methods which does not operate on space described by current yb state. Instead bdiff() is really a kind of re-initialization of yb to describe an intersection of two partial differences. Unequal length also makes yv::veq() fail to return equal; so even if no differing bytes are seen inside, bdiff() counts unequal length as a difference, using b_x to show the position in one where the shorter run was exhausted. (Thus bdiff() is a yb not yv method.) In some unit tests that fail, the only difference might be length. n32 yb::bdiff(const yv& one, const yv& two) { «
// Returns ndiff count of bytes actually different.
// Buf span covers first to last actual differences.
n32 ndiff = 0; // total count of different bytes
u8* first = 0; // where first byte varied in one
u8* last = 0; // where last byte varied in one
u8* xp = 0; // end in one of shorter of one and two
u8* p = (u8*) one.v_p;
n32 n = one.v_n;
n32 ntwo = two.v_n;
u8* ptwo = (u8*) two.v_p;
if ( n && ntwo && p && ptwo ) { // neither is empty?
register int a; register int b; // byte from either
++n; ++ntwo; // prepare both for predecrement
while (--n) { // more remains in one?
if (!--ntwo) { // two done first? one is greater?
xp = p; // spot in one where two ran out first
break;
}
b = (u8) *ptwo++; // two's next octet
a = (u8) *p++; // one's next octet
if ( a != b ) { // another difference?
++ndiff; // count each actual diff
last = p - 1; // where last diff was seen
if (!first) // first seen difference?
first = last;
}
}
if (ntwo != 1) { // one and two did NOT end together?
xp = p; // location in one where one ran out first
}
// (else if ntwo == 1, two and one were same length)
}
else if (n || ntwo) { // either one or two is nonempty?
xp = p; // different at start of one
}
if (first) { // at least one byte was different?
v_p = first; // first byte in one that differed
v_n = (last+1) - first; // length to cover last diff
b_x = (xp)? (xp - v_p): v_n;
}
else if (xp) { // lengths differ?
ndiff = 1; // only difference was in length
v_p = xp; // zero length diff at end of shorter
v_n = 0; b_x = 0; // b_x = xp - v_p;
}
else {
v_p = 0; v_n = 0; b_x = 0;
}
return ndiff; // actual diff count (or 1 when v_n == 0)
}
For a real use case applying bdiff(), see the compare demo. In practice when any difference is seen, Wil also slices the second run in the same way and prints that too to support a full manual comparison. But that isn't illustrated below. The next short example just forces a difference in the first run, after first generating a zero sized difference (since that's a correct result from unit tests expecting perfect equality). static const char* lo = "abcdefghijklmnopqrstuvwxyz";
yv vlo(lo); // yv::yv(const char* cstr);
u8 t1[1024]; yb b1k(t1, 0, 1024); // 1st buf
b1k << vlo; // copy of lowercase vlo
u8 t2[2048]; yb b2k(t2, 0, 2048); // 2nd buf
b2k << vlo; // identical vlo copy
yb bcmp; // buf to receive compare diff
n32 ndiffs = bcmp.bdiff(b1k, b2k); «
yout.of("# ndiffs=%d (should be zero)", (int) ndiffs);
yout << yendl << bcmp.quote() << yendl << ynow;
t1[4] = toupper(t1[4]);
t1[9] = toupper(t1[9]);
ndiffs = bcmp.bdiff(b1k, b2k);
yout.of("# ndiffs=%d (nonempty)", (int) ndiffs);
yout << yendl << bcmp.quote() << yendl << ynow;
The code above writes the following on stdout. As expected, the count of differing bytes is two — since only two were changed. # ndiffs=0 (should be zero)
<yb p=0 n=0 x=0 crc='#0:0'/>
# ndiffs=2 (nonempty)
<yb p=0xbffff6dc n=6 x=6 crc='#3a2695d3:6'>
00000: 45 66 67 68 69 4a ; EfghiJ
</yb>
Debug printing the result of bdiff() shows the byte range in the first yv run argument that covers the range where differing bytes were found. In this case, since we made changes in the first buffer, we see the span just covers both uppercased bytes and all between.
remarks
"Gee, that's a lot to take in at one time," sighed Stu. "It was the give and take methods in the left column, wasn't it?" prompted Wil. "Did I get carried away?" "And the stream class too," observed Stu. "Why not just stick to buffers alone here?" "Well," Wil scratched his nose, "the out demo would have been too long with ybo there too. It has to go somewhere; at least here it clarifies yb a little." "Do you actually like writing this stuff?" asked Stu. "Actually, no," admitted Wil. "I'd rather not write any of these demos. But as I explained elsewhere, if I don't, my code won't make sense to anyone. Wouldn't you agree?" "Mmmm, yeah I suppose," agreed Stu. "Now you just need to lighten up the license. Why not BSD? Stingy." |
A submenu for demos appears below, letting you
go to the page on a topic written as a demo (as the
demos page defines it).
menu
thorn: todo, names, fd, iovec, assert, log, run, hex, crc, buf « Þ, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile) Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.
ybo
This right column is devoted to out stream ybo subclassing yo from the near future out demo, writing into a fixed sized space described by a yb buffer instance. (Actually ybo also writes into yv and/or iovec just as easily by synthesizing an internal yb instance to describe them.) You might better read the out demo first, coming after this one. But material below is presented as if the base yo api doesn't matter. So the way in which ybo resembles other yo subclasses won't be emphasized. However, parts of yo and yb are semantically very similar because they both focus on managing writes to a single current buffer at a time. So as a result, the ybo class relating the two ought to introduce the idea of yo out streams without much explanation of yo itself. But let's start with yo state. Subclasses of yo all manage member vars of yo any way they like; the members shown below are declared in yo only to make them standard. In particular, the first triple o_0, o_p, and o_x are used in yo inlines, so they must be present in all yo subclasses. As shown in a diagram below, this triple corresponds very closely to the triple v_p, v_n, and b_x in yb (except the yo triple is all pointers). class yo : public yo0 { // buffered out/sink stream «
protected:
u8* o_0; // out origin; first usable byte in buf
u8* o_p; // out ptr; this must be true: o_0 <= o_p <= o_x
u8* o_x; // out max; one byte beyond end of buf
mutable int o_e; // zero or some error status
yb o_take; // copy of buf returned by last take
u16 o_tab; // current indent depth
u16 o_pad; // not yet used
public:
int oerr() const { return o_e; }
void ofail(int e) { o_e = e; }
yo(): o_0(0), o_p(0), o_x(0), o_e(0), o_take(0,0,0), o_tab(0) { }
virtual ~yo(); // defined in yo.cpp
// ... continued
};
yo::~yo() { o_0 = o_p = o_x = 0; } // destructor in yo.cpp «
(Note pure virtual base class yo0 only declares pure virtual methods, so it won't otherwise be noted on this page; ignore it.) To understand ybo code below, you only need to know all this state is zeroed by yo::yo(), and yo::~yo() zeroes the main pointer triple. It's up to subclasses to give these member vars other meaningful values. Triple (o_0, o_p, o_x) is safe defaulting to all zero because this means "empty buffer" with nothing inside. (Note o_p is never dereferenced when o_p >= o_x; and when o_x is zero, o_p cannot be less: passive but effective magic.) Subclass ybo makes the following relation true: |<------------- b_x ----------->|
|<------- v_n ----->|
v_p ->|
+---+---+---+---+---+---+---+---+
| a b c d e f g h | buffer of eight bytes
+---+---+---+---+---+---+---+---+
| | |
^o_0 ^o_p ^o_x
This diagram shows a buffer of only eight bytes holding specific values so you can imagine this as concretely as possible in a simple case. So in fact, you can make the diagram above manifest in code by doing the following. u8 nine[9]; yv v9(nine, 9); // all of nine «
yb b9(nine, 0, 9); // buf for all of nine
b9 << "abcdefghi"; // copy into 9 byte buf
yb b8(nine, 0, 8); // buf for first eight
b8.v_n = 5; // change length to five
ybo o8(&b8); // out stream writing to b8
o8.oseek(b8.v_n); // seek current buf len
yout << "# vec v9:" << yendl << v9.quote() << yendl;
yout << "# buf b8:" << yendl << b8.quote() << yendl;
yout << "# out o8:" << yendl << o8.quote() << yendl;
yout << ynow; // flush
Though this is easy to do, we'll return to the original simple diagram again later since it summarizes with less detail. (This code sample maximizes detail in a concrete version so you can ground your model, but it's not clear.) The code just above writes this on stdout: # vec v9:
<yv p=0xbffffaec n=9 crc='0x8da988af:9'>
00000: 61 62 63 64 65 66 67 68 69 ; abcdefghi
</yv>
# buf b8:
<yb p=0xbffffaec n=5 x=8 crc='#8587d865:5'>
00000: 61 62 63 64 65 ; abcde
</yb>
# out o8:
<yo o0=bffffaec op=bffffaf1 ox=bffffaf4 oe=0 tab=0>
<!-- p-0=5 x-p=3 -->
<0:p p=0xbffffaec n=5 crc='0x8587d865:5'>
00000: 61 62 63 64 65 ; abcde
</0:p>
</yo>
This output aims to show you the earlier ascii diagram was achieved, so that 1) the entire v9 run still contains all nine bytes of "abcdefghi", but 2) buf b8 currently sees only five as content of eight at most, and 3) out stream o8 sees the same memory, position, and capacity as buf b8. As long as we've gone this far, instead of starting right away with constructing ybo to reach this state, let's see what happens when we write a few more bytes to o8 as shown above, since it's purpose is clearly to prevent writes beyond the end of space in b8. This next fragment writes six more bytes (three too many) then flushes to bring length in b8 up-to-date, before debug printing all those objects one more time. Method boplus() returns a count of extra bytes written that couldn't be stored due to limited space. yout << "# writing 'zyxwvu' to out"; << yendl
o8 << "zyxwvu";
o8.oflush(); // bring b8.v_n up to date
yout << "# vec v9:" << yendl << v9.quote() << yendl;
yout << "# buf b8:" << yendl << b8.quote() << yendl;
yout << "# out o8:" << yendl << o8.quote() << yendl;
n32 ignored = o8.boplus(); // did not fit
yout.ofn("# bytes written but ignored=%d", ignored);
yout << ynow; // flush
Without the flush, length v_n in b8 would not be brought up-to-date with state in the o8 out stream. The code above writes this on stdout: # writing 'zyxwvu' to out
# vec v9:
<yv p=0xbffffaec n=9 crc='0xcb1cfca3:9'>
00000: 61 62 63 64 65 7a 79 78 69 ; abcdezyxi
</yv>
# buf b8:
<yb p=0xbffffaec n=8 x=8 crc='#72255fff:8'>
00000: 61 62 63 64 65 7a 79 78 ; abcdezyx
</yb>
# out o8:
<yo o0=bffffaec op=bffffaf4 ox=bffffaf4 oe=0 tab=0>
<!-- p-0=8 x-p=0 -->
<0:p p=0xbffffaec n=8 crc='0x72255fff:8'>
00000: 61 62 63 64 65 7a 79 78 ; abcdezyx
</0:p>
</yo>
# bytes written but ignored=3
Here you see run v9 and buf b8 both see three new bytes "zyx" starting after the first five bytes. And v9 shows the first byte after buf b8 still contains the orginal i put there originally. So o8 successfully stops an overwrite past the end of b8. (This was the only reason v9 has nine bytes.)
ybo api
Before code for ybo, let's show the declaration first: class ybo : public yo { // out stream writing to yb «
//protected:
// u8* o_0; // out origin; first usable byte in buf
// u8* o_p; // out ptr; this must be true: o_0 <= o_p <= o_x
// u8* o_x; // out max; one byte beyond end of buf
// mutable int o_e; // zero or some error status
// yb o_take; // copy of buf returned by last take
// u16 o_tab; // current indent depth
// u16 o_pad; // not yet used
protected:
yb bo_b0; // in case one is needed for yv or iovec outputs
yb* bo_b; // the yb actually written
n32 bo_plus; // bytes in excess of bo_b.b_x written
void bo_init(const yv& v); // init inherited o_0, o_p, o_x
public:
ybo();
void boclear(); // clear: put ybo back to state post construct
ybo(const iovec& iov); // bo_b0(base, 0, len); bo_b=&bo_b0
ybo(const yv& v); // bo_b0(v.v_n, 0, v.v_n); bo_b=&bo_b0
ybo(yb& b); // copy of buf b: (bo_b0=b; bo_b=&bo_0)
ybo(yb* b); // write into THIS yb instance b: (bo_b=b)
n32 bolen() const { return bo_b->v_n; } // bytes in buffer
n32 boplus() const { return bo_plus; } // too many bytes written
// NOTE: must oflush() before bo_b->v_n is up-to-date
const yv& bov() const { return *bo_b; } // must oflush() 1st
yb& bob() { return *bo_b; } // oflush() before bo_b->v_n okay
const yb& bob() const { return *bo_b; } // must oflush() 1st
protected: // only seen by subclasses; public api uses oc() only
virtual void _oc(int c); // abstract fallback for yo::oc()
public: // required virtual methods for yo subclasses
virtual ~ybo();
virtual p32 opos() const; // current byte position
virtual p32 olen(); // size in bytes
virtual int owrite(const void* src, n32 n); // neg on err
virtual int oflush(); // neg on err
virtual int oseek(p32 p); // might not be supported in subclass
virtual int opwrite(const void* src, n32 n, p32 pos); // seeks
public: // bulk i/o
virtual int otake(yb& dest); // reserve dest.b_x buffer bytes
virtual int ogive(yb const& src);
}; // class ybo
Additional state members are minimal: bo_b points to the buf updated by ybo, and bo_plus counts written bytes that didn't fit. Buf bo_b0 is only used when a pointer to yb is not passed to the constructor — in this case bo_b0 is used as the yb buffer udpated as writes and flushes occur. This supports writing to yv and iovec state, or a copy of an existing immutable yb. Note inlines bov() and bob() return the actual run or buf involved, whether or not ybo refers to or contains the target yb. Note the absence of debug printing methods because ybo inherits printing behavior of yo in the out demo (so look there).
ybo init
Let's show the diagram one more time: |<------------- b_x ----------->|
|<------- v_n ----->|
v_p ->|
+---+---+---+---+---+---+---+---+
| a b c d e f g h | buffer of eight bytes
+---+---+---+---+---+---+---+---+
| | |
^o_0 ^o_p ^o_x
Constructor code below prepares state similar to the diagram shown above. But o_p won't be in position until after a seek. So the diagram above only occurs after construction, once the current write pointer o_p has been moved by writing or seeking. The starting v_n length of an input buffer has an important effect that's hard to see in the source code because the effect is actually one that's absent: the v_n length is never set to smaller value than it currently has. So ybo will never make a buffer have less content in it than it had at the start — it can only make the content longer. This allows you to use ybo to update existing buffer content without changing current size if you wish. ybo::ybo(yb* b) // modifies this yb on oflush() «
: yo(), bo_b0(b->v_p, 0, b->b_x), bo_b(b), bo_plus(0) {
yv v(b->v_p, b->b_x); this->bo_init(v);
}
This is the only ybo constructor that aliases (refers to) another ybo outside itself instead of having bo_b point at internal bo_b0. By passing a pointer to this constructor, you're saying you want ybo to write to this particular yb buffer, so flushing updates length v_n of that buffer. Though never touched again after the constructor, bo_b0 is initialized as a copy of input yb — excepting length v_n — to no purpose except consistency with other constructors. But bo_b doesn't point at it, so bo_b0 doesn't matter. void ybo::bo_init(const yv& v) { «
o_take.bclear(); o_tab = 0; o_e = 0;
if (v.v_p && v.v_n) {
o_0 = o_p = v.v_p;
o_x = v.v_p + v.v_n;
}
else { o_0 = o_p = o_x = 0; }
}
Except the empty constructor, all ybo constructors call bo_init(). The main reason to do this — even when unnecessary — is to show similar effect. In every case o_p is set equal to o_0; if a buffer is already partially full, it's ignored by ybo under a theory you might intend starting anew. You can always call oseek() afterward to set a write position different from the buffer's origin. Every ybo constructor (given space to write) makes o_0 point at the first byte and o_x point one beyond the last buffer byte, by calling bo_init(). ybo::ybo() : yo(), bo_b0(0,0,0), bo_b(&bo_b0), bo_plus(0) { «
o_0 = o_p = o_x = 0; o_take.bclear(); o_tab = 0;
}
ybo::ybo(const iovec& iov) «
: yo(), bo_b0(iov.iov_base,0,iov.iov_len)
, bo_b(&bo_b0), bo_plus(0) {
yv v(iov.iov_base, iov.iov_len); this->bo_init(v);
}
ybo::ybo(const yv& v) : yo(), bo_b0(v.v_p, 0, v.v_n) «
, bo_b(&bo_b0), bo_plus(0) {
this->bo_init(v);
}
ybo::ybo(yb& b) : yo(), bo_b0(b.v_p, 0, b.b_x) «
, bo_b(&bo_b0) , bo_plus(0) {
yv v(b.v_p, b.b_x); this->bo_init(v);
}
The empty constructor isn't currently useful. (A later rev might re-add methods to re-init after construction.) The other three constructors take a reference to writable space, making a description in buffer bo_b0 and tracks future writing progress in that. You must use bob() to get the end result. None of these constructors modify immutable input args. Nor does ybo ever change input args later. (Only the constructor taking yb* updates the input.) However, note the space pointed at by the input args is not immutable, even though the input objects are immutable. As ybo writes, it modifies any memory given. So never mistake your obligation to pass into ybo constructors actual circumscribed space you intend to be written by ybo.
clear
Method boclear() reverts to state matching that right after construction, except current buffer length v_n is still not modified if the bo_b address is outside (not the address of bo_b0). You must decide how much previous content to keep yourself if an external yb instance is being written. void ybo::boclear() { // reset: original post-construct state «
if ( bo_b ) { // not destroyed?
o_take.bclear(); o_tab = 0; o_e = 0;
u8* p = bo_b->v_p;
bo_b0.v_n = 0; // always okay to zero length of bo_b0
// constructors ignore bo_b->v_n length, so we do too:
// bo_b->v_n = 0; // do NOT change buffer length
if ( p ) {
o_0 = o_p = p;
o_x = p + bo_b->b_x;
}
else { o_0 = o_p = o_x = 0; }
}
}
The destructor is a more permanent kind of clear which zeroes most state including the bo_b address, making ybo unusable. /*virtual*/ ybo::~ybo() { «
yv v(0,0); this->bo_init(v); bo_b0.bclear(); bo_b = 0;
}
flush
Flushing ybo updates the v_n length of the buffer being written, but only if pointer o_p now occupies a distance above origin o_0 higher than any previous length known before. Flush only increases a length highwater mark. /*virtual*/ int ybo::oflush() { // neg on err «
if (bo_b) { // not destroyed?
n32 top = o_p - o_0; // distance moved from origin
if (bo_b->v_n < top) // have not gotten this yet?
bo_b->v_n = top; // new high water mark
// backward seeks can make o_p-o_0 less than old highwater,
// so DO NOT change bo_b->v_n unless to make it larger.
}
return 0; // no error
}
It should be okay to seek a new position lower than the current highwater mark for length without interfering with memory of largest length seen. Note getting current length of ybo with olen() also does a flush since the highwater length is only known as a result of flushing. Knowing this, other ybo methods might use olen() as a synonym for oflush(). /*virtual*/ p32 ybo::olen() { // flush and then return length «
if (bo_b) { // not destroyed? do inline flush first?
p32 top = o_p - o_0; // distance moved from origin
if (bo_b->v_n < top) // have not gotten this yet?
bo_b->v_n = top; // new high water mark
return bo_b->v_n; // max(top, oldLen)
}
return 0;
}
The only accurate internal calculation of length is the one performed by olen(), so oseek() below uses olen() to find eof.
seek
Not all yo out streams can support seeking. (How would you try to seek a socket?) But ybo is an out stream that seeks new write positions: /*virtual*/ int ybo::oseek(p32 u) { «
o_p = o_0 + u;
if (o_p > o_x) // past end of buffer?
o_p = o_x; // keep it legal
return o_p - o_0; // actual new write position
}
/*virtual*/ p32 ybo::opos() const { // i.e. out ftell() «
return o_p - o_0; // get current write position
}
Note we permit seeks to any position inside the buffer, up to max capacity at end of buffer, even if it passes a previous high length. Seeking a point above previous length just promotes the new bytes found in the buffer to actual content.
writes
The main purpose of out streams is writing, so we finally reached interesting methods. But since ybo supports only fixed sized space, there's nothing to do when space is exhausted except count bytes that did not fit inside. /*virtual*/ void ybo::_oc(int c) { // one char on buf overflow «
if ( o_p < o_x ) // room for more?
*o_p++ = (u8) c; // add it
else
++bo_plus; // count excess
}
Protected method ybo::_oc() is only called by public yo::oc() for writing a buffered byte, and only when the buffer is exhausted, which is true when the condition tested here fails: o_p >= o_x. So when _oc() is called it will always increments bo_plus. But Wil always writes _oc() methods to recheck the test condition, since it documents meaning succinctly. /*virtual*/ int ybo::owrite(const void* src, n32 n) { // neg on err «
p32 room = o_x - o_p; // remaining space
if (src && n && room) { // anything to write?
if (n > room) { // more than remaining space?
bo_plus += n - room; n = room; // cut n to fit; plus<-excess
}
::memcpy(o_p, src, n); o_p += n; return (int) n; // bytes copied
}
else if (bo_b) bo_plus += n;
return 0;
}
The owrite() above is a main workhorse write method — almost all yo methods eventually funnel content to this one. All it does is copy bytes that fit to a current write pointer o_p, then advance o_p by bytes written. Any bytes that could not fit due to insufficient space are counted in bo_plus. The following write to a given offset is slightly trickier because it doesn't update o_p, which is the only way in other methods we tracked amount of content in a buffer before flushing. So opwrite() itself must perform the highwater content check done by oflush() since no one else can. /*virtual*/ int ybo::opwrite(const void* src, n32 n, p32 at) { «
p32 eof = this->olen(); // inline pseudo-seek without real seek:
if (at > eof) // pos beyond eof?
at = eof;
u8* pat = o_0 + at; // pseudo o_p without changing o_p
p32 room = o_x - pat; // remaining space
if (src && n && room) { // anything to write?
if (n > room) { // more than remaining space?
bo_plus += n - room; n = room; // cut n to fit; plus<-excess
}
::memcpy(pat, src, n); pat += n; // advance cursor
p32 top = pat - o_0; // potentially new highwater mark
if (bo_b && top > bo_b->v_n) // past old highwater mark?
bo_b->v_n = top; // new highwater mark, but same o_p
return (int) n; // bytes copied
}
else if (bo_b)
bo_plus += n; // just sum attempted write size
return 0;
}
The base version of owritev() defined by yo (in the out demo) calls the first owrite() above, so there's nothing to see here for that.
give and take «
This last section shows ybo versions of otake() and ogive() methods used by yo to optimize stream to stream copies. The code here was quickly put together for this demo, and actually hasn't been tested. So sample code really should be shown, demonstrating this version does approximately the right thing. Another reason this code should provide samples is to illustrate what these two methods means, since Wil hasn't seen a similar api in other stream classes before. (A similar api might exist but Wil hasn't looked for one since the one he invented works fine for his purposes.) But it's late and Wil's tired. This is part of Wil's zero copy support in yo. It aims to avoid an extra copy when reading from an in-stream and then writing to an out-stream. This seldom occurs; but when it does, efficiency becomes a concern. A naive approach to the problem would read from a source in-stream into a temp buffer, and then write that to a destination out-stream. This is inefficient when the out-stream itself is buffered, since one can instead write directly into the out-stream's buffer space. This is what otake() and ogive() are for. int ybo::otake(yb& dest) { // get available buf space «
dest.bclear();
this->ybo::oflush();
if (!o_p) { if (!o_e) { this->ofail(ENOSR); } return -1; }
if (o_take.v_p) { ylog(1, "ybo::otake() IN USE"); return -1; }
o_take.binit(o_p, 0, o_x - o_p);
dest = o_take; // o_take copies return to check in ogive()
return dest.b_x;
}
int ybo::ogive(const yb& src) { // balance earlier otake() «
yb take = o_take; o_take.bclear(); // copy & clear regardless
if (src.v_p != take.v_p || src.b_x != take.b_x) {
ylog(1, "ybo::ogive() src doesn't match saved otake() buf");
return -1;
}
n32 n = src.v_n;
if (n > src.b_x) { // length over capacity??
ylog(1, "ybo::ogive() src.v_n=%d > b_x=%d",
(int) n, (int) src.b_xb_x);
n = src.b_x; // safety
}
o_p += n; // make pointer show the new addition from src
if (o_p > o_x) { // beyond end? (how can this happen?)
int extra = o_p - o_x;
ylog(1, "ybo::ogive() o_p-o_x=%d", (int) extra);
bo_plus += extra; // amount written that was too much
o_p = o_x; // make o_p legal again
n -= extra;
}
return (int) n;
}
Method otake() describes all remaining space in an out-stream's current output buffer. It saves a copy of this description in member variable o_take so error checking can be performed in ogive(). The caller can then use this buffer space and read directly into it from an in-stream. The actual number of bytes should be put into the v_n of the buffer's description. Then the udpated buffer — with only v_n changed — should be returned in a call to ogive(). This is a kind of transaction where only one outstanding take/give operation can be outstanding at one time. So it's an error if o_take looks like it already contains a copy of a buffer returned from otake(). Once otake() is called, you can't call it again until ogive() concludes the transation by saying how many bytes were actually used in the last buffer. Method ogive() must establish nothing has changed in ybo since the earlier otake() call. (You can't write directly to the buffer and indirectly through other method calls at the same time — writes would clobber each other that way.) Actually, ogive() should assert the current o_p value is exactly the same as it was earlier in otake(). Besides error checking, ogive() should just advance o_p by the value of v_n in the buffer argument, in order to simply acknowledge space was consumed as the caller says. The simplest version of ogive() would do that after asserting o_p had not changed, and after asserting the buffer arg matched the saved o_take in all but v_n length. This version tries to make sense of poor usage. Okay, finally here's a demo of otake() using ydvp in a way resembling an example in the iovec demo (cf «). static const char* lo = "abcdefghijklmnopqrstuvwxyz"; // «
static const char* hi = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
yv vlo(lo); yv vhi(hi); // both C strings as yv runs
yassert(vlo.v_n == 26 && vhi.v_n == 26); // must be so
u8 threeDozen[36]; yb b36(threeDozen, 0, 36);
b36 << vlo(0, 18); // fill first half of b36
ybo o36(&b36); // out stream writing to b36
o36.oseek(b36.v_n); // seek current buf len
yd64vp pmix; // iter to hold mix of lo and hi slices
pmix << vhi(0, 5) << vlo(5, 5) << vhi(10, 5);
pmix << vlo(15, 5) << vhi(20, 6) << vlo(0, 5);
yout << pmix.quote() << yendl << ynow;
Thus far we prepared a buffer b36 which is half full, an out stream o36 with write position at current buf length, and an iovec iter pmix with at least a dozen more bytes inside than space remaining in o36. So now let's iterate over iovecs and use otake() to write each iovec into the buffer representing space left in o36: ydvp iter = pmix.pbegin();
for ( ; iter != pmix.pend(); ++iter) {
yb dest; // piece of o36 buffer space
if (o36.otake(dest) > 0) { // space left in o36
iovec iov = *iter;
yout << "# appending '" << iov << "' at ";
yout.ofn("offset=%#x", (int) o36.opos());
dest << iov; // append as much of iov as fits
o36.ogive(dest);
}
else
break;
}
o36.oflush(); // make b36 see all bytes written
yout.ofn("# (o36 plus=%d) buf b36: ", (int) o36.boplus());
yout << b36.quote() << yendl << ynow; // dump & flush
The code above writes output below on stdout. The extra comment printed each time an iovec is appended shows you how many times it happened, and what granularity of content was involved, which you can verify by checking offsets in the final hex dump. # appending 'ABCDE' at offset=0x12
# appending 'fghij' at offset=0x17
# appending 'KLMNO' at offset=0x1c
# appending 'pqrst' at offset=0x21
# (o36 plus=0) buf b36:
<yb p=0xbffffa64 n=36 x=36 crc='#fb32f903:36'>
00000: 61 62 63 64 65 66 67 68 69 6a 6b 6c ; abcdefghijkl
0000c: 6d 6e 6f 70 71 72 41 42 43 44 45 66 ; mnopqrABCDEf
00018: 67 68 69 6a 4b 4c 4d 4e 4f 70 71 72 ; ghijKLMNOpqr
</yb>
Note how o36.boplus() returned zero even though the last iovec written had length five. Why don't we see plus=2 in the output? Because the two bytes were truncated in the append to buf dest, not when written to o36 which knows nothing about it. |
demos « Þ
+ todo + names + fd + iovec + assert + log + run + hex + crc + buf « Þ + in + out + quote + escape + compare + file + deck + cow + arc + blob + tree + slice + rand + time + stat + hash + heap + node + primes + page + book + pile + stack + atomic + lock + mutex + thread + map + meter + list + iter + ctype |