|
The idea of demos is
explained here;
a menu at top column right indexes actual topic specific demos.
Here we demo iovec.
problem
Suppose you want to write code manipulating iovec vectors of the sort passed to readv() and writev() system calls. If you try man readv you might see something like this: ssize_t readv(int d, const struct iovec *iov, int n);
On my Mac powerbook, the sys/uio.h system header defines iovec as below, which also matches what I see on Linux. (But my Mac's man page for readv() mistakenly says iov_base is char*.) struct iovec { /* sys/uio.h « */
void* iov_base; /* Base address. */
size_t iov_len; /* Length. */
};
Wil can't really avoid using this iovec type even though he has a nearly identical yv type defined with lots of convenient methods lacking in iovec. In practice, APIs for interacting with the outside world prefer using arrays of iovec to express discontiguous scatter gather buffers for i/o. In many cases, trying to use yv isn't practical. So an ordered pair (iovec* iov, int len) often acts like a defacto standard type meaning vector of iovecs, and direct support for this is useful. So þ defines ydvp to represent this iovec vector pair most conveniently. Type y+d+v+p means thorn deck (iovec) vector pointer, which acts like an iterator over iovecs while preserving direct access to the original (iovec* iov, int len) pair. (Note all þ iterators end in p because they act semantically like pointers. If an STL collection class named foo has an iterator named foo::iterator, in þ the iterator's name is foop with no :: inside because it's not a nested class. Appending v to the end of iovec means a vector of iovecs, and appending p means an iterator over that vector. But ydvp is really just a wrapper for (iovec* iov, int len) plus methods, to work with other types editing iovec format.)
ydvp
Here's the ydvp class with interspersed comments. class ydvp { // iterator over iovec vector «
private:
iovec* p_v; // first of p_n iovec's
unsigned p_n; // length of p_v vector
unsigned p_i; // index into p_v
unsigned p_x; // max value for p_n
iovec p_nil; // used with bad index
Members p_v and p_n preserve constructor args; p_i is current index; any index >=p_n returns zero length p_nil. public: // note: bitwise copy and assign are okay «
ydvp(ydvp const& src, unsigned i) // new p_i == i
: p_v(src.p_v) , p_n(src.p_n), p_i(i), p_x(src.p_x) {
p_nil.iov_base = 0; p_nil.iov_len = 0;
}
ydvp(iovec* v, unsigned n) // n == max
: p_v(v), p_n(n), p_i(0), p_x(n) {
p_nil.iov_base = 0; p_nil.iov_len = 0;
}
ydvp(iovec* v, unsigned n, unsigned x) // n != max
: p_v(v), p_n(n), p_i(0), p_x(x) {
p_nil.iov_base = 0; p_nil.iov_len = 0;
}
~ydvp() { p_v = 0; p_n = 0; } // crash now on use
Construction is minimal; assignment and copy construction are encouraged (using the unspecified default definitions which copy memberwise, or using any bitwise copy you like). Destruction is a don't-care, except use after destruction should be brought to your attention. iovec* pvec() const { return p_v; } // iovecs «
unsigned psize() const { return p_n; }
You can recover constructor args; methods notably begin with a p prefix. (They might have been named pv() and pn().) size_t psum() const { // sum of all p_n iovecs «
size_t sum = 0;
for (unsigned i = 0; i < p_n; i++)
sum += p_v[i].iov_len;
return sum;
}
Total length in an iovec vector is usually defined as sum of all lengths. If we cached the result, we'd deny options to change iovecs later. (Why should we forbid all changes to iovecs? It's not our call.) const iovec& operator*() const { // current iovec «
return (p_i < p_n)? p_v[p_i] : p_nil;
}
iovec operator[](size_t i) {
return (i < p_n)? p_v[i] : p_nil;
}
You can copy a current iovec or fetch one at any index. bool pseek(unsigned i) { p_i = i; return i < p_n; } // «
bool operator+=(unsigned n) { return (p_i += n) < p_n; }
You can seek the "current" index position by name, addition, or subtraction (operator-=() omitted for brevity). All indexes above p_i are treated the same. (Why not?) bool operator++() { return ++p_i < p_n; } // PREFIX «
// prefix is better; we act just like prefix here (beware)
bool operator++(int){ return ++p_i < p_n; } // POSTFIX
You can increment or decrement (operator--() not shown) the current position. Postfix is deprecated as inefficient, but not punished more than acting like prefix. (Note these operators return the same result as operator bool(), so we could use while loops instead of for loops, but that would look less like STL usage.) operator bool() const { return p_i < p_n; } // more? «
Conversion to scalar boolean means a test that upward iteration is still underway. (The original version only iterated upward, but no harm is done by adding subtraction operators). // compare to nominal "end" value not actually used: «
bool operator!=(y0x*) const { return p_i < p_n; }
bool operator==(y0x*) const { return p_i >= p_n; }
}; // ydvp
If you want to mimic STL style of testing for iteration end, just return any y0x* pointer from a collection's end() method. Both comparison operators assume a nominal end value argument because otherwise comparison isn't very interesting. (These are here mainly for show, not because uasge is expected.) The y0x class is an opaque (never defined) type understood everywhere in þ to mean end — the return value from a collection method named end(). (A combination of zero 0 for origin and x for max extent intend to say "original or primordial end" to mean an end value canonically denoting end of any collection. In non-þ naming schemes Wil names it End.)
slices
The most complex parts of the ydvp api involve slice subsets (see the slice demo) and comparison to other sequences (see the compare demo). Slices for ydvp are shown below, but (due to detail) comparing iovecs appears only in the compare demo. Anticipating content of the slice demo, here is the main base yz slice api: struct yz { // slice (starting pos and length in a sequence) «
zp32 z_p; // signed position (neg is relative to eof)
zn32 z_n; // signed length (neg is relative to eof)
yz(zp32 p, zn32 n) : z_p(p), z_n(n) { }
yz() : z_p(0), z_n(0) { }
void zabsolute(n32 N); // normalize to N if z_p or z_n is neg
}; // struct yz
The yz api is longer, but this shows parts of yz affecting subclass ydvpz representing a slice of one ydvp instance. Both z_p and z_p are signed — negative means relative to total sequence length, equal to psum() in the case of ydvp. When either is negative (and thus relative) zabsolute() can convert to positive and absolute; but you might rarely (if ever) use negative values. (See the slice demo). The meaning of yz is trivial: z_p is a position in the sequence — an offset from the first element, with zero as the origin; and z_n is simply length starting at position z_p. You subclass yz like this: struct ydvpz : public yz { // slice of ydvp «
ydvp& z_dvp;
ydvpz(zp32 p, zn32 n, ydvp const& x)
: yz(p, n), z_dvp(*(ydvp*)&x) { }
ydvpz(yz const& z, ydvp const& x) : yz(z), z_dvp(*(ydvp*)&x) { }
// ... (continued below »)
}; // struct ydvpz
So the state of ydvpz is just yz plus a pointer to ydvp (but using reference syntax). The idea is to write expressions like this: ydvp& p = ...; // ydvp ref from someplace «
ydvpz pz = p(2, 4); // subset at offset 2, length 4
... using these ydvp methods not yet shown. The expression just above uses operator()() to return an ydvpz instance: class ydvp { // ... continued «
public: // slicing
ydvpz pz(zp32 p, zn32 n) const { return ydvpz(p, n, *this); }
ydvpz operator()(zp32 p, zn32 n) const {
return ydvpz(p, n, *this);
}
}; // ydvp
The ydvpz returned merely copies the input values, so all it does is capture the offset, length, and collection pointer together in a bundle of type ydvpz which means "I'm a slice of ydvp" that can be processed by other methods overloaded to use ydvpz. For example, the rest of the ydvpz adds support for writing and debug printing ydvpz in a manner resembling the same additions to ydvp (shown column right). struct ydvpz : public yz { // slice of ydvp «
// ... (continued)
struct Zq { ydvpz const& q_z; Zq(ydvpz const& z): q_z(z) { } };
Zq quote() const { return Zq(*this); } // to request dump
void zprint() const; // calls zdump(yout) (for gdb)
void zdump(yo& o) const; void zcite(yo& o) const;
void zout(yo& o) const; // write the content described to o
// 256, 1K, 4K, 16K stack arrays of iovec for ydvp::prender():
void z256(yo& o, bool quo) const; void z1K(yo& o, bool quo) const;
void z4K(yo& o, bool quo) const; void z16K(yo& o, bool quo) const;
}; // struct ydvpz
The following several pieces of sample code show building, debug printing, and writing ydvpz slices to std yout (out »), and to a local temp yb buffer (see buf). Here two literal C strings provide a source of iovec data used: static const char* lo = "abcdefghijklmnopqrstuvwxyz";
static const char* hi = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
yv vlo(lo); yv vhi(hi); // both C strings as yv runs «
yassert(vlo.v_n == 26 && vhi.v_n == 26); // must be so
Next we build an instance of ydvp as described column right, using yd64vp with a built-in vector of 64 iovecs (cf »). yd64vp pmix; // iter to hold mix of lo and hi slices «
pmix << vhi(0, 5) << vlo(5, 5) << vhi(10, 5);
pmix << vlo(15, 5) << vhi(20, 6) << vlo(0, 5);
yout << pmix.quote() << yendl << ynow;
The middle two lines above append iovecs for small slices from the yv instances; the last line writes this on stdout: <ydvp v=0xbffff860 n=6 i=0 x=64 sum=31 crc='0xdff31f7f:31'>
<v0 p=0xa49dc n=5 crc='0x72d31ad5:5'>
00000: 41 42 43 44 45 ; ABCDE
</v0>
<v1 p=0xa49fd n=5 crc='0x293d787a:5'>
00005: 66 67 68 69 6a ; fghij
</v1>
<v2 p=0xa49e6 n=5 crc='0xc81859d9:5'>
0000a: 4b 4c 4d 4e 4f ; KLMNO
</v2>
<v3 p=0xa4a07 n=5 crc='0xe87cf305:5'>
0000f: 70 71 72 73 74 ; pqrst
</v3>
<v4 p=0xa49f0 n=6 crc='0x8e601468:6'>
00014: 55 56 57 58 59 5a ; UVWXYZ
</v4>
<v5 p=0xa49f8 n=5 crc='0x8587d865:5'>
0001a: 61 62 63 64 65 ; abcde
</v5></ydvp>
Next we make a one kilobyte buffer and write this content into it with yb::bappend() (buf ») before debug printing the buf. u8 tmp1[1024 + 4]; yb b1(tmp1, 0, 1024); // «
b1 << pmix;
yout << b1.quote() << yendl << ynow;
The last line calls yb::pdump() which writes this output. (The trailing ynow flushes the buffer in out stream yout.) Note crc checksum 0xdff31f7f is identical to that in our ydvp. <yb p=0xbffff444 n=31 x=1024 crc='#dff31f7f:31'>
00000: 41 42 43 44 45 66 67 68 69 6a 4b 4c ; ABCDEfghijKL
0000c: 4d 4e 4f 70 71 72 73 74 55 56 57 58 ; MNOpqrstUVWX
00018: 59 5a 61 62 63 64 65 ; YZabcde
</yb>
Now we finally reached the slice example, which is stored in variable pz for clarity. Then pz is written to a new buffer instance. (We could have re-used the old buffer, but this way we show less new api.) ydvpz pz = pmix(6, 18); // slice at pos 6 len 18 «
u8 tmp2[2048 + 4]; yb b2(tmp2, 0, 2048);
b2 << pz; // yb::bappend(ydvpz const& src)
yout << b2.quote() << yendl << ynow; // yb::bdump()
The final line above writes the next output, differing from the last dump only by containing less content, matching the slice described: <yb p=0xbfffec40 n=18 x=2048 crc='#371787ad:18'>
00000: 67 68 69 6a 4b 4c 4d 4e 4f 70 71 72 ; ghijKLMNOpqr
0000c: 73 74 55 56 57 58 ; stUVWX
</yb>
But that's how content looks when dumped from a buffer. When we cite or quote the slice below, we can see internal ydvpz structure — minimal in the case of cite but fully detailed for dump. yout << ycite(pz) << yendl << ynow; // one line «
yout << pz.quote() << yendl << ynow; // multi line
The output of those two lines is shown below. But where did the ydvp come from inside the dump? Well ydvpz::zdump() actually generates the ydvp described by the slice on the fly, as we'll see soon. <ydvpz z=6:18 v=0xbffff860 n=6 i=0 x=64>
<ydvp v=0xbfffe380 n=4 i=0 x=256 sum=18 crc='0x371787ad:18'>
<v0 p=0xa49fe n=4 crc='0xd829d296:4'>
00000: 67 68 69 6a ; ghij
</v0>
<v1 p=0xa49e6 n=5 crc='0xc81859d9:5'>
00004: 4b 4c 4d 4e 4f ; KLMNO
</v1>
<v2 p=0xa4a07 n=5 crc='0xe87cf305:5'>
00009: 70 71 72 73 74 ; pqrst
</v2>
<v3 p=0xa49f0 n=4 crc='0x25ec60db:4'>
0000e: 55 56 57 58 ; UVWX
</v3></ydvp></ydvpz>
However, before we dive into source code for dumping slices and generating new ydvp instances on the fly, first let's look at what happens when you write that ydvpz to an instance of yfdw for stdout: yfdw wout(yfdw::we_stdout); // STDOUT_FILENO=1 «
wout << "#" << pz << "#\n";
Wil wrote almost all of the fd demo just for that last line of code whose operator<<() calls yfdw::wput() using system call writev() to do the work (fd «). Unsurprisingly, this line appears on stdout: #ghijKLMNOpqrstUVWX#
Let's drill down into code writing that line, so you can see how ydvpz slices actually work in terms of generating new temp ydvp instances on the stack (using subclasses like yd64vp, column right: cf »). This is the part of the fd demo reserved for this page. (The rest of that demo sets the stage for code below.) This yfdw.h inline operator: inline yfdw& operator<<(yfdw& w, ydvpz const& x) {
w.wput(x); return w; }
... calls this yfdw::wput() entry point: ssize_t yfdw::wput(ydvpz const& pz) const { // «
n32 n = pz.z_dvp.psize(); // max iovecs needed for pcut()
if (n <= 1024) // at most 1K iovecs needed?
return yfdw_1K_ydvpz(*this, pz);
else
return yfdw_32K_ydvpz(*this, pz);
}
Which calls one of the two helper methods below, depending on size of stack allocation needed. These versions explicitly declare arrays of iovec instead of using ydvp subclasses yd1024vp and yd32Kvp: static ssize_t yfdw_1K_ydvpz(yfdw const& w, ydvpz const& pz) { // «
iovec v1K[ 1024 ]; // space for 1K iovecs
ydvp iter(v1K, 0, 1024); // max 1K iovecs
iter.pcut(pz); // generate slice result
return w.wput(iter.pvec(), iter.psize());
}
static ssize_t yfdw_32K_ydvpz(yfdw const& w, ydvpz const& pz) {
long psz = pz.z_dvp.psize();
if (psz > 32 * 1024) // over capcity? (maybe assert?)
ylog(0, "ydvpz::z32K() psize=%lu over 32K max", psz);
iovec v32K[ 32 * 1024 ]; // space for 32K iovecs
ydvp iter(v32K, 0, 32 * 1024); // max 32K iovecs
iter.pcut(pz); // generate slice result
return w.wput(iter.pvec(), iter.psize());
}
In both these helper methods the interesting bit looks like: ydvp iter(v1K, 0, 1024); // max 1K iovecs
iter.pcut(pz); // generate slice result
This creates a new empty ydvp on the stack and uses pcut() to render the ydvpz slice argument as a fresh temp iovec vector. In fact, debug print code for ydvpz uses method prender() for i/o: void ydvp::prender(yo& o, ydvpz const& pz, bool quo) { // «
// this ydvp is assumed on the stack as a temp
this->pcut(pz); // generate slice result
if (quo) // 'quote' content in dump format?
this->pdump(o); // print description
else // this is the same as calling pout():
o.owritev(p_v, p_n); // yo0::owritev()
}
ydvp subclasses with built-in iovec vectors have a constructor that calls prender() above directly. So just instantiation can do i/o. Mildly gnarly pcut() generates a new subset from a slice: (Code for yz::zabsolute() appears in the slice demo: cf ».) bool ydvp::pcut(const ydvpz& pz) { // «
if (pz.zrelative()) { // z_p or z_n is negative?
yz zabs(pz); // new copy we can make absolute
++g_ydvpz_rel2abs_count; // count these events
zabs.zabsolute(pz.z_dvp.psum()); // relative to N
return this->pcut(pz.z_dvp, zabs.z_p, zabs.z_n);
}
return this->pcut(pz.z_dvp, pz.z_p, pz.z_n);
}
But that just converts a relative slice (with negative values) to an absolute form before calling the real pcut() below, which is the main support for ydvpz slices, all of which route here: bool ydvp::pcut(const ydvp& iter, p32 pos, n32 len) {
p_n = 0; // locally empty (p_x is physical size) «
if (0 == len) // zero sized input? all done?
return true; // all of request was appended
iovec* v = iter.p_v; // incoming vector
iovec* x = v + iter.p_n; // one past end of v
iovec iov; // temp iovec for each append
u8* p = 0; // incoming data in iovec in v
n32 n = 0; // length of data at p in iovec
for ( ; v < x && pos; ++v) { // more to skip?
n = v->iov_len; // length of next iovec
if (n > pos) { // need some of this iovec?
p = (u8*) v->iov_base; // u8 arithmetic
iov.iov_base = p + pos; // skip pos in v
n -= pos; // remainder after pos
if (n > len) // more than request?
iov.iov_len = len; // only the request
else
iov.iov_len = n; // all of first piece
this->padd(iov); // 1st iovec in new vector
pos = 0; // orginal offset now all skipped
len -= iov.iov_len; // part of request done
}
else // skip all of this iov (skip all of n)
pos -= n;
}
if (pos) // original offset more than iter.psum()
return true; // no data exists after iter's eof
for ( ; len && v < x; ++v) { // need & have more?
p = (u8*) v->iov_base;
n = v->iov_len; // bytes of content at p
if (!p) { // nil address in input iovec vector?
ylog(0, "ydvp::pcut() nil == v[%d].iov_base",
(v - iter.pvec())); // report bad index
yassert(0 != p); // die (rub in caller's nose)
}
iov.iov_base = p; // from start of iovec v
if (n > len) // more than request?
iov.iov_len = len; // only the request
else
iov.iov_len = n; // all of next iovec
if (this->padd(iov)) // append succeeded?
len -= iov.iov_len; // less request left
else
return false; // unable to append all
}
return (0 == len); // appended all described content
}
Since this page already runs long, little detail in pcut() is explained. First length p_n is set to zero (in case it wasn't already) so new iovecs can be appended with padd() whose code appears column right (cf »). Then pos bytes are skipped to find the starting offset of content in the slice. This might exhaust all the iovecs if pos exceeds old psum(). The second loop simply counts out the desired number of bytes, appending iovecs until the new psum() is established. Many overloaded pcut() methods route to the main one: bool ydvp::pcut(yv const& v, p32 pos, n32 len) { «
iovec iov = v; // convert to iovec and cut:
ydvp iter(&iov, 1); // single iovec iter
return this->pcut(iter, pos, len);
}
bool ydvp::pcut(iovec const& src, p32 pos, n32 len) {
iovec iov = src; // mutable copy
ydvp iter(&iov, 1); // single iovec iter
return this->pcut(iter, pos, len);
}
int g_ydvpz_rel2abs_count = 0; // rel to abs slices
bool ydvp::pcut(yv const& src, const yz& zrel) {
iovec iov = src; // convert to iovec and cut:
ydvp iter(&iov, 1); // single iovec iter
if (zrel.zrelative()) { // z_p or z_n is negative?
yz zabs(zrel); // new copy we can make absolute
++g_ydvpz_rel2abs_count; // count these events
zabs.zabsolute(src.v_n); // relative to N=v_n
return this->pcut(iter, zabs.z_p, zabs.z_n);
}
return this->pcut(iter, zrel.z_p, zrel.z_n);
}
bool ydvp::pcut(iovec const& src, const yz& zrel) {
iovec iov = src; // mutable copy
ydvp iter(&iov, 1); // single iovec iter
if (zrel.zrelative()) { // z_p or z_n is negative?
yz zabs(zrel); // new copy we can make absolute
++g_ydvpz_rel2abs_count; // count these events
zabs.zabsolute(iov.iov_len); // relative to N
return this->pcut(iter, zabs.z_p, zabs.z_n);
}
return this->pcut(iter, zrel.z_p, zrel.z_n);
}
bool ydvp::pcut(const ydvp& iter, const yz& zrel) {
if (zrel.zrelative()) { // z_p or z_n is negative?
yz zabs(zrel); // new copy we can make absolute
++g_ydvpz_rel2abs_count; // count these events
zabs.zabsolute(iter.psum()); // relative to N
return this->pcut(iter, zabs.z_p, zabs.z_n);
}
return this->pcut(iter, zrel.z_p, zrel.z_n);
}
More ydvp api appears in other demos, when detail is specific to those topics. For example, the rand demo will feature pseudo random permutation of iovecs, along with other techniques for stream randomization.
ydvz
Source code for ydvp hasn't been shown yet. It appears below, and isn't long since the complex parts are done by ydvp::pcut() in the last section above. Most slice code prints or writes code — in each case this involves generating a new ydvp first in a temp stack instance, once again using ydvp::pcut(). However, this time pcut() gets called indirectly via prender() (cf «), or via constructors for ydvp subclasses that call prender(). Unlike the fd demo — which only handled two different sizes of iovec vectors (1K and 32K) — the slice code below uses more gradations in size for temp stack iovec vectors. This makes the code longer (and that's why fewer cases were shown in the fd demo). void ydvpz::zcite(yo& o) const { // single line «
//yh32 h; h << *this;
// o.of("crc='%#lx:%lu'", h.hcrc(), (long) h.hlen());
ydvp& p = z_dvp;
o.of("<ydvpz z=%d:%d v=%#lx n=%u i=%u x=%u/>",
(int) z_p, (int) z_n, (long) p.p_v,
(int) p.p_n, (int) p.p_i, (int) p.p_x);
}
void ydvpz::zprint() const {
yout << yendl; this->zdump(yout); yout << yendl << ynow;
}
One line is written by zcite() and zprint() is just a shallow wrapper around pdump() that curries the standard yout argument, so it's easily used from gdb. void ydvpz::zdump(yo& o) const { // multi line «
ydvp& p = z_dvp;
o.oftn("<ydvpz z=%d:%d v=%#lx n=%u i=%u x=%u>",
(int) z_p, (int) z_n, (long) p.p_v,
(int) p.p_n, (int) p.p_i, (int) p.p_x);
n32 n = z_dvp.psize(); // max iovecs needed for pcut()
if (n <= 256) // at most 256 iovecs needed?
this->z256(o, /*quo*/ true);
else if (n <= 1024) // at most 1K iovecs needed?
this->z1K(o, /*quo*/ true);
else if (n <= 4096) // at most 4K iovecs needed?
this->z4K(o, /*quo*/ true);
else if (n <= (16 * 1024)) // at most 16K iovecs needed?
this->z16K(o, /*quo*/ true);
else
this->z64K(o, /*quo*/ true); // 16K iovecs is max
o.ouend("ydvpz");
}
Helper functions are called to process content in iovecs. Different functions are called depending on the size of temp stack array needed. There are other ways to do this, but this one is simple and works. Most of the time, under normal circumstances, the first z256() helper method will be called, because 256 is large for iovec vectors. void ydvpz::z256(yo& o, bool quo) const { // shortest «
yd256vp iter(o, *this, quo); // pdump() or pout()
}
void ydvpz::z1K(yo& o, bool quo) const {
yd1024vp iter(o, *this, quo); // pdump() or pout()
}
void ydvpz::z4K(yo& o, bool quo) const {
yd4096vp iter(o, *this, quo); // pdump() or pout()
}
void ydvpz::z16K(yo& o, bool quo) const {
yd16Kvp iter(o, *this, quo); // pdump() or pout()
}
void ydvpz::z64K(yo& o, bool quo) const {
long psz = z_dvp.psize();
if (psz > 64 * 1024) // over capcity? (maybe assert?)
ylog(0, "ydvpz::z64K() psize=%lu over 64K max", psz);
iovec v64K[ 64 * 1024 ]; // space for 64K iovecs
ydvp iter(v64K, 0, 64 * 1024); // max 64K iovecs
iter.prender(o, *this, quo); // pdump() or pout()
}
The final 64K helper is handled differently for contrast. In each case the quo arg means quoted (ie for debug print) to request a dump of iovec content instead of writing iovec content, so zdump() can share these helpers with with zout() shown next. void ydvpz::zout(yo& o) const { // write content described to o «
n32 n = z_dvp.psize(); // max iovecs needed for pcut()
if (n <= 256) // at most 256 iovecs needed?
this->z256(o, /*quo*/ false);
else if (n <= 1024) // at most 1K iovecs needed?
this->z1K(o, /*quo*/ false);
else if (n <= 4096) // at most 4K iovecs needed?
this->z4K(o, /*quo*/ false);
else if (n <= (16 * 1024)) // at most 16K iovecs needed?
this->z16K(o, /*quo*/ false);
else
this->z64K(o, /*quo*/ false); // 64K iovecs is max
}
Code for zout() is identical to with zdump() except for false passed to each helper method, causing content of iovecs to be written without interpretation. As before, z256() is the normal case.
license
The license for all C++ code on this page is the same as that for all C++ on this site. Please read the license page carefully before you use code — the BriarPig mu-babel license resembles no other open source license. You only have permission to use the code if you agree with all the license terms. And as always: // Copyright (c) 1993-2008 BriarPig (mu-babel license)
banter
(This conversation with Dex clarifies nothing, and is likely a complete waste of your time. It wasted Wil's time — such conversations with Dex rarely have any value beyond bits of humor, explaining why Wil doesn't like chatting about software to no purpose. A shorter and slightly more constructive dialog follows this one.) "So what did you think?" Wil asked Dex at demo's end. "Well," Dex sniffed, "I'm a little offended you didn't ask what I thought earlier, when I could have helped." "Help do what?" asked Wil, genuinely puzzled. "Um, you know, help suggest a better api," Dex shrugged. "To clean up the ungainly parts, I suppose," Wil suggested. "No," Dex laughed, "when it comes to iovecs, I like the traditional approach, which is painfully awkward, as you know." "Okay, you mean like, 'Grr, I love to write in C, so gimme those naked C iovecs without any objects — I can take it like a man.' Something like that?" offered Wil. "Heh," Dex grinned, "yeah, basically." "Why am I not surprised?" Wil monotoned. "You just don't get it," Dex explained. "The idea is to show how smart you are by doing everything the hard way. When you use objects you must overdo it; and in low level C stuff you must not use objects at all. But you're using objects the right amount — which is wrong if you want to grandstand like I always do." "So what's your point?" Wil prompted. "My point is slices are creepy," Dex replied. "That's a change in subject," Wil objected. "Yes," Dex agreed. "But I didn't think you'd call me on it. It's very rude of you to notice when I evade questions with random topic swerves like that. It's better when you just go defensive like I wanted." "Okay," Wil joked, "I'll try to remember." "I hope you don't expect me to use your iovec slices for stack buffer i/o to avoid buffer overflow problems," Dex said. "I wouldn't dream of it," Wil pretended. "I know how much you enjoy those buffer overflow fire drills." "I'm just kidding," Dex joshed. "Humor me, but don't overdo it since my dumb remarks reverberate a little." A short silence descended as Wil waited to see if Dex had anything else to say, while Dex looked around to see whether anyone was around. Maybe he looked for Ira. Nope, no police around. "I was thinking," Dex proposed slowly, "maybe you should rework all this as a large and exhaustively thorough i/o framework because that would be cool and gnarly and establish your chops." "Not interested," Wil shrugged. "Not interested in chops?!" Dex asked incredulously. "No, I have chops," Wil countered. "I don't care about establishing them. Bor-r-ring. Also, I don't need a framework." Dex snorted. "Yeah, right," he disagreed. "We all need frameworks. You're naked without one. What would people say?" "Oh, who cares?" Wil dismissed. "Other stuff in þ might look more like a framework. Seems like a disadvantage, though." "You overdo heresy a little," Dex complained. "Useful stuff should look like part of a language," Wil continued. "Not like a suit of armor shielding you from being involved." "You even mention English long bow versus knight's armor again, and I'm outa here," Dex warned. "Tempting," Wil noted, "but too easy. Framework sounds too rigid and I'm into really lightweight stuff. I only want enough form to avoid errors, but not so much attention is channeled without flexible options remaining visible. I don't have a key metaphor." Dex raised his hands and made scare quotes with his fingers. "'The medium is the message,'" Dex suggested archly. "Or 'design patterns über alles' slogans," Wil said. "Blah, blah, blah," Dex agreed against form. Wil winced. "Don't say that," Wil pleaded. "I hate that phrase." "I feel a cliché phrase binge coming on," Dex warned. Wil stuck his fingers in his ears and waited, peeking. Dex looked up and mimed whistling until Wil stopped. "So what do you call your method?" Dex asked finally. "No name yet," Wil shrugged. "I'll think about it."
summary
"Is this the longest demo you plan to write?" asked Stu. "Yes," confirmed Wil. "I'd expected a bit less." "Long from showing so many iovec classes?" prompted Stu. "Yes, and I could have skipped some," noted Wil. "But by including all of them I introduced themes priming many other demos. So this one creates a horizontal slice of good context motivating several other classes. Though a long intro, it might unify what follows." "Ever use yz's slice api in the workplace before?" asked Stu. "No, but I usually port ydvp and ydvo to most work environments," replied Wil. "At work it's best to introduce the smallest amount of code that gets a thing done. So I usually add the effect of slicing but I don't include the yz class." "Because it's abstraction?" Stu guessed. "Yes, abstraction is shunned most places," Wil agreed. "And because operator overloading gets little favor either — and slices are less useful without more overloading." "Operator overloading sucks," Stu agreed. "Yes, I think that's what Dex found creepy," mused Wil. "Then why use it?" Stu pursued. "Operator overloading clarifies when used consistently and sparingly," explained Wil. "I mainly use it for stream processing, to simplify sequence editing and copying." "Using operator()() looks disturbingly like a function call," Stu wrinkled his nose. "It's ugly to C coders." "Well it makes slice notation concise," Wil defended. "I'd use operator[]() instead, but that only takes one argument, and I need both position and length." "I agree it's concise," Stu granted. "But then there's operator<<() — you use it everwhere. Could it be overkill?" "My task is usually 'copy source to destination' and << expresses that fine," Wil explained. "The only thing that changes is type of source and destination, and whether you mean form or content." "I'd rather see the actual method names," Stu said. "You can unify generally or itemize specifically, but not both," Wil shrugged. "I'd rather reduce code with generality."
license
All this code is available only under the BriarPig mu-babel license described fully on the rights page. You do not have permission to reprint this page in any way. Neither feeds nor repackaging is allowed. You can link this page if you want folks to read it. |
A submenu for demos appears below, letting you
go to the page on a topic written as a demo (as the
demos page defines it).
menu
thorn: todo, names, fd, iovec « Þ, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile) Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.
name
Here Wil first tried friendlier but longer name yiovecvp, but later gave up and changed the name back to ydvp when appending letters (eg ydvpz for a slice) got even harder to read. Note ydvp is the original name Wil used a few years ago when writing yd and related classes (described in deck), when it meant thorn deck (iovec) vector pointer for the iterator over an iovec vector inside yd. In other words, the word iovec doesn't even appear in the name despite being central. If it's easier to memorize, you might try a neologism data for the d inside ydvp, as if referring to one iovec. But that's not really it. Just tell yourself: it's a iovec vector iter used by yd decks, which just happens to be something completely generic and useful apart from yd.
model
The most interesting thing about ydvp is a model it shares with many other þ types: it represents an ordered sequence of some N bytes that can be read or written, which can described in terms of subsets (called slices) expressed as offsets into the original plus a length. In other words, iovec vectors in themselves are a bit dull; how they resemble other byte sequences — and moving content back and forth between them — is of more interest. The purpose of iovec vectors in scatter-gather i/o is skipping any staging to make N bytes contiguous in memory just to invoke an api that reads or writes those N bytes. Location in memory wherever it sits now should be just fine; it might already be staged just the way you want it for some other purpose. Demos here focus on treatment of a byte sequence in ydvp as a single thing, despite representation as a vector of N parts. Other than eof or end of stream, no position in ydvp is distinguished as a cursor for reading and writing. Also in common with other byte sequence types is general separation of memory ownership from use and manipulation. You might feel tempted to build memory allocation semantics into classes like ydvp because it seems missing to you (because you never want to deal with it). Owning space is a role for some other class. This kind of scatter-gather api should be able to work on many kinds of physical representation — interoperating between them — as long as memory involved doesn't move while iovecs are using it.
iteration
Expected ydvp iteration usage looks like this: void foo_by_iter(iovec* v, unsigned n) { // usage «
ydvp i(v, n);
for ( ; i; ++i)
do_one_iovec(*i);
}
And this has essentially the same effect as this: void foo_plain(iovec* v, unsigned n) {
for (unsigned i = 0; i < n; ++i)
do_one_iovec(v[i]);
}
The latter plain version is arguably simpler. So what's a practical reason to use ydvp? The packaging is better, and you can subclass ydvp to do something interesting (like represent subsets of an iovec). The advantage of packaging is subtle: As a single value instead of pair (iovec* iov, int len), now you can manipulate that single value very conveniently. For example, if you pass it to other methods, now you can ensure they stay together, paired with each other instead of other values. More conveniently, ydvp can be used as the left hand or right hand side of expressions, and you can overload as many binary operators as you like to take ydvp as a source or destination operand. Since þ uses operator<<() most commonly in this way, all the examples below will use this operator. For sake of argument, assume þ has this operator defined: yo& operator<<(yo&, const iovec&); // iovec to yo «
On top of this, we can build the following operator: yo& operator<<(yo&, const ydvp&);
Whose implementation could look like this: yo& operator<<(yo& o, const ydvp& src) { // write src «
ydvp p(src, 0); // mutable copy we can modify
for ( ; p; ++p)
o << *p; // calls yo::owrite(const iovec&);
return o;
}
You can see more api for out stream yo on out when it's done. But just like this operator, we can define many other similar operators for any consumer of octet sequences. (Some might be able to process more than one iovec at once, in which case it's best to express intent this way with operator<<() instead of forcing a loop and explicitly sync i/o.)
building
Wil added several new ydvp methods to the api just for this demo, so usage examples can be shown with little code. This section shows appending new iovecs to an existing vector, which required adding new member p_x to distinguish max physical capacity from current p_n logical length. (It's the same distinction as that between physical size b_x in a buf and logical length v_n in a run.) The following sample shows building a ydvp instance by:
int ydvp_build_test() { // test sample «
iovec temp[ 64 ]; // stack array of uninitialized iovecs [1]
ydvp iovp(temp, /*len*/ 0, /*max*/ 64); // empty so far [2]
// iovp now has capacity to hold up to 64 iovec instances
yv abcd("abcd"); // v_p="abcd", v_n=4: 4 octet iovec [3]
yv efghi("EFGHI"); // v_p="EFGHI", v_n=5: 5 octet iovec
iovec two; // an iovec to hold converted state of efghi above
two.iov_base = efghi.v_p; two.iov_len = efghi.v_n; // convert
iovp << abcd; // append abcd via yv::operator iovec() [4]
iovp << two; // append copy of two (already converted)
yout << iovp.quote(); // display iovp form and content [5]
yout << yendl << ynow; // write newline and flush yout
return 0;
}
The output of [5] appears below, and is discussed in the following section on quoting and debug printing. This output is untouched other than being run through yfdw::whtml() shown in the last demo (fd «). Proper indentation is handled by all object dump methods for better readability. Hex dumping is shown in the hex demo, and checksum generation is shown in the crc demo. (The topmost crc value shown here checksums all content regardless of form.) <ydvp v=0xbfffeb2c n=2 i=0 x=64 sum=9 crc='0x7afd4a1f:9'>
<v0 p=0xa5b1c n=4 crc='0xed82cd11:4'>
00000: 61 62 63 64 ; abcd
</v0>
<v1 p=0xa5b24 n=5 crc='0xaa3b80b9:5'>
00004: 45 46 47 48 49 ; EFGHI
</v1></ydvp>
Code for ydvp::pdump() appears below in this column (cf »). Let's repeat the two sample lines actually appending new iovecs to the ydvp so we can show both the new api and actual code for building: iovp << abcd; // append abcd via yv::operator iovec() [4]
iovp << two; // append copy of two (already converted)
Those two lines invoke two variants of padd() methods in the ydvp api, by way of operator<<() inlines right after the class declaration: class ydvp { // ... continued «
public:
unsigned padd(yv const& v); // append 1
unsigned padd(iovec const& iov); // append 1
unsigned padd(ydvp const& iter); // append N (all of source iter)
}; // ydvp
inline ydvp& operator<<(ydvp& p, yv const& x) { p.padd(x); return p; }
inline ydvp& operator<<(ydvp& p, iovec const& x) { p.padd(x); return p; }
inline ydvp& operator<<(ydvp& p, ydvp const& x) { p.padd(x); return p; }
Notice we also support operator<<() for ydvp on the right hand side (rhs) and not just single iovecs. The two iovec versions are simple: unsigned ydvp::padd(iovec const& iov) { // append «
unsigned n = p_n; // current count of used iovecs
if (p_x > n) { // more room in vector?
p_v[n] = iov; // copy iovec to new high
p_n = n + 1; // one closer to p_x maximum
return 1; // length increase
}
else { // no slots available: log message
ylog(1, "ydvp holds only x=%d iovecs", (int) p_x);
}
return 0; // none added: no length delta
}
unsigned ydvp::padd(yv const& v) {
iovec iov = v; // convert to iovec and add:
return this->padd(iov);
}
Adding all iovecs in another ydvp is just as trivial. We return the total count of iovecs actually added, in case a caller wants to know: unsigned ydvp::padd(ydvp const& iter) {
unsigned n = p_n; // save for return value
iovec* v = iter.p_v; // 1st incoming iovec
iovec* x = v + iter.p_n; // one past last in v
for ( ; v < x; ++v) { // another iter iovec?
this->padd(*v); // append each separately
}
return p_n - n; // increase in length, if any
}
Though sample code above shows an iovec vector allocated separately from ydvp, you can allocate them together using ydvp subclasses defined in ydv.h that include built-in iovec vectors. For example, here are the first two which have 64 and 256 iovecs inside: class yd64vp : public ydvp { // ydvp with built-in 64 iovecs «
public: // sample ydvp with in-built space for use in tests
enum { pe_max = 64 }; // arbitrary built-in iovecs
iovec p_iovec[ pe_max ]; // space for p_v in ydpv
yd64vp() : ydvp(p_iovec, /*len*/ 0, /*max*/ pe_max) { }
yd64vp(yo& o, ydvpz const& pz, bool quo)
: ydvp(p_iovec, 0, pe_max) { this->prender(o, pz, quo); }
yd64vp(ydvpz const& pz)
: ydvp(p_iovec, 0, pe_max) { this->pcut(pz); }
~yd64vp() { }
ydvp pbegin() const { return ydvp(p_v, p_n); } // new iter
y0x* pend() const { return (y0x*) 0; }
}; // yd64vp
class yd256vp : public ydvp { // ydvp w/ built-in 256 iovecs «
public: // sample ydvp with in-built space for use in tests
enum { pe_max = 256 }; // arbitrary built-in iovecs
iovec p_iovec[ pe_max ]; // space for p_v in ydpv
yd256vp() : ydvp(p_iovec, /*len*/ 0, /*max*/ pe_max) { }
yd256vp(yo& o, ydvpz const& pz, bool quo)
: ydvp(p_iovec, 0, pe_max) { this->prender(o, pz, quo); }
yd256vp(ydvpz const& pz)
: ydvp(p_iovec, 0, pe_max) { this->pcut(pz); }
~yd256vp() { }
ydvp pbegin() const { return ydvp(p_v, p_n); } // new iter
y0x* pend() const { return (y0x*) 0; }
}; // yd256vp
These ydvp subclasses with in-built iovec vectors are used by slice methods (see left column) to allocate temporary stack iovec vectors when building the subset described by a slice. Which version gets used depends on size of the original iovec vector — a slice will never be longer than the original, which helps explain use of ydvpz::z256() (and related helper methods) which allocates an instance of yd256vp when slicing ydvp with size under 256 (cf «). The inherited prender() call in the yd256vp constructor also appears column left (cf «).
quoting
Wil likes how easy it is to write iovecs to an output stream using earlier code. But Wil also wants to be able to write a debug description of iovec vector structure to an out stream. So given an instance of ydvp named src, Wil wants to debug print it with this syntax (see quote for more on quoting): o << src.quote(); // debug print src
To make this work, Wil needs a debug printing ydvp method named pdump() and a new type for overloading operator<<() to call pdump(). So Wil adds the following to the class api: class ydvp { // extending iovec vector iter api: «
// ...
struct Pq { ydvp const& q_p; Pq(ydvp const& p): q_p(p) { } };
Pq quote() const { return Pq(*this); } // to request dump
void pprint() const; // calls pdump(yout) (for gdb)
void pdump(yo& o) const; void pcite(yo& o) const;
}; // ydvp
inline yo& operator<<(yo& o, ydvp::Pq const& x) {
x.q_p.pdump(o); return o;
}
Of course, Wil did not try to make this look pretty. Partly this is because it's boilerplate — virtually every þ class worth printing has something like this in it. By convention, þ nested classes start uppercase with the same prefix as the methods. Also by convention, the quote wrapper class is just named q after the leading uppercase prefix. The quote() method simply does an inline wrap of the object's reference, so sizeof(Pq) is just the size of a pointer: sizeof(ydvp*). A compiler needn't even really construct an instance of struct Pq when the only thing that happens is passing the q_p ref to another method. The whole thing is just a dance with types to make overloading work. And because it's for debug printing, Wil doesn't care whether a compiler does indeed optimize the code — it's not going to matter. All that matters is how easily Wil can print what he wants with least effort. And it's nice this support is easy to clone from class to class. The pcite() and pprint() mehods are also shown because they're always present when a dump() method is defined. By convention, cite() is always one line, and print() is for use under gdb (writing to stdout by one or another means) which calls dump(). Next is code to dump or cite — fine detail might be unclear until the crc, hex, out, or quote demos. Sample output appears above (cf «). void ydvp::pdump(yo& o) const { // debug print ydvp «
if (p_v && p_n) { // nonempty?
yh32 h; h << *this; // crc via ydvp::pcrc()
o.oft("<ydvp v=%#lx n=%u i=%u x=%u sum=%u crc='%#lx:%lu'>",
(long) p_v, (int) p_n, (int) p_i, (int) p_x,
(int) this->psum(), (long) h.hcrc(), (long) h.hlen());
char vtag[ 32 ]; // for "v%u" format in sprintf()
n32 base = 0; // cumulative base offset
for (unsigned i = 0; i < p_n; i++) { // hex dump each iov
sprintf(vtag, "v%u", i); // include iov index in tag
o.on(); // newline then indent to tab depth
yv fragment(p_v[i]); // yv::yv(iovec const&)
fragment.vshow(o, vtag, /*maxlen*/ 0xffff, base);
base += fragment.v_n; // advance hex offset label
}
o.ouend("ydvp"); // out untab end (tag) writes "</ydvp>"
}
else // nil vec or zero length
this->pcite(o); // use the one line format on empty
}
Obviously this takes the yo api in out for granted. Think of yo::of() as fprintf(), with a variant named yo::oft() increasing indent level by one afterward — here t means tab. Dumping each iovec is done by yv::vshow() in the hex demo. The crc is generated by ydvp::pcrc() shown in the next section below; however, that uses api from the crc demo. But first let's show implementations of pcite() and pprint() (which uses the standard yout stream from the out demo). void ydvp::pprint() const { // print under gdb «
yout << yendl; this->pdump(yout); yout << yendl << ynow;
}
void ydvp::pcite(yo& o) const { // one line debug print «
yh32 h; h << *this; // crc via ydvp::pcrc()
o.of("<ydvp v=%#lx n=%u i=%u x=%u sum=%u crc='%#lx:%lu'/>",
(long) p_v, (int) p_n, (int) p_i, (int) p_x,
(int) this->psum(), (long) h.hcrc(), (long) h.hlen());
}
Unlike a full debug print by pdump() invoked by a line like this: o << src.quote(); // debug print src
... a one line print by pcite() is requested with this syntax: o << ycite(src); // 1 line print «
which is based on template ycite() in mu.h and another operator<<() inline in ydv.h overloading yct<ydvp> as a rhs source: template <typename T> struct yct { // mu.h: yct - 'cite' template «
T const& c_t; // a T wrapper requesting cite rather than dump
yct(T const& t) : c_t(t) { } // just capture pointer value &t
};
template <typename T> yct<T> ycite(T const& t) { return yct<T>(t); } // «
inline yo& operator<<(yo& o, yct<ydvp> const& x) { // ydv.h:
x.c_t.pcite(o); return o; // one line debug print of ydvp
}
This is clearly rather baroque as a means of making ycite(src) work as a rhs value for overloaded operators. But since only one line of inline C++ is needed per class to call a cite() method, this is quite terse — as boilerplate goes — even if the ycite() template looks obtuse.
crc
Here we use the yh32 api in the crc demo to find the crc32 hash of ydvp content in the entire sequence. The ydvp::pcrc() method is used by another operator<<() inline as shown below. void ydvp::pcrc(yh32& crc) const { // crc of each iovec «
for (unsigned i = 0; i < p_n; i++)
crc << p_v[i]; // inline yh32::hadd(iovec const&);
}
Did you think it would be more complex? The inline operator<<() methods from ydv.h and yh32.h are defined as follows: inline yh32& operator<<(yh32& h, ydvp const& x) { x.pcrc(h); return h; }
Below is the subset of yh32's api in the crc demo related to iovec. (This is taking quite a lot from that demo, but at least this is clear.) class yh32 { // crc32 based on zlib's ::crc32() «
private:
u32 h_len; // number of bytes added to crc
u32 h_crc; // resulting crc from applying crc32() to h_len bytes
void _hinit() { h_len = 0; h_crc = (u32) ::crc32(0L, Z_NULL, 0); }
public:
yh32() { _hinit(); }
u32 hlen() const { return h_len; }
u32 hcrc() const { return h_crc; }
void hadd(iovec const& v) {
n32 n = v.iov_len;
h_crc = ::crc32(h_crc, (Bytef*) v.iov_base, n); h_len += n;
}
yh32& operator<<(iovec const& v) {
if (v.iov_base && v.iov_len) hadd(v); return *this;
}
}; // yh32
Note some processors have built-in crc32 instructions — you should use them when available. Otherwise code from zlib is good.
writing
Here Wil shows an odd but powerful member of ydv.h iovec classes (all first drafted in this form in 2004). This version of ydvo hasn't changed as many times as other classes in this demo. So ydvo is little integrated with other þ classes like yo in the out demo. It writes space described by ydvp using an api like yo — but it's not a yo subclass. (The out demo might include a class named ydvpo that is a yo subclass closely resembling ydvo shown here. If nothing else, you should get an idea Wil sees many ways to do a thing with small variations.) Before api and code for ydvo, let's first show an example of intended use, with the actual output written by this sample code. What you'll see is this:
The interesting part? Source and destination iovec vectors will share neither total length nor single iovec sizes, so it's scatter-gather to scatter-gather. static const char* lo = "abcdefghijklmnopqrstuvwxyz";
static const char* hi = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
yv vlo(lo); yv vhi(hi); // both C strings as yv runs
yassert(vlo.v_n == 26 && vhi.v_n == 26); // must be so
Those are the lower and uppercase octet strings; vhi is the source sliced below. Let's fill a destination buffer with bytes from vlo: u8 tmp1K[1024]; yb b1K(tmp1K, 0, 1024); // 1K on stack «
b1K << vlo;
yout << b1K.quote() << yendl << ynow;
The last line's debug print looks like this: (This part of the demo shortens hex characters per line to 12 from 16 shown earlier in this demo, to use fewer horizontal columns.) <yb p=0xbffff228 n=26 x=1024 crc='#4c2750bd:26'>
00000: 61 62 63 64 65 66 67 68 69 6a 6b 6c ; abcdefghijkl
0000c: 6d 6e 6f 70 71 72 73 74 75 76 77 78 ; mnopqrstuvwx
00018: 79 7a ; yz
</yb>
Now we make a destination ydvp for iovecs describing slices of this buffer, so we can write them below with ydvo: yd64vp pdest; // where we plan to overwrite «
// slices: "cdefg" and "jklmn" and "stuvw"
pdest << b1K(2, 5) << b1K(9, 5) << b1K(18, 5);
yout << pdest.quote() << yendl << ynow;
Dumping pdest prints the following on stdout. Note memory addresses for slices correspond to spots in the buffer above. <ydvp v=0xbffff858 n=3 i=0 x=64 sum=15 crc='0x44844fea:15'>
<v0 p=0xbffff22a n=5 crc='0x299abc5:5'>
00000: 63 64 65 66 67 ; cdefg
</v0>
<v1 p=0xbffff231 n=5 crc='0xc2138302:5'>
00005: 6a 6b 6c 6d 6e ; jklmn
</v1>
<v2 p=0xbffff23a n=5 crc='0x7933ca9d:5'>
0000a: 73 74 75 76 77 ; stuvw
</v2></ydvp>
The source ydvp comes next, using iovec sizes that are not the same size as those in pdest shown above. Note we feel free to use iovecs slicing space in immutable static C strings, since we won't be writing them. yd64vp psrc; // where we plan to read input «
// slices: "UVWXYZ" and "MNOPQR" and "EFGHIJ"
psrc << vhi(20, 6) << vhi(12, 6) << vhi(4, 6);
yout << psrc.quote() << yendl << ynow;
Dumping psrc shows the following internal ydvp structure: <ydvp v=0xbffff640 n=3 i=0 x=64 sum=18 crc='0x69ddf01f:18'>
<v0 p=0xa6a1c n=6 crc='0x8e601468:6'>
00000: 55 56 57 58 59 5a ; UVWXYZ
</v0>
<v1 p=0xa6a14 n=6 crc='0x6cc9642f:6'>
00006: 4d 4e 4f 50 51 52 ; MNOPQR
</v1>
<v2 p=0xa6a0c n=6 crc='0xf61c77ab:6'>
0000c: 45 46 47 48 49 4a ; EFGHIJ
</v2></ydvp>
The next two lines create a ydvo out stream writing to pdest, then writes the source iovecs to the stream. After that, both destinations are dumped — the iovec vector and the buffer into which it points. ydvo odest(pdest); // writes to pdest «
odest << psrc;
yout << pdest.quote() << yendl << ynow;
yout << b1K.quote() << yendl << ynow;
Note how the six byte iovecs in the source flow into the five byte iovecs of the destination. As you can see in output for b1K, the previous lowercase content is untouched outside what was described by pdest. Because the source was larger than the destination, the last three bytes of the source were just ignored since no place could be found to put them. <ydvp v=0xbffff858 n=3 i=0 x=64 sum=15 crc='0xa7f55aaf:15'>
<v0 p=0xbffff22a n=5 crc='0xd191e1e1:5'>
00000: 55 56 57 58 59 ; UVWXY
</v0>
<v1 p=0xbffff231 n=5 crc='0xbb710263:5'>
00005: 5a 4d 4e 4f 50 ; ZMNOP
</v1>
<v2 p=0xbffff23a n=5 crc='0x9a9fa5d4:5'>
0000a: 51 52 45 46 47 ; QREFG
</v2></ydvp>
<yb p=0xbffff228 n=26 x=1024 crc='#be7b05a4:26'>
00000: 61 62 55 56 57 58 59 68 69 5a 4d 4e ; abUVWXYhiZMN
0000c: 4f 50 6f 70 71 72 51 52 45 46 47 78 ; OPopqrQREFGx
00018: 79 7a ; yz
</yb>
The most obvious use of ydvo out streams? Updating content in pre-staged layouts, touching nothing outside ydvp iovec vector stencils.
ydvo
Here's code and interface for ydvo: class ydvo { // iovec vector sink (vaguely resembling yo) «
protected:
// o_base and o_len imitate two members in ydv:
iovec* o_base; // first of o_len iovec's in an array
size_t o_len; // length of contiguous o_base array of iovecs
// these members resemble those in yo:
iovec* o_vec; // the iovec used for o_0, o_p, and o_x:
u8* o_0; // origin: start of iovec
u8* o_p; // pointer inside iovec
u8* o_x; // one past last writable byte in iovec
n32 o_actual; // actual bytes written
public:
ydvo(yv const& one);
ydvo(iovec const& one);
ydvo(const iovec* base, size_t sz);
ydvo(ydvp const& iter);
int oindex() const { return o_vec - o_base; } // iovec in the array
n32 osize() const { return o_actual; }
n32 oactual() const { return o_actual; }
The actual number of bytes written to ydvo is returned by oactual(), but it's seldom up-to-date until after oflush() is called. No constructor takes a ydvpz slice because we don't want to generate space for the copy here. So presumably a caller will use one of the ydvp subclasses with a constructor calling prender() to convert a slice to a new iterator. void _oc(int c);
bool _onext(); // advance to next iovec in array if one remains
Methods starting with underscores are usually private and internal. These internal methods are exposed as relatively harmless. Note ydvo::_oc() strongly resembles virtual method yo::_oc() in the out demo, which handles contiguous buffer exhaustion. public:
void oflush(); // ensure o_actual is accurate «
n32 owrite(ydvp const& src); // append
n32 owrite(const void* base, size_t sz); // append
n32 ov(yv const& v) { return this->owrite(v.v_p, v.v_n); }
void ov(iovec const& v) { this->owrite(v.iov_base, v.iov_len); }
void os(const char* s) { yv v(s); owrite(v.v_p, v.v_n); } // append
void oc(int c) { if (o_p < o_x) *o_p++ = (u8) c; else _oc(c); } // «
Writing one byte is done by inline yo::oc(), which is important for the following reason: the design of ydvo revolves around this inline — reading and writing bytes by inline pointer bumping is the main purpose for describing buffers by origin and max pointers in all þ stream classes. (It was one of the lessons learned from horrible Taligent i/o performance: primitive operations should not be virtual.) ydvo& operator<<(iovec const& v) {
this->owrite(v.iov_base, v.iov_len); return *this;
}
ydvo& operator<<(yv const& v) {
this->owrite(v.v_p, v.v_n); return *this;
}
ydvo& operator<<(int c) {
if (o_p < o_x) *o_p++ = (u8) c; else this->_oc(c);
return *this;
}
struct Oq { ydvo const& q_o; Oq(ydvo const& o): q_o(o) { } };
Oq quote() const { return Oq(*this); } // to request dump
void oprint() const; void odump(yo& o) const; void ocite(yo& o) const;
public: // testing
// ydvo::otest() returns error count; zero is success
static int otest(); // (int argc, const char** argv);
}; // class ydvo
Convenience operator<<() inlines follow the class: inline ydvo& operator<<(ydvo& o, y1now const&) { o.oflush(); return o; }
inline ydvo& operator<<(ydvo& o, ydvp const& x) { o.owrite(x); return o; }
inline ydvo& operator<<(ydvo& o, const char* s) { o.os(s); return o; }
inline ydvo& operator<<(ydvo& o, int c) { o.oc(c); return o; }
inline yo& operator<<(yo& o, ydvo::Oq const& x) {
x.q_o.odump(o); return o; }
inline yo& operator<<(yo& o, yct<ydvo> const& x) {
x.c_t.ocite(o); return o; }
The constructors have little substance. Each starts by populating the buffer with the first iovec, so origin o_0 points at the first byte and max o_x points one after the last. Pointer o_p starts at the origin and works toward o_x as a cursor. When the current buffer is exhausted, ydvo::_onext() loads the next iovec if any. ydvo::ydvo(yv const& single) : o_base(0), o_len(0), // «
o_vec(0), o_0(0), o_p(0), o_x(0), o_actual(0) {
o_0 = o_p = (u8*) single.v_p;
o_x = o_0 + single.v_n;
}
ydvo::ydvo(iovec const& single) : o_base(0), o_len(0),
o_vec(0), o_0(0), o_p(0), o_x(0), o_actual(0) {
o_0 = o_p = (u8*) single.iov_base;
o_x = o_0 + single.iov_len;
}
ydvo::ydvo(const iovec* base, size_t ln)
: o_base((iovec*) base), o_len(ln),
o_vec((iovec*) base), o_0(0), o_p(0), o_x(0), o_actual(0) {
if (o_base && o_len) {
iovec v = o_base[0];
o_0 = o_p = (u8*) v.iov_base;
o_x = o_0 + v.iov_len;
}
else
o_len = 0;
}
ydvo::ydvo(ydvp const& vec)
: o_base(vec.p_v), o_len(vec.p_n),
o_vec(vec.p_v), o_0(0), o_p(0), o_x(0), o_actual(0) {
if (o_base && o_len) {
iovec v = o_base[0];
o_0 = o_p = (u8*) v.iov_base; // 1st in v
o_x = o_0 + v.iov_len; // one past last in v
}
else
o_len = 0;
}
Writing ydpv just writes each iovec in turn: n32 ydvo::owrite(const ydvp& src) {
ydvp p(src, 0); // new copy we can modify «
uint32_t outActual = 0;
for ( ; p; ++p) {
const iovec& v = *p;
outActual += this->owrite(v.iov_base, v.iov_len);
}
return outActual;
}
Each inbound iovec is written by owrite() which loops over input fragments, advancing the destination iovec buffers with _onext() as necessary each time one is exhausted. When iovecs have substantial length, all the time spent using ydvo appears inside the call to ::memcpy(), which actually moves the bytes involved. (Note the assumption source and destination bytes will never overlap — otherwise we'd have to use ::memmove() instead.) In other words, as complex as code in this demo might look, time accrues mainly to moving bytes in ::memcpy() — all other decision-making costs little. (This does have far-reaching consequences when you think about it.) n32 ydvo::owrite(const void* base, size_t len) { // «
u32 outActual = 0; // bytes actually written
const u8* src = (const u8*) base;
const u8* end = src + len; // one beyond last byte to read
if (src && len) { // anything to read?
while (src < end) { // have not read everything yet?
u32 quantum = end - src; // bytes remaining
u32 room = o_x - o_p; // capacity in this iovec
if (!room) { // need to advance?
if (!this->_onext()) // no more capacity?
return outActual;
else
room = o_x - o_p; // capacity in new iovec
}
if (quantum > room) // more than fits here?
quantum = room; // max for this iovec
::memcpy(o_p, src, quantum); // copy into iovec
o_p += quantum; // note: o_actual changes in _onext()
src += quantum; // this many fewer bytes to read
outActual += quantum; // sum bytes written this call
}
}
return outActual;
}
Advancing to the next iovec in the destination occurs in _onext() which begins with an inline flush to keep o_actual correct — it counts the number of bytes by which o_p has ever been advanced ahead of origin o_0. (This occurs a byte at a time in inline ydvo::oc(), which doesn't count bytes since it can be put off.) bool ydvo::_onext() { // go to next iovec if one exists «
if (o_p > o_0) { // flush to actual?
o_actual += o_p - o_0;
}
o_p = o_0 = o_x = 0; // mark buffer empty
if (o_base) { // another iovec?
iovec* vx = o_base + o_len; // end: one past last
while (++o_vec < vx) { // another iovec to see?
iovec v = *o_vec;
if (v.iov_base && v.iov_len) { // non-empty?
o_0 = o_p = (u8*) v.iov_base; // 1st in v
o_x = o_0 + v.iov_len; // one past last in v
return true; // success
}
}
}
return false; // exhausted storage capacity
}
When writing a single byte via inline oc() exhausts the current buffer, the fallback is _oc() below, which resembles the flushbuf primitive in standard C library macros, except _onext() above actually replenishes the buffer. The cost of this call is amortized over the many times oc() is called without exhausting the buffer, when all the i/o is done one byte at a time. void ydvo::_oc(int byte) { // one octet only «
if (o_p < o_x)
*o_p++ = (u8) byte;
else {
if (this->_onext() && o_p < o_x)
*o_p++ = (u8) byte;
}
}
Flushing ydvo resembles flushing a ybo out stream writing to a yb buffer (see out for ybo and buf for yb): the final tally in size isn't right until the effect of bytewise writes can be gauged by measuring how far the pointer of cursor o_p has moved from origin o_0. The subtle line is updating o_0 to match o_p so those bytes won't be counted again by a later flush. void ydvo::oflush() { // ensure o_actual is accurate «
// compare this to flush at start of _onext()
if (o_p > o_0) {
o_actual += o_p - o_0;
o_0 = o_p; // don't flush this part again
}
}
"Did I mention your naming scheme sucks?" Dex shot. "Not yet," Wil said. "I thought you forgot." |
demos « Þ
+ todo + names + fd + iovec « Þ + assert + log + run + hex + crc + buf + in + out + quote + escape + compare + file + deck + cow + arc + blob + tree + slice + rand + time + stat + hash + heap + node + primes + page + book + pile + stack + atomic + lock + mutex + thread + map + meter + list + iter + ctype |