Þ   briarpig  » code  » io


11sep09 « showing

reading and writing «

     Below you can find cursory versions of utilities for doing i/o: reading and writing in ways convenient for ease in testing. Typically these merely clone similar classes in thorn demos, but using slightly more agreeable names.

     Note this i/o code was going to appear on the main page as part of unit testing; but then printing support seemed a distraction. Obviously printing is easy, and deciding what to print is more interesting, usually. Code below shows how you can print: basically trivial effects with tedious api.

     For now, none of the i/o below is async. Also, none of the i/o is exotic, or optimized for a particular purpose. Printing for evidence purposes is assumed slow, in general, so ease and brevity in coding is usually given priority over better speed; minimizing code pain trumps speed bikeshedding.

11sep09 « purple

color emphasis «

     You can ignore code below shown in a smaller font with purple hue; it's included only for completeness. Only code in a larger font is described.

12sep09 « unlikely

     The following unlikely() macro invokes a gcc builtin to hint a test branch should predict the value is false. This macro is used to return the same zero or nonzero value present without the macro, but expecting the other branch is taken.

#define mu_unlikely(x) __builtin_expect((x)!=0,0)

     This macro is often used in tests for hopefully rare error conditions.

12sep09 « iovecs

Iov «

     You can see more of the Iov clone of struct iovec on the main page (here), but the basic idea is simple: pointer plus length describes a single contiguous memory fragment at address v_p with v_n length in bytes:

struct Iov { // C++ class with all public members « u8* v_p; // Base address. u32 v_n; // Length. Iov() : v_p(0), v_n(0) { } Iov(const void* p, u32 n) : v_p((u8*) p), v_n(n) { } explicit Iov(const char* cstr) // null terminated C string : v_p((u8*) cstr), v_n((cstr)? ::strlen(cstr) : 0) { } Iov& operator=(const char* cstr) { v_p = (u8*) cstr; v_n = (cstr)? ::strlen(cstr) : 0; return *this; } // ... more api elided };

     Note Iov is interchangeable with iovec, and sometimes I cast from one to another after asserting their sizes are equal.

11sep09 « escape for html

filter «

     Nearly all C++ code shown on this site was run through a filter (shown below) which escapes three meta characters in HTML (<, >, and &) so C++ source code does not look like HTML markup. Of course, it's easy to do this by hand—but also very tedious for hundreds or thousands of lines.

     The pig test app shown on main executes the following block of code when "--" (two minuses) appears as a command line argument:

FdWrap wout(STDOUT_FILENO); // 1 FdWrap win(STDIN_FILENO); // 0 i32 actual = wout.whtml(win); fprintf(stderr, "escaped: html=%d\n", (int) actual);

     Class FdWrap here is a file descriptor wrapper, cloning yfdw found in the fd demo under thorn. What's the basic idea? By wrapping file descriptor integers in a struct, you can attach file specific behavior to integers in a typesafe manner. (Parts of this also resemble the escape demo.)

struct FdWrap { // file descriptor wrapper int w_fd; // file descriptor FdWrap() : w_fd(-1) { } // invalid explicit FdWrap(int fd) : w_fd(fd) { } // as you say operator bool() const { return w_fd >= 0; } // valid int wfd() const { return w_fd; } // getter // whtml() calls wescape() with a map for html excape i32 whtml(FdWrap const& src) const; // wescape() writes src to w_fd, escaping mapped octets i32 wescape(const char* map[], const void* p, u32 n) const; i32 wescape(const char* map[], FdWrap const& src) const;

     The rest of the FdWrap api shown next can be ignored.

i32 wescape(const char* map[], Iov const& src) const; i32 wescape(const char* map[], iovec const& src) const; enum We_fd { // enums for standard fd's we_stdin = STDIN_FILENO, // 0 // unistd.h we_stdout = STDOUT_FILENO, // 1 // unistd.h we_stderr = STDERR_FILENO // 2 // unistd.h }; int wprintf(const char* fmt, ...) const; // cf printf() // werrno() logs errno for fun(w_fd, n), returns zero ssize_t werrno(const char* fun, int n) const; enum We_limit { // system related limits for api // class local enum for UIO_MAXIOV (4? 8? 16?) we_maxiov = 4 /*most iovecs at once (8? 16?)*/ }; ssize_t wput(IovPtr const& src) const; ssize_t wput(IovPtrCut const& src) const; ssize_t wput(const struct iovec *iov, int n) const; ssize_t wput(Iov const& src) const; ssize_t wput(iovec const& src) const; ssize_t wput(const char* cstr) const; // C string int wopen(const char *path, int flags, mode_t mode) { w_fd = ::open(path, flags, mode); // see: 'man 2 open' return w_fd; } int wcreat(const char *path, mode_t mode) { w_fd = ::creat(path, mode); // see: 'man 2 creat' return w_fd; } int wclose() { int d = w_fd; w_fd = -1; return ::close(d); } // see: man 2 close int wfcntl(/*int fd,*/ int cmd, int arg) { return ::fcntl(w_fd, cmd, arg); } // see: man 2 fcntl ssize_t wread(/*int d,*/ void *buf, size_t nbytes) { return ::read(w_fd, buf, nbytes); } // see: man 2 read ssize_t wreadv(/*int d,*/ const struct iovec *iov, int n) { return ::readv(w_fd, iov, n); } // see: man 2 readv ssize_t wpread(/*int d,*/ void *buf, size_t sz, off_t p) { return ::pread(w_fd, buf, sz, p); } // see: man 2 pread ssize_t wwrite(const void *buf, size_t nbytes) { return ::write(w_fd, buf, nbytes); } // see: man 2 write ssize_t wwritev(const struct iovec *iov, int n) { return ::writev(w_fd, iov, n); } // see: man 2 writev ssize_t wpwrite(const void *buf, size_t sz, off_t p) { return ::pwrite(w_fd, buf, sz, p); } // see: man 2 pwrite enum We_whence { // wlseek() vals for whence we_set = SEEK_SET, we_cur = SEEK_CUR, we_end = SEEK_END }; off_t wlseek(/*int fildes,*/ off_t offset, int whence) { return ::lseek(w_fd, offset, whence); } // cf man 2 lseek };

     Above you can see this pattern: any system api taking a file descriptor integer can be trivially added to this wrapper class as an inline.

inline FdWrap& operator<<(FdWrap& w, IovPtrCut const& x) { w.wput(x); return w; } inline FdWrap& operator<<(FdWrap& w, IovPtr const& x) { w.wput(x); return w; } inline FdWrap& operator<<(FdWrap& w, Iov const& x) { w.wput(x); return w; } inline FdWrap& operator<<(FdWrap& w, const char* x) { w.wput(x); return w; } inline FdWrap& operator<<(FdWrap& w, iovec const& x) { w.wput(x); return w; }

werrno «

     You can ignore these trivial methods:

// werrno() logs errno for fun(w_fd, n), returns zero ssize_t FdWrap::werrno(const char* fun, int n) const { pig_logf(1, "%s(fd=%d) n=%d errno=%d ('%s')", fun, w_fd, n, errno, ::strerror(errno)); return 0; } // wescape() writes src to w_fd, escaping mapped octets i32 FdWrap::wescape(const char* map[], Iov const& src) const { return this->wescape(map, src.v_p, src.v_n); } i32 FdWrap::wescape(const char* map[], iovec const& src) const { return this->wescape(map, src.iov_base, src.iov_len); }

whtml «

     The top level whtml() must makes an octet map of bytes to be escaped, then passes this map to wescape() to actually do the work.

// whtml() calls wescape() with a map for html excape i32 FdWrap::whtml(FdWrap const& src) const { const char* map[ 256 ]; // one for each octet value ::memset(map, 0, 256 * sizeof(const char*)); // zero map['&'] = "&amp;"; map['<'] = "&lt;"; map['>'] = "&gt;"; // map[''] = ""; return this->wescape(map, src); }

wescape «

     Below we look at every input byte and escape all those with nonzero entries in the map by writing the map entry instead. Non-escaped bytes are only written when we hit the end, or hit an escaped byte.

i32 FdWrap::wescape(const char* map[], const void* cv, u32 n) const { static const char* s_msg = "FdWrap::wescape(m,cv,n)"; const u8* p = (const u8*) cv; i32 sum = 0; i32 actual = 0; if (p && n && map) { // nonempty? const u8* o0 = p; // origin: bytes not yet written const u8* x = p + n; // one beyond last to write for (/*prep preincr*/ --p; ++p < x; ) { const char* esc = map[*p]; if (esc) { // need to escape this byte? if (p > o0) { // write earlier bytes first? // Iov before(o0, p - o0); // before p actual = ::write(w_fd, o0, p - o0); sum += (actual < 0)? werrno(s_msg, p - o0) : actual; } int len = ::strlen(esc); actual = ::write(w_fd, esc, len); sum += (actual < 0)? werrno(s_msg, len) : actual; o0 = p + 1; // next byte to write is after current p } // if (map[c]) } // for if (p > o0) { // at least the final byte was not escaped? // Iov last(o0, p - o0); // trailing bytes before end actual = ::write(w_fd, o0, p - o0); sum += (actual < 0)? werrno(s_msg, p - o0) : actual; } } return sum; // bytes written }

     Note lack of buffering on the outbound leg to write(), which implies excessively high frequency of system calls for small numbers of bytes. You would not want to do this in a server bottleneck. But it's of no consequence in a command line tool, and makes the code much simpler, even if it makes you wince.

wescape read «

     Finally, here's the version of wescape() called earlier by whtml(), which reads the input file descriptor until either eof or an error, then runs all input bytes through the filter to escape mapped bytes.

i32 FdWrap::wescape(const char* map[], FdWrap const& src) const { char tmp[ 4096 ]; i32 sum = 0; i32 actual = 0; do { // until input stream is exhausted actual = ::read(src.w_fd, tmp, 4096); if (actual >= 0) { sum += this->wescape(map, tmp, (u32) actual); } else { this->werrno("FdWrap::wescape(m,w)", 4096); return sum; } } while (actual > 0); // read more bytes from src? return sum; }

     The mere 4K buffer shown here must be larger for better efficiency.

12sep09 « singletons

     Types and singleton instances shown in this section exist only for the purpose of overloading methods. For example, type mu::Place1 and singleton instance mu_place exist only to override operator new like this:

inline void* // "placement" operator new: operator new(size_t sz, void* ptr, mu::Place1& ignore) { return ptr; }

     In some environments, placement operator new is already declared globally; but then it isn't in other environments, making it hard for one code base to always work, unless you declare your own unique instance like this.

mu «

     Below namespace mu is used to scope sample code, and presumably all sample code here and elsewhere also appears in the mu namespace. But the name doesn't matter as long as I'm consistent. I tend to use very short namespace symbols, like mu and ae; you might see me mix namespaces at times.

namespace mu { /* ----- singletons ----- */ struct Nil1 { int m1_nil; Nil1() : m1_nil(0) { } operator int() const { return 0; } }; struct Endl1 { int m1_endl; Endl1() : m1_endl(0) { } }; struct Subi1 { int m1_subi; Subi1() : m1_subi(0) { } }; struct Addi1 { int m1_addi; Addi1() : m1_addi(0) { } }; struct Place1 { int m1_place; Place1() : m1_place(0) { } }; struct Now1 { int m1_now; Now1() : m1_now(0) { } }; struct Reset1 { int m1_reset; Reset1() : m1_reset(0) { } }; struct Lf1 { int m1_lf; Lf1() : m1_lf(0) { } }; // lf struct Gloss1 { int m1_gloss; Gloss1() : m1_gloss(0) { } }; struct Sync1 { int m1_sync; Sync1() : m1_sync(0) { } }; class End; // opaque pointer type for end() iteration methods }; // namespace mu

     The following singleton instances all appear in a global scope and use a mu_ prefix to avoid collisions with other global symbols.

// global scope, NOT in namespace mu mu::Place1 mu_place; // singleton for use with (eg) operator new() mu::Nil1 mu_nil; // singleton for use with (eg) operator<<() mu::Endl1 mu_endl; // singleton for use with (eg) operator<<() mu::Subi1 mu_subi; // singleton for use with (eg) operator<<() mu::Addi1 mu_addi; // singleton for use with (eg) operator<<() mu::Now1 mu_now; // singleton for use with (eg) operator<<() mu::Reset1 mu_reset; // singleton for use with (eg) operator<<() mu::Lf1 mu_lf; // singleton for use with (eg) operator<<() mu::Gloss1 mu_gloss; // singleton for use with (eg) operator<<() mu::Sync1 mu_sync; // singleton for use with (eg) operator<<() // end global scope NOT in namespace mu

12sep09 « sinks

out streams «

     Class Sink show below is a clone of yo in the C++ out demo, and a clone of cy_sink in a related C sink demo. (It's been cloned a dozen or twenty times since the 90's.) The api basically represents buffered i/o, using a handful of virtual methods so a subclass can fulfill the api with little work.

     The concrete sink subclass shown next writes to a file descriptor. The pig test app shown on main creates a sink writing to stdout like this near the start of main(), so buffer space on the stack can be used globally thereafter. (You can obviously put global buffer space elsewhere.)

char buf[ 4096 + 4 ]; Iov bufiov(buf, 4096); FdSink sout(bufiov, STDOUT_FILENO); Sink& o = sout;

     Think of o as similar to cout, except an out stream in sample code is typically passed as an argument, instead of assuming global scope.

     In the out demo, all method names begin with a leading o for out; but below all method names begin with a leading s for sink. It makes no difference. (One purpose of a leading prefix on method names is to disambiguate superclass api under multiple inheritance; for example, some classes want to specialize both out stream and in stream api.)

sink «

     The main idea of an out stream sink is buffering: the api assumes you have a buffer in a sink, somewhere, with pointers to the start, middle, and end that can be used for fast inlines when writing single bytes. But unbuffered i/o can be had just by putting zero (or any single value used consistently) in base class pointer members, because this invokes buffer exhausted virtual behavior.

     A secondary idea in the sink api is providing methods specific to writing many basic types, or tweaking indentation semantics, using very short method names, so printing code is not quite so onerous to write. The following single letter abbreviations are used:

  • t - tab: indent one more on next newline
  • u - untab: indent one less on next newline
  • n - newline: end line then indent to tab depth
  • f - format: format as if by printf()
  • c - char: write a single octet (not character)
  • s - string: write null terminated C string
  • x - hex: write hex dump annotated by ascii
  • i - in: reads from an in-stream source
  • z - slice: reads from a slice source
  • p - pointer/position: one or more hex positions
  • 0 - zero: describes unexpected nil values

     (Note a compete lack of interest in standard C++ stream api.)

class Sink { /* 'sink' output stream */ public: uint8_t* s_0; uint8_t* s_p; uint8_t* s_x; int s_err; uint16_t s_tab; uint16_t s_pad;

     This version of the api has very little state. (You can reserve buffer space for direct writes from an in stream by supporting a take/give api shown in the out demo.) Virtual methods below link to implementations in subclass FdSink shown further below.

public: // virtual methods virtual int write_fn(const void* b, uint32_t n) = 0; virtual int flush_fn() = 0; virtual void putc_fn(int c) = 0; /* fallback */ virtual ~Sink();

public: // public api Sink() { s_0 = s_p = s_x = 0; /* empty buffer */ s_err = 0; s_tab = 0; } Sink(const Iov& v) { u8* p = v.v_p; if (p && v.v_n) { /* initial buffer exists? */ s_0 = s_p = p; s_x = p + v.v_n; /* one past last buf byte */ } else { s_0 = s_p = s_x = 0; /* empty buffer */ } s_err = 0; s_tab = 0; } int sflush() { /* virtual dispatch */ return this->flush_fn(); }

int swrite(const void* b, uint32_t n) { /* virtual dispatch */ return write_fn(b, n); }

void sv(const Iov& v) { this->write_fn(v.v_p, v.v_n); } /* writev() writes each iov to sink s using swrite() */ int /* -1 on error; otherwise sum of bytes written */ swritev(const Iov* v, int cnt);

void /* sometimes virtual dispatch if needed */ sc(int c) { /* write one byte */ /* note: virtual putc() only called when buffer is full: */ if (s_p < s_x) { /* room for a byte in buffer? */ *s_p++ = (uint8_t) c; } else { putc_fn(c); /* flush and then handle byte */ } } void s1c(int c); /* non-inline c() above*/ void s2c(int a, int b); /* c(a); c(b); */ void s3c(int a, int b, int c); /* c(a); c(b); c(c); */ void sline() { sc('\n'); } /* newline only */

void sn(); /* newline then INDENT to depth */ void stabs(unsigned count); /* count tabs */

void st() { ++s_tab; } /* increase tab depth by one */ void stt() { s_tab += 2; } /* increase tab depth by two */

void su() { if (s_tab) --s_tab; } /* untab depth by one */ void suu() { if (s_tab >= 2) s_tab -= 2; } /* untab by two */ void /* tab, then newline indent */ stn() { st(); sn(); } void /* tab twice, then newline indent */ sttn() { stt(); sn(); } void /* untab, then newline indent */ sun() { su(); sn(); } void /* untab twice, then newline indent */ suun() { suu(); sn(); }

void ss(const char* cstr) { /* write C string */ (void) write_fn(cstr, strlen(cstr)); } void /* newline indent, then string*/ sns(const char* cstr) { sn(); ss(cstr); } void /* string, then newline indent */ ssn(const char* cstr) { ss(cstr); sn(); } void /* newline indent, string, tag*/ snst(const char* cstr) { sn(); ss(cstr); st(); } void /* tab, then newline indent,then string */ stns(const char* cstr) { st(); sn(); ss(cstr); } void /* tab, then newline indent */ sst(const char* cstr) { ss(cstr); st(); } void /* untab, then newline indent */ sus(const char* cstr) { su(); ss(cstr); } void /* untab, then newline indent */ suns(const char* cstr) { su(); sn(); ss(cstr); } void /* basically sink_2c('<', '/'); sink_s(tag); sink_c('>'); */ send(const char* tag); /* '<' '/' tag '>' */ void /* newline indent, then '<' '/' tag '>' */ snend(const char* tag) { sn(); send(tag); } void /* untab, then '<' '/' tag '>' */ suend(const char* tag) { su(); send(tag); } void /* untab, newline indent, then '<' '/' tag '>' */ sunend(const char* tag) { sun(); send(tag); } void /* untab twice, newline indent, then '<' '/' tag '>' */ suunend(const char* tag) { suun(); send(tag); } void stc(int c) { st(); s1c(c); } void suc(int c) { su(); s1c(c); } void sct(int c) { s1c(c); st(); } void scu(int c) { s1c(c); su(); } void scn(int c) { s1c(c); sn(); } void sctn(int c) { s1c(c); st(); sn(); } void scttn(int c) { s1c(c); stt(); sn(); }

     Sink methods containing f in the name use printf() style formatting. Note subclasses override almost none of the sink api because everything writes via virtual putc_fn() and write_fn() methods.

void /* f() is basically the same as printf() */ sf(const char* fmt, ...); /* f() */ void sfn(const char* fmt, ...); /* f(); n(); */ void sft(const char* fmt, ...); /* f(); t(); */ void sftn(const char* fmt, ...); /* f(); t(); n(); */ void snf(const char* fmt, ...); /* n(); f(); */ void snft(const char* fmt, ...); /* n(); f(); t(); */ }; // class mu::Sink

inline Sink& operator<<(Sink& o, Iov const& x) { if (x.v_p && x.v_n) { o.swrite(x.v_p, x.v_n); } return o; }

     The following inline << operators use the singleton types defined earlier to invoke sink api in response to "appending" a single instance as a meta operator. For example, you can flush like this: o << mu_now. The main purpose is brevity, since putting multiple operators on one line is clear and natural.

// ----- Sink inlines ----- inline Sink& operator<<(Sink& o, mu::Endl1 const& x) { o.sn(); return o; } inline Sink& operator<<(Sink& o, mu::Now1 const& x) { o.sflush(); return o; } inline Sink& operator<<(Sink& o, mu::Addi1 const& x) { o.st(); return o; } inline Sink& operator<<(Sink& o, mu::Subi1 const& x) { o.su(); return o; } inline Sink& operator<<(Sink& o, const char* s) { if (s) { o.ss(s); } return o; }

12sep09 « Sink.cpp

     Non-inline mplementations of Sink methods appear below.

Sink::~Sink() { s_0 = s_p = s_x = 0; }

     By conventoin, sink subclasses do not automatically flush. The base class merely ensures the buffer is "empty."

/* writev() writes each iov to sink s using sink_write_do() */ int /* -1 on error; otherwise sum of bytes written */ Sink::swritev(const Iov* iov, int cnt) { int sum = 0; for (int i = 0; i < cnt; ++i) { const Iov* v = iov + i; int actual = this->write_fn(v->v_p, v->v_n); if (actual >= 0) sum += actual; else return actual; /* stop on negative */ } return sum; }

     Methods below write one, two, and three bytes; note s1c() means the same thing as sc() but is not inline, for use when code size matters more than speed. One often wants to write juste a few bytes.

void Sink::s1c(int c) { /* non-inline c() */ sc(c); /* inline */ } void Sink::s2c(int a, int b) { /* c(a); c(b); */ sc(a); /* inline */ sc(b); /* inline */ } void Sink::s3c(int a, int b, int c) { /* c(a); c(b); c(c); */ sc(a); /* inline */ sc(b); /* inline */ sc(c); /* inline */ }

     Method sn() means newline and then indent to current tab depth.

void Sink::sn() { /* newline then INDENT to depth */ sc('\n'); /* newline */ stabs(s_tab); /* indent */ } void /* basically sink_2c('<', '/'); sink_s(tag); sink_c('>'); */ Sink::send(const char* tag) { /* '<' '/' tag '>' */ sc('<'); /* inline */ sc('/'); /* inline */ this->write_fn(tag, strlen(tag)); sc('>'); /* inline */ }

     Method stabs() below is how sn() indents.

static const char* mu_blanks = /* ALL BLANKS on next line with comment ruler: */ " "; /* 123456789_123456789_123456789_123456789_123456789_123456789_123456789_12 */ #define mu_blanks_len 72 /*length of mu_blanks above*/ void Sink::stabs(unsigned tab) { unsigned sz = tab * 2; if (sz > 1024) /* too big? */ sz = 1024; while (sz) { unsigned quantum = (sz > mu_blanks_len)? mu_blanks_len: sz; this->write_fn(mu_blanks, quantum); sz -= quantum; } }

     Support for printf() style formatting comes in several flavors, with options to newline/indent before or after, with or without an increase in tab depth.

void /* f() is basically the same as printf() */ Sink::sf(const char* fmt, ...) { /* f() */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ this->write_fn(temp, strlen(temp)); } void Sink::sfn(const char* fmt, ...) { /* f(); n(); */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ this->write_fn(temp, strlen(temp)); sn(); /* newline indent */ } void Sink::sft(const char* fmt, ...) { /* f(); t(); */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ this->write_fn(temp, strlen(temp)); st(); /* tab */ } void Sink::sftn(const char* fmt, ...) { /* f(); t(); n(); */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ this->write_fn(temp, strlen(temp)); st(); /* tab */ sn(); /* newline indent */ } void Sink::snf(const char* fmt, ...) { /* n(); f(); */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ sn(); /* newline indent */ this->write_fn(temp, strlen(temp)); } void Sink::snft(const char* fmt, ...) { /* n(); f(); t(); */ char temp[ 2048 + 2 ]; va_list args; va_start(args,fmt); vsnprintf(temp, 2048, fmt, args); va_end(args); temp[2048] = 0; /* ensure end nul */ sn(); this->write_fn(temp, strlen(temp)); st(); /* tab */ }

     Note all the buffers are only 2K in size. Why so small? Why not 4K or 8K? Because typically these methods are for writing small amounts of text, on the order of "around a line" or so. Of course, in your version you can make stack-based buffers as big as you like.

     Some implementations of vsnprintf() return the string length, so callling strlen() is unnecessary; but this code always works, and we're not very concerned about speed when printing, especially when you use printf() style formats. Feel free to use size returned by vsnprintf() in your rev.

12sep09 « FdSink

basic sink subclass «

     The following sink subclass writes to a file desriptor. Note member variables are public, but altering these fields is not often a good idea.

#define FdSink_TAG 0x6664736b /*'fdsk'*/ class FdSink : public Sink { public: // private by convention Iov f_iov; /* entire local buffer */ uint32_t f_tag; /* must be TAG */ int f_fd; /* file descriptor (where to write) */ uint32_t f_out; /* bytes written to descriptor */ public: FdSink() : Sink(), f_iov(0,0), f_tag(FdSink_TAG) , f_fd(-1), f_out(0) { } FdSink(Iov const& v, int fd) : Sink(v), f_iov(v), f_tag(FdSink_TAG) , f_fd(fd), f_out(0) { } public: // virtual methods virtual int write_fn(const void* b, uint32_t n); virtual int flush_fn(); virtual void putc_fn(int c); /* fallback */ virtual ~FdSink(); };

     Brevity of this api suggests the code must be simple—which it is, if you discount complexity around staging in the local buffer.

12sep09 « FdSink.cpp

     Non-inline mplementations of FdSink methods appear below.

FdSink::~FdSink() { assert(FdSink_TAG == f_tag); f_tag = 0xcafedead; }

     The following static method handles actually writing to the file descriptor, while keeping statistics and responding to errors. A mu_printf() logging method used casually here presumably writes to stdout and then flushes.

static int /* internal/private write to descriptor */ _fdsink_write(FdSink* s, const void* buf, unsigned len) { const uint8_t* src = (const uint8_t*) buf; int tries = 0; int err = 0; unsigned outSize = 0; while (len) { if (++tries > 20) { mu_printf("fo_write() tries=%d errno=%d", tries, err); s->s_err = err; return outSize; } int did = ::write(s->f_fd, src, len); if (did < 0) { err = errno; if (EAGAIN == err || EINTR == err) continue; /* try again */ mu_printf("write(m_fd=%d, len=%d)=%d errno=%d", (int) s->f_fd, (int) len, did, err); s->s_err = err; return -1; } if (did > (int) len) { mu_printf("fo_write(len=%d) actual=%d??", (int) len, did); did = len; /* do not subtract more then len */ } src += did; len -= did; outSize += did; s->f_out += did; } return outSize; }

     Central public writing method write_fn() below manages buffer space and defers actual writes to _fdsink_write() above.

int /* -1 on error; otherwise sum of bytes written */ FdSink::write_fn(const void* buf, uint32_t sz) { unsigned outSize = 0; int actual = 0; assert(FdSink_TAG == f_tag); const uint8_t* src = (const uint8_t*) buf; if (!sz) { return 0; } /* write nothing? */ if (!src) { s_err = EINVAL; return -1; } if (mu_unlikely(s_p < s_0 || s_p > s_x)) { s_err = EINVAL; assert(s_p >= s_0 && s_p <= s_x); /* die */ } unsigned room = s_x - s_p; /* space left in buf */ if (s_p == s_0) { /* empty? NOTHING in local buf? */ if (sz > f_iov.v_n) { /* cannot fit in buf? */ actual = _fdsink_write(this, src, sz); /* direct */ if (actual < 0) return -1; outSize += (unsigned) actual; } else { /* sz <= v_n implies it fits in buffer */ memcpy(s_p, src, sz); /* safely put in buffer */ s_p += sz; outSize += sz; /* advance */ assert(s_p <= s_x); /* not beyond buf end */ } } else { /* SOME already buffered bytes are present */ if (sz > room) { /* more than fits in space left? */ unsigned more = sz - room; /* excess bytes */ if (room) { /* is there any more space left at all? */ memcpy(s_p, src, room); /* fill, then flush */ s_p += room; outSize += room; src += room; } actual = _fdsink_write(this, s_0, s_p - s_0); if (actual < 0) return -1; s_p = s_0; /* buffer is empty again */ if (sz > f_iov.v_n) { /* cannot fit in buf? */ actual = _fdsink_write(this, src, more); /* direct */ if (actual < 0) return -1; outSize += (unsigned) actual; } else { /* buffer is flushed: remainder now fits */ memcpy(s_p, src, more); s_p += more; outSize += more; src += more; assert(s_p <= s_x); /* not beyond buf end */ } } else { /* room >= sz: input bytes fit in space left: */ memcpy(s_p, src, sz); s_p += sz; outSize += sz; } } return outSize; }

     Flush method flush_fn() checks whether cursor s_p is no longer the start of the buffer at s_0. When those two pointers differ, the buffer contains s_p-s_0 bytes that need to be written. Once the buffer has been flushed, setting cursor s_p back to the origin s_0 shows all the buffer is available again.

int FdSink::flush_fn() { assert(FdSink_TAG == f_tag); if (mu_unlikely(s_p < s_0)) { assert(s_p >= s_0); /* die */ s_err = EINVAL; return EINVAL; } unsigned size = s_p - s_0; if (size) { /* anything in buffer? */ if (f_fd >= 0) { /* descriptor is valid */ s_p = s_0; /* empty again: origin */ int actual = _fdsink_write(this, s_0, size); if (actual < 0) { return actual; } } } return 0; }

     The flush method returns zero when no error occurs. So putc_fn() checks for a successful flush before attempting an inline write of the byte that did not fit in the buffer before. As long as the buffer has an space at all, indicated by s_x>s_p, space will be present to hold the input byte.

void FdSink::putc_fn(int c) { if (0 == this->flush_fn()) { if (s_p < s_x) { *s_p++ = (uint8_t) c; } } }