Þ   briarpig  » thorn  » demos  » fd


demos are explained here; a menu at top column right indexes actual topic demos. Here we demo fd.

problem

     To illustrate simple and primitive file i/o using system calls, Wil wants a wrapper for int when it is used as a file descriptor in system calls. At minimum, the api for yfdw needs support for the w_fd descriptor state:

struct yfdw { // file descriptor wrapper « int w_fd; // file descriptor yfdw() : w_fd(-1) { } // invalid explicit yfdw(int fd) : w_fd(fd) { } // as you say operator bool() const { return w_fd >= 0; } // valid int wfd() const { return w_fd; } // getter }; // yfdw

     But in addition to this minimum, Wil adds many more methods in order to reveal intended system call use, and to support i/o with other þ types. For example, since this demo was actually motivated by the iovec demo, Wil adds direct support for ydvp for use in the iovec demo.

     Some of the extra api supports operator<<() using yfdw as the lhs (left hand side) of the binary operator. The following inlines are declared after the class, which simply call yfdw::wput() in each case to write the rhs.

inline yfdw& operator<<(yfdw& w, ydvpz const& x) { // « w.wput(x); return w; } inline yfdw& operator<<(yfdw& w, ydvp const& x) { w.wput(x); return w; } inline yfdw& operator<<(yfdw& w, yv const& x) { w.wput(x); return w; } inline yfdw& operator<<(yfdw& w, iovec const& x) { w.wput(x); return w; } inline yfdw& operator<<(yfdw& w, const char* x) { w.wput(x); return w; }

     The implementations of the last three are trivial, as shown below, using the write() system call.

int yfdw::wprintf(const char* fmt, ...) const { « char buf[4096 + 4]; // temp buffer va_list args; va_start(args,fmt); vsnprintf(buf, 4096, fmt, args); // room at end for nul va_end(args); buf[4096] = 0; // whether or not vsnprintf() wrote a nul return ::write(w_fd, buf, ::strlen(buf)); }

ssize_t yfdw::wput(const char* src) const { « return (src)? ::write(w_fd, src, ::strlen(src)) : 0; } ssize_t yfdw::wput(yv const& src) const { return ::write(w_fd, src.v_p, src.v_n); } ssize_t yfdw::wput(iovec const& src) const { return ::write(w_fd, src.iov_base, src.iov_len); }

     (The first two wput() methods for ydvp and ydvpz are part of the iovec demo, and only the first is described here at top of column right. The ydvpz variant appears with other iovec content because it's very ydvp specific.)

system calls

     This section shows other methods Wil added to the yfdw api for typical system call usage. Gratuitous enums for well known file descriptors and for lseek() constants are given as well — these aim to clarify the api and reduce developer study elsewhere using system references.

     Comments are interspersed with api additions:

struct yfdw { // file descriptor wrapper « int w_fd; // file descriptor // ... enum We_fd { // enums for standard fd's « we_stdin = STDIN_FILENO, // 0 // unistd.h « we_stdout = STDOUT_FILENO, // 1 // unistd.h « we_stderr = STDERR_FILENO // 2 // unistd.h « };

     These wrapper fd enums are for constructor use, especially in demos using standard i/o and not files.

enum We_limit { // system related limits for api « // class local enum for UIO_MAXIOV (4? 8? 16?) we_maxiov = 4 /*most iovecs at once (8? 16?)*/ }; ssize_t wwrite(const void *buf, size_t nbytes) { return ::write(w_fd, buf, nbytes); } // man 2 write ssize_t wwritev(const struct iovec *iov, int n) { return ::writev(w_fd, iov, n); } // man 2 writev ssize_t wpwrite(const void *buf, size_t sz, off_t p) { return ::pwrite(w_fd, buf, sz, p); }

     Writing occurs more than reading in demos, for debug printing purposes. These are direct representations of typical system calls used to write files via file descriptors: write(), pwrite(), and writev(). You should read the man pages for these. This api adds no extra spec, but a we_maxiov constant helps remind you a limit exists for the max number of iovecs writev() can take at one time. The actual UIO_MAXIOV system limit is hard to find on some systems, and might be as small as eight or sixteen. In any case, wput(ydvp&) loops and does iovecs N at a time for N=we_maxiov, but you're on your own with direct calls to writev().

ssize_t wput(ydvp const& src) const; // « ssize_t wput(ydvpz const& src) const; ssize_t wput(const struct iovec *iov, int n) const; ssize_t wput(yv const& src) const; ssize_t wput(iovec const& src) const; ssize_t wput(const char* cstr) const; // C string

     Each wput() calls either write() or writev() as appropriate for the source. The first wput(ydvp&) just writes an iovec vector described in the iovec demo — the second wput(ydvpz&) is exotic because it first generates a new iovec vector corresponding to the offset and length slice of an existing iovec vector.

     Note the absence of pwrite() semantics taking a position; the file's current position is the target. In the slice demo later, we'll add support for slicing yfdw as well as other types, and you'll be able to assign new rhs values for the lhs slice describing a destination range in a file.

ssize_t wread(void *buf, size_t nbytes) { // « return ::read(w_fd, buf, nbytes); } // man 2 read ssize_t wreadv(const struct iovec *iov, int n) { return ::readv(w_fd, iov, n); } // man 2 readv ssize_t wpread(void *buf, size_t sz, off_t p) { return ::pread(w_fd, buf, sz, p); } // man 2 pread

     Read methods closely mirror the write methods.

int wopen(const char *path, int flags, mode_t mode) { // « w_fd = ::open(path, flags, mode); return w_fd; } int wcreat(const char *path, mode_t mode) { w_fd = ::creat(path, mode); return w_fd; } int wclose() { int d = w_fd; w_fd = -1; return ::close(d); } int wfcntl(/*int fd,*/ int cmd, int arg) { return ::fcntl(w_fd, cmd, arg); } // man 2 fcntl

     Perhaps you'll want to call other methods taking file descriptors; we should show a good sample of them here.

enum We_whence { // wlseek() vals for whence « we_set=SEEK_SET, we_cur=SEEK_CUR, we_end=SEEK_END }; off_t wlseek(/*int fd,*/ off_t offset, int whence) { return ::lseek(w_fd, offset, whence); }

     Seek changes current file position. Note this is not a good idea for streams without useful seek behavior. (You should not seek sockets, nor stdin, stdout & stderr.)

// wescape() writes src to w_fd, escaping mapped octets « i32 wescape(const char* map[], const void* p, n32) const; i32 wescape(const char* map[], yv const& src) const; i32 wescape(const char* map[], iovec const& src) const; i32 wescape(const char* map[], yfdw const& src) const; // whtml() calls wescape() with a map for html excape i32 whtml(yfdw const& src) const;

     This last set of escape methods are not system calls — instead they anticipate part of the escape demo that will appear later. All of these read from the rhs source and write to this yfdw as the lhs sink, but using map[] to test each octet to see if it should be replaced with a substitute string appearing at map[c] for some octet c.

     A whtml() method shown below uses wescape() to perform this mapping of &, <, and >:

  • & becomes &amp;
  • < becomes &lt;
  • > becomes &gt;

     The implementation of whtml() is short and looks like this (the code for wescape() appears later):

i32 yfdw::whtml(yfdw const& src) const { // « const char* map[ 256 ]; // one for each octet value ::memset(map, 0, 256 * sizeof(const char*)); // zero map['&'] = "&amp;"; map['<'] = "&lt;"; map['>'] = "&gt;"; return this->wescape(map, src); }

     The next section below shows the code for wescape(). First let's wrap up the end of the class api additions:

// werrno() logs errno for fun(w_fd, n), returns zero « ssize_t werrno(const char* fun, int n) const; }; // class ydfw

     Use of werrno() to report errors is internal.

escape

     As a sample app showing the use of whtml() (listed a ways above) the following short main() was used to create a simple filter escaping the most problematic C++ source characters in html markup. All yfdw source today passed through this tool (self referential demo).

int main (int argc, char * const* argv) { // « yfdw wout(yfdw::we_stdout); yfdw win(yfdw::we_stdin); i32 actual = wout.whtml(win); fprintf(stderr, "escaped: html=%d\n", actual); return 0; }

     The call to whtml() invokes this wescape() variant, which basically just loops over the input 4K at a time:

i32 yfdw::wescape(const char* map[], yfdw const& src) const { static const char* s_msg = "yfdw::wescape(m,w)"; // « char tmp[ 4096 ]; i32 sum = 0; i32 actual = 0; do { // until input stream is exhausted actual = ::read(src.w_fd, tmp, 4096); if (actual >= 0) { sum += this->wescape(map, tmp, (n32) actual); } else { this->werrno(s_msg, 4096); return sum; } } while (actual > 0); // read more bytes from src? return sum; }

     Finally, the core of wescape() is just the following, which writes each non-escaped byte once and only once — in order — using a write() system call, delaying as long as possible until provoked by hitting either end of input or a mapped octet.

i32 yfdw::wescape(const char* map[], // « const void* cv, n32 n) const { static const char* s_msg = "yfdw::wescape(m,cv,n)"; const u8* p = (const u8*) cv; // octet cursor i32 sum = 0; i32 actual = 0; if (p && n && map) { // nonempty? const u8* o0 = p; // origin: bytes not yet written const u8* x = p + n; // one beyond last to write for (/*prep preincr*/ --p; ++p < x; ) { const char* esc = map[*p]; if (esc) { // need to escape this byte? if (p > o0) { // write earlier bytes first? actual = ::write(w_fd, o0, p - o0); // before p sum += (actual < 0)? werrno(s_msg, p - o0) : actual; } int len = ::strlen(esc); actual = ::write(w_fd, esc, len); sum += (actual < 0)? werrno(s_msg, len) : actual; o0 = p + 1; // next byte to write is after current p } // if (map[*p]) } // for if (p > o0) { // at least the final byte was not escaped? actual = ::write(w_fd, o0, p - o0); // trailing bytes sum += (actual < 0)? werrno(s_msg, p - o0) : actual; } } return sum; // bytes written }

     Escaped output is not buffered here, causing more write() calls than necessary (easily corrected when better performance is needed using a buffered version of out streams inheriting from yo). But speed isn't the point of this demo — just primitive i/o and simple algorithms massaging output data. Wil even calls strlen() each time the same escape value in map[] is used, but that would vanish in the cost of a file i/o system call anyway.

     More of this sort of escaping appears in the escape demo.

A submenu for demos appears below, letting you go to the page on a topic written as a demo (as the demos page defines it).

menu

     thorn: todo, names, fd « Þ, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype

     (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile)

     Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.

iovec

     "This section is properly part of the iovec demo," Wil introduced, "but I decided to show it here because almost none of it is specific to the ydvp iterator over iovec vectors."

     "Then what does it involve?" Stu asked.

     "Mostly struct iovec and writev()," Wil replied. "So these versions of wput() belong here with other code using basic system types and calls. The log calls one can easily ignore."

ssize_t yfdw::wput(ydvp const& src) const { // « return this->wput(src.pvec(), src.psize()); }

     Writing ydvp using the wput() method simply calls another variant shown next, taking iovec vector and length:

ssize_t yfdw::wput(const struct iovec *iov, int n) const { // « if (n <= we_maxiov) return ::writev(w_fd, iov, n); ssize_t sum = 0; const struct iovec* v = iov; while (n >= we_maxiov) { // write we_maxiov at a time? int here = ::writev(w_fd, v, we_maxiov); if (here >= 0) sum += here; else { // log write failure ylog(1, "::writev(w_fd=%d, n=%d) errno=%d", (int) w_fd, (int) we_maxiov, (int) errno); } v += we_maxiov; n -= we_maxiov; } if (n) { int last = ::writev(w_fd, v, n); if (last >= 0) sum += last; else // log write failure ylog(1, "::writev(w_fd=%d, n=%d) errno=%d", (int) w_fd, (int) n, (int) errno); } return sum; }

     "Looks like you're looping over the input iovecs a few at a time," Stu remarked. "Wasn't we_maxiov an enum defined in the class? With a value of four — why four by the way?"

     "Yes," Wil confirmed, "this wput() calls writev() with just a few iovecs at once; I chose a small number to make sure it was less than the system's definition of UIO_MAXIOV. Since this is just sample code whose speed doesn't matter, clarity was more important than writing more iovecs at once."

     "Error handling looks a little sloppy here," Stu observed.

     "Yes, I only gave it minimal attention," Wil said. "Normally I'd carefully check errno on errors and repeat my writes on EINTR for non-blocking file i/o. But that would involve more code and more detail. I'm aiming for less detail."

     "Why loop at all then?" asked Stu.

     "Because I expect sample code in demos to construct long iovec arrays," explained Wil. "Much longer than normal in real systems. But I still want the right output in demos."

guard

     "Um," Stu hesitantly interrupted. "Why would I use yfdw again? Why not just use file system calls directly?"

     "Whatever you like," replied Wil. "I only whipped up this file descriptor wrapper for demos, as a way to show how iovec content can be written without requiring an out stream."

     "So it's just a cheap disposable class?" Stu tested.

     "Sure," Wil agreed. "I might or might not use it in real code. It's just useful as a low complexity foil in demos."

     "Have any other file descriptor wrappers?" Stu asked.

     "Yes, check out this guard wrapper," Wil pointed.

class yfdg { // file descriptor guard « protected: int g_fd; // descriptor int g_status; // errno on err; zero otherwise public: yfdg() : g_fd(-1), g_status(0) { } yfdg(int fd) : g_fd(fd), g_status(0) { } int gfd() const { return g_fd; } int gstatus() const { return g_status; } // close status bool ggood() const { return g_fd >= 0; } // not invalid int gclose(); // close the descriptor now (idempotent) void gforget() { g_fd = -1; g_status = 0; } int gsteal() { int fd = g_fd; g_fd = -1; return fd; } void gswap(yfdg& g) { int t = g_fd; g_fd = g.g_fd; g.g_fd = t; } ~yfdg() { gclose(); } // gclose() only acts first time }; // class yfdg

     "What's it for?" Stu prompted. "One of those RAII things? resource acquisition is initialization?"

     "Yes, exactly," Wil agreed. "The main reason it exists is to provide the following gclose() method called from the destructor."

int yfdg::gclose() { // close descriptor now (idempotent) « if (g_fd >= 0) { if (::close(g_fd) < 0) { g_status = errno; ylog(1, "::close(g_fd=%d) errno=%d ('%s')", (int) g_fd, (int) g_status, ::strerror(errno)); } g_fd = -1; } return g_status; }

     "What does idempotent mean again?" Stu puzzled.

     "That means doing it again has no new effect," Wil replied. "You can call gclose() more than one time, and you won't be punished for losing track of whether it needs doing."

     "And the whole purpose of the class is to close the file descriptor in the destructor?" Stu asked. "Isn't that overkill?"

     "Being careful seems like overkill?" Wil asked. "You can use it to ensure file descriptors are not leaked in contexts where you open a file (or something else) and must see it's closed at end of scope, no matter what. For example, if an exception is thrown, the destructor will be called in passing — basic RAII."

     "I see gsteal() and gswap() let me change my mind on the fly," Stu observed. "Do you use it for anything else?"

     "Yeah," nodded Wil. "I use yfdg as an input arg type to methods taking a file descriptor, to tell the receiver it now owns the file descriptor and must close it when it's done — or hand it off to someone else who promises to close it eventually."

     "A receiver uses gswap() to take possession?" Stu guessed.

     "That would work," Wil agreed. "I've done that to ensure someone stays responsible for closing a descriptor, whether or not an error occurs somewhere along the way."

     "What would happen if I assigned one yfdg instance to another?" Stu asked. "Does that leak a clobbered descriptor?"

     "Dang, you're right," Wil pointed at Stu. "Good call; let's fix that by making copy constructor and assignment operator private like this (put on one line to creep out Dex)."

class yfdg { // ... private: // no copying allowed to prevent descriptor leaks: yfdg(yfdg const&); yfdg& operator=(yfdg const&); // no copy « }; // class yfdg

     "Well, it looks easy to throw new file descriptor wrappers together," Stu noted. "Have any odd ones up your sleeve?"

     "I can show you one more," Wil confirmed. "The next section's object saves and restores current file position."

pointer

     "Sometimes," Wil changed tack, "I want to change a file's position temporarily, then put it back again. I use yfdp to save current position in the constructor; the destructor restores that position, unless I call pforget() before destruction."

     "Why would you need this?" Stu asked.

     "Usually I just wanted the file length," Wil replied, "and I didn't want to use fstat() to find out. Maybe I should just use fstat(), but I developed a habit of using lseek() to eof for this."

     "Which isn't thread safe," Stu observed.

     "Yes, of course," Wil granted. "But usually I only do this when first opening a file, before any other thread gets involved."

class yfdp { // auto restore original file pos for fd files « int p_fd; // file descriptor i64 p_pos; // off_t returned from ::lseek(p_fd, 0, SEEK_CUR); public: /// \brief constructor lseeks and prints error msg as necessary explicit yfdp(int fd); // p_pos = ::lseek(p_fd, 0, SEEK_CUR) /// \return original position of fd (captured by yfdp::yfdp()) i64 ppos() const { return p_pos; } // negative when lseek failed int pfd() const { return p_fd; } // descriptor /// \brief ::lseek(p_fd, (off_t) p_pos, SEEK_SET) to restore pos void phalt(); // early ~yfdp() before destroyed (idempotent) /// \brief restore file pos to p_pos if pforget() was not called ~yfdp() { phalt(); } // phalt() only acts 1st time called /// \brief use forget if you only need constructor's current pos void pforget() { p_fd = -1; p_pos = -1; } // NO ~yfdp() restore /// \brief change file current pos to newPos (err msg at need) /// \return newPos on success, or -errno (neg errno) on failure i64 pseek(i64 newPos); // change file pos (restored by ~yfdp()) /// \brief change file pos from p_pos to eofPos (err msg at need) /// \return file eof, or -errno if lseek(p_fd, 0, SEEK_END) fails i64 peof(); // change file pos to eof (length); ~yfdp() restores }; // class yfdp

     "You wrote more comments than usual," Stu said.

     "I do that when guessing semantics for an api is harder," Wil explained. "I use single line summaries the most."

     "Looks okay. Where's the code?" Stu asked.

     "The constructor's trivial," Wil stated.

yfdp::yfdp(int fd) // p_pos = ::lseek(p_fd, 0, SEEK_CUR) « : p_fd(fd), p_pos(::lseek(fd, 0, SEEK_CUR)) { if (p_pos < 0) { ylog(1, "::lseek(fd=%d, 0, SEEK_CUR)=%d errno=%d ('%s')\n", (int) p_fd, (int) p_pos, (int) errno, ::strerror(errno)); } }

     "Yes, 'where am I now?' is all it asks," Stu agreed.

     Wil nodded. "The destructor puts it back," he added.

void yfdp::phalt() { // « if (p_fd >= 0 && p_pos >= 0) { off_t back = ::lseek(p_fd, (off_t) p_pos, SEEK_SET); if (back < 0) { ylog(1, "::lseek(fd=%d, pos=%lld, SEEK_SET)=%d errno=%d\n", (int) p_fd, (long long) p_pos, (int) back, (int) errno); } } this->pforget(); // never halt twice }

     "What's left?" Stu asked. "Just peof() and pseek()?"

i64 yfdp::peof() { // seek file eof (length); ~yfdp() restores « off_t eofPos = ::lseek(p_fd, 0, SEEK_END); if (eofPos < 0) { int e = errno; ylog(1, "::lseek(fd=%d, 0, SEEK_END)=%d errno=%d\n", (int) p_fd, (int) eofPos, (int) errno); eofPos = (e >= 0)? -e : e; // neg return shows failure } return eofPos; } i64 yfdp::pseek(i64 newPos) { // alter file pos; ~yfdp() restores off_t after = ::lseek(p_fd, (off_t) newPos, SEEK_SET); if (after < 0) { int e = errno; ylog(1, "::lseek(fd=%d, pos=%lld, SEEK_SET)=%d errno=%d\n", (int) p_fd, (long long) p_pos, (int) after, (int) errno); after = (e >= 0)? -e : e; // neg return shows failure } return after; }

     "Yep," Wil confirmed. "Not much to this. But after calling peof(), the file position is at eof until you call phalt(), or yfdp gets destroyed. I hate seeing lseek() sprinkled in code, especially without any error checking. You can read the iovec demo now."