|
demos are
explained here;
a menu at top column right indexes actual topic demos.
Here we demo crc.
problem
A collision resistant checksum is often very useful for hashing as well as change detection. Wil uses crc32 as his default tool for this purpose, barring more specific requirements. A cyclic redundancy check (crc) is a collision resistant hash, and it's quite effective in 32-bits for distributing input byte streams over the full range of 32-bit unsigned integer values. But that's a far cry from no collisions. You'd need a cryptographic hash of much larger size if you can't bear collisions. But when collisions are okay (though rare) then crc32 is an excellent hash for populations in many millions. (However, crc's were really designed to detect bursty line errors in telecommunications. So crc's might not be as collision resistant as you'd like for other situations. Analyze effects in your situation.)
crc32
Wil uses crc32() from the zlib library, which is distributed on terms allowing free embedding in many products from gzip to your browser's content encoding systems. Since the code is so easy to get, Wil treats crc32() as one step away from a system library call. But it's in a third party library you have no trouble finding. Most of the yh32 class api below is just a wrapper around the crc32() function, with convenient operators. The speed is substantially better than extracting 32 bits from the 128 bits of an md5 hash, when memory pressure seems to penalize md5 with cache line misses. (In unit tests touching high locality memory in tight loops, md5 can look as fast as crc32, but this isn't the case under high i/o.) class yh32 { // crc32 based on zlib's ::crc32() «
private:
u32 h_len; // number of bytes added to crc
u32 h_crc; // crc from applying crc32() to h_len bytes
void _hinit() {
h_len = 0; h_crc = (u32)::crc32(0L, Z_NULL, 0); }
public:
u32 hlen() const { return h_len; }
u32 hcrc() const { return h_crc; } «
operator unsigned long() const { return h_crc; }
void hadd(const char* s) {
n32 n = ::strlen(s);
h_crc = ::crc32(h_crc, (Bytef*) s, n); h_len += n;
}
void hadd(iovec const& v) {
n32 n = v.iov_len;
h_crc = ::crc32(h_crc, (Bytef*) v.iov_base, n); h_len += n;
}
void hadd(yv const& src) {
n32 n = src.v_n;
h_crc = ::crc32(h_crc, (Bytef*) src.v_p, n); h_len += n;
}
void hadd(void* p, u32 n) {
h_crc = ::crc32(h_crc, (Bytef*) p, n); h_len += n;
}
yh32() { _hinit(); }
yh32(yv const& src) { _hinit(); if (src) hadd(src); }
yh32(void* p, u32 n) { _hinit(); if (p && n) hadd(p, n); }
yh32& operator<<(const char* s) { «
if (s) hadd(s); return *this; }
yh32& operator<<(iovec const& v) { «
if (v.iov_base && v.iov_len) hadd(v); return *this;
}
/// \return collision resistant hash of ptr address by crc32:
static u32 hashp(const void* p) { // crc32 hash 4-byte ptr «
//yv four(&p, sizeof(void*)); // ptr as 4-byte octet run
//yh32 x; x.hadd(four); return x.hcrc();
const void* tmp = p; // 4 bytes of address passed to crc32:
return (u32) ::crc32(0, (Bytef*) &tmp, (uInt) sizeof(void*));
}
struct Hq { yh32 const& q_h; Hq(yh32 const& h): q_h(h) { } };
Hq quote() const { return Hq(*this); } // to request dump
void hprint() const; // hdump() stdout for use under gdb
void hdump(yo& o) const; void hcite(yo& o) const;
}; // class yh32
inline yo& operator<<(yo& o, yh32::Hq const& x) {
x.q_h.hdump(o); return o; }
inline yo& operator<<(yo& o, yct<yh32> const& x) {
x.c_t.hcite(o); return o; }
Little of this api needs explanation if you've been reading other demos before this one. But yh32::hashp() is an oddity present in this api only as a convenience to code using pointer addresses as keys in hashmaps. When this might otherwise cause too many hashmap collisions — presumably because of memory alignment patterns — hashp() can be used to distribute keys more uniformly over a 32-bit address space, and at relatively little time cost. Wil does this in some hashmaps. Since most of this class is inlines, the yh32.cpp source file only contains printing method implementations. void yh32::hprint() const { // hdump() to stdout for gdb «
yout << yendl; this->hdump(yout); yout << yendl << ynow;
}
void yh32::hdump(yo& o) const { // here same as hcite() «
o.of("<yh32 crc=%#lx len=%lu/>", hcrc(), (long) hlen());
}
void yh32::hcite(yo& o) const { // cite is always one line «
o.of("<yh32 crc=%#lx len=%lu/>", hcrc(), (long) hlen());
}
For explanations of quote() and ycite() you should see the coming quote demo. They drive operator<<() overloading to call either hdump() or hcite() (which in this api does the same thing since there's not enough state to need more than one line of output). If you want to write your own version of crc32() you can find a good number of web pages elsewhere on this. Or you can pick apart the zlib library source code. A 64-bit crc appears column right because you might not find as many online references for this.
bare metal
"Why shouldn't I just call crc32() directly?" Dex challenged. "Go right ahead," Wil suggested. "Not selling me on advantages of yh32?" Dex asked. "No," Wil agreed. "You're no fun any more," Dex whined. "You never argue." "If you say so," Wil nodded. "I suppose," Dex considered slowly, "if I wanted to get a crc in a deep data structure and I already had some content before that folded into the crc, it might be hard to pass in starting state for crc32() and later get back both the new crc and the total length involved." "You're doing great," Wil urged. "Sell yourself some more." "Ha, you fell for it!" Dex crowed. "Crc classes suck." "Time's up," Wil looked at his watch. "Should I bill you? Or did you get insurance? You need a lot more therapy." "What drugs are you on?" Dex wondered. Wil jumped up to go. "Gotta run, see you next time." |
A submenu for demos appears below, letting you
go to the page on a topic written as a demo (as the
demos page defines it).
menu
thorn: todo, names, fd, iovec, assert, log, run, hex, crc « Þ, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile) Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.
polynomials
You might've found more material on crc code and algorithms in general, if this demo had been written first. But now Wil just wants to show code and get out. A little background context below is offered for token purposes, in case it whets your appetite. In the 80's Wil read about cyclic redundancy checks in the library, but now you have Wikipedia (cf Cyclic-redundancy-check). There you'll see gzip's polynomial 0x04c11db7 (reflected: 0xEDB88320). You can even find a good 64-bit polynomial listed there from the ECMA 182 spec, which took Wil long to find by search in 2002: 0x42f0e1eba9ea3693 and 0xc96c5795d7870f42 are considered variants of the same polynomial because they are left-to-right bitwise reflections of one another, meaning they have the same error detection strength. One caveat from old literature: don't invent your own polynomials because you've no idea what you're doing. So trust folks who chose the proven 32 and 64-bit polynomials. (But you can get several different checksums out of one polynomial by using the reflected form, and by applying them both in a couple different ways.) Your safest bet in performance is to use someone's tuned implementation. The code in zlib's crc32() was reworked not too long ago to process four bytes at a time for good net increase in speed. But you wouldn't care to code that yourself. And if you write your own 64-bit crc, you should expect poor 64-bit arithmetic performance in typical compiler output; so you might need to roll your own assembler if you must have 64 bits and not just 32.
crc64
To help roll your own crc64, this section gives you a head start. Making it run fast is your problem, capiche? Let's imitate yh32's api column left. class yh64 { // crc64 using ECMA-182 polynomial «
private:
u64 h_len;
u64 h_crc;
void _hinit() { h_len = 0; h_crc = (u64) 0xFFFFFFFFFFFFFFFFULL; }
static u64 _hcrc64(u64 crc, const void* p, n32 n);
static u64 _hsub64(u64 crc, const void* p, n32 n);
static const u64 kpoly = 0x42f0e1eba9ea3693ULL; // ECMA-182
public:
yh64() { _hinit(); }
yh64(const void* p, n32 n) { «
this->_hinit();
h_crc = this->_hcrc64(h_crc, p, n);
}
static void hterms(yo& o); // write 64-bit terms on out stream
u64 hlen() const { return h_len; }
u64 hcrc() const { return ~h_crc; } «
operator unsigned long long() const { return ~h_crc; }
void hcut(yv const& src) { // gratuitous variant
n32 n = src.v_n; // note: do NOT mix with hadd() calls
h_crc = _hsub64(h_crc, src.v_p, n); h_len += n;
}
void hadd(yv const& src) { n32 n = src.v_n;
h_crc = _hcrc64(h_crc, src.v_p, n); h_len += n;
}
yh64& operator<<(yv const& v) { if (v) hadd(v); return *this; }
};
Note the extra oddly named hcut() method when all the yh32 methods for adding to the crc were named hadd(). What's that about? It's an example of how to generate another crc64 from the same polynomial. You'll have to figure out why this works on your own time though. /*static*/ void yh64::hterms(yo& o) { // write 64-bit terms «
u64 table[256 + 1]; // to hold generated terms
u64* terms = table;
for (u64 i = 0; i < 256; ++i) {
u64 t = i << 56; // next term in u8-indexed lookup table
int bits = 8 + 1; // one per bit plus one for pre-decr
while (--bits) { // trigger polynomial if hits high position
t = ( t & 0x8000000000000000ULL )?
(t << 1) ^ kpoly : (t << 1);
}
terms[ i ] = t; // term for all eight bits of an input byte
}
o << "const u64 yh64_terms[] = {" << yendl;
u64* end = table + 256; // one beyond last table slot
for (/*prep preincr*/ terms -= 4; (terms += 4) < end; ) {
o.ofn(
" 0x%016llxULL, 0x%016llxULL, 0x%016llxULL, 0x%016llxULL,",
(long long) terms[ 0 ], (long long) terms[ 1 ],
(long long) terms[ 2 ], (long long) terms[ 3 ]);
}
o << " 0" << yendl << "};" << yendl << ynow;
}
Method hterms() generates a lookup table from the polynomial and writes it as source code, which is actually statically compiled in yh64.cpp instead of generating it on the fly this way. The table is too long to list here. Just run hterms() above (change it to use fprintf()) and you'll see it. The first and last pairs of terms in the table are listed next, so you can see if you're getting the expected output. Note the second term is the polynomial itself. 0x0000000000000000ULL, 0x42f0e1eba9ea3693ULL // first two terms
0xd80c07cd676f8394ULL, 0x9afce626ce85b507ULL // last two terms
The yh64_terms table is used in both the methods below. Each generates a different crc64 provided you don't mix one with the other in one stream. The "standard" one for operator<<() was chosen arbitrarily. /*static*/ u64 yh64::_hcrc64(u64 crc, const void* p, n32 n) { // «
const u8* s = (const u8*) p;
const u8* x = s + n;
const u64* terms = yh64_terms;
for ( /*prep preincr*/ --s; ++s < x; ) {
crc = ( crc << 8 ) ^ terms[ (( crc >> 56 ) ^ *s) & 0x0FF ];
}
return crc;
}
/*static*/ u64 yh64::_hsub64(u64 crc, const void* p, n32 n) { // «
const u8* s = (const u8*) p;
const u8* x = s + n;
const u64* terms = yh64_terms;
for ( /*prep preincr*/ --s; ++s < x; ) { // variant:
crc = ( crc >> 8 ) ^ terms[ (crc ^ *s) & 0x0FF ];
}
return crc;
}
These are serviceable, but you ought to tune them in hand-crafted assembler if you want speed in heavy use cases. You'd also best review code for accuracy since these versions were written quickly and casually. |
demos « Þ
+ todo + names + fd + iovec + assert + log + run + hex + crc « Þ + buf + in + out + quote + escape + compare + file + deck + cow + arc + blob + tree + slice + rand + time + stat + hash + heap + node + primes + page + book + pile + stack + atomic + lock + mutex + thread + map + meter + list + iter + ctype |