|
demos
Lately I write demos under thorn with a todo list using C++ under a BriarPig mu-babel license. The run and hex demos were done on 13apr2008; crc on 14apr2008; buf on 20apr2008; in on 27apr2008; ctype on 04may2008; out on 18may2008; slice on 25may2008; quote on 31may2008; escape on 31may2008; mutex on 14jun2008; rand on 16jun2008; stat on 17jun2008; primes on 19jun2008; list on 23jun2008; heap on 29jun2008; iter on 02jul2008 and 04jul2008; atomic on 06jul2008; node on 13jul2008; page on 23jul2008; hash on 27jul2008; book on 27jul2008; pile on 03aug2008; stack on 07aug2008; mu on 12aug2008; toy on 16aug2008; weight on 23aug2008; symbol on 02sep2008; imm on 04sep2008; gc on 06sep2008; map on 14sep2008; meter on 16sep2008; thread on 22oct2008; arc on 05nov2008; this menu links all demos: mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile thorn: todo, names, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, list, iter, ctype 27nov08
¶
reckless stragglers
holiday ¶ I'm having a great holiday with my sons. Years ago I had a tendency to work through holidays, which wasn't a good idea. Now I'm sensitive to how few years I have my sons left at home, and I try to enjoy it while I still can. Once they're ready to leave home after school, I'll campaign to keep them near, but it might not work. I've been trying to decide what my plan for this site should be. I no longer think my sons will be that interested in learning to code, since I don't think it will be possible for them to learn how to code things they like doing on computers — playing games mostly. They might enjoy coding, but I shouldn't get my hopes up. There's no reason to get emotionally invested in something likely to disappoint. So far I think I might lean toward a stronger focus in prototyping and testing tools: drivers for other things I find more interesting. What would I find useful? A toy-scale async server framework for exercising pieces of things fitting into more complex systems. This is close to what I was already planning to do, but now there's not quite as strong an emphasis on doing real development, or aiming for end-user programming — at least without adding a lot more batteries than I'll include at first. So the toy language might aim to be a scripting backend first for guts of automated behavior in a daemon (which wouldn't have to be a daemon, but could be if you wanted). Anyway, I'm thinking less about a toy development framework for my sons, and more about a lightweight bit of breadth-first scaffolding to get stuff to happen in an async server, as a lowest common denominator for running code designed to operate in an async environment. Because async code is really irritating to exercise in a way you feel really covers code thoroughly. I need to update the home page (and other pages) to downplay a goal of pedagogical tools for my sons. Whatever I make might be useful for my sons, but I won't worry about it. virtual zombies ¶ My youngest son is having a blast playing Left 4 Dead: a zombie game from Valve (wikipedia), the makers of Portal and Half Life 2, which both my sons have played. (I've played neither. I play almost none of the games I buy my sons. The part I find interesting is game design, and I can pick that up from watching them play a few minutes. Almost all first person shooters are dreadfully boring as problems to solve. I'd play games more if I ever saw puzzles to investigate. The kinds of puzzle you find in things like Portal are almost engaging, except each one is done almost immediately. So I have trouble caring.) But like several other games on the Xbox running online to interact with other players, Left 4 Dead allows my younger son (let's call him Fenrir) to engage in deeply social behavior by playing in a virtual world while talking to other participants over his headset. Something about Left 4 Dead seems different from his recent addiction to playing Halo online. There's a lot more strategy, for several reasons. In Halo my sons would wander around shooting other players. Teams in Halo formed a basic element of organization, but had little practical effect. The really interesting part was how my sons talked continuously with other players over headsets while playing. The Xbox basically gives them a no-handset phone connection to other players in a conference call. While negotiating coordinated movements through three dimensional environments, my sons and other players stalked each other and traded negotiation and criticism continuously and fluidly. Fenrir was especially adept at giving other folks verbal instruction and direction in real-time while solving his own problems. I stopped seeing video games as a waste of time. Both my sons were getting terrific training in real time social communication — tracking what other folks are thinking while pursuing their own agendas, and casually fielding just-in-time negotiation. Playing Left 4 Dead is some kind of step up. The game has four principal main characters in the role of survivors of some kind of catastrophe populating the world with zombies in a cast of hundreds, at least. It works like a zombie movie, even to the extent of having movie posters featuring the main characters, and ending play with rolling credits like a movie — Fenrir said one player named MrReager was killed during a compaign, and the rolling credits said the "movie was made in the memory of MrReager." In other words, it's a little campy and it works really well. But the real genius in Left 4 Dead is subtle but quite direct: you can't survive the fight with zombies without cooperating, a lot. Players who wander too far ahead or too far behind get picked off by über zombies who'll kill the players unless saved by another player. When you're in the role of one of the survivors, your first goal is keeping your teammates alive, because otherwise no one will save you when you're the one being torn to pieces by a hyped-up zombie with the energy and agility of a werewolf. (The zombies come in several exotic flavors besides a common infected strain. The leaping, jumping, slashing monster type is called a Hunter. Others are all weirdly horrible and nimble; zombies aren't slow in this game.) Fenrir's favorite mode is called Versus in which a team of four survivors play against another team of four zombies. Each team of four trade places with the other team every time the survivors make it to the next home base, called a safe room. So teams take turns being survivors and zombies, in rotation. As zombies, players spawn as one of the several special zombie variants each time, with it's own special attack mode and attendant tactics. If you're killed as a zombie, you just respawn as another zombie — populating a small, intelligent, especially virulent subset of zombies in the AI driven landscape. When you die as a survivor, you're just dead until the others make it to the safe room, or until all the survivors are killed. Sometimes all the survivors are killed, and sometimes they all live. It depends partly on chance and partly on skill of the players. Each team can communicate with one another over headsets. Survivors can hear each other, and zombies can hear each other. But there's no crosstalk between teams. Zombies also have the extra advantage of being able to see outlines of survivors when out of range, in order to plan and time attacks better. There's a lot of strategy and Fenrir runs analysis continuously while playing, giving suggestions to other players verbally while timing his own actions. Good play for a zombie aims to exploit failures of survivors to watch one another, either because they don't stick together, or because they're fielding multiple attacks. It involves guessing what the other team might be thinking, and sussing what they're likely to give attention in small windows of time. It's fun to watch because there's no plot, just like a zombie movie. It's just staying alive in changing conditions with a lot of time pressure to make goals before you get worn down. naps ¶ Lately folks keep discussing whether taking naps is a good idea. Here's what I do: I often take very short naps — but only when I'm very sleepy. When I sleep well all the time (about seven hours a night is good), then I don't get sleepy during the day. But occasionally I get groggy at 9pm or 10pm. A nap of about ten or fifteen minutes removes the sleepiness, after which I get completely alert again. I started doing this about seven years ago. When I was really young, going to sleep was always a very high latency activity: 90 minutes was the minimum when I was 20, and I was usually groggy afterward. So I never napped. But my first startup after Netscape sometimes encouraged me to sleep not quite enough. Too many days of 6 or 6.5 hour nights in a row caused a sleep deficit making me sleepy in the afternoon. When it developed into a really intense impulse to sleep, it was far too distracting, so I stopped bucking my body's demand. To my surprise, I found merely losing consciousness completely was enough to vanquish the urge to sleep. About five or ten minutes would do it. If I did this in a somewhat uncomfortable position — in a reclining chair or even on the floor — I'd awake in a few minutes, the first time I awkwardly changed positions. And the intense urge to go to sleep was always gone. Within a few minutes of standing back up, I became incredibly alert once again. It was great. I assumed — without anything to back up my notion — that I was somehow gaming my brain's chemical system. Fatigue from sleep deficit got my system in a state where sleep was demanded. By giving in, just for a few minutes, this bypassed some logjam that otherwise would not go away. I thought of it as quickly rebooting. It only takes about sixty to ninety seconds for me to turn completely off when I'm that sleepy. Then for whatever bizarre reason, I'm good to go ten minutes later. I have no idea what I'm accomplishing, if anything, besides dismissing a distraction of sleepiness. All I know is it works. Now I seldom get sleepy in the day — maybe once every three weeks or so. But when I do, I find a dark room and lay down for ten minutes, completely blanking thoughts and letting my conscious mind vanish completely. Fifteen minutes later I'm invigorated about as much as from taking a cold shower. I have no idea why this is so effective. But you should give it a shot. 22nov08
¶
nullified threats
memory ¶ I'll turn 50 soon, and in theory it ought to affect my memory some day. However, I can barely tell any difference lately, and I'm not particularly worried: I suspect by the time I turn 60 I'll still have a better memory than average. My memory for names has gotten worse. If I haven't heard a person's name in years, I have trouble calling it to mind. It's the same old tip-of-the-tongue experience, but much more pronounced. I can only access those names in a neutral context, but not right at the moment I'm trying to finish a sentence where I'd like to insert the name. If when I start a sentence I don't already remember a name, I no longer assume I'll have it by the time I reach that part of a sentence, and try to avoid it. I didn't have that problem at all when I was young, so now my memory for names is more like an average 25 year old. When I was 25, if I knew I had known something once, I could access it in seconds at most. And my recall was excellent for anything I'd given any attention at all. If I had noticed a thing, I could recall it. My memory wasn't eidetic, because I wasn't able to parse visual details out of my visual memory I never noticed the first time. But in high school everyone assumed my memory was photographic. This was because I had 100% recall for anything rehearsed to the extent of discussing it in class, or listening to the teacher discuss it. This was one part really good episodic memory, and one part great random access semantic memory, and the two reinforced each other since I could prompt one with the other. You would really have to grill me in fantastic detail to establish what I didn't remember. By the age of five I noticed I was starting to recall too much detail about things I wasn't going to need in the future. So I tried to deal with this by paying less attention to stuff I knew was minutiae. In grade school I was an avid reader and studied memory mnemonics a little because I was interested in how my mind worked. And it also served as a guide to how to forget things — you just do the opposite of techniques for recall: don't rehearse things. Every time I heard a phone number I had to blank my mind not to memorize it. Unfortunately, I now have a lifetime of habits aiming at forgetting enough. It wouldn't hurt to rehearse more things once or twice, instead of carefully trying not to get something the first time. But I need to rehearse visually. (My verbal memory has never been good, and my short term memory for sound is much shorter than that of an average person. If I haven't turned a sound into picture or concept within a very small number of seconds — about three to five seconds — the sound is unrecoverably gone. Except for tone; for some reason I recall subtle emotional tone the same as episodic memory. When I remember what a person says, I'm actually remembering my model of what I think they said they believe, and not the words, because I don't think much in words.) In grade school teachers started showing me complex posters full of dozens of elaborate images — just for a few seconds — so they could set it aside and ask questions, to see whether I recalled all of it. They were testing me for eidetic memory because I seemed like the best candidate. But I was a disappointment: I only recalled those parts I set eyes on and started telling myself a narrative about. All those parts I recalled, but nothing I hadn't gotten to yet. My short term episodic memory always allowed me to replay recent history like rewinding video, but everything was fuzzy that I hadn't studied with the fovea of my eye. One of my junior high school English teachers told us a long anecdote about a medical student she'd known in college with photographic memory. Her eyes went to me often during the anecdote, as was common with teachers telling stories about memory oddities. The med student had gotten in trouble by virtue of getting a perfect score on some test, which authorities had assumed meant evidence of cheating; so he'd had to explain about his trick memory. But he'd paid a price for his memory, my English teacher explained — he believed it made him less intelligent than he would have been with a more normal memory. Or at least this was the moral my teacher wanted to put across: remembering too much could impact brilliance, and she certainly thought so. This is the law of even-handed gifts most people tend to believe — it's unfair for someone to be good at everything, so any gift must come at the cost of incapacity somewhere else. The end of the anecdote told how the med student had ended up the chief administrator of a hospital, instead of becoming a brilliant über doctor — a clearly second place prize to my teacher. I had the distinct impression I was supposed to get something out of this, but I never figured out what it was. I suspected her of schadenfreude. Now I understand the categorical error she made. What she had actually observed was that great memory can make a person lazy, since success can be had merely by remembering, and then thinking would not be necesssary. If you believed your memory was gospel, or that text of what you learned was gospel, you might avoid questioning the basis of what you knew — having the effect of making you less brilliant when you could not think outside the lines or outside the box (or outside whatever metaphor you use for canonical, socially constructed world views). I had always believed everything I was taught was at least partially if not totally cockeyed, as if twisted by inferior mental categories of folks not up to analyzing what they formalized and described. I was busy unraveling every underlying premise I managed to notice, hoping I could find a more sane way to recombine all the parts in some better order. That I remembered a lot didn't stop me at all from trying to tear it all apart. I also had no trouble remembering all my mental workshops strewn with wreckage from disassembled models of how things work. However, it didn't take long to encounter the real problem: if I had too many interesting ideas too often, what was going to cue me to bring them back to mind? By high school I started to notice fascinating ideas would slip for days at at time because I had too many things in my queue, and all of them were interesting. So I needed to find a way to prioritize since I didn't have bandwidth to come back to everything, not if I just kept steadily adding to my queue of ideas to study. In short, it was an indexing problem: I had trouble refinding every thing I thought about. I needed a mental index I could peruse as a mental tickle list. I couldn't write it down because it was complex, tree-shaped, and changed continuously as I paid attention. I tried mnemonics again. One common technique for remembering lists of things involved the use of a virtual house, with locations in the house where you could embed an image of something you wanted to recall later. My older brother had a model with ten locations. I tried imagining my explorations of ideas as happening in specific imaginary locales, so I could bookmark where I was and recall my place later by visting all those places to see what was there. (Over time, I have tended to use stories as things in which I embed things I want to recall later, since stories include not only location, but also plot, dialog, personal beliefs and actions of characters, and lots of texture for hooks to hang ideas. This approach has the effect of turning episodic memory into an index, since you can visit fictional episodes as easily as real ones, and you aren't constrained by a one-to-one ratio with real time.) In college, partly for amusement, I expanded my brother's ten-slot model into a model whose capacity accomodated one hundred images. Once I worked out the order of locations, I immediately put it to use performing parlor tricks. About a dozen times I asked a friend or relative to make a list of 100 things, preferrably with a visual representation; then I asked them to read me the numbered list in some scrambled order, crossing out each already-read item as we went. I needed about three or five seconds to cement each image in place. Then after hearing all of them, I could read them back in sorted order from one to one hundred, with perfect scores. A girl I knew in high school — Denise Peterson — found me in the college library one day and I did this trick for her, which she loved. Denise said she was in a psychology class, and asked me to come show this trick to her class. But I pointed out just how long this takes to perform. First you have to think of a hundred things and write each one down. Then you have to read each one to me, at a rate of maybe a dozen a minute. The whole thing would take at least twenty minutes. It would consume a lot of one class to introduce, perform, then follow up with discussion. I thought her classmates would hate it. I have no idea whether I can still do it. I'd have to rebuild the model, and I'm sure I'd be slower. Not very entertaining. One of my younger brothers tried to jam the program by making each item a long and involved phrase, such as, "The rotund, dancing, Franciscan monk with green toenails and torn raiments composed a rambling dirge in honor of half-forgotten and yet piquant errors of judgment." He was fond of working elaborate mental state into each one. I told him no, it had to be something I could see in my mind's eye, and no more than a short noun phrase with one or two adjectives. So he didn't see why it should be hard. He didn't offer to try though. Anyway, my memory isn't what it used to be, but it doesn't suck yet, and I suppose it never will. However, any bit of random nonsense I want to remember days from now I have to write down. The syntax for that arbitrary command line that effects a special purpose environment change? I can remember that for the next ten minutes, but that's it. If any part of the syntax meant anything to me, I might remember longer. Fortunately, this almost never comes up. I just wish I had needed to keep notes when I was younger, so I'd have better note taking habits. 21nov08
¶
baleful stares
mood ¶ I've been unusually content lately, maybe for no particular reason, or maybe because I have enough to think about. But it's pretty interesting not to have that sensation of starving for stimulus all the time despite consuming input that ought to be engaging, but isn't. Perhaps my schedule is is slightly closer to a balance lately. In the last week I saw one new movie, read one new novel, saw one new episode of television (House of course), had lunch with one old friend, emailed another old friend, had one eye exam for new glasses, watched my older son beat one new video game, started writing about one new project at work, ignored yet another idea for a funny fiction story, and worked out in the gym several times. I think I'll write a little more hobby code this weekend. But I hardly thought about my home project the last week because I had an absorbing problem from work to think about. So that's been running in the background of my mind all the time. It's a distributed storage problem, and a little on the wide open side, which I like a lot but makes me search through a large problem space looking for plausible hits. It's another look at dedup, but this time starting from first principles, seeing if anything new comes to mind. So far the ideas seem old and conventional, in so far as they match bits and pieces from the usual suspects. The total cognitive footprint is really small thus far, so it looks like a reductionist game play: covering the entire problem with a very small set of primitives. I keep wondering if another trick will suggest itself. But the only trick today is just: less is more. Having fewer constraints affords more options in organization and optimization with replaceable modules. The main advantage so far seems to be in error handling. It looks like I can cover almost all failures with one sort of unified behavior. Partly the trick here is problem definition. (Yes, changing problem definitions has a lot to do with good solutions.) This way of looking at it folds all the error situations into the same space as normal behavior. I think when I prototype this one it will come out sorta cool, especially when lots of parts lean toward functional data flow. Note I'm not going to be telling you about it. That's about as close as you can get. I certainly can't tell you the reductive arguments without spelling out the problem redefinition, which is where a lot of the secret sauce will be involved. I'm still spending a lot of time thinking about exactly which parts depend on which other details. optimization ¶ I rewrote one of the central algorithms in the old code base in a new form so it could more easily be done in hand-crafted assembler. The Intel architecture is really irritating when you can't get enough registers for as many variables as seem needed. |
Entries appear in reverse chronological order.
Content here is permanent: Each entry has a permalink
(¶) to
the long-lived persistent copy here. Clearly, to link
anything, you'd best link the permanent copy.
14nov08
¶
thoughtful faces
type classes ¶ Jonathan Shapiro kindly wrote a generous response on the bitc-dev mailing list to a question I asked recently at the Lambda The Ultimate blog. I asked about terms typeclasses and polyinstantiation — which I had not heard before (because I don't really bother to keep up with programming language terminology) — because naasking had suggested v-tables in BitC might be unnecessary given these other mechanisms. It looked like a good opportunity to ask a newbie question, since I never mind looking ignorant. (Which is convenient, since I'm often ignorant.) Over at LtU, David Barbour gave a good short description of polyinstantiation, including a comparison to static specialization in C++ templates, which made me infer the term means something like: "polymorphism via type specialization at point of instantiation — usually static but could be dynamic." (Because you can always make anything dynamic, of course: late binding is easy, while early binding is hard since it narrows options.) I'm replying here because I've written long posts at LtU in the past making me self-conscious about verbosity while still not letting me say as much as I'd like. I'm happy to respond at LtU when I can keep it short. Note I don't use my last name on this site, and I'd rather not see it used by others. (I had a taste once of greater notoriety, and I don't like it much. The first name I adopted a few years and use at LtU is just fine to keep a low profile.) Sorry for a long setup; the rest of this is response. C++ does tend to tie too many things together. I'm not an apologist for C++, just someone who uses it a lot. Please don't take me for one of those provincial coders who think v-tables are the right way to do things. An array of function pointers is just one representation of a map, and is equivalent to (but more fragile than) dictionaries in other formats. Smalltalk style method dictionaries are nice and flexible, and this hashmap approach to api dispatch is used by Self and Python and other languages to similar effect, though Python uses it for everything including member variables; I gather objects in Python are tuples in the form of mapped name/value pairs. As far as I'm concerned, Objective-C's approach to method maps and dispatch is superior to that in C++, but I haven't had opportunity to use it much. My work life would have been simpler if Objective-C had bested C++ many years ago. Oh well. When I asked about space efficiency of typeclasses as compared to v-tables in C++, I was wondering whether it resembled (just guessing blindly of course) something like Ian Piumarta's approach in cola, which I glanced at some months ago, just long enough to get an impression it aped structural aspects of both Smalltalk class pointers in objects and C++ vtable pointers in objects — embedding a reference to metainformation about a class somewhere in the physical object representation, where it can be found at runtime by dynamic dispatch for polymorphism. Oh, and yes, yes, I know Piumarta is a god (someone I admire) and I mean no disrespect by using casual verbs like ape. The answer I was hoping to hear was one like, "Every single object spends a little space on overhead to access class oriented metadata in a very direct manner, instead of finding very clever indirect ways to factor that overhead out of every object." I like my indirection straight and without too much gamesmanship in optimization. I have a knee jerk negative reaction to fractal sounding solutions (where every rule has an exception for special cases described by new rules with their own exceptions, ad infinitum). I'm not really familiar with Java interfaces, except that when I first read about them in the 90's they sounded so close to Objective-C interfaces I assumed it was a direct copy. (Or they're cousins since essentially the same community of folks would have been involved.) The material on dictionary resolution you mentioned sounds like the part I find interesting. The interactions between multiple types is likely where I'd worry about problems. What I'd really like is a simple operational definition of outcomes from a casual coder's point of view: where should I look in code to resolve conflicts myself? Can I tell — just by looking at the code — how the resolution will work out in the compiler? Or is it too complex to do by hand in practice? This bears very closely upon an aspect of generic functions in Dylan that put me off the language: I had trouble guessing which method would be called by my own casual, mental static typing. I didn't want the runtime to "do the right thing" — I wanted to know myself what would dispatch a priori. In other words, I want to do global static analysis myself, without any maybes or probablys involved. One of the reasons I liked Smalltalk with single inheritance was ease of deciding where to look to see what method executes, once I made an assumption about what actual types were passed to methods. If you like, think of me as coding in object oriented assembler. I still want to imagine a deterministic assembler model despite a nice high level language. I'm glad you wanted to resist dynamic allocation of dictionaries. As a user I would be satisfied by a feature allowing me to turn off dynamic dictionary allocation for types I cared most to control. Incidentally, I like your use of capsule as an alternative for object, since the length is short enough, the syllables few enough, the relation to encapsulation close enough, and odds of confusion with random associations low enough. Thanks a lot for expanding on typeclasses. Mechanics of dispatch is one of my favorite topics, which is why I had trouble resisting an urge to ask for elaboration. I think of you as someone whose time is much more valuable than mine, so I'm flattered to get my interest addressed. I'm not sure I was confused (and I lack the modesty to claim I was), but I sure was ignorant about typeclasses and I thank you for tutorage. I hope it helps other folks familiar with C++ become more familiar with BitC in areas permitting some cross over. In some ways C++ looks like a train headed to run into a mountain without benefit of a tunnel, so it would be nice to have options. simpler C++ rules ¶ Coincidentally (given the sub-topic in shap's remarks about interfaces generally being better than v-tables specifically) this afternoon I added similar remarks along these lines to a wiki page at work outlining future ground rules for a style of writing C++ embedded inside C oriented processes. My first draft of C++ usage rules concerned things not to do, in order to have no runtime impact on nearby C code linking no std C++ libs. The purpose of the wiki page was to inscribe a circle around a set of things to be explicitly permitted by C++ code lurking inside a C-only environment, as distinguished from another set of things explicitly denied to C++ in the same context. But a reviewer asked me to include guidelines on use of inheritance, explicitly spelling out what is considered good inheritance style. In particular, for components intended swappable with replacements, the best approach to minimize future conflict uses pure virtual abstract base class interfaces, with neither inherited state nor implementation to cause coupling in subclasses. Naturally I agreed and formalized rules a bit for the wiki page to encourage good inheritance style in the direction of abstract interfaces. But in practice, extremism of interface purism is one I seldom bother to pursue in my own casual projects, like ones I post on this site. In code you see around here, I make no effort to enforce pure separation of abstract interfaces from concrete subtypes implementing them. I could, but I don't. So, why not? When you cleanly separate abstract interfaces from concrete implementations, you tend to get at least two sets of names: the abstract ones and the concrete ones, with the obvious one-to-one mapping, of course. Proliferation of code names often upsets many people — or at least most folks I see all the time. I tend to create so many class names in the first place that I already get complaints, even before adding more by separating abstract from concrete. The number one request I get is always the same: please write less — less of everything: fewer classes, methods, docs, and checkin notes. Just less, please. When I try to make code replaceable by substitutes, I'm the only one who ever does it (currently anyway). And the abstraction making it possible is usually criticized. Because essentially, abstraction is also obfuscation. Code written abstractly is much easier to change, but is somewhat harder to grasp. The former counts for little, and the latter counts for a lot, in day jobs of late. Sample code on this site blurs edges between abstract and concrete, to make it more clear and to give me less to write, both. The result is hacking fodder, mostly. Which is more or less what it's for: grist for future mills. 12nov08
¶
war bucks
tinkering ¶ Rather than writing web pages, I've been tinkering with new code. I was aiming for a new blob api, but I veered sideways into a new file api rev because it seemed plausible to make a blob mimic a file. Then I kept veering into some pseudo memory mapping support for files. So effectively I'm cooking up some content for a file demo (which is still a stub) because it will help me frame some context for blob content. And it wouldn't hurt to have demos get slightly code heavy. But I'm sure I can go too far that way, getting too many code names. (In code design, when you favor orientation toward data, the result is often more flexible and adaptive. But when each type needs a unique name, you get a lot of names; for a complex task, like paging iovec-based files, it's easy to use many classes with only slightly different names, because their meanings are intertwined.) Since it has a high amusement factor just now, I'm not throttling much. So I might just crank out code for several related demos, all oriented around schlepping around bytes in blobs, files, and wherever. I might write demos as if Wil composed code in stream-of-consciousness mode, in response to Zé asking "what if" too many times. Half baked results would be both good and bad: fertile suggestions and lines of inquiry, but you'd need to boil it down yourself, if you want minimalism. tombstones ¶ This last week I updated code for a closed/probing hash map (using tombstones) so accumulated entropy in the form of tombstones could incrementally boil off. I was seeing too high a percentage of tombstones in map occupancy (bordering on pathological). So I thought about it; then I said oh yeah once I'd simulated it a little. In a closed hashing map which probes to find unused slots when the target bucket is used, you only know a search has ended when you see an empty slot. As a result, when you delete a map entry you must replace it with a tombstone if you don't want to rehash, in order to keep searches valid for buckets used downstream from that spot. When do tombstones go away? When a successor is an empty slot. So removing tombstones is dependent on getting them next to empty slots. High frequency of empty slots is important to maintain constant time access. I decided my last map implementation wasn't working hard enough to remove tombstones — I hadn't asked myself how to make them go away proactively. But once I did, an obvious approach suggested itself. Say you search for key K in a map, but the hash H=h(K) points at bucket B, and K is found in a slot downstream to the right of B (because the slots were used when K was added). When some older keys are removed, you might run across tombstones on the way to finding K. If you do, then swapping K with the first tombstone slot improves the map two ways: 1) K is closer to bucket B, and 2) a moved tombstone gets closer to the next empty slot. Both these improvements remove map entropy, and you can do this incrementally in every map lookup. So far I call this bubbling tombstones rightward because it pushes tombstones to the right on each swap, getting closer to the next empty slot each time, promoting conversion of tombstone slots to empty slots when they become adjacent. When the total empty slot count gets too low, the same tactic can be pursued more aggressively by considering other keys seen in passing besides K, the target of the search. For example, the next used key K' in a slot that is not equal to K might also rehash to an earlier tombstone, which would bubble another tombstone rightward. 09nov08
¶
shanty towns
archeology ¶ I started trawling through old docs about blob btrees in IronDoc, thinking about how to write a demo and whether to include significant code therein, and I noticed an odd thing: I wouldn't do blobs the same way now. So I've been asking myself: What changed? How is it so obvious to me now my earlier approaches were too complex? Did I get dumber? Did my standards change? Or did I learn useful rules of thumb to avoid complexity quagmires? The part that seems too complex now inserts an arbitrarily large data fragment in an existing blob stream. My approach aimed to do this in the cheapest way possible, by calculating final end effects on a btree, at the start before making any changes. As a result, my old code touches nodes the smallest number of times to build a new final tree. But that wasn't necessary. The analysis for cheapest cost was clever, gnarly, and long — all in a totally inappropriate way if you wanted someone else to grasp it all with little effort. It's as if I didn't care what it took for someone else to come up to speed. I only cared whether I grasped it. That's one red flag. Another is absence of the simple approach in old code: I didn't write a simpler but equivalent variant to which the complex version could be compared. Now that just seems wrong, for several reasons, the most important of which is denying others an easy version to walk through. My standards have moved markedly in a direction of letting others review code cheaply. An important quality in code is ease of letting others convince themselves it works. Ten years ago I was too eager to optimize code, when my priority should first have been to ensure it was obvious to others. Now I no longer feel very complex code is to be trusted, ever, without a simpler version to which it can be compared for equivalence. And even then only when complex optimization is needed to meet performance goals. The next version I write will do only the simple thing. But I don't expect to write a disk-based version using a page cache, because that involves details I don't need at first for a blob btree working in memory only. At the moment I'm inclined to insert a large new fragment into an old blob btree by first building a new separate blob btree describing only the incoming fragment. Then I'd write code to insert one blob btree into another at some offset. This would incur extra cost where the two trees join edges. But it's an easy divide and conquer strategy, whose code is re-used when you actually do insert one btree into another. 08nov08
¶
waterfalls of caches
job roles ¶ I had a really intense week I'm not saying much about, but I was really busy all the time. Friday at work included a meeting about what I'll be assigned to do next, which was largely crafted to suit whatever it was I wanted to be doing. So far it looks like I'll have a lot of interesting stuff to think about, with more on my plate than I'll have time to address, which I tend to like since being bored is worse than being slightly overloaded. arc addendum ¶ I added slightly more material to the arc demo so it addresses a few more quirks to arise later in the blob demo. The bottom of each column is longer, with more in the right column. copy-on-write ¶ Today I'm planning to write the cow (copy-on-write) demo next because that's a good way to offload more material that might otherwise creep into the blob and tree demos. They might be too long unless I manage to cut potential side issues. I was thinking about explaining IronDoc style structured storage transactions, and I realized it was technically another form of copy-on-write at the block API level, and the cow demo is still a stub, so I can cover that there. At the same time, I can ditch any other cow side-topics for blobs and btrees in that demo too. Here's a short version of IronDoc transactions. (I'll say this in more detail in the cow demo.) Files are allocated in blocks, mapped in an allocation bitmap of free blocks. A db file actually has two such allocation bitmaps: the old and new free bitmaps. A block can be (re)allocated only when it is both free now and also free previously, since otherwise modifying a block used in the last transaction would lose data you have not committed to discard. Any blocks used in the old worldview (not free in the old bitmap) are physically immutable, but can be logically modified because copy-on-write occurs. When modified, each old immutable block at offset X is remapped to a new offset at Y, where Y is allocated from new or previously empty space; for the rest of a transation, any attempt to read or write X actually uses the block at Y instead. Until a transaction commits, both X and Y are maintained independently to preserve both the old and new versions of X. Using this model, all changes to a db file occur in space that was either empty (free and containing don't-care values) or beyond the old eof from the view of the old database transaction snapshot. An abort for any reason (including process termination losing any unwritten changes) simply leaves a db file with the same content it had before — in old blocks. Changes occur only in unallocated space, from the view of the last committed global transaction. Some folks notice this feels morally equivalent to a log structured file system, since changes log into new space. This particular way of doing it was designed to suit end users for Apple MacOS environments, because a single file hosted both pre and post transaction commit data, so any abort or commit always had everything necessary in one file. This way, if a user crashed and afterwards immediately copied one document they knew about to new media, they would always get everything, because no additional special or hidden transaction log files were involved. The first priority was always the same: never lose the user's data. 05nov08
¶
airtight compartments
arc demo ¶ The new arc demo is now all finished. It overlaps with content to appear in a near future blob demo, while also covering arc oriented ideas applicable other sorts of data structures. (All map data structures indexing data found elsewhere basically contain arcs as entries.) 04nov08
¶
slap buttons
new games ¶ I bought my sons new games over the weekend, which they've played a lot since. Tonight they discovered a very funny feature of Little Big Planet requiring some explanation. Watching them use it was one of the funniest things I've seen them do in a long time. Note Little Big Planet (let's abbreviate to LBP) deserves a long review, because it's really interesting. Watching my two sons start a tutorial for creating new environments in LBP was the first time I thought I was seeing something new in software dev growing in a context of end user game creation tools. However, I'm not going to say much about what's cool in LBP tonight. I'll focus on just one funny part. But to set up the joke, first I'll tell you about my younger son (let's call him Fenrir, but that's not his name) when he played Fallout 3 first on Sunday. Fallout is a cool-looking, post-apocalyptic game made by the same folks who created Bioshock, which in some ways resembles an above-water version of Bioshock, in what I see so far. Part of the appeal is retro cultural elements. The game Fallout begins with the player's game character growing up — Fenrir was irritated when his character reached the age of ten and had to suffer the social burden of a conventional birthday party with a few post-apocalyptic twists. The game has dialog and permits Fenrir to choose one of several responses he can make. When adults patronized Fenrir's character, he said, "I want to slap you. Where's the slap button?" For the next half hour he complained about not being able to find the slap button. Slapping other characters was never one of the offered choices. Fenrir and his older brother made a running joke of the complaint. Okay, now let's fast forward to Little Big Planet again, still called LBP for short. The two of them played a couple games in two player mode, in which each controlled a little character in the form of a bean bag doll called a "sack person," resembling a generic, detailess beenie baby of a human being. LBP renders sack persons and their environments in exquisite three dimensional detail, mildly realistic physics motion, and good sound effects with wonderful background music. The game controls give each player fascinatingly detailed control over the movement and facial expressions of each sack person, so they have enormous personality when a player wants to emote through their sack person. So it's quite easy to project yourself onto the sack person you control. My sons spoke of their sack persons as themselves, and acted that way too, with respect to invasion of space and other forms of mutual interference. Just imagine their surprise when my older son made a sudden gesture, sending Fenrir's sack person flying in a full body sprawl, along with the meaty and satisfying smack sound of a righteous slap — a bit like a punch sound effect from other games, but more in the slap direction. Naturally they explored it more, losing themselves in an epic slapfest competition that was hilarious to watch. I found myself studying exactly what happens to the slapper and the slappee each time a slap event occurs. If you look closely, you eventually notice the sack person delivering the slap shakes his hand and arm afterward as if shaking off the pain of the impact. It's weirdly subtle. The slappee — the slapped sack person who falls — does a nice pratfall, typically rotating in the air about 270 degrees to land on their face. Each slap lifts the target sack person up off their feet as they spin in the air before falling in a heap. It's a tiny little physics drama/simulation of puppet scale comic violence. And something about it is really funny. As my sons played more game levels, they periodically stopped to slap the other sack person. And on the podium celebrating the winner of each level, they always slapped each other, sneaking up on each other for a sudden backhanded slapstick assault. One got up to take a rest room break and warned his brother, "Don't slap me." But he was hardly out the door before the first slap was delivered. The sack person with no controller stood up again after each slap, only to receive another slap meted out by the mischievious brother, which lead immediately to a fit of rage in the brother who left for a break, who shouted, "I told you not to slap me!" Anyway, it seemed sociologically interesting. 02nov08
¶
courtly knaves
partial arc demo ¶ The new arc demo is about two thirds done. I spent a lot of my writing time creating images instead of writing. I started with a very slow exploratory search for techniques that might work. I eventually made fast enough drawing progress, once I found a plausible way to render btrees in a way I could contrast in perspective views. But it was a slow ramp. btree figures ¶ I stayed up into the wee hours of the night last night drawing figures for the arc demo. Naturally explanations need to be written for full value. But here's a couple:
The (near) perspective view below was constructed by hand, shooting for a common vanishing pointing point for both layers. But the rear blue layer doesn't have a vanishing point matching the red layer's close enough. But it's good enough to get across the right idea. I added the faint background grid specifically because I knew it would help provide context revealing an intent to render perspective. The two graphs would just look distorted without the grid.
|