Þ   briarpig  » thorn  » ideas


Ideas in thorn (þ runtime) are described on this page. This is mainly about patterns in high level views that might not be obvious, resembling an answer to: "What does þ do?" If you focus on kinds of effects, you see some categories. You might get more context from the why page, which explains what I get out of telling anyone about a þ utils library.

history

     I learned C++ in the late 80's after I'd written C for several thousand hours. In 1990 I took a job using C++ at Apple, and it's been my main language since. (I'd rather have used Smalltalk and Scheme more than I could.) Obviously I learned many things before later standards were invented. I'm sure some ideas would never have come to mind if I learned C++ recently.

     I saw one of the worst C++ ideas pursued at Taligent in the early 90's: "everything must be an object." (That is: a class with an abstract api and totally hidden state.) This is a terrible idea, since it makes a bottom layer more complex, slow, and involuted. I'll spare you anecdotes. But once I knew having less abstraction was often good, it inspired simpler code:

  • primitive is good

     The smallest degree of abstraction is best, if it works. In other words, none at all is a desired default. So in þ you see lots of naked C showing through: as much as possible, on purpose. You can guess how it looks to folks with C++ religion: it's heresy. And it doesn't win pissing contests based on "my code's gnarly." Gnarly code is exactly the wrong thing.

classes

     Here I'll lists ideas related to specific classes or groups of classes in þ, so I have a place to compare how ideas differ in classes with some relation. Such a high level view doesn't really need code specific details.

  • itemize class concepts

     I expect I'll write this incrementally over time, as I document the classes involved elsewhere. (This section might go through a lot of revision until I settle on a long term scheme.)

     [Last update: 09mar08.]

text

     Unfortunately, all this description is just text, and I only mention patterns in þ that might vary a little from other C++ libraries in terms of flavor. A better way to introduce ideas in þ might use animation, simulation, and responsive games. Maybe something like that will come later, using stuff I use þ to build. But those aren't thorn ideas and don't need þ code.

  • include concept overviews

     Writing this sort of concept overview isn't much fun. I'd rather just dive into class descriptions, and shortly I will. But since I hate reading tech material with no global overview, I feel obligated to provide one for þ. For coders seeking briefs: þ is just a C++ class library depending on little besides standard C libraries and a few other random things — like pthreads — that can be avoided if you don't use them. One can use a subset of þ classes to fix minor problems in systems, without dragging in a whole world of cranky new dependencies.

     It's sparse as libraries go, and I'll cover little more than needed to explain a toy language (coming soon to mu) used for driving and testing more of þ code I'll publish later. Being able to tell what the library does is one of the points.

menu

     thorn: todo, names, fd, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, hash, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype

     (mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile)

     Some demos are stubs: todo is a demo guide. See toy for mu updates on language pages; names introduces naming schemes.


Overlapping categories are expected in these ideas, since one can pursue more than one goal at once, and ideas here amount to weakly defined simultaneous goals.

favoritism

     C++ isn't my favorite language. (I can't decide whether Smalltalk or Lisp is.) But it's what I use professionally since most often my job is making something go as fast as possible, as safely and efficiently as possible. Usually code starts out in C++ and that's it — we don't fix what's not broken.

  • results trump preference

     I'd be glad to use someone else's library for this sort of thing (go fast with less memory), but I sometimes think of a way to whip other choices, so my hand is forced. I think I could use something else for non-critical parts. But it would need to look simpler and more cost effective than solely C++.

visibility

     I err on the side of making everything public. So some classes look like naked C structs with methods. Each data member is public in simple data types when I can imagine some use wanting direct access without getting confused. I make members private when only bad things happen without privacy, or when correct use can't figure out if bad things might happen.

  • public simple root types

     You can always embed simple public types inside more complex objects stripping away public visibility. Privacy is like salt that's easy to add and is hard to take away.

     You should never expose public access to refcounts. You'll be lucky if the count ever comes out right anyway. Refcounts are canonical examples of things you can't expose safely. You also can't reveal puzzling or unpredictable data structures.

     As a rule of thumb: if you can't define what something means (to someone else using little attention) or how a value is used publically in safety, then make it private. Otherwise stay public until you get burned.

layers

     I only layer for a purpose. Otherwise I want visibility for as great a distance as possible. Opacity for firewalls are a nuisance, most often to a developer writing code. They are like tying your shoelaces to every nearby object each time you stop moving. Guess what happens when you want to walk?

  • layer less

     A good reason for layers is separation of responsibility, when visibility across a boundary would cause trouble or interfere with one side's right to rearrange code. But it comes at a cost: the ignorant side loses affordances to see and optimize.

mechanism vs policy

     When splitting code into manageable chunks, put mechanism on the bottom side and policy on the top side. Bottom-up code makes many things possible, but doesn't decide. Top-down code decides policy, but doesn't say exactly how to do it. Layers in call structure should push doing lower and deciding higher.

  • broad-base narrow-top pyramids

     Bottoms must provide many mechanisms, allow anything, and have no goals. Tops must provide specific entry points aiming at goals and purposes, decide how to invoke features, and have no mechanism constraints.

separation of concerns

     One can separate issues in interfaces that needn't be tied together, and one should separate concerns whenever tying unrelated semantics together causes hardship — or makes usage awkward, inefficient, or unstable. (You can just keep adding bad effects of coupling here; there's more.)

  • separate location and ownership

     One aim in þ separates memory management (ownership) from ability to use data wherever it happens to live (location). You can read and write bytes without any idea how long they live, as long as it's longer than a current operation. So some þ classes represent data without any idea who created it, or who owns it, or when it changes hands.

     In C++ libraries it's quite common to have objects automatically create storage for content, so api users needn't bother themselves with memory management. Storage is a nuisance, you see. But this typically has an effect of forcing data copies, and creating competition over who owns a canonical copy of data. It's a source of major costs or problems.

     It's useful to have raw interfaces that don't copy, able to refer to state managed by someone else. Among other things, this sort of code is exactly the way an owner wants to refer to state it owns. And it permits easy handling of state changes in the middle of transitions from one valid configuration and another.

     Referring to data does not always imply ownership. Knowing where something is — right now anyway — doesn't mean you know other things. Relax and let code be simpler.

inheritance

     Inheritance is good for very small focused hierarchies, with little framework. Every technique has a cost, including inheritance (it increases complexity and dependencies). So I avoid it unless obvious benefit occurs. I prefer it to templates for generics as a more explicit, less complex approach. I use multiple inheritance at times, though I don't like it — usually to pursue more primitive structures. (If an object will go in N different well-known lists, I indirectly inherit a base link class N times for intrinsic links; it's creepy but effective.)

  • few virtual interfaces

     It's tempting to factor virtual APIs cleverly, but don't. Large class hierarchies are a burden. Note documentation is a burden, too, since it obligates a library user to read. Since docs are required and good, a library should be designed for small docs.

  • smaller is better

     Instead of adding optional code you won't call now, add the idea to docs, indirectly explaining (by contrast) how code works which is already done. It's impossible to get a balance right between less and more, but prioritize for "less."

testing

     A good way to test a library is using someone else's library with similar semantics. For this reason, it's a good idea for þ to use STL and any other library in unit tests to create state that should mirror what is stored in þ collections, without using the very same code involved in both versions.

  • compare equivalent systems

     In fact, this is one of the things you can use þ to do: see if some other library generates the same answers you expect from a þ version. Except using STL as a yardstick is easier since in unit tests one is often not very concerned about memory usage, and you don't really care about the effects of STL doing whatever it wants. The same idea also applies to using high level languages to verify some things that won't need a high level language in a deployed runtime. Just throw resources at it.