October 20, 2009

Web shortcomings and opportunities

There is a very interesting post itemizing some of the perceived shortcomings of the web over at the SpaceCollective.

Some of those problems have been addressed by researchers in the past - notable examples are Freeman and Gelernter's Lifestreams, and Nelson's transclusion (I assume this is what the author of the SpaceCollective article means by the term collage; in any case you can start using a limited form of transclusion today).

Some of those problems are being addressed on an ad-hoc basis by commercial companies today. If you take some time to think about the problems, there is still a wide open field of opportunity to provide commercial offerings around the ideas there.

October 18, 2009

The history of programming language syntax

http://news.bbc.co.uk/2/hi/technology/8306631.stm

Only if all the other "programming language designers" had the conscience to apologize for the pain their thoughtlessness has caused.

October 15, 2009

Computers are not getting faster

When we discuss the inclusion of a feature in Lua, we ask ourselves, "OK, but will it run in a microwave oven?"
--Luiz Henrique de Figueiredo

I recently started working with John Fremlin's TPD2 HTTP server, one of the fastest web servers currently available. TPD2 uses a lot of interesting (at least from a Lisp point of view) techniques to achieve high performance. This reminded me of something I noticed in Masterminds of Programming: both Chuck Moore and the creators of Lua emphasized the tendency of computers to scale down.

When laptops and servers are getting faster processors with larger caches and more cores (which doesn't really help because we don't know even know how to think about writing software for them), it's easy to overlook the fact that we're trying to imbue ever smaller objects with computational intelligence. Moore provided the extreme example of a chip having cores with 64 words of RAM and 64 words of ROM memory each. A less extreme example comes in the form of smartphones.

At the same time use of existing systems is increasing to the point where some people are measuring how much servers they buy by the number of tons of waste their packaging generates. If each of your servers could handle three times as many requests per second, you could fit it on a computer three times as small, or have a datacenter a third of the size. The latter is a pragmatic consideration applicable at any time, but the former enables entirely new patterns of use and interaction.

October 13, 2009

The impossibility of global state

(This post is a write-up of some ideas about concurrency that I've briefly discussed before).

By far the bulk of concurrency practice is aimed at dealing with systems based around shared, mutable state. Participants are asked to imagine a world with a global state whose changes are instantaneously observable by everyone. This world is a physical impossibility.

The global state illusion is based on elaborate consensus mechanisms. The concept only works because our perception of the flow of time is fixed vis-a-vis the rest of the world, which happens to be just quick enough to execute consensus algorithms over large networks that we don't get annoyed.

The common refrain to criticisms of the global state illusion is "but think of the bankers!" The consistent bank account with atomic deposits and withdrawals is an example found in almost all textbooks on concurrency. It is a neat and self-contained pedagogical tool for discussing the mechanisms needed to construct a global state illusion, but unfortunately its pervasive use seems to have done immense damage to any attempts to actually reason about concurrent systems.

Networked credit banking existed in 4th century BC Egypt. Banks still accept checks. Your bank account still has overdraft fees. Bank accounts work by reifying transactions as first-class objects. That this idea is considered novel is a testament to the pervasive confusion surrounding concurrency today.

What is the global state illusion really needed for? Distributed shared simulations. Most commonly seen in the guise of videogames. There, all the participants really do need to see a consistent view of the world with changes occurring in quantized time steps.

The key insight behind the global state illusion is that it is not time-scale independent. Whether the illusion can be maintained depends on how the participants perceive the flow of time relative to how quickly the mechanisms behind the illusion work. Banks could have implemented a pen-and-paper two-phase commit protocol for deposits and withdrawals before the advent of telecommunications. It would not have been very useful. This is the same exact issue affecting cache-coherency protocols in multi-core processors, at the micrometer scale.

The way forward in concurrency practice is the abandonment of the global state illusion in favor of constructing systems based on how concurrency works in the real world, taking cues and patterns from organizational and biological systems. This will entail the use of techniques from functional programming and linear logic.

October 10, 2009

Append-only databases

Two years ago, Pat Helland wrote Accountants Don't Use Erasers, which resonated widely in the blogosphere. The central premise, append-only databases, offered a glimpse of the promises of purely functional programming. The idea itself is probably as old as transactional databases, however it has been significantly delayed in practical realization due to the nature of storage hardware.

Append-only filesystems have been around for some time for hard-drives, used mainly for archival storage (the Plan9 Venti filesystem being the most widely known example). Append-only database implementations have been held back by the high seek latency of mechanical drives. This is changing as high-capacity flash drives are becoming available.

Over the summer, Slava Akhmechet (of Weblocks fame), along with Michael Glukhovsky and Leif Walsh, started RethinkDB, a project to build a new append-only MySQL backend. Aside from the performance improvement, RethinkDB manages to solve some glaring MySQL DB management problems: hot backups (without special tools) and fast failure recovery.

The possibilities for flash storage to enable significantly greater database performance were previously noted in several papers (see Lee et alia and Graefe; even the late Jim Gray weighed in on flash drives).