2006-04-15

The stigmergic user script pattern

In a recent (half an hour long) podcast, Jon Udell and Steve Burbeck explore multi-cellular behaviour in a biological and a computer science setting, and what this field could emerge into in a world of computing and computer networks. A fresh perspective on message passing, emergent behaviour and stigmergy structures, concepts not all familiar to people from typical engineering backgrounds, but which have been flourishing in biology for some time already.

Nevertheless, it's growing on us, too, like mold. We are just not very far into putting it in words and common familiarity and frames of reference. The user script is slowly shifting and reshaping our understanding of the web, into something it was not designed to be, but, as many of us have noticed, into a very powerful and shape shifting capabile organism. It's a paradigm shift I believe will evolve the web in a profound way, following along in the trail of previous paradigm shifts:

  • The transition from the web as a set of loosely coupled static hypertext documents into server side dynamically generated pages

  • The large-scale client adoption of XMLHttpRequest, shifting pre-AJAX-ian days to a part client side driven data flow

  • The client driven orchestration or remixing of what the web page does, and how, by way of the user script. This is one of the presently evolving next steps, and useful patterns and practices are slowly forming around it.
While standardization is only just now slowly reaching XMLHttpRequest, and has not moved towards the user script dimension yet, we can in a way be thankful for only having seen a manageable user interface to the user script capability in one of the mainstream browsers yet, making Greasemonkey a de facto standard in the absence of organized work in the field.

But I'm losing track, here; let us get back to the podcast topics, again. A stigmergy structure is the result of emergent behaviour in a system, examples from the podcast being Wikipedia or the Linux source code, evolving over time from contributions by a large number of people, building something fit for a purpose that evolves around some small set of basic principles governing its growth. Or search engines, harnessing the web into a very large data mine, growing larger and more resourceful by the second.

Stigmergy is the method of communication in these systems: how one part sends a message to another part by modifying their common environment. Editing a wikipedia page, submitting a linux patch or writing a web page, registering it with the search engine or being picked up by it from external linkage. And it's a coming, very useful practice in user scripts too.

I recently dropped a reminder on my idea blog repository about finding or devising a gallery browser user script (and Instant Gallery is what came out of it), to transform any page with image links into a common gallery browsing application customized to my own liking, giving me the benefit of a user interface I like and could improve on incrementally and perfect, share with friends and the rest of the web, and use that, pan web, without any prior cooperation from web masters everywhere. This is the raw power of the user script.

And it is also where stigmergy comes into play. Because not all web pages are made to my own and the W3C:s standards and expose image links as hypertext links, handled the same way whether your browser was written this millennium or the former, is capable of javascript or not, and so on, and some employ the most hideous of ways to render click-to-zoom images. The traditional programming approach would be to teach the gallery program how to handle this disorderly input, rendering new versions any time we encounter a new format in need of conquering. But we don't have to.

Recognizing that we have a script that takes image links and makes a gallery from them, we can make a separate script that modifies a standards defiant web site into a standards compliant site with proper image links, prior to running the gallery script -- harvest the page for its image links, add them as proper links, and the gallery is set to go. No added to the gallery tool, easy plug-in fixes from other people who use other sites than I do, and want to use them with the script without to understand and modify my script or have my help doing it.

The web page, in a user script setting, is a stigmergic structure, where an ecology of user scripts can work together to build new and interesting applications, without the blessings of or participation from the webside page host.

And plain image links in a page, is a de facto micro format, specifying a gallery of images. The input fixup scripts can work their magic on a page for the benefit of any other script besides my (and Jos van den Oever's) gallery hack which takes that micro format and renders a gallery of it. We could just as well plug in another script that uploads a copy of those images to a Flickr account, stores them in Google Base or on some social bookmarks service, or anything else we might want to do with a bunch of images.

Making small scripts to bring structure to pages of some common traits and making other scripts to pick up on this structure and perform on it is a very useful pattern from biology that has lots merits in this computer science setting, too. Specialize, do your thing well, and share the result with other specialized entities who do theirs.

Next I would want to stigmergize-enable GM_xmlhttpRequest calls too, to provide the same ecology of specialized helpers to process page-external resources to the scripts I write that pull together data from many locations into tools like Book Burro. But let's play this one step at a time.

2006-04-08

Removing cruft with FireBug

I had a sneak peak at Ajax Magazine today, and ended up in an article about the XML DOM (Microsoft centric, basic level, nothing you couldn't piece together yourself from a skim of the property names and functions available). Anyway, baffled by the overweight of the page template (yes, that presently applies to the ecmanaut root page too), I took to writing a little snippet of code to isolate the page content, dropping the cruft just with a bit of wizardry hand-waving at FireBug command line. This is the code I came up with:
function isolate( node, cb )
{
for( var parent; parent = node.parentNode; node = parent )
for( var c = parent.childNodes, i = c.length-1; i>=0; i-- )
if( c[i] != node )
(cb||removeNode)( c[i] );
}

function removeNode( node )
{
node.parentNode.removeChild(node);
}

function hideNode( node )
{
if( node.style ) node.style.visibility = 'hidden';
}
It works like this -- pick the node in the page you care about (the post text) with the FireBug Inspect button, hit the FireBug command line and apply the piece of code to iterates up the DOM tree from that node, removing (or hiding) all node siblings, wherein the cruft lieth. If you want to try it (and why not on this page?), you first need to add the methods to the page name space -- cut and paste the following portion of code to your FireBug command line:

isolate = function( node, cb ){for( var parent; parent = node.parentNode; node = parent )for( var c = parent.childNodes, i = c.length-1; i>=0; i-- )if(c[i] != node)(cb||removeNode)( c[i] );}; removeNode = function( node ) { node.parentNode.removeChild(node); }; hideNode = function( node ){ if(node.style) node.style.visibility = 'hidden'; }

Assuming you use the default FireBug keyboard bindings and don't have any clashing extension like Web Development Toolbar installed at the same time, this is how you repeat what I did. Press Ctrl+Shift+C and pick out the article node you want to isolate with the mouse (perhaps hitting the up arrow a few times to get enough context around the bits you wanted), hit return to select the node, Ctrl+Shift+L to go to the command line and type isolate($0) or isolate($0, hideNode) to remove all the cruft, or just hide it from view.

Once upon a time I used to do live page hacks like this, typing javascript: URLs into the Location field. FireBug makes hackery like this much less painful. I'm hopeful to see hacks like these become even easier to share and enjoy, in the comfortable, low-threshold fashion Greasemonkey has set an example for.

Being able to package a small component for reuse is a very powerful property of a good tool. The era of the bookmarklet hack is probably not quite over yet.

2006-04-05

Improving page load times

Freshblog wonders if and how you cut down on page load times where external script files are involved. Depending a bit on what those scripts do and how they do it, you can.

If you run multiple constant, feature frozen scripts (contrary to evolving, externally hosted script maintained by some third party, such as the Labelr code John just retired due to excessive lag), you may join them all into one single script, pasting them together into one file in the order you used to link them in your page, and link to that file instead. This does not count bits that you need to add to specific parts in your page template for them to do their job, for instance if they use document.write() to inject content where they are put. (Google AdSense does this, for instance.)

This cut and paste job can bring down tens of script fetches into a single http request, which, assuming you put the script on a fast and responsive server, does marvels for page load times by eliminating a dependency on a multitude of different web servers to all be as quick and responsive serving your page its script code.

On a related note, if you link to lots of images on various web servers (hotlinking icons from social networking sites, feed readers, trendy eye candy, translation flags and whatnot) from lots of random locations, the same problem applies. If your blog is hosted by Blogger, I would warmly recommend using their quite excellent image hosting facility, downloading all the imagery you use to your hard disk, uploading them once to the Blogger image servers, picking up the URLs they got and using those URLs for your own template. It's a bit laborsome, but your blog visitors will thank you.

Conversely, if your blog template includes very large amounts of of inlined CSS code, as Blogger templates often do (as Blogger to date has not provided the option of hosting external CSS files for you), you may actually gain load time by ripping out all that code and putting it in an external file that gets loaded once, and for later page loads is cached by the visitor's browser so each consecutive page load does not have to fetch the same CSS code over and over again.

There are several good page load measuring tools available on the web, some of which give you good statistics and overviews of what adds load time for your web pages:

  • Web Page Analyzer shows nice table views over component sizes and minimum load times for modem bound visitors (and people on slow network connections in general).

  • Loading time checker is another, which lists a summary time and shows your images (though not those linked from CSS, which Web Page Analyzer handles).

  • Stop watch measyures how long time it takes your own browser to load the page, given your present state of page content and/or component caching.