XPath bookmarks

The web doesn't have any good way of bookmarking any spot in a web page. With some help from the web page author, we can bookmark a specific anchor or node id in the page, but most particular spots are still not reachable for bookmarks. I just tossed up a little user script that makes any node in the page addressable by an XPath query bookmarkable. It's mostly for XPath power users, for now, but works well (and lets you load bookmarks using that technique, which you might have gotten from such people).

The source code (install from here) is extremely short:
var path, node, hash = decodeURIComponent( location.hash||'' );
if( (path = /^#xpath:(.+)/.exec( hash )) &&
(node = $X( path[1] )) )

function $X( xpath ) {
return document.evaluate( xpath, document, null, 0, null ).iterateNext();
Having installed that, you can load bookmarks like http://tibet.dharmakara.net/TibetABC.html#xpath:/html/body/h2[3], and get zoomed in to the right part of the page immediately (here, the part featuring how the Tibetan numbers are spelled, what they look like, and approximately how to pronounce them, for us westeners).

Scope user scripts to HTML pages

Most user scripts, and especially user scripts writing the DOM (injecting a user interface of some sort, for instance) should, but don't, start with this line of code:
if( !(document.contentType||'').match( /html/i ) ) return;
Which means what? Well, it ascertains that the page loaded is HTML, which people tend to take for granted, but which is not the case for all pages on the web. Especially, it is not the case for text/plain pages, by the broad masses more commonly known as *.txt. In Firefox, text documents get rendered much like HTML pages, but in a <pre> encasing.

Saving the document will remove this "HTML enclosure", but if your script injected some other junk, like an interface of some sort with some text, for instance, the saved page also will. This is probably not what you wanted. It is at the very least certainly not what unsuspecting users of the script wanted.

Edit: a modern GM+Firefox combination gets to run at XHTML pages. (This has not always been the case. Thanks for the correction, Rod McGuire!)

But please do decorate your scripts with the above line. It's royalty and patent free software with an irrevocable, DRM free license for all time. Public domain, at its best, working for you. Cheers!


Make low-tech people publishers with EditGrid

In the real world, I regularly sing (tenor) in my local choir. Choirs have some boring administrative burdens, like keeping track of what sheet music is being sung now, and what should be returned to the choir library. In our choir, it's manual labour handled by each member and coordinated by a clerk we elect every season.

Anyway, that person is typically more neat than technical. I crafted some help tooling for her (as administrator) that gives her a minimum-maintenance publishing system, to show us, right on our internal home page, what sheets we should have and what to return. She does not need to handle messy web tech, our web page needed no server side hacks, and all she does is edit what looks and feels like her old Excel document she used to keep our sheet music in, but at EditGrid:

I added another sheet for her, stating the out/in dates (to the precision she wants) and ids of the songs we sing (for her own reference, she adds some additional; names and composers, typically):

This EditGrid spreadsheet is open for public browsing. Firefox+Firebug users beware: unless you turn off Firebug for that domain, on visiting an EditGrid spreadsheet, your whole Firefox session (all tabs of all open windows) will freeze beyond salvation (due to issues with Firebug's XmlHttpRequest monitor, if I remember correctly). To avoid that issue, first go to the front page and right-click the Firebug icon, selecting "Disable Firebug for www.editgrid.com".

Then it's safe proceeding to the spreadsheet itself. The administrator of course has an account with edit rights to the data; you will only be able to browse it.

As mentioned in a previous post, EditGrid can export data in mostly any format you want, including JSON, if not out of the box yet (EditGrid devs: even if you don't support native JSONP yet, it would be helpful if you let users share their xsl transforms, so it gets easier to copy recipies such as this one). As I already had set up my account with a nice JSON data format exporter, I opted to reuse that (with the future option to make an Exhibit interface for the data set of what we sung when and the like, without any fuss).

With my data format (Exhibit JSON, actually), the file looks a little like this:

{"type":"Noter","löpnr":"M01","titel":"Beati sunt",...},
{"type":"Noter","löpnr":"M02","titel":"Partitur, Mariamusik",...},
{"type":"Ut/In","ut":"2007-","titel":"Sånger från Taizé",...},
{"type":"Ut/In","ut":"2005-","titel":"Himlen & jorden sjuder",...},

Then I wrote up some javascript for our web page that imports the data, listing works we should have and should return, depending on whether today's date is in the given date range or not. Notes past their due back date show up in the latter list, and notes that have not yet been handed out to us are not listed at all. There are some extra features for listing notes without any dates at all (meaning "you should have them, but I don't remember since how long back") and so on.

The result doesn't look much, it but does the job very nicely. See the source code, liberated of blog template cruft separately. (The bit at the end adding a timestamp to the URL is needed to prevent your browser from over-caching the data; EditGrid does not yet seem to send proper HTTP headers about content modification.)

...What we're singing now? See for yourselves (second list folded by default, to conserve space):


XPath shorthands $x and $X

One good thing about work is I find it a lot easier to be rational about building (or buying) the best tools for the job. I've known for over a year that I should have the power tools I equip almost all my Greasemonkey scripts with in the Firebug console too, but never came further than to request the feature, at some time, ages ago. Today I extended Firebug's $x(xpath) to handle $x(xpath, contextNode) too (catering relative xpaths) and to return strings, numbers or booleans, when the expression resulted in such output.

$X is a variant on $x, which will return a node rather than an array, when the result was a node set. Instead, you get the first match, in document order. You'd be surprised how comfy and useful that is. These tools cut down the user script development feedback loop overhead for me quite noticeably. They also work even in a framed environment (when you pass a context node), which the former Firebug $x did not.

After patching up the build script a bit until it worked (so I could test it out), I submitted the patches to the Firebug list, so they might end up in the upstream 1.0.6 build as well.

Relative XPaths are very useful for answering lots of intricate questions about web pages that you'd have to work for quite a while with the DOM inspector or Firebug's HTML view to figure out. If you like me like to bookmark comments you've written on some web page, you ideally want to find the closest anchor before the place where your comment showed up. Good blogs have easily clickable links around the comment to help you do that; others, like Ajaxian, don't.

With this hack, you find the closest preceding anchor by invoking Firebug's Inspector, picking out your comment, opening the Firebug command line (Command or Control + Shift + L), and typing, for instance,

node = $X('preceding::*[@name or @id][1]', $0); location.hash = '#' + (node.id||node.name);

Then it's just the matter of bookmarking as usual. Since Firebug's console features are not exposed to the page, you unfortunately can not make a bookmark out of it, but it's nevertheless a good example of what you can do with this.


Trac(k)ing svn repositories

I had been dragging my feet for a while, hoping the Simile subversion repository would get a nice web based Trac timeline over commits and change sets, if I asked nicely and waited patiently. That very often helps with open source projects (Edit: this time too, eventually), but not always.

Last weekend I took to setting up a local mirror of the repository to set up my own, local, Trac, just to get that timeline/changeset browser combination. I find it indispensable for software development with more than one developer (and a very useful tool, even when you're on your own).

Thanks to SVK (a rather mature perl concoction running atop the subversion filesystem and remote access layers -- see the svk book for more info, for instance), it is actually rather comfortable to set up your local subversion mirror of a remote repository, whether your own or someone else's. Even saves you some disk compared to a common subversion repository.

Anyway, to do that I first created a vc user (for "version control", for lack of imagination) as I wanted to avoid mixing up my own user's SVK depot (kept in ~/.svk) with the repository I wanted to get Trac coverage for, in System Preferences. Running as that user, here are the steps I needed to take to get this running on my (fink enabled) macbook:
# installing the packages:
sudo fink install trac-py24
sudo fink install svk

# setting up the local repository mirror and syncing up:
svk mirror https://simile.mit.edu/repository //mirror/simile
svk sync //mirror/simile

# pointing Trac at it
mkdir ~/trac
trac-admin ~/trac/simile initenv
Then I got to answer to some questions; name my Trac instance Simile, point it to the repository root /Users/vc/.svk/local, say that it is an svn repository and get pointed to the config file ~/trac/simile/conf/trac.ini, which needed some editing.

If you want full commit messages in the timeline (I do), make sure you keep wiki_format_messages in the [changeset] section and changeset_long_messages under [timeline] both set to true. You'd think these options are orthogonal from their names, but they are not; the latter is happily ignored if you turn off wikiml. So even if, like in my case, commit messages aren't wikiml markup tied to the wiki and issue tracker inside this Trac instance, it's either pretend it is, or get truncated commit messages.

Starting it is done with tracd --port 8000 ~/trac/simile (or something more permanent, by way of Apache or similar), and you can browse at http://localhost:8000/, once come this far. If you want some more linkage and less round trips between views, feel free to tuck in my Trac Timeline and Trac Changeset improver Greasemonkey scripts crafted for this particular purpose.

Does anyone know of any Trac plugins to export changeset data as JSON / JSONP? I have some plans for equipping the setup with a facet browser and some commit message searchability, and it would be a great help if I didn't have to write them myself. Even better would be getting that installed at DevjaVu, so lots of other people could benefit from the same hack. Including my own repositories. (Which I could arguably mirror, but where is the fun or elegance in that?)