2006-01-31

Bookmarklet tool: Find links to you!

If you are the least bit interested in who links to you and use some service to show you HTTP referrers of site visitors coming to you via inbound links from elsewhere, or see trackbacks from other sites, and so on, you have most likely encoutered this problem, when visiting the page:

Where is that link to me?

I figured I'd make a quick bookmarklet to quickly scroll to the exact spot in a page where the link is, or, if invoked again, to zoom further down the page to the next link, in case there are any.

It turned out quite okay, and as it wasn't much work making it customizable I went that extra bit to make it a tool useful to mostly anybody. Edit: as it eventually turned out, that is just almost true; I took a full hour to polish it up to work better than my original quick and dirty hack. It also illustrates a few good bookmarklet making techniques (more thorough descriptions of a few of these are presented at gazingus.org);

  • Encasing the script in a anonymous function casing so it doesn't leak any variable or funciton names to the page you invoke it on so it won't upset any scripts running at the target page.

  • Creating support functions inside this function casing.

  • Using var to define function local variables.

  • Using the void operator to throw away the return value of a function, so the target page isn't replaced with a document with the value your bookmarklet produced.
I made two variants; the first bookmarklet prompts for a domain name (regexp) to match for all the links in the page, the second matches a fix domain without ever prompting.

Click either of the configure buttons below to set up both scripts to the domain of your preference before bookmarking them. You can click either button again and again to reconfigure the links in the page and bookmark the resulting scripts, if you have a whole slew of sites you want to have bookmarklets for. For the script that prompts for a domain, this sets up the default suggestion, for the other one, it sets which links it will look for.

Regardless of how you configure the scripts, the test will only be performed against the domain name of links, case insensitively (as domain names are not case sensitive) -- if you want to change that, you should be advanced enough to be able to tweak the script to your liking on your own.

Configure scripts by

The first script, which prompts for a domain name regexp:
javascript:void(function()
{
function Y( n )
{
var y = n.offsetTop;
while( n = n.offsetParent )
y += n.offsetTop;
return y
}
var l = document.links, i, u, y, o = [];
if( u = prompt( 'Find links to what domain? (regexp)',
'^ecmanaut\.blogspot\.com$' ) )
{
u = new RegExp( u, 'i' );
for( i = 0; i<l.length; i++ )
if( l[i].host.match( u ) )
o.push( Y(l[i]) );
o.sort( function( a, b ){ return a - b } );
for( i = 0; i<o.length; i++ )
if( (y = o[i]) > pageYOffset )
return scrollTo( 0, y );
alert('No more links found.')
}
})()
The second script, which does not prompt for the domain matcher regexp:
javascript:void(function()
{
function Y( n )
{
var y = n.offsetTop;
while( n = n.offsetParent )
y += n.offsetTop;
return y
}
var l = document.links, i, y, o = [],
u = /^ecmanaut\.blogspot\.com$/i;
for( i = 0; i<l.length; i++ )
if( l[i].host.match( u ) )
o.push( Y(l[i]) );
o.sort( function( a, b ){ return a - b } );
for( i = 0; i<o.length; i++ )
if( (y = o[i]) > pageYOffset )
return scrollTo( 0, y );
alert('No more links found.')
})()
If you want to, you can try clicking either script rather than bookmarking them, to go chasing around this page for links to places so you get a feel for how they work. The default setup prior to customization will look for links staying on this blog.

Things I (re)learned (or remembered a bit late, depending on how you see it) on writing the above scripts:

  • Chasing through document.links processes links in document order (DOM tree order), not to be confused with top-down order of the fully layouted page.

  • Array.prototype.sort() sorts in alphabetical order, which sorts the array [0,3,6,17,4711] as [0,17,3,4711,6] which can be quite different from what you wanted. To sort by numeric values instead, assuming the array only contains numbers, provide a comparison function function( a, b ){ return a-b; } to the sort method.
So those of you who saw the post within the first half hour or so of my posting it, might want to pick up the scripts again. (I opted not to clutter it all down with change markers, to keep the code readable.)

If you use Firefox (or another Mozilla derivate) I recommend editing the bookmarks you save to give each a keyword ("links-to", for instance). That way, you won't have to put it on some panel (or remember where in your maze of bookmarks you put it) to access it when the need arises; just type links-to in the address field and hit return, and the script will be run (and pulled into the address field, should you want to edit it afterwards). This is a very useful technique for keeping lots and lots of tools (not to mention sites) easily accessible, if you like me find it easier to recall names than traversing menu structures. Just hit Ctrl+L, type the script name and hit return. Quick and easy, and even works in full-screen presentation mode when you have no menus or toolbars visible.

This is unfortunately as close to access keys for bookmarks you get in the Mozilla browser family; go vote for bug 47199 now if you too want to put that feature on the development agenda. It has an embarrassing seven votes and was filed in the year 2000, and has not seen much action since. Imagine having any site or handy tool like this a keypress away. You know you want it. So off you go; vote away! Be heard.

JSONP: Why? How?

Codedread took me up on a few points worth discussion in relation to my previous post about JSONP:

  • Isn't the feed consumer very susceptible to incompatible changes in the source feed?

    Yes; naming issues are the same for JSON as for any XML dialect; the day your data provider breaks backwards compatibility with the format they previously committed to, your application breaks. It can be argued that widely adopted schemas such as RSS and ATOM are safer bets to write code for, but in my opinion it's a bit of a moot point. As soon as you use somebody else's data, you are at their mercy of still making it available to you in whatever format they picked. Content providers are still kings of their reign. When Del.icio.us has downtime (and don't care enough about their JSON feed consumers to degrade gracefully, still providing a valid, though empty, JSON feed) your application breaks.

  • How do you ensure that the JSONP feeds don’t embed bad code with side effects you did not opt in on?

    If you live solely client side, having just the provisions of the browser sandbox in unprivileged mode, as is the common case when JSONP is interesting at all: you can't. You are at the mercy of your feed provider's better judgement. Should you discover that they break the "contract" in passing code rather than data your way, wreak havoc and give them the devastating publicity they deserve. Pick your feed with the same attention to the trustworthiness of the source as when you pick your food. Don't eat it if you fear poisoned food. Or feeds.

    If you can't trust your feed source not to send code with unwanted side effects, and you have elevated privileges (either from being a signed script, or perhaps because you are a Greasemonkey user script or similar), so you can fetch content by way of XMLHttpRequest (side-stepping the same origin policy), you can use a JSON parser rather than eval() to process the feed. The JSON site provides a JSON parser written in javascript.

    Otherwise you would have to use some server side script to process and cleanse the feed first. In which case your choice of picking up just any feed at all and reformatting it as JSONP would be the solution most close at hand. Of course this is always an option open to you regardless of the original format (RSS, Atom or anything you know how to read) if you have a server side base of operations where you can cook your own JSONP. Making a generic XML to JSONP converter, as Jeff suggests, is of course a neat idea.

  • How do you include JSONP feeds dynamically into a web page?

    Point a script tag at the feed. If you plan on spawning off multiple JSONP requests throughout the lifetime of your page, clean up too, removing the script tag you added for the request, for instance from the callback you get when the script has loaded.

  • What about XML?

    What you do with XML, you should probably keep doing with XML, as there is that much of a larger tool base available for leveraging it. JSONP isn't here to replace XML in any way, it's strength is solely in overcoming the same origin policy of the browser security model. (Sure, the markup overhead of JSONP is roughly 50% less than that of XML, and XML formats are typically designed in a bulkier fashion than typical JSON formats in prevalent use today, but neither is really much of an argument for most practical purposes.)

  • Why should you adopt JSONP too?

    The way I see it, providing JSONP feeds for external consumption is really only interesting when you invite the wide public to innovate around your data, client side, unaided by any kind of server side resources on their part. If you don't, there is not much point in doing this at all. Doing JSONP is just lowering the bar as far as you can possibly go, in inviting others to use your service programmatically.

    It's comparable with providing a dynamic image meant to sit on a web page, or a really small web page component meant to go in an iframe of its own. The striking difference is that your data becomes available to programmatic leverage, which neither the image nor iframe does due to the browser security model (not counting the provions on offer for partially hiding them by way of CSS).

  • What is the guaranteed payoff with making a JSONP feed?

    None. If your users don't want to use your data, they won't. Nor will they, if they don't know it's there, or how to make use of it. JSONP isn't widely known yet, and javascript has much bad heritage of being a language people cut and paste from web sites to get nifty annoying page effects going on their web pages. To this public, a JSONP feed is useless. It's for the growing community of "Web 2.0 programmers", to use a nasty but understandable name for them, that your JSONP feed will be useful. And you get no guarantees that anyone will use it for anything either; your data might not be interesting, or you might smell like the kind of shady person that would salt their JSONP feed with nasty side effects, stifling people that would otherwise have considered building something with your data.

    The only firm guarantee is that nobody will use a feed you don't provide.
All of the above is the price we pay to use data we don't produce ourselves, and for overcoming the same origin browser policy security model, which is the very heart of the JSONP idea. Nothing has changed in this field since the millennium (except we have now got a name for a best practices approach for the behaviour/design pattern), though the WHAT WG are churning away at better hopes for the future. Until anything concrete comes out of that, we are stuck with JSONP for things like this, though. Be sure to make the best of it.

2006-01-30

JSONP: The recipe for visitor innovation

What RSS and Atom is to feeds, for making your data easy to access and subscribe to for humans, JSONP is for making your data easy to access for web page applications. To date, Yahoo are more or less alone on the large player front about having realized this. (They are also only almost JSONP compliant, but it's close enough to still be very useful.)

Where RSS and Atom are well defined XML formats, JSON is more similar to XML, in just being a data encoding, and a very light-weight one, at that. Besides being very light-weight, JSON is also the native javascript format for writing object literals, hence the name "JavaScript Object Notation". And since the native scripting language of web pages is javascript, it should come as no surprise that you open up your data for user innovation by providing it in JSONP form. The P stands for "with Padding", and it only defines a very basic URL calling convention for allowing a remote web page to supply your feed generator with a query parameter callback to encapsulate the generated feed in such a way as to make it usable for the page that requested it.

So, as neither JSON nor JSONP specifies anything about the actual feed content format for data, you are encouraged to think up something for yourself whichever way you please, and remain consistent with yourself, as, if and when you might opt to upgrade the feeds you provide with more data. I'll pick GeoURL for an example here, which has an RSS feed that contains this item structure:
<item rdf:about="http://ecmanaut.blogspot.com/">
<title>ecmanaut</title>
<link>http://ecmanaut.blogspot.com/</link>
<description>Near Linköping.</description>
<geo:lat>58.415294</geo:lat>
<geo:long>15.602978</geo:long>
<geourl:longitude>15.602978</geourl:longitude>
<geourl:latitude>58.415294</geourl:latitude>
</item>
Formatted as JSON, paying attention to making the data as useful as possible to an application as possible rather than formatting it for human consumption, it might end up looking like this instead:

{"title":"ecmanaut", "url":"http://ecmanaut.blogspot.com/", "cityname":"Linköping", "lat":"58.415294", "long":"15.602978"}

I substituted the "description" field for a "cityname" entry for the name of the closest city. You can of course easily improve on this with other relevant data too, for instance, why not add "citylat" and "citylong" values while at it? Or, you might nest it stylishly in another object, as

{"title":"ecmanaut", "url":"http://ecmanaut.blogspot.com/", "city":{"name":"Linköping", "lat":"58.4166667", "long":"15.6166667"}, "lat":"58.415294", "long":"15.602978"}

which would allow an application to access the data as city.name, city.lat and city.long. (Or, given that long is a reserved word in javascript, perhaps more safely as city["long"]. Hence the common practice of naming longitudes lng in javascript code instead.)

The spaces in the above examples are purely optional, and were only added here for easy reading. Whether to provide latitudes as strings, as above, or numeric literals (dropping the quotes) is of course up to you, but I deliberately chose the string representation above, as that leaves it up to the application whether to treat them as numbers to do math on or text data to do some kind of visual or URL formatting operations on. Using floats instead would have lost precision information, as decimal and rational numbers are represented by IEEE floating point numbers in javascript, meaning that there would be no telling the difference between 0.0 and 0.000, or different numbers with many figures, where the differences appears only near the end. Opting to use strings, your consumers can pick either; the difference is just another parseFloat() call at their end.

Anyway, you might ask yourself what good providing JSONP feeds does you, as a service and feed provider. At least you should, as it's a very good question. A JSONP feed will (technically) allow any web page anywhere on the web to use your data in any way it pleases. This may be what you want when you provide a blogging service you want to popularize, a user customizable guest book with built-in feed support et cetera, and it may be quite contrary to what you want for information whose reach you want to be in greater control of yourself. (Of course, you can technically limit the scope of a feed using referrer detection tricks just as you can to avoid image hot linking, and it naturally also has the same issues with browsers and proxies set up to provide some amount of browsing anonymization.)

The real value with JSONP distribution is this: user innovation. It allows a third party to very easily leverage the functionality your tool or service provides, mash it up with other tools or services, and build really great things in general; things you probably never would have thought of yourself, or if you did, at least not had the ambition of doing of your own accord. And definitely not at no investment cost to you.

Opening up your data allows anyone and everyone on the web to do those things for you, should they want to. Ask for a link back from their cool applications, and you have a sudden generator of incoming traffic and visitors that builds on its own as others innovate around your service. This is pretty much what the fluffy word Web 2.0 is all about, by the way, at least to me. Data transcending barriers between sites and applications. Users playing an active role on the web rather than consuming a web made by others to solve problems dictated by slowly rusting business models.

The visitor map on top of my blog is just one example of what has been (and hence can be) made with JSONP; the feed provider of coordinates for recent visitors is provided by GVisit, and the naked feed is combined with Google Maps using the public Google Maps API. Yes, that rather boring text data becomes the pretty eye candy that lets us boggle at the many interesting places from which people drop by for a casual read here.

Assuming GeoURL provided JSONP feeds too, I could have asked it for blogs and web pages created near places where recent visitors of mine reside, and perhaps feature a rolling feed of pages made in their surroundings, as my visitor map zooms around the corners of the globe, tracking my erratic visitor patterns as curiosity drives eyes this way from all over the world. Add to that some optional topic tags data to each URL from that service, and a way of filtering the feed on the same tags, and it might even be made an opt-in self moderated list of links to peers of mine around the world for other blogs on topics such as javascript, blog and web technology, or what have you.

As you see, it's actually rather easy to come up with good web services that interesting applications could be built around in an open community fashion, this way, and JSONP is the brilliantly no investment, user interest driven way of doing it. It's trivial to make a JSONP feed, it's easy to leverage it into applications of your own as a web programmer, and the possibilities they open up for are nothing short of astounding.

I warmly recommend following Yahoo's lead here, but don't squash the callback parameter, the way they do; put it in the feed verbatim, so the applications developer won't have to jump through needless hoops to put your feed to good use making tomorrow's killer applications.

Adding javascript to Blogger posts

First off: Blogger does not let you post <script> tags to your blog. (Edit: It does, at least in non-beta, though it warns you about doing so, probably assuming it wasn't done intentionally.) Which is annoying, but perhaps a design choice to prevent code from executing outside of the context of the blog when syndicated through the blog's Atom feed. Or just plain imposing "this should not be in a blog" opinions on bloggers. Whichever.

I'm a bit surprised, though, as we can still use onclick handlers and the like, which I have put to good use once in a while in previous articles to make small in-page utilities and more helpful self-configurable tutorials. Anyway, we can fortunately put javascript code in our blog templates, so for most purposes there are ways of getting around this small annoyance.

A while ago, I was experimenting with posting plain javascript code in Blogger posts with a template that would let external pages include the result as script tags. Not a unique thought, apparently, as Stephane Hamel has institutionalized the practice. And quite cleverly too, using the year part of the post date (minus 2000) for script major version, month number for minors, and page title for script name. Version 1.1 of the Hello World example, thus maps to http://code67.blogspot.com/2001/01/hello-world.html.

The scheme doesn't allow for .0 minors, though; I would have gone for subtracting one from the month number had I devised the system. (Maybe Stephane likes RCS revision numbers, starting at 1.1. ;-)

If you want to do a less elaborate version for inlining code in a blog post, and just want to put the code in the post you are currently writing, rather than pulling it in from an external URL, I would suggest another solution though. Put this block in your blog template:
<script type="text/javascript">
var c = document.getElementsByTagName('code'), s, i;
var junk = /^\s*\46lt;\133\133CDATA\133|]]\46gt;\s*$/g;
for( i=0; i<c.length; i++ )
{
s = c[i].getAttribute('style') || '';
if( s.match( /display[\s:]+none/i ) )
eval( c[i].innerHTML.replace( junk, '' ) );
}
</script>
(ideally in some function run when the onload event has fired) and it will let you use a <code style="display:none"> <[[CDATA[ your javascript code here ]]></code> sections as if they were really <script type="text/javascrtipt"> //<[[CDATA[ your javascript code here //]]></script> tags. They will of course not show up visibly in your feed, nor do any harm there, either.

Happy hacking!

2006-01-24

Google Maps to Panoramio user script

March 3 update:

The map object volunteered by the Google maps site of today only offers the pan() method with intact name, hence this script, needing all of getCenter(), getZoom(), getMapTypes() and getCurrentMapType() to work, no longer functions.

If you are familiar with the Google Maps API or sites using it, you might know that the Google logo in the bottom left corner of every view is a clickable link that brings up the same view you were seeing, but in Google Maps (or, actually, Google Local, but for most purposes the result is the same). A rather good feature, and especially so in times such as the present, when Google Local has two zoom levels more than all the sites using the Google Maps API.

The feature skew comes from the Google Maps team recently having pushed out new maps images to Google Local previously only available in Google Earth. I would presume we will eventually get to see these via the Maps API too, quite likely with the event of the forthcoming move to Maps API version 2.

Especially as the new API names zoom levels with 0 mapping to "see the whole world in one tile", which was previously "see the finest detail map imagery available". Clearly, the new world order will allow for future expansion a bit more safely.

Anyway, while I like the "zoom to Local" feature, I kind of lack a "zoom back to Panoramio" link in the other direction, and now I wrote myself one. Panoramio, if you haven't seen it before, is a very well designed huge worldwide photo album by Joaquín Cuenca Abela, who also keeps a good blog on his tampering with javascript and tools for the purpose. And you guessed it; it runs atop Google Maps.

A bit to my surprise, my script ended up number one at userscripts.org. (Have you started reusing the ID:s of deleted scripts, Jesse et al?) Just in case it's a temporary lapse and the above doesn't end up the script's permalink, you may pick it up at my own userscript repository too. That actually goes for most, not to say all, of my user scripts. If either archive is offline, try the other for a fallback.

When you have installed it, you will get a little "Panoramio" link sitting at the top right of your Google Maps / Google Local page, right next to the Help link. Until the Google people change layouts, or the javascript code running in the page to generate the maps changes incompatibly, both of which might not happen in a rather long time, I'd venture guessing.

At least I didn't fall into the trap of picking up any of the names of the crunched-into-randomness method and property names the Maps and other Google javascript code is otherwise famous for. Feel free to read the code; it's short, and contains a few useful ideas and techniques for Greasemonkey and user script authors.

2006-01-21

SVG challenge: Taken!

RAD.E8 design browser stats Unsurprisingly, I was not alone about wanting really pretty web stats graphs like the ones Scott at RAD.E8 design crafted (featured image), but in SVG form, as I proposed two weeks ago.

As it happened, within only the span of a few days, Jeff Schiller had taken on the challenge and made good progress towards those pretty SVGs. Being a somewhat seasoned SVG hacker, he even opted to introduce interactivity to the graph too, scripting it to show not only a static snapshot of a day's stats, but rather the entire history of his blog's visitors.

Jeff Schiller's take on SVG browser stats

SVG, like HTML and presumably many other W3C standards to come too, ties in very neatly with ecmascript; embedding code is just as easy as javascript in HTML in the web world. Arguably even somewhat easier, as there is less legacy pre-DOM days baggage to bother about, SVG not having been with us since the dawn of the web.

Anyway, Schiller had already given the matter some thought previously, so I presume he was really waiting for the right inspiration to get going, and from what I gathered of his comments on the post, he was going to leave it there, as it was.

Wanting to see how far off the SVG front (in released browsers) is from being able to cope with making something as pretty as the originals, I picked up the ball and played some more with the code.

my take on Scott's and Jeff's browser stats As it turns out, we are pretty close; it's really just some filters left, to add gaussian blur shadows for text and pies, perhaps some <textPath> to bend the text around the edge of the circle perimeter, and of course much, much better font rendering. Except that, we are more or less already there. I actually thought it was still farther off than that, but the good browser developer people have been busy.

Of course, picking the right font is also of utmost importance, but as fonts go, I am very unseasoned myself, so I have not even tried here. Having said font on the client side is also an issue I don't think SVG addresses yet, but I presume some decent set of fallback font specifications might render more or less as intended too.

The data featured in my graph is actually what my traffic looks like, since I started tracking this blog with Google Analytics, formerly (and still, by some) known as the Urchin tracker. It was a bit of work bending out the data from the view and massaging it to sum up <1% slices into bigger chunks, summing up sibling minor versions and the occasional really exotic browser into categories, but most of it could hopefully be automated, given some work. I have not dared experiment too much yet, as I don't want to trip a lock-down in sector four.

I'm not saying that applies to all Google sites yet, nor that it doesn't, but as I haven't dived particularly deep into the fine print of the Analytics terms of service I have so far opted to be cautious.

The really great thing not yet mentioned in this story, is the file sizes of these pretty things. The PNG thumbnails I opted to put in this post (as there is still quirks and issues to circumvent when inlining SVG images into page content, Jeff reports) each weigh in around 40 kilobytes, or 70 for Jeff's dual graph. My SVG and its companion external javascript file weigh in at two plus seven kilobytes, or together about a fourth of the size of the thumbnail, and they scale very nicely to any size, not just the 200 pixels high versions seen here. Jeff's version (presently) is a hundred kilobytes, but also sports not the data for one, but for a hundred and seventy graphs you can instantly switch among, at the click of a mouse button in the left graph, or arrowing back and forth via the keyboard.

For comparison, two kilobytes is roughly the size of the tiny flag images at the top right of my blog for machine translation to other languages (very forcefully frowned upon by my girlfriend, by the way, translator by trade) -- so not only can we expect to see much prettier graphics once SVG imagery becomes mainstream on the web, but we will also get much speedier delivery of them. And more bandwidth to spare for the video blogs to swallow it all up again.

Good times ahead, people.

2006-01-18

Efficient code branching in javascript

First off: default optimization rules still apply. Lots of time and code readability is wasted on optimizations that does not speed up, or save memory, for applications to any measurable degree. Don't go there unless that extra bit of efficiency is really needed, in which case you should do measurements to track down hot spots, addressing those. Read through the slides on optimization from What Works in Software Development (via The Farm), for instance, if you are unfamiliar with when, where and how to do optimization.

Indeed, instead of seeing this as a tip on optimization, see this post more as a way of breaking free of programming style baggage you might have acquired in programming largely compile time static environments such as Java, where the compile time reality and looks of methods is set in stone. Javascript is, by contrast, a very dynamic language, and there are benefits to reap stemming from that fact.

Most javascript code today doing anything even remotely advanced with the DOM needs to be aware of the different ways of solving the same problems in different browsers. The IE reality is a harsh one in ways that its Mozilla or Opera counterparts are not, for instance, and depending on what you do, it might need doing somehow else for Safari to cooperate, and so on. So it has some kind of browser forking, executing different code in different browsers.

The approach is typically this: make a method that does Something. Prior to making this Something, test for which browser environment is running the code, and pick a code branch depending on the outcome. And the tip is just as simple: as you know that during the execution time of your script, the target browser will be the same browser exhibiting the same flaws as it was the first time you tested it, you may opt to do this test once, rather than every time your method is invoked. In effect, you customize your methods in an initialization step that sets up the code to run in the way the visiting browser wants it, and get rid of the excess code and testing baggage that would have allowed other execution paths for other browsers.

An example, testing for an archaic pre-DOM browser from Microsoft to implement one of the most frequently typed DOM methods, typically called something offensively long:
function $( id )
{
if( document.getElementById )
return document.getElementById( id );
if( document.all )
return document.all[id];
}

The exact same behaviour, minus the run-time testing overhead, could be expressed as:
if( document.getElementById )
$ = function( id ){ return document.getElementById( id ); };
else if( document.all )
$ = function( id ) { return document.all[id]; };
else
$ = function(){ return undefined; };

You should also be aware that this, like any other possible optimization, might not always end up a speedup, even if it runs in some inner loop hot spot of your code. It could even slow down the code; measure, measure, measure! A bit of inlined bulky code testing for and calling document.getElementById directly when available, in every bit of code in your application that picks up a node by its id, might prove very much faster than the additional invocation of a javascript function. You won't know until you measure it in all the target environments it is supposed to run in. Which again should remind you of how expensive optimization can be, in terms of development time.

But it does not hurt knowing your options when you write code that needs branching in ways you could test for in advance, or the first time it gets invoked only, rather than every time it gets run.

2006-01-17

On Firefox extensions, stability, and standardization

I was very long overdue starting fresh with a new firefox profile, my prior one having made my installation very prone to frequent browser crashes, which isn't something you should just accept and live with. Firefox installations, much like Windows installations, have a tendency to rot and degrade with time, especially if you try out software by third party sources (which equals "not from the Mozilla foundation", in the case of firefox and "not from Microsoft", for Windows).

This is, of course, a natural phenomenon, and not just bashing programmers in general. All programmers makes mistakes, once in a while, and without quality control you know you can safely place your trust in, every piece of software you install on a computer system may jeopardize the stability of other related software. In a playful environment such as an open source browser that pretty much invites the world to poke about with it, or in an operating system used by a large part of the desktop machines throughout the world, the overall effect is more or less inevitable. Especially given my computing habits.

So I started a new profile. To those of you who take on the same procedure for the same primary reasons, I would recommend doing what I first thought I would: to not look at your former list of extensions, when reinstalling. Only pick the select few you know you use and without which you just don't feel comfortable. In my case, those would be something like


-- those being the ones I really wouldn't want to live without. (Nor migrate to competing browsers who lack their functionality.) Some of them, like the Google Toolbar and keyconfig, only very recently made the list; the Google toolbar for having previously been a big ugly beastly thing rather than a set of slim, independently draggable widgets to have just where I want them, for instance. Stylish, for being a relatively obscure sister of Greasemonkey, for installing user css just as conveniently as user scripts with Greasemonkey (never again shall I touch a Greasemonkey script that just adds styling to a site) and keyconfig, for plain not working in my probably severely bitrot ridden former configuration.

Stylish is worth mentioning a bit more in-depth, by the way, as I believe user css is still not a very familiar concept in large circles. First off, there are two domains for user css, at least in mozilla descendants: the HTML namespace, where web sites specify styles and layout for their page markup bone structure to make the colorful, pleasant eye candy web we see, and the XUL namespace, where browser developers style the bone structure of the browser interface itself -- the looks of buttons, bookmarks and menus, for instance. Stylish lets you add your own configuration on/off toggles to restyle either, and in the case of HTML, to do it on a site by site basis, just as does Greasemonkey for javascript code. And, hopefully, eventually to share them as *.user.css files we could set up a whole new section on userscripts.org for.

This way, I can quite comfortably remove the (to me) completely useless Go and Help menus, and get some more space for gadgets I do use, such as the Web Developer toolbar toggle, the Google search field and a GMail icon. And a slew of other things too, at least until I figure out how to add keyboard shortcuts to generic widgets in my browser interface, so I can for instance zoom up a directory level in the URL without clicking the Google Toolbar provided "parent directory" button, which recently succeeded my minimal ".." bookmarklet (which you might opt to bookmark; click it, and you will presently come to a non-existant page, since this blog lives in a place where I can't make a proper site structure with index pages strewn throughout it in suitable spots in URL space).

Input on how to add such key bindings is highly welcome, by the way. In this department, competing browsers such as Opera are still light years ahead, having long ago standardized and centralized all key bindings configuration. Keyconfig doesn't solve more than making it possible to add new keyboard shortcuts; for some things that already was bound, a clashing shortcut you add will not replace the old binding, but trigger both functions. Let's hope that solutions are not all too far off.

Anyway, I did not opt to installing the bare minimum extension set. Where I had previously had perhaps a screenful, say twenty, extensions, I now browsed through the first fifteen (of 97, at present) pages of popular firefox extensions at Mozilla Update, installing those I deemed interesting and useful enough to merit a peek. This yielded a much larger list of (mostly very light-weight) extensions, now closer to 50, all in all. I'll share some tips as I start to discover which ones are the true gems.

If you do this at home, you should first be aware of one thing: extensions typically add lots and lots of status bar buttons, Tools menu items, submenus, and web page context menus, especially when you install them by the dozen. You will end up with an overstacked browser interface that hits usability hard. And on the status bar front, Mozilla development has not matured nearly as much as on the toolbar front, which unfortunately are two very different fronts, so you won't be able to click and drag to relocate or completely drop the bits you want elsewhere, or not at all. Only the really good ones, such as that of Adblock.

Reordering the list of extensions may change the order of icons in your status bar, but it's neither easy nor necessarily possible to get them how you want them, if you have strict preferences about looks.

I would like to take this opportunity to recommend all mozilla extension authors to make the things you add to menus and the status bar, to provide an option not to in your extension's configuration dialog, the one available from the Extensions menu. The good thing about that route to configuration, as compared to, for instance, adding it to some menu or keyboard shortcut, is that it is standardized across all extensions, so your user base will be able to find it, and it will neither pollute key binding nor menu space for your users.

It also will not make those bindings clash with those of other extensions that also chose Ctrl+Shift+P for preferences. Pardon me for stating it in this not very flattering way, but when you think about it, it really is kind of an obvious source of trouble, wouldn't you say?

Another thing worth doing, usability wise, is trying to go with how applications usually behave. If your status bar icon has a menu you can access, it very frequently sits on the right mouse button, for instance. Similarly, if it has an on/off toggle bound to same icon, it very frequently sits on the left mouse button. And if you have a mode toggle that does not live in the status bar, try making it a widget that can be dragged to a toolbar somewhere, or in any case at least resist the temptation of putting it in a context menu. Imitating how others do lowers adoption thresholds and improves usability. Mainstream is good, in this context.

Thanks. (Your less outspoken user base will also silently be very thankful to you.) Now I'm looking forward to a few days of adding user styles to filter out some not very wanted recent sightings in my Tools menu, also conveniently made available from my Extensions menu where I roam in times of need. Or, actually, looking forward to having already done that and moved on to the pleasant experience of a slim weasel of a firefox, packed with a feature set very hard to beat. Without coming up with new and extraordinary yet-unthought-of extensions.

And do take inspiration from other extension wrights living in the creative field between excellence of design, visualization and use case, such as the just recently announced Safari Web Inspector, if you have a mac around (via Ajaxian). You can be as good, or better. It looks good on your CV list of accomplishments, too.

2006-01-15

Javascript IDEs, resources and load progress imagery

This all happened rather backwards, so I'll take it in reverse chronological order. Did you know what the best javascript IDE on the market is, at present? Well, neither did I, nor would I have guessed, either. (And while I can't claim perfect field knowledge yet, I sincerely hope I'll be rewarded with corrections for making such bold statements -- claim first place, show it off, and nobody will be more glad you did than me.)

As it turns out, it's Photoshop.

Yes. The graphics package from Adobe.

It's got a really good javascript debugger/profiler called "the ExtendScript Toolkit", and it ties in beautifully with Adobe Photoshop, ImageReady and Bridge, running scripts in those applications from a separate window of its own, single stepping them, adding break points. It has a very intuitive, overviewable and to the point interface you can reconfigure to fit your needs, it has its own unicode capable built-in editor with syntax highlighting support, call stack and variable inspection views, and everything else you would expect from a stand alone product. The Adobe + Macromedia merger is making more sense to me by the minute.

Finding this treasure, I was really taken by surprise, though it makes perfect sense. Photoshop embeds a javascript interpreter to do custom scripting of actions, and your own advanced automation of boring tasks. It's much like the AREXX interfaces of Amiga applications during the nineties, and you basically get access to everything you may want to script, as it should be.

I have been mildly aware of this since 2004, on finding the (quite excellent) ObjJob javascript DOM references, which covers not only the various browser DOM and ecmascript core objects references but also the base SVG, Anark Studio and Photoshop DOMs. Among the better aspects of it is that it allows toggling inherited properties and methods on and off, and breaking down the DOMs by interfaces, which is great for finding what you are looking for quickly.

Not until today did I really look into it anything, though. I had set out to make myself some animated load progress images for a forthcoming upgrade of my topic navigation system, and figuring it's a blob of JSON being fetched from Del.icio.us (and since it's stylish), I thought I would base it on the JSON logo. My first shot at it ended up like this:

JSON JSON JSON


But I didn't realize it was antialiased towards a white background until I was done, and redoing the job from scratch to get it antialiased towards the somewhat darker background I had in mind did not feel like a very amusing prospect at the time. So I recalled the scriptability of Photoshop, and thought I should at least investigate.

Tranberry seems to be a very good resource on Photoshop javascript programming, and the Adobe reference guide PDFs are very good, too. I'll probably be back on the subject later on. If somebody in the meantime would pop by with a tip on already existing support or automated methods for making gif animations by rotating an image with Photoshop, I'm all ears, but otherwise I will probably eventually address it myself.

Unless I end up fawning over the ExtendScript Toolkit, every time I start poking on it.

2006-01-14

Flash for us non-clickies

I have long wanted to see the gap between browser javascript and flash narrow in, and it seems to slowly start happening about now. Brad GNUberg reports on a javascript to flash integration demo where a flash component sitting behind the page HTML communicates with the page javascript to do seamless animation (which browser javascript won't have until the coming of ways of syncronizing drawing with monitor refresh) in the background. (Don't hold your breath.) It's a rather nice tutorial on how to go about doing it, too.

2006-01-10

Socially made tag site navigation: any providers?

I have recently been in touch with Hans Persson of Project Runeberg, a volunteer effort to create free electronic editions of classic Nordic (Scandinavian) literature and make them openly available over the Internet. It's a well established project that has been running since 1992 and already spans in excess of a thousand volumes of works by 286 authors, a total of 387409 pages (and counting) scanned from works predating the reach of today's copyright rules. (Works by creditors who died more than 70 years ago are all in the public domain.)

Needless to say, covering that many books, their web site is gigantic, and making a good and relevant navigation system is quite the undertaking, especially to a volunteer project on a limited work force and budget. So they are considering social bookmarking services as a means for making site navigation too backed by visitor interest, knitting together what has been bookmarked and tagged by their visitors, with other pages at runeberg.org (their site), tagged with similar tags.

Before they embark on creating a bucketload of server side scripts and cron jobs to occassionally scrape social bookmarking services such as Del.icio.us or Technorati, I think it would be prudent to toss out the question of whether this is a service in the works or already provided by any of the tagging services, large or small. They already sit on the databases and are in a great position to set up a service to address this use case, which I believe has a huge potential of being a useful service to web sites of all topics and domains, all over the world, and probably one that could easily finance itself all of its own.

A quick off the top of my head sketch of a suitable mode of operation:

  • Given a URL, look up all tags from all taggers for that page.

  • Cross-reference those tags with other pages from the same domain, having tags in common with this page, sorted in descending order by how many.

  • Provide this dataset in JSONP form for the referring page to layout as it wishes.

  • For points of style, embed a permalink URL in said JSONP package to an RSS feed of this dataset, covering links for the given search criteria in the future, as and when they are added.

Readers are most encouraged to expand on the ideas and link projects, present and future, that cover similar ground.
Categories:

2006-01-08

Merging RSS / Atom feeds, and teaming up

Where are the online tools for merging feeds? I thought this would be something a quick Google search away, or tags peek at Del.icio.us but to my surprise, I only found two,
  1. feedjumbler, which was so broken to the naked eye I had to peek at Google's cached version to see that it would indeed have been what I wanted, and

  2. Blogdigger, which would let me setup a merged feed, and then deliver an empty feed. Update: As noted in the comments below, this is actually an expected result, due to an (at my time of visiting) undisclosed initial indexing delay.

Oh, well. Maybe I just had really bad search karma today. The real solution to the above problem still lies over at userscripts.org, where Jesse and Britt have been really busy doing other things lately. With a bit of luck, I'll be able to join the team to do some tweaking and feature additions myself in a while, such as creating aggregate RSS feeds of all comments to scripts from a particular author, which was what I set out to do above.

This is one of the things I really like about many of the really good projects on the web today -- the barrier to joining in to help out has felt much lower than it used to do a few years ago, at least to me. Prove yourself capable and worthy of trust, acquaint or befriend the people behind the web fronts and team up with them, for the good of the project. You also make a lot of really good friends with people who, much like yourself, want to improve the web and its tools.

Everybody wins.
Categories:

AJAX × Date × Time × Time zones - best practices

Are you developing a site or application which relies on javascript, which somewhere on some page or in some view shows some time, or a date, or even a weekday to the visitor? Registration time, for instance. Or an advance notice of your next scheduled downtime? Anything at all! Do you also let visitors from other cities, and maybe even from other countries, visit your web site?

Great! Then this article is for you.

Because if you do, then you have run into the problem of time zones. Or, if you haven't, then your poor visitors have. In a typical scenario, you are in one time zone, your web server in another and the visitor is in some third place you can't even begin to imagine. Azerbadjan, perhaps. Or Tadjmbekistan. (No, I'm kidding; there is no Tadjmbekistan. ...or is there?)

Time zones are our friends, as common people; they make clock readings correspond vaguely to the height of the sun on the sky, letting us wake up almost at the same time in the morning wherever we are on the planet. Very convenient way of organizing work hours with light hours. It's not perfect, though, and in a vain attempt of doing some slight adjustments to the sun's annoying habit of drifting some over our hours, we adopted daylight savings time, DST for short, half of the year, shifting our scale by an hour to compensate. This has been widely regarded as a bad move, and has over the course of centuries in all likelihood given us much more trouble than ever the Y2K problem did. But we're used to it, and it's hard to switch back, even though we now have artificial lighting so we can perform work any time we well please anyway.

What time and day is it right now? You probably have an answer that fits your local reality well. Your web server probably has an idea of its own, too. It might even be in concert with yours; it mostly at least gets the weekday right. But on the other side of the globe, it might still be yesterday. Or already be tomorrow, from your point of view. And you bet you will have visitors from there, who find it at best bothersome that your web site insists on getting these things wrong.

So you append a "PST" to your time readings.

No, you don't. Humans hate performing time zone math. They are not even very good at it. And your visitors from Azerbadjan don't even know what it stands for. And those who do, will wonder if that is Pacific Standards Time, or if it's maybe daylight savings time over there, and in either case, how many longitudes away is PST, anyway? Then they will very likely get an off by one error in their calculation anyway, and you bet they will be pissed at that self centered... Well, let's stop there; we all get the point.

So you introduce a time zone option in your user settings.

No, you don't. Or at least you shouldn't, if this is the problem you are trying to solve. A time zone choice is good if you, for example, want your site to be able to relate time in one visitor's time zone to the time in other visitor's time zones, but for now, we just want to format times for one visitor in her own frame of reference.

Besides, time zones are complicated things, with local political rules, all unlike the rest, and only roughly corresponding with the longitude of the visitor. "Naïve" time zones, like "UTC+1" for central Europe, for instance, will be off by an hour half of the year, and real time zones, like "Europe/Stockholm", where DST rules are taken into account, is a whole lot of work and knowledge to maintain, or have your programming language / environment maintain, and always get right. And was there really no Tadjmbekistan? If there were, would the guys who wrote the functions to handle the math know about their rulesets? And did they also cater for the changes they did on becoming the People's Democratic Republic of Tadjmbekistan last year?

Unlikely.

So, what do you do?

Your visitor's browser knows what time zone it is in. It even knows what time it is. And, it's programmable. See? You probably knew all those things already; after all you are developing applications for it, in it, and the rest is history. So you let the browser do the math for you. And for your visitor.

There is an easy way, and the hard way. And as the result will be the same, I suggest you don't bother much with Javascript's meagre support for time zone arithmetics with date.getTimezoneOffset() and similar trickery, because there is a beautiful time zone called unix time, which all good programming environments handle, including javascript. What time is it in unix time, right this moment?

Click to find out.

Yes. A number. And it isn't even related to time zones; unix time is the same, everywhere; in space, too, actually. And your visitor's browser knows how to convert it to the time zone the computer was set up for. It's as trivial as passing that integer to the constructor of the javascript Date object. Multiplied by a thousand, to get it in milliseconds (which is the native time unit to javascript), rather than seconds. Similarly, if your environment provides nanosecond timestamps, divide by a thousand instead.

Generalizing wildly, I'll assume you pull most times from a MySQL database, in which case you should tell it to give you unix times using SELECT UNIX_TIMESTAMP(modified) AS modified, for instance. Do it as close to the data source as possible to have as little code trying to apply local understanding of date and time mathematics as possible, to avoid unnecessary pitfalls. Pass this integer along to the client, and format it appropriately to the visitor. I would suggest a format for dates which lists month names rather than figures, as variants on the theme X/Y/Z theme offer too many interpretations, forcing the reader to guess which variant you prefer, even if you are kind and foreseeing enough to supply four digit years.

Here, you might actually want to provide a configuration option on how the visitor prefers to read dates and time. But don't try being too clever, bundling the choice of date format with the choice of country, or interface language (as does Google Mail, who move on to downgrading functionality if you opt for a Swedish interface -- for instance, you suddenly can't choose sender addresses if you switch to the Swedish locale). Suggesting default choices based on locale choice is good, assuming what goes with what is not.

Here is the most trivial date formatting method possible to get you started or for debugging purposes, and one somewhat more sanitized version. Most of you will probably want to cook up something better; browse your options among the many available methods of the Date object for primitives. Just remember to stay clear of the legacy getYear() function; it's getFullYear() you want to use. If you use some library, it probably already has good date formatting methods that will do a good job if you pass them a Date object.
function formatDebug( timeInteger )
{
return (new Date( timeInteger )).toString();
}

function formatTime( time )
{
function zeropad( n ){ return n>9 ? n : '0'+n; }
var t = time instanceof Date ? time : new Date( time );
var Y = t.getFullYear();
var M = t.getMonth(); // month-1
var D = t.getDate();
var d = t.getDay(); // 0..6 == sun..sat
var day = ['Sun','Mon','Tue','Wed','Thu','Fri','Sat'][d];
var mon = ['Jan','Feb','Mar','Apr','May','Jun',
'Jul','Aug','Sep','Oct','Nov','Dec'][M];
var h = t.getHours();
var m = t.getMinutes();
var s = t.getSeconds();
return day +' '+ D +' '+ mon +', '+ Y +', '+
zeropad(h)+':'+zeropad(m)+':'+zeropad(s);
}

As you are pioneering this field of end-user usability, you may want to state that times and dates are indeed given in the visitor's frame of reference, as people have generally come to expect to see times given in some random and hence typically fairly useless time zone. This can be seen as a not entirely bad reason for actually providing a time zone configuration option, should you want one. I would suggest defaulting it to "auto" or "local time" using the above method, though, as that is most likely what the user would want to see, anyway. This way, the configuration option actually doubles as documentation of what times the site shows, in a place a visitor is likely to look for it. To make it extra apparent that you render proper times, you might equip the page with the setting with a printout of present time, which the visitor will easily recognize as the time her computer clock shows (since they are in fact one and the same).

Similarly, if your web site does not require javascript, neither should you introduce this requirement just for handling date and times. You can do both, though, if you have the time zone configuration option doing time zone math server side, and provide the javascript version for locally rendered times. It might look like this in rendered code:
<noscript>Jan 8, 2006, 14:26</noscript>
<script type="text/javascript">
document.write(formatTime(new Date(1136726801*1000)))
</script>

...That's all we have time for today. Hopefully, you and your visitors will have all the time in the world, and get them properly in sync too. Best of luck to you all!

2006-01-07

Seventh druid, and a nifty XPath tool

Yesterday I spent a few off hours wandering the web, with an attentive eye to permalink structure and post HTML formation. Why? Off the top of my hood, well... It seemed like more fun being a druid than a mere cleric. I am not sure it was time well spent, but it was educational, and a peek at an aspect of the web I think we seldom take much note of as site visitors. My selection of sites to peek more closely at was (more or less) the sites on my OPML list of feeds I read; a cross-section of javascript, greasemonkey and web 2 tech people mostly, with the odd graphics designer and somewhat more mindful read sprinkled on top.

Unsurprisingly, mostly all of them employ modern HTML constructs for site layout, with <div> tags and semantical markup for headers, lists and the like to draw up the general structure of pages, rather than the flurry of tables, font tags and spacer images of the primeval web. I believe we have the relative maturity of CSS and the hard work of template designers prior to the recent blogging explosion to thank for this. (And thank deity for that -- myself, I practically left the dirty web for a few years back in the nineties, having wanted to do many of the things that required the much cleaner web of today, and capabilities unavailable in javascript and the DOM back then.)

Anyway, to my surprise, one of the sites that was a mix and match of old bad times and fresh good times, was Joel on Software. This is a hand wrought site by a programmer for other programmers (mostly), and it employs table and tag attribute mayhem for base site structure, and occasionally classed divs for some of the content grouping inside. Not the worst tag soup design of late, but not as pretty as I had expected either.

I know, this is all not very interesting, but it gave me some good XPath exercise, trying to pick out specific nodes of the pages I was interested in. The kind of thing like "find the last <p> child of the first <td> element which has a <div class='slug'> child" (//td[div[@class='slug']]/p[last()]), which refreshes a lot more XPath expertise than a trivial "find the <div id='viewer'> element" (//div[@id='viewer']). Granted, neither of the above ensure that they do not match more than one element (the second will, in a well-formed document), but for my purposes, that was not relevant.

And, better still, it gave birth to a little scriptlet for trying out XPath expressions on a web page, flashing the first element matched (or bringing up the expression again with an error message if it failed to find the node, or you wrote some malformed XPath). For those of you familiar with the Firefox Document Inspector, the behaviour is familiar. Something like this ought to go into its "find" mode, by the way; I place this code in the public domain, should anyone want to submit it upstream.

By default, it suggests an expression matching divs with an empty class attribute, a very common start of most of the kind of things I usually look for; typically I add a "post-body" or some other name between the apostrophes. I suggest typing in the expression you want, saving it in your clipboard (ctrl+C) so you can use it once you saw that it did what you wanted. Go ahead and try that on this page, if you like; it should flash the text body of this post a few times.

To go in the other direction, that is, on an unfamiliar page, how would one go about finding an XPath for a particular section of the page (assuming you are familiar with XPath syntax and workings), I again (read my prior post about it) warmly recommend Aardvark, a great extension which shows node names, classes and id:s of things hovered by the mouse, once invoked. You can even walk up through page structure by repeatedly tapping W (to widen scope). Immensely useful. And, for tough nuts such as Joel -- the Document Inspector, for looking at the full exact node structure of the surroundings of a node.

Having been made the seventh druid of the hoodwink society, I'm starting to feel right at home. I'll be back with some additional thoughts about permalinks shortly, which I feel ought to be given a whole article of its own. Permalinks, the remnants of the original idea of the URL, is very important technology we ought to pay much more attention to, and teach every new generation of people coming to the web about. But I will save that discussion for later.


By the way, if you find some article or tool of mine really useful, I would very much appreciate a small tip, say a dollar, for my work on it. Try giving my donation pane a spin; I try to keep it unobtrusive and out of the way so it does not disturb the readability of my posts. Don't feel obliged to, but it would encourage me keep doing the kind of things others (besides myself) find value in.

2006-01-06

Prediction about SVG / Canvas

Prediction: when somebody does something as pretty as this using SVG, or Canvas, a landslide of people will give up their previous generation browser. It doesn't have to be a plugin for Mint or Google Analytics, though it would certainly help.

The best thing about this is that I'm fairly sure this can already be done today. All it takes is the hard work of a good graphics artist with a knack for vector graphics, and perhaps some minor understanding of programming, or a programmer with the patience and follow-through to mimic the design closely enough in code, or parametrized vector paths. It's an interesting challenge, isn't it?

You would have Ajaxian beating down your door before you realized what happened, too. :-)

2006-01-05

hoodwink'd

I must say the sheer ambition of this project has me kind of stumped. Or the intense undergroundness about it. Or maybe the air of "hey, it can be done, and it's cool, so let's!" behind it. Plus, it's a very high profile example of the magics made possible by a common user script. Or a not so common user script, with some backend server support.

Picture the web. Yes, we are all familiar with it; it's mostly a content provider's world still. User scripts skewed this a bit in the direction of "a content provider's world, which geeks can remodel at will, sharing their site mods". Hoodwink brings it another large step towards a great big wiki, where any participant can scribble a comment somewhere for all to see, whether supported by the site host or not. How?

A hoodwinked browser adds comments and comment buttons to posts all over the web, layering another web on top of our familiar web. Not all web pages; only pages on web sites where a member of the community has tailored two parameters for what can be commented and where. The first is a regexp for URLs (to permalinks of commentable entities), the second an XPath selector (for picking out a spot in the web page where it would make sense to add a "comment this" button). Simple enough to work, obscure enough to look and feel like magic.

Comments are stored on a separate server hidden away behind additional layers of obscurity, where a common mortal does not risk to tread by mistake; conscious effort is required to find your way into the system, and chances of search engines scooping up content is even smaller. This is indeed an operation in the mists of shadows.

If you want to participate in the hooded fellowship, instructions are available. These are the prerequisites for teaching your browser the magic hand-shake to find any content on the hoodwink server, and once there, registered and let in among the druids, you find a customized user script that lets you see the hooded version of the web, and participate, both in scribbling, and, should you want to, becoming a druid yourself, adding site coverage following the hints and tips section.

Be aware, though, if you mind your privacy, that any step you take on the hooded web with the script turned on, will be shared with the hooded server, echoing back your complete click trail. It does not get exposed the way your Del.icio.us bookmarks show up for anyone to see, but your browsing history will leave your local computer, just as it does if you run the Google Toolbar and opt to show Page Rank, for instance (which also requires leaving a breadcrumb trail, to Google, wherever you stride across the web). Recent Greasemonkey versions make it very easy to turn scripts on and off with just a click on the Greasemonkey icon and checking or unchecking the script in the menu, on the other hand, so you can always switch back and forth between the underground enhanced and public web.

The good things do not end here, though. There is RSS too, and lots of it. This is in fact CommentBlogging in a fully developed form, where your commentary left on blogs and other pages anywhere are recorded and neatly packaged up in RSS feeds by author, by site and also in a joint feed of all recent comments, or winks, as they are called here. Very nicely executed indeed. A fourth feed is available on new sites spun into the network by the druidry too.

You can't read the feed in your web based feed readers, though. For technical reasons probably very intentionally designed that way, you need to use a feed reader running on your own machine, or build a proxy for echoing the feeds to where you want them. I believe these obscurity measures are a way of limiting explosive growth by holding back wide masses of common people, growing at a slow pace and attracting only a very geeky crowd of druids at the moment. I can see how that might be a rather good thing for a community like this.

Similar things have been done in the past, I believe by Alexa, using browser plugins, but I'm not sure what happened to them and if they are still operational today. Doing it more low profile like this might make it last longer, spreading more slowly. Not to mention giving the air of exclusivity that lies around the project like a thick white mist. There is very beautiful Magick going on when visiting the core server itself, too, where an AJAX based forum is woven out of thin air, the AJAX code itself not even present in the pages loaded from the server, as it all resides in the local user script. All the compositioning into actual HTML pages full of life and discussion is driven from your own browser, not as traditionally from a recipy fetched from the web server. The web turned backwards, or inside out, again.

I really like the contribution climate they have devised too; weave in ten previously not handled sites to become a druid, and have present druids confirm that the sites you weave indeed work. After that, I presume, you are versed enough with the regexps and XPath to take part in the elder druids' duties, similar to how wizards have ruled the MUDs throughout the ages.

How I stumbled on this secret shrine hidden in the depths of the darkest woods? I just retraced my steps to the Greaseblog, reading up on articles I had missed, and found a new nice Greasemonkey presentation from early December in a post where Aaron shares and shows off a new presentation engine of his.

Thanks, Chief Monkey and Googler. :-)

2006-01-04

Unescaping HTML in javascript without function invocation

Here is a (rarely) useful tip on how you can effortlessly unescape HTML in a document, without even calling a function. It is probably mostly useful when debugging some live page where everything below some node has ended up one level of HTML quoting too much, and you want to see what it should actually look like. Or, in other words: do not use this in production code; first, because you should always go for fixing the problem at its source, second, because real string dabbling is still the more portable route, across new and aging browsers alike.

Anyway, if you have a node whose contents need unescaping a level, here is what you do: target.innerHTML = target.textContent; Yes, that would be all. Run it again to unescape another level of HTML escapes. To do this part point-and-click:y with the Firefox Document Inspector, invoke that first by Ctrl+Shift+i (or Command+Shift+i on the Mac, I presume), click the "select element by click" arrow, click your node in the document, switch to the "Javascript Object" view in the right panel, right-click Subject, picking "Evaluate Javascript...", where you type or paste the magic statement from above. Done!

If you want a test page to try it, have a look at the last excerpt on this page on Yukuan's Blogmarks blog, generated by the Blogmarks sync feature (which seems to either escape some entries once too much, or, perhaps more likely, not unescape them once too seldom. (I would venture guessing that this is the effect of not properly taking notice of attributes in the imported dataset stating whether it is quoted or not, but I may be a bit off the mark as I have been reading up on Atom and RSS and lost sense of Blogmarks context somewhere in the middle of it. It's also five AM here; when did that happen?)

When done marvelling at the beauty of this nifty hack, do read the linked article on the Google Reader API; it's good for some interesting future prospects. When will we see a mashup of feeds and translation services, for instance? I find I would want to pick up Yukuan's feed, tell FeedBurner, or better still my feed reader, to translate it to English for me (adding some standard header bringing reader attention to the fact that the poor translation was the result of machine translation, rather than insanity on part of the author) and otherwise show it as any other posts in my reading list.

I tried in vain to get Google Language Tools to translate Atom and RSS feeds alike; it only seems to chew text/html at present. I suppose we will see good things happen on this front too, eventually.

Web API access control best practices

This article discusses best practices on access control for modern web services, a field for some reason still embarrasingly set back, in relation to the kind and amounts of data we have started to work with through emerging APIs for various services offered all over the web. Your data, privacy and who gets what access to it are valuable resources worth protection seldom offered today. I'm going to ease into the subject by examples of three services, Blogmarks, Del.icio.us and Blogger.


I was just playing around with Blogmarks a bit today, after stumbling on News Berry, an edited "best picks" feed of Yukuan Jian's recent bookmarks, which I found very stylish indeed -- the screenshots that sets Blogmarks out from what other community tagging services I am familiar with really add that extra something to the mix.

So I thought I should investigate it, passed by, imported my Del.icio.us account's data set, set up a bit of a test blog, which incidentally got recognized as a splog by the Blogger spam detection, as I imported the day's worth of what was mostly CommentBlogging bookmarks. (Oh, well; if I'm lucky the Blogger people might clear it, I suppose, and if not, I'll try setting it up on some other blogging service. :-)

As I do not want to clutter this blog with more or less autogenerated content, I thought it best to make a new blog, and as I do not want to spread my Blogger login details around, I had to set up a separate Blogger identity for it too. Bothersome, but necessary at the moment, until the dawn of a more fine-grained password system that would allow me (or the entity using it, in this case the Blogmarks sync feature) to limit write access only to one particular blog. I assume such a system might be a bit beyond the short time span of Blogger development, unless things like this really take off and Blogger get appropriately uneasy about lots of their users sharing their passwords with random third parties. Because they well ought to.

I get back to this topic once in a while, because it is important to API designers for systems on the modern web, and few if any get it yet. If you provide a service with an open API, such as the Blogger Atom API, or the Del.icio.us API for managing your bookmarks, you should not only offer one password that enables the holder to do anything, including erasing your blog or wiping all of your bookmarks. This is the equivalent of the root password, in Unix terminology. Yes, you should offer a root password. But for services such as Del.icio.us, where several typical innovators around your service will only need read access to your data, for importing bookmarks, for instance, you should also offer a different password for a lower privilege level, in this case offering read only access. That way, your users can safely share that password with other services that would feed on data off your repository, knowing that malicious code or system administrators can not perform destructive transactions using your API methods. Since they only have read access.

In the case of Blogger, the appropriate level of relevant access granularity would on top of full access to all administered blogs, read only or read/write access, also break down on particular blogs, multiplying the number of possible combinations of modes of access to 2N+1 (the number of blogs times read only or read write access, plus the root password used by the human logging in to Blogger to write her posts, or using her preferred post editing tool to do the same).

Bear in mind that this line of thinking is not specific to Blogger, or Del.icio.us, but lends itself to any system on the web offering access to user created data. You can do the math and resituating of the theory to your own case better than I could, I'm sure.

Only the root password needs to (and indeed should) be user configurable and treated by the web interface as a secret hush-hush entity. All other passwords could well get shown in some page view where you go as you need to share data with third party services. You might even offer the user the choice of tagging every password with the name or URL of the service she has used it with, for ease of tracking, and generate new passwords as needed, so every new service used could get its own password for the same privilege level. Add a button next to each to terminate the password's validity, and the user will be able to kill a service she is fed up with or for some reason no longer wants to place trust in, or whatever.

We need to move towards advanced systems for access control like this, as web APIs mature and increasing number of services start to become the bones and marrow of the web, building atop one another, until we will look back on the web of the early decade and smile at the thought of specific sites being called "mashups" for picking up something from one service, joining it with another. It need or should not be hard to grasp by a casual user, though, so don't go too far, making up scary-looking access matrices or similar.

Hopefully, some system similar to the one outlined above will evolve into a de facto standardized way of controlling data flow from one application to the next, so users will not have to face the burden of relearning every new service, as often today. On average, running your own interfaces is almost always the wrong choice. Today you are among the very first to do this, and hence have greater freedom of coming up with something really good. Try going for something really simple, yet powerful. The ideas outlined above may go too far, and may not go far enough. Don't patent your interface and don't mark your territory; both practices limit the likeliness of your solution being adopted in the larger web community, marginalizing the reach and usefulness of your design work. But be good.

And offer your users the option of employing safe hex with the data you host for them. You only invite trouble tying them down to root access alone. If you would not offer a random third party your root password, why should you encourage your user base to be any less security minded?

Show your email address!

GMail Bigfoot Yahoo

Have you too seen the nice "mail me" icons that sprout on blogs and web sites everywhere and stopped for a moment, wondering what makes them? I did some time ago, but didn't stop enough to actually search for a source, and cooked my own instead. (Hence the somewhat different size, adjusted to fit snugly in my blog sidebar.)

Hotmail Lycos MSN

By a pleasant coincidence, I today stumbled on it (or one of them, anyway) in a post by Yukuan Jiang (in Chinese, so I assume you too might prefer to have Google translate it to English, or something close to it).

Mac Netscape Spymac

I whole-heartedly recommend giving visitors options for ways of contacting you; it gives a much better impression than anonymous blogs where it is difficult to get in touch with the author. Everybody wins, and images like these don't get crawled by spam bots either. Leave out the additional HTML mailto: link around the image if you are really paranoid about getting spam that way.

AOL AT&T Blueyonder Comcast Cox Earthlink QQ Rogers Verizon Sina VIP.Sina Sympatico Rocketmail SBCGlobal