Unsurprisingly, mostly all of them employ modern HTML constructs for site layout, with
<div>
tags and semantical markup for headers, lists and the like to draw up the general structure of pages, rather than the flurry of tables, font tags and spacer images of the primeval web. I believe we have the relative maturity of CSS and the hard work of template designers prior to the recent blogging explosion to thank for this. (And thank deity for that -- myself, I practically left the dirty web for a few years back in the nineties, having wanted to do many of the things that required the much cleaner web of today, and capabilities unavailable in javascript and the DOM back then.)Anyway, to my surprise, one of the sites that was a mix and match of old bad times and fresh good times, was Joel on Software. This is a hand wrought site by a programmer for other programmers (mostly), and it employs table and tag attribute mayhem for base site structure, and occasionally classed divs for some of the content grouping inside. Not the worst tag soup design of late, but not as pretty as I had expected either.
I know, this is all not very interesting, but it gave me some good XPath exercise, trying to pick out specific nodes of the pages I was interested in. The kind of thing like "find the last
<p>
child of the first <td>
element which has a <div class='slug'>
child" (//td[div[@class='slug']]/p[last()]
), which refreshes a lot more XPath expertise than a trivial "find the <div id='viewer'>
element" (//div[@id='viewer']
). Granted, neither of the above ensure that they do not match more than one element (the second will, in a well-formed document), but for my purposes, that was not relevant.And, better still, it gave birth to a little scriptlet for trying out XPath expressions on a web page, flashing the first element matched (or bringing up the expression again with an error message if it failed to find the node, or you wrote some malformed XPath). For those of you familiar with the Firefox Document Inspector, the behaviour is familiar. Something like this ought to go into its "find" mode, by the way; I place this code in the public domain, should anyone want to submit it upstream.
By default, it suggests an expression matching divs with an empty class attribute, a very common start of most of the kind of things I usually look for; typically I add a "post-body" or some other name between the apostrophes. I suggest typing in the expression you want, saving it in your clipboard (ctrl+C) so you can use it once you saw that it did what you wanted. Go ahead and try that on this page, if you like; it should flash the text body of this post a few times.
To go in the other direction, that is, on an unfamiliar page, how would one go about finding an XPath for a particular section of the page (assuming you are familiar with XPath syntax and workings), I again (read my prior post about it) warmly recommend Aardvark, a great extension which shows node names, classes and id:s of things hovered by the mouse, once invoked. You can even walk up through page structure by repeatedly tapping W (to widen scope). Immensely useful. And, for tough nuts such as Joel -- the Document Inspector, for looking at the full exact node structure of the surroundings of a node.
Having been made the seventh druid of the hoodwink society, I'm starting to feel right at home. I'll be back with some additional thoughts about permalinks shortly, which I feel ought to be given a whole article of its own. Permalinks, the remnants of the original idea of the URL, is very important technology we ought to pay much more attention to, and teach every new generation of people coming to the web about. But I will save that discussion for later.
By the way, if you find some article or tool of mine really useful, I would very much appreciate a small tip, say a dollar, for my work on it. Try giving my donation pane a spin; I try to keep it unobtrusive and out of the way so it does not disturb the readability of my posts. Don't feel obliged to, but it would encourage me keep doing the kind of things others (besides myself) find value in.
Another great tool (I'm sure you've got that one installed) is the Web Developer Extension. The "CSS" → "View Style Information" also displays the DOM path for the hovered element.
ReplyDeleteBut most of the time I use the good old DOM Inspector (Ctrl-Shift-I). I just can work without the "Find a node to inspect by clicking it" button.
Yes, that (it's on Ctrl+Shift+Y) is useful too, but, as you say, it's hard to beat the inspector. What I don't like about the "CSS" → "View Style Information" visualizer is that it does not give any feedback about the size and span of the element hovered, which Aardvark (and Platypus, though I believe Platypus doesn't show any info about node ids and classes) does.
ReplyDeleteNice bookmarklet! I don't use this stuff much, but that's primarily because it's not so easy to hack with as some things are -- this will make it easier :)
ReplyDeleteDo you know if there is a Javascript console for Internet Explorer? I was about to look at Script Debugger. Any experience with it?
ReplyDeleteHow did you find out you were the seventh?
ReplyDeleteQuite unscientifically; just after my acquiring druidry status, I counted how many others had the shillings for it in the Druid Standings list, and found six, myself not included.
ReplyDelete