2006-01-02

Backing up your Blogger blogs

In response to a question from Hans Persson today on how to back up all of one's Blogger blog posts, I did some quick research into the topic, after recalling something I had seen about the Blogger Atom API.

Apparently, these things are not as apparent as one might hope them to be, as even Blogger themselves recommend backing up your posts by changing your templates to show all posts, republish, save the result somewhere, change back and republish again.

This is of course every bit as silly as it sounds, and far from the easiest solution, at that. Instead, I recommend doing like this:

  • Go to https://www.blogger.com/atom/ and type in your Blogger login information. This will yield you a very brief and to the point chunk of XML that at least Mozilla and Internet Explorer will render quite readably for you. It will go a little like this, with two almost identical rows for every blog you own:

    <feed>
      <userid>8852417</userid>
      <link href="https://www.blogger.com/atom/19000527" rel="service.post" title="Some assembly required" type="application/atom+xml"/>
      <link href="https://www.blogger.com/atom/19000527" rel="service.feed" title="Some assembly required" type="application/atom+xml"/>
      <link href="https://www.blogger.com/atom/15626356" rel="service.post" title="ecmanaut" type="application/atom+xml"/>
      <link href="https://www.blogger.com/atom/15626356" rel="service.feed" title="ecmanaut" type="application/atom+xml"/>
    </feed>

  • Mark and copy the URL in either (they are the same) href attribute of the blog you want to back up, and paste it into your browser's address field.

  • Save the page. Done!

This will save all the last 100 (Blogger docs suggest 15, but are fortunately wrong there) posts, published and drafts alike, and all related information about creation times and other state about each post, in Blogger's slightly extended variant of the Atom format. To make a complete backup, you may also want to browse around the blog's settings pages, saving the blog template and perhaps a copy of each Settings page, so you can see how you had your date formats and whatnot setup at the time of the backup. I warmly recommend storing all of this information in separate directories by date and blog name.

The more ambitious of you may of course set up a cron job on some trusted machine that does incremental backups, picking up the latest 100 entries of the blog at regular intervals, storing them somewhere for safe-keeping. Should anyone have a better way of also picking up and storing blog configuration at large, your input is very welcome.
Categories:

14 comments:

  1. I don't think this works for larger blogs. My atom page apparently only goes back to last September. Perhaps there is a size limit to the output file?

    ReplyDelete
  2. Indeed, it only lists the latest 100 posts, and this post (counting a few uncompleted drafts) is number 101, just pushing my first post off the page.

    Still, it is a somewhat useful default strategy for doing incremental post backups.

    ReplyDelete
  3. Thank you indeed for this reply (received in another forum). I've just finished a very small cron job that fetches this page nightly for my blog and checks the result into CVS. I had just passed 90 posts, so it was about time...

    ReplyDelete
  4. You just made my life easier. I am currently using the HTTrack method, which is a little cumbersome. This seems much much easier, and I'll just have to remember to do it often( which I do anyway )

    ReplyDelete
  5. Hmm. I wonder if they changed the number of posts available recently. I'm pretty sure that when I messed with it before it did only return the last 15 posts as per the documentation.

    Nice that it returns 100 now.

    ReplyDelete
  6. Hello! My name is Robert Gillis and I am new to Blogger (about a month now) but have set up several large blogs already. My question is this -- once I have saved the xml file generated by your solution (I pasted the href for the blog into IE), then what? Suppose I wanted to restore this blog someplace else, or suppose my blogger files were deleted and I needed to restore them. I guess my question is, how do I restore the blog once I have this xml file?Many thanks! robertxgillis@aol.com

    ReplyDelete
  7. Hi, Robert again... If I may ask abother question, how do you back up your blogs, and restore them if necessary, or move them someplace else?

    ReplyDelete
  8. Good questions, both of them. Having not had the need yet, I have addressed neither, but in case I would end up having to restore them to Blogger, I would craft a tool to process the XML backup, posting it back via the Blogger API, a post at a time, or, in case the tool was already available (not all unlikely given the size of the platform, though I have not sought it up yet, use one already crafted).

    As for migrating to another platform than Blogger, that is a problem that will have very different solutions depending on your destination platform of choice. I believe to have read articles on migrating from Blogger to Wordpress and to Movable Type, though specifics have probably not been about using this backup as input.

    Unfortunately, blog land is still not quite mature enough to have standard operating procedures for cross platform data migration yet; that is still on the horizon, or beyond.

    ReplyDelete
  9. on this backup do you mean to just save as a favorite or copy past into word

    ReplyDelete
  10. You'll want to save the data itself as the URL is not a permalink, so, given the question phrasing: copy and paste.

    ReplyDelete
  11. i tried the method listed on blogger just once but it terrified me so much that i never did it again!

    yours worked like a charm. thank you very much. I wish there was a way around the 100 limit. luckily, at the moment, i have just one blog that goes over this limit.

    ReplyDelete
  12. Thought this might be a place to introduce a new blog backup find that I have yet to use myself, but find intriguing. Especially the PDF generation part. I have non-computer relatives that might be interested in my kids blogs:
    http://asprise.com/product/blogcollector/index.php/
    Paying for bakup depends on how much you value the content of your blog...

    ReplyDelete
  13. If you have more than 100 articles in Blogpot, there's a bash script for backing up blogs:

    http://amalgamadeletras.blogspot.com/2006/06/hacer-un-backup-de-los-artculos-de.html

    This script for Linux, uses bash, wget and bzip2 and only needs to change 4 variables for adjust it to your own blog.

    HTM

    ReplyDelete
  14. I'll have to try the atom method sometime. My main problem was extracting the raw data from my blogger blog to store into a database so I could add and retrieve it with a php script I'm writing.
    My blogger blog is on my own domain anyway, so its easy for me to download it every so often (when i remember!).
    I created a 'skeleton' template that contained just the bits I needed, uploaded that to blogger, republished and downloaded (phew!).
    I was able to write a php script to go through all the archive directories (e.g. 2006/08/ etc), parse the html files for the important data using regular expressions (that was a bloody nightmare in itself lol).
    Blogger really should give you the option to download your blog as a .CSV or some other format.
    Thanks for the interesting info.

    ReplyDelete

Limited HTML (such as <b>, <i>, <a>) is supported. (All comments are moderated by me amd rel=nofollow gets added to links -- to deter and weed out monetized spam.)

I would prefer not to have to do this as much as you do. Comments straying too far off the post topic often lost due to attention dilution.

Note: Only a member of this blog may post a comment.