Feed readers are lame and URLs are forever

— January 26, 2008 at 11:54 PST

When I changed my blog software from Typo to Mephisto, my article and feed URLs all changed. I didn't want to break all those old URLs, so I put a lot of effort into writing a ton of Apache mod_rewrite rules to redirect the old URLs to the new ones. I set them up as 301 permanent redirects to indicate that the old URL was defunct and to use the new one from then on. That means that the feed reader is supposed to change the URL for the feed permanently. But feed readers are lame and many treat the redirect as a temporary change, so I'm still getting requests for feed URLs that haven't existed in over a year.

If you're reading this article in your browser because your feed broke and you came here to see what happened, that means it's time to update the URL for your feeds. I'm sorry, but your feed reader has been told every hour for the last year that the URL has moved to a new location forever, but it chose to ignore that fact, and I'm not going to keep supporting those old URLs anymore. Here's the new URLs so you can manually update your feeds:

The main feed is hosted on FeedBurner: http://feeds.feedburner.com/hasmanythrough (and has been for the last year).

Categories have disappeared and articles are now organized only by tags. Old categories are now just tags. Tag feed URLs look like: http://blog.hasmanythrough.com/tag/rails.atom

I haven't set up comment feeds yet, but they should be along in a week or two at most.

Already I can see my FeedBurner subscription count has dropped significantly. I expect some of those losses are permanent, but I'm not going to cry about it. It's a good lesson though: URLs are forever. I like to keep that in mind as I design the URL scheme for a web app, and I like to have a scheme for URLs that isn't tightly coupled to the underlying model representation or database internals (like primary keys). Rails makes it so darn easy to expose the primary key of a record in the URL that nearly every Rails app in existence does just that. I've done it myself fairly often, but I strive not to do so in an app that's going to have a long life in the wild of the Internet. Teldra has no public routes that use the primary key.

If you want to avoid exposing primary keys in URLs, you should watch out for default routes and for map.resources calls. Anything that has an :id param in the route is going to get you. The easy way around that is to override the #to_param method for your ActiveRecord model to change the id from the primary key to something more meaningful. For example, here's the method from Teldra's Tag class:

def to_param
  name.gsub(/ /,'+')

So the tag "ruby on rails" is identified by the URL /tag/ruby+on+rails. Then you have to deal with the other side of things in the controller. Instead of doing the usual @tag = Tag.find(params[:id]), you want to find the tag using its name. The simplest way to define this method in class Tag:

def self.find_by_param!(name)
  record = find(:first, :conditions => {:name => name})
  raise ActiveRecord::RecordNotFound if record.nil?

Then in the controller do @tag = Tag.find_by_param!(params[:id]). Or you could fix your route to use :name instead of :id to make it even more clear. I like raising the RecordNotFound error so that Rails will give you a 404 page if you try to access a tag that doesn't exist. The exclamation mark at the end of the method name says we're raising an error sometimes, to distinguish from normal dynamic finder methods that return nil if no record is found.

I think there's probably a little plugin that could be built around this pattern that would make it even easier. (makes a note)

By the way, I make it a practice to remove the default routes from routes.rb right away. I think they are a crutch and a cause of messy application design, and also a potential security hole or at least a source of unexpected behavior. If you want to have a firm grasp on the URLs of your application, get rid of the default routes and use only map.resource(s) and specific named routes for everything else.

7 commentsrails, routes, teldra

  1. Nate Klaiber2008-01-26 14:50:00

    " If you want to have a firm grasp on the URLs of your application, get rid of the default routes and use only map.resource(s) and specific named routes for everything else."

    This is an excellent tip, one that should every beginner should do. I know the defaults help you get up and running, but once you learn your way around - there is no reason to keep them in there.

  2. Ben2008-01-26 17:34:48

    I actually wrote a plugin (well, a patch for Rails that was rejected for being too magical) a while back that helps with part of this: http://agilewebdevelopment.com/plugins/magic_routing

    Basically, it lets you declare a route like this: map.tag '/tag/:slug' :controller => 'tags', :action => 'show'

    Then, you can call tagpath(tag) and - assuming your Tag model has a slug method - it'll automatically pull in that instead of the id. In other words, you don't have to resort to updating toparam and putting something that's not an ID in params[:id].

    Unfortunately, like so many plugins this one's been almost abandoned - I don't think it's even been tested with Rails 2.

  3. Luke Francl2008-01-26 18:51:01

    Good tips. I always delete the default routes as one of the first things I do.

    There's a couple plugins out there to help with this. PermalinkFu is one. I wrote something similar called url_key which will generate model-unique URL slugs so if you have multiple things called "foo", the first gets "foo" and the next get "foo-1", "foo-2", etc.

  4. rick2008-01-26 21:39:11

    permalinkfu does that too... I've thought about adding #findby_permalink also, since I tend to implement that in every app now.

  5. traveler2008-01-27 00:43:50

    Honoring 301s automatically in a client-based feedreader is a bug. Reason: There are WiFi captive portals setup by greedy hotels and airports that redirect to the portal using a 301. (Yes, that is the actual bug -- I couldn't believe NNW was changing my feeds until I saw those 301s...)

  6. Josh Susser2008-01-27 08:11:19

    @traveler: The captive portal is misusing the 301 status code. But rather than ignoring 301s entirely, a client could prompt the user and ask what to do, have a global setting for handling 301s, or even notice that all feeds were being redirected and be smart enough to realize that something wonky was going on. Not honoring 301s at all is throwing the baby out with the bath water.

  7. W. Andrew Loe III2008-02-06 20:24:59

    Prompted by your comment, I did a search and actually found friendlyid to be the most comprehensive. http://randomba.org/articles/2008/01/18/friendlyid

    It does require you to track another column, but this prevents collisions, and they nicely support 301'ing old slugs until you expire them.

Sorry, comments for this article are closed.