validates_numericality_of :resource_name

— July 1, 2006 at 10:05 PDT


And now, more discussion of DHH's World of Resources ideas...

From what I saw at RailsConf, the reaction to the unveiling of ActiveResource was quite mixed. The two big concerns were, "is CRUD a good fit for my model/controller design?" and "should I use human-readable names for resources, or numeric IDs?". I've seen a fair bit of discussion on various blogs about the former, but none at all on the latter.

Some context: In his by now infamous keynote, DHH said that simple numeric IDs were the best way to name resources in URLs. This is analogous to ActiveRecord's standard of using auto-increment integers for primary keys instead of whatever natural key the data model may imply. For example:

natural: /employees/josh
numeric: /employees/13

DHH made the point that naming resources using what is effectively a portion of their contents is not a reliably permanent name. The city of Sunset, Florida changed its name to Sunrise (not surprising, given the local demographics) - should that make all URLs to "/cities/Sunset" invalid?

First off, let me say that if resource URLs are meant for consumption by another program via a RESTful web service, than having human-readable URLs is a non-issue. Machines are totally fine with numbers. However, low-level naming schemes have a tendency to bubble up to where the user can see them. Admit it: you've written at least one app in Rails that used primary keys in the URLs. Don't your users just love seeing their profiles displayed with the friendly URL "/users/show/13"?

I keep running into this issue myself, not only in software design, but even in how I choose permalink names for articles in this blog. It's nice to have a human-readable URL to paste into an email, as it gives a reader an idea of what the referenced article is about, and might make it easier to find a link in a list of bookmarks. But I also worry about the longevity of links, and managing identity equivalence. Bookmark services like del.ico.us don't equate two different URLs to the same page, so http://blog.hasmanythrough.com/articles/2006/06/30/cruddy-searches and http://blog.hasmanythrough.com/articles/read/375 will be counted separately. This may not seem a big deal, but it not only affects the popularity ranking of a page, but also messes with finding areas of overlapping interest with other users. I wonder if the folks at bookmarking services are thinking about this issue; I assume Google is.

So there seem to be three issues to consider for URL names of resources:

  1. Readability by humans
  2. Longevity (permanence)
  3. Identity comparisons

I talked this issue over with a bunch of people at RailsConf, and the tentative consensus that emerged was this: Use numeric IDs for resource permalinks. Allow human-readable (or actually, human-writable) URLs as an alias or heuristic search for an entity, but do a permanent redirect (301) to the numeric permalink once the entity was found.

Looks good on paper, but I want to do some usability testing to see how it flies.

20 commentsrails

Comments
  1. Arpan2006-07-01 11:17:07

    How about doing it this way:

    /employees/13/josh/

    The program can just use the id that is given and ignore the name. The name can be just for display purposes.

  2. Dr Nic2006-07-01 11:43:23
    Allow human-readable (or actually, human-writable) URLs as an alias or heuristic search for an entity, but do a permanent redirect (301) to the numeric permalink once the entity was found.

    It would be nice then for the ActiveResource mechanism to both support auto-generation of readable names, or via syntax like:

    uniq_name :firstname, :lastname
    on the AR to generate a unique name key, based upon one or more fields of the resource.

    It would also be nice for the AR to support auto-redirect when the unique name key.

    This is of course easier to write about than implement, but I think it would be a killer trick to make us fall in love with ARs.

    Nic

  3. J. Weir2006-07-01 11:48:08

    What is the occasion for the URL?

    Is this a URL the user is going to memorize?

    Then a word based url might be best.

    http://myspace.com/imhot

    Is this a URL which a user is not going to memorize?

    http://www.iht.com/article/128218

    • that is how it used to work, but they got silly

    Numbers ARE human readable, thanks to the Muslims. Numbers are very easy to write and verbalize.

    Which would you rather read over the phone:

    http://blog.hasmanythrough.com/articles/2006/06/30/cruddy-searches

    or

    http://blog.hasmanythrough.com/articles/read/375

    I use the 'over-the-phone' method for urls. If you can't read over the phone to your Chinese partner, who speaks only a little english, then it might not be a good url.

  4. scott2006-07-01 12:01:46

    I realize that this is somewhat orthogonal to what you're talking about, but aren't there potential security issues around exposing the auto-increment keys in their naked form? Encrypting/Encoding the primary key for presentation to the outside world might keep troublemakers from simply looping over your entire database and doing massive damage in the event that somebody discovers a security hole in Rails that makes your database records venerable. Plan for the worst, hope for the best! - Scott

  5. Drew2006-07-01 12:23:31

    Is '/articles/2006/06/30/cruddy-searches' parsable by rails? I thought the only possible format was 'model/action/id?parameters'. I would love to know how to implement something that will handle it other ways.

  6. Arthur2006-07-01 13:27:54

    look at the routing configuration, there you can do things like :controller/:year/:month/:day/:title and many more.

  7. Drew2006-07-01 14:48:31

    I've never even seen that part of rails before! It's great, thanks!

  8. Greg2006-07-01 18:52:13

    I wonder what affect having a human readable url would have on search engine optimization, if any. That's always been a debatable subject.

    One way I can think of to handle the human readable change like sunset/surise is to possibly add a config value or route for any changes.

    I like human readable for certain cases. Take a large directory, say a statewide business directory. Visitors might get used to a human readable url to find information in the instance of /directory/17050 as a zipcode or /directory/townname as the town. Changing a zipcode (which happens) would require a database update and/or could be a fairly quick rule in a config file. The rule in a config file might help any old links to the page while the database update handles everything else.

    It would also be nice to have human readable urls that could handle user typos. /directory/harsburg might find /directory/harrisburg instead and alert the user.

    This is an interesting discussion; maybe one we should ponder over a beer some night.

  9. Dave Teare2006-07-01 19:06:15

    At the end of the day, most users are never going to type the URL, so don't worry much about them. As for machines talking to machine, I don't care about them either - give them a meaningless id.

    What I *do* care about greatly is the Google Ranking of my site. For a spider, the difference between 1passwd.com/guide/about/going_beyond_firefox_password_manager and 1passwd.com/guide/123 is HUGE.

    OTOH, I agree that composite keys really suck in terms of longevity. It is possible for me to rename the section title in my guide and thereby screw everything up. This is a current limitation in my User Guide code.

    I think this is where it's important to do as Arpan said:

    /employees/13/josh/ The program can just use the id that is given and ignore the name. The name can be just for display purposes.

    This allows you to have good SEO with proper maintainability / longevity too.

  10. chris2006-07-01 22:29:30
  11. Bill Katz2006-07-02 12:50:36

    Josh, "readability by humans" can also be a business decision. When I was choosing how to name URLs for writer profiles, I went with what I thought my friends would like.

    http://www.writertopia.com/profiles/BrandonSanderson vs http://www.writertopia.com/profiles/38

    The former helps author branding, even if it has "writertopia.com" in the URL. 37signals gave subdomains out based on user-chosen names, right?

    For Writertopia profiles, the URL should be long-lived. Actors and actresses have to choose a screen name from a recognized name space. It's heavily regulated. Writers, as far as I can tell, don't have one big union that enforces the name space, but they tend to use one pen name for a chosen market.

    I like the aliasing safety valve. Drupal has that built into their CMS. It'd be nice to extend routes into an efficient, easy aliasing system.

  12. Duncan Ponting2006-07-02 14:29:07

    What do people think about:

    http://blog.hasmanythrough.com/articles/2006/06/30/cruddy-searches-375

    I realise it doesn't fix the Sunrise problem.

  13. Christian Romney2006-07-02 19:57:16

    At the end of the day, most users are never going to type the URL, so > don't worry much about them.

    Too hand-wavy for me. I've seen people do crazy things with address bars in usability labs, so I don't really know that I buy this argument without hard data for the particular website in question (and there are HUGE differences from app to app).

  14. Pat2006-07-02 23:42:30

    I prefer human-readable URLs, but I never know how to handle a conflict. For example the URL for this post is http://blog.hasmanythrough.com/articles/2006/07/01/validatesnumericalityof-resourcename. Well what if you had decided to make another post with the same title back on July 1st? Is it just ofresource_name2? Not sure what the great benefit is between that and using the primary key somewhere in the URL.

  15. john2006-07-03 02:34:01

    Meaningful urls are important to me. I look at urls all the time for extra information. Maybe Im crazy. I do it a lot when Im googling, or staring at a list of forgotten bookmarks, and many other situations. You guys don't look at urls for information about where they point?

    For most 'public' resources in my app I've just added a 'uri' column. On create I generate the uri from some other meaningful string attribute, check that its unique, slap a 2 or 3 etc. on the end of it if its not unique, and that's that. I can change the 'title' or 'name' or whatever but I don't mess with the uri again. Thats just out of laziness though. I've been thinking about adding a kind of polymorphic url graveyard table though. id, resource id, resource table, dead uri, active uri Or something to that effect. I haven't thought that much about it really, because things seem to be good enough as they are.

    Works for my purposes anyway.

    @J. Weir, I don't think you'll be registering a lot of purely numeric and random domain names so you can read them over the phone easily. Anyway numbers are fine, sure. You are right. But numbers that mean nothing outside of a database don't do anybody any good.

    My vote goes for identifying resources based on something describing the resource and not auto-generated primary database keys.

  16. Rimantas2006-07-04 11:15:35

    Of course there can be many people who disagree about importance of keywords in URL's. However, when guy working at Google says, that it is better to use dashes for separating keywords in URL - you got to believe...

  17. Aristotle Pagaltzis2006-07-04 23:18:47

    This is a solved problem. I discussed an airtight approach in Transparent opaque changeable permanent URLs.

  18. Peter2006-07-05 00:07:32

    @Aristotle: very nice solution but someone with a different balance of priorities might say the numeric part shouldn't be used in the URL at all.


    @TheWholeWorld:

    Rails is cool because if Victoria's Secret had a Rails site I could type

    http://www.victoriassecret.com/category/bras

    to see a glossy splash page. I could jump a step, directly type

    http://www.victoriassecret.com/category/bras/products

    and see thumbnail photos* of plenty of hot girls wearing small lacy clothes.

    Apparently, and for good reasons, DHH would rather I use a primary key which is a bit trickier to remember

    http://www.victoriassecret.com/category/5/products

    However, a bra can be in more than one category. I am really searching for category_product associations with category_id=5. With the CRUD system, which is very flexible, I will soon have to type

    http://www.victoriassecret.com/category_product?category_id=5

    This looks like it is for a PHP site I wrote a year ago and this seems to be getting a bit rediculous.

    A website is a CMS that guides the user through the data in a friendly fashion. Perhaps we need to have the flexible CRUD based core actions, which are very cool, and then layer some friendly actions or routes on top, which are also very cool.

    • thankfully clickable and expandable
  19. Peter2006-07-05 14:47:34

    Upon reconsideration, perhaps we are looking for a view of the products aspect of category 5. So soon maybe we will be making a GET to

    http://www.victoriassecret.com/category/5;products

  20. Michael @ SEOG2006-07-25 15:01:28

    Human readable URL's make a dent when it comes to search engine rankings. I work as a search engine manager for search engine marketing and search engine optimization and play with Ruby on Rails on the side.

    A few reasons it makes a difference

    1. Weight in Search Results Directly

    Perhaps not as much in Google, but definitely in MSN having the keyword in the page title can help the ranking of the page. It is a limited space so whatever keywords you choose to put there are probably pretty important and it looks like the search engines view it that way as well.

    1. Keyword Weight on Links

    A lot of people might link to a blog or post by just dropping the URL in and having it link that way. If you were writing about a companed names "ZONGA!" and 100 blogs linked to you and your page was blog.com/zonga-rules and a competing blog was blog.com/3 generaly speaking you will be found for zonga and the other person may not be found at all. That is because to the search engines all those blogs were saying "ZONGA" and pointing to you.

    1. Human Clickthrough Factor

    Relevant Page Titles and descriptions definitely have an impact on how a user clicks on a result in a search engine. URL's probably also have an effect and it seems to me a user is more likely to click when the URL looks like something they are interested in then just some random id number. Sometimes the excerpt picked by the search engines can be funky so why not give your user as many chances as possible to find out what you are talking about?

Sorry, comments for this article are closed.