Self-referential has_many :through associations

— October 30, 2007 at 18:55 PDT


This article updates a previous version for the Rails 2.0 way of things. Since there's not much difference, I decided to fix up the example code to be more understandable. After all, not everyone is a discrete math geek.

This example updates the one from the previous article. The only significant difference is that you don't need to specify the :foreign_key when using the :class_name option in a belongs_to association. In Rails 2.0, the key is inferred from the association name instead of the class name. I also included the :dependent option because I feel it's too often overlooked.

These classes could be used to model a food chain. Spider eats fly, bird eats spider, cat leaves bird on pillow as gift...

create_table :animals do |t|
  t.string :species
end
create_table :hunts do |t|
  t.integer :predator_id
  t.integer :prey_id
  t.integer :capture_percent
end

class Animal < ActiveRecord::Base
  has_many :pursuits,  :foreign_key => 'predator_id',
                       :class_name => 'Hunt',
                       :dependent => :destroy
  has_many :preys,     :through => :pursuits
  has_many :escapes,   :foreign_key => 'prey_id',
                       :class_name => 'Hunt',
                       :dependent => :destroy
  has_many :predators, :through => :escapes
end
class Hunt < ActiveRecord::Base
  belongs_to :predator, :class_name => "Animal"
  belongs_to :prey,     :class_name => "Animal"
end

The Hunt model describes how likely a species of predator is to catch a species of prey. From the predator's perspective the hunt is a pursuit, but the ever-hopeful prey sees it as an escape. Note that you can model both kinds of hunts between the same pairings of animals: Some days you get the bear, some days the bear gets you.

22 commentsassociations, rails, update

Comments
  1. Rev. Dan2007-10-30 19:49:40

    Sorry this is about 99.5% unrelated, but I've never heard anybody say "Some days you get the bear, some days the bear gets you." That totally explains the joke in The Big Lebowski... "Some days you get the bar, some days the bar gets you." (Just say "bear" with a twangy western accent and voila.)

    I guess some things are so "obvious" that they're obtuse.

    :dependent => :destroy

    I'm still not quite clear how dependencies work in ActiveRecord... if'n you haven't already written more on the subject maybe you could be convinced to do so? :)

    Thanks for the great blog Josh.

  2. Dustin2007-10-30 20:44:28

    Cool, thanks for this Josh - I'm wondering what you think about using a separate model for things like pursuits?

    I've heard people talk about creating separate models for relationships. Like when you have a Person model, and if those people get together, they now have a new model - a Relationship.

    I really don't know, just wondering what your opinion is... thanks!

  3. Josh Susser2007-10-30 20:57:57

    Rev. Dan: The :dependent option tells the model that when it is destroyed, the contents of that association should be dealt with so they don't have a belongs_to referencing a missing record. You can deal with the contents by destroying them (which instantiates and runs callbacks on the content objects), deleting them (by zapping the records in the db), or nulling out the reference. That's about it.

    Dustin: That's exactly what that example does. The Hunt is the model that describes the relationship between predator and prey Animal objects.

  4. Edoardo Marcora2007-10-31 16:08:15

    Is there a way, in Rails 2.0, to create polymorphic self-referential hasmany :through associations, without resorting to the hasmany_polymorphs plugin?

    Thanx in advance for your help and consideration,

    Dado

  5. Josh Susser2007-10-31 17:31:26

    Edoardo: Yes there is a way to do polymorphism in hmt. I'll be updating my old blog post on that fairly soon, since Rails 2 has a new feature that makes that easier to do. (Historical note: has_many_polymorphs was a response to that post of mine on polymorphic hmt.)

  6. ArthurGeek2007-11-01 12:47:51

    In Rails 2.0 you can use:

    create_table :hunts do |t| t.references :predator t.references :prey t.integer :capture_percent end

    Read more about that, here: http://dev.rubyonrails.org/changeset/7973

  7. shiki2007-11-08 16:26:56

    Hey I've modeled my user-friends cloning exactly what you do in this example, being the "pursuits" and "escapes": "madefriendrequests" and "receivedfriendrequests", but as I see [another example on the net][] they don't use that first "fake" association, but they rather "link" directly to the User class [ in your case the Animal class ]. So, since the other post is a little bit old, I was wondering if it's still a valid way of doing things and if so I wonder if it has some advantage/disadvantage to do it so... could you please throw some light on the difference there might be of implementing it like that versus implementing it like you suggest? or is it that in the user-friends case your example is not applicable?. I'm more than a Ruby and Rails newbie and well I'm a bit confused. Thanks beforehand.

    [another example on the net]: http://www.aldenta.com/2006/11/10/has_many-through-self-referential-example/ ]

  8. Josh Susser2007-11-09 00:52:32

    shiki: I'm not sure what you mean by a "fake" association. But one thing to notice is that my example has bidirectional associations (meaning you can traverse the join model in both directions), but the example you cited only allows navigating a friendship from a person to its friends, and not from a friend to people who have it as a friend.

  9. shiki2007-11-13 04:34:05

    You're so right, I think I just saw it... thanks a lot for this very useful example!

  10. CelesteKing2007-11-16 15:03:37

    Hey mister associations geek, could you help me please with AR thingie ?

    A little abstraction: I have apples and oranges that belongto juice. Juice hasmany apples, oranges. Now, I want to SELECT apples.name,oranges.name FROM apples JOIN oranges ON(apples.juiceid=oranges.juiceid). How do I do this with less pain ? The point here is to get all apples and oranges having identical common object (Juice).

    "has_many :apples, :through => :juice " doesn't work for [almost] obvious reasons... :/

    Thanks!

  11. Francisco Velez2007-11-21 04:33:04

    ok. Thank you, it's a very good example, it's just it I was looking for, but now, How can I use this kind of relationship with something like :acts_as_tree?

  12. Danimal2007-11-22 05:54:32

    Josh,

    Thanks for all of this! I only wish I'd remembered this example about 4 hours earlier... would've saved me 3.9 hours of "spinning wheels".

    :-)

  13. JoeC2007-11-24 21:35:53

    Thanks so much for this. I'd been spinning my wheels for a couple of days trying to figure out why I kept getting the "could not find association" error.

    Thanks for passing on the knowledge!

    JoeC

  14. MJ2007-11-27 01:21:16

    Thanks for the good article. But I'm little worried about the performance. I think a lot of sql queries will be sent back to DB. In your example, fly -> spider -> bird -> cat, it would send 3 queries to DB, right? The application I'm working on has similar problem and it should render lots of food-chain like relation. So I'm worried the slight increase in traffic will generate huge requests to DB.

  15. Josh Susser2007-11-28 15:43:57

    MJ: You're right, but what would the usage model be for that? I think typically a user action would result in moving only one level up or down the food chain, so that would involve only one query. There are certainly cases where this structure would give horrible performance as you describe - fetching all the tree-structured nested comments of a blog post is a good example. For those situations, you want to go with a different data structure (e.g. nested set works well for tree-structured comments).

    Also, keep in mind that fewer queries may give you better performance, but that's not a guarantee. DB join operations are expensive too. Sometimes avoiding the big nasty multi-table joins by doing more than one query is a win. You have to check your own data to see how to tune your own application.

  16. MJ2007-11-28 21:43:33

    Thanks Josh,
    I'm working on an application that has to describe all the tree + the numbers of children that each node has, everytime. It's looks tough, but the good news is that the tree won't change its structure, except adding leaf nodes. So I'm trying to change the DB model.
    I have several ideas, but none of them really solve the whole problem. After I find the DB model, I think I have to spend lots of time to match those with rails, because I'm quite new to ruby & rails. ;-)

  17. Gabe da Silveira2007-12-12 23:31:24

    The performance hit for multiple joins is usually lower than the database connection latency in my experience, except when you start to join several tables where the amount of data increases at a geometric rate where adding extra queries would only give you a polynomial increase. ActiveRecord::Base.find's :include option currently just does one big join, but there is work in progress to provide an interface that will allow you to break down includes into separate queries that still avoid the n+1 problem but without the geometric data expansion problem.

    On another note, one thing I need for a current project (how i found this post) is true bidirectional associations. That is a single association that fetches rows from both directions. In the given example it would be akin to a hasmany :predatorsandprey where it would do the join with an OR and use both foreign key columns so you could get both at once. Normally you could just do this with the two associations separately, but in my case the association truly is transitive, and I want the results sorted by createdat, so I think my best bet is to just skip associations and write methods to query directly for what I need.

  18. Josh Susser2007-12-15 16:57:34

    Gabe: You're probably right about the performance impact of multiple joins. As Nick Kallen said to me yesterday when talking about this issue, "I dream of having my Rails app be fast enough that db performance became the issue." With small datasets, the db won't be your long tent pole. But with large tables (hello, Twitter!), watch out...

    I'm looking forward to seeing the prefetch/caching alternative to :include make it into Rails. Rick Olson's unreleased plugin for doing that sounds very interesting. I think a lot of times it makes sense to do 3 queries instead of 1 with an outer join that takes a lot of work to create in the db then even more work to consume by Rails.

    The kind of bidirectional association you describe is interesting. It's what most people want to do a sibling type relationship. I think you might be able to fake it up with an association extension or some custom finder_sql code and still get the other benefits of associations, but yes, the SQL would look pretty different.

  19. Ian2007-12-18 22:34:54

    Thank you so much for documenting this, it was driving me nuts trying to figure it out on my own. Would it also be possible to do something like this, except with a has_one type relationship? As an example, my application is tracking servers and ports and connections. A server can have many ports (easy enough) but each port can only connect to another port once. I was thinking to either:

    1) create a connections table with two columns for port_id 2) create a connectedto column on ports with the portid of the matching port.

    The first case is awkward because I would have to scan both columns to find if a port is connected and manage all that (no duplicate entries in either column, etc). The second I'm not sure how to implement such that the rails classes all work similar to your example. Ideally, I'd be able to find the other side via something like server.ports.first.otherside. I can manually make this work with transactions and such (find both ports and set the otherside_id to the corresponding port) but it just seems that there's a slicker way to do this.

    Any advice? It would sure be appreciated..

  20. Nathan2007-12-19 05:28:39

    Thanks so much Josh. I've been trying to figure out how to do this exact thing for a project I've been meaning to start. You've done a great job at explaining a very abstract concept. Thanks a ton and keep it up!

  21. Tim2007-12-22 07:28:55

    This finally worked for me when I added explicit foreign keys to the "belongs_to" lines:

    belongsto :male, :classname => 'Indiv', :foreignkey => 'maleid' belongsto :female, :classname => 'Indiv', :foreignkey => 'femaleid'

  22. Josh Susser2007-12-24 06:18:09

    Tim: The example is for Rails 2. For Rails 1.2, you need to specify the foreign keys explicitly. You should refer to the previous article for Rails 1.2 usage.

Sorry, comments for this article are closed.