Why aren't join models proxy collections?

— April 17, 2006 at 21:26 PDT


Who hasn't created a has_many :through association and then tried to use << to add an object to the association? It seems like the obvious thing to do, but it doesn't work. Well, it works until you save the model and find that the new record in the join model didn't get saved into the database. It took me a bit of head scratching, but I finally figured out why it doesn't work, and what to do about it.

To understand the problem, you have to dig into how ActiveRecord implements the convenience methods for working with associated objects. If you don't have a thing for the inner workings of Rails you might want to skip ahead.

As usual, I'll start with an example so we have something concrete to discuss. Consider a many-to-many mapping that matches technicians with their certified skills.

class Certification < ActiveRecord::Base
  belongs_to :technician
  belongs_to :skill
end
class Technician < ActiveRecord::Base
  has_many :certifications
  has_many :skills, :through => :certifications
end
class Skill < ActiveRecord::Base
  has_many :certifications
  has_many :technician, :through => :certifications
end

Given these models (and the implied database schema), it's easy to map technicians to skills by creating new certification objects.

pat = Technician.find(1)
pat.skills.size             # => 4
macosx = Skill.find(10)
cert = Certification.new(:technician => pat,
                         :skill => macosx)
cert.save
pat.skills.size             # => 4
pat.skills(true).size       # => 5

After the call to cert.save, the skills accessor on pat still returns the cached collection of skills, so we need to call it passing true to force a reload. All that's great, but what if we try the following?

pat.skills.size             # => 5
winxp = Skill.find(13)
pat.skills << winxp
pat.skills.size             # => 6
pat.save
pat.skills(true).size       # => 5

The above shows that Pat's WinXP certification didn't get saved. Perhaps Pat will have to take the training again. But what happened? The << method works fine for regular has_many associations, and for has_and_belongs_to_many too. Why doesn't it work for has_many :through associations?

If you look at the API documentation for the has_many method, you can see it adds a collection attribute for managing the objects in the association. That attribute behaves mostly like an Array of associated objects, but it includes some extra methods that make working with those objects more convenient. That old Rails magic can make your life much easier, but it can also be confusing. If you evaluate pat.skills.class, the result will be Array. So how does an Array know how to build() and create() associated objects, or have << fix up their foreign keys?

Let me pretend I'm Penn Jillette for a moment and explain the secret behind the magic trick: It's not really an Array. In actuality it's an instance of AssociationCollection, which is a subclass of AssociationProxy. These classes follow the well-known Proxy pattern. They front a target object, provide some extra behavior on top of that object, and delegate all other methods to the target object. AssociationProxy delegates to a single model object, as for has_one or belongs_to. AssociationCollection delegates to an Array of model objects. Since the delegated methods include the class() method, if you ask the skills object what its class is, it will say it's an Array even though it's not. It will also do everything else an Array does, and then some.

The final piece of the puzzle can be found in the inheritance hierarchy of the association classes.

AssociationProxy
  AssociationCollection
    HasAndBelongsToManyAssociation
    HasManyAssociation
  BelongsToAssociation
    HasOneAssociation
  HasManyThroughAssociation

This hierarchy shows that HasManyAssociation is a subclass of AssociationCollection, but HasManyThroughAssociation is a subclass of AssociationProxy. That means that HasManyThroughAssociation doesn't get the collection management methods that HasManyAssociation does, including <<. But if the association doesn't have a << method, why doesn't pat.skills << winxp throw an exception? Because the << method is delegated to the target array! The Array#<< method adds the winxp object to the target array, but doesn't do any ActiveRecord magic to fix its foreign key or make sure it gets saved. Well, why not?

Think for a moment about what pat.skills << winxp would imply. Since skills are associated via a join model, that method would have to create a new Certification object to hold the mapping. The << method for has_and_belongs_to_many associations does something very close to that, creating a record in the appropriate join table. If it's good enough for has_and_belongs_to_many, why not for has_many :through?

The main difference between a simple has_and_belongs_to_many join table and a has_many :through join model is that the join model can have attributes other than the foreign keys for the records it is joining. In fact, if you didn't have those other attributes you probably wouldn't use a join model and would settle for a join table.

Let's say our Certification object has some extra attributes, like the date and level of the certification. Now let's get Pat certified for a new skill.

pat.skills << winvista

For that to do the right thing Rails would have to create a new Certification object. Alright, what values should it use for the certification date and level? Care to guess? Neither does Rails, so you can't do it that way.

cert = Certification.new(:technician => pat,
                         :skill => macosx,
                         :date => Date.new(2006,4,15),
                         :level => 1)
pat.skills(true)

The above example works fine for certifying Pat in a new skill. Not as convenient as <<, but it has the advantage of working correctly.

In closing, I think that this all points out how different has_many :through associations are from regular has_many associations. I think it would probably have been better to coin a new name for the association rather than confuse things by reusing has_many for something that's not really a has_many anymore. At the very least, there's some work needed to fix the has_many documentation to describe the differences when using the :through option.

I do actually have some ideas about how to add some more magic and make join models easier to work with. However this article is long enough already, so I'll have to save that for next time.

18 commentsassociations, rails

Comments
  1. Ted2006-04-18 06:15:12

    Josh, I too am attempting to solve this problem and you can see my latest attempt on my blog at http://trak3r.blogspot.com/2006/04/hasmany-continuing-sagas.html

  2. Riad2006-04-18 08:20:06

    Great article!

    The other day I wasted about two hours to figure out what you just described. I also felt that hasmanythrough should be a own association instead of having it written like some special case of has_many association which it really isn't and which just adds to the confusion. Also the documentation is really lacking in this regards. Until that is fixed this whole <<-thing will be a major gotcha for people that work with :through for the first time.

    I'm sure that << will work with :through at some point though. Probably already in Rails 1.2.

  3. Josh Susser2006-04-18 09:46:14

    @Ted: I've been following your adventures on your blog. Funny that we're both running into the same problem at the same time.

    @Riad: Thanks. By the way, I haven't seen anything on the rails-core list or in trunk that suggests anyone is working to fix << in hmt. I guess it's up to us :-)

  4. Guido2006-04-21 10:09:24

    I really think for the sake off beauty and transparency << should work the way I intuitively feel it would work.

    Saying

     pat.skills << winvista
    

    really means i wholeheartedly don't care about the extra attributes. So it would then default to

    cert = Certification.new
           ( :technician => pat
           , :skill      => macosx
           )
    pat.skills(true)
    

    No guessing. It would be the responsibity of my model to supply the defaults.

  5. Anthony2006-04-25 13:34:38

    So the winvista skill is equal to the macosx skill? I guess if you know one, you know the other. ;-)

  6. choonkeat2006-05-04 07:26:05

    To fail-fast on myself, whenever I define a :through association I'll probably add method to croak if I ever use << in the app

    has_many :x, :through :y do def <<(obj); throw "has_many :through doesn't support proxy collection << method"; end end

  7. Josh Susser2006-05-04 09:21:36

    @choonkeat: Good idea. Actually that was recently added that to trunk (in [4265]), so the next Rails release will do that for you and you won't have to add it yourself.

  8. Adam Keys2006-05-09 15:22:45

    Josh, it was nice to meet and chat with you at the SDForum Ruby Conference.

    I knew in the back of my head ye' olde' << wouldn't work on my join model. But I had to play with it myself and then read this to really get it. Thanks for putting effort into some really good documentation on this most perplexing crannie of ActiveRecord.

  9. Steven Luscher2006-06-09 21:26:37

    I have to admit, I'm one of the handful that wasted an hour or two trying to figure out why << wasn't doing the database magic I expected it to. I'm glad I found this article before too long… Thanks!

  10. Brian Donovan2006-06-12 22:43:18

    I think it should either do what Guido said, or should issue a warning like calling nil.id does.

  11. Josh Susser2006-06-12 22:51:46

    @brian: In the current trunk Rails, << on through associations will raise ActiveRecord::ReadOnlyAssociation.

  12. chuck2006-07-16 02:01:03

    I don't think it's reasonable to ask rails to handle this. The has_many-through association should be shorthand for saying that you have (in C) a structure like technician.certifications[].skill, and you want to be able to reference a flat, merged list of skills.

    In this case, you really do want to add a new certification.

    You gave the example for table1<-table2->table3. Consider the example for table1<-table2<-table3: parent<-child<-grandchild. Suppose you want to add a new grandchild. Well, we need to know which of your children should have that grandchild.

  13. james2006-07-24 16:08:56

    +1 to people who have wasted an hour prior to reading this. D'oh!

  14. alex2006-08-16 15:21:23

    +1 to people who waste more than one hour prior to reading this.

  15. Rob Pitt2006-10-15 11:57:16

    Last time I reported this as a bug (#5106), you closed it as invalid. Rails has always been about doing something the most obvious way and just like Guido says above so eloquently - that is exactly what it should do.

  16. Josh Susser2006-10-15 12:21:59

    @Rob: Progress has been made. See my more recent posting Magic join model creation for a solution to your issue.

  17. Rob2006-10-16 10:24:56

    Cool, I had no idea this post was so old! Downloads an RSS reader that shows dates

    Awesome.

  18. xia2006-10-16 23:39:01

    does anyone else have a need for has_one through?

    i'd like it now for linking profile images to a user who may have more than one profile, but can only have one of each profile type

    this person seems to have requested it too http://dev.rubyonrails.org/ticket/4756

Sorry, comments for this article are closed.