Here's a gotcha: has_many :through
associations do not support polymorphic access to the associated object. In this article I'll show the reasons for this limitation, and also provide an approach that lets you work with polymorphic associations without too much trouble.
Let's start with some example code so I have something concrete to talk about. Here's some models for representing that an author can write either articles or books, and that each article or book can have multiple authors.
class Authorship < ActiveRecord::Base
belongs_to :author
belongs_to :publication, :polymorphic => true
end
class Author < ActiveRecord::Base
has_many :authorships
has_many :publications, :through => :authorships
end
class Article < ActiveRecord::Base
has_many :authorships, :as => :publication
has_many :authors, :through => :authorships
end
class Book < ActiveRecord::Base
has_many :authorships, :as => :publication
has_many :authors, :through => :authorships
end
That's all by the book (no pun intended) and follows the examples in the Rails 1.1 release. So where is the gotcha?
The problem
Given our example, we can ask an article or book for its authors:
my_article.authors # => [author1, author2, ...]
my_book.authors # => [author1, author2, ...]
However, we can't ask an author for all its publications!
an_author.publications # => ERROR
So what's going on? Why does traversing the join model only work in one direction? Well, lets take a look at what is in the join model table.
create_table "authorships" do |t|
t.column "author_id", :integer
t.column "publication_id", :integer
t.column "publication_type", :string
end
The reference to the publication is essentially a compound foreign key. The _id
field holds the id of the object's record in its own table, and the _type
field holds the name of the Ruby class of that object. That information is combined to do the join to find a publication's authors. Look at the SQL Rails generates for article_1.authors
SELECT authors.* FROM authors
INNER JOIN authorships ON authors.id = authorships.author_id
WHERE (authorships.publication_id = 1
AND authorships.publication_type = 'Article')
You can see in the WHERE clause how the join uses both the id and type fields to match the authors to the article. That's quite straightforward and works fine.
So what about the other direction? Why doesn't author_1.publications
do the right thing? Let's try and write the SQL for it. First off, what about the SELECT clause?
SELECT publications.* FROM publications
That's not right, because there is no publications
table. The author's publications are scattered among some number of tables: articles, books, and possibly other types we may decide to add later. I'm not an SQL god so I don't know if there is some way to indirect the name of the table within a query, or a way to return non-homogeneous results, but I'm betting even if there were it would be pretty gross. Well, maybe not much grosser than the SQL for :include
, but still gross. In short, you can't traverse a :through
association to a polymorphic data type because you can't tell what table it's in.
Another approach
Where does that leave us? Are polymorphic :through associations useless? Not at all. There are two ways you can make use of them to get to the polymorphic object. First, if you really need to get a polymorphic collection of the associated objects, you can roll your own.
def publications
self.authorships.collect { |a| a.publication }
end
There may be some performance issues with doing a query for each publication, but at least you can do it.
The other option is to create associations for each of the polymorphic types. Like so:
class Authorship < ActiveRecord::Base
belongs_to :author
belongs_to :publication, :polymorphic => true
belongs_to :article, :class_name => "Article",
:foreign_key => "publication_id"
belongs_to :book, :class_name => "Book",
:foreign_key => "publication_id"
end
class Author < ActiveRecord::Base
has_many :authorships
has_many :articles, :through => :authorships, :source => :article,
:conditions => "authorships.publication_type = 'Article'"
has_many :books, :through => :authorships, :source => :book,
:conditions => "authorships.publication_type = 'Book'"
end
This technique provides associations that let you access all the objects of a particular class. It's not totally polymorphic, but at least you can grab lots of objects in a single query. Speaking of which, lets use these new associations to improve the publications()
method in class Author.
def publications
self.articles + self.books
end
Now we're down to needing only one query for each class of object in the polymorphic association, which scales much better than one query for each object.
Whew!
So that's it. Polymorphic :through associations can only be traversed from the polymorphic side. However, you can use special associations with conditions to limit the association to a particular class of object and restore the use of association access by collection.
Exercise for the reader: Use the approach in this article to create a join model where both associated objects are polymorphic. Have fun with that.
I guess you could use a union to do the query but your solution is probably just as good for most cases.
Now you just need to wrap this up into a mixin and get it commited. Then everyone will be happy.
@Nolan: How would this approach work as a mixin? I see it as more of a pattern than a mixin, and can't see how to turn it into reusable code. What are you thinking of?
It's fitting you'd publish this given your domain name ;)
Great! I learned how to join two different types of results as one! ;]
In the above code I don't understand this:
:through => :dealings
Is this right, or a typo?
Maybe I'm misunderstanding the problem, but it seems like we have all the metadata we need to do the polymorphic query correctly in Active Record. We know that both Article and Book are publications (and any other model classes defined later), so why not run a query per distinct concrete subtype and aggregate the results in ruby, essentially pushing the code above down into AR?
It's very probable that I'm missing something.
cheers, carson
@Lou: thanks for catching that. Yes, that was a copy/paste error from my app code. Fixed now!
@Carson: I've thought a bit about how to do what you're suggesting. The roadblock I always hit is discovering what the concrete classes might be. Given dynamic class loading, I don't know how AR would be able to discover all the potential classes before doing the query. You might be able to get that to work in Rails for classes in the model directory, but ActiveRecord has to be able to work independent of Rails as well. I suppose you could specify the concrete classes as an option in the association or finder. I'll have to think about that some.
Nice write up!
However I think that the second design approach lose much of the flexibility offered by polymorphic associations. It makes authorship- and the author-model tightly coupled together. If you add a new model like Compendium you have add this association to both models, thats a smell ;)
Josh,
Can you override the "has_many" method to register the class in some backing hash when :through is passed in as an argument? Basically collect the information at class evaluation time, rather than trying to crawl over them at runtime?
I'm woefully ignorant on the backend of ActiveRecord (I work in a java shop with a custom scripting language and our own metadata layer) but this seems like a solvable problem. All the data is sitting there staring us in the face...
Again, I'm probably missing something.
Cheers, Carson
Josh,
Of course, rather than "override" I should say "rewrite" or "redefine", and rather than "crawl over them at runtime" I should say "crawl over them at a later time during runtime."
My small java brain is still adjusting to how to discuss ruby concepts competently.
Cheers, Carson
@Carson: It might be possible to do what you are suggesting. If this becomes a common problem, I suppose someone will try to code it up and see how well it works. My pragmatic governor says not to get involved in a science project like that when I've got a simple workaround that works fairly well.
If I were to go nuts I might try extending ActiveRecord to sniff out the classes in the join model table with one query (
SELECT DISTINCT authorships.publication_type
), then use that information to do a query with a badass:include
of all those types. I'm not sure if the big dogs would be happy with a double query for an access being rolled into AR, but it might be worth investigating.I hit this problem pretty hard, having lots of polymorphic relations. This blog entry helped a lot, thanks.
Now, I think this is important to have in AR, as at first I expected :through to work for any association. :through the pure beauty of it. :)
Thanks for the workaround - I have a question regarding updating and editing a has_many-through association. I was working with HABTM-associations for some time and now i try to figure out how to update and create new records in the related tables.
Following the given example, would it be possible to update the Author-Authorship-Article-association with one step by:
@author.update_attributes (YYY) ?
YYY= fitting Paramters for the three models.
Thanks in advance.
Is all this fixed with rails 1.1.1? thanks
@Manuel: No, this hasn't changed in 1.1.1. There is better support for :include with polymorphic associations, but that is going from the polymorphic type, not to it.
@Mike: I don't think your
update_attributes()
call is going to do what you want. You'll have to work with attributes on the model object where they are defined. I'm working on a writeup on that, so stay tuned.@josh: Now i'm pretty shure i have to do it 'by hand'. I spend the last evening with trying to solve the problem of updating the join-table when i want to create and/or delete one or more associations between Author and Book for example.
Currently my favourite way of doing this is something like this:
-> Find the publicationid for all existing authorships for the given Author.id put them in array eg. existingauthorships -> Get all authorships that the user wants from a form, i use a checkboxtag for selecting the associated publications. put this in an array eg. choosen_authorships
Now I can loop over the array: (existingauthorships - choosenauthorships) and delete every record in Authorship that matches the Author.id and an entry in the loop-array. Also I can loop over the array: (choosenauthorships - existingauthorships) and add a record for each Author.id with the according entry in the loop array.
Seems VERY complicated for rails-development - but it works :)
I'm looking forward to see your writeup - in the meantime i try to make my solution somehow generic to praise DRY ...
Thanks for the writeup, Josh.
This helped me get "Mailroom":http://sproutit.com/ upgraded to Rails 1.1.
I still don't 100% understand what ":source => :xxx" is doing.
Btw - are there any docs on this anywhere? I couldn't find them on either RailsManual.org or the api.
Great site, btw!
Here's what the docs say on :source: Specifies the source association name used by
has_many :through
queries. Only use it if the name cannot be inferred from the association.has_many :subscribers, :through => :subscriptions
will look for either:subscribers
or:subscriber
on Subscription, unless a:source
is given.If
Subscription belongs_to :subscriber, :class_name => 'User'
, thenMagazine has_many :subscribers, :through => :subscriptions
works fine. However, ifSubscription belongs_to :user
, then you will need eitherMagazine has_many :users, :through => :subscriptions
orhas_many :subscribers, :through => :subscriptions, :source => :user
.Josh,
Excellent article (and a most excellent and highly referenced blog)! I have a question regarding your improved publications() method in the Author class. Say if authorships had an attribute createdat. How would one merge and sort publications based on the createdat attribute.
Thanks,
Raja
I still can not work .. This is my steps:
Then,according to the blog, Tag.find_by_name("123").articles should get such SQL as:
However, the "conditions" of the association can not work under my enviroment: I can only get SQL of this :
Therefore, wrong result sets are returned. What is the matter with the association?
@Charlie: My bad (I think). I could have sworn that code was working when I posted this article, but it doesn't now, even if I revert back to the edge revision of that date. Anyway, if you move the :condition for the type test to the Tag model from the Tagging model, it works correctly and you get the expected SQL. (Don't forget to qualify the field with the table name.)
Yeah,It gets to work,thank you!
Thanks for the tip :)
While experimenting i just found that
belongs_to :publication, :polymorphic => true
has no effect.You can use
belongs_to :foobar, :polymorphic => true
or simply delete the line, it does not change anything.Rails build the query with id and type only when you choose to specify :as => :name in other models ! (tested with Rails 1.1.2)
Strange isnt'it ?
Why don't you just use single table inheritance (http://wiki.rubyonrails.com/rails/pages/SingleTableInheritance)??
@François: Wow, that's odd. But some of that association code is pretty bizarre.
@Danny: STI works for some cases, but not for others. That's why they added the polymorphic feature in the first place.
First, thanks for the great article
Second, here's a solution to the reader's exercise, welcome the
not-yet-modularized : acts-as-taggable-tag!
, http://oldmoe.blogspot.com/2006/05/actsastaggabletag.htmlHi Josh,
This topic you chose for your blog put you in the strategic hot-seat. :) This the stuff people have to get right when designing their applications.
I'm struggling with a user/roles (or maybe user/behavior) question and I believe a polymorphic hasmanythrough is the answer. But I'm not sure.
This is slightly contrived, but a games example works well here. I'll have 3 separate game websites where a player can compete against the computer. At some point after a player registers for just one of the game sites (chess), she'll be prompted to 'add on' a registration for one of the other game sites (mastermind, checkers). So if a user chooses to register for all 3 sites, she should generate games historys with lots of stats specific to each game.
So there's a master account, and exactly 3 sub accounts. I don't want a person to create multiple chess accounts (just 1 or 0 chess accounts). I looked at STI, but it's extremely wasteful of harddrive space if there is not enough overlap between the objects being placed in the same table. There is very little overlap between these games, and the types of play statistics they generate. Regardless of the implementation I see in my mind 4 entities:
And I think that each game will generate its' own statistics sets:
I may want to add on more games later.
Can you or one of your guests give me a start?
@royroy: The problem is potentially more complicated than you describe. If you want to build multiple sites operating off of shared user data, my guess is that you want to build them around separate databases and have a distinct database for user billing and provisioning. This is important so that doing upgrades, maintenance and production issues with one product don't impact the other products. You can set things up so different AR model classes talk to different databases within the same application.
So no, I don't think polymorphism is the way to go. I think you need to deal with the issue at a higher level.
@josh: Actually it *is* all in one application. With a single database & application for all three sites I can do billing per individual and run queries that capture interactions between the sites (e.g. linking chess playing style to MasterMind playing style). So it's 3 sites, one application.
Note that the AtomicPlayer (maybe should have just been called Player) is only there as a roll-up for each of the roles she plays. And that makes it the right place to put her name and single login.
@josh: I'll try to summarize...
Hey,
I wrote a mixin to enable easy implementation of Josh's second solution, above.
Take a look. I would appreciate your comments.
Evan
Josh, There may be an insiduous trap in the approach you present. AR does not use the polymorphic type in building the SELECT expression for the
belongs_to
association (thankfully, it does for thehas_many
through). For example, if you have identical keys in both the articles table and the books table (which would be very typical without UUIDs),my_authorship.books
can retrieve a book NOT belonging tomy_authorship
.Example:
All well and good. Now consider this snippet:
The first line works exactly as expected, returning my one and only authorship. The problem comes in the second line: AR will generate the following query for the second line:
At this point, It looks like I am claiming 'The Cat's Cradle' as the fruit of my labor and I am likely to have a lawsuit on my hands from Vonnegut's estate.
MORAL of the story: Publication references from the authorships table can't be trusted (see caveat below). And since Rails does not join the authorships table when querying the various publications tables, there is nothing that can be done as far as I can tell (AR version 1.14.3). Treat such references with extreme caution!
Fortunately, publication references from the authors table can be trusted -Rails joins the authorships table allowing you to specify the
publication_type
in the supplied condition as you have shown above.In an ideal world, belongs_to assocations would allow you to
Caveat: You get a Get-Out-of-Corrupted-Database-Hell-Free card if you use UUIDs for all your publication classes.
NB: I have not actually seen this problem myself because I do use UUIDs and I am too lazy to conjure up a test case. So it is possible that ruby would barf (or maybe misbehave even more strangely) on the likely mismatch between the coded method invocations and the dynamically defined acessors from the DB columns. In the example I gave above, this would NOT happen, but in most real world examples there would probably be a mismatch. I would love for someone to let me know what really happens.
@Chris: the polymorphic :publication association in the authorship means that the join is done with not only the publication_id field, but also the publication_type. The problem you are worried about won't happen if you build your associations correctly, even if a book and magazine share the same id in different tables. The _type attribute will be different, so you'll only get the item from one table, not the other.
@Josh, That is exactly my point: THERE IS NO JOIN when querying the book table with an previously retrieved (in memory) authorship. Yes, there is a join when querying with an author (something like
SELECT * from books JOIN authorships
) but I emphasize that THERE IS NO JOIN if you start with an authorship!!!! This is logical: if I already have an authorship in memory, why would I need to join the authorhips table? Indeed, Rails simply uses the foreign key for books -which is in memory in the authorship instance!Check it out. In the console, retrieve an authorship record (call it a). Then query for the books belonging to that authorship (a.books) and examine the SQL generated (I examine the SQL by forcing an error with a bogus
belongs_to
foreign key). You will see no join -just a fast lookup in the books table using something likeSELECT * FROM books WHERE books.id = <in-memory foreign key from previously retrieved authoriship>
.Bluntly, the
Authorship.belongs_to :book
relationship is not polymorphic and Rails does not do a join or usepublication_type
. It is a simplebelongs_to
relationship and Rails treats it exactly as any otherbelongs_to
relationship.Depending on the specific data in the polymorphic class tables, the result could be
I'm sorry if my original explanation was not clear. I'm doubly sorry if this post did not clear it up!
@chris: Oh yeah. I was rushing when reading your previous comment, that'll show me. I see what you mean now. At this point I'd just say to follow your "use with caution" warning.
I like polymorphic belongs_to associations less and less every day...
Can anyone come up with a DRY way to enumerate the polymorphic children of a particular abstract class (if I have that vocabulary right). For example, how do you get a list of all the kinds of publications that have been defined (should return "Article" and "Book").
Among other uses, right now if I want to let the user create a new publication, and I give her a pull-down list to choose between "Article" and "Book", I don't know how to produce that list in a non-redundant way. (If I add a new publication type later, I need to go back and update this list.)
Right now I'm keeping the list of publication types in a separate table.. but that seems kind of redundant since I need to remember to update that table every time I add a new publication type, as well as create the new model and table for the new publication type.
I think "subclass" was the word I was looking for. Can anyone come up with a DRY way to enumerate the polymorphic subclasses of a particular abstract class.
@dan: You could define a module Publication and include it in each of your publication classes. Implement a self.included method in the module and you can track the classes that include the module. That's how Rails does the Reloadable trick (well, at least until earlier today it was).
@josh: What happens if not every one of the publication classes has been loaded yet? Are they loaded at app startup or only when one of that kind of publication actually shows up? I guess if the latter, I'm probably outa luck no matter what..
RE Chris's problem, would it not be possible to work around this by specifying :conditions for the belongs_to?
Opps! sorry about the double submit.. quickie fingers! :S
"Exercise for the reader: Use the approach in this article to create a join model where both associated objects are polymorphic. Have fun with that."
Luckily, with edge rails, these double-sided polymorphic associations aren't hard.
Imagine a school that has lesson plans and field trips, and you want to track which students and teachers attended which. Later on, you may want to track which donors went to which fundraiser etc..., so you decide to go polymorphic. Here's the setup (only tested on the latest version of edge rails):
The participations table looks like:Gotta love that.
I should note that this still won't work with eager loading - that is:
will still fail.
The has-many-polymorphs plugin now supports almost all of these constructs. It loads the child records in a single query, so although not true eager loading, you need only O(N)*2 queries to grab the parent and all the relationship contents.
It also supports STI and namespaced models every place you might possibly want them. I am working double polymorphism today.
A lot of people already use has-many-polymorphs in production, so check it out if you have a need for this kind of stuff.