SJ logo

Polymorphic Association in Relational Database

20.04.202011 Min Read — In Database

Polymorphic Association

With polymorphic association, a model can belongs_to several models with a single association.

As an example, let's think of a e-commerce site, both individuals and groups can make an order to buy products. These individuals and groups are not related (other than both being a type of User, maybe), and so they have different data.

Without polymorphic associations, we would have something like this

  class Order
    belongs_to :person
    belongs_to :group
  end
  
  class Person
    has_many :orders
  end
  
  class Group
    has_many :orders
  end

So, the Order table would have to competing foreign keys: group_id and person_id . This can be a problem, for example, when trying to find the owner of an order, we would have to make a point to check both columns to find correct foreign key, rather than relying on one.

This is a polymorphic association addresses this issue by condensing this functionality into one association. But it's not easy to correctly represent the association in a relational database. There are four different ways:

  • The Join tables.
  • The Type column.
  • Reverse belongs to.
  • Exclusive arc.

The Type Column

A simple approach to connect an order to a resource is to use two columns on the order table: resource_type and resource_id . This approach was popularized by Ruby on Rails.

Source Code: Branch: type-column

  class CreateOrders < ActiveRecord::Migration[6.0]
    def change
      create_table :orders do |t|
        t.string :order_ref
        t.integer :owner_id
        t.string :owner_type
  
        t.timestamps
      end
    end
  end
  
  class Order < ApplicationRecord
    belongs_to :owner, polymorphic: true
  end
  
  class Person
    has_many :orders, as: :owner
  end
  
  class Group
    has_many :orders, as: :owner
  end
  
  # Seed
  # Person creating
  person = Person.create(name: "John Doe")
  
  # Group creating
  group = Group.create(group_name: "Jungles")
  
  # Orders
  Order.create(owner: person, order_ref: "Or001")
  Order.create(owner: group, order_ref: "Or002")
  Order.create(owner: person, order_ref: "Or003")
  
  # Rails console
  Person.first.orders
  =>[#<Order id: 1, order_ref: "Or001", owner_id: 1, owner_type: "Person", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">,
  #<Order id: 3, order_ref: "Or003", owner_id: 1, owner_type: "Person", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">]>
  
  Order.first.owner
  =>#<Person id: 1, name: "John Doe", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">
  
  Order.second.owner
  =>#<Group id: 1, group_name: "Jungles", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">
  
  # SQL query
  # select *
  # from orders
  # where owner_type='Order' and owner_id=1;

The Rails convention for naming a polymorphic association use "-able". This makes it clear in your relationship which class is polymorphic. But you can use whatever name that you like.

Warning

Polymorphic associations come with one huge red flag: compromised data integrity

In a normal belongs_to relationship, we use foreign keys for reference in an association.

They have more power than just forming a link, though. Foreign keys also prevent referential errors by requiring that the object referenced in the foreign table does, in fact, exist. If someone tries to create an object with a foreign key that references a null object, they will get an error.

Unfortunately, polymorphic classes can’t have foreign keys. We use the resource_type and resource_id columns in place of a foreign key. This means we lose the protection that foreign keys offer.

Rails and ActiveRecord help us out on the surface, but anyone with direct access to the database can create or update objects that reference null objects.

PROS

  • Easy to scale number of models: more models can be easily associated with the polymorphic class.
  • Follow the DRY principle, creates one class that can be used by many other classes.

CONS

  • More tables can make querying more difficult and expensive as the data grows.
  • Cannot have foreign key.
  • Your data integrity is compromised.

The Join tables

In this approach, we don't create foreign keys on either of the tables that we want to relate together. Instead, we create a join tables to connect them together.

Source Code: Branch: join-tables

  class CreateOrders < ActiveRecord::Migration[6.0]
    def change
      create_table :orders do |t|
        t.string :order_ref
  
        t.timestamps
      end
    end
  end
  
  class CreateOrdersPeople < ActiveRecord::Migration[6.0]
    def change
      create_table :orders_people do |t|
        t.references :order, foreign_key: true, index: { unique: true }
        t.references :person, foreign_key: true, index: true
        t.timestamps
      end
    end
  end
  
  class CreateOrdersGroups < ActiveRecord::Migration[6.0]
    def change
      create_table :orders_groups do |t|
        t.references :order, foreign_key: true, index: { unique: true }
        t.references :group, foreign_key: true, index: true
        t.timestamps
      end
    end
  end
  
  class Order < ApplicationRecord
    has_one :orders_person
    has_one :orders_group
  
    has_one :person, through: :orders_person
    has_one :group, through: :orders_group
  end
  
  class OrdersGroup < ApplicationRecord
    belongs_to :order
    belongs_to :group
  end
  
  class OrdersPerson < ApplicationRecord
    belongs_to :order
    belongs_to :person
  end
  
  class Person < ApplicationRecord
    has_many :orders_persons
    has_many :orders, through: :orders_persons
  end
  
  class Group < ApplicationRecord
    has_many :orders_groups
    has_many :orders, through: :orders_groups
  end
  
  # Seed
  person = Person.create(name: "John Doe")
  
  group = Group.create(group_name: "Jungles")
  
  order1 = Order.create(order_ref: "Or001")
  order2 = Order.create(order_ref: "Or002")
  order3 = Order.create(order_ref: "Or003")
  
  OrdersGroup.create(order: order1, group: group)
  OrdersPerson.create(order: order2, person: person)
  OrdersGroup.create(order: order3, group: group)
  
  # Rails console
  Person.first.orders
  [#<Order id: 2, order_ref: "Or002", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">]
  
  Group.first.orders
  [#<Order id: 1, order_ref: "Or001", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">,
  #<Order id: 3, order_ref: "Or003", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">]
  
  Order.first.group
  #<Group id: 1, group_name: "Jungles", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">

Pros

  • We're using foreign key constraints, so the database can ensure that any connection between an order and a resource (group/person) is valid.

Cons

  • There is no way to require that the resource (group/person) has an order. And it does not enforce uniqueness across the two owners (e.g one order could be incorrectly connected to both a person and a group).
    order4 = Order.create(order_ref: "Or004")
    person = Person.first
    group = Group.first
    
    OrdersGroup.create(order: order4, group: group)
    OrdersPerson.create(order: order4, person: person)

Reverse Belongs-To

We know an order logically belongs to the person or group, but these relationships can be reversed. Instead of having the foreign key on the orders table, we will adding an order_id to people and groups tables.

Source Code: Branch: reverse-belongs-to

  class CreatePeople < ActiveRecord::Migration[6.0]
    def change
      create_table :people do |t|
        t.string :name
        t.integer :order_id
  
        t.timestamps
      end
    end
  end
  
  class CreateGroups < ActiveRecord::Migration[6.0]
    def change
      create_table :groups do |t|
        t.string :group_name
        t.integer :order_id
  
        t.timestamps
      end
    end
  end
  
  class CreateOrders < ActiveRecord::Migration[6.0]
    def change
      create_table :orders do |t|
        t.string :order_ref
  
        t.timestamps
      end
    end
  end
  
  class Group < ApplicationRecord
    belongs_to :order
  end
  
  class Person < ApplicationRecord
    belongs_to :order
  end
  
  class Order < ApplicationRecord
    has_one :group
    has_one :person
  
    def owner
      @owner ||= group || person
    end
  end
  
  # Seed
  person = Person.create(name: "John Doe")
  group = Group.create(group_name: "Jungles")
  
  order1 = Order.create(order_ref: "Or001")
  order2 = Order.create(order_ref: "Or002")
  order3 = Order.create(order_ref: "Or003")
  
  person.update(order: order1)
  group.update(order: order3)
  
  # Rails console
  Order.first.person
  =>#<Person id: 1, name: "John Doe", order_id: 1, created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">
  
  Group.first.order
  =>#<Order id: 3, order_ref: "Or003", created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">
  
  Order.last.owner
  =>#<Group id: 1, group_name: "Jungles", order_id: 3, created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">

Pros

  • All resources (Person / Group) are guaranteed to have a valid references to an order.

Cons

  • There is no way to prevent an orphan order record. (e.g order2)
  • There is no way to prevent a Group and a Person from referencing the same order at the database level.

Exclusive Belongs To (aka Exclusive Arc)

Instead of having owner_id and owner_type combo, we will add two foreign keys: group_id and person_id to the Orders table.

  class CreatePeople < ActiveRecord::Migration[6.0]
    def change
      create_table :people do |t|
        t.string :name
  
        t.timestamps
      end
    end
  end
  
  class CreateGroups < ActiveRecord::Migration[6.0]
    def change
      create_table :groups do |t|
        t.string :group_name
  
        t.timestamps
      end
    end
  end
  
  class CreateOrders < ActiveRecord::Migration[6.0]
    def change
      create_table :orders do |t|
        t.string :order_ref
        t.references :group, foreign_key: true, index: true
        t.references :person, foreign_key: true, index: true
  
        t.timestamps
      end
    end
  end

To ensure that an order belongs to exactly one resource (group/person) at anytime, we need to do something. That's why this way is called Exclusive Arc.

So, technically, there are two ways to enforce the exclusive constraint.

  class Order < ApplicationRecord
    belongs_to :group, optional: true
    belongs_to :person, optional: true
  
    validate :can_be_only_one
  
    private
  
    def can_be_only_one
      return unless existing_group + existing_person != 1
  
      error.add(:base, "can only have one owner")
    end
  
    def existing_group
      group.present? ? 1 : 0
    end
  
    def existing_person
      person.present? ? 1 : 0
    end
  end
  
  class Person < ApplicationRecord
    has_many :orders
  end
  
  class Group < ApplicationRecord
    has_many :orders
  end
  
  # Seed
  person = Person.create(name: "John Doe")
  group = Group.create(group_name: "Jungles")
  
  order1 = Order.create(order_ref: "Or001", person: person)
  order2 = Order.create(order_ref: "Or002", group: group)
  order3 = Order.create(order_ref: "Or003", group: group)
  
  # Rails console
  Person.first.orders
  =>[#<Order id: 1, order_ref: "Or001", group_id: nil, person_id: 1, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">]
  
  Group.first.orders
  =>[#<Order id: 2, order_ref: "Or002", group_id: 1, person_id: nil, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">,
  #<Order id: 3, order_ref: "Or003", group_id: 1, person_id: nil, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">]
  
  # Validates
  order = Order.new(order_ref: "Or004", group_id: Group.first.id, person_id: Person.first.id)
  order.valid? # false
  order.errors
  => #<ActiveModel::Errors:0x00007f90f4351998 @base=#<Order id: nil, order_ref: "Or004", group_id: 1, person_id: 1, created_at: nil, updated_at: nil>,
  # @messages={:base=>["can only have one owner"]}, @details={:base=>[{:error=>"can only have one owner"}]}>

NOTE: I'm using PostgreSQL for this way

    class AddCheckConstraintToOrder < ActiveRecord::Migration[6.0]
      def change
        execute <<-SQL
          ALTER TABLE orders ADD CONSTRAINT order_owner_check CHECK (
            (
              (group_id is not null)::integer +
              (person_id is not null)::integer
            ) = 1
          );
        SQL
      end
    end
    
    class Person < ApplicationRecord
      has_many :orders
    end
    
    class Group < ApplicationRecord
      has_many :orders
    end
    
    class Order < ApplicationRecord
      belongs_to :group, optional: true
      belongs_to :person, optional: true
    end
    
    # Seed
    person = Person.create(name: "John Doe")
    group = Group.create(group_name: "Jungles")
    
    order1 = Order.create(order_ref: "Or001", person: person)
    order2 = Order.create(order_ref: "Or002", group: group)
    order3 = Order.create(order_ref: "Or003", group: group)
    
    # Rails console
    Person.first.orders
    =>[#<Order id: 1, order_ref: "Or001", group_id: nil, person_id: 1, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">]
    
    Group.first.orders
    =>[#<Order id: 2, order_ref: "Or002", group_id: 1, person_id: nil, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">,
    #<Order id: 3, order_ref: "Or003", group_id: 1, person_id: nil, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">]
    
    # Check constraint
    order = Order.new(order_ref: "Or004", group_id: Group.first.id, person_id: Person.first.id)
    order.save!
    (0.4ms)  BEGIN
      Order Create (1.9ms)  INSERT INTO "orders" ("order_ref", "group_id", "person_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4, $5) RETURNING "id"  [["order_ref", "Or004"], ["group_id", 1], ["person_id", 1], ["created_at", "2020-04-21 09:56:36.563553"], ["updated_at", "2020-04-21 09:56:36.563553"]]
       (0.3ms)  ROLLBACK
    Traceback (most recent call last):
            1: from (irb):10
    ActiveRecord::StatementInvalid (PG::CheckViolation: ERROR:  new row for relation "orders" violates check constraint "order_owner_check")
    DETAIL:  Failing row contains (5, Or004, 1, 1, 2020-04-21 09:56:36.563553, 2020-04-21 09:56:36.563553).

Pros

  • Ensure that an order belongs to exactly one resource (group/person) at any type.
  • An order can't be orphaned.

Cons

  • Take a little extra work to setup
  • It can get a little cumbersome if you have more than three resources.

Conclusion

Due to lack of data integrity guarantees, the Type Column should not be used. The only advantage it has is an ORM such as ActiveRecord may take it very easy to use

The Join tables approach is an improvement over polymorphic joins, but it requires creating extra table for each relationship.

The Reverse Belongs-To models approach has few critical cons.

So, the final approach, the Exclusive Arc. Data integrity is maintained via the check constraint.

However, there are few concerns:

  • Multiple null fields. But in the case of PostgreSQL, null values are almost free.
  • Adding a new table requires adding a column to the exclusive belongs-to table. If this was a large, heavily used table there might be an issue with how long the table would be locked. With PostgreSQL, this is not a problem. Nullable fields can be added quickly regardless of table size. The updated check constraint can also be added without blocking concurrent usage.

In conclusion, I suggest using an Exclusive Arc approach to represent a polymorphic association.

© 2020 by Silicon Jungles. All rights reserved.
Last build: 15.05.2020