Sooner or later you’ll want to subclass Ruby’s String
, Array
, or Hash
. In this now-free RubyTapas video, you’ll learn why that’s a bad idea… and what to do instead.
Director’s commentary: This was originally published as RubyTapas episode #18 in November 2012. I think this one actually holds up pretty well in terms of script and pacing. And a quick IRB investigation seems to indicate that the demonstrated behavior is still true in Ruby 2.7.1: adding two Array subclasses with the +
operator returns a raw Array
.
Here’s the video, and beneath it you’ll find the script and code.
Sooner or later you will find yourself wanting a data structure which is almost, but not quite, exactly like a built-in Ruby Array. For instance, let’s say we have a list of tags. We’d like it to behave more or less like an array, except that strings containing spaces should be separated into individual tags on insertion, and converting the list to a string should result in a space-separated list.
tags = TagList.new
tags << "foo", "bar", "baz buz"
The first thing that occurs to us is to make TagList
a subclass of <a href="https://www.rubytapas.com/out/ruby--array">Array</a>
.
class TagList < Array
def <<(tag)
tag.to_s.strip.split.each do |t|
super(t)
end
self
end
def to_s
join(" ")
end
end
class TagList < Array
def <<(tag)
tag.to_s.strip.split.each do |t|
super(t)
end
self
end
def to_s
join(" ")
end
end
tags = TagList.new
tags << "foo" << "bar" << "baz buz"
tags.to_s # => "foo bar baz buz"
tags.grep(/b/) # => ["bar", "baz", "buz"]
At first blush this seems like a perfect solution. Our custom insertion behavior works correctly, the object stringifies the way we want it to, and otherwise it behaves like a normal array.
But one day we discover a fly in the ointment. We have some code that merges two TagLists
together. After they are merged, they stop behaving like TagLists
!
tl1 = TagList.new(%w[apple banana])
tl2 = TagList.new(%w[peach pear])
tl1.to_s # => "apple banana"
tl2.to_s # => "peach pear"
tl3 = tl1 + tl2
tl3.to_s # => "[\"apple\", \"banana\", \"peach\", \"pear\"]"
On further investigation, we discover that the merged object isn’t even a TagList
—it’s an ordinary <a href="https://www.rubytapas.com/out/ruby--array">Array</a>
!
tl3.class # => Array
What’s going on here?!
The explanation boils down to limitations in the Ruby implementation. For efficiency, many core class methods are coded in C instead of Ruby. And in some cases, such as this Array
addition operator, they are implemented in such a way that the class of the return value is hardcoded.
If we subclass core classes, such as <a href="https://www.rubytapas.com/out/ruby--array">Array</a>
, <a href="https://www.rubytapas.com/out/ruby--string">String</a>
, and <a href="https://www.rubytapas.com/out/ruby--hash">Hash</a>
, we will eventually run up against these limitations. The results can be surprising and frustrating.
Fortunately, there’s a way around this mess. Instead of subclassing, we can use delegation. Here’s a version of the TagList
that doesn’t subclass Array
, but is instead implemented in terms of an internal Array
.
class TagList
def initialize(*args)
@list = Array.new(*args)
end
def <<(tag)
tag.to_s.strip.split.each do |t|
list << t
end
self
end
def to_s
list.join(" ")
end
protected
attr_reader :list
end
tl1 = TagList.new(%w[apple banana])
tl2 = TagList.new(%w[peach pear])
tl1.to_s # => "apple banana"
tl2.to_s # => "peach pear"
To make TagList
addition work, we can add a “plus” operator that adds the internal arrays and then wraps the result in a TagList
.
class TagList
def initialize(*args)
@list = Array.new(*args)
end
def <<(tag)
tag.to_s.strip.split.each do |t|
list << t
end
self
end
def to_s
list.join(" ")
end
def +(other)
self.class.new(list + other.list)
end
protected
attr_reader :list
end
tl1 = TagList.new(%w[apple peach])
tl2 = TagList.new(%w[pear banana])
tl1 + tl2 # => apple peach pear banana
But what about all those other great <a href="https://www.rubytapas.com/t/enumerable">Enumerable</a>
methods, like <a href="https://www.rubytapas.com/out/ruby--enumerable-map">#map</a>
, <a href="https://www.rubytapas.com/out/ruby--enumerable-select">#select</a>
, <a href="https://www.rubytapas.com/out/ruby--enumerable-grep">#grep</a>
, or <a href="https://www.rubytapas.com/out/ruby--enumerable-group_by">#group_by</a>
? Do we have to delegate each one individually?
Thankfully, no. All we need to do is delegate one more method, <a href="https://www.rubytapas.com/in/each">#each</a>
, then include the Enumerable
module. All of Enumerable
‘s methods are implemented in terms of #each
, so our TagList
now has the full power of Enumerable
available.
class TagList
include Enumerable
def initialize(*args)
@list = Array.new(*args)
end
def <<(tag)
tag.to_s.strip.split.each do |t|
list << t
end
self
end
def to_s
list.join(" ")
end
def +(other)
self.class.new(list + other.list)
end
def each(*args, &block)
list.each(*args, &block)
end
protected
attr_reader :list
end
tl1 = TagList.new(%w[apple peach pear banana])
tl1.grep(/p/) # => ["apple", "peach", "pear"]
tl1.map(&:reverse) # => ["elppa", "hcaep", "raep", "ananab"]
tl1.group_by(&:size)
# => {5=>["apple", "peach"], 4=>["pear"], 6=>["banana"]}
So the lesson here is simple: when you want something that behaves almost, but not quite, like a core Ruby class, just remember to use delegation rather than inheritance.
That’s all for now. Happy hacking!
The issue of return types is a though one (covariance). Note that Ruby is trying to simplify how core classes behave and typically return the base type now.
It’s not clear how delegation actually helps here. In all cases, you need to redefine any Array method that returns a new Array.
For example, you didn’t mention it but
tl2 = tl1.map(&:reverse)
is not a TagList.Note: typo, you meant `tags << “foo” << “bar” << “baz buz”“