Studying the Ruby Docile Gem to Learn About DSLs

A Stackoverflow user suggests studying the Docile source code to learn how to write DSLs in Ruby, but the source code is dense for programmers new to Ruby DSLs. This blog posts demonstrates how to write a DSL that behaves similarly to the Docile gem, but with much less code.

At a fundamental level, Docile evaluates a block in the context of an object with the instance_eval() method:

arr = []

arr.instance_eval do
  push(1)
  push(2)
  reverse!
end

arr # => [2, 1]

The instance_eval() portion can be abstracted to a method as follows:

def dsl(obj, &block)
  obj.instance_eval(&block)
end

arr = []

dsl(arr) do
  push(1)
  push(2)
  pop
  push(3)
end

arr # => [1, 3]

The dsl() method takes an object and a block as arguments and simply sends the :instance_eval message to the object with the block as an argument.

Suppose there is a Pizza class to make pizzas. The same dsl() method can be used to customize an instance of the Pizza class.

Pizza = Struct.new(:cheese, :pepperoni, :bacon, :sauce)
obj = Pizza.new

dsl(obj) do |pizza|
  pizza.cheese = true
  pizza.pepperoni = true
  pizza.sauce = :extra
end

obj # => #<struct Pizza cheese=true, pepperoni=true, bacon=nil, sauce=:extra>

The prior example uses a block variable to avoid implicit self syntax that is interpreted by the Ruby interpreter as local variable assignment. If the block variable is omitted, the self keyword is required to clarify that cheese = true is a method call, not local variable assignment.

my_pie = Pizza.new

dsl(my_pie) do
  self.cheese = true
  self.pepperoni = false
  self.sauce = :none
end

my_pie # => #<struct Pizza cheese=true, pepperoni=false, bacon=nil, sauce=:none>

Writing Your First Ruby DSL (DSL to generate HTML)

DSLs are easy to write in Ruby and are an elegant way to solve well-defined problems. This post shows how to build a DSL to generate HTML code without getting into too much detail like some of the other blog posts on this same topic. Here is the desired behavior of the DSL:

html = HtmlDsl.new do
  html do
    head do
      title 'yoyo'
    end
    body do
      h1 'hey'
    end
  end
end
p html.result # => "<html><head><title>yoyo</title></head><body><h1>hey</h1></body></html>"

The HtmlDsl class is initialized with a block that specifies the nesting of the tags and the content. We’ll start by simplifying the problem and writing a DSL that does not handle nested tags:

class HtmlDsl
  attr_reader :result
  def initialize(&block)
    instance_eval(&block)
  end

  private

  def method_missing(name, *args)
    tag = name.to_s
    content = args.first
    @result ||= ''
    @result += "<#{tag}>#{content}</#{tag}>"
  end
end

html = HtmlDsl.new do
  h1 'h1 body'
  h2 'h2 body'
end
p html.result # => "<h1>h1 body</h1><h2>h2 body</h2>"

When the HtmlDsl class is initialized, instance_eval(&block) is run, which executes the block in the context of the newly created instance. The first line of the block calls the method :h1 with the argument ‘h1 body’. The HtmlDsl class does not define a h1 method, so method_missing() is called. In method_missing(), the name parameter equals the name of the method that was called (:h1 in this case) and the args parameter is an array of the arguments ([‘h1 body’]). The tag and content are concatenated and stored in the @result instance variable.

We can adjust this code to handle nested blocks and account for the nested nature of HTML markup.

class HtmlDsl
  attr_reader :result
  def initialize(&block)
    instance_eval(&block)
  end

  private

  def method_missing(name, *args, &block)
    tag = name.to_s
    content = args.first
    @result ||= ''
    @result << "<#{tag}>"
    if block_given?
      instance_eval(&block)
    else
      @result << content
    end
    @result << "</#{tag}>"
  end
end

html = HtmlDsl.new do
  html do
    head do
      title 'yoyo'
    end
    body do
      h1 'hey'
    end
  end
end
p html.result
#=> "<html><head><title>yoyo</title></head><body><h1>hey</h1></body></html>"

HtmlDsl#method_missing() traverses the nested block structure and continues evaluating the nested blocks and adding opening tags to @result until hitting a method without a block. After hitting a method that doesn’t have a block, it will add tags and content associated with the method to @result and start adding the closing tags to @result.

I would like to thank Uri Agassi for helping me find a solution to this problem on StackOverflow.

Ruby’s Method Lookup for attr_* Methods

Ruby’s attr_* methods are defined in the Module class as private instance methods. The Module class is not included in the ancestor hierarchy for a user-defined class, so it’s surprising that user-defined classes can access the attr_* methods at first glance.

class A; end
# Module is not one of A's ancestors
A.ancestors # => [A, Object, Kernel, BasicObject]
# No attr_* methods are available as private_instance_methods
A.private_instance_methods.grep /attr/ # => []

The attr_* methods work, so they’re obviously somewhere in A’s ancestor lookup hierarchy.

class Dog
  attr_reader :name
  def initialize
    @name = 'fido'
  end
end
Dog.new.name # => 'fido'

The attr_reader method is called in the Dog class, so it needs to be defined somewhere in the singleton class ancestry chain, not the ancestor chain for the regular instance methods. It turns out that the Module class is in fact included in the singleton class ancestry chain.

class Dog; end
Dog.singleton_class.ancestors # => [Class, Module, Object, Kernel, BasicObject]
# attr_reader is defined in Module's singleton class
Dog.singleton_class.private_instance_methods.include?(:attr_reader) # => true

A user defined class has different ancestor hierarchies for the regular class and the singleton class. I didn’t know about this until I went on a method hunt for the attr_* methods. When you’re working with Ruby and something doesn’t make sense, keep digging and asking questions – this approach will answer your questions and help other concepts to fall into place.

Ways to Define Singleton Methods in Ruby

Singleton methods are methods that live in the singleton class and are only available for a single object (unlike regular instance methods that are available to all instances of the class). Singleton methods are often referred to as class methods, but that’s confusing because Ruby doesn’t have class methods. This post outlines the many different ways singleton methods can be defined.

Methods defined with dot notation are singleton methods for the receiver object.

s = 'a string'
def s.meth; end
s.singleton_methods # => [:meth]

The instance_eval() method evaluates the block in the context of the receiver, so methods defined in an instance_eval() block are singleton methods for the receiver.

x = 'moo'
x.instance_eval do
  def cow; end
end
x.singleton_methods # => [:cow]

Object#define_singleton_method() is the most explicit way to define singleton methods.

word = 'boo'
word.define_singleton_method(:hi) do
  'hey!'
end
word.hi # => 'hey!'
word.singleton_methods # => [:hi]

Singleton methods for class objects can be added by opening the singleton class and defining the methods. After all, singleton methods are simply instance methods defined in the singleton class.

class A
  class << self # opens the singleton class
    def coolio; end
  end # closes singleton class
end
A.singleton_methods # => [:coolio]

Extending a class with a module adds all the methods from the module to the object’s singleton class as singleton methods.

module M
  def surf; end
end

class Watersport
  extend M
end
Watersport.singleton_methods # => [:surf]

The most common technique for adding singleton methods to classes is with dot notation (same idea as the first example):

class Fish
  def Fish.swim; end
  def self.yummy?; end
end
Fish.singleton_methods # => [:swim, :yummy?]

When adding singleton methods to a class, it’s better to use the self keyword instead of the class name, so if the class name changes, edits are not required in multiple places.

There are multiple techniques to create singleton methods in Ruby, but there are not any ways to make class methods because there are no class methods in Ruby. There are singleton methods for class objects that function like class methods, but it’s better to not refer to these as ‘class methods’. Get familiarized with all the different techniques to define singleton methods since all the techniques are used and it can be confusing if you don’t realize they’re all doing the same thing.

Ruby’s BasicObject#instance_eval method

Here’s a description of the BasicObject#instance_eval method from the Ruby documentation (with some edits for clarity):

Evaluates the given block within the context of the receiver. In order to set the context, the variable self is set to the receiver while the code is executing, giving the code access to the receiver’s instance variables.

In other words, the instance_eval method assigns self to the receiver in the block, thus providing access to the receiver’s instance variables, public methods, and private methods.

class A
  def initialize
    @crunk = 'yeahya'
  end

  def grill
    'platinum'
  end

  private

  def rapper
    'lil john'
  end
end

a = A.new
a.instance_eval do
  p self # => #<A:0x007fbf1b3313a0 @crunk="yeahya">
  p @crunk # => "yeahya"
  p grill # => "platinum"
  p rapper # => "lil john"
end

Private methods are accessible whenever an explicit receiver is not required. In the instance_eval block, self is assigned to the receiver, so the implicit self can be relied on and private methods can be called. See this blog post for more information about private methods in Ruby.

instance_eval can be used to modify the instance variables of an object:

obj = Object.new
obj.set_instance_variable(:@b, 'bob')
obj.instance_eval do
  @b = 'brain'
end
obj.instance_variable_get(:@b) # => 'brain'

Methods defined in the instance_eval block are singleton methods and are only accessible by the receiver, not all instances of the class.

obj = Object.new
obj.instance_eval do
  def hi
    'objects have feelings too'
  end
end
p obj.hi # => 'objects have feelings too'
p obj.singleton_methods # => [:hi]
Object.new.hi # => NoMethodError: undefined method `hi'

The ability to change the context of a program to a specific object is another powerful feature of the Ruby programming language and is one of the many reasons Ruby is an awesome programming language.

Ruby Modules

The Ruby documentation has an awesome description of modules that is worthy of some elaboration. Here’s the description:

A Module is a collection of methods and constants. The methods in a module may be instance methods or module methods. Instance methods appear as methods in a class when the module is included, module methods do not. Conversely, module methods may be called without creating an encapsulating object, while instance methods may not. (See Module#module_function)

This post clarifies the Ruby documentation description by showing ‘module methods’ are actually singleton methods, demonstrating how to call a module’s singleton methods, and also demonstrating how to create a singleton method with the same name and functionality as an instance method.

module M
  def self.hi
    'hi from module method'
  end
end

M.hi # => 'hi from module method'

# another way to call M.hi
M::hi # => 'hi from module method'

Module methods are simply singleton methods for the module object.

module M
  def self.hi
    'hi from module method'
  end
end

M.singleton_methods # => [:hi]

# singleton methods are instance methods
# defined in the singleton class
M.singleton_class.instance_methods(false) # => [:hi]

This blog post has a more detailed explanation of singleton methods.

When a module is included in a class, the singleton methods are ignored and the instance methods become available to the class.

module Calculator
  def self.about
    'I like computation'
  end

  def add(x, y)
    x + y
  end
end

class A
  include Calculator
end

# instance methods are accessible
p A.new.add(3, 4) # => 56

# singleton methods are not included in the class
p A.about # => error

A module’s singleton methods can be called without being included in a class, but a module’s instance methods need to be included in a class to be called.

module Helper
  def self.about
    'i am helpful'
  end

  def full_name(first, last)
    "#{first} #{last}"
  end
end

Helper.about # => 'i am helpful'

# Helper#full_name() can only be called if it's included in a class
Helper.full_name('bob', 'lob') # => error

The Module#module_function() method makes a singleton method with the same name as the instance method, so the singleton method can be called when the module is not included in a class. To exemplify with the prior example, module_function() is used to make full_name() singleton method, so Helper.full_name(‘bob’, ‘lob’) will work.

module Helper
  def full_name(first, last)
    "#{first} #{last}"
  end
  module_function :full_name
end

Helper.full_name('bob', 'lob') # => 'bob lob'

Helper.singleton_methods # => [:full_name]

Here’s an alternate description of modules, inspired from the description in the Ruby documentation:

A Module is a collection of singleton methods, instance methods and constants. Instance methods appear as methods in a class when the module is included, singleton methods do not. Singleton methods may be called without being included in a class, but instance methods need to be included in a class to be called. The Module#module_function method can be used to make a singleton method with the same name as an instance method, so the singleton method can be called without being included in a class.

Bypassing Ruby Scope Gates (Flattening the Scope)

A Ruby program creates a new scope whenever it encounters a scope gate (the def, class, and module keywords). When a new scope is created, local variables from earlier scopes are not available, but the scope gates can be bypassed with closures. Read this blog post if you’re not familiar with Ruby’s scope gates.

The following code does not work because the class keyword creates a new scope:

# top level scope
x = 'bob'
class A # begin class scope
  puts x # => error
end # end class scope
# back in top level scope

The local variable x is defined in the top level scope (outer scope) and is not available in the class scope (inner scope). Ruby closures retain local variable bindings that are in place and can be used to bypass scope gates. Read this blog post if you’re not familiar with Ruby closures.

The Class.new closure captures all the top level local variable bindings and makes them available in the class definition.

# top level scope
x = 'bob'
A = Class.new do # begin closure
  puts x
end # end closure

Module#class_eval allows scope gates for existing classes to be bypassed.

class BBB; end
x = 'bob'
# use class_eval to bypass the scope gate for an existing class
BBB.class_eval do
  p x # => local variable is accessible
end

# local variables are not accessible when
# a class is reopened with the class keyword
class BBB
  p x # => error
end

Method scope gates are similar to class scope gates and can be bypassed with a closure:

class A # begin class scope
  x = 'bob'
  define_method :hi do # begin closure
    x
  end # end closure
end # end class scope
A.new.hi # => 'bob'

The define_method closure captures the class’s local variable bindings and makes them available to the hi() method.

Using closures to bypass scope gates is referred to as flattening the scope.