Decoupling Persistence from your Domain

Note: this post is actually a README I’ve written available on github (along with example source code).


In this README, I’ll show you two simple ways to isolate your business logic from your persistence concerns. The first example uses inheritance and a naming convention; the second, mixins.

However, I want you to walk through a refactoring to understand it. In general, I don’t recommend starting with a pattern unless you know it so well that refactoring into it would feel like a truly needless exercise.

Note: Checkout the readme_simple_example directory for a working example of the code in this README. Or, checkout the active_record_example directory for a larger example with both an in-memory persistence plugin and an ActiveRecord persistence plugin. (Caveat: I’ve only tested this code on MRI 1.9.3-p125).

Twitter

Let’s develop twitter. OK, not really, but let’s start with the following rspec spec:

require 'ostruct'
require_relative '../twitter_user'

describe Twitter::User do
  let(:user)          { Twitter::User.new }
  let(:tweet)         { OpenStruct.new    }
  let(:tweet_factory) { -> do tweet end   }

  before do
    user.tweet_factory = tweet_factory
  end

  describe "#tweet" do
    it "uses the tweet factory to generate a new tweet" do
      user.tweet("hi").should == tweet
    end
  end
end

Basically, we’ve written a spec that says a “user” should be able to tweet. Since that’s the single most essential feature of the Twitter application, I thought it would makes sense to start with that. If we can’t get this abstraction right, then we’re doomed.

The “tweet_factory” bit may seem a little odd. Essentially, we don’t want to tightly couple our User model to a Tweet model; instead, we’d simply like to inject a method for creating tweets into it at runtime. This makes it simpler to test, and makes our User model simpler to maintain. (If you’d like to learn more about this sort of dependency injection, I highly recommend purchasing Avdi’s ebook "Objects on Rails").

Run the spec, watch it fail, then write some code until it passes. You might end up with something like this:

module Twitter
  class User
    attr_writer :tweet_factory

    def tweet(content)
      @tweet_factory.call
    end
  end
end

Great! There’s likely a couple other features of our tweet method that we’ll want to go ahead and add:

#...
it "sets the content of the tweet to the desired text" do
  user.tweet("hi").content.should == "hi"
end

it "associates the tweet with the user" do
  user.tweet("hi").user.should == user
end

Simple enough. Let’s get these tests passing:

module Twitter
  class User
    attr_writer :tweet_factory

    def tweet(content)
      @tweet_factory.call.tap do |t|
        t.content = content
        t.user    = self
      end
    end
  end
end

We’re done! We’ve just implemented twitter! Oh wait, what about persistence?

Saving users

Clearly, before we can launch our app into the real world, we’re going to need to persist our objects and retrieve them in various ways. Let’s start by simply adding specs for saving users, and for finding all users.

describe Twitter::User do
  #...

  describe ".all" do
    it "should default to empty" do
      Twitter::User.all.should be_empty
    end
  end

  describe "#save!" do
    it "should add the user to the list of all users" do
      user.save!
      Twitter::User.all.should include(user)
    end
  end
end

Seems simple enough. Let’s update our user class and make these specs pass:

module Twitter
  class User
    def self.all
      @users ||= []
    end

    attr_writer :tweet_factory

    def tweet(content)
      @tweet_factory.call.tap do |t|
        t.content = content
        t.user    = self
      end
    end

    def save!
      self.class.all << self
    end
  end
end

Great! There are, of course, flaws in our implementation. For starters, it’s not really even a persistence layer. These objects will die the second our script exits, never to return. Also, there’s a bug. Calling “#save!” multiple times will persist duplicate objects into “.all”. And if we build any more persistence specs, we will absolutely need a “User.truncate!” method that destroys all the users (we’ll want to run that before every test to ensure isolation between tests).

To make this article short, however, let’s ignore those problems and move on to some refactoring.

Refactoring out persistence

We have a problem. Originally, we started out our User spec by describing what a User actually does on Twitter (they tweet!). But now we’ve muddied up our domain with the concerns of our persistence layer.

Is that really such a big deal? Maybe not. I mean, if you’re cool with creating inflexible, tightly coupled systems, then carry on.

There’s all kinds of different ways to solve this problem. There’s the “Rails Way” - pretend it’s not a problem ;-). There’s the data mapper pattern (described notably in Martin Fowler’s “Patterns of Enterprise Application Architecture”). There’s the Active Record pattern (of which the powerful ActiveRecord library is an implementation of). There’s Avdi Grim’s “fig leaf” approach described in his “Objects on Rails” book. There’s Piotr Solnica’s compositional approach described here. And I’m sure many, many more that I’m not aware of.

All of those approaches have their merits. In terms of level of effort, the approach I’m about to show may be the simplest (well, except for the “Rails Way”, of course), though it doesn’t go as far with decoupling as the Data Mapper pattern or Solnica’s method.

Let’s start by refactoring out the persistence into a seperate class:

module Twitter
  module Persistence
    class User
      def self.all
        @users ||= []
      end

      def save!
        self.class.all << self
      end
    end
  end
end

module Twitter
  class User < Twitter::Persistence::User
    attr_writer :tweet_factory

    def tweet(content)
      @tweet_factory.call.tap do |t|
        t.content = content
        t.user    = self
      end
    end
  end
end

Now run the tests again. They should still all pass.

Next, move the Twitter::Persistence::User class into it’s own file. I called mine “in_memory_persistence.rb” - since what we’ve written is actually a simple in memory persistence solution.

We can also move the persistence specs into their own spec file:

require_relative '../in_memory_persistence'

User = Twitter::Persistence::User

describe User do
  describe ".all" do
    it "should default to empty" do
      User.all.should be_empty
    end
  end

  describe "#save!" do
    it "should add the user to the list of all users" do
      user = User.new
      user.save!
      User.all.should include(user)
    end
  end
end

Now we simply need an abstract persistence layer standin to unit test our business logic. Create another file called “abstract_persistence_layer.rb” and place the following code:

module Twitter
  module Persistence
    class User; end
  end
end

Now you can require this file at the top of your user spec to get those tests to pass again.

Wins

In a way, we’ve isolated our persistence layer from our business logic. We can test them completely independently of each other. When we look at our domain models in this application, they should scream “TWITTER”, not “DATABASE”.

You may have noticed, but we’ve also written an integration test suite for our persistence layer. If we wanted to replace our in-memory persistence layer with a file-system persistence layer, or a database persistence layer, we could test it by simply replacing the require_relative '../in_memory_persistence' in our persistence spec with require_relative '../file_persistence' or require_relative '../database_persistence'. That seems like a nice win.

In reality, you won’t likely be replacing your persistence layer a lot. However, a nice side effect of this sort of de-coupling is that it makes it possible to parallelize the work on our project. We could have one team develop the persistence layer while another develops the domain models. Other teams could work on various delivery mechanisms (e.g., a website, a REST api, an smartphone app, etc.) by requiring both the business logic layer and a persistence layer.

Note that we could have used mixins instead of inheritance to seperate our persistence layer. In fact, I’d prefer that. We could remove the inheritance from our domain model completely, and simply let the persistence layer inject modules into our domain models for supporting the persistence concerns:

#user.rb
module Twitter
  class User
    def tweet(content)
      #...
    end
  end
end


#in_memory_persistence.rb
require_relative 'user'
module Twitter
  module Persistence
    module User
      def self.included(base)
        base.extend ClassMethods
      end

      module ClassMethods
        def all
          #...
        end
      end

      def save!
        #...
      end
    end
  end
end

Twitter::User.send :include, Twitter::Persistence::User

Now we no longer need to define an abstract persistence layer standin for testing our user domain model. The only problem with this approach is that it would likely make working with ORMs like ActiveRecord tricky, since they assume that they’re bolted on to your models with inheritance.

Sunday, March 18, 2012 — 1 note   ()

Twitter: worst SMTP/IMAP implementation EVAR?

If you think about it, Twitter is the worst implementation of SMTP/IMAP ever.

First, let’s break it down:

  • Your timeline is your inbox. Twitter gives you a private inbox.
  • Your profile is really your Sent Box. Twitter makes your Sent Box public. (At least by default).

When you a send a tweet, that tweet gets delivered to all of your followers. Kind of like an email, with the “To:” field defaulted to “All Followers”.

Catching on?

When you sign in to twitter.com and read your tweets, it’s basically an email client accessed over HTTP (which in turn accesses your inbox over IMAP).

When you send a tweet - well, that’s SMTP.

So why is Twitter the WORST IMPLEMENTATION OF SMTP/IMAP EVAR? Well, because they didn’t implement it with SMTP and IMAP. The best thing about email is that it is decentralized. In other words, it scaled. Kind of like DNS. That scaled too. Why? Because they’re this beautiful decentralized system.

Twitter, on the other hand, is this hopelessly centralized hot mess. OK, that’s not entirely true. For a centralized service, Twitter is kicking ass at scaling it up. But, aside from their search service (which clearly does require some centralization of data), I can’t help but wonder if 90% of their scaling efforts could have been avoided had they simply used SMTP and IMAP to build twitter behind the scenes (along with a host of SMTP/IMAP servers all over the world).

Monday, February 13, 2012   ()

Use-Case Driven Design: Interactors, Entities, and Boundaries

Wait, what? I thought it was Model-View-Controller. Nope. That’s for a user interface. The UI is a detail. The database is a detail. Your application is neither. And it’s certainly not a detail.

So what is your application? It starts with a use case. A use case makes no mention of the UI or the database. Your use case is delivery and storage agnostic. Your use case is about business logic.

Your application is divided into interactors and entities. The interactors contain use-case specific business logic. The entities contain use-case agnostic business logic. The interactors orchestrate the interaction of entities to satisfy a use-case.

The UI depends on your application (through Boundary interfaces). The persistence gateway depends on your application (through Boundary interfaces). Your application depends on neither the UI or the persistence gateway. Your application is pure business logic. Not convinced? Watch Uncle Bob’s Ruby Midwest 2011 Keynote, “Architecture: The Lost Years”. Or better yet, check out Episode 7 of the cleancoders.

Uncle Bob makes a fabulous point in both of those videos. Software architecture isn’t about frameworks or databases. It’s not about Rails, or Struts, or Hibernate, or NoSQL. It’s about deferring decisions about those kinds of details as long as possible. Whenever you draw a boundary in your system, the lines of dependencies should only cross one way. That is the definition of software architecture.

Saturday, February 11, 2012   ()

NoSQL in the Real World - The Video, Pics and Slides

gilt-tech:

Thanks to everyone who came to Gilt Tech’s latest tech talk, NoSQL in the Real World.

We had a great series of speakers including:

  • Ara Anjargolian - Redis
  • Matt Parker - CouchDB
  • Sean Cribbs - Riak
  • Edward Capriolo - Cassandra
  • Luke Gotszling - MongoDB

Huge thanks to Rockman and Maureen for organizing the event, and to AOL ventures for sponsoring and hosting.

Enjoy out the video …

The Pictures …

The Slides …

Riak

Redis

Cassandra

Membase and MongoDB

CouchDB

Saturday, February 11, 2012 — 6 notes   ()

Independent Deployability with Ruby on Rails

In pretty much any well designed application, you’ll discover the following phenomenon (among others): dependencies point from the concrete to the abstract.

What would that look like in a Rails application? For starters, Rails would depend on your application, but your application wouldn’t know a damn thing about Rails. In other words, your Gemfile would look like:

gem "rails"
gem "your_app"

your_app" is an abstraction, containing all of your business logic, but completely unconcerned with concrete details like the UI - or the database. It should, however, define an interface that a data-persistence layer could plug into. We’ll need to update that Gemfile:

gem "rails"
gem "your_app"
gem "some_persistence_layer"

some_persistence_layer" would depend on "your_app" (so that it can access the persistence interface), but "your_app" would know nothing about "some_persistence_layer”. It’s basically a plugin to your application:

YourApp.persistence = SomePersistenceLayer

So what do you get out of this? For starters, independent deployability. Of course, that doesn’t mean much since you’re packaging it all up with bundler and shipping it out, but there’s a valuable side effect of independent deployability: division of labor. You could have three teams working in parallel - one building the UI, one hammering out the actual app, and one wiring up the persistence layer.

You also stand a decent chance of ending up with a quick test suite, testing each module in isolation from everything else.

Note: I got the ideas in this blog post from watching several episodes of cleancoders.com.

Sunday, February 5, 2012   ()

Specdown, README Driven Development, and the evolution of testing

Testing used to be simple. Derive a class from Test::Unit, write a method, make an assertion, red-green-refactor, and you’re done.

Then Dan North fucked all that up. Suddenly we had to actually talk to our stakeholders. We began drafting acceptance criteria in a domain specific language understandable by the programming-impaired. Our testing API exploded. Gone were the days of assert. Suddenly we had to memorize a plethora of methods in order to write our tests: feature, scenario, describe, context, it, should, include, be, example, shared_examples, include_examples, cover, etc. etc. etc.

In 2010, Tom Preston-Werner wrote a blog post about README Driven Development. He tried to strike a balance between over-specified waterfall techniques and under-specified agile cowboy coding.

I embraced README Driven Development. I started all of my projects with a README. I reaped all kinds of benefits from it:

  • I started out focused on my API, not implementation details
  • I thought about how my potential users might use my project (and how I might convince them to try it)
  • I wrote about my project without any of the constraints of Gherkin or Connextra

I write clear, concise, and comprehensive READMEs. Anyone should be able to read my README and learn everything they need to know to use my project.

But I had a problem. Every step I took after my README took me further and further from that clarity. My cucumber features replaced my concise prose with repetitive “Given/When/Then” scenarios. They hid my API behind step definitions, mitigating against the likelihood that they would ever serve as the primary documentation for my project. My RSpec suffered a similar fate; my narrative was lost behind a sea of syntax that was ironically intended to give tests a more human readable cadence.

Even worse, I was now maintaining documentation about my project in three different places: my README, my cucumber, and my rspec. And my README could all too easily get out of date. Like code comments, it had the propensity to lie.

Then I had an idea: what if my README was executable? I wouldn’t need any Gherkin or step definitions or rspec. And even better, my README wouldn’t lie - at least no more than any other test.

So I created specdown. It makes your READMEs executable, letting you expose your API while giving you a way to hide your assertions behind plain english (or any other language). Check out the README and hit me up on twitter or github if you want to discuss or have any issues.

Saturday, February 4, 2012   ()

Liquid Layouts and Matrix Transposition

Liquid layouts: the practice (among other things) of rendering a single ul/li group (like this):

  <ul>
    <li>1</li>
    <li>2</li>
    <li>3</li>
    <li>4</li>
    <li>5</li>
    <li>6</li>
    <li>7</li>
    <li>8</li>
    <li>9</li>
    <li>10</li>
    <li>11</li>
  </ul>

into a multi-column layout like this:
  
  1   2   3
  4   5   6
  7   8   9
  10  11
using CSS like this:

  <style type="text/css">
    ul {width: 30%}
    li {width: 33%; text-align: left; float: left}
  </style>
But what if you wanted the data sorted by column, not by row? For this, CSS is inadequate: it’s capable of flowing from left to right, top to bottom. So in order to acheive a column sorting, we need to re-sort the data, like this:

  <ul>
    <li>1</li>
    <li>5</li>
    <li>9</li>
    <li>2</li>
    <li>6</li>
    <li>10</li>
    <li>3</li>
    <li>7</li>
    <li>11</li>
    <li>4</li>
    <li>8</li>
    <li></li>
  </ul>
Using the same CSS, this would render:
  1   5   9
  2   6   10
  3   7   11 
  4   8
My new liquidity gem adds a “column_sort" method to the Array class, making it simple to re-sort your collection for a column-sorted liquid layout.

It achieves this by first copying then converting the array it’s called on into a list of n-rows, then performing a simple matrix transposition on it, then flattening the array again before returning.

For example, calling (1..11).to_a.column_sort(3) would perform the following transformations:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, nil]
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, nil]]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, nil]]
[1, 5, 9, 2, 6, 10, 3, 7, 11, 4, 8, nil]

Sunday, April 4, 2010 — 13 notes   ()