Materialized View Alternatives

A while back I wrote about materialized views in Postgres. My main motivation was making Postgres’ full-text search faster, and to be able to filter and sort using indexes when searching over data coming from many different tables. Since then, I’ve learned about some alternatives that I want to share.

First, a note about why materialized views are not always the right tool for the job:

1. When creating materialized views, you end writing SQL that duplicates logic you’ve already implemented in Ruby, like this:

class Person < ActiveRecord::Base
  def relatives_names
    relatives.map(&:name).join(", ")
  end
end


class CreateMissionsDatatables < ActiveRecord::Migration
  def up
    execute <<<-eos
      CREATE MATERIALIZED VIEW denormalized_people
      AS
      SELECT
        people.*,
        string_agg(relatives.name, ", ") as relative_names
      FROM people
        LEFT OUTER JOIN relatives ON people.id = relatives.person_id
    eos

In that example, you have to write two versions of how to find and format the relative_names method: one in Ruby and one in SQL. You also duplicate other things such as knowledge of how to join the people and relatives tables (you probably already have this as a has_many association in ActiveRecord).

2. It is pretty messy to write all the SQL to keep a materialized view up to date, even when using my ActiveRecord Materialized Views gem.  It’s even messier to do so in a way that updates rows immediately for 1-to-1 updates, but invalidates them for later updating for 1-to-n updates. (If that last sentence made no sense to you, then you can either check out Dan Chak’s chapter on materialized views, or you can just take my word that optimizing materialized view update triggers is NOT something you want to be thinking about)

Now, the first alternative I can recommend is Sunspot. Backed by the Solr search engine, Sunspot builds indexes that can be used either for full-text search, or for filtering and ordering. Sunspot can index the results of Ruby methods from your model layer, which means no more need to duplicate your logic in SQL!  Furthermore, it is easy to configure Sunspot to update indexes in the background, so you can worry less about optimizing performance of updating your indexes. The drawback is that Sunspot runs as a separate process, making your test suite and production setup all that more complicated.

The next alternative is probably not production ready, but it is an exciting big-picture idea about the future of databases. In a talk called Turning the Database Inside Out, Martin Kleppmann notes that there is this fundamental conflict in database design, between optimizing for DRYness and write speed through normalization, and optimizing for read speed through denormalization. Furthermore, write-ahead-logs are thought of as an implementation detail that are not really part of a database’s public API.  Kleppmann is working on a project called Samza that is rethinking that, by promoting the log to be the basic unit of data storage (that is, optimizing for write speed because appending to a log is very fast), then streaming those logs to other processes that build materialized views on top of them (that is, optimizing for speed on the read side as well).  You should watch the talk: it’s very interesting, and Kleppmann does a better job of explaining his project than I do.

Slim for static sites

I’m a big fan of preprocessing. I prefer Slim, Sass, and CoffeeScript to plain HTML, CSS, and Javascript. They are cleaner to read and easier to type. They are also very easy to use in Rails. But what about using them for static sites, without Rails?

Although Sass and CoffeeScript have command line compilers, I couldn’t find anything as easy to use for Slim.

In the end, I realized that the right way to go is using Middleman, a Ruby framework for building static sites and blogs. It uses the same sort of asset pipeline as Rails does, so all of the compiling and preprocessing is done for you. It’s livereload feature is a big win for productivity as well.  In short, it sets everything up exactly as you need it for a killer static site workflow that includes html and asset precompilation, without you having to worry about the detailed configuration.

So, how can you compile Slim on the command line to use in a static site?  My answer would be: don’t — save yourself some trouble and use Middleman instead!

Fixing font-awesome on Heroku

My icons from the font-awesome-rails gem were showing up fine in my development environment. But in production on Heroku, some showed up as empty boxes, and others did not show up at all.

In the “Network” tab of Google Chrome, I saw that the http response status for the font assets was: 206 Partial Content. Not very informative. A more useful message came from the “Console” tab:

Font from origin http://shielded-falls-2406.herokuapp.com has been blocked from loading by Cross-Origin Resource Sharing policy: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. Origin http://expensemanager.teddywidom.com is therefore not allowed access.

I had recently added a custom domain for my app: expensemanager.teddywidom.com. But the fonts were being requested from the original heroku domain: shielded-falls-2406.herokuapp.com.

The solution was to add this line to config/environment/production.rb:

config.action_controller.asset_host = 'expensemanager.teddywidom.com'

(substituting your own host for ‘expensemanager.teddywidom.com’)

The one caveat to this is that now font-awesome icons will no longer work if you visit the app using the shielded-falls-2406.herokuapp.com domain.  If anyone knows a workaround to this, please share it with us!

Travis CI and Hound

Yesterday, I started using two tools that integrate with Github: Travis CI and Hound CI.

“CI” stands for continuous integration. Before new code is merged into a project, there are certain standard steps that should be taken every time.  CI is when those tasks are automated, so that they happen every single time before code is merged, without a person having to remember to do them.

For instance, every time a pull request is opened, Hound automatically comments on lines that contain stylistic violations. This way, style guidelines get enforced, but the people doing the code review get to spend their time and energy on more big-picture design issues.

Travis is CI for testing: every time there is a pull request, Travis pulls down the code, runs the full test suite (I believe you can even specify multiple versions of Ruby and maybe even gem versions if you want to test compatibility!!), and reports the result on the GitHub pull request. Now, we no longer have to worry about pulling the code down locally and running the test suite ourselves before we merge code: Travis takes care of that for us.

These are both awesome, awesome tools that can save Rubyists tons of time and make our code better. What’s better, they are super easy to set up, and free for open source projects!

Try them out!

Know your industry

I just finished reading Michael Lewis’ new book, Flash Boys. It was a great book (like everything else by Michael Lewis)!

Flash Boys is about high frequency trading (HFT) in financial markets. It has fascinating stories of many different involved parties: from people laying down ultra-straight fiber-optic cables in response to demand for the fastest New York/Chicago internet connection, to a gifted programmer who goes to prison for cloning a Goldman Sachs SVN repository, to a Candian banker who starts his own, fairer stock exchange.

There are so many themes in this book that are important for programmers to consider. One is the need to understand what your code is for. It sounds so simple, so obvious; borderline ridiculous. But in large organizations, there can be many layers of separation between the people who understand the “big picture” — how the business is going to make money — and the people writing the code to make that vision a reality. Consider this excerpt:

[The tech guy responsible for making the bank’s computers run faster] was also keenly aware that he had only the faintest idea of the reason for this incredible new lust for speed. He heard a lot of loose talk about “arbitrage”, but what, exactly was being arbitraged, and why did it need to be done so fast? […] Towards the end I’m like, ‘Excuse me sirs, but what are you doing in the bank?’

Note that “what are you doing in the bank?” was asked towards the end, and not towards the beginning!

HFT may be a special case, but when programmers understand the overall vision of the company they work for, it is a huge win for everybody. So here is a challenge to all my readers: the next time you are asked to implement a feature, go and spend a few days doing the job of whoever in your company is going to end up actually using the feature. Get to know them personally. Get to know the job. Understand how the feature will fit in to those people’s workflows. See if you can think of alternatives that would be even better.

Know your industry. Know your company. Understand the business. Understand the “why”. It’s not only more efficient, it’s more fun. And that’s as good a reason as any to give it a try!

Making junk into treasure

At the Steel City Ruby 2014 conference, Franklin Weber gave a talk called Utils is a Junk Drawer.  It was great!

He pointed out that in despite of trying to adhere to the DRY principle, and liking sharing our code with others, many projects have a “junk drawer” where general utility code that is unrelated to the project’s own business logic goes.

Often, this same code gets copy-pasted between many projects, or multiple people spend time developing the same utilities separately.

For instance, here is some utility code from a personal project of mine:

    # in config/initializers/date_time_formats.rb
    Date::DATE_FORMATS.merge!({
      :month => "%Y-%m";
    })

On his advice, I made this into a small gem called year_month, and published it.  I used this tutorial on making your own gem. One great thing about this tutorial was that you create all the files by hand — no code generators — which makes it easier to understand the minimal necessary structure for gems.