Nikolay Sturm's Blog

Musings of a DevOp

Multi-Database Setup With Rails and RSpec

| Comments

Last week I had to integrate one of our Rails applications at work with an additional database. While the Rails parts of the setup are quite well documented on the net, my Google Fu failed me when it came to making it work with RSpec. So here are my findings…

Integration with Rails

Adding additional databases to a Rails application is quite simple, all it takes is a database connection in database.yml and establishing the proper connection in associated models.

Let’s say we want to integrate the foo database, the additional connection might be:

1
2
3
4
5
6
foo:
  adapter: mysql2
  database: foo
  encoding: utf8
  username: root
  password: secret

To access the foo database, we conventionally use models in the Foo namespace and use a proper connection.

1
2
3
class Foo::Bar < ActiveRecord::Base
  establish_connection 'foo'
end

DRY it up

That’s almost it for the Rails part. At this point our application can access the foo database, but we may start wondering if we could somehow dry up database.yml, as it now contains at least three very similar entries.

The answer is merge keys. This slightly esoteric feature of YAML allows one block to include definitions from another block. In our case all our connections are identical, except for the database they use. Let’s create a block with common directives and merge it into each connection.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defaults: &defaults
  adapter: mysql2
  encoding: utf8
  username: root
  password: secret

development:
  database: development
  <<: *defaults

test:
  database: test
  <<: *defaults

foo:
  database: foo
  <<: *defaults

That looks much cleaner, time to setup the database for our tests.

Integration with RSpec

In order to get reproducable test runs, the database used by test cases has to be in a defined state at all times. Rails ensures this property in principle by separating the test database from the development database via the Rails environment. The test database can easily be purged and should be managed by the test framework.

If we want to write tests against models using our new foo database, we should reproduce Rails’ setup and use one database in the development environment and a different one in the test environment.

Lets start with the model. The easiest way to differentiate between Rails environments is encoding it in the database connection. So we change our model to:

1
2
3
class Foo::Bar < ActiveRecord::Base
  establish_connection "foo_#{Rails.env}"
end

Now we have to adapt database.yml accordingly:

1
2
3
4
5
6
7
8
...
foo_development:
  database: foo
  <<: *defaults

foo_test:
  database: foo_test
  <<: *defaults

Having decoupled the development database from the test database, there is only one more issue: the test database is completely empty. We have to find a way to manage the schema.

Rails uses rake tasks for database management. The tasks most often used are rake db:migrate for updating the development database and dumping its schema and rake db:test:prepare to clean and update the test database. It would be nice if we could adapt these tasks to manage foo_test as well, so we don’t have to remember any new tasks.

A neat feature of rake is the possibility to define tasks multiple times, in effect adding code to existing tasks. In our case, we can extend db:schema:dump and db:test:load_schema to manage the foo_test database and completely hide this fact from the developer.

Let’s put the following code into lib/tasks/db.rake:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
namespace :db do
  namespace :schema do
    # desc 'Dump additional database schema'
    task :dump => [:environment, :load_config] do
      filename = "#{Rails.root}/db/foo_schema.rb"
      File.open(filename, 'w:utf-8') do |file|
        ActiveRecord::Base.establish_connection("foo_#{Rails.env}")
        ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, file)
      end
    end
  end

  namespace :test do
    # desc 'Purge and load foo_test schema'
    task :load_schema do
      # like db:test:purge
      abcs = ActiveRecord::Base.configurations
      ActiveRecord::Base.connection.recreate_database(abcs['foo_test']['database'], mysql_creation_options(abcs['foo_test']))
      # like db:test:load_schema
      ActiveRecord::Base.establish_connection('foo_test')
      ActiveRecord::Schema.verbose = false
      load("#{Rails.root}/db/foo_schema.rb")
    end
  end
end

Now, whenever we run a rake task that results in a schema dump, foo’s schema will be dumped to db/foo_schema.rb as well. Whenever we load a schema into our test database, foo_test will be updated, too.

That’s it, we are done. Time to write some specs…

Reusability in Configuration Management Systems

| Comments

Since puppet and later chef took off, there was discussion in the community about code reusability (chef cookbooks or puppet modules). As far as I can see, there is hardly any progress to see and it is still a big PITA to use other people’s cookbooks or modules.

As I am mostly working with chef these days, I will present my ideas with respect to chef cookbooks. From my experience with puppet, I expect them to apply similarly.

The status quo is unusable

In recent months I have been incorporating some community chef cookbooks into our company setup and felt the pain first hand. Besides varying levels of quality, my main problem was with cookbooks being too tightly coupled to the author’s infrastructure. I could hardly find a cookbook that was usable as is. Instead I usually removed all the stuff I didn’t need, rewrote all the stuff I didn’t like and ended up with a more or less simple cookbook that did the job for our infrastructure. I believe there has to be a better way to do this.

When I was not working on chef, I was improving my Ruby skills and learning about principles of object-oriented software development. I wonder, if we consider Infrastructure as Code, wouldn’t it make sense to apply some good software development practices to cookbook development as well? So what actually makes a cookbook dependent on the author’s infrastructure?

Lack of Separation of Concerns

The main reason, in my opinion, is missing separation of concerns. Many cookbooks are basically just monolithic blocks of code without much abstraction. Recipes do everything, creating users, directories, installing software and configuring it the way the author liked.

Granted, people split their client setup from their server setup, but this isn’t enough to actually making cookbooks reusable. Cookbooks have to adapt easily to different contexts, different policies.

Ideas from Object-Oriented Software Development

Two principles of object-oriented software development I found particularly applicable to this problem are the Single Responsibility Principle and the Open/Closed Principle.

The Single Responsibility Principle

Applied to recipes, the SRP states:

A recipe should have a single responsibility.

In other words, it should just install an application user, or just create configuration files, or just delegate work to other recipes. The corollary to this principle, as with software development, are cookbooks with many short recipes, some of which will do some work (I consider these low-level recipes) and some of which just delegate tasks in order to compose a variant of the application(I consider these high-level recipes).

The Open/Closed Principle

Applied to cookbooks, the OCD states:

Cookbooks should be open for extension, but closed for modification.

It should not be necessary for users to change distributed recipes, files or templates. If a user wants to change some aspect of a cookbook, that should be possible via configuration data (cookbook attributes), or extension recipes (adding alternative recipes and using application composition).

An example of recipe composition

Let’s have a look at a sample cookbook’s recipes:

recipes
├── _group.rb
├── _server_config.rb
├── _server_runit.rb
├── _server_install_from_source.rb
├── _user.rb
├── _webui_config.rb
├── _webui_cronjobs.rb
├── _webui_source.rb
├── server.rb
└── webui.rb

In this example I prefixed all low-level recipes with an underscore to differentiate them from high-level, service composing recipes.

Our sample application consists of a server component and a webui that are both composed from a couple of low-level recipes.

Let’s have a look at the server composition:

This recipe composes a server by just including all relevant low-level recipes. In this convenience recipe, the cookbook author decided on installing the application from source and using runit to manage the server process.

Let’s assume a user prefers to install the application from a pre-built package. It is easy for him to extend the cookbook with a recipe of his own, like _server_install_from_package.rb that would just install the server package. He could then compose his custom server from a role or a custom_server.rb recipe like this one:

As you can see, the user addeded their internal company repository and installed their pre-built package. No changes to existing recipes were necessary and the user can still use most parts of the cookbook. If a package was generally available, the _server_install_from_package.rb recipe could even be added upstream.

As I said above, the SRP leads to many small low-level recipes, like group.rb, which are concerned with just a single responsibility:

If the recipe turns out to contain just a single statement, so be it. If anyone ever has to implement group addition differently, for instance, because they use specific company wide IDs for their groups, they could easily replace this recipe or provide upstream with a better version, that would optionally include a group id setting. Other recipes, like _server_install_from_source.rb probably contain more steps, like downloading, compiling and installing the application.

Conclusion

Applying the Single Responsibility and the Open/Closed Principles to the space of configuration management could be a road to furthering reusability of chef cookbooks and puppet modules. Having many small low-level recipes makes it easier for users to adapt cookbooks to their needs and extend them as local policy prescribes. Deviation from the upstream version is minimized.

What do you think of this approach? Do you have other ideas about furthering code reusability? Let us know in the comments!

Setting Up Tddium for a Rails Project

| Comments

Lots has been written about the merits of continuous integration and I am a firm believer in it. However, while our team at work uses test-driven development, we still don’t have a CI server. This week I went on a quest to change that.

In order to minimize maintenance, we decided to give tddium a try. It is a hosted CI service for Ruby (on Rails) projects.

Setup

Initial setup was quite painless and works as described on their webpage:

  • sign up on the webpage
  • install the tddium gem
  • guided setup with the tddium tool, including github integration and email/campfire notification

For a regular rails app, this should have been it. In our case however, the tests wouldn’t even start as our sample app uses multiple databases. While initial install was a matter of minutes, at this point I began a multi-hour debug session, trying to figure out how to bend tddium to my will.

Debugging

The first problem was the Rails logger. Tddium seems to overwrite the standard logger with their own implementation that is not fully compatible with the standard Rails logger. In our case it doesn’t provide auto_flushing=. This, however, was easy to fix:

Rails.logger.auto_flushing = true if Rails.logger.respond_to?(:auto_flushing)

The next problem was a much bigger beast, multi database support. While the documentation roughly explains database setup and the use of hooks to add custom configuration, it took me quite some time to figure out how everything really works and how I could change the setup to suite my needs.

At this point I have to mention the slow feedback cycle tddium has, from running tddium spec to push code changes, to running the suite and seeing an error, it took 2 to 3 minutes. On the plus side I am happy tddium spec exists at all, this way you can easily run the suite on locally commited changes and don’t have to push to github all the time. Great for testing stuff.

So, after many failures I was finally able to come up with this solution:

There are two things to notice:

  • I was only able to setup a 2nd database connection. Unfortunately the user is not permitted to create arbitray databases. In our case this was enough to get the code to load and the test suite to run.
  • The socket entry is necessary for us but is not mentioned in the documentation. I extracted it from tddium’s database.yml.

At this point our test suite would run with two failing specs. In one case we actually executed code accessing the 2nd database. This was easy to fix in our specific case by rewriting the spec. The other problem was completely unrelated und not reproducable locally, some text saved to the database was not retrieved identically. We disabled this spec for now.

Conclusion

So far I am happy with tddium, it is a very lightweight CI service and in the general case, much easier to setup than a custom CI server. If we hadn’t have this multi-database setup, tddium would have been up and running within minutes. Pricing seems fair, although it lacks many features if compared to Jenkins CI. I am looking forward to its evolution and hope integration of our other apps will be smoother. :-)

Thoughts About Security and Agile Development

| Comments

When attending a security training for developers last week, I was astonished to find a security professional without programming background as our trainer. He mostly talked in waterfallish terms, just as I had done a decade ago when I was more interested in IT security than the physics I studied at university. So I began wondering about the role of security in agile development.

Security on the product level

As agile developers, we optimize our code for flexibility, so we can change it easily at any point in time. In this world, security at the product level is nothing more than a feature to add whenever it makes sense. It becomes a matter of continuous risk management, which feature prioritisation is anyways.

We might start a new product without any security whatsoever, because who needs security when we have no clue how to solve our business logic problems in the first place? However, with each feature our risk profile changes and we have to reassess our situation. We better implement that login functionality before we go into public beta, but maybe it’s still ok to run without any encryption. Once we launch the product, we better protect our servers with TLS and encrypt sensitive data in the database.

Security on the code level

The story changes, however, when it comes to our daily coding practice. I consider security another code quality like good design or freedom from defects. Just as we use practices like TDD or Continuous Integration, we should use secure coding practices like input validation and output encoding. These practices need to be embedded in our daily practice. We have to make ourselves aware of the trust boundaries where we need to apply them.

What secure coding practices do you use? Let us know in the comments!

Stand-up Desk Migration for Little Money, an IKEA Hack

| Comments

When I remodeled my flat 1 year ago, I chose to downsize my desk and put the PC in an underused corner of the living room. The problem there was, that the corner isn’t too wide, just 114cm. As most desks have a minimal width of 120cm, I was happy to find IKEA’s Vika line with the Vika Amon tabletop just 1m wide. Together with two Vika Lage legs and a Vika Annefors table leg with storage space to hold my PC it cost me just 43 Eur. As the Annefors is quite wide, I moved it a couple of cm to the side and gained some space to put small stuff like my mobile and wallet.

Every now and then I came across someone mentioning their move to a stand-up desk to counter the ill side-effects of sitting all day. One day I figured I’d give it a try, borrowed my girl friend’s ironing board, put some books on top and my laptop on top of those.

It didn’t take me much getting used to and I was quickly oscillating between ironing board and couch with my laptop. Happy as I was, at some point my girl friend needed her ironing board back, so I decided to upgrade to some kind of IKEA hack solution.

After discussing several options with my girlfriend, we settled on buying a second tabletop with some wooden legs to cut to proper length. While searching for the items, we came across the Lack side table for just 5 Eur. We quickly realised, that this was the perfect solution and bought two.

These side tables mix perfectly with my Vika table, but together they are 10cm wider than the Amon tabletop. As described above, this wasn’t much of a problem for me, as I had positioned the Annefors a little to the side. This gave me an effective width of about 114cm to work with. So we cut the side table’s legs, left 2 of them a little longer to compensate for the missing tabletop on the side and there was my new stand-up desk. This is how it looks: