Nikolay Sturm's Blog

Musings about Development and Operations

Reducing the Noise in Git Diffs

| Comments

Since we switched our Rails applications to SQL schemas, I found it disturbing seeing lines and lines of boring autoincrement changes in diffs whenever something touched the database:

-) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
+) ENGINE=InnoDB AUTO_INCREMENT=12833 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

This made it especially hard to identify unexpected changes.

When lamenting about this, a friend suggested git attributes. They allow you to set special options on a per path basis. One such option is a filter that can be applied to files before they get diffed.

In my case, I want to filter those AUTO_INCREMENT lines from db/structure.sql so the first thing is to specify an attribute for this path:

$ cat .git/info/attributes
structure.sql diff=sql_schema

There are different ways to specify attributes. I chose this file so my change wouldn’t interfere with my colleague’s setup.

The next step is to configure the filter to use for this attribute:

$ tail -2 .git/config
[diff "sql_schema"]
        textconv = sed -e '/^) ENGINE=InnoDB/s/AUTO_INCREMENT=[0-9]* //'

This filter selects all lines starting with ) ENGINE=InnoDB and removes the AUTO_INCREMENT statement. Now, whenever the schema changes, I only see the actual change, without any noise from my auto increment counters.

If you work with multiple Rails applications, it might make sense to move the sed call to a script and call that instead.

Hope this helps.

Comments