Don't make your database a slave to your ORM

Warning: This blogpost has been posted over two years ago. That is a long time in development-world! The story here may not be relevant, complete or secure. Code might not be complete or obsoleted, and even my current vision might have (completely) changed on the subject. So please do read further, but use it with caution.
Posted on 29 Oct 2011
Tagged with: [ databases ]  [ orm

ORM’s, or object-relational mappers, are a great way to convert (mostly) relational databases to classes in a object oriented language. It takes care of SO many things you do over and over again: fetch records from a table, populate an object, implement getters and setters, update or add records when needed etc. A lot of this work can be abstracted away by using patterns like ActiveRecord, table gateways and/or data mappers. An ORM will even abstract away this further as a whole and let you only deal with the resulting (domain) models. There is no immediate need to interact with any data storage of any kind. Who would not want this!?

I always learned that if something sounds too good to be true, it always is to good to be true. And in the case of ORM, this is quite frankly the case. Now this does not mean that I’m here to advise you to stop using ORM’s. On the contrary: ORM’s are great and may be suitable for about 80 percent of the time. But the catch is inside the last 20 percent. (Did you notice that the 80/20 rule can be applied to 80 percent of all things? :) )

Abstracting away your data is only one of the benefits of an ORM. There is another that people like to mention: it doesn’t matter which database you use (and it’s even possible not to use a database at all, but maybe a REST service or even something else like a memcache store). But would you really need this advantage? Microsoft claims that  portability is for canoes. It is a bold statement and even though I don’t necessarily agree with it, think about it: how many times in your life do you need to change your application from using a MySQL database to a PostgreSQL database to a Sybase to end up with Oracle?? Even without an ORM, portability still should not be too difficult when your architecture is right. And most probably, the least of your problems when you need a switch like that isn’t the code or the amount of work to change it, it’s the reason why you need to change databases to begin with.

The biggest problem I have with people using ORM’s is that they think they don’t need to know SQL anymore. Hurray! But this is were the last 20% will come and haunt you. In the end, queries generated by the ORM might not be the most optimized queries you need. Lot of ORM’s don’t really do joins, sub-queries, stored procedures etc efficiently or even correctly. Every database needs a lot of TLC, and you simply cannot auto-generate SQL and/or tables in the hope your database will use it efficiently. It needs indexing, it needs tuning, it needs optimization of queries and tables. Off course, this also is true for developers not using ORM’s, but most of the time they will understand SQL, or at least up to a certain point and know how to deal with indexing, inner/outer joins, covering indexes, selectivity and whatnot. A DBA cannot be replaced by a bit of code that generates some tables. This stuff is important, highly complex and cannot be ignored. Maybe someday ORM’s are that powerful that we can completely ignore the database layer, just like we don’t code directly on a CPU in assembly anymore (special cases not included), but let the C-compiler do its stuff. We cannot manually beat a compiler anymore, besides the occasional funroll-loops here and there. But until that day arrives when we can say the same thing for ORM’s, we cannot simply ignore the fact that we and we alone are in charge of the database.

Conclusion

I think the problem with ORM’s is not that they are evil, but the users who use them quickly turn evil when it comes to the underlying databases. By abstracting away the SQL, people tend to forget the importance of it and in the end not only will struggle with their ORM on one end, but with the SQL server on the other end. And fighting battles on two fronts is always a recipe for disaster. I don’t think the in the domain which ORM’s currently reside will ever can optimize and/or monitor your SQL server (farm) and you probably never want it to. So let ORM’s deal with the 80% of the queries, but make damn sure you’ve got the resources available to take care of the remaining 20%. And yes, this means we should create some queries manually..