development

Why PostgreSQL?

Quick overview of what PostgreSQL brings to the table that is not available in MySQL:

  • Uses MVCC for all tables providing:
    • Fully transactional including ACID compliance for consistency
    • Nested transactions
  • SQL 2008 compliant
  • Foreign keys for any table
  • Advanced table partitioning
  • Highly sophisticated query planner/optimizer
    • Can split up a query for execution across multiple CPUs simultaneously
    • Collects internal statistics for adaptive query planning
    • Special genetic query optimizer for queries with large numbers of joins
    • Supports multiple indexes per table per query
  • Advanced support for query & results caching
  • Hot/online backup
  • Point-in-time-recovery
  • Write-ahead logs for fault-tolerance
  • Tablespaces for controlling physical disk layout
  • Native asynchronous replication guaranteeing identical results on all machines. Supports both:
    • Streaming replication
    • Hot standby
  • Partial indexes
  • Index creation/removal does not lock table
  • Full support for constraints
  • Transactional DDL - changes like table modifications can placed inside a transaction and rolled back

Specific disadvantages to MySQL:

  • Confusion with table types - MyISAM vs InnoDB
  • Designed to scale out not up - does not utilize larger numbers of cores effeciently and cannot spread queries across cores
  • Hot backup of is difficult for databases containing both InnoDB and MyISAM
  • Replication is mediocre and error prone
  • InnoDB stores the data with the primary key, so any queries using secondary indices are slower
  • Subqueries not well optimized
  • Only uses a single index per table per query
  • Index creation/removal requires an exclusive write lock
  • MyISAM only offers table level locking which causes severe performance degradation under heavy concurrency
  • Limited support for constraints
  • No transactional DDL - changes like table modifications are automatically committed and cannot be rolled back

MySQL offers the following advantages over PostgreSQL:

  • MyISAM tables can offer better read performance, specifically for simple SELECT queries, but at the cost of no support for transactions, foreign keys or data guarantees
  • COUNT(*) on MyISAM is very fast and slow on PostgreSQL
  • INSERT IGNORE and INSERT...ON DUPLICATE UPDATE

 

Different content in Rails based on UserAgent

I was recently working on a website built using Rails that needed to render different content for certain user agents. Specifically, we needed simpler versions of certain pages for BlackBerry devices. Here's how I accomplished.

First, I added a new mime-type for BlackBerry by adding the following line to config/initializers/mime_types.rb:

Mime::Type.register_alias "text/html", :blackberry

Next, I added two utility methods to app/controllers/application.rb:

# Checks UserAgent
def is_blackberry?
  ua = request.user_agent
  return false if ua.nil?
  return false if ! ua.downcase.index('blackberry')
 
  # Don't call the BlackBerry 9800 a BlackBerry, since it has a modern browser
  # based on WebKit:
  # Mozilla/5.0 (BlackBerry; U; BlackBerry 9800; en) AppleWebKit/534.1+ (KHTML, Like Gecko) Version/6.0.0.141 Mobile Safari/534.1+
  return false if ua.downcase.index('webkit')
 
  # Must be a BlackBerry!
  true
end
 
# Sets the respond_to format to blackberry if blackberry
def set_blackberry_format
  if !request.xhr? && is_blackberry?
    request.format = :blackberry
  end
end

With that in hand, it's easy to render BlackBerry specific content on specific pages:

set_blackberry_format
respond_to do |format|
  format.blackberry
  format.html
  format.js { render :layout => false }
end

To Rewrite or Not to Rewrite: The Ugly Question

I recently had a discussion about the idea of rewriting software from scratch. I actually played the devil's advocate and argued against ever throwing out and rewriting, which really got me thinking about the whole concept.

The discussion centered around article by Joel Spolsky (of Joel on Software) titled Things You Should Never Do, Part 1. His points against rewrites include:

  1. The ugly code you throw out has been hardened and tested. It's filled with bug fixes. You're throwing out that knowledge and expertise.
  2. You're throwing out market leadership and "giving a gift of two or three years to your competitors".
  3. You're not going to do a better job writing the code a second time than you did the first time, especially since it's unlikely you have the same team that wrote the earlier version.
  4. You will introduce new bugs.

Joel further argues that there are three major reasons developers want to rewrite code and none of them require rewrites:

  1. Architectural problems. The "you got your gui in my business logic" problem. This can be handled by small but steady code refactorings.
  2. Inefficiency. Again, can be handled by small code refactorings.
  3. The code is fugly. This may be due to complexity and bug fixes, in which case see point #1 above. Or it may be due to poor and changing naming conventions, in which case it can be fixed by a simple Find-Replace.

These are all excellent points. On some level, I agree with this entirely. Even many nasty combinations of all three problems can be solved by steady refactorings. I have worked for places where people pushed for rewrites that weren't necessary. But these were larger businesses with a well established core product. These were not early startups. That's why I believe Joel makes several assumptions which are fatal to his arguments.

First, he assumes the software project is really large and complex. While some of us may have worked on projects of that size and scope, quite of few of us work on much smaller projects. Simply put, it's a matter of scale.

Second, as a corollary of his first assumption, Joel also assumes that a rewrite requires years not months. Again, this is likely true for a product like Excel or Word... but this simply isn't true for many of the sites and products I've worked on. Furthermore, the use of agile or rapid development technologies such as Ruby on Rails can dramatically shrink this window.

Third, and perhaps most importantly, Joel assumes that the time required to cope with a messy code-base and make steady refactorings is significantly less than the time required to rewrite the app. And he assumes that's a worthwhile trade off. This may be clear cut for larger products or companies, but I question whether or not that's accurate for a startup. The more tangled your code, the longer it takes you to make changes. The longer it takes to make changes, the less nimble you are and the longer it takes you to respond to changes in company direction or marketplace demands.

It's that last point that I believe is most important to those of us working for small startups. We tend to be small young companies who are still striving to find our exact place in the wider world. We're often in cutting edge spaces where there is no clear cut path to success. And usually we're steadily seeing greater numbers of competitors in our space. It seems to me that agility is vitally important to people us. We need to be able to makes changes rapidly as our knowledge of the space evolves. Fundamentally, I think it's better to have a decent product/feature/whatever out in the hands of consumers than it is to have a nearly perfect product that's still under development. Don't get me wrong, I'm sure I'm preaching to the choir. :) But, I think it's critical to keep the need for agility and nimbleness in the forefront of our thoughts.

Fourth, Joel assumes any architectural problems can be solved by steady refactoring. Frankly, I disagree. I think there exist serious architectural flaws, especially related to scalability that cannot be easily solved by refactoring. eBay, LinkedIn, Facebook and Yahoo have all had major rewrites in their history that were directly attributed to serious architectural failings.

That is not to say that a full rewrite is necessarily a desirable goal. :) However, it takes careful management and planning to avoid finding yourself in this position. eBay used to employ a strategy they called headroom, which basically set aside 20+% of all development time to refactor code and it keep it in top working order. While I think it may very difficult to employ such a strategy in a startup, it may be worth considering.

Syndicate content