Monday, April 23, 2007

Interview With SQL Server MVP Louis Davidson: Author Of Pro SQL Server 2005 Database Design and Optimization

Another cool SQL interview is the one with SQL Server MVP Louis Davidson.

Here is one of the questions:

What are/could be some consequences if your data model is not in Third Normal Form or even First Normal Form?

The earth could stop spinning on its axis, at least that is what I try to tell people. The more reasonable answer is that you will have nagging, annoying issues with your data that will require more and more management just to keep straight.

Normalization pushes data consistency issues to the foreground and forces you to deal with them. For example, take first normal form. One of the rules of first normal form is that every row is unique. A constant question on newsgroups/forums is: “I have duplicate rows, how do I get rid of them?” Well, go through this following messy set of steps, and then PROTECT YOURSELF USING KEYS!
Another concern of first normal form is scalar attributes. I like to define this as keeping your data broken down to the lowest form you will manipulate using SQL. So if you have to use substring for any common operation, they you likely have an issue. For example take an address, if you use addresses just as the user types them in, then just have a text column, put the address in it. But if you have a mailing system and you have to cleanse the data, and then break the address down into all of its constituent parts instead of going through the process every time you need an address. SQL is great for taking bits and pieces and constructing what you need, but lousy at breaking things down.

For third normal form, the problems are more ominous. If the data in the table isn’t really describing the thing that the table is modeling, you will end up with redundant data. Redundant data doesn’t sound so bad, until you have need to update the data. Consider the case where you have a table modeling a person, and in this table you have the person’s manager, the manager’s name and address. From this sentence, it almost sounds plausible and correct. It certainly is something you might want to see on a screen. But is the manager just the manager for the one person? If not you might have ten copies of the managers information. Now you have to make sure and deal with all ten copies, which is a real pain, to say the least.


Read the whole interview here: Interview With SQL Server MVP Louis Davidson: Author Of Pro SQL Server 2005 Database Design and Optimization

No comments: