FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Architecture & Design
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
October 20, 2006

Relational Databases 101

(Page 8 of 11)
Creating Table Indexes

Once you define your database tables and the primary key, you're going to want to add indexes to improve query performance. Without indexes, you'll find that query performance is rather slow. If you take a look at the query plan being generated, you might find that SQL Server is not fetching rows efficiently by scanning the entire table each time. Again, the interactive Database Designer tool can help set up indexes. No, don't add too many, as each index must be updated as you insert new rows. Start with an index on the primary key columns—the tools should do that for you automatically. Once you populate your database with data, you can run the query analyzer to evaluate your indexes. This tool will tell you which indexes are helping and which are not, as well as where additional indexes will further improve performance.

Choosing the Right Data Type

When designing databases in the 1960s and 1970s, I was taught to be especially careful of how much space each data element consumed. Since hard disks were tiny by today's standards (the IBM 360 125 came with 7.25Mb to 100Mb drives),13 I was hard-pressed to minimize the amount of data stored in each "record." I economized by "coding" whenever and wherever I could. For example, a single column (byte) might contain several different types of data, depending on the value to be stored. When SQL Server and other relational databases were introduced, disk space was still expensive, but not nearly as much as in the mainframe days. However, more experienced database architects still choose column widths based on past experience and with the knowledge that more data means poorer performance.


13 See www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP3125.html


Unicode vs. ANSI

In situations where you need to store data in an "international" character set, whose characters are not supported by the ANSI set, you'll have to define your columns (and string literals) as Unicode. If you take this option, SQL Server stores 16 bits for each character instead of 8. It means the same four-character entry requires 4 bytes in ANSI and 16 bytes in Unicode columns. Just remember to prefix your string literals with "N", as in N'Fred', when building Unicode expressions—Visual Studio tools and wizards do this for you if they generate the query. There is another aspect to Unicode that might surprise you. When you define a column as nvarchar, you specify a maximum length, as shown in Figure 3.8.

[Click image to view at full size]
Figure 3.8: Creating a table with a Unicode column.

This DDL code allocates 50 bytes of space in the data row to the Author's name. However, this also means that the name must be no longer than 25 (16-bit) characters.

Char vs. VarChar

I've heard the debates over use of fixed-length datatypes (like char and nchar) over variable-length types (like varchar and nvarchar). In my practices, I rarely use the fixed-width types because they're problematic in a number of respects. These types are fine for columns whose data is always the same number of characters, but if you slip and provide a value that's shorter (or longer) than the defined size, SQL Server either pads the remaining space or truncates the data (often without notice). You'll also find that it's tough to create expressions against fixed-length columns unless you match the length of both operands. For example, if your fixed-length column can contain four characters, you'll have to write an expression that has exactly four characters, or an equality expression will always return False.

IF MyFixedCol = 'Fred'

This returns TRUE if MyFixedCol contains "Fred".

IF MyFixedCol = 'Fred '

But this returns FALSE.

For this reason (and others), I prefer to use variable-length types. They don't consume much extra space (if any) and when the data length varies, this approach can actually save space. When defining variable-length character columns, you specify the maximum amount of space to reserve for the column. This does not preallocate this space—it simply sets an upper limit. With SQL Server 2005, you can now define a varchar(max) or nvarchar(max) column that (like the TEXT datatype) can store up to 231 bytes and Unicode 230 bytes.

Previous Page | 1 Introduction | 2 Understanding Relational Database Normalization | 3 Creating Tables, Rows, and Columns | 4 Identifiers | 5 Using Identity or GUID Primary Keys | 6 Understanding Parents and Children | 7 Naming Objects | 8 Creating Table Indexes | 9 Decimal vs. Floating Point | 10 Working with Imprecise Numbers | 11 Summary Next Page
TOP 5 ARTICLES
No Top Articles.



MICROSITES
FEATURED TOPIC

ADDITIONAL TOPICS

INFO-LINK