The Schema Evolves

By Scott W. Ambler, August 01, 2004

Data professionals, too, can benefit from increased flexibility. Here's how to create a data model in the face of changing requirements. Part 2 of 3.

The Schema Evolves

Read Parts 1 & 3

By Scott W. Ambler

Agile software development techniques are quickly becoming the norm within the programming community—unfortunately, the same thing can’t be said of the data community. Last month, I described a data model’s evolution through the first three iterations of development of a fictitious Karate School Management System (KSMS), and this month, I continue the case study. We started the project by first identifying and then prioritizing a collection of user stories. Next, we assigned the user stories to iterations, and then started working in priority order. The evolution of the database schema went fairly smoothly, mainly because the requirements didn’t change during the first three iterations—a blissful idyll that would all too soon be cut short by grim reality.

At the end of the third iteration, the users chose to install the system and start working with it—enough functionality was in place that the system could start earning value. At that point, they decided to stop development to see how well the system actually worked and to give the developers a chance to go on summer vacation. When we returned, we discovered that our users had

rethought what they needed—coming up with some new requirements and reprioritizing existing ones. We had our work cut out for us.

The Requirements Changed—So What?

Our stakeholders originally wanted to run tournaments to earn extra money. After talking with a few people, they discovered that tournaments involve a lot of work, and they’d be lucky to break even. However, they still needed to run special events such as small internal tournaments and special training sessions to teach advanced techniques. They also realized that they’d forgotten to tell us about the family and child memberships they offer. Furthermore, some students study other styles such as tai chi and cardio kickboxing, and some study several styles.

We needed to rework our requirements to reflect these new priorities. The requirements were captured as user stories on index cards (see “The Requirements”). The developers estimated the effort to implement each requirement and the stakeholders prioritized them, enabling us to assign the user stories to upcoming iterations. The stakeholders were still free to introduce new requirements or rework existing ones, and the developers continued to work at the requirements in priority order.

Fluctuating requirements illuminate the advantages of an incremental approach. By releasing a portion of the functionality early, we enabled our users to better identify what they actually wanted. Had we tried to implement all the requirements at once, we would have delivered functionality that wasn’t actually needed in practice. In our continuing list of agile data modeling lessons, here’s lesson number five: Trying to define all the requirements up-front is a risky proposition.

Iteration Four

[click for larger image]
Iteration Four
To keep track of members in a given family, we added the Family table. We added a corresponding FamilyPOID column in Student to act as a foreign key to the new table.

Our users’ highest-priority requirements were to support family and child memberships. To keep track of members in a given family, we added the Family table. We added a corresponding FamilyPOID column in Student to act as a foreign key to the new table. Most students aren’t on a family membership, so this column often has a null value (see “Iteration Four”).

Adding support for child memberships was a little more difficult. Children have a different set of belts than adults; they also have striped belts in addition to the normal adult colors, as well as two additional colors: red and purple. This multicolor motivation system helps to keep kids engaged: Most adults understand that it could take six to 12 months to earn their next belt, but try explaining that to a four-year-old. To support this functionality, in addition to the Java code changes, we added an IsChild column to the Belt table and new rows for the child belts.

We also added an IsChild column to the Student table. People progress from the children’s to the adult classes when they’ve reached an appropriate level of maturity and skill, not just because of their age, so a birth-date column wasn’t appropriate.

These database changes were made in parallel to required code changes. In addition to taking an evolutionary approach to your database design, you must also take a collaborative approach. My development partner, Beverly, and I worked together, often pair programming, to evolve both the Java code and the database schema. I generally took the lead on the database work, and she took the lead on the Java work, but all in all, it was our teamwork that built the application. One of us didn’t design what needed to be done and hand it off to the other—this serial approach has a serious risk of communication errors. Nor did we go our separate ways and each do our own part of the work—this parallel approach carries a dangerous potential for double work (both of us would have explored the same schema issues, though from different viewpoints) and incompatible work (we could have easily made different schema design decisions). Lesson number six: Agile data modeling is both evolutionary and collaborative.

Iteration Five

[click for larger image]
Iteration Five
The fifth iteration focused on supporting non-karate training. These styles have their own belts, so a Style table was added to implement the current requirements. Furthermore, the Belt table now has a StylePOID column to track which style a given belt relates to—for example, a white belt record is used for both tai chi and karate.

The fifth iteration focused on supporting non-karate training. These styles have their own belts; in tai chi, for example, the belts are white or black, and cardio kickboxing uses no belts at all. A Style table was added to implement the current requirements. This may make it easy to support new styles in the future, although we won’t know for sure until we have actual requirements to do so. The StylePOID column was added to the Belt table to indicate which style a given belt relates to—a white belt record is used for both tai chi and karate (see “Iteration Five”).

To record the fact that someone can train in several styles, we introduced the StudentBelt associative table, which implements the many-to-many association between Student and Belt. The Java code, however, lacks a corresponding StudentBelt class because Java natively supports many-to-many associations via collections. Data professionals introduce associative tables to their designs quite naturally; similarly, Java programmers add collections to their business classes quite naturally, but each group may not be familiar with the other’s techniques. Luckily, Beverly and I were able to map the two schemas effectively once we discovered the differences. Lessons number seven and eight: You can always learn new skills from someone else, and it isn’t enough to specialize in one aspect of technology.

To initialize the StudentBelt table, we needed to migrate the data from the original StudentPOID and BeltPOID columns of the fourth iteration’s data schema. Each time we rework the existing schema, we may need to migrate existing data—test data that we use in our development environments as well as the actual production data. Regular data migration is the downside of evolutionary database development. It can be difficult, and it’s easy to say that this increased complexity is a good reason to develop the database schema up-front early in the project. But this view doesn’t hold water from an agile point of view. Even with a traditional approach, data must still be migrated occasionally, and new or updated requirements slip in regardless of how well your change management process tries to prevent it. So accept this fact and get good at data migration. Lesson number nine: Agilists choose to embrace some tasks that traditionalists avoid.

The final change to the schema was the addition of the StyleSequence column to the belt table. We needed to support the fact that each system has its own belt-earning order: Adult karate students move from white to yellow to orange and so on, whereas tai chi students move from white to black. Next month, I’ll describe how we tackled this challenge, and others, to complete the project.

In part one of this series, I explored how a database schema developed in an evolutionary manner on an agile project. In this column, I described the schema’s evolution to support new requirements, and how we discovered that we didn’t need or want to design the entire database schema early in the lifecycle. In the last installment, I’ll discuss the challenges involved with adopting an agile approach to data modeling in your organization—projects often don’t go this smoothly when legacy databases and legacy data professionals are involved—and I’ll describe the new skills that data professionals must learn to become agile.

The Requirements The following user stories describe the updated usage requirements for the fictitious Karate School Management System (KSMS).

Iteration	User Stories
1	Maintain student contact information Enroll student Drop student Record payment
2	Promote student to higher belt Invite student to grading E-mail membership to student Print membership for student
3	Schedule gradings Print certificate Put membership on hold
4	Enroll child student Offer family membership plan Support child belt system
5	Enroll student in tai chi Support tai chi belt system Enroll student in cardio kickboxing Support kickboxing belt system Support the belt order for each style
6	Maintain product information Sell product
7	Print catalog of products Order product for inventory Order product for student
8	Organize internal special events (special classes, internal tournaments and so on) Enroll student in special event Print special event certificate for student
—SA

Scott Ambler is a senior consultant with Ronin International Inc. His latest book is Agile Database Techniques from Wiley Publishing.

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

The Schema Evolves

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

The Schema Evolves

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content