By Scott W. Ambler
Agile software development techniques are quickly becoming the norm within the programming community—unfortunately, the same thing can’t be said of the data community. Last month, I described a data model’s evolution through the first three iterations of development of a fictitious Karate School Management System (KSMS), and this month, I continue the case study. We started the project by first identifying and then prioritizing a collection of user stories. Next, we assigned the user stories to iterations, and then started working in priority order. The evolution of the database schema went fairly smoothly, mainly because the requirements didn’t change during the first three iterations—a blissful idyll that would all too soon be cut short by grim reality.
At the end of the third iteration, the users chose to install the system and start working with it—enough functionality was in place that the system could start earning value. At that point, they decided to stop development to see how well the system actually worked and to give the developers a chance to go on summer vacation. When we returned, we discovered that our users had
rethought what they needed—coming up with some new requirements and reprioritizing existing ones. We had our work cut out for us.
The Requirements Changed—So What?
Our stakeholders originally wanted to run tournaments to earn extra money. After talking with a few people, they discovered that tournaments involve a lot of work, and they’d be lucky to break even. However, they still needed to run special events such as small internal tournaments and special training sessions to teach advanced techniques. They also realized that they’d forgotten to tell us about the family and child memberships they offer. Furthermore, some students study other styles such as tai chi and cardio kickboxing, and some study several styles.
We needed to rework our requirements to reflect these new priorities. The requirements were captured as user stories on index cards (see “The Requirements”). The developers estimated the effort to implement each requirement and the stakeholders prioritized them, enabling us to assign the user stories to upcoming iterations. The stakeholders were still free to introduce new requirements or rework existing ones, and the developers continued to work at the requirements in priority order.
Fluctuating requirements illuminate the advantages of an incremental approach. By releasing a portion of the functionality early, we enabled our users to better identify what they actually wanted. Had we tried to implement all the requirements at once, we would have delivered functionality that wasn’t actually needed in practice. In our continuing list of agile data modeling lessons, here’s lesson number five: Trying to define all the requirements up-front is a risky proposition.
Iteration Four
[click for larger image] Iteration Four To keep track of members in a given family, we added the Family table. We added
a corresponding FamilyPOID column in Student to act as a foreign key to the
new table.
|
Family
table. We added a corresponding FamilyPOID
column in Student
to act as a foreign
key to the new table. Most students aren’t on a family membership, so
this column often has a null value (see “Iteration Four”).
Adding support for child memberships was a little more difficult. Children
have a different set of belts than adults; they also have striped belts in
addition to the normal adult colors, as well as two additional colors: red
and purple. This multicolor motivation system helps to keep kids engaged: Most
adults understand that it could take six to 12 months to earn their next belt,
but try explaining that to a four-year-old. To support this functionality,
in addition to the Java code changes, we added an IsChild
column to the Belt
table and new rows for the child belts.
We also added an IsChild
column to the Student
table. People progress from
the children’s to the adult classes when they’ve reached an appropriate
level of maturity and skill, not just because of their age, so a birth-date
column wasn’t appropriate.
These database changes were made in parallel to required code changes. In addition to taking an evolutionary approach to your database design, you must also take a collaborative approach. My development partner, Beverly, and I worked together, often pair programming, to evolve both the Java code and the database schema. I generally took the lead on the database work, and she took the lead on the Java work, but all in all, it was our teamwork that built the application. One of us didn’t design what needed to be done and hand it off to the other—this serial approach has a serious risk of communication errors. Nor did we go our separate ways and each do our own part of the work—this parallel approach carries a dangerous potential for double work (both of us would have explored the same schema issues, though from different viewpoints) and incompatible work (we could have easily made different schema design decisions). Lesson number six: Agile data modeling is both evolutionary and collaborative.
Iteration Five
[click for larger image] Iteration Five The fifth iteration focused on supporting non-karate training. These styles have their own belts, so a Style table was added to implement the current requirements.
Furthermore, the Belt table now has a StylePOID column to track which style
a given belt relates to—for example, a white belt record is used
for both tai chi and karate.
|
Style
table was added to implement
the current requirements. This may make it easy to support new styles in
the future, although we won’t know for sure until we have actual requirements
to do so. The StylePOID
column was added to the Belt
table to indicate which
style a given belt relates to—a white belt record is used for both
tai chi and karate (see “Iteration Five”).
To record the fact that someone can train in several styles, we introduced
the StudentBelt
associative table, which implements the many-to-many association
between Student
and Belt
. The Java code, however, lacks a corresponding StudentBelt
class because Java natively supports many-to-many associations via collections.
Data professionals introduce associative tables to their designs quite naturally;
similarly, Java programmers add collections to their business classes quite
naturally, but each group may not be familiar with the other’s techniques.
Luckily, Beverly and I were able to map the two schemas effectively once we
discovered the differences. Lessons number seven and eight: You can always
learn new skills from someone else, and it isn’t enough to specialize
in one aspect of technology.
To initialize the StudentBelt
table, we needed to migrate the data from the
original StudentPOID
and BeltPOID
columns of the fourth iteration’s data
schema. Each time we rework the existing schema, we may need to migrate existing
data—test data that we use in our development environments as well
as the actual production data. Regular data migration is the downside of evolutionary
database development. It can be difficult, and it’s easy to say that
this increased complexity is a good reason to develop the database schema up-front
early in the project. But this view doesn’t hold water from an agile
point of view. Even with a traditional approach, data must still be migrated
occasionally, and new or updated requirements slip in regardless of how well
your change management process tries to prevent it. So accept this fact and
get good at data migration. Lesson number nine: Agilists choose to embrace
some tasks that traditionalists avoid.
The final change to the schema was the addition of the StyleSequence column to the belt table. We needed to support the fact that each system has its own belt-earning order: Adult karate students move from white to yellow to orange and so on, whereas tai chi students move from white to black. Next month, I’ll describe how we tackled this challenge, and others, to complete the project.
In part one of this series, I explored how a database schema developed in an evolutionary manner on an agile project. In this column, I described the schema’s evolution to support new requirements, and how we discovered that we didn’t need or want to design the entire database schema early in the lifecycle. In the last installment, I’ll discuss the challenges involved with adopting an agile approach to data modeling in your organization—projects often don’t go this smoothly when legacy databases and legacy data professionals are involved—and I’ll describe the new skills that data professionals must learn to become agile.
Scott Ambler is a senior consultant with Ronin International Inc. His latest book is Agile Database Techniques from Wiley Publishing.