Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

High-Definition Software Measurement


May 1999: Project and Process Management: High-Definition Software Measurement

My friend Seth called me recently with excitement in his voice. Among his many attributes, he’s an audiophile, an aficionado of high-end music reproduction. He had just acquired some silver core speaker cables, which run between his amplifier and speakers.

“You have to buy these cables,” he said. “The definition and resolution in the playback are incredible. The music’s dynamics are more vivid and alive.” He was thrilled that his system had even more capability and accuracy to extract information from a compact disc, making the musical experience sound more realistic.

You may be wondering how this conversation sparked a retrospective on software development. How does high-definition audio or television reproduction map to software measurement?

In high-definition audio, the finely articulated resolution might illuminate the emotional and stylistic nuances between how Vladimir Horowitz or Mitsuko Uchida interprets Mozart’s Piano Concerto No. 23. These details might be indistinguishable with a cheap player or an AM radio. In the same vein, high-definition television images look virtually three-dimensional, much better than Grandma’s old black and white Quasar.

You could describe many software management frameworks as the old stereo console or black and white TV, smearing vital management information into a blurred continuum. These frameworks can’t manage the multimillion dollar investments that could determine whether or not a company will be competitive in the twenty-first century.

What Works

High-definition software measurement can reveal the detail you need to understand software productivity and what drives it. Its metrics can demonstrate ways to improve software processes in a more surgical and less shotgun-like manner. They can quantify the benefits, or the speed bumps, of technologies like Java, component-based development, object-oriented development, client/server, and frame engineering.

In my 13 years specializing in software development management, I’ve often been asked how software measurement, or metrics, can raise productivity. What should we measure to speed development and cut cycle time? How can we estimate better so our projects don’t miss their deadlines? People want to understand the differences between successful and unsuccessful projects and development methods.

Yet many software development managers only have access to what I call “low definition” software management information. Low definition information can’t reveal the level of detail you need to answer important questions. It’s often dated, unreliable, or inconsistent. The metrics are usually two-dimensional, such as productivity in units of output over effort, while the challenges you face are multidimensional, such as deadlines, cost targets, reliability requirements, and staffing. The result is projects that are completed late, over budget, and with poor reliability.

High-definition software measurement can reveal and help you understand the differences between various development processes. You need to know how to use that information to make realistic estimates and subsequent promises for new projects or releases to management, customers, and the marketplace. If you can’t do this, you’re flying blind.

Metrics that Matter

You can begin with what the Carnegie Mellon Software Engineering Institute describes as “the minimum data set,” otherwise known as “the four core metrics.” These are size, time, effort, and defects. For completed projects, the minimum data set links these metrics into a cohesive relationship. Projects take a certain duration (months), expending a certain amount of work effort (person-months), as a function of the number of people working on the project. Their hard work results in a system that represents a certain amount of functionality (size), at a certain level of quality (defects). Anyone embarking on a measurement program should start with these four core metrics. They will tell you what happened on past projects, and help you gauge the “functional throughput” recently demonstrated by your organization and its management processes.

Why these four? Often, projects are managed by just two metrics: project milestones and effort (proportional to cost). The other two, size and defects, are often neglected. Yet size and defect metrics are critical, as they represent what has been built (or will be built, for new projects) and the quality of the end result.

It would also help to add one or two more metrics for past projects. One is the amount of rework, another is the degree of software reuse. For additional background on these metrics and their use, see “The SEI Core Measures: Background Information and Recommendations for Use and Implementation” (The Journal of the Quality Assurance Institute, July 1994) by Anita Carleton, Robert Park, and Wolfhart Goethert.

What Doesn’t Work

To help you determine how you can apply this information to new projects and understand the differences between effective and ineffective software development processes, let’s examine the dynamics of bad project estimation. Bad promises top the list. Software development organizations are saddled with bad promises. It’s catastrophic, based on the number of failed projects reflected in industry statistics. In worst-case scenarios, degraded relationships between vendors and customers have caused a rise in software-related litigation.

Tom DeMarco, noted researcher and author of Peopleware: Productive Projects and Teams (Dorset House, 1999), says, “As impressive as growth of the software industry has been, it is outpaced by growth of software-related litigation. It is not unusual for a large software development organization today to have upwards of 50 active cases on its hands,” (Cutter IT Journal, April 1998).

Tim Lister, co-author of Peopleware, adds, “Most litigation ends up focused on measurement, management, requirements practice, or some combination thereof.” According to Lister, “The things you do to win a litigation—that is, to be the less damaged loser—are: do careful measurement work, focus on good management practice, and conduct exhaustive and thoughtful analysis of requirements. These are also, coincidentally, three of the principal things you should do to avoid litigation.”

Two common scenarios seem to contribute to this state of affairs:

Scenario A: First, management mandates the deadline. The project team has 10 months to deliver. The project is loosely determined (no one “sizes” the application, because requirements are in flux.) A “bottom up” estimate of the required work effort is constructed—let’s say, 200 person-months. The 200 person-months estimate is mandated into a 10-month schedule. A team of 10 people is assigned to consume the 200 person-months over 10 months.

If the 10-month schedule gets compressed to 6 months, then management stuffs 200 person-months into 6 months, and assigns 33 people to the project.

Some of this scenario’s obvious and critical flaws are a mandated deadline without regard to project requirements and team productivity; subjective (and often optimistic) estimates of the effort put forth to complete each task; no software size estimation (for example, no estimates on the number of programs, modules, object classes, requirements, source lines of code, or function points); violation of Brooks’s Law, which states that manpower and time are not interchangeable (for example, you cannot cut the schedule in half by doubling the staff.)

Scenario B: Analysts determine that an application might total 8,000 source lines of code, or 800 function points. Management assumes the team can build at a rate of 1,000 lines of code, or 10 function points, per person-month of effort. Simple math yields that the project should expend 80 person-months. Eight people are available, so the schedule is 80/8 = 10 months. If management says to get it done in 6 months, then it is decided that 80/6 = 13 people should be assigned to the project.

A fatal flaw in this scenario is the use of simple and unreliable ratios of productivity such as source lines of code or function points per staff month (size/effort), which omit schedule metrics. Again, this violates Brooks’s Law.

Alternatives

While most people agree that Brooks’s Law is a given, it’s surprising how often development teams unknowingly violate it, often because of dictated schedules.

But here’s a rule of thumb: Given an uncompressed, nominal schedule, you might cut the time by 20% if you double the staff. Think 20% cut with 200% effort, based on regression analysis of thousands of completed projects, such as those cited in Lawrence H. Putnam and Ware Myers’s book Industrial Strength Software (IEEE Computer Society, 1997). Expect defects found during testing to rise about six-fold, according to these same statistics. Put simply, software schedules for complex, large projects are difficult to compress.

The best way out of this box is to negotiate functionality. You can do this by estimating backward. What do I mean by this? Put simply, you use your own metrics to find your own development proficiency. Let’s assume a 10-month deadline is a given. In this case, working the problem backward, would mean determining how much functionality can be built within a deadline X, with Y people, given proficiency Z. Proficiency Z has to be in multiple dimensions—not only in output per unit effort, but also in output per unit time.

You can merge these latter two ratios into an efficiency parameter encompassing output per unit time, per unit effort. You can map this to a useful index. One such index is known as a process productivity index, put forth by the pioneering metrics research by Larry Putnam. It has less risk of dangerous linear misinterpretation. Best of all, its calculation is in the public domain.

The concept of the index is straightforward. Higher values represent higher development proficiency, lower project complexity, or a combination of the two. Lower values represent the inverse. When researchers from my company calculate this across a heterogeneous industry database of several thousand projects, patterns emerge showing higher values for well-known information technology class applications such as billing and financial systems. The patterns also show lower values for complex, embedded engineering applications like factory automation, switching, or avionics systems.

Productivity Charts

A straightforward yet valuable way to determine how fast your organization develops applications is to generate a chart showing the speed of past projects (time) across small, medium, and large projects (size), as shown in Figure 1. You can use this chart for information you gathered on your past projects.

A schedule productivity chart has a horizontal, or “x,” axis labeled “size.” This measure is not team size, but rather the amount of the new and changed code or functionality. (You could also measure the size of programs, modules, objects, function points, and so on.)

The vertical or”y” axis reflects the schedule in months of elapsed time. Chart the schedule by project for the software main build, or construction phase. This comprises detailed design, coding, testing, integration, and deployment.

An upwardly sloping pattern means longer schedules for larger projects. If you draw a line through the lower limit, that line will represent the fastest schedule your projects exhibit. If you draw a line through an upper limit, it will show the longest schedule that your projects exhibit. Projects with less requirements, larger teams, lower staff turnover, and better processes, tools, and methods tend to be on the lower bound. The converse is true of projects on the slower, or upper bounds. The lines that represent these bounds will likely curve. They will show your projects’ range of performance.

For new projects, you can mark deadlines on the “y” axis, and see the maximum amount of functionality that can be built in that time frame by reading the “x” axis. You can do this for charts of effort as a function of project size to generate effort productivity charts, as shown in Figure 2. This will help you ensure you don’t over-promise by taking on too much work (sign up to too much functionality) within a given deadline and effort (or cost) budget.

Commercial Software Estimating Tools

A common reaction to collecting software metrics, much less high-definition metrics is, “We don’t have any data.” But you do. The information may be buried, but you can still gather it—one project and one day at a time. Start small and you’ll have a rather impressive historical database before long.

A wealth of information will begin to unfold, even with just a handful of recently completed projects. Say goodbye to Grandma’s old Quasar. You’ll see patterns that help your team estimate the people, schedule, and amount of functionality required for the next project. The charts I described will reveal valuable trends with credibility based on facts. You will begin to know your organization better.

Commercial estimating tools can greatly help you gather data by automating the hard bits. The best of them will use your own historic project data, letting you enter these factual profiles, and calibrate efficiency values to your own development process, tool environments, and project complexity. Some may even contain modern industry benchmarks to help you sanity-check any scenario that you explore. The degree of sophistication and value these tools provide will be a function of the resolution of the information that they accept and produce.

Be conscious that, like high-definition audio or video equipment, all software estimation tools are not alike. Tools will give you an answer, but how good will that answer be? Sometimes “freeware” is exactly that. Grandma’s Quasar and the new digital high-definition televisions both show an image, but they’re not the same.

You can find the Carnegie Mellon SEI cost and schedule estimating checklist in Robert Park’s report, “Checklists and Criteria for Evaluating the Cost and Schedule Estimating Capabilities of Software Organizations” (Carnegie Mellon University, 1995). The report tells you how to build six critical disciplines for reliable estimation. Estimation tools should provide capability in these important areas, with an adequate degree of resolution and sophistication. They are:

• A corporate memory (or historical database)

• Structured processes for estimating size and reuse

• Mechanisms for extrapolating from demonstrated accomplishments on past projects

• Audit trails (values for cost model parameters used to produce each estimate are recorded and explained)

• Integrity in dealing with dictated costs and schedules (imposed answers are only acceptable when you follow legitimate design-to-cost, or plan-to-cost, processes)

• Data collection and feedback processes that foster capturing and correctly interpreting data from work performed.

Striving for the Best

High-definition software measurement is within the reach of many software organizations. It starts with things like the SEI’s four core metrics, but can extend beyond them to add additional dimensionality. Why these initial four? Because they relate to what the software industry is striving to achieve—better, more reliable systems with more functionality, more speed, and at less cost.

To ensure that you make promises within reach, you need to make reliable estimates for how much your team can build, usually within predetermined deadlines. These estimates require scoping software size and getting a handle on what to sign up for, based on your organization’s historical capability. Estimating backward is crucial to negotiating realistic amounts of functionality that you can promise within a mandated deadline.

High-definition software measurement has two additional domains. These are long-term productivity and process improvement, and midstream project “runaway” prevention. Runaways happen all too often. They occur after you launch a project, and demands for more functionality arise but the deadline is held fixed. high-definition software measurement can prevent the disastrous effects of these kinds of pressures.

Last, don’t reinvent the wheel. A lot has already been researched. Assembling useful information and applying it practically is within everyone’s reach. The world is complex, dynamic, and multidimensional, yet many people rely on two-dimensional information for multimillion dollar decisions. Your software management “information displays” can and should rise to the challenge, and get out of the realm of Grandma’s old Quasar.

The Four Core Metrics

Here are the definitions of each of the metrics described by the Carnegie Mellon Software Engineering Institute and the moderate and higher resolution characteristics of each:

Size: What has been built, or what must be built, as countable entities.

• Moderate Resolution: Number of programs, modules, processes, function points, and lines of code. All represent the building blocks of a system.

• Higher Resolution: Functionality, including new code (or programs, function points, modules), changed code, and reused code without changes. In these cases, size should be non-comment, non-blank source statements (as opposed to physical lines, which include comments and blanks). Size profiles of simple, moderate, and complex programs, entities (for example, source lines of code per program), or source lines of code per function point by development language.

Time: Elapsed time in months for each of the major development phases.

• Moderate Resolution: Elapsed time for the project from the start through deployment.

• Higher Resolution: Elapsed time for each major development phase (for example, feasibility study phase, functional design phase, main build phase (detailed design, code, test), and maintenance phase. Relative proportions of these values as a function of the main build time (for example, five months functional design, ten months main build, two months overlap). Therefore, functional design time equals 50% of the time spent in main build, with a 40% overlap. Extra credit: milestones throughout the design, code, build, and test phases as a percentage of the overall main build schedule (for future estimates).

Effort: Full-time equivalent person-months of effort expended during the project.

• Moderate Resolution: Person-months of effort expended throughout all phases.

• Higher Resolution: Person-months of effort broken down by development phase or by development labor category, relative proportions of these values as a function of the main build effort, amount of overtime spent or overtime by development phase or calendar time.

Defects: An error in analysis, design, or coding that affects required performance.

• Moderate Resolution: Number of defects found during testing.

• Higher Resolution: Number of defects by severity category (major, moderate, and minor) found throughout system integration testing. Rate of discovery over time. Also, defects reported within first 30 days, 60 days, or 90 days of operational service. Extra credit: defects found throughout early phases of unit coding and testing, and defects found during design and code walkthroughs.

Michael Mah


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.