Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Sweet Predictability


Software quality is a question of scale. With traditional methods, software developers produce their code as best they can and then they review this code, hold team inspections, and run tests. Their objective is to find and fix as many of the program's defects as they can. Although this approach has traditionally produced programs that run and even run pretty well, it has not produced code that is of high enough quality or sufficiently safe and secure to use for critical applications.

The CMM, or Capability Maturity Model, was introduced by the Software Engineering Institute to help organizations assess the capability of their software groups. At CMM level 1, typical delivered quality is about 7.5 defects per thousand lines of code. By the time organizations get to CMM level 5, they have applied all of the traditional software quality methods—requirements inspections, design inspections, compiling, code inspections and a lot of tests—to find and fix the program's defects. Although the improvement from CMM1 to CMM5 is significant, today's large-scale software systems have millions of lines of code.

Therefore, a delivered 1,000,000-LOC program produced by a CMM level-5 organization would likely have about 1,000 undiscovered defects. Can this still be considered a quality product? When a million-line program has thousands of defects, it is almost certainly not of high enough quality to be safe or secure. So what would it take to get one or 10 defects per million LOC? I counted one of my C++ programs and found that a 996-LOC listing had 30 pages. At one defect per KLOC, assuming a similar number of pages per LOC, level-5 organizations are getting quality levels of about one delivered defect in every 30 pages of source code listings. Although that's not bad when compared with other human-produced products, it is still 1,000 defects per MLOC. To get 100 defects per MLOC, we must have only one defect left in 300 listing pages, and for 10 defects per MLOC, we could miss at most one defect in 3,000 pages of listings.

I cannot imagine inspecting 3,000 pages of listings and missing at most one single defect. As this seems like a preposterous challenge, what can we do? At one defect in 30 pages, today's level-5 organizations are near the limit of what people can do with traditional software development methods. To achieve better quality, we must focus on the process.

Recharged and Recycled

Chasing the elusive goal of high quality on a rehab job

In software development, we rarely have the opportunity to develop brand-new products.

Most of our work involves enhancing and fixing existing products—products with questionable quality histories. I am often asked, "How can I use the PSP on this kind of work?"

The first point to remember is that the PSP is based on six principles:

  1. To have predictable schedules, you must plan and track your work.
  2. To make accurate and trackable plans, you must make detailed plans.
  3. To make accurate detailed plans, you must base the plans on historical data.
  4. To get the data needed to make accurate plans, you must use a defined and measured personal process.
  5. To do high-quality work, you must measure and manage the quality of your development process.
  6. Because poor-quality work is not predictable, quality is a prerequisite to predictability.

Although the PSP course focuses on how to apply these principles when developing new or modified module-size programs, the principles are equally applicable to modifying and enhancing large existing systems. In fact, these same principles also apply to developing requirements, designing hardware devices, and writing books.

The PSP principles can be used and are very effective for maintaining and enhancing existing products, but the quality benefits are more variable. The governing factors are the size of the existing product, the nature of the enhancement work, the historical data available, the quality of the existing product, the goals for the work, and the skills and resources available. —WH



The Personal Software Process (PSP) has measures that help you to do this. However, because many of its actions involve both personal and team work, you need the help of teammates to produce the highest-quality software. Then, to further improve product quality, you must improve the quality of your process. This requires measuring and tracking your personal work. Although my focus is principally on personal practices, the methods I describe are even more effective when used by all the members of a development team.

What Is Software Quality?

First, the product must work. If it has so many defects that it does not perform with reasonable consistency, the users will not use it regardless of its other attributes. If a minimum quality level is not achieved, nothing else matters. Beyond this quality threshold, the relative importance of performance, safety, security, usability, compatibility, functionality and so forth depend on the user, the application and the environment. However, if the software product does not provide the right functions when the user needs them, nothing else matters.

Though defects are only part of the quality story, they are the quality focus of the PSP. Because defects result from errors by individuals, to effectively manage defects, you must manage personal behavior. You are the source of the defects in your products and you are the only person who can prevent them. You are also the best person to find and fix them.

Personal Quality Practices

Most software professionals agree that it is a good idea to remove defects early, and they are even willing to try doing it. Beyond that, there is little agreement on how important this is. This is why, for example, developers will spend hours designing and coding a several-hundred-LOC program module and then spend only 10 or 15 minutes looking it over for any obvious problems. Although such superficial reviews may find something, the odds are against it. To see how important it is to develop quality practices, and to begin to appreciate the enormous cost and schedule benefits of doing high-quality work, consider data on the numbers of defects in typical products and the costs of finding and fixing them.

Data for several thousand programs show that even experienced developers inject 100 or more defects per KLOC. While there is wide variation among individuals, about half of the defects are typically found during compiling and the rest must be found in testing. Based on these data, a 50,000-LOC product would start with about 5,000 defects, and it would enter testing with about 50 or more defects per KLOC, or about 2,500 defects. Again, about half of these defects would be found in unit testing at a rate of about two to three defects per hour, and the rest must be found in system-level testing at an average cost of 10 to 20 hours each. Total testing times would then be about 13,000 to 25,000 hours. With roughly 2,000 working hours in a year, this would require 12 to 18 or more months of testing by a team of five developers. A heavy reliance on testing is obviously inefficient, time-consuming and unpredictable. What is worse, even after extensive testing, development groups often deliver poor-quality products to their users.

Many of the developers I have worked with agree that other people inject a lot of defects—they just don't believe that they do. After they have used the PSP and seen their own data, they realize that they are part of the quality problem. Then they are willing to change their personal development practices.

Comparing Defect-Inject and -Removal Rates

How to determine the reasonableness of review ratios

This table shows the defect-injection rates for detailed design and coding for 810 developers on the 3,240 programs they wrote with PSP2 and PSP2.1. The developers injected 9,302 defects in 4,624 hours of design and injected 19,296 defects in 4,160 hours of coding. These rates are 2.01 defects injected per hour in detailed design and 4.64 defects injected per hour during coding. The table also shows the defect-removal rates for these same developers in design and code reviews. Here, the average removal rates are 3.32 defects per hour in design review and 6.04 defects per hour in code review. The ratios of these rates indicate that these developers must spend at least 76.8 percent of their coding time in reviews to find the defects they inject. Similarly, they must spend at least 60.5 percent of their design time in reviews to find their design defects.

*Detailed design review

Finding Defects Early

With the PSP, personal design and code reviews find defects before testing. Then, with the Team Software Process, you use inspections. When they do proper reviews and inspections, TSP teams typically find close to 99 percent of the defects before even starting system-level testing. Test times are then cut by 10 times or more. Instead of spending months in testing, TSP teams need only a few weeks ("Team Software Process in Practice," N. Davis and J. Mullaney, SEI Technical Report CMU/SEI-2003-TR-014, Sept. 2003).

If reviews and inspections are so effective, why don't more organizations do them? There are two reasons: First, few software groups have the data to make sound quality plans. Then, when they treat these plans as accurate projections, there is no apparent reason to spend much time on reviews and inspections. However, during testing, the quality problems become clear, at which point all the developers can do is test and fix until they get the product to work. The second reason is that without PSP data, developers do not know how many defects they inject or what it costs to find and fix them. Therefore, neither they nor their organizations can appreciate the enormous benefits of finding and fixing essentially all of the defects before they start testing.

Quality Measures

How do you measure quality and how can you use the resulting data? The defect content of a finished product indicates the effectiveness of the process for removing defects. This, however, is an after-the-fact measure. To manage the quality of our work, we must measure the work, not just its products. The available work measures are defect-removal yield, cost of quality (COQ), review rates, phase-time ratios and the process quality index (PQI).

The yield of a phase is the percentage of product defects that are removed in that phase. For example, if a product contained 19 defects at the start of unit testing, one was injected in test, and seven were found during testing, the unit test yield would be 100*7/20 = 35 percent.

The cost-of-quality measure—rather, the cost of poor quality—is a way to "quantify the size of the quality problem in language that will have impact on upper management," according to J. M. Juran and F.M. Gryna (Juran's Quality Control Handbook, 4th edition, McGraw-Hill, 1988). There are three principal components:

  1. Failure costs: The costs of diagnosing a failure, making necessary repairs, and getting back into operation
  2. Appraisal costs: The costs of evaluating the product to determine its quality level
  3. Prevention costs: The costs of identifying defect causes and devising actions to prevent them in the future

For the PSP, we use somewhat simpler definitions:

E Failure costs for the PSP: The total cost of compiling and testing. Because defect-free compile and test times are typically small compared to total compile and test times, they are included in failure costs. E Appraisal costs for the PSP: The total times spent in design and code reviews and ins-pections. Because defect-repair costs are typically a small part of review and inspection costs, the PSP includes them in appraisal costs.

The PSP COQ Measures

Incorporate defect prevention into organization-wide standards.

Here's how to calculate the PSP cost-of-quality (COQ) measures:

  • Failure COQ=100*(compile time + test time)/(total development time)
  • Appraisal COQ=100*(design review time + code review time)/(total development time)
  • Total COQ=Appraisal COQ + Failure COQ
  • Appraisal as a percent of Total Quality Costs=100*(Appraisal COQ)/(Total COQ)
  • Appraisal to Failure Ratio (A/FR)=(Appraisal COQ)/(Failure COQ)

The A/FR measure is useful for tracking personal process improvement. The number of test defects is typically much lower for higher values of A/FR. Although a high A/FR implies lower defects, too high a value could mean that you are spending excessive time in reviews. The PSP guideline is an A/FR of about 2.0.

—WH



Review Rate Measures

Although the yield and COQ measures are useful, they measure what you did, not what you are doing. To do quality work, we need measures to guide what we do while we are doing it. In design or code reviews, the principal factors controlling yield are the time and the care a developer takes in doing the review.

The review rate and phase ratio measures provide a way to track and control review times. The review rate measure is principally used for code reviews and inspections, and measures the LOC, database elements or pages reviewed per hour. If, for example, you spent 20 minutes reviewing a 100-LOC program, your review rate would be 300 LOC per hour. There is no firm rate above which review yields are bad. However, the PSP data show that high yields are associated with low review rates, and that low yields are typical with high review rates. The PSP rule of thumb is that 200 LOC per hour is about the upper limit for effective review rates and that about 100 LOC per hour is recommended as the target review rate. The best guideline is to track your own review performance and find the highest rate at which you can review and still consistently get review yields of 70 percent or better.

How Much Time Is Enough?

Another set of useful quality measures is the ratio of the time spent in two process phases. To measure process quality, the PSP uses the ratios of design to coding time, design review to design time, and code review to coding time. One would expect that increases in design time would tend to improve product quality while correspondingly reducing coding time. Indeed, a 1-to-1 ratio of design to coding time appears to be a reasonable lower limit, with a ratio of 1.5 being optimum. Again, this optimum point varies widely, but a useful rule of thumb is that design time should at least equal coding time. If it doesn't, you are probably doing a significant amount of design work while coding. Because developers typically inject more than twice as many defects per hour in coding, designing while coding is not a sound quality practice.

Another useful ratio is review time divided by development time. In the PSP, the general guideline is that you should spend at least half as much time reviewing a product as you spent producing it. The same ratio holds for design and design review time, requirements and requirements review time, and so forth.

The PSP Process Quality Index (PQI)

PQI is the product of the following five quality elements:

  1. Design Quality: the minimum of either 1.0 or design time/coding time.
  2. Design Review Quality: the minimum of either 1.0 or 2 times design review time/design time.
  3. Code Review Quality: the minimum of either 1.0 or 2 times code review time/coding time.
  4. Code Quality: the minimum of either 1.0 or 20/(10+compile defects/KLOC).
  5. Program Quality: the minimum of either 1.0 or 10/(5+unit test defects/KLOC).

The development goal is to achieve a PQI value of 1.0 for every module and component of a system. When any part has a PQI that is significantly lower, review that component personally and have it reinspected by your team. Then either repair, redevelop or replace it. —WH



Personal Quality Management

To effectively manage software quality, you must focus on both the defect-removal and the defect-injection processes. For example, some days you might inject a lot of defects because you are sick, have personal problems, or did not get enough sleep. Regardless of your training, skill or motivation, there is a distinct probability that any action you take will produce an error.

For software, the change process is the most problematic. When you recognize and address the change-quality problem, you can bring the percentage of erroneous fixes down to 5 percent or less. Some of the most error-prone programming actions involve interpreting requirements and making design decisions. Errors are also likely during logic design and coding. In short, every development action we take has a probability of injecting a defect.

The PSP strategy is to use your plans and historical data to guide your work. That is, start by striving to achieve a PQI value of 1.0 (see "The PSP Process Quality Index," this page). This requires that your work meet the PSP quality guidelines. Focus first on producing a thorough and complete design and then document that design. Then, as you review the design, spend enough time to find the likely defects. If the design work took four hours, plan to spend at least two, and preferably three, hours doing the review. Plan the review steps based on the guidelines in the PSP Design Review Script. Then, in the code review, follow the PSP Code Review Script. Make sure that you take the time that your data say you must take to do a quality job.

The compiler will find other types of defects, as will every phase of testing. Surprisingly, however, when developers and their teams use reasonable care throughout the process, their finished products have essentially no defects. Of course, there is no way to know for certain that these products are defect-free, but many TSP teams have delivered products for which users have not reported defects. In short, quality work pays in reduced costs, shorter schedules, and higher-quality products.

Quality Starts with Me

What is right for you today will not likely be best forever. Not only will you develop new skills, but the technology and working environment will change and you will face different problems. You must thus continually track and assess the quality of your work. When improvement progress slows, set new goals and start again. Be alert for new techniques and methods, and use your data to determine what works best for you.


Watts Humphrey is a Fellow of the Software Engineering Institute at Carnegie Mellon University, where he founded the process program, led the PSP development work, and helped launch SEI's PSP certification program for software developers. Prior to joining SEI in 1986, he worked at IBM, where he directed commercial software development. He holds five U.S. patents, has written many technical articles, and published 10 books. In 2005 in a White House ceremony, President Bush awarded Mr. Humphrey the National Medal for Technology, the nation's highest technical honor. This article is abridged from Chapter 8 of PSP: A Self-Improvement Process for Software Engineers, ISBN 0321305493, Copyright (c) 2005 by Addison-Wesley, a division of Pearson Education Inc. (www.awprofessional.com). Reprinted with permission.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.