Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Estimating Software Costs

, October 01, 2000


Estimating Software Costs

Though everyone has a favorite theory as to why software failures occur, my experience and work while with the MITRE Corporation and Booz-Allen & Hamilton has taught me that more projects are doomed by poor cost and schedule estimates than by technical, political or team problems. Capers Jones’ extensive research, found in his book Estimating Software Costs (McGraw-Hill, 1998), makes a similar claim. It’s no surprise, therefore, that so few companies and individuals understand that software estimating can be a science, not just an art. It is possible to accurately and consistently predict development life cycle costs and schedules for a wide range of projects. In a series of four articles, I will provide a step-by-step tutorial in estimating the cost and schedule for your projects. You will be able to implement the concepts in these articles using nothing more complicated than a spreadsheet.

I’ll start by covering the various methods of estimating the size, or volume, of a program. The two traditional measures for this are lines of code and function points, although there are others. At the end of this article, I’ll show you how to prepare a preliminary, unadjusted estimate using this information.

In next month’s installation, entitled "Project Cost Adjustments," I’ll explain how to adjust project costs for variations in the project environment. At the end of the article, you will be able to create an accurate estimate of the time and cost required to develop a new application.

Part three, "Dealing with Reuse," explains how to quantify the impact of software reuse and commercial components or libraries on your estimate.

Finally, part four, "Creating the Project Plan," describes how to use your newfound insight into project cost and schedule to create a complete project plan.

The Estimating Life Cycle

Before discussing specific size measures, I must point out the limitations of software cost estimating at the macro level. As shown in Figure 1, the typical accuracy of cost estimates varies based on the software development stage. Early uncertainty is largely based on variances in the input parameters to the estimate. Later uncertainty in the estimate is based on the variances to the estimating models.

Figure 1. Cost Estimate Accuracy by Development Stage

Early uncertainty in cost estimates is due to variances in the input parameters to the estimate. Later uncertainty can be traced back to variances in the estimating models. While at the concept stage, when requirements may be hazy, the general purpose of the new software should be clear. At this point, estimates using informal techniques such as historical comparisons or group consensus should have an accuracy of plus or minus 50 percent. By the time the detailed design is complete, an implementation-oriented estimate will be accurate within plus or minus 10 percent.

Initially, at the concept stage, you may be presented with a vague definition of the project. Though the requirements may not yet be fully understood, the general purpose of the new software can be recognized. At this point, estimates with an accuracy of plus or minus 50 percent are typical for an experienced estimator using informal techniques such as historical comparisons or group consensus.

The key to accuracy lies in making periodic reestimates throughout the project life cycle, thereby identifying problems early enough to take corrective action.

Estimating Program Volume

The first step in preparing an estimate is to characterize the project volume. One measure is the number of source lines of code, or SLOC. A SLOC is a human written line of code that is not a blank line or comment. Do not count the same line more than once, even if the code appears several times in an application. We typically work with a related number, thousands of SLOC, or KSLOC, when estimating. SLOC as an estimating metric was popularized by Barry Boehm’s Constructive Cost Model, or COCOMO, found in his book Software Engineering Economics (Prentice Hall, 1981). The basic COCOMO model and the new COCOMO II model remain the most common estimating approaches.

I’ll discuss approaches to estimating KSLOC in more detail, but first, how do you convert from the number of KSLOC to an estimate for the project?

Let’s begin with the most simple estimate. If you know the number of KSLOC your developers must write, and you know the effort required per KSLOC, then you could multiply these two numbers together to arrive at the person months of effort required for your project. This concept is at the heart of all of the estimating models. Table 1 shows some common values that researchers have found for this linear productivity factor. Note that although language affects productivity in terms of functionality per hour, effort measured in terms of effort per line of code is language-independent. The values in the table are derived from work by Barry Boehm (COCOMO), Raymond Kyle and the U.S. Air Force Cost Analysis Agency’s revised COCOMO (REVIC), and firms or organizations working directly with the Cost Xpert Group.

Table 1. Common Values for the Linear Productivity Factor

Project Type

Linear Productivity Factor

COCOMO II Default

2.94

Embedded Development

2.58

E-commerce Development

3.60

Web Development

3.30

Military Development

2.77

If you know how many thousands of lines of code (KSLOC) your developers must write and you know the effort required per KSLOC, you can multiply these two numbers together to arrive at the person months of effort required for your project. This concept is at the heart of all of the estimating models.

OK, let’s apply this approach. Suppose we were going to build an e-commerce system consisting of 15,000 lines of code. How many person months of effort would this take using just this equation?

The answer is computed as follows:

Productivity*KSLOC=3.60*15=Effort=54PersonMonths

If all of your projects are small, then you can use this basic equation. Researchers have found, however, that productivity does vary with project size. In fact, large projects are significantly less productive than small projects—probably because they require increased coordination and communication time, plus more rework due to misunderstandings.

This productivity decrease with increasing project size is factored in by raising the number of KSLOC to a number greater than 1.0. This exponential factor then penalizes large projects for decreased efficiency. Table 2 shows some typical size penalty factors for various project types.

Table 2. Typical Size Penalty Factors for Various Project Types

Project Type

Exponential Size Penalty Factor

COCOMO II Default

1.052

Embedded development

1.110

E-Commerce development

1.030

Web development

1.030

Military development

1.072

Productivity does vary with project size. In fact, large projects are significantly less productive than small projects—probably because they require increased coordination and communication time, plus more rework due to misunderstandings. The exponential factors above penalize large projects for decreased efficiency.

So, after we do a size penalty adjustment, how many person months of effort would our 15,000 lines of code e-commerce system require? The answer is computed as follows:

Productivity*KSLOCPenalty=3.60*151.030=3.60*16.27=Effort=58.6PersonMonths

All of this is pretty straightforward. The next logical question is, "How do I know my project will end up as 15,000 SLOC?"

There are two main approaches to answering this question: direct estimation and function points with backfiring. Using either approach, the fundamental input variables are determined through expert opinion, often with your developers as the experts. The Delphi technique, described in Karl Wiegers’ article, "Stop Promising Miracles" (Feb. 2000), is a good way to cross-check the input variables.

Normally, the first step in estimating the number of lines of code is to break the project down into modules or some other logical grouping. For example, a very high level breakdown might be front-end processes, middle-tier processes and database code. Your developers then use their experience building similar systems to estimate the number of lines of code required.

We strongly recommend that you obtain three estimates for each input variable: a best case estimate, a worst case estimate and an expected case estimate. With these three inputs, you can then calculate the mean and standard deviation as

The standard deviation is a measure of how much deviation can be expected in the final number. For example, the mean plus three times the standard deviation will ensure that there is a 99 percent probability that your project will come in under your estimate.

For more information, refer to Barry Boehm’s Software Engineering and Project Management (IEEE Press, 1987).

Estimating Function Points

An alternative to direct SLOC estimating is to start with function points, then use a process called backfiring to convert them to SLOC. Function points were first utilized by IBM Corp. as a measure of program volume. The idea is simple: The program’s delivered functionality (and hence, cost) is measured by the number of ways it must interact with the users.

To determine the number of function points, start by estimating the number of external inputs, external interface files, external outputs, external queries and logical internal tables.

External inputs are largely your data entry screens. If a screen contains a tabbed notebook or similar metaphor, each tab counts as a separate external input. External interface files are file-based inputs or outputs. Each record format within the file, or, in the case of XML, each data object type, would count as a separate interface file even if residing in the same physical file. External outputs are your reports. External queries are message or external function-based communication into or out of your application. Finally, logical internal tables are the number of tables in the database, assuming the database was third normal form or better.

To convert from these raw values into an actual count of function points, you multiply the raw numbers by a conversion factor from Table 3.

Table 3. Factors for Converting Raw Values to Function Points

Raw Type

Function Point Conversion Factor

External inputs

4

External interface files

7

External outputs

5

External queries

4

Logical internal tables

10

To determine the number of function points, start by estimating the number of external inputs, external interface files, external outputs, external queries and logical internal tables. To convert from these raw values into an actual count of function points, you multiply the raw numbers by the conversion factors above.

So, if we had a system consisting of 25 data entry screens, 5 interface files, 15 reports, 10 external queries and 20 logical internal tables, how many function points would we have?

The answer is computed as follows:

(25*4)+(7*5)+(15*5)+(10*4)+(20*10)=450FunctionPoints

The only remaining step is to use backfiring to convert from function points to an equivalent number of SLOC. This can be done using a table of language equivalencies. Capers Jones was a pioneer in this area, and his work still makes up approximately 70 percent of the published language efficiency values. Many of the values are published in his book Estimating Software Costs. See Table 4 for some common values.

Table 4. Lines of Code Per Function Point by Programming Language

Language

SLOC per Function Point

C++ default

53

Cobol default

107

Delphi 5

18

HTML 4

14

Visual Basic 6

24

SQL default

13

Java 2 default

46

A table of language equivalencies lists a standard number of source lines of code (SLOC) per function point in a given programming language.

So, to implement the above project (450 function points) using Java 2 would require approximately the following number of SLOC:

450*46=20,700SLOC

And would require the following effort to implement, assuming that this was an e-commerce system:

Productivity*KSLOPenalty=3.60*20.71.030=3.60*22.67=Effort=81.61PersonMonths

As I discussed in my article "Estimating Internet Development" (E-development and Security, Aug. 2000), there are other approaches to calculating equivalent SLOC from a higher level input value. These include Internet points, Domino points and class-method points to name just a few. All of them work in a fashion analogous to function points.

In the next installment, I’ll cover the concept of project cost adjustments for variations in the project environment.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.