Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Debugging Web Applications


As the old joke goes, the best way to end up with a bug-free program or script is to write it with no bugs to begin with. But that smug line ignores the realities of compressed schedules and budgets, constantly shifting requirements, the often-negative effects of maintenance by programmers unfamiliar with the original code, and ever changing hosting environments. Months and years after its original release, following system upgrades and multiple security patches, your once-perfect code might be reduced to a bug-riddled albatross.

So how do you produce code, especially code that drives complex Web applications, without going crazy or losing your shirt? One answer, though certainly not the only answer, is to rigorously test and debug the entire code base.

Retellings of the story of the first bug—supposedly, a moth pulled from between the relays of the Mark II, one of the world's first computers—are legion. In the fifty years plus since the Mark II, many debugging schemes have been devised to improve the overall quality of software. But for some reason, the overall quality of Web applications doesn't seem to have improved much since the introduction of the browser. Why is this?

Bad Practice

Inevitable factors will limit your ability to write code that lives up to the ubiquitous guidelines of programming best practices. So what approaches can you realistically take toward practical debugging?

The worst method, and the one that's least likely to help, is to simply debug your code if and when you find problems: Don't do any other preparation, don't test your code, and don't document it. You might be able to pull it off, as long as you have a thorough understanding of the intended and actual functionality provided, not to mention every quirk of the environments in which the code is intended to run. That includes all of the possible interactions and unintended consequences that may arise from the introduction of new variables, such as newly released browsers, system upgrades, and so forth. And, if you're a sadist, you might even name your functions and variables in ways that have nothing whatsoever to do with their purpose.

Not surprisingly, but somewhat sadly, this is the method that most of us use. Whether it's due to tight deadlines, low budgets, poorly documented requirements, low expectations, or simple vanity, we just don't give debugging the attention that it truly deserves.

How can we improve upon these bad programming practices? First, become familiar with the tools at your disposal that enable or enhance your debugging capabilities. You can then structure your code and your systems to take advantage of those tools. If your code is easy to test, you're more likely to test it, and others on your team will be, too. The more your code is tested under a wider variety of circumstances, the more likely you are to find the showstopper that otherwise would have brought you in early on a Sunday to do some last-minute damage control.

The Web's Unique Challenges

One reason for the poor state of Web debugging is that, despite the apparent simplicity of Web applications from the user's perspective, the Web is an exceedingly complex system. Web applications often involve multiple languages executing in multiple environments (client and server)—even using many different parsers and interpreters on several different network layers, which are all tied together by way of an overarching rendering engine. And, of course, the end user is an ingenious beast, likely to think of the most unexpected ways of breaking your application.

To illustrate, let's say you're writing a Web application in PHP. The PHP interpreter is embedded in the Web server, and must produce output that is acceptable to whatever post-processing the server must perform. That output must in turn be acceptable to the browser at the HTTP level.

Simple things such as the order of HTTP headers can cause problems that may only appear much later in the course of the page's operation, or even at a later point in the request sequence. For example, if the Set-Cookie header isn't sent at the right time, or is improperly formed, a session variable may not be set properly. What you think is the problem (the session is invalid) is really only a symptom of another problem (the cookie wasn't set properly), which is itself a side effect of the real problem (the order or format of the HTTP headers was incorrect).

Your application's output may contain HTML, which may in turn reference CSS style sheets, JavaScript code, form elements, and data. All of these elements may interact in unforeseen ways. Many of them may even perform some of the same functions as your server-side software, such as setting cookies or manipulating data from within JavaScript logic. Bugs or weaknesses in one part of your markup, scripts, or styles, can result in unexpected data or behavior, far beyond your imagining when the code was created. And I haven't even mentioned databases, with their various wonderful ways of delivering data that you didn't expect.

What's more, in recent months there have been absurdly widespread reports of cross-site scripting bugs. This indicates that many systems are vulnerable to complex and unforeseen consequences of their remote applications interconnecting. It also means that more and more "black hats" are noting those interactions and preparing exploits just as fast as security-conscious Web developers can produce bug fixes. Can you confidently say that your systems are free of such vulnerabilities? How do you know?

Even Web developers themselves are plagued with a variety of problems. In some cases, Web developers must coexist with other developers working simultaneously on the same site. At other times, they have the site all to themselves. This means that they must work in a wide array of languages, each with its own quirks and vulnerabilities—more than any one person should be expected to track or understand. The available tools are, for the most part, relatively primitive. The libraries and other components may still be immature, while documentation may be inaccurate, out of date, or simply incomplete.

Writing Good Code

It's difficult, if not impossible, to write code that is completely free of bugs, especially when its output may be re-interpreted in several other environments. But it's possible to write code that's close enough to being bug-free that the end result is both useful and robust. If nothing else, well-designed code will be easy to fix when problems arise.

There are, as you might imagine, as many approaches to the task of debugging as there are varieties of languages, environments, and combinations thereof. I'd like to emphasize a fundamental principle: Code that works and is easy to debug and maintain is deliberately written for that purpose.

Thus, good code, no matter how simple, is well documented. Although documentation may take the form of inline comments in compiled or server-side code, it should be external in code optimized for delivery via the network. Its logic (functions, branches, methods, queries, and so forth) and data structures (variables, hashes, objects) should be clear and easy to read and understand. Good code is designed to allow for easy testing and debugging, possibly using "debug" and "production" versions of the same code. Often, code is written to enable ease of testing, debugging, and maintenance as much or more than to fulfill its specified purpose or function. If a project uses a language or environment that allows for code execution to be traced and/or logged, good code makes use of that as well. Some code is designed to be easily tested at any level—whether as a standalone component, or as a fully integrated part of the deployed system as a whole.

And don't forget version control. You'll need it when you want to return to known, good versions of a routine or component and compare them with newer, buggier versions. Most version control systems also let you back out broken code, branch test code until it can be rigorously tested, and then re-integrate the new, known, good code into the larger system.

Use The Tools You Have

Be sure to test and debug each component both before and during the integration with a larger system. Document your tests, and if possible in your environment, write small tools that automatically repeat the tests and complain loudly when they fail. This can be as simple as writing tiny applications that feed a variety of data into your routines and test their output; or as complex as writing your routines so that they contain their own internal testing logic. Debug-friendly subroutines might check data that has been input before it's used to ensure that it's within acceptable or expected ranges. They also might test the output of those routines to make sure that nothing has gone amiss while you were manipulating the data.

If your system runs in the context of a Web server, see if it can open up so-called remote testing ports. These are usually just listeners that produce a stream of output showing what data is being manipulated by which functions and the output of those functions. These listeners also give a big-picture view. Many Web application environments, such as PHP version 3 and Java, provide these listeners as a matter of course. It's often difficult to debug a system at the server level, as opposed to debugging client-side applications within an integrated development environment (IDE). PHP version 4 doesn't provide a standard debug listener service, but there are add-ons that do.

The utility of IDEs can be limited by server-side code. But if you're fortunate enough to have a powerful IDE, you should find out whether it supports such practices as syntax validation, conditional and stepped execution, setting execution breakpoints and watches on variables, and examining the values of the variables and other data structures in use. Many IDEs offer many more features than simple syntax-coloring text editors. Some of the more advanced ones may even run your code in the context in which it will be deployed, or at least emulate it under similar conditions. Some can even tie into remotely executing code and synchronizing it with a source file that's open in an editing window.

The most sophisticated testing of all uses test harnessesthat can emulate the actions taken by an end user, and log and react to responses by the browser and server. Though fairly common in old-school client-server software development, the use of test harnesses for Web-deployed software is rare. The practice is on the rise, however, particularly in the Java world.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.