Current State of Data Quality
Respondents were asked what they felt about the quality of their production data. Figure 2 summarizes the results by IT organization size. Data quality is highest within smaller organizations, presumably because they haven't had time to make serious mistakes and/or because they've adopted modern database development techniques. An interesting trend is that quality drops the larger the organization gets, until about the mid-sized organization level, where quality starts to rise again. My guess is that mid-size organizations can still survive with a little bit of "data chaos", but eventually, an organization reaches a certain size and needs to start to improve its data management approach if it's to survive.
The survey asked whether the respondents worked in organizations with defined service-level agreements (SLAs) for database performance and for database availability. Of the respondents who knew the answer, the survey allowed "I Don't Know" as a possible answer for many questions, 30 percent and 42 percent, respectively, responded positively. Having SLAs in place seemed to be correlated to improved data quality60 percent of respondents where one or both SLAs exist indicated that data quality was either perfect or pretty good compared with 52 percent when no SLAs exist. Respectively, 5 percent and 10.3 percent reported serious data quality problems.
Figure 3 depicts the correlation of various approaches to data naming conventions with data qualitythis is important because it is an indicator of the health of the relationship between the data management group and developers. The survey showed that when developers willingly followed the data naming conventions, data quality was better than when the conventions were enforced by the data group. Both of these approaches were much better than having inconsistently followed conventions, which in turn, was better than having no data naming conventions at all. (At www.agiledata.org, I describe a collection of strategies for promoting a more effective, collaborative relationship between data professionals and developers.)