Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

A Conversation with Jim Gray


DDJ: Okay, so we were talking about Sybase and stored procedures.

JG: You're right. It just seemed so obvious. But the fact is that you look at the calendar and it was five and seven years before the other venders got it and put stored procedures into their system. In fact, if I'm not mistaken, DB2 UDB put stored procedures in just last year. This is no criticism of them; it is just that it was not on the top of their list of things to do. Sybase came on the scene, so now we have the complete spectrum of players, almost. I just tell this story. It is hearsay, but it is second hand information from the people who were actually on the scene.

About this time, MS-DOS was getting popular. I think 1985. IBM was not pleased with the notion that MS-DOS was getting popular, as opposed to IBM DOS. They had this plan called OS/2. OS/2 was a plan for IBM to reassert its leadership in the PC space. Ownership is probably a better word. They also had OS/2 extended addition. OS/2 extended addition was a lock-out Microsoft, lock-out Compac, lock-out everybody else. They wanted to reacquire ownership of the PC space. Microsoft said, "Oh my Gosh. We need to have an MS OS/2 extended addition version". That meant that OS/2 extended addition was presentation services, or presentation manager. It was COBOL. It was the so called IBM SAA Architecture. It had a CS CS interface; and, it had a SQL in it. Microsoft said, "Gee, we need a SQL." So they found this little start-up in Emeryville who was building a SQL system. They went to Ashton Tate and said, "What have you got?" Because Ashton Tate was the 800-gorilla in the PC database space -- dBase. Ashton Tate said, "No, we don't have a SQL thing." Microsoft said, "IBM's got a SQL, so what have got to have a SQL." So, they found this little start-up in Emeryville called "Sybase". Sybase hadn't shipped a product yet, but they had a great story. Sybase said, "We'll make you a deal. You can use Sybase on all of Microsoft's platforms and you just give usthe revenue. We will fix all the bugs and you do all the support." They had this sweetheart deal. Then, Microsoft shipped a product with Ashton Tate. It was called Ashton Tate Microsoft SQL Server.

Sybase kind of liked that name so they came out with something called Sybase SQL Server. Sybase wanted the VMS deck marketplace. They wanted the IBM mainframe marketplace. They wanted the Unix marketplace. They never thought the PC marketplace would ever amount to a hill of beans. They said, "Great. Microsoft is a channel for us." Now relational systems kind of catch on. Parallel relational systems -- Terada comes on the scene. They are a data mining company.

Now, Fast forward to 1995. NT is kind of catching on. Microsoft has been using Sybase SQL Server, whose product name is branded "Microsoft SQL Server." I joined Microsoft at that time -- about 1995. We, basically, decided this database stuff is important. We really can't count on Sybase for all our technology in this space. So, we cut a deal with Sybase so that we are now independent of them. It took about four years to ramp down. 1999 was the year where we actually ended up being independent of them. We have been beavering away on SQL Server trying to make it better, and better, and better. At this point, it is interesting, you look out across the scene of the large software companies. . . Microsoft is a big software company, Oracle is a big software company, IBM is a big software company, and Computer Associates is a big software company. Sybase and Informix are also, actually, pretty good sized. SAP is definitely coming up. So, there is an application layer that is coming up. Database companies are, in fact, a big deal in the software space. It is interesting, to me, that compilers have not become the same huge industry. Certainly, Excel is a giant business. Certainly, Lotus and Exchange, as mail servers, are a big business. In terms of programming languages, and SQL is fundamentally a programming language, databases have ended up being a very lucrative business.

DDJ: Maybe that is related to performance. Are there really any big differences between Microsoft SQL Server, Oracle SQL Server, and Informix? Is there really a big difference here?

JG: So, if we could get the Oracle salesman, the Informix sales, and the Microsoft salesman in the room together, they would give you -- Absolutely! Ours is better. In fact, everybody who buys one will tell you theirs is better. Again, at the whiteboard level, they are all identical. They all do SQL. . . you know. There is this TPC set of benchmarks. TPC stands for Transaction Processing Performance Council www.tpc.org. This is like a level playing field. You can go there and say, "Okay, on this piece of hardware, this operating system, this database system. What kind of performance do you get?" You find that there are factors of two and three difference between them.

DDJ: The operating system here is a major factor?

JG: The operating system here is a major factor? The thing that is really a major factor is the hardware. Let's just take our good friend, Windows NT (soon to be called Windows 2000). It runs on 1 to 8 processor systems. It runs on machines that have up to 4 gigabytes of memory. It runs on machines that have. . . fundamentally Intel machines with a certain PCI bus structure, and so on. You take that andyou stack it next to a Sun UE 10,000 and, guess what, the UE 10,000 has more memory (64 gig, instead of 4 gig), more processors (64 processors versus 8 processors), more buses, more this, more that. Incidentally, it does four times, or three times, as much throughput. I think the maxthat you get on an Intel box is like 40,000 transactions per minute and the other one is something like 120,000 or 130,000 transactions per minute. Part of the equation is the hardware base. How big an SMP can you get? How big a cluster can you get? Then, there is always abiggest SMP you can buy. Now, you take these SMPs and you stack them side by side to make a farm, or a cluster. In the TPC benchmarks there are TPCD benchmarks, which are now called H&R, I'm afraid to say, whichare decision support benchmarks. The largest ones of those are always done as a cluster because, for these things, it is a question of how much processing and how much disk you can throw at them. So, some of the software systems do clustering well and some don't do clustering well. That is a dimension where the peak performance of one system can be dramatically better than the other.

DDJ: A lot of the logic that powers this technology was developed by you and others in the 1970s when systems were much slower. How well has this scaled in terms of speed? You can imagine it is a point where because of locking and friction between components. . .

JG: Scaling went amazingly well. One of the things that is strange to me is that when I programmed a lot, in the old days, we used count instructions, count bytes, and count IOs. Today, people are programming in Java, VBScript, Javascript, and Perl. One thing that has happened is that we are using computers (they are programming in SQL, (incidentally) the way God intended. We are not wasting time on register allocation, and stuff like that. We are thinking about the applications. A flip side of that is that we are programming in a style, which lets these automatic tools generate the code for us. You will see a tool generate a SQL statement for you this long and hand it to the compiler. Frankly, the optimizer guys that I worked with never expected they would see a SQL statement that long. I think it is a real contribute to the technology that it has scaled up to be able to handle extraordinarily complex queries. The fact that processors are so fast, and so inexpensive, that allow us to program fairly inefficient from the point of view of somebody on the 1960s saying, "Gee, think of all the instructions you are wasting." A different way of thinking of it is think about all the time you are saving. People time. I would say thatthe technology. . . The mindset that we had to say, "Make it easy, make it easy, make it easy." Forget the cost of storage, and processing.

Moore's Law is on our side. Eventually the cost of storage and eventually the cost of processing are going to be inexpensive enough so the dominant cost will be people cost. That has definitely panned out. I think the safe money is on, in the future, people are going to want systems to be more automatic. They are going to want graphical user interfaces where the computer sees. To some extent intuit is what they want as opposed to having them write in this. . . Very few people now write in this arcane SQL language. People are using much more graphical interfaces.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.