Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Open Source

Optimizing Open-Source Software for Intel Architectures


MySQL Optimization

MySQL is a database application, which suggests a number of characteristics:

  • We expect the application to be large—and it is. MySQL has several hundreds of thousands of lines of code and an example combined code size for all applications in the client directory using the Intel Compiler on IA-32 over 15 MB. Optimizations that aid code size may provide a benefit.
  • The application contains C++ source code, so make sure inlining is used. How much of the C++ language does the application use? MySQL uses classes, but does not take advantage of C++ exception handling or runtime type identification (RTTI). Options that can limit the amount of C++ feature overhead used should be considered.
  • Databases typically access large amounts of memory and, thus, optimizations for data access may be beneficial.
  • Databases typically involve large amounts of integer-based calculations and little floating-point calculations, so optimizations geared to floating point such as vectorization would not be expected to provide a performance increase.

The first optimization to try is the baseline optimization using -O2. For the Intel Compiler on IA-32, the -O2 option enables a broad range of compiler optimizations, such as partial redundancy elimination, strength reduction, Boolean propagation, graph-coloring register allocation, and sophisticated instruction selection and scheduling. In addition, single-file inlining occurs at -O2, so we expect some of its benefits. Because inlining is important to C++ performance, we also attempt more aggressive inlining by using single-file interprocedural optimization (-ip) and multiple-file interprocedural optimization (-ipo). The -ip option enables similar inlining to what is enabled at -O2, but performs a few more analyses that should result in better performance. The -ipo option enables inlining across multiple files. Interestingly, inlining tends to increase code size; however, if the inlining results in smaller code size for the active part of the application by reducing call and return instructions, the net result is a performance gain. Profile-guided optimization (-prof_use) is a great optimization to use with inlining because it provides the number of times various functions are called and therefore guides the inlining optimization to only inline frequently executed functions, which helps reduce the code-size impact.

The -O3 option enables higher level optimizations focused on data access. We use -O3 and see what kind of performance benefits occur. Finally, we expect that vectorization would not provide a performance benefit; however, vectorization is fairly easy to use, so we will attempt it. Table 1 summarizes the optimizations that will be attempted and the reasons for doing.

Optimization Expectation
-O2a Baseline optimization
-O3 Data Access Optimizations
should provide benefit
Single file interprocedural
optimization (-ip)
Stronger inlining analysis over –O2
Multiple file interprocedural
optimization (-ipo)
Multiple file inlining should
bring further benefit
Profile guided optimization
(-prof_use)
Help performance through
code size optimizations
Vectorization (-xN) Don’t expect an improvement

Table 1: Intel compiler optimization evaluation.

The use of GCC on Itanium-based systems running Linux focused on a few optimizations. The -O3 option is a superset of -O2 optimization and adds simple inlining of functions. We expect -O3 will be beneficial. The -O3 optimization also includes register allocation that may benefit architectures with many registers like IPF. MySQL uses a subset of C++, and GCC offers options that turn off the generation of C++ RTTI and exception handling (EH) information. The use of these options may benefit performance by optimizing the code size and removing unnecessary exception-handling code. Be careful when applying the options to disable generation of RTTI and EH. If linked-in libraries or application code planned for the future depends on this information, you may run into problems. One other optimization that is used is -felide-constructors, a minor C++ optimization.

A special benchmark called "SetQuery" (www.cs.umb.edu/ ~poneil/dbppp/) was developed to help in this optimization effort and was used to measure the performance of the MySQL database on Intel Architecture. SetQuery returns the time that the MySQL database takes to execute a set of SetQuery runs. The SetQuery benchmark measures database performance in a decision-support context such as data mining or management reporting. The benchmark calculates database performance in situations where querying the data is a key to the application performance as opposed to reading and writing records back into the databases.

The use of most of the optimizations was fairly straightforward; however, there are a few optimizations that require some effort to test. Multiple-file interprocedural optimization essentially delays optimization until link time so that every file and function is visible during the process. To properly use -ipo, ensure the compiler flags match the link flags and that the proper linker and archiver are used. The original build environment defines the linker to ld and archiver to ar and these defines were changed to xild and xiar, respectively.

One challenge in using profile-guided optimization is determining the correctness and proper use of the profile information by the compiler. We were surprised by the profile-guided optimization results and suspected the sanity of the profiling data. Two techniques for verifying profile information are manually inspecting the compilation output and using the profmerge facility to dump profile information. During compilation, if the compiler does not find profile information for some number of functions in the file that is being compiled, the compiler emits this diagnostic:


WARNING: field.cc, total routines: 667, routines w/profile info: 9


If the compiler is able to find profile information for all functions in a file, the compiler does not emit a diagnostic. Make sure the compiler either doesn't emit the above diagnostic or emits the diagnostic with a number of the functions in a file using profile information. If you are aware of the most frequently executed functions in your application and the file with those functions shows little to no routines compiled with profile information, the profile information may not be applied correctly. The second technique to verify profile information is to use the profmerge application with the -dump option. profmerge -dump dumps the contents of the profile data file (pgopti.dpi). Search for a routine that is known to execute frequently and find the number of blocks in the function (BLOCKS:) and then the section "Block Execution Count Statistics," which contains counts of the number of times the blocks in the function were executed.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.