April 18, 2014

Stripped MySQL builds, the optimization that isn’t

I usually tell people to use official MySQL builds from MySQL, or from their operating system distribution if they don’t want to do that. (This assumes that there is no compelling reason to use third-party builds such as Percona’s.) Sometimes, though, people want to create their own builds, or use a build that is “optimized” for whatever reason.

This is usually a bad idea. Most “optimized” builds really aren’t. One of the common problems I see is that a lot of them, including sometimes those shipped with popular Linux distributions, are stripped. That is, the binaries have no symbols. That means you can’t profile them with oprofile, you can’t analyze waits with PMP, you can’t use GDB, you can’t figure out core dumps, and so on. Well, actually you can do some of these things, but only with some effort.

And that’s why I tell people they generally should not try to “optimize” their server by custom compiling it. The top-tier users have reason to do so. Most users are better off putting their effort into measuring what’s happening, and a stripped build only makes that harder to do. Besides, it’s far too easy to subtly mess up a MySQL build and really end up with a problem.

About Baron Schwartz

Baron is the lead author of High Performance MySQL. He maintains a personal blog at Xaprb. Follow him at @xaprb or connect with him on LinkedIn.

Comments

  1. kiwibob32 says:

    or build with profiling options, run for a while, then recompile with your profiling information. Sure, you can lose opportunity for debugging, but by the time you’ve gone with the default of most distributions’ default compile options anyway (CFLAGS=”-O2 -g”) you’ve already introduced significant complications because of the sorts of optimizations O2 does (yes, O3 is worse, but O2 still messing with stuff a lot, including instruction order). I don’t see why this is new info or anything. You either build for speed (preferably based on knowing what optimizations actually do anything useful for your setup, profiling helps a lot) and / or build for debugging. Who says you can’t build twice and install the binaries to different paths and switch to a debug based build when you actually NEED to do debug or run oprofile or whatnot…

  2. John Laur says:

    kiwibob32: Do you, by chance, run Gentoo? Or work for tumblr?

    When I have the extra time or whenever performance becomes a problem, I can usually sit down and go through my application and find queries to rewrite, schemas to optimize, old application code/cruft to remove or rewrite, or a place where I can start doing some extra caching or asynchronous operation, etc. These types of optimizations can conservatively buy 500%+ performance improvements in very short order, and I know exactly how they will affect my code and the application. It’s not that the initial code was terrible (though sometimes I wonder why I went the long way around); it may just not have been written to scale as well as it could have or handle as much data as it had to 7 years ago when I wrote it. But I can tell you that I have no idea what changing some esoteric gcc flag will do to my database, but even if it doesn’t completely crash it and destroy my data at most it will only buy me half a percent performance increase or so.

    My personal take on the issue is that you shouldn’t be fiddling with this type of optimization unless you work for an organization that is big enough that fiddling with mysql optimizations, patches, and compiler flags is your full time job. For pretty much anyone except companies entirely focused on MySQL performance, there’s basically no reason to spend time doing this — there’s no cost benefit. The time should be spent elsewhere such as improving horizontal scalibility or reducing latency.

  3. kiwibob32 says:

    John,

    Was assuming the dba stuff like optimizing schemas, indexes, database parameters, and sql statements had already been done. The original post isn’t about optimizing in that sort of fashion anyway, hence there is no need to address it, but yes, you’re right, that’s most often where you’ll see the biggest gains hands down.

  4. Mark R says:

    You can probably save more time in the long run by running a debug build, as you can then look at the core dumps with gdb. Maybe the server runs for months without problems, then crashes?

    Developer / Ops time is a lot more valuable than server time.

  5. IMHO This is not so simple, sometimes your only chance to optimize MySQL/Apache/etc is to create your own custom build. This is especially true for mass webhostings and other businesses in which you cannot change the source code or sql queries because it does not belong to you. Even if you see that your customers software is not optimized at all (wrong indexes, fullscans, strange queries), you cannot do anything but take the cash from him and try to optimize the server in any possible way.

  6. To be clear, we should state what a “debug” build is — this is one of the reasons building MySQL is subtle and easy to get wrong. If you connect to MySQL and see “debug” in the version number it reports, you have a severely crippled build that is only intended for developers to use, and will run catastrophically slowly.

    Now, a build that includes symbols (names of functions) is not appreciably slower. The binary is larger, but that’s not the same thing. That is why I said stripping the symbols is not an optimization.

  7. John Laur says:

    Artur: If you are a hosting company and you have a customer bogging down some shared MySQL instance because of bad indices and bad queries, “creating a custom MySQL build” isn’t going to do jack squat to help the problem. There are much better ways of dealing with the issue such as VServer/OpenVZ or some other full-on virtualization and lowering the CPU/IO priorities of the misbehaving software.

    On a more general note though, I was watching a talk recently about how Netflix actually goes to the extreme of having profiling running on all of their production code (Java) almost all the time. Once you get to the point where you are really only scaling horizontally you can really afford all of the overhead to do stuff like this. The database itself though seem to almost always be a point where we are still asked to scale vertically so all my earlier ranting aside, it’s probably one of the few pieces of software where it actually can make some economic sense to spend the time tweaking on it, but again I still think its a kind of an endgame type optimization and in 99% of cases it would be better to spend one’s time elsewhere.

Speak Your Mind

*