August 27, 2009

Looking at Redis

Posted by peter |

Recently I had a chance to take a look at Redis project, which is semi-persistent in memory database with idea somethat similar to memcache but richer feature set.

Redis has simple single process event driven design, which means it does not have to deal with any locks which is performance killer for a lot of applications. This however limits it scalability to single core. Still with 100K+ operations a second this single core performance will be good enough for many applications. Also nothing stops you from running many Redis instance on single server to get advantage of multiple cores.

I call Redis semi-persistent because it does not store the data on disk immediately but rather dumps its all database every so often – you have a choice of configuring time and number of updates between database dumps. Because dump is basically serial write Redis does not an expensive IO subsystem. Also because this dump is background it does not affect read/write performance to the database which is in memory. In the tests I’ve done I’ve seen Redis doing writes some 4MB/sec for probably 50% of test duration where Innodb had to write 50MB/sec for about third of throughput and doing a lot of random IO as it was doing it. This is among other things because Innodb has to flush full 16K pages while doing flush.
[read more...]

August 18, 2008

Worse than DDOS

Posted by peter |

Today I worked on rather interesting customer problem. Site was subject what was considered DDOS and solution was implemented to protect from it. However in addition to banning the intruders IPs it banned IPs of web services which were very actively used by the application which caused even worse problems by consuming all apache slots which were allocated to the problem. Here are couple of interesting lessons one can learn from it.
[read more...]

January 10, 2008

PHP vs. BIGINT vs. float conversion caveat

Posted by shodan |

Sometimes you need to work with big numbers in PHP (gulp). For example, sometimes 32-bit identifiers are not enough and you have to use BIGINT 64-bit ids; e.g. if you are encoding additional information like the server ID into high bits of the ID.

I had already written about the mess that 64-bit integers are in PHP. But if the numbers you use do not cover 64-bit range fully, floats might save the day. The trick is that PHP floats are in fact doubles, i.e. double-precision 64-bit numbers. They have 52 bits for mantissa, and integer values up to 2^53-1 can be stored exactly. So if you’re using up to 53 bits, you’re OK with floats.

However, there’s a conversion caveat you should be aware of.
[read more...]

November 22, 2007

Pitfall of proxying HTTP requests through Lighttpd

Posted by Maciej Dobrzanski |

Recently I had a case with a web server farm where a random node went down every few minutes. I don’t mean any of them rebooted except once or twice, but rather they were slowing down so much that practically stopped serving any requests and were being pulled out from the LVS cluster. The traffic was not any different than usual, all other elements of the system worked perfectly fine (e.g. databases, storage), no one started any backup in the middle of the day as it happens sometimes… so what was happening?
[read more...]

July 23, 2007

Sphinx: Going Beyond full text search

Posted by peter |

I’ve already wrote a few times about various projects using Sphinx with MySQL for scalable Full Text Search applications. For example on BoardReader we’re using this combination to build search against over 1 billion of forum posts totaling over 1.5TB of data handling hundreds of thousands of search queries per day.

The count of forum posts being large, is however not the largest we’ve got to deal in the project – number of links originating from forum posts is a bit larger number.

The task we had for links is to be able to search for links pointing to given web site as well. The challenge in this case is we do not only want to match links directed to “mysql.com” but links to “www.mysql.com” or “dev.mysql.com/download/” as well as they are all considered to belong to mysql.com domain, while searching for “dev.mysql.com/download”" will only match dev.mysql.com domain and files within /download/ directory.

Initially we implemented it in MySQL using partitioning by domain which link was pointing to. So “mysql.com” links were stored in one table group and “google.co.uk” on another. We still had serious challenges however – as each applies to many search URLS,
such as “dev.mysql.com/download/mysql-5.1.html” would match “mysql.com”, “dev.mysql.com”, “dev.mysql.com/download/” and
“dev.mysql.com/download/mysql-5.1.html” we could not use link=const where clause but had to use link like “prefix%” which means index could not be used to get 20 last links and filesort over millions of links we had to youtube.com wikipedia.org and other top domains was extremely slow. Not to mention counting number of links (and number number of distinct forum sites) pointing to the given URL or graphs showing number of links per day. To fight this problem we had to restrict number of days we allow to cover based on the amount of links to the domain… but for some top domains it was slow even with just 3 days worth of data.

You might point out if we had link_date between X and Y and link like “prefix%” kind of where clause we would not be able to use index past link_date part, it is true so we had to use link_date in ( ) and link like “prefix%” which allows to use both keyparts which is much better but not good enough.

Caching is not good enough in such case as we do not want a single user to wait for minutes. large variety of problematic search urls does not allow to use pre-caching not to mention general load on server such batch processing would put.

The first alternative to this approach was to store duplicate data storing link to “dev.mysql.com/download/mysql-5.1.html” as links to 4 url prefixes I mentioned above. Unfortunately this would blow up data stored quite significantly, requiring in average of 6 rows for each link and it does not solve all the problems – result counting and number of distinct sites were still pretty slow and we did not want to go into creating all this data as summary tables.

Instead we decided to use Sphinx for this kind of task which proved to be extremely good idea. We converted all URLs to search keywords and now these 6 rows become simply one row in sphinx index with 6 “keywords” – specially crafter strings which corresponded to the URLs. Of course we did not store these in the table but instead used UDF to convert URL to list of “keywords” on the fly.

As results we now can pull up results even for youtube.com for fractions of the second and we could show 3 months worth of data for any URLs. (We could use longer time span but we did not have enough memory for Sphinx attribute storage). It is especially great as there is still room for optimization – Sphinx stores word positions in the index, while we do not need them in this case as we’re doing kind of “boolean full text search”. Plus we can make index built sorted by timestamp which would allow to same on sorting which is now still happening.

Using Sphinx such non-traditional way required implementing some features more traditional for SQL databases rather than full text search applications. Group By was added to Sphinx so we could search number of matches per day, or number of matches per language.

For Domain Profile we’ve got to use even more of those features such as counting number of distinct sites pointing to the given url or domain etc. Honestly this is where we cheated a bit and distinct number is bit approximate for large numbers but it still works really well for our needs.

Sure we could use summary tables for domains but it would be a lot of work and raver inflexible if we would like to add some more features and take a lot of resources to update or periodically build for millions of domains.

As this part worked pretty well we also decided to use Sphinx for other part of the web site – Forum Site Profile. This uses some pre-generated data such as number of posts in total for forum or in thread but most of other stuff is built with Sphinx. This also uses fair amount of tricks using fake full text search to retrieve all posts from given web site or forum from the global sphinx index.

So in general we find parallel processing using sphinx pretty good solution for many data crunching needs especially when lack of parallel abilities in MySQL makes its use rather inconvenient and pre-generating data is inconvenient or impossible.

If you’re attending OSCON 2007 and would like to learn more about Sphinx we have a BOF on Thursday to talk on the topic.

July 6, 2007

PHP Large result sets and summary tables.

Posted by peter |

We’re working with web site preparing for massive growth. To make sure it handles large data sets as part of the process we work on generation test database of significant size as testing your application on table with 1000 rows may well give you very dangerous false sense of security.

One of the process web site had was creating of summary tables which was done by executing some huge group by query, doing some stuff with results and then populating tables. This all worked well for small tables… but not for larger ones.

First problem was PHP script generating the table took 10GB of RAM and was swapping development server which had just 4GB of Ram (and plenty of swap space) like crazy. Why ? Because by default mysql_query uses mysql_store_result C library call and buffers all result set in the process memory. Not good if there are over 50 millions of rows. Note this limit is not controlled by memory_limit PHP config variable because that only controls memory which passes via PHP memory management which does not apply to MySQL result set.

OK there is “easy” fix for this problem, you can use mysql_unbuffered_query instead and mysqli and PDO have their own way to reach similar behavior. This call users underlying mysql_use_result API call which does not store all result set in memory but instead streams it from the server, fetching in blocks. There are some limits as you can’t use mysql_num_rows() and mysql_data_seek() if you use this method but this is told in PHP manual and so easy to catch. There are however more differences which may cause things breakage

  1. Table Locks – Table locks are not cleared until you fetch whole result set if you’re reading from tables directly (if you do not have “using temporary” in EXPLAIN) this was not issue for given case as GROUP BY in question required temporary table plus it was test system anyway. The workaround for this one is to use SQL_BUFFER_RESULT hint if you need to release table locks early. It comes at cost of creating temporary table though which can be quite high.
  2. Sharing connection no more works If you use buffered query you can use same connection to run other queries, ie INSERTs and UPDATEs while you traverse with data. Not with unbuffered query because connection is still busy.
  3. Need more error checking If you use buffered query the only real call you can get errors is when you run mysql_query , mysql_fetch_row simply reads data from memory and so most applications do not care to check if there are any errors while fetching. With mysql_unbuffered_query data comes in portions so you can well get an error while fetching rows. If you do not check for error it can look as you’ve done with result set while you only processed a portion of it, which can cause rather hard to catch errors.
  4. Connection can timeout If you do not fetch data for long enough MySQL Server may think client is dead and close connection. This well may happen if you need long processing for each row or have long periodic data flushes, ie with multiple value INSERTs etc. This can be fixed by increasing net_write_timeout variable on the server so it gives you more time

But is this the only way ?

Of course not. First you should consider if you need to do processing in PHP at all. Many summary tables can be built by INSERT … SELECT, or some others purely SQL commands and it can be much more efficient. Another alternative is of course to use MySQL Stored Procedures which can be fit to do this simple job.

The downside of using these techniques if of course you’ve got to have summary tables and original tables on the same server which can limit your scalability. Using FEDERATED Tables can work for some cases in others script can be more efficient especially when multiple servers are involved and you want to do some parallel processing.

It is also good question if you need to query all result at once. It is rather efficient bur can cause problems with table locks and other issues plus if script aborts it may be hard to restart. So it may be better structuring your queries to process data by certain objects (ie City by one City at the time) or do INSERT … SELECT to the temporary table with auto_increment column and fetch data from this table using auto increment column ranges instead. I would especially recommend this last way for very long processes, ie if you need to check data against web services and so on – in this case the overhead of creating yet another temporary table is not so large.

March 27, 2007

Integers in PHP, running with scissors, and portability

Posted by shodan |

Until recently I thought that currently popular scripting languages, which mostly evolved over last 10 years or something, must allow for easier portability across different platforms compared to ye good olde C/C++.

After all, their development started a few decades after C, so its notorious caveats are all well-known and should be easy to avoid when designing a new language, right?

However, PHP just brought me a new definition of “portable” – and that was when working with… integers.

[read more...]

PHP Sessions – Files vs Database Based

Posted by peter |

One may think changing PHP session handler from file based to database driven is fully transparent. In many cases it is, sometimes however it may cause some unexpected problems as happened to one of our customers.

If you use file based sessions PHP will lock session file for whole script execution duration, which means all requests from the same sessions will be serialized on PHP level, which means they also will be serialized for single user on database level. If you change to store PHP sessions in MySQL instead this effect may be no more true and you may have number of requests executing for the same session at the same time. First of course means you may have your session data damaged because you will have lost session variables update from one of the script, in addition however you may run into database related issues of modifying user profile or other user/session related data in parallel, if you do not use transactions or lock tables.

So how you can get back your old file based session behavior with MySQL Sessions ?

If you have dedicated connection to session database and use Innodb tables for your session storage you can start transaction on the session start and use SELECT … FOR UPDATE to lock the session row in the session table for whole request length. On the end of the session the same row is updated and transaction is committed.

If you share session connection with other modules or do not use transactional tables for session you can use GET_LOCK to get same behavior. In the start of the session you can do SELECT GET_LOCK(‘‘,10) and in in the end of the request
SELECT RELEASE_LOCK(‘‘) where session_id is current session identifier. Note – setting this external lock on session name should be done before session data is read from database for things to work properly.

This approach assumes you do not use GET_LOCK in other places in your application as as soon as it is called second time previous lock is automatically released. The good thing about it however – you can use it as an extra to your current MySQL Sessions system without need to change how it works internally. If you do not use persistent connections you even do not have to release lock – as soon as connection is closed the lock is automatically released.

The value 10 in GET_LOCK is timeout in seconds – if lock can’t be granted for this amount of time it will return “0″ indicating lock was not granted in this case you can select to continue without session or may do something else, like logging error as this generally should not happen in well tuned applications.

February 11, 2007

Content delivery system design mistakes

Posted by peter |

This week I helped dealing with performance problems (part MySQL related and part related to LAMP in general) of system which does quite a bit of content delivery, serving file downloads and images – something a lot of web sites need to do these days. There were quite a bit of mistakes in design for this one which I though worth to note, adding some issues seen in other systems.

Note this list applies to static content distribution, dynamic content has some of its own issues which need different treatment.

DNS TTL Settings The system was using DNS based load balancing, using something like img23.domain.com to serve some of the images. I’m not big fan of purely DNS based load balancing and HA but it works if configured well. In this case however the problem was zero TTL set in DNS configuration. This obviously adds latency especially for “aggregate” pages which may require images to be pulled from 10 different image servers.

Keep Alive In my previous post I wrote you often do not need keep alive for dynamic pages (there are also exceptions) but you really should have Keep Alive enabled while serving images. It especially hurts not to have one if 30 thumbnails are loaded per page if you do not have one.
[read more...]

February 8, 2007

Debugging sleeping connections with MySQL

Posted by peter |

Have you ever seen connection in the SHOW PROCESSLIST output which is in “Sleep” state for a long time and you have no idea why this would happen ?

I see if frequently with web applications and it is often indication of trouble. Not only it means you may run out of MySQL connections quicker than you expected but it also frequently indicates serious problems in the application. If you do not use persistent connections and you have connection in Sleep stage for 600 seconds what could it be ? It may mean some of your pages take that long to generate (or might be the code simply gets into the tight loop and page never gets generated) it also could mean some of external Web Services are slow or not available and you’re not dealing with timeouts properly. Or may be you have several connections to MySQL server and right now running query which takes that long ? In any case it is something frequently worth looking at.

[read more...]