August 27, 2014

Why do you need many apache children ?

I already wrote kind of about same topic a while ago and now interesting real life case makes me to write again Most Web applications we’re working with have single tier web architecture, meaning there is just single set of apache servers server requests and nothing else – no dedicated server for static content, no […]

Exploring message brokers

Message brokers are not regularly covered here but are, nonetheless, important web-related technologies. Some time ago, I was asked by one of our customer to review a selection of OSS message brokers and propose a couple of good candidates. The requirements were fairly simple: behave well when there’s a large backlog of messages, be able […]

Heartbleed: Separating FAQ From FUD

If you’ve been following this blog (my colleague, David Busby, posted about it yesterday) or any tech news outlet in the past few days, you’ve probably seen some mention of the “Heartbleed” vulnerability in certain versions of the OpenSSL library. So what is ‘Heartbleed’, really? In short, Heartbleed is an information-leak issue. An attacker can […]

Why do we care about MySQL Performance at High Concurrency?

In many MySQL Benchmarks we can see performance compared with rather high level of concurrency. In some cases reaching 4,000 or more concurrent threads which hammer databases as quickly as possible resulting in hundreds or even thousands concurrently active queries. The question is how common is it in production ? The typical metrics to use […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Introducing tcprstat, a TCP response time tool

Ignacio Nin and I (mostly Ignacio) have worked together to create tcprstat[1], a new tool that times TCP requests and prints out statistics on them. The output looks somewhat like vmstat or iostat, but we’ve chosen the statistics carefully so you can compute meaningful things about your TCP traffic. What is this good for? In […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

How Percona does a MySQL Performance Audit

Our customers or prospective customers often ask us how we do a performance audit (it’s our most popular service). I thought I should write a blog post that will both answer their question, so I can just reply “read all about it at this URL” and share our methodology with readers a little bit. This […]

Commodity Hardware, Commodity Software and Commodity People

In the previous post I mentioned not all architectures and solutions work for Commodity People, and people seems to agree with me. Number of vendors would claim they are in Commodity Software or Hardware business but few would probably mention they are doing it for Commodity People, because few people would like to be called […]