August 23, 2014

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Impact of the number of idle connections in MySQL

Be careful with my findings, I appear to have compile in debug mode, I am redoing the benchmarks. Updated version here. I recently had to work with many customers having large number of connections opened in MySQL and although I told them this was not optimal, I had no solid arguments to present. More than […]

Watch out for your CRON jobs

Resolving extreme database overload for the customer recently I have found about 80 copies of same cron job running hammering the database. This number is rather extreme typically the affect is noticed and fixed well before that but the problem with run away cron jobs is way to frequent. If slow down happens on the […]

The perils of InnoDB with Debian and startup scripts

Are you running MySQL on Debian or Ubuntu with InnoDB? You might want to disable /etc/mysql/debian-start. When you run /etc/init.d/mysql start it runs this script, which runs mysqlcheck, which can destroy performance. It can happen on a server with MyISAM tables, if there are enough tables, but it is far worse on InnoDB. There are […]

Election night

Today was epoch day in American history. Maybe even most important day this year, but it’s not what I’d like to write about here. What does it mean for US citizens and all other people around the world? We know, but what does it mean for us – IT professionals and/or internet portals serving news […]

RAID and Scale Out Discussions

Just found this wonderful summary of articles by Jeremy and wanted to give some of my thoughts on the topic. First lets speak about death of the RAID. I think this is far from the case especially if you consider Software RAID here. For many workloads you would like to get RAID just for the […]

Master-Master or Master with Many Slaves

I just found post by Kevin, in which he criticizes Master-Master approach, finding Master with many slaves more optimal. There is surely room for master-N-slaves systems but I find Master-Master replication much better approach in many cases. Kevin Writes “It requires extra hardware thats sitting in standby and not being used (more money and higher […]

Wishes for new “Pure PHP” MySQL driver

If you’re following MySQL or PHP landscape you should have seen announcement by MySQL to develop pure PHP driver. If not – Here is FAQ . I’m to meet the team (Georg, Andrey etc) which will be developing this driver during my visit to Open Source Database Conference in November so I thought it would […]