July 31, 2014

Star Schema Bechmark: InfoBright, InfiniDB and LucidDB

In my previous rounds with DataWarehouse oriented engines I used single table without joins, and with small (as for DW) datasize (see http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/, http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/, http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/). Addressing these issues, I took Star Schema Benchmark, which is TPC-H modification, and tried run queries against InfoBright, InfiniDB, LucidDB and MonetDB. I did not get results for MonetDB, will […]

Analyzing air traffic performance with InfoBright and MonetDB

Accidentally me and Baron played with InfoBright (see http://www.mysqlperformanceblog.com/2009/09/29/quick-comparison-of-myisam-infobright-and-monetdb/) this week. And following Baron’s example I also run the same load against MonetDB. Reading comments to Baron’s post I tied to load the same data to LucidDB, but I was not successful in this. I tried to analyze a bigger dataset and I took public […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

The MySQL optimizer, the OS cache, and sequential versus random I/O

In my post on estimating query completion time, I wrote about how I measured the performance on a join between a few tables in a typical star schema data warehousing scenario. In short, a query that could take several days to run with one join order takes an hour with another, and the optimizer chose […]

Why MySQL could be slow with large tables ?

If you’ve been reading enough database related forums, mailing lists or blogs you probably heard complains about MySQL being unable to handle more than 1.000.000 (or select any other number) rows by some of the users. On other hand it is well known with customers like Google, Yahoo, LiveJournal,Technocarati MySQL has installations with many billions […]

ScaleArc: Benchmarking with sysbench

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by Uday Sawant (ScaleArc) and myself. You can also download the report directly as a PDF here. The goal of these benchmarks is to identify the potential overhead of the ScaleArc […]

Quick review of InfiniDB 4

I’ve mentioned InfiniDB before in a previous post titled, “Star Schema Bechmark: InfoBright, InfiniDB and LucidDB,” but it’s been 4 years since that was published. Recently I evaluated column-storage solutions for Percona Cloud Tools and took another look at InfiniDB 4. There was the release of version 4, which I think is worth attention. What […]

Measuring Max Replication Throughput on Percona XtraDB Cluster with wsrep_desync

Checking throughput with async MySQL replication Replication throughput is the measure of just how fast the slaves can apply replication (at least by my definition).  In MySQL async replication this is important to know because the single-threaded apply nature of async replication can be a write performance bottleneck.  In a production system, we can tell […]

Read/Write Splitting with PHP Webinar Questions Followup

Today I gave a presentation on “Read/Write Splitting with PHP” for Percona Webinars.  If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as I could during the session, but here are […]

Percona XtraDB Cluster: Failure Scenarios with only 2 nodes

During the design period of a new cluster, it is always advised to have at least 3 nodes (this is the case with PXC but it’s also the same with PRM). But why and what are the risks ? The goal of having more than 2 nodes, in fact an odd number is recommended in […]