Comments on: Using GROUP BY WITH ROLLUP for Reporting Performance Optimization http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/ Everything about MySQL Performance Sat, 21 Nov 2009 05:23:57 -0800 http://wordpress.org/?v=2.8.4 hourly 1 By: Dasher http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-349559 Dasher Wed, 27 Aug 2008 10:46:17 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-349559 The Tag Cloud only changes when an article is added or changed. So geneate a normalised value tag information and store the information in a table. Then when needed update the tag table with the tags used for the post. Each day - re-normalise the tag table. You'll need to normalise the values otherwise you'll end up with very large values over time - when what's important is the relative values for the tags. For optimum performance: If you're using PHP for the site - you can also use APC to store the site-wide TAG data (the HTML will be enough) - and then the code that updates the tag table - also updates the APC tag data. The Tag Cloud only changes when an article is added or changed.

So geneate a normalised value tag information and store the information in a table.
Then when needed update the tag table with the tags used for the post.

Each day – re-normalise the tag table.

You’ll need to normalise the values otherwise you’ll end up with very large values over time – when what’s important is the relative values for the tags.

For optimum performance:
If you’re using PHP for the site – you can also use APC to store the site-wide TAG data (the HTML will be enough) – and then the code that updates the tag table – also updates the APC tag data.

]]>
By: peter http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-343608 peter Mon, 11 Aug 2008 15:51:01 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-343608 vepa, This may be long (though can be OK if the load is IO bound) - the first thing you should not run such complex queries in real time :) vepa,

This may be long (though can be OK if the load is IO bound) – the first thing you should not run such complex queries in real time :)

]]>
By: vepa http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-343571 vepa Mon, 11 Aug 2008 13:39:53 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-343571 Having same problem with count() for tagcloud generation. have 300.000 different tags (groups) and 1.5 mil records. takes 120 sec. to populate a tagcloud. Do you know any optimal ways to do tagclouds for huge websites? I wanted to store total numebers and update time on each tag but I am using same tagcloud for different things. so I need to store several totals for each tag which will be 20 extra columns for each record. Having same problem with count() for tagcloud generation. have 300.000 different tags (groups) and 1.5 mil records. takes 120 sec. to populate a tagcloud. Do you know any optimal ways to do tagclouds for huge websites?

I wanted to store total numebers and update time on each tag but I am using same tagcloud for different things. so I need to store several totals for each tag which will be 20 extra columns for each record.

]]>
By: richard http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-230943 richard Mon, 14 Jan 2008 15:12:05 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-230943 In MySQL 5 and up you could do the following. Since you already have the counts, as long as you grouop by the same fields, using the sum function in the upper query does not change the counts, and you gain speed by using the limit in the subquery. SELECT grp , sum(cnt) FROM ( SELECT grp , count(*) cnt FROM dt WHERE slack LIKE "a%" GROUP BY grp ORDER BY cnt DESC LIMIT 10 ) t1 GROUP BY grp WITH rollup; -richard In MySQL 5 and up you could do the following. Since you already have the counts, as long as you grouop by the same fields, using the sum function in the upper query does not change the counts, and you gain speed by using the limit in the subquery.

SELECT grp
, sum(cnt)
FROM ( SELECT grp
, count(*) cnt
FROM dt
WHERE slack LIKE “a%”
GROUP BY grp
ORDER BY cnt DESC
LIMIT 10
) t1
GROUP BY grp WITH rollup;

-richard

]]>
By: links for 2007-10-11 - smalls blogger http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-176437 links for 2007-10-11 - smalls blogger Thu, 11 Oct 2007 00:37:19 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-176437 [...] Using GROUP BY WITH ROLLUP for Reporting Performance Optimization | MySQL Performance Blog Using GROUP BY WITH ROLLUP for Reporting Performance Optimization (tags: development performance mysql) [...] [...] Using GROUP BY WITH ROLLUP for Reporting Performance Optimization | MySQL Performance Blog Using GROUP BY WITH ROLLUP for Reporting Performance Optimization (tags: development performance mysql) [...]

]]>
By: peter http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-174598 peter Wed, 03 Oct 2007 15:48:52 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-174598 Indeed. Your last name directly translates as "Hare" in Russian. I guess it should be from some slavic country but probably not Russian directly as Russian last names usually formed using some apendix like "ov", "ev" etc. Indeed. Your last name directly translates as “Hare” in Russian. I guess it should be from some slavic country but probably not Russian directly as Russian last names usually formed using some apendix like “ov”, “ev” etc.

]]>
By: Larry Zaetz http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-173968 Larry Zaetz Sun, 30 Sep 2007 23:14:53 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-173968 Hello Peter- I was surprised to see a last name such as yours. I was told that some of my family routes could have had a similar name. Father was from a town that sounded like "Chipchevitz". Relatives close by came from Sarne. Hello Peter-
I was surprised to see a last name such as yours.
I was told that some of my family routes could have
had a similar name. Father was from a town that sounded
like “Chipchevitz”. Relatives close by came from Sarne.

]]>
By: peter http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-171974 peter Tue, 25 Sep 2007 05:36:52 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-171974 Daniel, Please compare the query I'm saying which can't be run and your query. They are VERY different. Indeed you can get results sorted in any direction by columns you use for group by but if you're using group by with roll up you can't sort by value of aggregate function such as count(), avg() sum() - hope it makes things a bit more clear. Daniel,

Please compare the query I’m saying which can’t be run and your query. They are VERY different.

Indeed you can get results sorted in any direction by columns you use for group by
but if you’re using group by with roll up you can’t sort by value of aggregate function such as count(), avg() sum() – hope it makes things a bit more clear.

]]>
By: Daniel Ciulinaru http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-170653 Daniel Ciulinaru Fri, 21 Sep 2007 19:37:37 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-170653 Hi Peter, Looks like you CAN order the results the way you want in a GROUP BY ... WITH ROLLUP (at least in 5.1.11-beta - I haven't checked other versions). The key word is DESC after immediately after GROUP BY: mysql> select IFNULL(date,'Total') as 'date_total', no_of_files, raw_size as 'raw_size', compressed_size as 'compressed_size', avg_ratio as 'avg_ratio' from (select date(log_datestamp) as date, count(log_instance) as no_of_files, sum(log_uncompressed_size) as raw_size, sum(log_compressed_size) as compressed_size, round(avg(log_compress_ratio_percentage),3) as avg_ratio from logs_stats group by date desc with rollup) as a; +------------+-------------+--------------+-----------------+-----------+ | date_total | no_of_files | raw_size | compressed_size | avg_ratio | +------------+-------------+--------------+-----------------+-----------+ | 2007-09-21 | 16 | 2605136552 | 131024683 | 94.988 | | 2007-09-20 | 171 | 26946514751 | 1724108146 | 93.246 | | 2007-09-19 | 355 | 53270319908 | 11372204678 | 79.300 | | 2007-09-18 | 375 | 57924126151 | 12854828516 | 80.481 | ......................................................................... | 2007-07-16 | 1 | 113834620 | 82608679 | 27.400 | | 2007-06-08 | 1 | 132729275 | 12107490 | 90.900 | | 2007-05-22 | 1 | 180724880 | 16671364 | 90.800 | | 2007-05-03 | 1 | 180577870 | 16670719 | 90.800 | | Total | 5641 | 602786174381 | 126053043926 | 80.638 | +------------+-------------+--------------+-----------------+-----------+ 37 rows in set (0.04 sec) "Sure, trust the gurus. Just don't believe anything they say, especially when it comes to performance." - Steven Feuerstein Cheers! Hi Peter,

Looks like you CAN order the results the way you want in a GROUP BY … WITH ROLLUP (at least in 5.1.11-beta – I haven’t checked other versions).
The key word is DESC after immediately after GROUP BY:

mysql> select IFNULL(date,’Total’) as ‘date_total’, no_of_files, raw_size as ‘raw_size’, compressed_size as ‘compressed_size’, avg_ratio as ‘avg_ratio’ from (select date(log_datestamp) as date, count(log_instance) as no_of_files, sum(log_uncompressed_size) as raw_size, sum(log_compressed_size) as compressed_size, round(avg(log_compress_ratio_percentage),3) as avg_ratio from logs_stats group by date desc with rollup) as a;
+————+————-+————–+—————–+———–+
| date_total | no_of_files | raw_size | compressed_size | avg_ratio |
+————+————-+————–+—————–+———–+
| 2007-09-21 | 16 | 2605136552 | 131024683 | 94.988 |
| 2007-09-20 | 171 | 26946514751 | 1724108146 | 93.246 |
| 2007-09-19 | 355 | 53270319908 | 11372204678 | 79.300 |
| 2007-09-18 | 375 | 57924126151 | 12854828516 | 80.481 |
……………………………………………………………….
| 2007-07-16 | 1 | 113834620 | 82608679 | 27.400 |
| 2007-06-08 | 1 | 132729275 | 12107490 | 90.900 |
| 2007-05-22 | 1 | 180724880 | 16671364 | 90.800 |
| 2007-05-03 | 1 | 180577870 | 16670719 | 90.800 |
| Total | 5641 | 602786174381 | 126053043926 | 80.638 |
+————+————-+————–+—————–+———–+
37 rows in set (0.04 sec)

“Sure, trust the gurus. Just don’t believe anything they say, especially when it comes to performance.”
– Steven Feuerstein

Cheers!

]]>
By: Data Management Today by Craig Mullins http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/comment-page-1/#comment-170609 Data Management Today by Craig Mullins Fri, 21 Sep 2007 16:05:27 +0000 http://www.mysqlperformanceblog.com/2007/09/17/using-group-by-with-rollup-for-reporting-performance-optimization/#comment-170609 <strong>Log Buffer #63: a Carnival of the Vanities for DBAs...</strong> Welcome to the 63rd edition of Log Buffer, the weekly review of database blogs. For those of you reading... Log Buffer #63: a Carnival of the Vanities for DBAs…

Welcome to the 63rd edition of Log Buffer, the weekly review of database blogs. For those of you reading…

]]>