July 31, 2014

The power of MySQL’s GROUP_CONCAT

In the very early days of Percona Vadim wrote very nice post about GROUP_CONCAT.

But I want to show you a bit more about it.

When is GROUP_CONCAT useful? Usually while working with Support customers I recommend it when you have aggregation of many-to-many info. It makes the view simpler and more beautiful and it doesn’t need much effort to make it work.

Some simple examples:

This is a test table:

Without grouping info the only way you can check things is:

But it looks much better and easier to read with GROUP_CONCAT:

Easy? Let’s go to production usage and some “real” examples :)

Assume you have 4 Support Engineers who were working with 6 Customers this week on 15 issues.

As it usually happens: everyone (sure, except those who are on vacation :)) worked on everything with everybody.

How you would represent it?

Here is my way:

Create test tables:

  • engineers (id, name, surname, URL) – list of engineers
  • customers (id, company name, URL) – list of customers
  • issues (id, customer_id, description) – list of issues assigned to customers
  • workflow (id, engineer_id, issue_id) – list of actions: issues and engineers who worked on them

Examples:

List of issues for each engineer (GROUP_CONCAT):

List of engineers for each customer (GROUP_CONCAT inside of GROUP_CONCAT):

PHP/HTML? Why not? It’s easy :)

Source Code:

Result:

1OTFix replicationMiguel Nieto
2PZHelp with installation of Percona ClusterMichael Rikmas
3VKHardware suggestionsMarcos Albe, Michael Rikmas
4FDError: no space leftMarcos Albe, Michael Rikmas, Miguel Nieto, Valerii Kravchuk
5ASHelp with setup daily backup by XtrabackupMarcos Albe, Miguel Nieto, Valerii Kravchuk
6SSPoke sales about Support agreement renewalMarcos Albe
7FDAdd more accounts for customerMiguel Nieto, Valerii Kravchuk
8PZCreate Hot Fix of Bug 1040735Marcos Albe, Michael Rikmas
9OTQuery optimisationMarcos Albe, Miguel Nieto
10OTPrepare custom build for SolarisMiguel Nieto, Valerii Kravchuk
11PZexplain about Percona Monitoring pluginsValerii Kravchuk
12SSPrepare access for customer servers for future workMarcos Albe
13ASDecribe load balancing for pt-online-schema-changeMarcos Albe
14FDManaging deadlocksMichael Rikmas, Valerii Kravchuk
15OTSuggestions about buffer pool sizeMarcos Albe, Miguel Nieto

That’s a power of GROUP_CONCAT!

About Michael Rikmas

Michael joined Percona in October 2007. He serves in several roles, including Persona's 24x7 support coverage. He has an undergraduate degree in computer science, and in 2010 he started pursuing studies to earn an MBA.

Comments

  1. SiteKickr says:

    Good call on this article, I almost never hear anything about this really useful SQL aggregate function.

  2. Gimmer says:

    Always worth remembering to set the session variable group_concat_max_len to a higher number if you are grouping excessively long lists

  3. Peter (Stig) Edwards says:

    One thing to watch out for with GROUP_CONCAT (and ORDER BY) is how it can result in tmp tables on disk, the example query to list issues for each engineer above causes a tmp table on disk to be created:

    show session status like ‘Created_tmp_disk_tables’;
    +————————-+——-+
    | Variable_name | Value |
    +————————-+——-+
    | Created_tmp_disk_tables | 1 |
    +————————-+——-+

    set session group_concat_max_len=512;

    Stops the tmp table being created on disk. So does using ORDER BY NULL or removing the ORDER BY. Tested with mariadb 5.3.12, 10.0.4 and 5.6.13-rel61.0 Percona Server with XtraDB.

    http://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html#function_group-concat

    The result type is TEXT or BLOB unless group_concat_max_len is less than or equal to 512, in which case the result type is VARCHAR or VARBINARY.

    http://dev.mysql.com/doc/refman/5.5/en/internal-temporary-tables.html

    Some conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead:

    Presence of a BLOB or TEXT column in the table

    Presence of any string column in a GROUP BY or DISTINCT clause larger than 512 bytes

    Presence of any string column with a maximum length larger than 512 (bytes for binary strings, characters for nonbinary strings) in the SELECT list, if UNION or UNION ALL is used

    Also see:

    http://bugs.mysql.com/bug.php?id=14169 – When using GROUP_CONCAT() function with group_concat_max_len > 512 then the field type will be BLOB if ORDER BY is used, otherwise it will be VARCHAR.

    http://www.mysqlperformanceblog.com/2007/08/16/how-much-overhead-is-caused-by-on-disk-temporary-tables/

  4. Thanks Peter for the comment. This is very interesting info which should be taken into attention.

  5. Roman Vynar says:

    Great topic, thanks!

  6. shamin says:

    whats the max length i can set for ” SET SESSION group_concat_max_len=15000000; ” is there any limit for the variable group_concat_max_len ?

    thanks in advance.

  7. Michael Rikmas says:

    Shamin,

    Manual says it’s:
    for 32-bit systems: 4294967295
    for 64-bit systems: 18446744073709547520

    You can see details here:
    https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_group_concat_max_len

  8. Thank you sooooooooooooooooooooooooooooo many much, It Rocks

  9. Thank you very much for the article with great examples.

Speak Your Mind

*