<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Blob Storage in Innodb</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 16:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Tobias</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-766179</link>
		<dc:creator>Tobias</dc:creator>
		<pubDate>Sat, 05 Jun 2010 12:44:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-766179</guid>
		<description>Thanks Peter,

finally a clear statement on storing images as blobs. :-) 
I have been looking for this single-lined answer for quite a long time now...

Tobias :-)</description>
		<content:encoded><![CDATA[<p>Thanks Peter,</p>
<p>finally a clear statement on storing images as blobs. <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /><br />
I have been looking for this single-lined answer for quite a long time now&#8230;</p>
<p>Tobias <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dani</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-726321</link>
		<dc:creator>Dani</dc:creator>
		<pubDate>Wed, 17 Feb 2010 20:08:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-726321</guid>
		<description>@Tony -- Why store blobs in DB rather as opposed to plain files in a filesystem?

I agree with you in general, but there are applications where storing tons of files in a filesystem is bad.

I ran into problems having many thousands (millions?) of files, where I ran out of inodes in the filesystem I was using. Besides disk size, I did not want to have to worry about keeping track of inodes nor extra options to mkfs to increase inode counts whenever deploying a new server or increasing disk or logical volume size.
If you split things into multiple subdirectories then you also have to ensure your application(s) maintain those directories -- create new, delete old when empty, etc.
Also doing an &#039;ls&#039; on a huge directory will tie things up on your server. 

The more efficient thing to do is:
A) create 1 table that has ONLY an id sequence column set as primary index, and 2nd column for the blob. This table could conceivably be in a different database if you so choose.

B) All other metadata about this blob must be held in a separate table, and the id column used to associate rows (1-to-1 relationship).

C) Do your hefty queries (with joins, complex where clauses, etc) ONLY against the metadata table, and when you actually need the data from the blob query just that single row from your blob table.

What you achieve:
1) A single large file in the filesystem for that 1 InnoDB table that holds the blob. Queries against this table will use the primary index, and will only pull from the actual table what you need.
If disk space usage is important, use row compression and pay the CPU price on your DB server, or have your application compress/decompress upon INSERT/SELECT to pay that price on your app. server.

2) A well-designed meta-data table will contain everything you need to know about that blob, and you could design things to even contain a field that is, say, the first 50 bytes of your blob -- if you need that for indexing or searching.

If you think you need multiple blob columns, then as Peter suggested, consider combining them into 1 blob and on the application side splitting up the blobs. That might not so easy either, so maybe use multiple &quot;blob-only&quot; tables that have just a few blob columns each -- less than 10 he said was problematic.</description>
		<content:encoded><![CDATA[<p>@Tony &#8212; Why store blobs in DB rather as opposed to plain files in a filesystem?</p>
<p>I agree with you in general, but there are applications where storing tons of files in a filesystem is bad.</p>
<p>I ran into problems having many thousands (millions?) of files, where I ran out of inodes in the filesystem I was using. Besides disk size, I did not want to have to worry about keeping track of inodes nor extra options to mkfs to increase inode counts whenever deploying a new server or increasing disk or logical volume size.<br />
If you split things into multiple subdirectories then you also have to ensure your application(s) maintain those directories &#8212; create new, delete old when empty, etc.<br />
Also doing an &#8216;ls&#8217; on a huge directory will tie things up on your server. </p>
<p>The more efficient thing to do is:<br />
A) create 1 table that has ONLY an id sequence column set as primary index, and 2nd column for the blob. This table could conceivably be in a different database if you so choose.</p>
<p>B) All other metadata about this blob must be held in a separate table, and the id column used to associate rows (1-to-1 relationship).</p>
<p>C) Do your hefty queries (with joins, complex where clauses, etc) ONLY against the metadata table, and when you actually need the data from the blob query just that single row from your blob table.</p>
<p>What you achieve:<br />
1) A single large file in the filesystem for that 1 InnoDB table that holds the blob. Queries against this table will use the primary index, and will only pull from the actual table what you need.<br />
If disk space usage is important, use row compression and pay the CPU price on your DB server, or have your application compress/decompress upon INSERT/SELECT to pay that price on your app. server.</p>
<p>2) A well-designed meta-data table will contain everything you need to know about that blob, and you could design things to even contain a field that is, say, the first 50 bytes of your blob &#8212; if you need that for indexing or searching.</p>
<p>If you think you need multiple blob columns, then as Peter suggested, consider combining them into 1 blob and on the application side splitting up the blobs. That might not so easy either, so maybe use multiple &#8220;blob-only&#8221; tables that have just a few blob columns each &#8212; less than 10 he said was problematic.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: paul</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-725786</link>
		<dc:creator>paul</dc:creator>
		<pubDate>Tue, 16 Feb 2010 16:10:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-725786</guid>
		<description>Hi Peter,

re: &quot;If all columns do not fit to the page completely Innodb will automatically chose some of them to be on the page and some stored externally. This is not clearly documented neither can be hinted or seen.&quot;

Last year as a result of docs Bug#29042 (http://bugs.mysql.com/bug.php?id=29042) we did identify that the manual indeed implied some of the misconceptions to which you allude. We tried to make some improvements in the discussion. See:

http://dev.mysql.com/doc/refman/5.1/en/innodb-file-space.html:

&quot;The maximum row length, except for variable-length columns (VARBINARY, VARCHAR, BLOB and TEXT), is slightly less than half of a database page. That is, the maximum row length is about 8000 bytes. LONGBLOB and LONGTEXT columns must be less than 4GB, and the total row length, including BLOB and TEXT columns, must be less than 4GB.

If a row is less than half a page long, all of it is stored locally within the page. If it exceeds half a page, variable-length columns are chosen for external off-page storage until the row fits within half a page. For a column chosen for off-page storage, InnoDB stores the first 768 bytes locally in the row, and the rest externally into overflow pages. Each such column has its own list of overflow pages. The 768-byte prefix is accompanied by a 20-byte value that stores the true length of the column and points into the overflow list where the rest of the value is stored.&quot;

That doesn&#039;t address newer storage formats provided by InnoDB Plugin, of source. Integrating Plugin information into the manual is a longer-term project still in progress.

Very interesting article, thanks for writing it.</description>
		<content:encoded><![CDATA[<p>Hi Peter,</p>
<p>re: &#8220;If all columns do not fit to the page completely Innodb will automatically chose some of them to be on the page and some stored externally. This is not clearly documented neither can be hinted or seen.&#8221;</p>
<p>Last year as a result of docs Bug#29042 (<a href="http://bugs.mysql.com/bug.php?id=29042" rel="nofollow">http://bugs.mysql.com/bug.php?id=29042</a>) we did identify that the manual indeed implied some of the misconceptions to which you allude. We tried to make some improvements in the discussion. See:</p>
<p><a href="http://dev.mysql.com/doc/refman/5.1/en/innodb-file-space.html" rel="nofollow">http://dev.mysql.com/doc/refman/5.1/en/innodb-file-space.html</a>:</p>
<p>&#8220;The maximum row length, except for variable-length columns (VARBINARY, VARCHAR, BLOB and TEXT), is slightly less than half of a database page. That is, the maximum row length is about 8000 bytes. LONGBLOB and LONGTEXT columns must be less than 4GB, and the total row length, including BLOB and TEXT columns, must be less than 4GB.</p>
<p>If a row is less than half a page long, all of it is stored locally within the page. If it exceeds half a page, variable-length columns are chosen for external off-page storage until the row fits within half a page. For a column chosen for off-page storage, InnoDB stores the first 768 bytes locally in the row, and the rest externally into overflow pages. Each such column has its own list of overflow pages. The 768-byte prefix is accompanied by a 20-byte value that stores the true length of the column and points into the overflow list where the rest of the value is stored.&#8221;</p>
<p>That doesn&#8217;t address newer storage formats provided by InnoDB Plugin, of source. Integrating Plugin information into the manual is a longer-term project still in progress.</p>
<p>Very interesting article, thanks for writing it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-725365</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Mon, 15 Feb 2010 17:02:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-725365</guid>
		<description>Tony,

You&#039;re right. Storing images in MySQL is in most cases not a good idea being it MyISAM or Innodb storage engine.  I&#039;m not sure how this post made you feel it is being recommended.</description>
		<content:encoded><![CDATA[<p>Tony,</p>
<p>You&#8217;re right. Storing images in MySQL is in most cases not a good idea being it MyISAM or Innodb storage engine.  I&#8217;m not sure how this post made you feel it is being recommended.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tony</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-725107</link>
		<dc:creator>Tony</dc:creator>
		<pubDate>Mon, 15 Feb 2010 02:23:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-725107</guid>
		<description>I seem to be missing the point??  WHY, WHY, WHY would you save images in your DB when saving the link/URL to a simple varchar field and letting the OS do the work on the image directory is not MUCH more efficient??  BLOB, really?</description>
		<content:encoded><![CDATA[<p>I seem to be missing the point??  WHY, WHY, WHY would you save images in your DB when saving the link/URL to a simple varchar field and letting the OS do the work on the image directory is not MUCH more efficient??  BLOB, really?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Shlomi Noach</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-722954</link>
		<dc:creator>Shlomi Noach</dc:creator>
		<pubDate>Thu, 11 Feb 2010 05:08:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-722954</guid>
		<description>Weird. This post does not show up on planetmysql.</description>
		<content:encoded><![CDATA[<p>Weird. This post does not show up on planetmysql.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-722652</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:51:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-722652</guid>
		<description>Ivan,

Right MySQL Allocates memory for a full blobs furthermore there may be several copies of blobs allocation and conversion.   I would not store 1GB blobs in MySQL.  I think some 64M is the practical top size limit while  if you&#039;re using more than 500K blobs I would really think whenever you&#039;re making a right choice.</description>
		<content:encoded><![CDATA[<p>Ivan,</p>
<p>Right MySQL Allocates memory for a full blobs furthermore there may be several copies of blobs allocation and conversion.   I would not store 1GB blobs in MySQL.  I think some 64M is the practical top size limit while  if you&#8217;re using more than 500K blobs I would really think whenever you&#8217;re making a right choice.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-722645</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:40:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-722645</guid>
		<description>Vince,

Many people use MySQL for BLOB/TEXT fields to keep them accessible together with other data as well as to get MySQL backup and recovery, replication, ACID guarantees etc. 

If this is not needed using file storage can be better choice (especially when blobs are really files such as images, pdf documents etc).  You can also use various noSQL storage solutions.  BlobStreaming with PBXT is another interesting solution.</description>
		<content:encoded><![CDATA[<p>Vince,</p>
<p>Many people use MySQL for BLOB/TEXT fields to keep them accessible together with other data as well as to get MySQL backup and recovery, replication, ACID guarantees etc. </p>
<p>If this is not needed using file storage can be better choice (especially when blobs are really files such as images, pdf documents etc).  You can also use various noSQL storage solutions.  BlobStreaming with PBXT is another interesting solution.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-722644</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:36:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-722644</guid>
		<description>Shlomi,

As I understand it is only TAIL which gets compressed.  So if you have  10K blob (does not fit into the page)  which can be compressed to 4K  (will fit in the page)  it will still use external page - 768 out of 10K will be stored on the page and when  remaining 9K will be compressed to say 3K  and stored on the 16K external page.    Whenever 3K or 9K is used on 16K page does not really make a lot of difference. 

When it makes the difference is even larger blobs which can use less external pages. When compressed.  Check this out:

mysql&gt; create table comptest(b mediumblob);
Query OK, 0 rows affected (0.05 sec)
mysql&gt; insert into comptest values (repeat(&#039;a&#039;,1000000));
Query OK, 1 row affected (0.02 sec)


mysql&gt; show table status like &quot;comptest&quot; \G
*************************** 1. row ***************************
           Name: comptest
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1
 Avg_row_length: 1589248
    Data_length: 1589248
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2010-02-10 09:20:46
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)


mysql&gt; alter table comptest row_format=compressed;
Query OK, 1 row affected (0.62 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql&gt; show table status like &quot;comptest&quot; \G
*************************** 1. row ***************************
           Name: comptest
         Engine: InnoDB
        Version: 10
     Row_format: Compressed
           Rows: 1
 Avg_row_length: 32768
    Data_length: 32768
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2010-02-10 09:21:08
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: row_format=COMPRESSED
        Comment:
1 row in set (0.00 sec)

In this this is highly compressible long blob gets some 50 times smaller :)

Also note this less than 1MB blob takes about 1.5M  this is because space allocation for BLOB happens as well as for other innodb segments -  allocation happens in 16K pages for a first few pages and when space is allocated in 1MB blobs. Which means &quot;waste&quot; can be a lot more than 16K.

This is actually a strange decision for blobs as Blobs never grow and could be allocated to exact size.  If Blob changes it is stored in a new space.</description>
		<content:encoded><![CDATA[<p>Shlomi,</p>
<p>As I understand it is only TAIL which gets compressed.  So if you have  10K blob (does not fit into the page)  which can be compressed to 4K  (will fit in the page)  it will still use external page &#8211; 768 out of 10K will be stored on the page and when  remaining 9K will be compressed to say 3K  and stored on the 16K external page.    Whenever 3K or 9K is used on 16K page does not really make a lot of difference. </p>
<p>When it makes the difference is even larger blobs which can use less external pages. When compressed.  Check this out:</p>
<p>mysql> create table comptest(b mediumblob);<br />
Query OK, 0 rows affected (0.05 sec)<br />
mysql> insert into comptest values (repeat(&#8216;a&#8217;,1000000));<br />
Query OK, 1 row affected (0.02 sec)</p>
<p>mysql> show table status like &#8220;comptest&#8221; \G<br />
*************************** 1. row ***************************<br />
           Name: comptest<br />
         Engine: InnoDB<br />
        Version: 10<br />
     Row_format: Compact<br />
           Rows: 1<br />
 Avg_row_length: 1589248<br />
    Data_length: 1589248<br />
Max_data_length: 0<br />
   Index_length: 0<br />
      Data_free: 0<br />
 Auto_increment: NULL<br />
    Create_time: 2010-02-10 09:20:46<br />
    Update_time: NULL<br />
     Check_time: NULL<br />
      Collation: latin1_swedish_ci<br />
       Checksum: NULL<br />
 Create_options:<br />
        Comment:<br />
1 row in set (0.00 sec)</p>
<p>mysql> alter table comptest row_format=compressed;<br />
Query OK, 1 row affected (0.62 sec)<br />
Records: 1  Duplicates: 0  Warnings: 0</p>
<p>mysql> show table status like &#8220;comptest&#8221; \G<br />
*************************** 1. row ***************************<br />
           Name: comptest<br />
         Engine: InnoDB<br />
        Version: 10<br />
     Row_format: Compressed<br />
           Rows: 1<br />
 Avg_row_length: 32768<br />
    Data_length: 32768<br />
Max_data_length: 0<br />
   Index_length: 0<br />
      Data_free: 0<br />
 Auto_increment: NULL<br />
    Create_time: 2010-02-10 09:21:08<br />
    Update_time: NULL<br />
     Check_time: NULL<br />
      Collation: latin1_swedish_ci<br />
       Checksum: NULL<br />
 Create_options: row_format=COMPRESSED<br />
        Comment:<br />
1 row in set (0.00 sec)</p>
<p>In this this is highly compressible long blob gets some 50 times smaller <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Also note this less than 1MB blob takes about 1.5M  this is because space allocation for BLOB happens as well as for other innodb segments &#8211;  allocation happens in 16K pages for a first few pages and when space is allocated in 1MB blobs. Which means &#8220;waste&#8221; can be a lot more than 16K.</p>
<p>This is actually a strange decision for blobs as Blobs never grow and could be allocated to exact size.  If Blob changes it is stored in a new space.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ivan Novick</title>
		<link>http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/comment-page-1/#comment-722641</link>
		<dc:creator>Ivan Novick</dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:16:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2207#comment-722641</guid>
		<description>Also note for very large blogs, as far as I recall, innodb allocates memory for the entire blob to be returned to the user.  So imagine a blob that is 1 GB in size and you have 25 conncurrent requests for such a sized blob, innodb will need to allocate 25 GB of memory to satisfy this query.  See: http://mysqlinsights.blogspot.com/2009/01/mysql-blobs-and-memory-allocation.html

The people at pbxt have created a solution for their storage engine using a streaming blob protocol, but this requires a change to mysql client as far as i know:  http://www.blobstreaming.org/</description>
		<content:encoded><![CDATA[<p>Also note for very large blogs, as far as I recall, innodb allocates memory for the entire blob to be returned to the user.  So imagine a blob that is 1 GB in size and you have 25 conncurrent requests for such a sized blob, innodb will need to allocate 25 GB of memory to satisfy this query.  See: <a href="http://mysqlinsights.blogspot.com/2009/01/mysql-blobs-and-memory-allocation.html" rel="nofollow">http://mysqlinsights.blogspot.com/2009/01/mysql-blobs-and-memory-allocation.html</a></p>
<p>The people at pbxt have created a solution for their storage engine using a streaming blob protocol, but this requires a change to mysql client as far as i know:  <a href="http://www.blobstreaming.org/" rel="nofollow">http://www.blobstreaming.org/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

