<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: When should you store serialized objects in the database?</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Thu, 29 Jul 2010 19:06:57 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=1836</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Diego</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-720709</link>
		<dc:creator>Diego</dc:creator>
		<pubDate>Sat, 06 Feb 2010 20:20:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-720709</guid>
		<description>Great post and discussion. IMO, unless someone is really unconfortable with trying new dbs (which are somewhat not as proven as mysql), they shouldn&#039;t mysql that way. It&#039;s harder to administer than any nosql db, and it&#039;s just not the best tool for storing key/value data.</description>
		<content:encoded><![CDATA[<p>Great post and discussion. IMO, unless someone is really unconfortable with trying new dbs (which are somewhat not as proven as mysql), they shouldn&#8217;t mysql that way. It&#8217;s harder to administer than any nosql db, and it&#8217;s just not the best tool for storing key/value data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Leo Petr</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-718949</link>
		<dc:creator>Leo Petr</dc:creator>
		<pubDate>Thu, 04 Feb 2010 00:16:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-718949</guid>
		<description>Hi, I&#039;m dropping by via the High Scalability blog.

This is a neat technique.

Something similar but much simpler can be done in IBM DB2. DB2 has a built-in, indexable XML column type. There&#039;s native support for XPath and XQuery, so you can store a clob with arbitrary fields serialized as XML and then do SQL queries with XPath to extract arbitrary fields, run aggregation functions on them, etc. Effectively, this lets you do exactly the same thing except without the opaqueness and with potentially higher performance depending on what you want.

This is included in the free edition (ibm.com/db2/express/)

Disclaimer: I work on the DB2 team, the opinions are my own, etc.</description>
		<content:encoded><![CDATA[<p>Hi, I&#8217;m dropping by via the High Scalability blog.</p>
<p>This is a neat technique.</p>
<p>Something similar but much simpler can be done in IBM DB2. DB2 has a built-in, indexable XML column type. There&#8217;s native support for XPath and XQuery, so you can store a clob with arbitrary fields serialized as XML and then do SQL queries with XPath to extract arbitrary fields, run aggregation functions on them, etc. Effectively, this lets you do exactly the same thing except without the opaqueness and with potentially higher performance depending on what you want.</p>
<p>This is included in the free edition (ibm.com/db2/express/)</p>
<p>Disclaimer: I work on the DB2 team, the opinions are my own, etc.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-714804</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Wed, 27 Jan 2010 01:50:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-714804</guid>
		<description>This is completely OT and not meant to be trolling, but one of the things I&#039;ve always thought Postgres could improve is make their data files less architecture-dependent.</description>
		<content:encoded><![CDATA[<p>This is completely OT and not meant to be trolling, but one of the things I&#8217;ve always thought Postgres could improve is make their data files less architecture-dependent.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Wultsch</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-714776</link>
		<dc:creator>Rob Wultsch</dc:creator>
		<pubDate>Wed, 27 Jan 2010 00:01:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-714776</guid>
		<description>Mark:
&quot;All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data. The header is detailed in Table 53-4. The actual user data (columns of the row) begins at the offset indicated by t_hoff, which must always be a multiple of the MAXALIGN distance for the platform. The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. If it is present it begins just after the fixed header and occupies enough bytes to have one bit per data column (that is, t_natts bits altogether). In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null. The object ID is only present if the HEAP_HASOID bit is set in t_infomask. If present, it appears just before the t_hoff boundary. Any padding needed to make t_hoff a MAXALIGN multiple will appear between the null bitmap and the object ID. (This in turn ensures that the object ID is suitably aligned.) &quot;
http://www.postgresql.org/docs/8.4/interactive/storage-page-layout.html</description>
		<content:encoded><![CDATA[<p>Mark:<br />
&#8220;All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data. The header is detailed in Table 53-4. The actual user data (columns of the row) begins at the offset indicated by t_hoff, which must always be a multiple of the MAXALIGN distance for the platform. The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. If it is present it begins just after the fixed header and occupies enough bytes to have one bit per data column (that is, t_natts bits altogether). In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null. The object ID is only present if the HEAP_HASOID bit is set in t_infomask. If present, it appears just before the t_hoff boundary. Any padding needed to make t_hoff a MAXALIGN multiple will appear between the null bitmap and the object ID. (This in turn ensures that the object ID is suitably aligned.) &#8221;<br />
<a href="http://www.postgresql.org/docs/8.4/interactive/storage-page-layout.html" rel="nofollow">http://www.postgresql.org/docs/8.4/interactive/storage-page-layout.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brian Cavanagh</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-714668</link>
		<dc:creator>Brian Cavanagh</dc:creator>
		<pubDate>Tue, 26 Jan 2010 15:53:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-714668</guid>
		<description>Yeah PostgreSQLmight be a good choice if you can get around the lousy replication support, as it will let you store arrays and another data types natively in the fields, so you don&#039;t have to worry about an inaccessible object model.</description>
		<content:encoded><![CDATA[<p>Yeah PostgreSQLmight be a good choice if you can get around the lousy replication support, as it will let you store arrays and another data types natively in the fields, so you don&#8217;t have to worry about an inaccessible object model.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Callaghan</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-713961</link>
		<dc:creator>Mark Callaghan</dc:creator>
		<pubDate>Mon, 25 Jan 2010 02:52:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-713961</guid>
		<description>Josh - what is the cost of a null column in Postgres? Do possibly null columns require a bit in the row header?</description>
		<content:encoded><![CDATA[<p>Josh &#8211; what is the cost of a null column in Postgres? Do possibly null columns require a bit in the row header?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Josh Berkus</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-713958</link>
		<dc:creator>Josh Berkus</dc:creator>
		<pubDate>Mon, 25 Jan 2010 02:44:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-713958</guid>
		<description>I just thought I&#039;d mention where PostgreSQL is for this:  It&#039;s less costly to have null columns in PostgreSQL than in InnoDB, so having lots of null columns is not a reason to use E-Blob.  Postgres is also very good at storing blobs, and compresses them for offline storage automatically.  Also, with GIST, GIN, expression indexes and HStore (and the upcoming HStore-to-JSON) there are indexing options for blobs. So thats two reasons TO use them.

Reasons to use E-Blob are:

a) the data stored varies wildly per entity, over time, or between customer installations;

b) the data in the blob is almost always retrieved, read, and updated all-at-once due to the application design;

c) the data is not going to be used for aggregation, ad-hoc querying, or response-time-sensitive filtering (since cheap indexes are impossible);

d) the data does not need to be constrained or used to enforce a constraint

Reasons not to use it are:

1) the above-mentioned update cost: update an entire 1K blob to change one value;

2) complete inability to enforce meaningful constraints on the data, thus allowing garbage to creep into the database;   

3) high cost of blanket updates to the data which might be required by application design changes.

Generally, I only consider e-blob for non-essential data which is going to vary by installation, or for specially structured data which is infrequently updated and thus works well with special index types which work with blobs.</description>
		<content:encoded><![CDATA[<p>I just thought I&#8217;d mention where PostgreSQL is for this:  It&#8217;s less costly to have null columns in PostgreSQL than in InnoDB, so having lots of null columns is not a reason to use E-Blob.  Postgres is also very good at storing blobs, and compresses them for offline storage automatically.  Also, with GIST, GIN, expression indexes and HStore (and the upcoming HStore-to-JSON) there are indexing options for blobs. So thats two reasons TO use them.</p>
<p>Reasons to use E-Blob are:</p>
<p>a) the data stored varies wildly per entity, over time, or between customer installations;</p>
<p>b) the data in the blob is almost always retrieved, read, and updated all-at-once due to the application design;</p>
<p>c) the data is not going to be used for aggregation, ad-hoc querying, or response-time-sensitive filtering (since cheap indexes are impossible);</p>
<p>d) the data does not need to be constrained or used to enforce a constraint</p>
<p>Reasons not to use it are:</p>
<p>1) the above-mentioned update cost: update an entire 1K blob to change one value;</p>
<p>2) complete inability to enforce meaningful constraints on the data, thus allowing garbage to creep into the database;   </p>
<p>3) high cost of blanket updates to the data which might be required by application design changes.</p>
<p>Generally, I only consider e-blob for non-essential data which is going to vary by installation, or for specially structured data which is infrequently updated and thus works well with special index types which work with blobs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-713728</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Sun, 24 Jan 2010 11:46:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-713728</guid>
		<description>Yup, this is a great technique for storing user preferences, e.g. the layout of a page or enabling/disabling certain feature. We just store a serialized associate array in the database for several different things on our site, and when the user logs in, we grab it, unserialize it, and store it in a session. Works great. It woulud be a huge pain in the ass if everytime we wanted to add a new option to the set, we had to change the table structure.

As long as your datda doens&#039;t need to be indexed or searchable, this can save a ton of time, short term and long term.</description>
		<content:encoded><![CDATA[<p>Yup, this is a great technique for storing user preferences, e.g. the layout of a page or enabling/disabling certain feature. We just store a serialized associate array in the database for several different things on our site, and when the user logs in, we grab it, unserialize it, and store it in a session. Works great. It woulud be a huge pain in the ass if everytime we wanted to add a new option to the set, we had to change the table structure.</p>
<p>As long as your datda doens&#8217;t need to be indexed or searchable, this can save a ton of time, short term and long term.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron Schwartz</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-713212</link>
		<dc:creator>Baron Schwartz</dc:creator>
		<pubDate>Fri, 22 Jan 2010 21:52:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-713212</guid>
		<description>Steven, I agree with you about where to do compression.  Your idea about the client automatically handling it is great.  Why not mention it on the Drizzle mailing list?</description>
		<content:encoded><![CDATA[<p>Steven, I agree with you about where to do compression.  Your idea about the client automatically handling it is great.  Why not mention it on the Drizzle mailing list?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Morgan Tocker</title>
		<link>http://www.mysqlperformanceblog.com/2010/01/21/when-should-you-store-serialized-objects-in-the-database/comment-page-1/#comment-713197</link>
		<dc:creator>Morgan Tocker</dc:creator>
		<pubDate>Fri, 22 Jan 2010 20:26:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=2128#comment-713197</guid>
		<description>Robert - MongoDB uses a binary representation of JSON.  They call it BSON:
http://www.mongodb.org/display/DOCS/BSON

If you&#039;re talking about fast formats for MySQL, I liked Jeremy&#039;s suggestion (Google protobuf).</description>
		<content:encoded><![CDATA[<p>Robert &#8211; MongoDB uses a binary representation of JSON.  They call it BSON:<br />
<a href="http://www.mongodb.org/display/DOCS/BSON" rel="nofollow">http://www.mongodb.org/display/DOCS/BSON</a></p>
<p>If you&#8217;re talking about fast formats for MySQL, I liked Jeremy&#8217;s suggestion (Google protobuf).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
