<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: To UUID or not to UUID ?</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/</link>
	<description>Everything about MySQL Performance</description>
	<lastBuildDate>Sat, 07 Nov 2009 18:35:44 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Josh P</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-567152</link>
		<dc:creator>Josh P</dc:creator>
		<pubDate>Tue, 26 May 2009 06:51:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-567152</guid>
		<description>Here is the solution that I&#039;m considering for my project:

1. Concatenate the numeric user id (1-9 digits) and the UNIX timestamp (10 digits) 
--&gt; [i.e. &quot;3333&quot; . &quot;1111111111&quot; = &quot;33331111111111&quot;] ***GUARANTEED UNIQUE.

2. Convert to base-36 
--&gt; [ &quot;33331111111111&quot; --&gt; &quot;a3gsei3&quot;] --- (if I had 100 million users, the longest ID would still only be 10 characters)

3. Store as binary in CHAR().

My best guess is that this strategy is a win-win (for my situation) over GUIDs and INTs.  I have guaranteed-unique ids (since they are tied to the user id and the timestamp) that are available without querying the database. PLUS, they are SIGNIFICANTLY shorter than GUIDs (they are 1/3 the size).

Admittedly, I&#039;m relatively new to highly-scalable database architecture, so I&#039;d appreciate any thoughts or feedback.  Thanks.</description>
		<content:encoded><![CDATA[<p>Here is the solution that I&#8217;m considering for my project:</p>
<p>1. Concatenate the numeric user id (1-9 digits) and the UNIX timestamp (10 digits)<br />
&#8211;&gt; [i.e. "3333" . "1111111111" = "33331111111111"] ***GUARANTEED UNIQUE.</p>
<p>2. Convert to base-36<br />
&#8211;&gt; [ "33331111111111" --&gt; "a3gsei3"] &#8212; (if I had 100 million users, the longest ID would still only be 10 characters)</p>
<p>3. Store as binary in CHAR().</p>
<p>My best guess is that this strategy is a win-win (for my situation) over GUIDs and INTs.  I have guaranteed-unique ids (since they are tied to the user id and the timestamp) that are available without querying the database. PLUS, they are SIGNIFICANTLY shorter than GUIDs (they are 1/3 the size).</p>
<p>Admittedly, I&#8217;m relatively new to highly-scalable database architecture, so I&#8217;d appreciate any thoughts or feedback.  Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-525902</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Tue, 31 Mar 2009 21:16:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-525902</guid>
		<description>My biggest gripe about UUID() is that it doesn&#039;t generate random (or at least pseudo-random) values. A reason you would want to use UUID/GUID over auto incrementing integers is when you&#039;re synchronizing data from multiple dispersant sources. If you&#039;re constrained by incremental integers for primary keys you&#039;re constantly updating IDs on record inserts during synchronization. If the UUID() values were random (like a Guid in .NET) there should be very little chance of collisions.

I would also like to see a UUID data type (which would use 16-byte binary representation instead of VARCHAR(36))</description>
		<content:encoded><![CDATA[<p>My biggest gripe about UUID() is that it doesn&#8217;t generate random (or at least pseudo-random) values. A reason you would want to use UUID/GUID over auto incrementing integers is when you&#8217;re synchronizing data from multiple dispersant sources. If you&#8217;re constrained by incremental integers for primary keys you&#8217;re constantly updating IDs on record inserts during synchronization. If the UUID() values were random (like a Guid in .NET) there should be very little chance of collisions.</p>
<p>I would also like to see a UUID data type (which would use 16-byte binary representation instead of VARCHAR(36))</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: xli</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-366358</link>
		<dc:creator>xli</dc:creator>
		<pubDate>Mon, 27 Oct 2008 16:46:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-366358</guid>
		<description>Hi, Peter

I noticed your benchmark which shows 200 times performance differences between auto-increment and UUID(). I&#039;m wondering if you did the same thing for InnoDB as well. We are using MySQL 5.0 and InnoDB, we got badly lock conflict when inserting rows into InnoDB tables with auto-increment column. So, we are considering to switch to UUID() as a PK. I did a very simple testing: wrote a stored procedure, which has a loop to insert a row into a table. the results for InnoDB are shown as below: inserting 100,000 rows into InnoDB tables, for a table with auto-increment, it tooks 254 sec; for a table without auto-increment but use UUID(), it took 263 sec; the same testing for MyISAM tables, I got 54 sec vs. 68 sec. the 54 sec is similar to what you got. What I did wrong?</description>
		<content:encoded><![CDATA[<p>Hi, Peter</p>
<p>I noticed your benchmark which shows 200 times performance differences between auto-increment and UUID(). I&#8217;m wondering if you did the same thing for InnoDB as well. We are using MySQL 5.0 and InnoDB, we got badly lock conflict when inserting rows into InnoDB tables with auto-increment column. So, we are considering to switch to UUID() as a PK. I did a very simple testing: wrote a stored procedure, which has a loop to insert a row into a table. the results for InnoDB are shown as below: inserting 100,000 rows into InnoDB tables, for a table with auto-increment, it tooks 254 sec; for a table without auto-increment but use UUID(), it took 263 sec; the same testing for MyISAM tables, I got 54 sec vs. 68 sec. the 54 sec is similar to what you got. What I did wrong?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: links for 2008-10-23 &#171; Object neo = neo Object</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-364628</link>
		<dc:creator>links for 2008-10-23 &#171; Object neo = neo Object</dc:creator>
		<pubDate>Fri, 24 Oct 2008 04:31:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-364628</guid>
		<description>[...] To UUID or not to UUID ? &#124; MySQL Performance Blog (tags: uuid scalability) [...]</description>
		<content:encoded><![CDATA[<p>[...] To UUID or not to UUID ? | MySQL Performance Blog (tags: uuid scalability) [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Enlaces técnicos recomendados</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-354050</link>
		<dc:creator>Enlaces técnicos recomendados</dc:creator>
		<pubDate>Wed, 10 Sep 2008 13:12:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-354050</guid>
		<description>[...] To UUID or not to UUID ? de MySQL Performance Blog [...]</description>
		<content:encoded><![CDATA[<p>[...] To UUID or not to UUID ? de MySQL Performance Blog [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Al T.</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-278544</link>
		<dc:creator>Al T.</dc:creator>
		<pubDate>Tue, 15 Apr 2008 17:32:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-278544</guid>
		<description>I have seen those problems with UUID&#039;s in MySQL when stored as text.  Ideally, MySQL would have a UUID column type to store the values as binary rather than strings but convert to string anytime the row is returned.  The ability to insert an ID in hex text and convert it to binary would be a must too.  That&#039;s the biggest problem with using binary fields: you can&#039;t read them or copy and paste them without conversion.  That change would provide better performance in terms of storage and indexing (a 16-byte column instead of 36).

In an effort to improve the situation, I created a simple UUID class capable of generating random UUID&#039;s with an option to store them in Base64 rather than hex.  Since the length of the UUID is always constant, I was able to trim off the extra = in the Base64 conversion and come up with a case-sensitive 22-byte UUID representation (VTIW7xOgReOGrL3vMRjm4Q, for example).  The performance increase was enormous, and the overhead is much smaller (22 bytes vs. 16) and you are able to convert to binary and hex at any time.

Later, as I thought more of the problem, I realized the Base64 encoding was inefficient for a 128-bit number.  In order to get maximum efficiency out of Base64, the number of bits needs to be divisible by 6.  So I created a new identifier that was only 72 bits.  (Yes, the collision probability goes up, but it is still one in 4.7 * 10^21.)  These UID&#039;s (as I call them) only take 12 bytes to store in a binary column and strike a very good balance between speed and uniqueness.  They can also be translated to GUID&#039;s (########-0000-0000-0000-00##########) and back when needed.  (If 72-bits is not enough, use 96 bits to make a 16-byte UID).</description>
		<content:encoded><![CDATA[<p>I have seen those problems with UUID&#8217;s in MySQL when stored as text.  Ideally, MySQL would have a UUID column type to store the values as binary rather than strings but convert to string anytime the row is returned.  The ability to insert an ID in hex text and convert it to binary would be a must too.  That&#8217;s the biggest problem with using binary fields: you can&#8217;t read them or copy and paste them without conversion.  That change would provide better performance in terms of storage and indexing (a 16-byte column instead of 36).</p>
<p>In an effort to improve the situation, I created a simple UUID class capable of generating random UUID&#8217;s with an option to store them in Base64 rather than hex.  Since the length of the UUID is always constant, I was able to trim off the extra = in the Base64 conversion and come up with a case-sensitive 22-byte UUID representation (VTIW7xOgReOGrL3vMRjm4Q, for example).  The performance increase was enormous, and the overhead is much smaller (22 bytes vs. 16) and you are able to convert to binary and hex at any time.</p>
<p>Later, as I thought more of the problem, I realized the Base64 encoding was inefficient for a 128-bit number.  In order to get maximum efficiency out of Base64, the number of bits needs to be divisible by 6.  So I created a new identifier that was only 72 bits.  (Yes, the collision probability goes up, but it is still one in 4.7 * 10^21.)  These UID&#8217;s (as I call them) only take 12 bytes to store in a binary column and strike a very good balance between speed and uniqueness.  They can also be translated to GUID&#8217;s (########-0000-0000-0000-00##########) and back when needed.  (If 72-bits is not enough, use 96 bits to make a 16-byte UID).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cybermonk</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-254921</link>
		<dc:creator>cybermonk</dc:creator>
		<pubDate>Thu, 20 Mar 2008 17:54:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-254921</guid>
		<description>Well it&#039;s too bad the comparison wasn&#039;t done using the UUID in binary format, autogenerating the GUID on the client side using Jimmy Nilsson&#039;s GUID.COMB. Can you do that comparison peter? NHibnerate has an implementation of GUID.COMB.</description>
		<content:encoded><![CDATA[<p>Well it&#8217;s too bad the comparison wasn&#8217;t done using the UUID in binary format, autogenerating the GUID on the client side using Jimmy Nilsson&#8217;s GUID.COMB. Can you do that comparison peter? NHibnerate has an implementation of GUID.COMB.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anthony Mathews</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-218410</link>
		<dc:creator>Anthony Mathews</dc:creator>
		<pubDate>Sat, 15 Dec 2007 20:09:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-218410</guid>
		<description>It appears that the benchmark done in this article was storing the UUID as a ascii representation of a binary number when in fact the UUID should have bin stored in binary form.  This is equivalent to searching using a 4 byte integer for the auto_increment primary key and then using a persons First, Middle and Last name to search for the UUID implementation.

Write a function that converts the value returned by the UUID function to binary and store it in a binary(128) column.  Write yourself another function that will cast it back to a characters string with hyphenation if you need to display it.

However, the primary use for UUIDs or GUIDs is data portability, not speed for searching.  If you have worked for large companies where you have redundant data stored in many locations you have to manage this primary key much closer and have the ability to generate something that you know will be unique across the company.  Integers and auto increment will not cut it.</description>
		<content:encoded><![CDATA[<p>It appears that the benchmark done in this article was storing the UUID as a ascii representation of a binary number when in fact the UUID should have bin stored in binary form.  This is equivalent to searching using a 4 byte integer for the auto_increment primary key and then using a persons First, Middle and Last name to search for the UUID implementation.</p>
<p>Write a function that converts the value returned by the UUID function to binary and store it in a binary(128) column.  Write yourself another function that will cast it back to a characters string with hyphenation if you need to display it.</p>
<p>However, the primary use for UUIDs or GUIDs is data portability, not speed for searching.  If you have worked for large companies where you have redundant data stored in many locations you have to manage this primary key much closer and have the ability to generate something that you know will be unique across the company.  Integers and auto increment will not cut it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: scale-out: notes on sharding, unique keys, foreign keys&#8230; &#171; from Oracle to MySQL</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-158651</link>
		<dc:creator>scale-out: notes on sharding, unique keys, foreign keys&#8230; &#171; from Oracle to MySQL</dc:creator>
		<pubDate>Thu, 23 Aug 2007 19:30:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-158651</guid>
		<description>[...] UUIDs are a bit ugly for reading and working with. (See the MySQL Performance Blog entry &#8220;To UUID or not to UUID&#8221; for performance [...]</description>
		<content:encoded><![CDATA[<p>[...] UUIDs are a bit ugly for reading and working with. (See the MySQL Performance Blog entry &#8220;To UUID or not to UUID&#8221; for performance [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/comment-page-1/#comment-106459</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Wed, 11 Apr 2007 09:34:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/#comment-106459</guid>
		<description>Jacob,

I can tell you what we do for http://www.boardreader.com with some billion rows which need quick retrieval.
The data is partitioned in &quot;table groups&quot; which are mapped to the servers.   We use 64bit identifiers with lower byte used to store table group. 

Search is done using &quot;Sphinx&quot; search engine and we basically need to find rows by IDs to show result set in most cases.</description>
		<content:encoded><![CDATA[<p>Jacob,</p>
<p>I can tell you what we do for <a href="http://www.boardreader.com" rel="nofollow">http://www.boardreader.com</a> with some billion rows which need quick retrieval.<br />
The data is partitioned in &#8220;table groups&#8221; which are mapped to the servers.   We use 64bit identifiers with lower byte used to store table group. </p>
<p>Search is done using &#8220;Sphinx&#8221; search engine and we basically need to find rows by IDs to show result set in most cases.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
