<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Sphinx:  Going Beyond full text search</title>
	<atom:link href="http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/</link>
	<description>Percona&#039;s MySQL &#38; InnoDB performance and scalability blog</description>
	<lastBuildDate>Sat, 11 Feb 2012 16:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: MySQL Performance Blog &#187; FaceBook Search, Search for social networks</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-155456</link>
		<dc:creator>MySQL Performance Blog &#187; FaceBook Search, Search for social networks</dc:creator>
		<pubDate>Sun, 12 Aug 2007 14:06:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-155456</guid>
		<description>[...] also already make sure Sphinx can be used efficiently to perform queries going beyond full text search. Such as finding the goods from the given group or Finding all Males looking for Females within 50 [...]</description>
		<content:encoded><![CDATA[<p>[...] also already make sure Sphinx can be used efficiently to perform queries going beyond full text search. Such as finding the goods from the given group or Finding all Males looking for Females within 50 [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149956</link>
		<dc:creator>Michael</dc:creator>
		<pubDate>Fri, 27 Jul 2007 00:56:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149956</guid>
		<description>Peter, thanks for the answer on SphinxSE. We&#039;ll probably implement native sphinx. My new question, do you know if sphinx allows the index to be read in reverse?

Michael</description>
		<content:encoded><![CDATA[<p>Peter, thanks for the answer on SphinxSE. We&#8217;ll probably implement native sphinx. My new question, do you know if sphinx allows the index to be read in reverse?</p>
<p>Michael</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149851</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Thu, 26 Jul 2007 18:11:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149851</guid>
		<description>SELECT * FROM TBL WHERE KEYPART1=CONST and KEYPART2 LIKE &quot;PREFIX%&quot; ORDER BY KEYPART3 LIMIT 10;

This uses FileSort  (external sort) which is fatally slow for millions of   items we&#039;re looking at. 

Regarding your comment about Full Text Search this is too simplified understanding :)     The process you&#039;re describing is stemming - which can be adjusted. In this case we do not use stemming.</description>
		<content:encoded><![CDATA[<p>SELECT * FROM TBL WHERE KEYPART1=CONST and KEYPART2 LIKE &#8220;PREFIX%&#8221; ORDER BY KEYPART3 LIMIT 10;</p>
<p>This uses FileSort  (external sort) which is fatally slow for millions of   items we&#8217;re looking at. </p>
<p>Regarding your comment about Full Text Search this is too simplified understanding <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />      The process you&#8217;re describing is stemming &#8211; which can be adjusted. In this case we do not use stemming.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: maht</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149837</link>
		<dc:creator>maht</dc:creator>
		<pubDate>Thu, 26 Jul 2007 17:03:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149837</guid>
		<description>peter : I&#039;m sorry I still don&#039;t understand the problem, in that case, even after re-reading.

Is it just adding some extra columns to paths? presumably not :)

Full text search breaks down the words into lexemes (as it does for searches queries) so that animation, animator and animated will all be returned for searches for animate.

I&#039;m still under the impression that this is not the behaviour you are exploiting ergo there must be a better way of doing it.</description>
		<content:encoded><![CDATA[<p>peter : I&#8217;m sorry I still don&#8217;t understand the problem, in that case, even after re-reading.</p>
<p>Is it just adding some extra columns to paths? presumably not <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Full text search breaks down the words into lexemes (as it does for searches queries) so that animation, animator and animated will all be returned for searches for animate.</p>
<p>I&#8217;m still under the impression that this is not the behaviour you are exploiting ergo there must be a better way of doing it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149829</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Thu, 26 Jul 2007 16:35:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149829</guid>
		<description>Maht, 

Your example does not solve the main problem I&#039;m mentioning - sorting together with prefix/range index. 

Regarding LIKE - you can have it case sensitive or case insensitive depending on collation you&#039;re using.</description>
		<content:encoded><![CDATA[<p>Maht, </p>
<p>Your example does not solve the main problem I&#8217;m mentioning &#8211; sorting together with prefix/range index. </p>
<p>Regarding LIKE &#8211; you can have it case sensitive or case insensitive depending on collation you&#8217;re using.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: maht</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149755</link>
		<dc:creator>maht</dc:creator>
		<pubDate>Thu, 26 Jul 2007 09:35:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149755</guid>
		<description>lol, they came from the pastebin I used first, not my day :)</description>
		<content:encoded><![CDATA[<p>lol, they came from the pastebin I used first, not my day <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: maht</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149754</link>
		<dc:creator>maht</dc:creator>
		<pubDate>Thu, 26 Jul 2007 09:34:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149754</guid>
		<description>stupid blog software, these &#039;double&#039; quotes are supposed to be single.

What a lame way to sql escape.</description>
		<content:encoded><![CDATA[<p>stupid blog software, these &#8216;double&#8217; quotes are supposed to be single.</p>
<p>What a lame way to sql escape.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: maht</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149753</link>
		<dc:creator>maht</dc:creator>
		<pubDate>Thu, 26 Jul 2007 09:32:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149753</guid>
		<description>You can break the uris yourself into the components you want to search for.

Here&#039;s my postgresql table layout though it looks fairly SQL standard to me.

    CREATE TABLE domains (
        domain VARCHAR NOT NULL PRIMARY KEY
    );
    CREATE TABLE subdomains (
        sub VARCHAR NOT NULL,
        domain REFERENCES domains ON CASCADE DELETE, 
        PRIMARY KEY (sub, domain)
    );
    CREATE TABLE paths (
        proto VARCHAR NOT NULL;
        path VARCHAR NOT NULL, 
        port INT, 
        domain varchar NOT NULL REFERENCES domains ON CASCADE DELETE, 
        sub varchar REFERENCES subdomains ON CASCADE DELETE, 
    );
    CREATE INDEX paths(sub, domain, port, path);

    SELECT proto &#124;&#124; &quot;://&quot; &#124;&#124; 
        coalesce(sub &#124;&#124; &quot;.&quot;, &quot;&quot;) &#124;&#124; domain &#124;&#124; 
        coalesce(&quot;:&quot; &#124;&#124; port, &quot;&quot;)&#124;&#124; path 
    FROM paths 
    WHERE domain=&quot;mysql.com&quot; 
        AND path LIKE &quot;/downloads/%&quot;;

fun with I/LIKE - some web servers are case insensitive!</description>
		<content:encoded><![CDATA[<p>You can break the uris yourself into the components you want to search for.</p>
<p>Here&#8217;s my postgresql table layout though it looks fairly SQL standard to me.</p>
<p>    CREATE TABLE domains (<br />
        domain VARCHAR NOT NULL PRIMARY KEY<br />
    );<br />
    CREATE TABLE subdomains (<br />
        sub VARCHAR NOT NULL,<br />
        domain REFERENCES domains ON CASCADE DELETE,<br />
        PRIMARY KEY (sub, domain)<br />
    );<br />
    CREATE TABLE paths (<br />
        proto VARCHAR NOT NULL;<br />
        path VARCHAR NOT NULL,<br />
        port INT,<br />
        domain varchar NOT NULL REFERENCES domains ON CASCADE DELETE,<br />
        sub varchar REFERENCES subdomains ON CASCADE DELETE,<br />
    );<br />
    CREATE INDEX paths(sub, domain, port, path);</p>
<p>    SELECT proto || &#8220;://&#8221; ||<br />
        coalesce(sub || &#8220;.&#8221;, &#8220;&#8221;) || domain ||<br />
        coalesce(&#8220;:&#8221; || port, &#8220;&#8221;)|| path<br />
    FROM paths<br />
    WHERE domain=&#8221;mysql.com&#8221;<br />
        AND path LIKE &#8220;/downloads/%&#8221;;</p>
<p>fun with I/LIKE &#8211; some web servers are case insensitive!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149208</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Tue, 24 Jul 2007 22:49:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149208</guid>
		<description>Michael,

I think Sphinx SE Is a good way for MySQL Users to start using Sphinx not getting out of MySQL environment. 

In our projects we typically use native Sphinx API instead which allows to keep stock MySQL version and is more transparent in terms of understanding performance properties. 

It is actually quite easy - get list of row IDs from sphinx, retrieve data from MySQL,  use sphinx to do filtering, ordering and grouping.</description>
		<content:encoded><![CDATA[<p>Michael,</p>
<p>I think Sphinx SE Is a good way for MySQL Users to start using Sphinx not getting out of MySQL environment. </p>
<p>In our projects we typically use native Sphinx API instead which allows to keep stock MySQL version and is more transparent in terms of understanding performance properties. </p>
<p>It is actually quite easy &#8211; get list of row IDs from sphinx, retrieve data from MySQL,  use sphinx to do filtering, ordering and grouping.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peter</title>
		<link>http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/comment-page-1/#comment-149207</link>
		<dc:creator>peter</dc:creator>
		<pubDate>Tue, 24 Jul 2007 22:46:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/2007/07/23/sphinx-going-beyond-full-text-search/#comment-149207</guid>
		<description>Pedro Melo,

We used to use similar approach.    You&#039;re right in this case MySQL uses index to retrieve data,  the problem however for some URLs there would be millions of of links which have to be sorted with &quot;filesort&quot; which is too bad. 

If the goal would be to show top 10 any links this approach would work just fine :)</description>
		<content:encoded><![CDATA[<p>Pedro Melo,</p>
<p>We used to use similar approach.    You&#8217;re right in this case MySQL uses index to retrieve data,  the problem however for some URLs there would be millions of of links which have to be sorted with &#8220;filesort&#8221; which is too bad. </p>
<p>If the goal would be to show top 10 any links this approach would work just fine <img src='http://www.mysqlperformanceblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
</channel>
</rss>

