<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Metasearch vs. Google Scholar</title>
	<atom:link href="http://scholarlypublishing.org/jpwilkin/archives/6/feed" rel="self" type="application/rss+xml" />
	<link>http://scholarlypublishing.org/jpwilkin/archives/6</link>
	<description>John's blog on libraries, library technology, and pizza</description>
	<lastBuildDate>Tue, 17 Aug 2010 20:41:32 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Tito Sierra</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-40</link>
		<dc:creator>Tito Sierra</dc:creator>
		<pubDate>Thu, 06 Dec 2007 02:38:13 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-40</guid>
		<description>Sorry for coming to this conversation so late... just stumbled upon this fine blog post today.  The experiment Jonathan R. is referring to is called Smart Subjects (just google &#039;smart subjects&#039; for more info).  It basically attempts to do what John W. describes in paragraph 9.  It takes an arbitrary user query as input and returns a short list of related library subjects. Rather than rely on database descriptions it searches across a large corpus of topical terms as a surrogate description.  Since the system is based on structured search indices, the response time is milliseconds.

Our ERM system has a short list of article databases mapped to each of these subjects, so it possible to return a target list of eight databases based on a user query. I would just need to combine and de-dupe the databases associated with our library subjects. Jonathan, the only reason our current implementation doesn&#039;t just send the user&#039;s query into a targeted metasearch environment, or a specific database, is because our production metasearch application can&#039;t handle targeting of this sort. When I have some time I would like to continue with this experiment to create a explicit database recommendation service.  This would ask the user to describe the topic they are interested in a few words and it would output a listing of suggested databases to send their search to.

That said, David&#039;s comment is spot on. If the user types in terms that are subject specific such as &quot;toxicity&quot;, we can make some inferences on this.  If the user enters something vaguer than auto-selection of database targets results in false positives.  Taking David&#039;s query example of &#039;california indians&#039;, my system recommends a variety of subjects including Education, Psychology and History. This false positive problem could be addressed algorithmically by dropping recommendations that are too dissimilar from each other.  In other words only providing recommendations when there is a clear topical signal.  Again, this is something I would like to experiment with in the future.  It&#039;s unclear that this approach will solve the problems John W. describes with metasearch, but we ought to experiment just in case.</description>
		<content:encoded><![CDATA[<p>Sorry for coming to this conversation so late&#8230; just stumbled upon this fine blog post today.  The experiment Jonathan R. is referring to is called Smart Subjects (just google &#8217;smart subjects&#8217; for more info).  It basically attempts to do what John W. describes in paragraph 9.  It takes an arbitrary user query as input and returns a short list of related library subjects. Rather than rely on database descriptions it searches across a large corpus of topical terms as a surrogate description.  Since the system is based on structured search indices, the response time is milliseconds.</p>
<p>Our ERM system has a short list of article databases mapped to each of these subjects, so it possible to return a target list of eight databases based on a user query. I would just need to combine and de-dupe the databases associated with our library subjects. Jonathan, the only reason our current implementation doesn&#8217;t just send the user&#8217;s query into a targeted metasearch environment, or a specific database, is because our production metasearch application can&#8217;t handle targeting of this sort. When I have some time I would like to continue with this experiment to create a explicit database recommendation service.  This would ask the user to describe the topic they are interested in a few words and it would output a listing of suggested databases to send their search to.</p>
<p>That said, David&#8217;s comment is spot on. If the user types in terms that are subject specific such as &#8220;toxicity&#8221;, we can make some inferences on this.  If the user enters something vaguer than auto-selection of database targets results in false positives.  Taking David&#8217;s query example of &#8216;california indians&#8217;, my system recommends a variety of subjects including Education, Psychology and History. This false positive problem could be addressed algorithmically by dropping recommendations that are too dissimilar from each other.  In other words only providing recommendations when there is a clear topical signal.  Again, this is something I would like to experiment with in the future.  It&#8217;s unclear that this approach will solve the problems John W. describes with metasearch, but we ought to experiment just in case.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeroen Bosman</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-37</link>
		<dc:creator>Jeroen Bosman</dc:creator>
		<pubDate>Fri, 30 Nov 2007 08:23:47 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-37</guid>
		<description>Local indexing is also a reality in Utrecht, The Netherlands. Our system, called Omega (at http://omega.library.uu.nl/seal/omegasearch.php?lan=en) manages so far to search some 70% of our licensed journals. It is hard work (getting the data, building and updating the filters, tweaking the relevancy ranking), but it can be done. And the system is intensively used, by undergraduates in particular. The unique selling point is that each and every result is available in full text with one click. And another good thing is that the system is much much faster than the typical federated solution. Main concerns right now are getting the metadata from a remaining long list of smaller publishers and, another pressing issue, the way to go in local indexing of ebooks content.</description>
		<content:encoded><![CDATA[<p>Local indexing is also a reality in Utrecht, The Netherlands. Our system, called Omega (at <a href="http://omega.library.uu.nl/seal/omegasearch.php?lan=en)" rel="nofollow">http://omega.library.uu.nl/seal/omegasearch.php?lan=en)</a> manages so far to search some 70% of our licensed journals. It is hard work (getting the data, building and updating the filters, tweaking the relevancy ranking), but it can be done. And the system is intensively used, by undergraduates in particular. The unique selling point is that each and every result is available in full text with one click. And another good thing is that the system is much much faster than the typical federated solution. Main concerns right now are getting the metadata from a remaining long list of smaller publishers and, another pressing issue, the way to go in local indexing of ebooks content.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: WoW!ter</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-29</link>
		<dc:creator>WoW!ter</dc:creator>
		<pubDate>Sun, 18 Nov 2007 19:00:04 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-29</guid>
		<description>Perhaps interesting for you is the definition of Tamar Sadeh (ex libris) for meta search and federated search:
&lt;b&gt;Sadeh, T.&lt;/b&gt; (2006). Google Scholar versus metasearch systems. &lt;i&gt;High Energy Physics Libraries Webzine&lt;/i&gt;(12). &lt;a href=&quot;http://library.cern.ch/HEPLW/12/papers/1/&quot; rel=&quot;nofollow&quot;&gt;http://library.cern.ch/HEPLW/12/papers/1/&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>Perhaps interesting for you is the definition of Tamar Sadeh (ex libris) for meta search and federated search:<br />
<b>Sadeh, T.</b> (2006). Google Scholar versus metasearch systems. <i>High Energy Physics Libraries Webzine</i>(12). <a href="http://library.cern.ch/HEPLW/12/papers/1/" rel="nofollow">http://library.cern.ch/HEPLW/12/papers/1/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tori</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-28</link>
		<dc:creator>tori</dc:creator>
		<pubDate>Sun, 18 Nov 2007 18:28:50 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-28</guid>
		<description>Well-thought and helpful essay. I enjoyed the breakdown of possible solutions *very* much. However, one thing I always find missing in discussions of search (GS or otherwise) is its natural connection to fulfillment or access. Finding something online is great. Now how do I &quot;get&quot; it in my hands or on my screen? Sometimes search ends up being a frustrating exercise in being able to view a possibility of a dangling carrot, but never being able to grasp it.  I know this opens up the Pandora&#039;s box of copyfight and ownership issues, but someone needs to acknowledge the white elephant in the room at some point.</description>
		<content:encoded><![CDATA[<p>Well-thought and helpful essay. I enjoyed the breakdown of possible solutions *very* much. However, one thing I always find missing in discussions of search (GS or otherwise) is its natural connection to fulfillment or access. Finding something online is great. Now how do I &#8220;get&#8221; it in my hands or on my screen? Sometimes search ends up being a frustrating exercise in being able to view a possibility of a dangling carrot, but never being able to grasp it.  I know this opens up the Pandora&#8217;s box of copyfight and ownership issues, but someone needs to acknowledge the white elephant in the room at some point.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mita</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-27</link>
		<dc:creator>Mita</dc:creator>
		<pubDate>Sun, 18 Nov 2007 13:06:28 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-27</guid>
		<description>(1) is not crazy talk. In fact, its been done.

&lt;a href=&quot;http://www.ocul.on.ca/&quot; rel=&quot;nofollow&quot;&gt;OCUL&lt;/a&gt;, a Canadian consortium of university libraries from Ontario, negotiated with the major publishers (Elsevier, Springer and the other usual suspects) and loaded them onto a single server called Scholars Portal (&lt;a href=&quot;http://www.scholarsportal.info/&quot; rel=&quot;nofollow&quot;&gt;http://www.scholarsportal.info/&lt;/a&gt;).

Not only does this make the librarians involved feel better about the future preservation of this content, it gives the opportunity for developing search beyond the constraints of metasearch. 

Scholars Portal will be moving to a new server and a colleague and I have written about the possibilities of its development in a white paper called Scholr 2.0 :
http://www.scholarsportal.info/commentpress/</description>
		<content:encoded><![CDATA[<p>(1) is not crazy talk. In fact, its been done.</p>
<p><a href="http://www.ocul.on.ca/" rel="nofollow">OCUL</a>, a Canadian consortium of university libraries from Ontario, negotiated with the major publishers (Elsevier, Springer and the other usual suspects) and loaded them onto a single server called Scholars Portal (<a href="http://www.scholarsportal.info/" rel="nofollow">http://www.scholarsportal.info/</a>).</p>
<p>Not only does this make the librarians involved feel better about the future preservation of this content, it gives the opportunity for developing search beyond the constraints of metasearch. </p>
<p>Scholars Portal will be moving to a new server and a colleague and I have written about the possibilities of its development in a white paper called Scholr 2.0 :<br />
<a href="http://www.scholarsportal.info/commentpress/" rel="nofollow">http://www.scholarsportal.info/commentpress/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Wilkin’s blog &#187; Next Generation Library Systems</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-26</link>
		<dc:creator>John Wilkin’s blog &#187; Next Generation Library Systems</dc:creator>
		<pubDate>Fri, 16 Nov 2007 15:56:21 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-26</guid>
		<description>[...] Metasearch vs. Google Scholar [...]</description>
		<content:encoded><![CDATA[<p>[...] Metasearch vs. Google Scholar [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sue</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-25</link>
		<dc:creator>Sue</dc:creator>
		<pubDate>Sat, 10 Nov 2007 15:32:17 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-25</guid>
		<description>Has anyone tried to negotiate a license to locally house sets of databases? In the past lots of libraries loaded licensed content on local servers. At Hopkins we loaded  both Wilson and SilverPlatter databases locally and only moved to vendor served databases beginning in the late 90s. We abandoned local loading in part because of the resources it took to manage and update the databases and the search engines.

To me it&#039;s not so much a crazy idea as it is a very expensive idea. 

It&#039;s the the kind of project that could be taken up by a consortium of research libraries. That approach would spread out the cost of supporting a &#039;metabase&#039; and pull together a larger set of staff to cooperate in the development. 

Because, of course, after you get all the databases together you still have to develop algorithms that work across disparate types of data and on mechanisms for helping searchers move beyond the first set of results. Searching is, after all, only the first step.</description>
		<content:encoded><![CDATA[<p>Has anyone tried to negotiate a license to locally house sets of databases? In the past lots of libraries loaded licensed content on local servers. At Hopkins we loaded  both Wilson and SilverPlatter databases locally and only moved to vendor served databases beginning in the late 90s. We abandoned local loading in part because of the resources it took to manage and update the databases and the search engines.</p>
<p>To me it&#8217;s not so much a crazy idea as it is a very expensive idea. </p>
<p>It&#8217;s the the kind of project that could be taken up by a consortium of research libraries. That approach would spread out the cost of supporting a &#8216;metabase&#8217; and pull together a larger set of staff to cooperate in the development. </p>
<p>Because, of course, after you get all the databases together you still have to develop algorithms that work across disparate types of data and on mechanisms for helping searchers move beyond the first set of results. Searching is, after all, only the first step.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jpwilkin</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-24</link>
		<dc:creator>jpwilkin</dc:creator>
		<pubDate>Fri, 09 Nov 2007 14:31:54 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-24</guid>
		<description>Nice cite, and a very important thread in the argument about how to make discovery work.  I should modify my statement about &quot;crazy talk&quot;!  I&#039;m fairly sure I don&#039;t buy the argument that we should be negotiating for local loading of this data, however.  Having some competitors for GS would be great, but advocating for sending users back to local discovery mechanisms, however successful, runs counter to taking advantage of the natural draw users have to higher profile network-based services.  That&#039;s another topic, though, and the argument you make is an important one to take into account.</description>
		<content:encoded><![CDATA[<p>Nice cite, and a very important thread in the argument about how to make discovery work.  I should modify my statement about &#8220;crazy talk&#8221;!  I&#8217;m fairly sure I don&#8217;t buy the argument that we should be negotiating for local loading of this data, however.  Having some competitors for GS would be great, but advocating for sending users back to local discovery mechanisms, however successful, runs counter to taking advantage of the natural draw users have to higher profile network-based services.  That&#8217;s another topic, though, and the argument you make is an important one to take into account.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-23</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Thu, 08 Nov 2007 17:10:52 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-23</guid>
		<description>Excellent. The next step is automatically using metasearch to _query_ those identified databases, and showing the user the results of that query, not just the databases that the user might want to query. Do it for them, and show them the results. 

Of course, I&#039;m not sure my metalib instance could handle that level of traffic. When I think about doing this, I think about automatically querrying in like 25% of searches or something, to see what happens without killing my metalib server. Metalib is a resource hog.</description>
		<content:encoded><![CDATA[<p>Excellent. The next step is automatically using metasearch to _query_ those identified databases, and showing the user the results of that query, not just the databases that the user might want to query. Do it for them, and show them the results. </p>
<p>Of course, I&#8217;m not sure my metalib instance could handle that level of traffic. When I think about doing this, I think about automatically querrying in like 25% of searches or something, to see what happens without killing my metalib server. Metalib is a resource hog.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://scholarlypublishing.org/jpwilkin/archives/6/comment-page-1#comment-22</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Thu, 08 Nov 2007 17:07:51 +0000</pubDate>
		<guid isPermaLink="false">http://scholarlypublishing.org/jpwilkin/2007/11/05/metasearch-vs-google-scholar/#comment-22</guid>
		<description>To toot my own horn, this is the topic of my article published last February in Library Journal:

http://www.libraryjournal.com/article/CA6413442.html</description>
		<content:encoded><![CDATA[<p>To toot my own horn, this is the topic of my article published last February in Library Journal:</p>
<p><a href="http://www.libraryjournal.com/article/CA6413442.html" rel="nofollow">http://www.libraryjournal.com/article/CA6413442.html</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
