<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Xtract &#187; Laaarge data sets</title>
	<atom:link href="http://www.xtract.com/category/laaarge-data-sets/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xtract.com</link>
	<description>Corporate website</description>
	<lastBuildDate>Thu, 22 Apr 2010 13:02:06 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Social networking websites &#8211; Japan vs US</title>
		<link>http://www.xtract.com/culture/2008/10/15/social-networking-websites-japan-vs-us/</link>
		<comments>http://www.xtract.com/culture/2008/10/15/social-networking-websites-japan-vs-us/#comments</comments>
		<pubDate>Wed, 15 Oct 2008 07:49:05 +0000</pubDate>
		<dc:creator>Chris</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Laaarge data sets]]></category>
		<category><![CDATA[Social Network Analytics]]></category>
		<category><![CDATA[customer insight]]></category>
		<category><![CDATA[japan]]></category>
		<category><![CDATA[online]]></category>
		<category><![CDATA[social networking]]></category>
		<category><![CDATA[US]]></category>
		<category><![CDATA[websites]]></category>

		<guid isPermaLink="false">http://www.xtract.com/?p=703</guid>
		<description><![CDATA[When I read Jay Alabaster&#8217;s article on the Japanese behavior on social networking websites, it made me realise how difficult it must be for some companies to get any customer insight from their customer base.
According to Jay, &#8220;the vast majority of mixi&#8217;s roughly 15 million users don&#8217;t reveal anything about themselves&#8221; and keep in tight [...]]]></description>
			<content:encoded><![CDATA[<p>When I read Jay Alabaster&#8217;s <a href="http://seattletimes.nwsource.com/html/businesstechnology/2008211943_btjapanshynet29.html" target="_blank">article</a> on the Japanese behavior on social networking websites, it made me realise how difficult it must be for some companies to get any customer insight from their customer base.</p>
<p>According to Jay, &#8220;the vast majority of <a href="http://mixi.jp" target="_blank">mixi</a>&#8217;s roughly 15 million users don&#8217;t reveal anything about themselves&#8221; and keep in tight groups, to which he adds that &#8220;fewer than half of <a href="http://jp.match.com" target="_blank">Match</a>&#8217;s paying members in Japan are willing to post their photos, compared with nearly all members in the U.S&#8221;.</p>
<p>Must be so frustrating to sit on so much data and not be able to get any useful insight extracted. I wonder how companies like pixi.jp handle it, considering users have fake profiles, or then companies like match.com considering how differently users behave from culture to culture around the same service.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.xtract.com/culture/2008/10/15/social-networking-websites-japan-vs-us/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Matrices in the land of Tintin</title>
		<link>http://www.xtract.com/academic/2008/09/18/matrices-in-the-land-of-tintin/</link>
		<comments>http://www.xtract.com/academic/2008/09/18/matrices-in-the-land-of-tintin/#comments</comments>
		<pubDate>Thu, 18 Sep 2008 19:58:44 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Academic]]></category>
		<category><![CDATA[Academical]]></category>
		<category><![CDATA[Communities]]></category>
		<category><![CDATA[Laaarge data sets]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[food]]></category>
		<category><![CDATA[machinelearning]]></category>
		<category><![CDATA[scientific]]></category>

		<guid isPermaLink="false">http://www.xtract.com/?p=469</guid>
		<description><![CDATA[This week, the University of Antwerp has been hosting the ECML-PKDD conference. It is a good opportunity to hear the newest thinking in machine learning and knowledge discovery, and talk directly to researchers. The organizers have worked very hard to make the conference a success. One of their many good ideas is to have every [...]]]></description>
			<content:encoded><![CDATA[<p>This week, the University of Antwerp has been hosting the <a href="http://www.ecmlpkdd2008.org/">ECML-PKDD conference</a>. It is a good opportunity to hear the newest thinking in machine learning and knowledge discovery, and talk directly to researchers. The organizers have worked very hard to make the conference a success. One of their many good ideas is to have every paper be presented both as a talk and as a poster, so if you have questions that were not answered in the talk, the author can explain the work again using the poster as an aid.</p>
<p>On Tuesday I had the opportunity to chair the <a href="http://ecmlpkdd2008.org/schedule/session?id=4">Matrix Factorization session</a>, arguably the highest-quality research session at the conference, since out of the four papers presented one received the Best Paper in Machine Learning award, and another one the Best Student Paper in Knowledge Discovery award.</p>
<p>To those of us who didn&#8217;t take <a href="http://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/CourseHome/">Linear Algebra 101</a>, Matrix Factorization may sound imposing, but really it is a beautiful, unifying idea behind many techniques such as community discovery, document classification (<em>e.g.</em> into spam and non-spam emails), and <a href="http://en.wikipedia.org/wiki/Collaborative_filtering">collaborative filtering</a>, which is what Amazon or Netflix does when they recommend an item for you based on your previous purchases compared to those of other customers.</p>
<p>In the session, Ajit Singh gave <a href="http://www.ecmlpkdd2008.org/accepted-papers/abstract?id=467">a talk</a> on how the matrix factorization idea encompasses several methods that might not look like matrix algebra on the surface. Alexandros Karatzoglou <a href="http://www.ecmlpkdd2008.org/accepted-papers/abstract?id=243">explained several improvements</a> on Maximum Margin Matrix Factorization, one of the hottest collaborative filtering methods around. Pauli Miettinen <a href="http://www.ecmlpkdd2008.org/accepted-papers/abstract?id=338">discussed factorizing</a><a href="http://www.ecmlpkdd2008.org/accepted-papers/abstract?id=338"> binary matrices</a>, which is quite a different problem from usual linear algebra methods, and <a href="http://www.ecmlpkdd2008.org/accepted-papers/abstract?id=234">Bin Cao <em>et al.</em>&#8217;s paper</a> was about a new adaptive way to compute a similarity metric for collaborative filtering.</p>
<div id="attachment_479" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.xtract.com/wp-content/uploads/2008/09/dessert.jpg"><img class="size-medium wp-image-479" title="dessert" src="http://www.xtract.com/wp-content/uploads/2008/09/dessert-300x142.jpg" alt="Dessert in style" width="300" height="142" /></a><p class="wp-caption-text">Dessert served in style at the conference banquet on Wednesday</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.xtract.com/academic/2008/09/18/matrices-in-the-land-of-tintin/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
