<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Digital Reasoning &#187; machine learning</title>
	<atom:link href="http://www.digitalreasoning.com/tag/machine-learning/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.digitalreasoning.com</link>
	<description>Automated Understanding for Big Data</description>
	<lastBuildDate>Wed, 25 Jan 2012 13:31:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Data Analytics: Should We Build Iron Man or R2D2?</title>
		<link>http://www.digitalreasoning.com/2010/blog/data-analytics-should-we-build-iron-man-or-r2d2/</link>
		<comments>http://www.digitalreasoning.com/2010/blog/data-analytics-should-we-build-iron-man-or-r2d2/#comments</comments>
		<pubDate>Tue, 27 Jul 2010 16:13:19 +0000</pubDate>
		<dc:creator>Harry Schultz</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[automation]]></category>
		<category><![CDATA[data analytics]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Tim Estes]]></category>
		<category><![CDATA[unstructured data]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=1313</guid>
		<description><![CDATA[Earlier this year, Alex Handy wrote an intriguing article on exploring the future of data analysis, which In this article Handy compared and contrasted two approaches to understanding the ever-increasing stream of data. One approach depends upon building &#8220;exoskeletal systems&#8221;, which enhance human comprehension. Hardy draws connections to this solution and “Iron Man”. The other ]]></description>
			<content:encoded><![CDATA[<p>Earlier this year, <a title="ALex Handy Bio" href="http://www.sdtimes.com/about/AlexHandy" target="_blank">Alex Handy</a> wrote an intriguing <a title="SD Times Article on the future of Data Analytics" href="http://www.sdtimes.com/link/34139" target="_blank">article</a> on exploring the future of data analysis, which In this article Handy compared and contrasted two approaches to understanding the ever-increasing stream of data. One approach depends upon building &#8220;exoskeletal systems&#8221;, which enhance human comprehension. Hardy draws connections to this solution and “Iron Man”. The other approach would depend chiefly on autonomous robots or automated systems. This alternative, Hardy suggests, is more like “R2D2” from Star Wars. Ultimately, Handy concludes that &#8220;[d]evelopers should build Iron Man, not R2D2.”</p>
<p>Here at Digital Reasoning, we have been dealing with the challenges of automated understanding of massive amounts of unstructured data for years. Knowing that Tim Estes, our CEO, might have a different view on this issue,  I decided to interview him. Tim has worked within the realms of unstructured data analytics, artificial intelligence, and machine learning for the past decade.</p>
<p>The following is our interview:</p>
<p><strong>Jason Beck &#8211; In the article, one researcher suggests that developers shouldn’t build analytics robots, but rather “exoskeletal systems”. Do you agree?</strong></p>
<p><strong>Tim Estes -</strong> I think that it&#8217;s a matter of degree. The range of judgments that a machine can make as a proxy for the human is constantly and necessarily expanding. Even R2D2 was most famous for taking orders from Luke Skywalker trying to accomplish tasks from fixing the X-wing in flight to cracking into computer networks.</p>
<p>Just to be a little more accurate &#8211; Iron Man wouldn&#8217;t work without an AI that is close to R2D2. Jarvis (the AI program that runs&#8217; the Stark house and the Iron Man suit) is always chatting up Tony Stark about what&#8217;s going on with the suit and the risks that are present around him. The Iron Man analogy means we seed the full situational awareness (the sensory and data input space) to the machine with the human making key decisions on the filtered and prioritized information. I think that&#8217;s about right.</p>
<p>R2D2 is distinct in having a measure of its own intentionality  (i.e. it is autonomous in more dramatic ways than Jarvis/Iron Man suit) but they are much more close than you might think. Should humans get out of the loop in making analytic judgements? No more than we should have pilots out of the loop in flying commercial airlines at this time. But show me a pilot that can fly a 747 without computer assistance and guidance? We are already in the hybrid space. And the complexity of our technology and the explosion of the information created by machines and man assisted by machines means we will need ever increasing automation in understanding.</p>
<p><strong>JB &#8211; Doesn’t the exponential growth of data and decreasing levels of available talent necessitate automated systems? </strong></p>
<p><strong>TE -</strong> Exactly. The notion that &#8220;augmented intelligence&#8221; can solve the full data problem is wishful thinking. Something has to read everything and that can no longer be a human as a matter of scale. We have to make strides to catch up intelligent systems with the complexity and scale of the data we are being inundated with.</p>
<p><strong>JB &#8211; Is this an Either-Or situation? Just because someone may prefer automated systems, does this assume that there won’t be any human in the loop?</strong></p>
<p><strong>TE -</strong> I think that&#8217;s the real issue &#8211; where is the dividing line right now and where is it going to be in 5 years? Right now &#8211; machines have to read and organize everything. The race is to see who can do it accurately, at scale, and focused on the entity-level vs. the document level. In five years, the information overload will be so substantial that autonomous proxies or agents will likely be the baseline for all of these systems. In both situations, humans are in the loop. Now &#8211; they have much greater heavy lifting because nearly all of our enterprise information systems don&#8217;t really understand their data that well so the burden is on the reader. That has to change. Even when it does, we will just be enabling the humans to make better decisions in less time and less interruption of their daily lives.</p>
<p><strong>JB &#8211; Does the delineation between these two approaches represent a common split in the overall text analytics community?</strong></p>
<p><strong>TE -</strong> I think so. We can either be satisfied with augmenting the status quo or we can get to the root of the issue &#8211; that software doesn&#8217;t understand natural signals that make up unstructured data. We are in a place of diminishing returns with simple classifiers and <a title="ETL Architecture" href="http://en.wikipedia.org/wiki/Extract,_transform,_load" target="_blank">ETL (Extract, Transform, Load) architecture</a>. The more exciting alternative, however, is to go at the semantic and scale problems with the appropriate technologies and transform the enterprise to be entity-oriented.</p>
<p><strong>JB &#8211; Can you think of any example where someone tried to completely automate text mining?</strong></p>
<p><strong>TE -</strong> Not off the top of my head. I&#8217;m sure there have been. But a lot of text mining is feeding either fancy search engines (such as faceted navigation and data enriched topic clustering) or Business Intelligence frameworks.</p>
<p><strong>JB &#8211; What does the future look like regarding automation?</strong></p>
<p><strong>TE -</strong> Its going to go from being reactive (search, research, and investigation) to being proactive (push, warnings, summaries). Its going to go from two major silos inside the enterprise &#8211; the human curated/ structured data and the content management/unstructured data &#8211; to being one, unified entity-oriented data store. Once this is done, programs will constantly monitor this unified data store for areas of interest to users and start to screen most everything and prioritize it. Eventually, we&#8217;ll get some real next generation automation out of this because there will be a class of actions that will be autonomously executed without requiring human intervention (such as determining the defense policy in a detected cyber-attack).</p>
<p><strong>JB &#8211; What other thoughts do you have about this?</strong></p>
<p><strong>TE -</strong> I think that as we weigh the risks or errors in additional automation, we need to be wary of irrational risk aversion. The poverty of attention that most people suffer from has very real consequences even if we don&#8217;t fully understand that right now. Solutions which give small, incremental gains are unlikely to get ahead of this increasingly detrimental phenomenon. Without something reading everything and getting smarter, we are simply rolling the dice on what we don&#8217;t have time to read or consider. That&#8217;s the other side of the coin of the incremental approach.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2010/blog/data-analytics-should-we-build-iron-man-or-r2d2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

