<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Digital Reasoning &#187; natural language processing</title>
	<atom:link href="http://www.digitalreasoning.com/tag/natural-language-processing/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.digitalreasoning.com</link>
	<description>Automated Understanding for Big Data</description>
	<lastBuildDate>Wed, 25 Jan 2012 13:31:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>The 451 Group Reports on Digital Reasoning</title>
		<link>http://www.digitalreasoning.com/2011/news/press-release/the-451-group-reports-on-digital-reasoning/</link>
		<comments>http://www.digitalreasoning.com/2011/news/press-release/the-451-group-reports-on-digital-reasoning/#comments</comments>
		<pubDate>Wed, 20 Jul 2011 12:41:10 +0000</pubDate>
		<dc:creator>Dave Danielson</dc:creator>
				<category><![CDATA[Press Release]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Entity Oriented Analytics]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[Nick Patience]]></category>
		<category><![CDATA[Synthesys]]></category>
		<category><![CDATA[the451]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=3145</guid>
		<description><![CDATA[The 451 Group Recognizes Digital Reasoning’s Deep Expertise and Proven Success in Cloud Scale Data Analytics Nashville, TN – July 20, 2011 – Digital Reasoning™, the leader in unstructured data analytics at scale, today announced a recently published Impact Report by independent analyst firm The 451 Group titled, “Digital Reasoning Positions Military-tested Text Analysis Tools ]]></description>
			<content:encoded><![CDATA[<p><strong>The 451 Group Recognizes Digital Reasoning’s Deep Expertise and Proven Success in Cloud Scale Data Analytics</strong></p>
<p>Nashville, TN – July 20, 2011 – <a href="http://www.digitalreasoning.com/">Digital Reasoning™</a>, the leader in unstructured data analytics at scale, today announced a recently published Impact Report by independent analyst firm The 451 Group titled, “<a href="http://www.digitalreasoning.com/wp-content/uploads/2011/07/451_Digital_Reasoning_Impact%20Report.pdf">Digital Reasoning Positions Military-tested Text Analysis Tools for Commercial Market,</a>” is now available for download on the company’s website.</p>
<p>In the report, Nick Patience Research Director, Information Management at The 451 Group states, “Digital Reasoning’s patented text analysis technology, deep level of experience and apparent success in the proving grounds of US government intelligence show that Digital Reasoning has a product to be reckoned with.”</p>
<p>“We are pleased to be recognized for our expertise in <a href="http://www.digitalreasoning.com/">big data</a> analytics,” said Tim Estes, CEO at Digital Reasoning. “Our proven and patented software Synthesys, uncovers actionable intelligence from web content, email, social media, reports, and other unstructured data.”</p>
<p>Digital Reasoning recently announced Chinese language support for big data analytics in its flagship product <a href="http://www.digitalreasoning.com/products/" class="broken_link">Synthesys®</a>. Synthesys is an entity oriented cloud-scale analytic solution that enables enterprises and government agencies to automatically make sense of complex data. Built to address the most complicated data analytics challenges, Synthesys excels at extracting, resolving and linking entities and concepts to provide context to the newly discovered information.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2011/news/press-release/the-451-group-reports-on-digital-reasoning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Reasoning Introduces Chinese Language Support for Big Data Analytics</title>
		<link>http://www.digitalreasoning.com/2011/news/press-release/digital-reasoning-introduces-chinese-language-support-for-big-data-analytics/</link>
		<comments>http://www.digitalreasoning.com/2011/news/press-release/digital-reasoning-introduces-chinese-language-support-for-big-data-analytics/#comments</comments>
		<pubDate>Tue, 07 Jun 2011 13:24:51 +0000</pubDate>
		<dc:creator>Dave Danielson</dc:creator>
				<category><![CDATA[Press Release]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Entity Oriented Analytics]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[Rob Metcalf]]></category>
		<category><![CDATA[Synthesys]]></category>
		<category><![CDATA[unstructured data]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=3127</guid>
		<description><![CDATA[Synthesys Enables Cloud-Scale Entity Oriented Analytics in Chinese Arlington, VA and Nashville, TN – June 7, 2011 –Digital Reasoning™, the leader in unstructured data analytics at scale, today announced Chinese language support for its flagship product Synthesys®. Synthesys can now analyze the unstructured data from a variety of sources in both English and Chinese to ]]></description>
			<content:encoded><![CDATA[<p><strong><em>Synthesys Enables Cloud-Scale Entity Oriented Analytics in Chinese</em></strong></p>
<p><strong>Arlington, VA and Nashville,<strong> TN </strong></strong>– June 7, 2011 –<a href="http://www.digitalreasoning.com/">Digital Reasoning</a><sup>™</sup>, the leader in unstructured data analytics at scale, today announced Chinese language support for its flagship product<a href="http://www.digitalreasoning.com/products/" class="broken_link"> Synthesys</a>®. Synthesys can now analyze the unstructured data from a variety of sources in both English and Chinese to uncover potential threats, fraud, and political unrest. By automating this process, intelligence analysts can gain actionable intelligence in context quickly and without translation.</p>
<p>While English is still the most widely used language on the web, a recent report from <a href="http://thenextweb.com/asia/2010/12/21/chinese-the-new-dominant-language-of-the-internet-infographic/">The Next Web</a> suggests that “[i]t could be less than five years before Chinese becomes the dominant language on the Internet.”</p>
<p>“This is a significant milestone for our company”, said Rob Metcalf, President and COO of Digital Reasoning,” said Rob Metcalf, President and COO of Digital Reasoning, “Whether for the public sector, financial services, health care or other enterprise applications, the next generation of Big Data solutions for unstructured data will need to natively support the world’s most widely spoken languages.”</p>
<p><strong><em><br />
</em></strong></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2011/news/press-release/digital-reasoning-introduces-chinese-language-support-for-big-data-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Big Data: An Interview with Research Scientist James Gardner</title>
		<link>http://www.digitalreasoning.com/2011/blog/understanding-big-data-an-interview-with-research-scientist-james-gardner/</link>
		<comments>http://www.digitalreasoning.com/2011/blog/understanding-big-data-an-interview-with-research-scientist-james-gardner/#comments</comments>
		<pubDate>Tue, 06 Dec 2011 10:16:05 +0000</pubDate>
		<dc:creator>Jason Beck</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Alan Turing]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Automated Understanding]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Daniel Kahneman]]></category>
		<category><![CDATA[Emory University]]></category>
		<category><![CDATA[entity extraction]]></category>
		<category><![CDATA[Entity Oriented Analytics]]></category>
		<category><![CDATA[Entity Resolution]]></category>
		<category><![CDATA[James Gardner]]></category>
		<category><![CDATA[Named Entity Recognition]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[Siri]]></category>
		<category><![CDATA[Synthesys]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=3964</guid>
		<description><![CDATA[I recently sat down with James Gardner to discuss Big Data and the promise of Automated Understanding. James is a senior research scientist, and a doctoral candidate at Emory University, who works principally in the area of natural language processing – from tokenization to named entity extraction, entity resolution, fact extraction and relationship extraction. Since ]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-thumbnail wp-image-3967" title="James Gardner" src="http://www.digitalreasoning.com/wp-content/uploads/2011/12/James-Gardner-150x150.jpg" alt="James Gardner, Senior Research Scientist" width="150" height="150" /></p>
<p>I recently sat down with James Gardner to discuss Big Data and the promise of Automated Understanding. James is a senior research scientist, and a doctoral candidate at Emory University, who works principally in the area of natural language processing – from tokenization to named entity extraction, entity resolution, fact extraction and relationship extraction.</p>
<p>Since his undergraduate studies, James has been interested in Artificial Intelligence. That interest extended to natural language processing and machine learning, which he saw could be applied to increasing privacy in medical records.</p>
<p>The following is our interview:</p>
<p>&nbsp;</p>
<p><strong>-What is Big Data and what the inherent problems?</strong></p>
<p>Data is being generated with more velocity and variability now due to social networking, blogging, webpages, etc.</p>
<p>This velocity and, even more so, the variability (or heterogeneity) defines big data. Think terabytes and petabytes (eventually exabytes and zettabytes) of unstructured or semi-structured information.</p>
<p>Big data increases storage cost, computation cost, and variety cost. Where the variety or variability cost is the most expensive due to the large amount of human effort required to make the connections in the data or development time necessary to integrate the systems.</p>
<p>I think the benefits of Big Data significantly outweigh the cost. It may be difficult for engineers to make response times fast for complex analytics over very large datasets, but those of us on the machine learning side really welcome Big Data, because more data in many cases allows for better predictions, especially in unsupervised or semi-supervised machine learning methods.</p>
<p>&nbsp;</p>
<p><strong>-What is Automated Understanding and how can it resolve the inherent problems within Big Data?</strong></p>
<p>The velocity and variability of this data is unreadable even by all humans. I think that most people read the text that they write, but they definitely aren&#8217;t reading all of there social networking, or Internet traffic.</p>
<p>That last statement is alluding more toward the need to handle textual and “graphical” relationship content, but there is also a variety of video, audio, and temporal data associated with these events, in either meta-data or even in the content itself.  Computer aided summarization of these complex relationships and discoveries of entities is crucial for business and intelligence analysts to be able to make decisions quickly without having to read every document or data signal necessary to make a decision.</p>
<p>Human-aided summarization of data, where the human is helping the computer, is the future of big data.</p>
<p>Automated Understanding&#8217;s goal is to minimize the amount of human effort necessary to summarize these entity relationships.</p>
<p>&nbsp;</p>
<p><strong>-Is Entity Oriented Analytics the best approach to understanding unstructured data?</strong></p>
<p>Humans think in terms of entities and therefore computers need to present this information especially when dealing with big data.</p>
<p>When there are millions of mentions of a single individual in a dataset you definitely don&#8217;t want to have to read every one of the documents, and you might not even care how many documents the entity occurs in if you can see relationships and other facts associated with the given entity.</p>
<p>This brings us to the point of how we even define entities. Entities, in the most general philosophical sense, are sets of subsets that are invariant over space and time. Notice the recursive definition. For most practical purposes we can define entities as things that exist in the real world and have certain properties that make them unique.</p>
<p>&nbsp;</p>
<p><strong>-What are some of the toughest problems we have to resolve to better make sense of unstructured data?</strong></p>
<p>Named Entity Recognition is considered by many to be a nearly solved problem. This is the case in many situations, but is not necessarily true for all languages or for messy data. Synthesys has made great strides in accomplishing this goal by using the latest greatest techniques from the academic literature.</p>
<p>Even in those cases where the automated processes are not able to make a decision we give the users the ability to train Synthesys.</p>
<p>This training allows Synthesys to continually improve and adapt to the users needs. This training may be time-consuming but it can get the job done much more quickly. We can teach Synthesys a new language very quickly. It takes years to teach a human a new language.</p>
<p>Entity resolution (including disambiguation and clustering), association generation, and relationship generation are the hardest problems with the most promise for really affecting how users interact with and make use of big data in the future.</p>
<p>&nbsp;</p>
<p><strong>-What does the recent release of Siri mean to the AI community?</strong></p>
<p>Siri is a pretty cool feature recently introduced for iOS devices. Speech recognition technology has finally evolved to the point that handheld devices can process audio and determine which application or task the user wishes to accomplish. This technology is actually not that new and we use very similar technologies and algorithms for dealing with various aspects of our text analytics, but Apple has really put together a neat combination of existing technologies (as they have always been very good at) that really allows users to more quickly, or at least with less effort, interact with their handheld devices.</p>
<p>I think Siri is like the Chatterbot systems from the 90&#8242;s with the added ability to interact with other applications on the device. This is likely going to turn out to be a great success for Apple, but I&#8217;m not really sure as to how much this will really affect the AI community. I think it&#8217;s more likely to influence developers that it&#8217;s probably worth while to accept natural language commands to perform task rather than having users have to follow a strict API (or language) to communicate with their software. This,I think, is extermely important for human computer interaction.</p>
<p>I think that developers should work to not only make it easy for humans to interact with the systems, but also create frameworks for systems to learn from one another through probabilistic interactions. Maybe call this computer-computer interaction.</p>
<p>We have been viewing computers in a binary (right, wrong, 0-1) light since the 50&#8242;s. This paradigm leaves us stuck with having to make a decision, and if the computer makes the wrong decision we automatically jump to the conclusion that computers are unable to reason as humans do.  But think of the mistakes that humans make all the way through life. The whole “touch a hot stove and get burned” example. Gyorgy Buzaski in the <em>Rhythms of the Brain </em>explains that this is the way learning happens. I interact with the environment and I get some sort of response. Sometimes those random interactions (or predictions) are accepted by the environment and are considered correct or valid interactions, while others aren&#8217;t and you end up with a first-degree burn. These unsuccessful predictions or mistakes are necessary for the learning process.</p>
<p>Alan Turing, way back in 1947, made the point that computers can never be both infallible and intelligent. Machine Learning research is now using this idea in full force. Many learning and inference algorithms are based on the idea that if the model makes a mistake then update the model, otherwise the model is already making the correct prediction so why change it. Very much like Occam’s razor combined with work only as long or as much as you need to.</p>
<p>I recently read an article written by the Nobel Prize winning psychologist, Daniel Kahneman, where he was describing his discovery of the “illusion of validity.” Eight soldiers try to get a log and themselves over a wall without ever touching it.</p>
<p>They watched the soldiers and tried to infer who was the best candidate to be a leader. It turns out that the teachers in the leadership school told them that they were failing to make good predictions.</p>
<p>Kahneman explained that in spite of knowing that their predictions were wrong they continued to use the same metrics to make the predictions. This allowed him to conclude that WYSIATI, “What you see is all there is.”</p>
<p>I don&#8217;t believe that the test was flawed, this was more likely an issue with understanding the goal (as in the great business novel by Goldratt).</p>
<p>It&#8217;s very likely that if they had measured and weighted the appropriate observable components of the exercise relative to how well the soldiers performed in officer school rather than how well they performed the log-task, the psychologist would have made much better predictions.</p>
<p>This is one example where machine learning could be used to learn how to make accurate leadership predictions, considering the presence of both the prior data (log-task features) and the ground truth from the teachers in officer school. I think it&#8217;s important that we always consider the goal of computing.  We developed computers to help us accomplish tasks more quickly.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2011/blog/understanding-big-data-an-interview-with-research-scientist-james-gardner/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Entity-Centric Advanced Analytics using Synthesys</title>
		<link>http://www.digitalreasoning.com/2011/blog/entity-centric-advanced-analytics-using-synthesys/</link>
		<comments>http://www.digitalreasoning.com/2011/blog/entity-centric-advanced-analytics-using-synthesys/#comments</comments>
		<pubDate>Fri, 03 Jun 2011 19:58:45 +0000</pubDate>
		<dc:creator>Dave Danielson</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Entity Oriented Analytics]]></category>
		<category><![CDATA[IQT]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[noETL]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[Synthesys]]></category>
		<category><![CDATA[Tim Estes]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=3091</guid>
		<description><![CDATA[Tim Estes, CEO and founder of Digital Reasoning, was a featured author in a recent IQT Quarterly publication. In this article, Tim helps the reader understand the meaning of &#8220;Entity Oriented Analytics&#8221; including how the mission has evolved over the past decade to bring us to this place and why entity-orientation is so necessary in ]]></description>
			<content:encoded><![CDATA[<p><a href="wp-content/uploads/2011/06/IQT%20Quarterly_Spring%202011_Tech%20Corner.pdf" class="broken_link"><img class="alignright size-full wp-image-3094" title="IQT Tech Corner2" src="http://www.digitalreasoning.com/wp-content/uploads/2011/06/IQT-Tech-Corner2.jpg" alt="" width="240" height="140" /></a></p>
<p><strong>Tim Estes, CEO and founder of Digital Reasoning, was a featured author in a recent IQT Quarterly publication.</strong> In this article, Tim helps the reader understand the meaning of &#8220;Entity Oriented Analytics&#8221; including how the mission has evolved over the past decade to bring us to this place and why entity-orientation is so necessary in today&#8217;s big data analytics solution architecture.</p>
<p>&nbsp;</p>
<p>In this article, Tim references a decade of lessons learned in the intelligence community addressing topics including:</p>
<ul>
<li>How to deal with challenges when you don&#8217;t know what you need in advance</li>
<li>Why static ontology can be limiting</li>
<li>Challenges Relational DBs face with entity-centric missions</li>
<li>Why analytics tools can be overwhelmed by these new data sets</li>
<li>That there is no &#8220;uber algorithm&#8221; solution to all needs</li>
</ul>
<p>In summary, Tim presents the case for an architecture that is both revolutionary and enduring.</p>
<h2>View the full article<strong> <a title="Tim Estes IQT Quarterly Article" href="http://www.digitalreasoning.com/wp-content/uploads/2011/06/IQT%20Quarterly_Spring%202011_Tech%20Corner.pdf" target="_blank">by clicking here</a></strong></h2>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><strong><br />
</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2011/blog/entity-centric-advanced-analytics-using-synthesys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Security through Obscurity</title>
		<link>http://www.digitalreasoning.com/2010/blog/security-through-obscurity/</link>
		<comments>http://www.digitalreasoning.com/2010/blog/security-through-obscurity/#comments</comments>
		<pubDate>Wed, 26 May 2010 22:58:04 +0000</pubDate>
		<dc:creator>Dave Danielson</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[data analytics]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[Synthesys]]></category>

		<guid isPermaLink="false">http://www.digitalreasoning.com/?p=1277</guid>
		<description><![CDATA[“Security through Obscurity” is a term often used to refer to security provided by keeping details of a system secret, or by making a system so obtuse that it is difficult to determine how it works, thus hiding its vulnerabilities. Unfortunately, I believe that there is also an application of this term to the need ]]></description>
			<content:encoded><![CDATA[<p>“Security through Obscurity” is a term often used to refer to security provided by keeping details of a system secret, or by making a system so obtuse that it is difficult to determine how it works, thus hiding its vulnerabilities. Unfortunately, I believe that there is also an application of this term to the need of identifying and tracking the important information hidden in the mountains of digital data generated each day.</p>
<p>While technology has provided several good paradigms for dealing with structured data (i.e. data that is structured in such a way to be easily decomposed into pre-defined fields), it has not kept pace with unstructured data, such as emails, blogs, web site content, etc. Thus, critical information is often kept “secret” through the obscurity of the sheer volume of data one must process, often manually, to reveal this information.</p>
<p>In response to this challenge, Digital Reasoning Systems, Inc has developed a comprehensive set of analytical tools packaged into product called Synthesys<sup>®</sup> that essentially decomposes unstructured text into meaningful information easily understood and manipulated by a user.</p>
<p>This technology is based on the premise that there is order inherent in all languages that can be discovered and mathematically modeled. This has led to the development of our advanced data analytics and knowledge abstraction for unstructured data, based on a distinctive, patented mathematical approach to natural language processing.</p>
<p>For a better understanding of Synthesys<sup>®</sup> and its capabilities, a down-loadable white paper (Synthesys – Technology Overview) providing a high-level overview can be found <a title="Synthesys White Paper" href="http://www.digitalreasoning.com/synthesys-white-paper/" target="_blank">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.digitalreasoning.com/2010/blog/security-through-obscurity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

