Last week, business intelligence vendor SAS announced its acquisition of Teragram, a provider of multilingual natural language processing technologies and text analytics. On the face of it, SAS appears to be catching up with some competitors, but the reality is that it is a step closer toward a new kind of BI.
BI has historically evolved around the use of data in databases (structured data) as a foundation for actionable intelligence. By moving into text analytics, BI vendors are effectively signalling a new kind of BI, one that takes into account the significance of unstructured information.
While structured data gives us the hard facts of business, unstructured data could provide additional details to enrich those facts and help decision makers gain better insight. This new style of BI fits the era of Web 2.0 better when every organisation gathers and generates masses of unstructured content on an ongoing basis.
We have already seen BI vendors adding search technology to their products, mostly by embedding tools from the likes of Google into their applications. Text analytics go further however; search in this context is typically carried out by users to locate information that they know to exist.
With search the results can be many pages long, perhaps returned in clusters and/or ranked for relevancy. Text analytics can take the results of search and identify key information that the user was not aware of, or automatically process large numbers of documents to identify information that is pertinent to the users' requirements.
Other vendors, for example, Business Objects and Fast Search, have already started on converging structured and unstructured data for business intelligence, albeit from different perspectives. However, SAS's untypical acquisition will no doubt give the trend an added momentum. Teragram is a successful text analytics company whose product is embedded in Fast Search, Ask.com, and Yahoo, amongst others.
Furthermore, Teragram has MyGADs, a collaborative solution that enables the sharing of information among groups of users and access to the data through multiple channels, including Internet browser, mobile devices, and instant messaging services. Stakeholders can work together on a Wiki and search for stored information using Direct Answers technology, allowing members to ask specific questions and get actual answers rather than links to them.
SAS already had text analytics prior to the acquisition. It also had search and support for 15 languages, but Teragram will extend those capabilities much further.
To start with it adds support for more languages, making a total of about 30. Other advantages of Teragram include scalable algorithms, rules generation and machine learning capabilities, automatic categorisation, sampling and enforcing of metadata, and many dictionaries, which SAS no doubt will customise in support of its industry-specific solutions. Teragram also benefits from a multivendor and grid-enabled architecture that fits in well with SAS.
The bigger picture is however, much more interesting. It is about taking BI to the next level, not only to realise the inherent value of unstructured data but to make BI more pervasive and collaborative using mobile and Web 2.0 technologies.
Although other BI vendors have already moved in this direction, Teragram's MyGads provides a very advanced capability in this area. As far as the organisation goes, it is expected that Teragram will maintain its independence, much as Data Flux has done and thereby allow the spirit of 'co-opetiton' to continue with partners/competitors.
The big question is whether customers are ready to take advantage of this new BI, and the answer is perhaps not quite yet, but good functionality can be the catalyst.