Vendors are touting enterprise search as the next big thing in business intelligence. But is the current level of integration between the two sets of technologies only skin deep?
Pointing enterprise search at BI is akin to trying to index unstructured and structured data for analysis. The integration between search and BI is twofold: to bring unstructured data (via search indexes) into traditional BI and data warehousing environments; and to provide search capabilities directly against multidimensional data, rather than using traditional BI tools to perform OLAP functions like slice-and-dice, pivot, and drill-down.
This is a rapidly developing area in BI and one that has piqued the interest of vendors from both camps. Why? Because proponents claim:
* It is a more intuitive way to analyse data compared to using traditional BI tools and graphical dashboards. Taking its cue from a proven Internet search model like Google that has made the data strewn across the World Wide Web accessible to consumers at the touch of a few keystrokes, search is touted as a more familiar and proven interface for performing BI query and analysis.
* It is an easy way to bring unstructured data into the traditional BI mix that typically deals only with structured (relational or multidimensional) data.
* It is a way of making BI more pervasive by 'democratising' access and use across corporate masses of knowledge workers.
BI search party
Most of the leading BI vendors now offer search capabilities as part of their platforms, though many have done so by integrating into popular enterprise search technologies like Google's OneBox search appliance.
* Information Builders: WebFocus Intelligent search is built on iWay Software's Enterprise Index system released last year, which extends the Google Search Appliance.
* Cognos: Cognos Go Search, which is part of the Cognos 8 BI platform, is a pre-indexed search tool that is used in tandem with Google OneBox and IBM OmniFind to present information in its appropriate context.
* SAS Institute: Enterprise Intelligence Search marries its Enterprise Intelligence BI platform with Google OneBox and IBM's OmniFind search engine to perform contextually relevant searches.
* Oracle: Enterprise Search 10g is portable pure web-client built on text search capabilities that Oracle has offered as part and parcel of its core database products for well over a decade.
It is not just BI vendors that are getting in on the action. Traditional search vendors are also nudging their software products into a BI and data warehousing context.
There are a couple of smaller niche players in the market developing structured search tools that work against OLAP data:
* CopperEye: Its Greenwich software effectively puts a search front-end onto structured flat file data that is automatically indexed as it is loaded into the system. The software is aimed largely at applications that require large amounts of data to be archived but at the same time be searchable periodically (typically telecoms call record data and e-mail).
* Ardentia Search: Develops a structured search tool that runs against OLAP cubes and Lotus Domino databases. The software works by indexing the cube and then searching and filtering against that cube and comes with segmentation, charting, and metrics tools.
Meanwhile, larger enterprise search vendors are positioning their technologies to overcome the limitations of traditional data warehousing.
While acknowledging that recent moves by BI vendors to integrate their tools with OneBox and OmniFind make it easier for users to find relevant BI data, they believe this does not go far enough.
What is needed, they argue, is an ability for users to directly search and navigate BI data in an ad-hoc manner, then display relevant, usable information to users without the need for predefined report creation.
Enterprise Search vendors like Fast Search and Transfer (FAST) and Autonomy are of the opinion that unstructured data naturally becomes more 'structured' as it is being pulled into BI and data warehousing environments. Both are now starting to apply techniques like indexing, entity extraction, dynamic association, profiling, and categorisation directly to data warehousing environments to improve the BI user experience.
* FAST: Its Adaptive Information Warehouse layers techniques like dynamic object association, linguistic-based data cleansing algorithms, data profiling and dashboard capabilities on top of an indexing engine that extracts entities and unique identifiers from unstructured data sources and puts them into a searchable master index. AIW is driven off a new statistical analysis portal product called Radar that integrates reporting technology acquired from Corporate Radar into the FAST search engine. The aim is to make search a core part of conventional data warehousing infrastructure like Oracle and IBM.
* Autonomy: Its Meaning Analytic Warehouse (which is now part of Autonomy's IDOL unstructured data management technology that analyses historical data and extracts meaning from it) searches, extracts concepts, transforms, and indexes unstructured data including documents, video, voice, e-mail, and other file types in a data warehouse in preparation for bulk BI analysis. Data is indexed according to its 'meaning' which is derived from IDOL, understanding the concepts, context, and patterns contained in the information.
The aim is to then make the information (indexes) readily available for integration in third-party BI tools.
These two approaches are interesting for a couple of reasons.
First, they seem to aim to replace the classic star schema data warehouse with a search index for storing both structured and unstructured information. Second, in FAST's case, it applies linguistic-based techniques like contextually aware search, fuzzy, and phonetic matching, and rules for data cleansing.
Both are very different and potentially disruptive approaches to traditional data warehousing that has relied on extract, transform and load, and data quality tools.
The move to tighten up links between these two technologies is understandable. After all, if Internet search engines like Google can trawl through the entire web in seconds to access data then why can the same engine not be pointed at relatively smaller BI data repositories?
The growing use of BI search and text analytics is also part of a larger trend toward leveraging unstructured data in BI and data warehousing that have previously relied almost exclusively on structured data. Unstructured and semi-structured (XML) data accounts for around 80% of enterprise data and continues to grow fast.
There is value to be had from that data in a decision support context such as BI or performance management. This requires BI tools and technologies to be able to reach into and intelligently index and catalogue information from a range of unstructured data sources including voice recognition, wikis, RSS feeds, instant messaging transcripts, and document management systems so that they can be massaged into a form that is usable in data warehousing and BI analysis processes.
BI vendors are certainly banging the search drum loudly. But some might argue that the integration provided by vendors from both camps just runs skin deep only. By adding a search box to on top of their BI tools, vendors like Information Builders and Cognos are simply making it easier for users to find relevant reports to analytic business processes. This is great when users do not know what they are looking for or cannot find what exists.
BI search might also be a stopgap alternative for a risky and expensive BI consolidating project by linking together data and reports strewn across multiple BI platforms. Again, that is particularly useful for when the same business entities and processes are represented in reports from multiple BI platforms, which is often cited as a barrier to gaining a complete 360° analytic view of business performance. However, most of the BI search functions provided are vendor-specific, and thereby restricted to a particular set of data. Companies will need to work through some complex systems integration to provide cross-platform insights.
Meanwhile, search vendors like FAST and Autonomy are tackling integration from a different angle, by flipping BI on top of their respective search platforms to integrate and orchestrate all of the unstructured information for BI. While the connectivity issues between BI tools and unstructured sources might be getting ironed out, they assert that what BI search really needs is a mature data management infrastructure beneath it. For that reason the best search and text analytic deployments will be built on top of mature, reliable data warehouses.
But searching through structured data sources like OLAP cubes and data warehouses poses its own set of challenges for traditional search engines. The sticking point is the haziness of corporate metadata wrapped around the data. Comprehending and resolving inconsistent problem that search technologies struggle with. Having an index on unstructured information allows it to be searched, but a deeper understanding of metadata and business logic of that information is needed to make it truly useful in a BI context.
Regardless of all the vendor hype, BI search is far from being a universal turnkey system or a mainstream practice in BI and data warehousing today. Despite exhaustive connectivity to a host of unstructured and semi-structured data sources, many BI systems and data warehouses today still struggle with just structured data.
Throwing a mountain of unstructured data will not remove these shortcomings. In fact it might even make the problem worse. Only time will tell if these savvy search techniques can deliver on their promise of faster and more flexible data warehouses. But the promise is encouraging vendors from both sides of the fence to think differently about leveraging search indexes in BI and data warehousing environments in the future.
Similarly the jury is also out on whether search can deliver on the promise of ubiquitous BI use across the enterprise. Again the theory sounds promising, as a way of paradigm-shift change in the way in which corporate users generate and consume BI information.
After all, enterprise search is also based on a proven consumer-driven Internet search model. But that makes Bi search more evolutionary, than revolutionary.
Do not expect search to usurp traditional BI and data warehousing technologies just yet. But do expect it to be offered as a useful add-on to BI and data warehousing infrastructures that bring unstructured data into the BI fold and analytic mix.