October 18, 2011
This is a very smart move by Oracle. Until the Siebel and Hyperion acquisitions, Oracle was not a leader in the BI and analytics space. Those acquisitions put them squarely in the top three together with IBM and SAP. However, until this morning, Oracle played mostly in the traditional BI space: reporting, querying, and analytics based on relational databases. But these mainstream relational databases are an awkward fit for BI. You can use them, but it requires lots of tuning and customization and constant optimization — which is difficult, time-consuming, and costly. Unfortunately, row-based RDBMSes like IBM DB2, Microsoft SQL Server, Oracle, and Sybase ASE were originally designed and architected for transaction processing, not reporting and analysis. In order to tune such a RDBMS for BI usage, specifically data warehousing, architects usually:
- Denormalize data models to optimize reporting and analysis.
- Build indexes to optimize queries.
- Build aggregate tables to optimize summary queries.
- Build OLAP cubes to further optimize analytic queries.
Unfortunately, there’s one basic problem with these approaches: It’s impossible to build denormalized data models, indexes, and aggregates for every possible query that users will execute during the lifetime of a database. So BI pros must pick their battles and optimize the RDBMS based on current or near-term expected usage. But in today’s fast-paced business environment, “near-term” may mean days or even hours, requiring BI and data warehousing (DW) pros to spend a significant amount of their time doing nothing more than constantly optimizing and reoptimizing these databases.
Additionally, these RDBMSes offer minimal support for:
- Unstructured content.
- Diverse data structures with ragged, sparse, and unbalanced hierarchies.
Common RDBMS agility challenges involve the management of unstructured content. The inverted index BI DBMS — which is what Endeca is — approaches questions from the assumption that you must start with a database and then worry about tuning it by building indexes on top of it. This approach builds one big index, but instead of just pointing to data sources — as traditional search engines like Google or Yahoo do — it embeds data in the index itself. Typical use cases for inverted index DBMSes include:
- Managing a variety of unharmonized data types and data sources without resorting to complex ETL in a “load first, analyze next and then address data harmonization” approach.
- Combining SQL queries and keyword searches.
With this acquisition, Oracle has leapfrogged all other leading BI vendors in its capability to integrate unharmonized data sources and perform search-based BI. Here’s where the other vendors stand:
- A combination of SAP HANA and BusinessObjects Explorer can be positioned as a competitor to Endeca — but it’s a stretch.
- Microsoft FAST Search is direct competitor, but Microsoft so far has not done much to position Fast Search as a BI tool.
- In-memory BI vendors like QlikView and Tibco Spotfire recently introduced search like navigation in their tools, but their in-memory-only architecture is not scalable beyond a few hundred gigabytes.
So this acquisition now really differentiates Oracle’s BI suite, but it will not be without significant challenges for Oracle and Endeca. OBIEE is the strategic BI platform at Oracle. No ifs, ands, or buts. Even the ubiquitous Essbase is taking a back seat by being positioned mostly as a cubing engine with OBIEE as a recommended front end. As the first order of business, I expect the combined teams to first come up with an SQL or MDX wrapper for Endeca so that OBIEE can be used to access its index. Beyond that, I expect that Oracle will position Endeca as a special-purpose BI tool.
What’s next? IBM buys Attivio.
For the non-BI implications of the deal, please take a look at Leslie Owens’ blog.