OpenSearch Looks To Deeper Hybrid Analytics

(GERMANY OUT) little boy looking through binoculars (Photo by Kuttig\ullstein bild via Getty Images)

ullstein bild via Getty Images

Users need search. OpenSearch open source search and analytics technology is provided under an Apache 2.0 user license and is known for its capabilities in website search, but also in real-time application monitoring and log analytics.

Now celebrating one year under the stewardship of the Linux Foundation, OpenSearch was originally created back in 2021 by Amazon Web Services as a fork (a skewed iteration of one technology to create an entirely new product or service, often brought about by developers’ disgruntlements with an original piece of software… and even more commonly due to license dissatisfaction) of ElasticSearch and the Kibana data visualization tool.

Integrated Visualization Toolset

OpenSearch has its own integrated data visualization toolset and is known for being highly scalable, due to its use of the Apache Lucene search library (a Java-based search engine software library). It is capable of anomaly detection, trace analytics (tracing being a key technique used to examine how an application is executing and performing) and search capabilities including “k-nearest neighbors”, an algorithm-based supervised learning classifier that uses proximity to deliver classifications and predictions.

Upbeat about this community-driven project’s growth, the Linux Foundation says that the project has grown and membership has expanded. Foundation members say that the project continues to deliver on its mission to drive the growth and development of OpenSearch’s open source AI-powered search, observability and analytics platform.

What Is Hybrid Lexical & Semantic Search?

Now welcoming version OpenSearch 3.2 this summer, recent contributions may have demonstrated the impact of this technology’s observability, analytics, vector database and hybrid search capabilities. The updated hybrid search algorithms deliver faster query times and increased throughput, making hybrid search (the fusion of lexical and semantic search results) more efficient for production workloads.

New vector database functionality is said to make it easier to develop and deploy search functions here. This includes native agentic AI support through Model Context Protocol and GPU-acceleration for index builds. Observability tools, including PPL (piped processing language, a key data query tool) upgrades backed by Apache Calcite (a database building and management technology) are hoped to bring efficiency to complex log analytics workstreams. Also new is cross-cluster search for traces, enabling trace analysis across clusters for enterprises running distributed systems.

For those software engineers who want their search experience delivered in an oven-ready format, Amazon OpenSearch Service is an AWS-managed service for teams to run OpenSearch clusters without the infrastructure maintenance burden. With AWS looking after the management and monitoring backend from the cloud, the service now supports several of the new features that have been added to OpenSearch across multiple versions.

Search, Beyond Keywords

Senior manager for Amazon OpenSearch Service Pallavi Priyadarshini says that the search business has moved on. It is no longer just about keywords, she thinks with the rise of generative AI and large language models, organizations now need search systems that can understand context, intent and meaning.

“OpenSearch has been actively innovating to meet this challenge head-on, with semantic search capabilities, hybrid search, multimodal data support and even conversational, agentic workflows,” said Priyadarshini, in an OpenSearchCon India summary report written in concert with foundation colleague Sreedhar Gade, an OpenSearch user and VP of engineering at customer engagement software company Freshworks. “At the heart of this transformation is the OpenSearch vector engine. It empowers developers to index and retrieve rich, unstructured data, from documents and images to embeddings generated by transformer models…making it easier than ever to build AI-native search applications.”

She insists that the team is “really pushing the boundaries of search” and the release of OpenSearch 3.0 marked a major leap forward in this journey. As well as key indexing technique advancements, this version features vector quantization i.e. new compression techniques for vector representations that help reduce memory usage and improve query performance.

Searching For Uber

Uber’s software engineering team has explained how it uses OpenSearch and why this functionality is so key to the popular ride-sharing app. The company says that search is a “foundational pillar of Uber’s user experience” that has a direct influence on central elements of the organization’s core business.

“Uber Eats users are faced with a staggering choice: over one million restaurants globally and typically more than a thousand dishes or restaurants per user session. Helping Uber Eats users navigate this vast space – based on preferences like cuisine, price, delivery time, dietary needs, or past behavior – is a deeply complex problem. The importance of search extends to the Uber Rides experience as well. Riders rely on intelligent destination search, autocomplete suggestions, and personalized predictions, all of which must be fast, reliable, and context-aware. Even the process of matching riders to drivers is, at its core, a search problem,” blogged the Uber software engineering team, explaining why its engineers have implemented OpenSearch.

The Uber teams says that search matching must balance “multiple real-time signals” i.e. driver availability, location, preferences and compliance rules. They says that these challenges illustrate how search and ranking are deeply embedded in Uber’s fulfillment logic, operating under strict latency and accuracy constraints.

Competitive Analysis: Enterprise Search In AI

In terms of other products and services in this space, Elasticsearch remains in place as OpenSearch’s main competitor. The technology bids to offer increasingly advanced performance and features, but (obviously, because it’s not open source) it is sold under a more restrictive commercial license. Also here is Algolia, known for its API-first approach and its clear “typo tolerance” functions that allow users to misspell, mis-type and mis-ask their queries in huge variety of ways. Many will enjoy the hundreds of ways Britney Spears can be wrongly typed as showcased on this link.

Coveo also works in enterprise search, preferring to refer to the practice as AI-powered search and relevance, then there’s Lucidworks, Glean, Yext… all of which work at different levels of maturity and machine learning based enrichment. From the big players there’s Google Cloud Search, IBM Watson Discovery and Microsoft Azure AI Search, the latter being Redmond’s managed search service offers natural language processing like many of the others.

Looking (sorry… searching) ahead for future trends in this space, OpenSearch will gain fans among some users as a result of its open source pedigree, its alignment to AWS and the fact that it has addressed its speed and performance status (users suggest that it has progressed from good enough, to really quite good) and because its community-centric approach means that hard core developers and data scientist can give back and contribute to the project.

OpenSearch Looks To Deeper Hybrid Analytics

Tags: