Semantic AI for Ecommerce Product Discovery

AI-Powered Product Discovery with Elastic Path: Semantic Search and Natural Language Commerce

The gap between how customers think about products and how keyword search engines index them has always been a source of friction in ecommerce. A B2B buyer searching for “something to clean oil off a concrete floor” does not want to type “industrial degreaser, solvent-based, concrete substrate compatibility”. They want the system to understand their intent and surface the right products.

Semantic search changes the equation. By representing products and queries as dense numerical vectors in a shared embedding space, it enables AI-powered ecommerce search with natural language to work at the level of meaning rather than lexical overlap. Elastic Path’s composable architecture gives development teams a structured pathway to implement natural language product discovery at scale — and this article is a technical guide to doing exactly that.

Keyword Search vs Semantic Search: The Architectural Difference

Understanding why semantic search produces better results requires understanding why keyword search fails.

How Keyword Search Works

Traditional keyword search — whether implemented with Elasticsearch, Solr, or a database full-text index — operates on token matching. A query is tokenised, stop words are removed, and the remaining terms are matched against an inverted index. Relevance ranking (BM25, TF-IDF) rewards documents containing query terms with high frequency relative to the corpus.

The fundamental limitation is that keyword search is blind to meaning. It cannot recognise that “office chair for tall people” and “ergonomic seating, extended seat height” describe the same purchase intent. B2B catalogues are particularly vulnerable: buyers use application-level language (“something to seal a flange joint at 200°C”) while product data uses specification-level language (“gasket material: PTFE, temperature rating: -200°C to 260°C”). These two representations share almost no tokens, so keyword search reliably fails to connect them.

How Semantic Search Works

Semantic search replaces token matching with vector similarity. An embedding model — typically a transformer-based neural network trained on large text corpora — encodes both the product data and the query into dense vectors of several hundred to several thousand floating-point dimensions. These vectors capture semantic relationships: words and phrases that appear in similar contexts across the training data end up positioned close together in vector space.

At query time, the user’s natural language input is encoded into a query vector. The search engine then finds product vectors whose angle or distance from the query vector falls below a threshold, ranked by cosine similarity. Products that are semantically related to the query surface, even if they share no tokens with it.

This architecture enables AI-powered ecommerce search with natural language queries that would defeat any keyword system: “waterproof jacket for a Scottish winter” or “printer paper for a busy office that won’t jam”. The embedding model handles the vocabulary gap automatically.

Hybrid Search: Combining Both Approaches

In practice, the most robust production systems implement hybrid search: a combination of keyword matching and vector similarity, whose results are fused using reciprocal rank fusion or a learned ranker. Hybrid search retains the precision of keyword matching for exact SKU lookups, model numbers, and brand-specific queries, while adding the recall advantages of semantic matching for natural language and intent-driven queries. This is the architecture McKenna Consultants recommends for most Elastic Path implementations.

Elastic Path’s Native Search Capabilities

Elastic Path provides a built-in product search service as part of its composable commerce platform. The native search layer supports filtering, faceting, and basic text matching across catalogue data. For many B2C catalogues with modest complexity, the native offering handles standard search requirements adequately.

Elastic Path’s architecture also exposes the product catalogue through its EPCC (Elastic Path Commerce Cloud) API, which gives development teams clean access to catalogue data, product attributes, pricing, inventory, and hierarchy information. This API-first design is the foundation for integrating more capable search infrastructure when requirements demand it.

For Elastic Path semantic search AI integration, the key capability is the platform’s openness. Because Elastic Path does not lock product data behind a proprietary search layer, teams can synchronise catalogue data to an external vector store, build an embedding pipeline that runs on product upsert events, and replace the default search endpoint with a semantic layer — all without modifying the commerce backend itself.

Connecting an Embedding Pipeline to Elastic Path

The typical architecture for Elastic Path semantic search AI integration involves three components:

1. Catalogue synchronisation. A webhook-driven or polling process listens to Elastic Path product events and writes product records — including name, description, category hierarchy, specifications, application notes, and enriched search context — to a vector store.

2. Embedding generation. Each product record is passed through an embedding model to produce a dense vector representation. Commerce-specific fine-tuned models (Cohere, sentence-transformers shopping variants) often outperform general-purpose models. Vectors are stored alongside product metadata in the vector store.

3. Query embedding and retrieval. The user’s query is embedded using the same model; the vector store executes an approximate nearest-neighbour (ANN) query to retrieve the top-k products. Results are re-ranked if needed, enriched with live pricing and inventory from Elastic Path’s API, and returned to the frontend.

Suitable vector stores include Pinecone, Weaviate, Qdrant, and pgvector. Selection depends on deployment constraints, catalogue size, and query volume.

AI Agents and Conversational Product Discovery

Beyond single-query semantic search, AI agents ecommerce product discovery opens a more sophisticated interaction model: a conversational interface where buyers can refine their requirements through natural dialogue, and an LLM orchestrates multiple API calls to surface and compare products.

The Conversational Discovery Pattern

In this pattern, an LLM acts as an orchestration layer between the buyer and the commerce API. A typical session might proceed as follows:

User: “I need a pump for moving water in a light industrial setting, about 50 litres per minute.”
Agent: Calls the semantic search API with the intent extracted from the message, retrieves candidate products, checks inventory via Elastic Path API, and returns a curated shortlist with a summary of trade-offs.
User: “Which of those would handle slightly sandy water without clogging?”
Agent: Uses the conversation history and product specifications to filter and re-rank, potentially fetching additional product attributes from Elastic Path to answer the follow-up.

The LLM does not directly query Elastic Path. Instead, it uses tools — structured function calls — that invoke the semantic search layer and the Elastic Path API. This tool-calling pattern, supported in OpenAI, Anthropic, and open-weight model frameworks such as LangChain, keeps the LLM in an orchestration role rather than a retrieval role. The commerce API remains the authoritative source of truth for pricing and availability.

Implementing Tool Calls for Commerce Discovery

For natural language product discovery B2B implementations, the toolset exposed to the LLM typically includes:

searchProducts(query: string, filters: object) — semantic search with optional structured filters for category, price range, or attribute values
getProductDetails(productId: string) — fetch full product data including specifications and documentation links
checkInventory(productId: string, quantity: int) — live stock check from Elastic Path
getPricing(productId: string, accountId: string) — account-specific pricing where applicable in B2B scenarios
comparePProducts(productIds: string[]) — retrieve and format a comparison table

Each tool call is implemented as an API call to Elastic Path or the semantic search layer. The LLM selects which tools to invoke and in what sequence based on the buyer’s messages and the results of previous tool calls.

This architecture also enables AI agents ecommerce product discovery at the backend level — autonomous agents that periodically analyse customer journeys, identify catalogue gaps where searches returned no results, and flag product data quality issues for merchandising teams.

Product Data Enrichment for AI Readability

The quality of semantic search results is fundamentally constrained by the quality of the product data embedded into the vector store. Products with thin, specification-only descriptions perform poorly against natural language queries. Product data enrichment is therefore not optional — it is a prerequisite for effective AI-powered ecommerce search with natural language.

Structured Attribute Completeness

Before adding natural language content, ensure all structured attributes are populated. For semantic search, attributes serve two roles: they contribute to the text that gets embedded, and they provide filter dimensions that can be applied after vector retrieval to narrow results. Incomplete attribute data degrades both.

Audit your Elastic Path catalogue for attribute coverage. Common gaps in B2B catalogues include missing compatibility information, absent application context, and specification data that exists in PDFs but has not been extracted into structured fields. Prioritise attributes that buyers are likely to include in natural language queries: material, size, application, compatibility, and industry sector.

Natural Language Descriptions

Short, specification-focused descriptions embed poorly because they lack the contextual language that connects them to buyer intent. For each product category, write or generate supplementary descriptions covering: the problem the product solves, typical use contexts, the likely buyer profile, and common synonyms for the product type and its attributes.

These descriptions do not need to appear on product pages. Store them in a dedicated search_context field that is embedded but excluded from the rendered storefront, keeping the design clean while giving the embedding model rich semantic material.

Embedding Generation Strategy

The concatenation strategy matters significantly. Embedding a product name alone produces a narrow representation; embedding a structured concatenation of name, description, category path, key attributes, and search context captures far more of the product’s semantic surface area.

A practical template for B2B products:

{product_name}. {category_path}. {short_description}. Used for: {application_notes}. Compatible with: {compatibility_info}. {search_context}

Validate this template against a held-out set of representative buyer queries before indexing the full catalogue.

Keeping Embeddings Current

Embeddings must be regenerated whenever product data changes. Build embedding regeneration into your Elastic Path webhook handling: on product update events, re-embed the affected product and upsert the vector in the vector store. For bulk catalogue updates, trigger a full re-index job. Stale embeddings are a common source of degraded search quality in production systems.

Search Personalisation with Customer Context

Elastic Path’s B2B capabilities — account hierarchies, contract pricing, buyer roles — provide rich context for personalising semantic search results beyond static relevance scoring. Three techniques are particularly effective:

Account-level catalogue filtering. Use Elastic Path’s catalogue rules to determine which products are available to the buyer’s account and pre-filter the vector search to the eligible subset. This prevents surfacing products the buyer cannot purchase.

Purchase history re-ranking. Fetch the buyer’s order history from Elastic Path and boost products from categories they have previously purchased. Familiar options rank higher, reducing the cognitive load of product selection.

Session context accumulation. In conversational discovery sessions, accumulate signals — categories browsed, products viewed, queries submitted — and use these to adjust the query vector or apply soft filters to subsequent searches, giving the session a sense of adaptive intent understanding.

Native Elastic Path Search vs Algolia vs Coveo

Choosing the right search infrastructure is a significant architectural decision. Here is a practical comparison of the three main options for teams building on Elastic Path.

Elastic Path Native Search

Best for: Standard B2C catalogues with straightforward search requirements, teams that want to minimise infrastructure complexity, and projects where semantic search is not a current requirement but may be added later.

Strengths: Zero additional infrastructure, no data synchronisation overhead, fully integrated with catalogue and pricing data.

Limitations: Limited semantic search capabilities, less control over ranking algorithms, not designed for large B2B catalogues with complex attribute structures or natural language discovery requirements.

Algolia

Best for: Teams wanting powerful keyword and faceted search with excellent developer experience, and comfortable extending it with Algolia’s NeuralSearch for semantic capabilities.

Strengths: Extremely fast, mature faceting and A/B testing, NeuralSearch adds vector-based retrieval, strong frontend component library.

Limitations: Cost scales with record count and query volume. Data synchronisation pipeline required from Elastic Path. Less flexible for deep B2B personalisation. NeuralSearch is newer and less battle-tested than the core keyword offering.

When to choose Algolia: B2C or B2B2C contexts where search speed and merchandising flexibility are top priorities and the catalogue is large but not deeply technical.

Coveo

Best for: Large enterprise B2B implementations requiring AI-powered relevance, deep personalisation, and a unified index across commerce, content, and support.

Strengths: Purpose-built for enterprise AI-powered search, strong personalisation engine, proven B2B case studies, unified content and commerce index.

Limitations: Significant cost and implementation complexity. Requires substantial investment in integration and ongoing tuning. Often overkill for mid-market implementations.

When to choose Coveo: Enterprise B2B scenarios with complex catalogues and a requirement to unify product search with technical documentation or support content.

Our Recommendation

For most Elastic Path implementations, the decision sits between native search and Algolia. Native search is sufficient for straightforward requirements. Algolia adds meaningful value when merchandise control, search analytics, and speed are priorities. A custom semantic layer (vector store plus embedding pipeline) is the right choice when natural language product discovery B2B is a core requirement and the team wants full control over the relevance model without Algolia’s pricing exposure. Coveo makes sense only at enterprise scale with budget and timeline to match.

Measuring Search Relevance and Conversion Impact

Implementing AI-powered ecommerce search without a measurement framework leaves you unable to demonstrate value or diagnose degradation. Establish the following metrics before go-live.

Zero-result rate. The percentage of queries that return no results. For semantic search, this should approach zero — even poor queries return something. Monitor it as a quality gate.

Click-through rate (CTR) by rank position. Track which search results buyers click and at which position. Healthy semantic search produces high CTR at positions 1-3. A flat CTR distribution suggests ranking quality issues.

Search-to-conversion rate. The percentage of search sessions that result in an add-to-cart or purchase. Compare this metric before and after semantic search deployment using A/B testing where possible.

Query reformulation rate. How often does a buyer submit a second query immediately after the first? High reformulation rates indicate the first query failed to surface satisfying results.

Mean reciprocal rank (MRR). If you have a ground truth dataset of query/product relevance pairs — constructible from historical order data — MRR provides a single scalar metric of retrieval quality to track as embedding or ranking configuration changes.

Algolia and Coveo provide built-in analytics dashboards; for custom semantic layers, route search events into your analytics platform and build measurement dashboards in your BI tool of choice.

Implementation Roadmap

For teams starting from a standard Elastic Path deployment, the pragmatic sequence is:

Audit and enrich product data — Complete attribute coverage and write search context fields for priority categories (4–8 weeks for a mid-size B2B catalogue).
Deploy vector infrastructure — Stand up a vector store, configure the embedding pipeline, and index the enriched catalogue.
Implement hybrid search — Augment or replace the default search endpoint with a hybrid keyword/semantic layer, measuring retrieval quality against a representative query set before go-live.
Integrate with the storefront — Connect the new endpoint to the Elastic Path storefront, preserving existing facet and filter behaviour.
Add conversational discovery — Deploy the LLM orchestration layer as an optional interface initially, alongside the standard search box.
Personalise and measure — Introduce customer context signals, establish the measurement framework above, and iterate on ranking configuration.

How McKenna Consultants Can Help

McKenna Consultants brings together deep Elastic Path implementation experience and AI engineering capability. We have built composable commerce platforms on Elastic Path for B2B and B2C clients, and we design AI integrations that deliver real commercial impact beyond surface-level implementations.

For AI-powered ecommerce search with natural language, we provide end-to-end delivery: product data audit and enrichment, embedding pipeline architecture, vector store configuration, LLM orchestration for conversational discovery, and measurement frameworks to demonstrate ROI. We advise on the native Elastic Path vs Algolia vs Coveo decision for each client’s context, ensuring the investment matches the complexity of the problem.

If you are planning an Elastic Path semantic search AI initiative for 2026, or evaluating whether your current search infrastructure can support natural language product discovery, we would welcome the conversation.

Get in touch with McKenna Consultants to discuss your product discovery requirements.