Unnecessary Use of Embeddings for Simple Retrieval Tasks

CER:

CER-0273

Service Category

Cloud Provider

Snowflake

Service Name

Snowflake Cortex

Inefficiency Type

Misapplied Embedding Architecture

Explanation

Embeddings enable semantic similarity search by representing text as high-dimensional vectors. Keyword search, however, returns results based on lexical matches and is often sufficient for simple retrieval tasks such as FAQ matching, deterministic filtering, metadata lookup, or rule-based routing. When embeddings are used for these low-complexity scenarios, organizations pay for compute to generate embeddings, storage for vector columns, and compute-heavy cosine similarity searches — without improving accuracy or user experience. In Snowflake, this can also increase warehouse load and query runtime.

Relevant Billing Model

Embedding generation and vector search operations consume Snowflake compute credits. Larger embeddings increase storage requirements and query processing costs. When embeddings are not necessary, both compute and storage consumption rise needlessly.

Detection

Identify tables with vector columns used for retrieval tasks that follow deterministic or keyword patterns
Compare retrieval accuracy between vector search and simple keyword filtering
Review compute consumption for embedding-generation pipelines that process static or rarely changing data
Assess storage growth associated with large or unnecessary vector columns
Determine whether semantic search was adopted without a clear functional requirement

Remediation

Use keyword, metadata, or SQL-based filtering for simple retrieval workloads
Remove or stop generating embeddings where semantic similarity is not required
Drop unused vector columns to reduce storage cost
Benchmark simple search vs. vector search before allocating compute to embeddings
Periodically review vector-search usage to prevent unnecessary architectural complexity

Relevant Documentation

https://docs.snowflake.com/en/guides-overview-cortex

Submit Feedback