Vector or Graph?
A guide to choosing between vector and graph databases for your use case
Your choice of memory depends on the type of data you’re working with, how precise you need the retrieval to be, and how much data you have.
Generally, you should use a graph if you need to retrieve complex details that might not otherwise come up in a vector search. Good examples include voice bots, chat applications where accuracy is paramount, or other applications requiring a high degree of recall accuracy.
You should use a vector database if you have a large amount of data to retrieve, you need to retrieve it quickly, and you don’t need the complex relationships that graphs can provide. Examples include knowledge bases, research papers, websites or other large volume collections.
Comparing Memory Types
Vector Memory
Vector memory excels at:
- Similarity search and nearest neighbor queries
- Storing high-dimensional data
- Working with large datasets like websites, research papers, or other collections
Vector search works by converting data (like text, images, or audio) into numerical vectors - essentially long lists of numbers that capture the key characteristics of that data. When you search, your query is also converted into a vector and the database finds the stored vectors that are mathematically closest to your query vector, typically using distance calculations like cosine similarity. This allows the database to find content that is conceptually similar, even if it doesn’t contain the exact same words or features you searched for.
For embedding,our default models include the Cohere family - embed-english-v3.0
, embed-english-light-v3.0
, and embed-multilingual-v3.0
- as well as BAAI/bge-small-en-v1.5
, and mixedbread-ai/mxbai-embed-large-v1
.
You may request another model to be used. Currently the following models are supported:
Model | Dimensions | Description | License | Size (GB) |
---|---|---|---|---|
cohere/embed-english-v3.0 | 1024 | A model that allows for text to be classified or turned into embeddings. English only. | Commercial | - |
cohere/embed-english-light-v3.0 | 384 | A smaller, faster version of embed-english-v3.0. Almost as capable, but a lot faster. English only. | Commercial | - |
cohere/embed-multilingual-v3.0 | 1024 | Provides multilingual classification and embedding support. See supported languages here. | Commercial | - |
cohere/embed-multilingual-light-v3.0 | 384 | A smaller, faster version of embed-multilingual-v3.0. Almost as capable, but a lot faster. Supports multiple languages. | Commercial | - |
BAAI/bge-small-en-v1.5 | 384 | Text embeddings, Unimodal (text), English, 512… | MIT | 0.067 |
BAAI/bge-small-zh-v1.5 | 512 | Text embeddings, Unimodal (text), Chinese, 512… | MIT | 0.090 |
snowflake/snowflake-arctic-embed-xs | 384 | Text embeddings, Unimodal (text), English, 512… | Apache-2.0 | 0.090 |
sentence-transformers/all-MiniLM-L6-v2 | 384 | Text embeddings, Unimodal (text), English, 256… | Apache-2.0 | 0.090 |
jinaai/jina-embeddings-v2-small-en | 512 | Text embeddings, Unimodal (text), English, 819… | Apache-2.0 | 0.120 |
BAAI/bge-small-en | 384 | Text embeddings, Unimodal (text), English, 512… | MIT | 0.130 |
snowflake/snowflake-arctic-embed-s | 384 | Text embeddings, Unimodal (text), English, 512… | Apache-2.0 | 0.130 |
nomic-ai/nomic-embed-text-v1.5-Q | 768 | Text embeddings, Multimodal (text, image), Eng… | Apache-2.0 | 0.130 |
BAAI/bge-base-en-v1.5 | 768 | Text embeddings, Unimodal (text), English, 512… | MIT | 0.210 |
sentence-transformers/paraphrase-multilingual-… | 384 | Text embeddings, Unimodal (text), Multilingual… | Apache-2.0 | 0.220 |
Qdrant/clip-ViT-B-32-text | 512 | Text embeddings, Multimodal (text&image), Engl… | MIT | 0.250 |
jinaai/jina-embeddings-v2-base-de | 768 | Text embeddings, Unimodal (text), Multilingual… | Apache-2.0 | 0.320 |
BAAI/bge-base-en | 768 | Text embeddings, Unimodal (text), English, 512… | MIT | 0.420 |
snowflake/snowflake-arctic-embed-m | 768 | Text embeddings, Unimodal (text), English, 512… | Apache-2.0 | 0.430 |
nomic-ai/nomic-embed-text-v1.5 | 768 | Text embeddings, Multimodal (text, image), Eng… | Apache-2.0 | 0.520 |
jinaai/jina-embeddings-v2-base-en | 768 | Text embeddings, Unimodal (text), English, 819… | Apache-2.0 | 0.520 |
nomic-ai/nomic-embed-text-v1 | 768 | Text embeddings, Multimodal (text, image), Eng… | Apache-2.0 | 0.520 |
snowflake/snowflake-arctic-embed-m-long | 768 | Text embeddings, Unimodal (text), English, 204… | Apache-2.0 | 0.540 |
mixedbread-ai/mxbai-embed-large-v1 | 1024 | Text embeddings, Unimodal (text), English, 512… | Apache-2.0 | 0.640 |
jinaai/jina-embeddings-v2-base-code | 768 | Text embeddings, Unimodal (text), Multilingual… | Apache-2.0 | 0.640 |
sentence-transformers/paraphrase-multilingual-… | 768 | Text embeddings, Unimodal (text), Multilingual… | Apache-2.0 | 1.000 |
snowflake/snowflake-arctic-embed-l | 1024 | Text embeddings, Unimodal (text), English, 512… | Apache-2.0 | 1.020 |
thenlper/gte-large | 1024 | Text embeddings, Unimodal (text), English, 512… | MIT | 1.200 |
BAAI/bge-large-en-v1.5 | 1024 | Text embeddings, Unimodal (text), English, 512… | MIT | 1.200 |
intfloat/multilingual-e5-large | 1024 | Text embeddings, Unimodal (text), Multilingual… | MIT | 2.240 |
Graph Memory
Graph memory is optimal for:
- High accuracy retrieval
- Relationship-based queries
- Path finding and traversal
- Complex interconnected data
Graph databases offer superior retrieval capabilities compared to vector databases by capturing and utilizing explicit relationships between data points. While vector databases rely solely on semantic similarity through mathematical distance calculations, graph databases can traverse actual connections, context, and hierarchies within your data. This means they can uncover insights that might be missed by vector similarity searches alone. For example, in a customer support scenario, a graph database can follow relationship chains to understand that a user’s issue with “login problems” is actually related to a recent password reset attempt, which is connected to a security update – connections that might not be apparent through vector similarity alone. This ability to extract and leverage structured relationships makes graph databases particularly powerful for use cases requiring high precision and complex contextual understanding.
Ontologies
An ontology is a formal definition of concepts, relationships, and categories that represents knowledge within a specific domain. It’s like a structured vocabulary that defines how different pieces of information relate to each other. For example, in a customer service ontology, you might define relationships like “Customer -> Places -> Order” or “Product -> Has Feature -> Specification.”
Ontologies are particularly valuable because they:
- Provide a consistent framework for organizing and understanding complex information
- Enable more precise and meaningful queries by leveraging defined relationships
- Help maintain data quality by enforcing structured relationships
- Support logical inference and reasoning about your data
While we provide default ontologies for common use cases, it’s important to tailor your ontology to your specific data and use case. A custom ontology ensures that the relationships and concepts match your exact needs and domain expertise. This customization can significantly improve the accuracy and relevance of your retrievals.
If you’d like to use a custom ontology for your application, please reach out to our team. We can help you design and implement an ontology that best serves your specific requirements.