I really don't want another database. I just want to have a solution built in for Postgres, and more specifically, RDS, which we use. I know there will be some extra difficulty that I will have to manage (e.g. reindexing to a new model that is outputting different embeddings), but I really don't want another piece of infrastructure.
If anyone from AWS/Google/Azure is listening, please add pgvector [1] into your managed Postgres offerings!
Totally with you on that - it's annoying to need multiple pieces of infrastructure, especially for solo developers or small teams. Often times, you want to filter further based on scalar fields/metadata such as timestamps, strings, numeric values, etc...
That's why we built attribute filtering into Milvus via a Mongo-esque interface. No SQL and not as performant as an RDBMS, but it's an option: https://milvus.io/docs/hybridsearch.md
Yes exactly. My company has asked AWS if they will be adding support for pgvector for rds but they haven't been able to confirm if that will happen any time soon.
If the vectors are in the same database as the tabular/structured data then text to sql applications of llm's are so much more powerful. The generative models will then be able to form complex queries to find similarity as well as perform aggregation, filtering and joining across datasets. To do this today with a separate dedicated vector db is quite painful.
You could write a FDW that reads/writes to a vector database using postgres id tagged vectors. You can write to it from postgres, reference it in queries, join on it, etc. That cuts out a lot of the pain from having separate databases, the only remaining issues are additional maintenance overhead and hidden performance cliffs.
With Postgres, you can do almost everything, also a full-text search, but you still have Elasticsearch, Mejlisearch, etc when you need performance and advanced features. The multitool approach is suboptimal in most cases.
In small teams, the infrastructure is often not able to be fully utilized, so performance is not an issue. However, feature richness allows this team to deliver higher-level feature faster. Think early stage startup (one or two engineers) or hairdressers-like business (they use a ready-made framework that targets a popular database and limits its feature to have a wide range of users). As a result, you can have a lot of such business creating a very long tail.
SaaS products are infrastructure. Each different SaaS used is another piece that needs to be connected to the system and maintained; it thus becomes part of the system infrastructure. Each new SaaS piece has costs (time and money) associated with it.
That said, it's up to the individual company to decide if the added cost is worth it. Just because the cost exists doesn't mean it isn't worth it.
I work at Zilliz (https://zilliz.com) and am a part of the Milvus community. Here are some other resources in case anybody's interested in learning more about embeddings, vector search, and vector databases:
Not yet, but this functionality should be coming soon. We're currently working on adding the capability to call third party embedding APIs directly from a Zilliz Cloud instance.
I was mulling over the idea of building keyword image search (say, with CLIP based embeddings). However I'm not really sure the cost, or whether or not this is the best solution. Do you have any case studies about large deployments of this software, and what the upper limits of scale might be?
As another commenter noted, Milvus is overkill and a "bit much" if you're learning/playing.
A good intro to the field with progression towards a full Milvus implementation could be starting with towhee[0] (which is also supported by Milvus).
towhee has an example to do exactly what you want with CLIP[1]. For icing on the cake the example notebook includes deployment of the model with Nvidia Triton Server[2], which IMO is hands down the best way to actually deploy an ML model[3].
It feels like there are an influx of "vector databases" right now, I haven't had a strong answer out of anyone on why you'd be better off using these over Redis which offers vector storage with similarity search in a battle-tested OSS solution.
I'm currently evaluating different vector stores and passed on Redis today after spending about a half day looking into it. Here's my reasoning
1. The Node.js client is designed to be just a thin wrapper around Redis commands. The client's docs basically just point you straight at the Redis docs.
2. The `@redis/search` API is slightly different than the FS.SEARCH Redis command's api. The difference is not documented and cost me ~20 minutes. This is a pretty big problem given item 1.
3. The `@redis/search` TypeScript types didn't allow some valid FS.SEARCH calls.
One thing to note is that for some use cases (IMO probably a large fraction of use cases right now), the vectors are derived data, so you don't necessarily need a super robust/battle-tested "database", you just need something with a simple happy-path API.
1. node-redis or ioredis? If you are using a "modern IDE" (VS Code/WebStorm/any IDE with support for TS language server) you should get autocomplete for all the commands (in both of them).
2. The `@redis/search` package extends `@redis/client` with support for all RediSearch commands (you should get autocomplete for all RediSearch commands as well). If you want to get the "whole in one" package (support for redis vanilla + all redis modules) you can use the `redis` package instead.
3. Can you please share the command you are trying to run?
I haven't really looked into either. So far I've tested pinecone, Redis, chroma, elasticsearch, and pgvector. I'm not really considering performance at all, just looking for something dead-simple to deploy and use. At the moment it looks like pgvector on supabase is the winner.
Yeah, if you don't need performance, then just take a simple ANN library. No need for a database at all. In terms of databases, Qdrant and Pinecone are the simplest I've tried so far. But Pinecone isn't open-source, not an option for on-prem. PGvector is too much imho, do not want to have all the other Sequel stuff if I just need an NN search.
Not trying to say you're wrong in your decision, but in the context of the post you're replying to, aren't these points arguments against the node client rather than redis itself?
One positive thing I can say about Weaviate is that it's incredibly simple to get started with it, as it handles talking to the vectorizer(s) for you, and they provide many out of the box container images for them. With that we were able to integrate a similarity search into our product very quickly.
From what I can tell, its competitors generally don't do this, and you have to handle generating the embeddings and all the back and forth with the embeddings yourself, setting the barrier for integration much higher.
If you are at a level where you want to get a lot more control about embeddings, I agree that there it will often be better to just use the vector search capabilities that your existing solution like Redis/Postgres are getting.
Redis uses too much memory and supports only two NN algorithms FLAT (very slow) and HNSW if you start indexing millions of large vectors you will quickly understand the problem.
That being said, many of these DBs are overcomplicated for most use cases. Redis HNSW will work for many use cases.
Enterprise has tiered memory. We have customers in the 50-100m range. benchmarked more than that as well. Single query latency will usually be better than most of the field. Also, were working on other variants of our index that will scale to 500m +
Redis does fairly well single-threaded - iirc it does better than Milvus. The problem is, once you move to larger applications and want higher performance, its architectural limitations make it impossible to scale.
Thanks for the link. Nice to see Marqo on there (disclaimer I am a co-founder of Marqo). For anyone that is interested it includes a really nice api for handling a lot of the manipulations and operations you want to do (adding, updating, patching documents, filtering, embeddings only a subset of fields, multi-modal querying, multi-modal document representations) which are absent from vector db's. It also takes care of inference https://github.com/marqo-ai/marqo
tbh. Looks like a huge overengineered legacy project. What is the clue to having all these ANN indexes in place? Is it a kinda art collection? What is the sense when you can just have HNSW in memory, with quantization, or on disk, GPU accelerated, etc. There are already better alternatives like Qdrant, which is written in Rust and super performant https://github.com/qdrant/qdrant, or Weaviate with GraphQL interface https://github.com/weaviate/weaviate
One key differentiator of Milvus is the ability to use a variety of different indexing algorithms. Some will consume very little memory and provide significant speedup at the expense of recall, while others are more powerful at the expensive of compute time or memory consumption.
The most commonly used one I've seen is HNSW, which greatly speeds up search at the expensive of some extra memory consumption. Recall is also strong - for many large-scale use cases you can get 95%+ with properly tuned parameters.
I haven’t tried many alternatives, but I needed a fast, self-hosted vector similarity search that had the ability to cull results based on a second criterion. Milvus has worked really well for me. It does take a ton of memory though such that I can’t run on small VMs, and runs a large number of supporting services.
For those interested, here's a comparison with other open source vector databases: https://zilliz.com/comparison. For those who don't want to be burdened with installing and maintaining a local database, there's a managed service available as well: https://zilliz.com/cloud.
I'm affiliated with Qdrant and was quite surprised to see us listed in your comparisons with some false statements. If you claim to be the only database with billion-scale vector support, it would be great to make your benchmarks public, as we did: https://qdrant.tech/benchmarks/
Btw, Milvus is described in your comparisons as "a fully open source and independent project", while Weaviate and Qdrant, in contrary, are "maintained by a single commercial company offering a cloud version". Why then the suggested way in the Milvus Quick Start on github is to use Zilliz Cloud?
Thanks! BTW it only compares open-sourced databases, and there's no Pinecone in it, which is a very popular cloud-managed option. Would be cool if you could benchmark/compare it.
Biggest difference is Milvus is self-hosted (manage your own infra) while Pinecone is a managed service. There are other differences but that’s often a driving factor.
We (Pinecone) tend to attract customers who want to start, ship, and scale quickly and reliably without worrying about infra and ops overhead.
I think this update introduces Milvus 2.0, "Managed Milvus", which seems to be not self-hosted, but in-cloud managed. You can create account and see that it's similar to Pinecone.
I only heard about vector databases along with the recent advents of AI. Assuming they've been around for a while, what were the benefits of using them over "normal" search engines (e.g. ElasticSearch)?
Traditional search such as ES and Lucene rely primarily on bag-of-words retrieval and keyword matching e.g. BM25, TF-IDF, etc. Vector databases such as Milvus allow for _semantic_ search.
Here's a highly simplified example: if my query was "fields related to computer science", a semantic engine could return "statistics" and "electrical engineering" while avoiding things such as "social science" and "political science".
OpenSearch (elasticsearch open-source fork by AWS) has supported similarity search for embeddings for a couple of years now with its k-NN plugin. It supports 2 engines - FAISS and HNSW - and has post-result filtering support and replication support. A good option, imo.
ES has support for vector search now too. Really you want both in use cases where the user expects the the top results to contain the search keywords, but also wants results that are synonyms or conceptually similar. TF/IDF and BM25 help with first part and vectors help with the second. Theoretically only vectors should be needed, but that isn't my experience in practice.
Totally agree. The thing is that ElasticSearch does not meet our requirements in vector searching.
I am currently running with Milvus + ElasticSearch, works perfect. The latest Milvus version is super fast and scalable (>50M vectors). Haven't tried Zilliz Cloud. Have to find out what the cost is.
I am old school. IMO ElasticSearch is only good for keyword search and these so called "vector databases" products are only good for vector search.
Others have made other suggestions, but Vespa has two unique features. First it is battle tested at a large scale, second it supports combining the keyword and vector scores in several ways. The latter is something that other hybrid systems don't do very well in my experience.
Didn't even realise Milvus was so lacking. https://github.com/marqo-ai/marqo also has a hybrid approach. It's just a more complete/end-to-end platform than pinecone, so it really just depends on what you're building
Could you please elaborate on how you utilize both of them together, and for which specific use case? I'm attempting to gain a better understanding of the hybrid approach.
The thing is to make ElasticSearch scores "comparable" to Milvus scores. Lots of ways to do this, but there's no single good solution. For example you could calculate BM25 score offline, or use TF-IDF score to do some kind of filtering. Again there's no single perfect answer. You'd have to do a lot of experiment according to your own use case and your own data to get the best results.
Also a lot of tuning needs to be done during all phases:
1) query pre-processing
2) query tokenizing
3) retrieval
4) ranking and reranking
I personally would not trust any universal "hybird-search" solutions. All toy demos.
It usually takes 5-10 good engineers to build a decent search engine/system for any real use case. It also requires a lot of turning, tricks, hand-written rules to make things work.
Which is why Pinecone supports hybrid search, which has shown to provide better results for out-of-domain use cases than either semantic search or keyword search alone: https://www.pinecone.io/learn/hybrid-search-intro/
IMO vector databases should not mess with ElasticSearch.
The real focus should be to improve the recall of vector search. Pity that nobody is doing real AI research here. Money wasted in marketing and branding.
Somebody already mentioned it on this thread, but with Weaviate, we aim to create more end-to-end experience that can be used open source or as (hybrid-)SaaS. For example, you can store the complete data object, ANN and inverted filters (hybrid search), and optional modules for vectorization. More info about the concept behind Weaviate here: https://weaviate.io/developers/weaviate/concepts
Pinecone offers a free managed tier, which was quite nice until it lost my data last month. They did eventually recover it a few days later, to be fair to them.
I am not sure why Milvus is so popular. There are other pretty good options like Weaviate and Vespa - both are open source.
One important limitation of Milvus that I cannot ignore is that it does not support pre-filtering like Vespa and a few others do. Pre-filtering is critically important for use cases in e-commerce etc. where filters based on structured metadata are applied e.g. brand, category etc.
Can these databases do fast averaging of nearest neighbors for regression or do you have to retrieve the neighbors and manually compute a mean across them?
You'll have to manually compute the mean after retrieving the nearest neighbors. Automatic fast averaging could be an interesting use case for clustering or using a vector database to generate training though.
If anyone from AWS/Google/Azure is listening, please add pgvector [1] into your managed Postgres offerings!
1. https://github.com/pgvector/pgvector