Fast inference server for text embedding models
Blazing fast inference solution for text embeddings models
text-embeddings-inference$ text-embeddings-inference$ text-embeddings-inference --port 8080$ text-embeddings-inference --model-id sentence-transformers/all-MiniLM-L6-v2$ text-embeddings-inference --cuda-cores 8$ text-embeddings-inference --help