nomic-embed-vision-v1.5 配对用于图像到文本搜索clip-ViT-B-32-vision 配对用于图像到文本搜索nomic-v2-moe 特性(candle 后端)qwen3 特性(candle 后端)qwen3 特性(candle 后端)qwen3 特性(candle 后端)qwen3 特性(candle 后端,通过 Qwen3VLEmbedding 支持多模态)以上多个模型也提供量化版本(在模型枚举变体后追加 Q,例如 EmbeddingModel::BGESmallENV15Q)。
如需支持本库,请向我们的主要上游依赖 ort(ONNX 运行时的 Rust 封装)进行捐赠。
在项目目录中运行:
cargo add fastembed
或在 Cargo.toml 中添加以下行:
[dependencies]
fastembed = "5"
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
// 使用默认选项
let mut model = TextEmbedding::try_new(Default::default())?;
// 使用自定义选项
let mut model = TextEmbedding::try_new(
InitOptions::new(EmbeddingModel::AllMiniLML6V2).with_show_download_progress(true),
)?;
let documents = vec![
"passage: Hello, World!",
"query: Hello, World!",
"passage: This is an example passage.",
// 可以省略前缀,但建议保留
"fastembed-rs is licensed under Apache 2.0"
];
// 使用默认批次大小 256 生成嵌入向量
let embeddings = model.embed(documents, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 4
println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 384
Qwen3 嵌入模型需启用 qwen3 特性标志(candle 后端)。
[dependencies]
fastembed = { version = "5", features = ["qwen3"] }
use candle_core::{DType, Device};
use fastembed::Qwen3TextEmbedding;
let device = Device::Cpu;
let model = Qwen3TextEmbedding::from_hf(
"Qwen/Qwen3-Embedding-0.6B",
&device,
DType::F32,
512,
)?;
// 使用 Qwen3-VL 嵌入检查点进行纯文本嵌入也受支持:
// let model = Qwen3TextEmbedding::from_hf("Qwen/Qwen3-VL-Embedding-2B", &device, DType::F32, 512)?;
let embeddings = model.embed(&["query: ...", "passage: ..."])?;
println!("Embeddings length: {}", embeddings.len());
使用 Qwen/Qwen3-VL-Embedding-2B 进行多模态文本/图像嵌入:
use candle_core::{DType, Device};
use fastembed::Qwen3VLEmbedding;
let device = Device::Cpu;
let model = Qwen3VLEmbedding::from_hf(
"Qwen/Qwen3-VL-Embedding-2B",
&device,
DType::F32,
2048,
)?;
let image_embeddings = model.embed_images(&["tests/assets/image_0.png", "tests/assets/image_1.png"])?;
let text_embeddings = model.embed_texts(&["query: blue cat", "query: red cat"])?;
println!("Image embeddings: {}", image_embeddings.len());
println!("Text embeddings: {}", text_embeddings.len());
nomic-embed-text-v2-moe 模型需启用 nomic-v2-moe 特性标志(candle 后端)。这是首个支持 100+ 语言的通用 MoE 嵌入模型。
[dependencies]
fastembed = { version = "5", features = ["nomic-v2-moe"] }
use candle_core::{DType, Device};
use fastembed::NomicV2MoeTextEmbedding;
let device = Device::Cpu;
let model = NomicV2MoeTextEmbedding::from_hf(
"nomic-ai/nomic-embed-text-v2-moe",
&device,
DType::F32,
512,
)?;
let embeddings = model.embed(&["search_query: ...", "search_document: ..."])?;
println!("Embeddings length: {}", embeddings.len());
use fastembed::{SparseEmbedding, SparseInitOptions, SparseModel, SparseTextEmbedding};
// 使用默认选项
let mut model = SparseTextEmbedding::try_new(Default::default())?;
// 使用自定义选项
let mut model = SparseTextEmbedding::try_new(
SparseInitOptions::new(SparseModel::SPLADEPPV1).with_show_download_progress(true),
)?;
let documents = vec![
"passage: Hello, World!",
"query: Hello, World!",
"passage: This is an example passage.",
"fastembed-rs is licensed under Apache 2.0"
];
// 使用默认批次大小 256 生成嵌入向量
let embeddings: Vec<SparseEmbedding> = model.embed(documents, None)?;
use fastembed::{ImageEmbedding, ImageInitOptions, ImageEmbeddingModel};
// 使用默认选项
let mut model = ImageEmbedding::try_new(Default::default())?;
// 使用自定义选项
let mut model = ImageEmbedding::try_new(
ImageInitOptions::new(ImageEmbeddingModel::ClipVitB32).with_show_download_progress(true),
)?;
let images = vec!["assets/image_0.png", "assets/image_1.png"];
// 使用默认批次大小 256 生成嵌入向量
let embeddings = model.embed(images, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 2
println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 512
use fastembed::{TextRerank, RerankInitOptions, RerankerModel};
// 使用默认选项
let mut model = TextRerank::try_new(Default::default())?;
// 使用自定义选项
let mut model = TextRerank::try_new(
RerankInitOptions::new(RerankerModel::BGERerankerBase).with_show_download_progress(true),
)?;
let documents = vec![
"hi",
"The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear, is a bear species endemic to China.",
"panda is animal",
"i dont know",
"kind of mammal",
];
// 使用默认批次大小 256 进行重排序,并返回文档内容
let results = model.rerank("what is panda?", documents, true, None)?;
println!("Rerank result: {:?}", results);
此外,也可以通过相应结构体的 try_new_from_user_defined(...) 方法使用本地模型文件进行推理。