LLMWare — 面向企业文档理解、RAG 与小模型工作流的开源套件

ci · 2026-01-18 20:03:01 · 60 次点击 · 0 条评论

llmware

PyPI - Version

🧰🛠️ 用于构建基于知识的本地、私有、安全的 LLM 应用程序的统一框架

llmware 针对 AI PC 和本地笔记本电脑、边缘及自托管部署进行了优化，支持广泛的 Windows、Mac 和 Linux 平台，并支持 GGUF、OpenVINO、ONNXRuntime、ONNXRuntime-QNN (Qualcomm)、WindowsLocalFoundry 和 Pytorch，提供了一个高级接口，便于利用为目标平台优化的正确推理技术。

llmware 包含两个主要组件：

包含 300+ 模型的模型目录 - 模型以量化、优化的格式预打包，以利用设备上的 GPU 和 NPU 能力，支持主要的开源模型系列以及 50+ 个针对企业流程自动化关键任务专门优化的 llmware 微调 SLIM、Bling、Dragon 和 Industry-Bert 模型。同时支持来自 OpenAI、Anthropic 和 Google 的主要云模型。
RAG 管道 - 将知识源连接到生成式 AI 模型全生命周期的集成组件，具有广泛的文档解析和摄取能力，并能创建可扩展的知识库。

通过整合这两个组件，llmware 提供了一套全面的工具，用于快速构建基于知识的企业 LLM 应用程序。

我们的愿景是 AI 应该可持续、准确且具有成本效益，使用尽可能小的计算资源完成任务。

实际上，我们所有的示例和模型都可以在设备上运行 - 立即在您的笔记本电脑上开始使用。

加入我们的 Discord | 观看 Youtube 教程 | 在 Huggingface 上探索我们的模型系列

🎯 主要特性

使用 llmware 编写代码基于几个主要概念：

模型目录：以相同的方式访问所有模型，易于查找，无论底层实现如何。

#   目录中包含 300+ 个模型，其中 50+ 个针对 RAG 优化的 BLING、DRAGON 和 Industry BERT 模型
#   全面支持 GGUF、OpenVINO、Onnxruntime、HuggingFace、Sentence Transformers 和主要的基于 API 的模型
#   易于扩展以添加自定义模型 - 参见示例

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#   所有模型都通过 ModelCatalog 访问
models = ModelCatalog().list_all_models()

#   要使用 ModelCatalog 中的任何模型 - 使用 "load_model" 方法并传递 model_name 参数
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")

#   调用模型进行推理
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")

#   调用模型进行流式输出
for token in my_model.stream("What is the future of AI?"):
    print(token, end="")

#   将模型集成到 Prompt 中
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")

知识库：大规模摄取、组织和索引知识集合 - 解析、文本分块和嵌入。

from llmware.library import Library

#   解析和文本分块一组文档（pdf、pptx、docx、xlsx、txt、csv、md、json/jsonl、wav、png、jpg、html）

#   步骤 1 - 创建一个知识库，这是“知识库容器”结构
#          - 知识库同时具有文本集合（数据库）资源和文件资源（例如，llmware_data/accounts/{library_name}）
#          - 嵌入和查询是针对知识库运行的

lib = Library().create_new_library("my_library")

#   步骤 2 - add_files 是通用摄取函数 - 指向包含混合文件类型的本地文件文件夹
#           - 文件将根据文件扩展名路由到正确的解析器，进行解析、文本分块并索引到文本集合数据库中

lib.add_files("/folder/path/to/my/files")

#   在知识库上安装嵌入 - 选择一个嵌入模型和向量数据库
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)

#   在同一知识库上添加第二个嵌入（混合匹配模型 + 向量数据库）
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)

#   易于为不同的项目和组创建多个知识库

finance_lib = Library().create_new_library("finance_q4_2023")
finance_lib.add_files("/finance_folder/")

hr_lib = Library().create_new_library("hr_policies")
hr_lib.add_files("/hr_folder/")

#   获取包含关键元数据的知识库卡片 - 文档、文本块、图像、表格、嵌入记录
lib_card = Library().get_library_card("my_library")

#   查看所有知识库
all_my_libs = Library().get_all_library_cards()

查询：使用文本、语义、混合、元数据和自定义过滤器的组合查询知识库。

from llmware.retrieval import Query
from llmware.library import Library

#   步骤 1 - 加载先前创建的知识库
lib = Library().load_library("my_library")

#   步骤 2 - 创建一个查询对象并传递知识库
q = Query(lib)

#   步骤 3 - 运行多种不同的查询（示例中还有许多其他选项）

#   基本文本查询
results1 = q.text_query("text query", result_count=20, exact_mode=False)

#   语义查询
results2 = q.semantic_query("semantic query", result_count=10)

#   将文本查询限制在知识库中的某些文档，并与查询进行“精确”匹配
results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True)

#   要应用特定的嵌入（如果知识库上有多个），请在创建查询对象时传递名称
q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")

带来源的提示：将知识检索与 LLM 推理结合的最简单方法。

from llmware.prompts import Prompt
from llmware.retrieval import Query
from llmware.library import Library

#   构建一个提示
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")

#   添加一个文件 -> 文件被解析、文本分块、按查询过滤，然后打包为模型就绪的上下文，
#   如果需要，会分批处理以适应模型上下文窗口

source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query")

#   将查询结果（来自 Query）附加到提示中
my_lib = Library().load_library("my_library")
results = Query(my_lib).query("my query")
source2 = prompter.add_source_query_results(results)

#   对知识库运行新查询并直接加载到提示中
source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15)

#   使用“带来源的提示”运行推理
responses = prompter.prompt_with_source("my query")

#   运行事实核查 - 推理后
fact_check = prompter.evidence_check_sources(responses)

#   查看来源材料（分批的“模型就绪”并附加到提示）
source_materials = prompter.review_sources_summary()

#   查看完整的提示历史
prompt_history = prompter.get_current_history()

RAG 优化模型 - 专为 RAG 工作流集成和本地运行而设计的 1-7B 参数模型。

``` """ 这个“Hello World”示例演示了如何使用提供的上下文开始使用本地 BLING 模型，包括 Pytorch 和 GGUF 版本。""" import time from llmware.prompts import Prompt def hello_world_questions(): test_list = [ {"query": "What is the total amount of the invoice?", "answer": "$22,500.00", "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." "If you have any questions concerning this invoice, contact Bia Hermes. " "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, {"query": "What was the amount of the trade surplus?", "answer": "62.4 billion yen ($416.6 million)", "context": "Japan’s September trade balance swings into surplus, surprising expectations" "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " "billion yen. Data from Japan’s customs agency revealed that exports in September " "increased 4.3% year on year, while imports slid 16.3% compared to the same period " "last year. According to FactSet, exports to Asia fell for the ninth straight month, " "which reflected ongoing China weakness. Exports were supported by shipments to " "Western markets, FactSet added. — Lim Hui Jie"}, {"query": "When did the LISP machine market collapse?", "answer": "1987.", "context": "The attendees became the leaders of AI research in the 1960s." " They and their students produced programs that the press described as 'astonishing': " "computers were learning checkers strategies, solving word problems in algebra, " "proving logical theorems and speaking English. By the middle of the 1960s, research in " "the U.S. was heavily funded by the Department of Defense and laboratories had been " "established around the world. Herbert Simon predicted, 'machines will be capable, " "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " "'within a generation ... the problem of creating 'artificial intelligence' will " "substantially be solved'. They had, however, underestimated the difficulty of the problem. " "Both the U.S. and British governments cut off exploratory research in response " "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " "as proving that artificial neural networks approach would never be useful for solving " "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " "AI research was revived by the commercial success of expert systems, a form of AI " "program that simulated the knowledge and analytical skills of human experts. By 1985, " "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " "generation computer project inspired the U.S. and British governments to restore funding " "for academic research. However, beginning with the collapse of the Lisp Machine market " "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, {"query": "What is the current rate on 10-year treasuries?", "answer": "4.58%", "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " "jobs. However, wages rose less than expected last month. Stocks posted a stunning " "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " "At its session low, the Dow had fallen as much as 198 points; it surged by more than " "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " "their lowest points in the day. Traders were unclear of the reason for the intraday " "reversal. Some noted it could be the softer wage number in the jobs report that made " "investors rethink their earlier bearish stance. Others noted the pullback in yields from " "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " "near its highest level in 14 years. The benchmark rate later eased from those levels, but " "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " "some oversold conditions.'"}, {"query": "Is the expected gross margin greater than 70%?", "answer": "Yes, between 71.5% and 72.%", "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " "50 basis points. GAAP and non-GAAP operating expenses are expected to be " "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " "other income and expense are expected to be an income of approximately $100 " "million, excluding gains and losses from non-affiliated investments. GAAP and " "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." "Highlights NVIDIA achieved progress since its previous earnings announcement " "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " "up 141% from the previous quarter and up 171% from a year ago. Announced that the " "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " "this quarter, with a second-generation version with HBM3e memory expected to ship " "in Q2 of calendar 2024. "}, {"query": "What is Bank of America's rating on Target?", "answer": "Buy", "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " "soared more than 22%. Hotter than expected September consumer price index, consumer " "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " "Cites consumer price index showing sticky retail inflation for the fourth time " "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " "Merchandising better. Freight and transportation better. Target to report quarter " "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " "the off-price jug

项目地址：https://github.com/llmware-ai/llmware

60 次点击 ∙ 0 人收藏

登录后收藏

0 条回复