Guidance 是一种高效的大语言模型编程范式。通过 Guidance,您可以控制输出的结构,为您的用例获得高质量的输出——同时比传统提示或微调降低延迟和成本。它允许用户约束生成(例如使用正则表达式和上下文无关文法),并以无缝方式交错控制(条件、循环、工具使用)和生成。
Guidance 可通过 PyPI 获取,并支持多种后端(Transformers、llama.cpp、OpenAI 等)。如果您已有模型所需的后端,可以直接运行:
pip install guidance
使用 Guidance 时,您可以通过常见的 Python 习惯用法与大语言模型交互:
from guidance import system, user, assistant, gen
from guidance.models import Transformers
# 也可以使用 LlamaCpp 或其他多种模型
phi_lm = Transformers("microsoft/Phi-4-mini-instruct")
# 模型对象是不可变的,因此这里是其副本
lm = phi_lm
with system():
lm += "You are a helpful assistant"
with user():
lm += "Hello. What is your name?"
with assistant():
lm += gen(max_tokens=20)
print(lm)
在命令行运行将产生如下输出:
<|system|>You are a helpful assistant<|end|><|user|>Hello. What is your name?<|end|><|assistant|>I am Phi, an AI developed by Microsoft. How can I help you today?
如果运行在 Jupyter Notebook 中,Guidance 会提供一个更丰富的用户界面组件:

使用 Guidance 捕获生成的文本非常简单:
# 获取模型的副本
lm = phi_lm
with system():
lm += "You are a helpful assistant"
with user():
lm += "Hello. What is your name?"
with assistant():
lm += gen(name="lm_response", max_tokens=20)
print(f"{lm['lm_response']=}")
lm['lm_response']='I am Phi, an AI developed by Microsoft. How can I help you today?'
Guidance 提供了一种易于使用且极其强大的语法,用于约束语言模型的输出。例如,gen() 调用可以约束为匹配正则表达式:
lm = phi_lm
with system():
lm += "You are a teenager"
with user():
lm += "How old are you?"
with assistant():
lm += gen("lm_age", regex=r"\d+", temperature=0.8)
print(f"The language model is {lm['lm_age']} years old")
The language model is 13 years old
通常,我们知道输出必须是预先已知列表中的某个项。Guidance 为此场景提供了 select() 函数:
from guidance import select
lm = phi_lm
with system():
lm += "You are a geography expert"
with user():
lm += """What is the capital of Sweden? Answer with the correct letter.
A) Helsinki
B) Reykjavík
C) Stockholm
D) Oslo
"""
with assistant():
lm += select(["A", "B", "C", "D"], name="model_selection")
print(f"The model selected {lm['model_selection']}")
The model selected C
Guidance 提供的约束系统非常强大。它可以确保输出符合任何上下文无关文法(只要后端 LLM 完全支持 Guidance)。下面将对此进行更多说明。
在迭代约束时,您可以在本地验证候选字符串,并使用 Mock 模型测试完整运行。
from guidance import gen
from guidance.models import Mock
grammar = "expr=" + gen(regex=r"\d+([+*]\d+)*", name="expr")
# 1) 直接针对文法验证字符串
assert grammar.match("expr=12+7*3") is not None
assert grammar.match("expr=12+*3") is None
# 2) 使用本地 Mock 模型运行相同文法
lm = Mock(b"<s>expr=12+7*3")
lm += grammar
print(lm["expr"]) # 12+7*3
使用 Guidance,您可以创建自己的、能够与语言模型交互的 Guidance 函数。这些函数使用 @guidance 装饰器标记。假设我们想回答大量多项选择题,可以这样做:
import guidance
from guidance.models import Model
ASCII_OFFSET = ord("a")
@guidance
def zero_shot_multiple_choice(
language_model: Model,
question: str,
choices: list[str],
):
with user():
language_model += question + "\n"
for i, choice in enumerate(choices):
language_model += f"{chr(i+ASCII_OFFSET)} : {choice}\n"
with assistant():
language_model += select(
[chr(i + ASCII_OFFSET) for i in range(len(choices))], name="string_choice"
)
return language_model
现在,定义一些问题:
questions = [
{
"question" : "Which state has the northernmost capital?",
"choices" : [
"New South Wales",
"Northern Territory",
"Queensland",
"South Australia",
"Tasmania",
"Victoria",
"Western Australia",
],
"answer" : 1,
},
{
"question" : "Which of the following is venomous?",
"choices" : [
"Kangaroo",
"Koala Bear",
"Platypus",
],
"answer" : 2,
}
]
我们可以像使用 gen() 或 select() 一样使用装饰后的函数。language_model 参数将自动传入:
lm = phi_lm
with system():
lm += "You are a student taking a multiple choice test."
for mcq in questions:
lm_temp = lm + zero_shot_multiple_choice(question=mcq["question"], choices=mcq["choices"])
converted_answer = ord(lm_temp["string_choice"]) - ASCII_OFFSET
print(lm_temp)
print(f"LM Answer: {converted_answer}, Correct Answer: {mcq['answer']}")
<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which state has the northernmost capital?
a : New South Wales
b : Northern Territory
c : Queensland
d : South Australia
e : Tasmania
f : Victoria
g : Western Australia
<|end|><|assistant|>b
LM Answer: 1, Correct Answer: 1
<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which of the following is venomous?
a : Kangaroo
b : Koala Bear
c : Platypus
<|end|><|assistant|>c
LM Answer: 2, Correct Answer: 2
Guidance 函数可以组合,以构建完整的上下文无关文法。例如,我们可以创建 Guidance 函数来构建一个简单的 HTML 网页(注意:这不是 HTML 的完整实现)。我们从一个简单的函数开始,该函数将生成不包含任何 HTML 标签的文本。该函数被标记为 stateless 以表明我们打算将其用于组合文法:
@guidance(stateless=True)
def _gen_text(lm: Model):
return lm + gen(regex="[^<>]+")
然后,我们可以使用此函数在任意 HTML 标签内生成文本:
@guidance(stateless=True)
def _gen_text_in_tag(lm: Model, tag: str):
lm += f"<{tag}>"
lm += _gen_text()
lm += f"</{tag}>"
return lm
现在,创建页面头部。作为其中的一部分,我们需要生成一个页面标题:
@guidance(stateless=True)
def _gen_header(lm: Model):
lm += "<head>\n"
lm += _gen_text_in_tag("title") + "\n"
lm += "</head>\n"
return lm
HTML 页面的主体将填充标题和段落。我们可以为每个定义一个函数:
from guidance.library import one_or_more
@guidance(stateless=True)
def _gen_heading(lm: Model):
lm += select(
options=[_gen_text_in_tag("h1"), _gen_text_in_tag("h2"), _gen_text_in_tag("h3")]
)
lm += "\n"
return lm
@guidance(stateless=True)
def _gen_para(lm: Model):
lm += "<p>"
lm += one_or_more(
select(
options=[
_gen_text(),
_gen_text_in_tag("em"),
_gen_text_in_tag("strong"),
"<br />",
],
)
)
lm += "</p>\n"
return lm
现在,定义 HTML 主体本身的函数:
@guidance(stateless=True)
def _gen_body(lm: Model):
lm += "<body>\n"
lm += one_or_more(select(options=[_gen_heading(), one_or_more(_gen_para())]))
lm += "</body>\n"
return lm
接下来,是生成完整 HTML 页面的函数。我们添加 HTML 起始标签,然后生成头部,然后是主体,最后追加结束 HTML 标签:
@guidance(stateless=True)
def _gen_html(lm: Model):
lm += "<html>\n"
lm += _gen_header()
lm += _gen_body()
lm += "</html>\n"
return lm
最后,我们提供一个用户友好的包装器,它可以让我们:
from guidance.library import capture, with_temperature
@guidance(stateless=True)
def make_html(
lm,
name: str | None = None,
*,
temperature: float = 0.0,
):
return lm + capture(
with_temperature(_gen_html(), temperature=temperature),
name=name,
)
现在,使用它来生成一个简单的网页:
lm = phi_lm
with system():
lm += "You are an expert in HTML"
with user():
lm += "Create a simple and short web page about your life story."
with assistant():
lm += make_html(name="html_text", temperature=0.7)
当在 Jupyter Notebook 中运行并且组件处于活动状态时,我们会得到以下输出:

注意生成文本中的不同高亮显示。这展示了 Guidance 的另一个能力:令牌快进。文法施加的约束通常意味着某些令牌是预先已知的。Guidance 不需要模型来生成这些令牌;相反,它可以将它们插入到生成中。这节省了模型的向前传递,从而减少了 GPU 的使用。例如,在上面的 HTML 生成中,Guidance 始终知道最后一个打开的标签。如果最后打开的标签是 <h1>(例如),那么一旦模型生成了 </,Guidance 就可以填入 h1>,而无需模型执行一次向前传递。
JSON 模式实际上是一种上下文无关文法,因此它可以使用 Guidance 来约束 LLM。这是一个足够常见的场景,Guidance 为其提供了特殊支持。一个基于 Pydantic 模型的快速示例:
import json
from pydantic import BaseModel, Field
from guidance import json as gen_json
class BloodPressure(BaseModel):
systolic: int = Field(gt=300, le=400)
diastolic: int = Field(gt=0, le=20)
location: str = Field(max_length=50)
model_config = dict(extra="forbid")
lm = phi_lm
with system():
lm += "You are a doctor taking a patient's blood pressure taken from their arm"
with user():
lm += "Report the blood pressure"
with assistant():
lm += gen_json(name="bp", schema=BloodPressure)
print(f"{lm['bp']=}")
# 使用 Python 的 JSON 库
loaded_json = json.loads(lm["bp"])
print(json.dumps(loaded_json, indent=4))
# 使用 Pydantic
result = BloodPressure.model_validate_json(lm["bp"])
print(result.model_dump_json(indent=8))
lm['bp']='{"systolic": 301, "diastolic": 15, "location": "arm"}'
{
"systolic": 301,
"diastolic": 15,
"location": "arm"
}
{
"systolic": 301,
"diastolic": 15,
"location": "arm"
}
请注意,生成的血压值并不是模型在人类身上会看到的数值。生成 JSON 时,由于模式施加的结构约束,通常可以快进大量的令牌。