OA0 = Omni AI 0
OA0 是一个探索 AI 的论坛
现在注册
已注册用户请  登录
OA0  ›  技能包  ›  azure-storage-blob-py:用于 Python 的 Azure Blob 存储 SDK

azure-storage-blob-py:用于 Python 的 Azure Blob 存储 SDK

 
  sql ·  2026-02-14 23:06:32 · 3 次点击  · 0 条评论  

名称: azure-storage-blob-py
描述: |
Azure Blob Storage 的 Python SDK。用于上传、下载、列出 Blob,管理容器以及 Blob 生命周期。
触发词:"blob storage", "BlobServiceClient", "ContainerClient", "BlobClient", "upload blob", "download blob"。
package: azure-storage-blob


Azure Blob Storage Python SDK

用于 Azure Blob Storage 的客户端库——专为存储非结构化数据设计的对象存储服务。

安装

pip install azure-storage-blob azure-identity

环境变量

AZURE_STORAGE_ACCOUNT_NAME=<你的存储账户名>
# 或使用完整 URL
AZURE_STORAGE_ACCOUNT_URL=https://<账户名>.blob.core.windows.net

身份验证

from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

credential = DefaultAzureCredential()
account_url = "https://<账户名>.blob.core.windows.net"

blob_service_client = BlobServiceClient(account_url, credential=credential)

客户端层级

客户端 用途 获取方式
BlobServiceClient 账户级操作 直接实例化
ContainerClient 容器操作 blob_service_client.get_container_client()
BlobClient 单个 Blob 操作 container_client.get_blob_client()

核心工作流

创建容器

container_client = blob_service_client.get_container_client("mycontainer")
container_client.create_container()

上传 Blob

# 从文件路径上传
blob_client = blob_service_client.get_blob_client(
    container="mycontainer",
    blob="sample.txt"
)

with open("./local-file.txt", "rb") as data:
    blob_client.upload_blob(data, overwrite=True)

# 从字节/字符串上传
blob_client.upload_blob(b"Hello, World!", overwrite=True)

# 从流上传
import io
stream = io.BytesIO(b"Stream content")
blob_client.upload_blob(stream, overwrite=True)

下载 Blob

blob_client = blob_service_client.get_blob_client(
    container="mycontainer",
    blob="sample.txt"
)

# 下载到文件
with open("./downloaded.txt", "wb") as file:
    download_stream = blob_client.download_blob()
    file.write(download_stream.readall())

# 下载到内存
download_stream = blob_client.download_blob()
content = download_stream.readall()  # 字节数据

# 读取到现有缓冲区
stream = io.BytesIO()
num_bytes = blob_client.download_blob().readinto(stream)

列出 Blob

container_client = blob_service_client.get_container_client("mycontainer")

# 列出所有 Blob
for blob in container_client.list_blobs():
    print(f"{blob.name} - {blob.size} bytes")

# 按前缀(类似文件夹)列出
for blob in container_client.list_blobs(name_starts_with="logs/"):
    print(blob.name)

# 遍历 Blob 层次结构(虚拟目录)
for item in container_client.walk_blobs(delimiter="/"):
    if item.get("prefix"):
        print(f"目录: {item['prefix']}")
    else:
        print(f"Blob: {item.name}")

删除 Blob

blob_client.delete_blob()

# 删除包含快照
blob_client.delete_blob(delete_snapshots="include")

性能调优

# 为大文件上传/下载配置块大小
blob_client = BlobClient(
    account_url=account_url,
    container_name="mycontainer",
    blob_name="large-file.zip",
    credential=credential,
    max_block_size=4 * 1024 * 1024,  # 4 MiB 块
    max_single_put_size=64 * 1024 * 1024  # 64 MiB 单次上传限制
)

# 并行上传
blob_client.upload_blob(data, max_concurrency=4)

# 并行下载
download_stream = blob_client.download_blob(max_concurrency=4)

SAS 令牌

from datetime import datetime, timedelta, timezone
from azure.storage.blob import generate_blob_sas, BlobSasPermissions

sas_token = generate_blob_sas(
    account_name="<账户名>",
    container_name="mycontainer",
    blob_name="sample.txt",
    account_key="<账户密钥>",  # 或使用用户委托密钥
    permission=BlobSasPermissions(read=True),
    expiry=datetime.now(timezone.utc) + timedelta(hours=1)
)

# 使用 SAS 令牌
blob_url = f"https://<账户名>.blob.core.windows.net/mycontainer/sample.txt?{sas_token}"

Blob 属性与元数据

# 获取属性
properties = blob_client.get_blob_properties()
print(f"大小: {properties.size}")
print(f"内容类型: {properties.content_settings.content_type}")
print(f"最后修改时间: {properties.last_modified}")

# 设置元数据
blob_client.set_blob_metadata(metadata={"category": "logs", "year": "2024"})

# 设置内容类型
from azure.storage.blob import ContentSettings
blob_client.set_http_headers(
    content_settings=ContentSettings(content_type="application/json")
)

异步客户端

from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob.aio import BlobServiceClient

async def upload_async():
    credential = DefaultAzureCredential()

    async with BlobServiceClient(account_url, credential=credential) as client:
        blob_client = client.get_blob_client("mycontainer", "sample.txt")

        with open("./file.txt", "rb") as data:
            await blob_client.upload_blob(data, overwrite=True)

# 异步下载
async def download_async():
    async with BlobServiceClient(account_url, credential=credential) as client:
        blob_client = client.get_blob_client("mycontainer", "sample.txt")

        stream = await blob_client.download_blob()
        data = await stream.readall()

最佳实践

  1. 使用 DefaultAzureCredential 而非连接字符串
  2. 使用上下文管理器 处理异步客户端
  3. 明确设置 overwrite=True 以覆盖重传
  4. 使用 max_concurrency 提升大文件传输效率
  5. 优先使用 readinto() 而非 readall() 以优化内存
  6. 使用 walk_blobs() 进行层次化列表操作
  7. 为 Web 服务的 Blob 设置合适的内容类型
3 次点击  ∙  0 人收藏  
登录后收藏  
目前尚无回复
0 条回复
About   ·   Help   ·    
OA0 - Omni AI 0 一个探索 AI 的社区
沪ICP备2024103595号-2
Developed with Cursor