Chat-Completions API 使用指南

本文档详细说明了如何使用 Chat-Completions API 进行各种 AI 对话和文本生成任务。该 API 兼容 OpenAI 格式，支持多种模型，包括 GPT、Claude、Gemini 等主流 AI 模型。

官方文档：https://platform.openai.com/docs/api-reference/chat

📝 简介

Chat-Completions API 是一个强大且灵活的接口，提供了访问最先进的 AI 模型的简单方式，支持：

💬 文本对话：自然语言问答和对话
🖼️ 图像分析：多模态内容理解
🔄 流式响应：实时流式输出
🛠️ 函数调用：工具集成和自动化
📊 结构化输出：JSON 格式输出
🎯 多种模型：支持各大厂商的主流模型

🔧 接口定义

端点：/v1/chat/completions
方法：POST
认证：Bearer Token
格式：application/json

🔐 鉴权方法

所有请求都需要在 HTTP Header 中包含 Authorization 字段：

http

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

请将 YOUR_API_KEY 替换为您在平台生成的有效 API 密钥。

💡 请求示例

基础文本对话 ✅

最简单的文本问答场景：

json

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": "你好，请介绍一下人工智能的发展历史"
        }
    ],
    "max_tokens": 1000,
    "temperature": 0.7
}

图像分析对话 ✅

支持图像输入的多模态对话：

json

{
    "model": "gemini-2.5-flash-all",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "请描述这张图片中的内容"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD..."
                    }
                }
            ]
        }
    ],
    "max_tokens": 500
}

流式响应 ✅

实现实时流式输出：

json

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": "请写一首关于春天的诗"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}

函数调用 ✅

集成外部工具和函数：

json

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": "今天北京的天气怎么样？"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "获取指定城市的天气信息",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "城市名称"
                        }
                    },
                    "required": ["city"]
                }
            }
        }
    ],
    "tool_choice": "auto"
}

JSON 模式输出 ✅

强制模型输出结构化 JSON：

json

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "system",
            "content": "你是一个数据提取助手，请将用户输入的信息提取为JSON格式"
        },
        {
            "role": "user",
            "content": "张三，男，25岁，软件工程师，住在北京市朝阳区"
        }
    ],
    "response_format": {
        "type": "json_object"
    }
}

📮 请求参数详解

核心参数

参数	类型	必需	默认值	描述
`model`	string	是	-	指定使用的模型名称
`messages`	array	是	-	对话消息列表
`max_tokens`	integer	否	-	生成内容的最大 Token 数
`temperature`	number	否	1	控制输出随机性 (0-2)
`top_p`	number	否	1	核采样参数 (0-1)
`n`	integer	否	1	生成响应的数量
`stream`	boolean	否	false	是否启用流式输出

messages 参数详解

消息数组中每个对象的结构：

json

{
    "role": "user|assistant|system|tool",
    "content": "消息内容",
    "name": "可选的发送者名称",
    "tool_calls": "工具调用信息",
    "tool_call_id": "工具调用ID"
}

角色说明：

system：系统提示，定义 AI 的行为和角色
user：用户输入的消息
assistant：AI 助手的回复
tool：工具调用的返回结果

高级参数

参数	类型	描述
`stop`	string/array	停止生成的序列
`presence_penalty`	number	存在惩罚 (-2.0 到 2.0)
`frequency_penalty`	number	频率惩罚 (-2.0 到 2.0)
`logit_bias`	object	Token 生成偏置
`user`	string	最终用户标识符
`seed`	integer	确定性采样种子
`response_format`	object	输出格式控制
`tools`	array	可用工具列表
`tool_choice`	string/object	工具选择策略

📥 响应格式

标准响应

json

{
    "id": "chatcmpl-8XYZ123",
    "object": "chat.completion",
    "created": 1699000000,
    "model": "gpt-4o-2024-05-13",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "这是 AI 的回复内容"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 15,
        "completion_tokens": 25,
        "total_tokens": 40,
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    },
    "system_fingerprint": "fp_abc123"
}

流式响应

json

data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":"你好"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"！"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

data: [DONE]

finish_reason 说明

值	描述
`stop`	模型自然停止或遇到停止序列
`length`	达到最大 Token 限制
`tool_calls`	模型调用了工具
`content_filter`	内容被过滤

💻 代码示例

Python 示例

基础对话

python

import requests
import json

def chat_completion(api_key, base_url, messages, model="gpt-4o"):
    """基础聊天完成函数"""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": model,
        "messages": messages,
        "max_tokens": 1000,
        "temperature": 0.7
    }
    
    response = requests.post(
        f"{base_url}/v1/chat/completions", 
        headers=headers, 
        data=json.dumps(data)
    )
    
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"API Error: {response.status_code} - {response.text}")

# 使用示例
api_key = "YOUR_API_KEY"
base_url = "https://api.rainboxs.com"

messages = [
    {"role": "user", "content": "请介绍一下机器学习的基本概念"}
]

result = chat_completion(api_key, base_url, messages)
print(result['choices'][0]['message']['content'])

图像分析

python

import base64

def encode_image(image_path):
    """将图像编码为 base64"""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

def analyze_image(api_key, base_url, image_path, question):
    """图像分析函数"""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    # 编码图像
    base64_image = encode_image(image_path)
    
    data = {
        "model": "gemini-2.5-flash-all",
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": question},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/jpeg;base64,{base64_image}"
                        }
                    }
                ]
            }
        ],
        "max_tokens": 500
    }
    
    response = requests.post(
        f"{base_url}/v1/chat/completions",
        headers=headers,
        data=json.dumps(data)
    )
    
    return response.json()

# 使用示例
result = analyze_image(
    api_key, 
    base_url, 
    "image.jpg", 
    "请详细描述这张图片的内容"
)
print(result['choices'][0]['message']['content'])

流式输出

python

import requests
import json

def stream_chat(api_key, base_url, messages, model="gpt-4o"):
    """流式聊天函数"""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": model,
        "messages": messages,
        "stream": True,
        "stream_options": {"include_usage": True}
    }
    
    response = requests.post(
        f"{base_url}/v1/chat/completions",
        headers=headers,
        data=json.dumps(data),
        stream=True
    )
    
    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                content = line[6:]  # 移除 'data: ' 前缀
                if content.strip() == '[DONE]':
                    break
                try:
                    chunk = json.loads(content)
                    if 'choices' in chunk and chunk['choices']:
                        delta = chunk['choices'][0]['delta']
                        if 'content' in delta:
                            print(delta['content'], end='', flush=True)
                except json.JSONDecodeError:
                    continue

# 使用示例
messages = [
    {"role": "user", "content": "请写一首关于人工智能的诗"}
]

stream_chat(api_key, base_url, messages)

Node.js 示例

基础对话

javascript

const axios = require('axios');

async function chatCompletion(apiKey, baseUrl, messages, model = 'gpt-4o') {
    const headers = {
        'Authorization': `Bearer ${apiKey}`,
        'Content-Type': 'application/json'
    };
    
    const data = {
        model: model,
        messages: messages,
        max_tokens: 1000,
        temperature: 0.7
    };
    
    try {
        const response = await axios.post(
            `${baseUrl}/v1/chat/completions`,
            data,
            { headers }
        );
        return response.data;
    } catch (error) {
        throw new Error(`API Error: ${error.response?.status} - ${error.response?.data}`);
    }
}

// 使用示例
const apiKey = 'YOUR_API_KEY';
const baseUrl = 'https://api.rainboxs.com';

const messages = [
    { role: 'user', content: '请解释一下深度学习的原理' }
];

chatCompletion(apiKey, baseUrl, messages)
    .then(result => {
        console.log(result.choices[0].message.content);
    })
    .catch(error => {
        console.error('Error:', error.message);
    });

函数调用

javascript

async function functionCalling(apiKey, baseUrl, query) {
    const headers = {
        'Authorization': `Bearer ${apiKey}`,
        'Content-Type': 'application/json'
    };
    
    const data = {
        model: 'gpt-4o',
        messages: [
            { role: 'user', content: query }
        ],
        tools: [
            {
                type: 'function',
                function: {
                    name: 'get_weather',
                    description: '获取指定城市的天气信息',
                    parameters: {
                        type: 'object',
                        properties: {
                            city: {
                                type: 'string',
                                description: '城市名称'
                            }
                        },
                        required: ['city']
                    }
                }
            }
        ],
        tool_choice: 'auto'
    };
    
    const response = await axios.post(
        `${baseUrl}/v1/chat/completions`,
        data,
        { headers }
    );
    
    return response.data;
}

// 使用示例
functionCalling(apiKey, baseUrl, '今天上海的天气如何？')
    .then(result => {
        const message = result.choices[0].message;
        if (message.tool_calls) {
            console.log('工具调用:', message.tool_calls);
        } else {
            console.log('回复:', message.content);
        }
    });

cURL 示例

基础请求

bash

curl -X POST "https://api.rainboxs.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "你好，请介绍一下自然语言处理"
      }
    ],
    "max_tokens": 1000,
    "temperature": 0.7
  }'

流式请求

bash

curl -X POST "https://api.rainboxs.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "请写一个关于机器人的故事"
      }
    ],
    "stream": true,
    "max_tokens": 500
  }'

Go 示例

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

type Message struct {
    Role    string `json:"role"`
    Content string `json:"content"`
}

type ChatRequest struct {
    Model       string    `json:"model"`
    Messages    []Message `json:"messages"`
    MaxTokens   int       `json:"max_tokens,omitempty"`
    Temperature float64   `json:"temperature,omitempty"`
    Stream      bool      `json:"stream,omitempty"`
}

type ChatResponse struct {
    ID      string `json:"id"`
    Object  string `json:"object"`
    Created int64  `json:"created"`
    Model   string `json:"model"`
    Choices []struct {
        Index   int `json:"index"`
        Message struct {
            Role    string `json:"role"`
            Content string `json:"content"`
        } `json:"message"`
        FinishReason string `json:"finish_reason"`
    } `json:"choices"`
    Usage struct {
        PromptTokens     int `json:"prompt_tokens"`
        CompletionTokens int `json:"completion_tokens"`
        TotalTokens      int `json:"total_tokens"`
    } `json:"usage"`
}

func chatCompletion(apiKey, baseURL string, req ChatRequest) (*ChatResponse, error) {
    jsonData, err := json.Marshal(req)
    if err != nil {
        return nil, err
    }

    httpReq, err := http.NewRequest("POST", baseURL+"/v1/chat/completions", bytes.NewBuffer(jsonData))
    if err != nil {
        return nil, err
    }

    httpReq.Header.Set("Authorization", "Bearer "+apiKey)
    httpReq.Header.Set("Content-Type", "application/json")

    client := &http.Client{}
    resp, err := client.Do(httpReq)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        return nil, err
    }

    var chatResp ChatResponse
    err = json.Unmarshal(body, &chatResp)
    if err != nil {
        return nil, err
    }

    return &chatResp, nil
}

func main() {
    apiKey := "YOUR_API_KEY"
    baseURL := "https://api.rainboxs.com"

    req := ChatRequest{
        Model: "gpt-4o",
        Messages: []Message{
            {Role: "user", Content: "请介绍一下 Go 语言的特点"},
        },
        MaxTokens:   1000,
        Temperature: 0.7,
    }

    resp, err := chatCompletion(apiKey, baseURL, req)
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }

    if len(resp.Choices) > 0 {
        fmt.Println(resp.Choices[0].Message.Content)
    }
}

🚀 使用场景详解

1. 智能客服

python

def customer_service_chat(user_query):
    messages = [
        {
            "role": "system",
            "content": "你是一个专业的客服助手，请友好、准确地回答用户问题。"
        },
        {
            "role": "user",
            "content": user_query
        }
    ]
    
    return chat_completion(api_key, base_url, messages)

2. 内容创作

python

def content_creation(topic, style="专业"):
    messages = [
        {
            "role": "system",
            "content": f"你是一个{style}的内容创作者，请根据用户需求创作高质量内容。"
        },
        {
            "role": "user",
            "content": f"请创作关于'{topic}'的内容"
        }
    ]
    
    return chat_completion(api_key, base_url, messages, model="gpt-4o")

3. 代码助手

python

def code_assistant(programming_task):
    messages = [
        {
            "role": "system",
            "content": "你是一个专业的编程助手，请提供清晰、可运行的代码解决方案。"
        },
        {
            "role": "user",
            "content": programming_task
        }
    ]
    
    return chat_completion(api_key, base_url, messages, model="gpt-4o")

4. 文档分析

python

def analyze_document(document_content, question):
    messages = [
        {
            "role": "system",
            "content": "你是一个文档分析专家，请基于提供的文档内容回答问题。"
        },
        {
            "role": "user",
            "content": f"文档内容：\n{document_content}\n\n问题：{question}"
        }
    ]
    
    return chat_completion(api_key, base_url, messages)

⚙️ 最佳实践

1. 系统提示优化

好的系统提示示例：

json

{
    "role": "system",
    "content": "你是一个专业的Python编程导师。请遵循以下原则：\n1. 提供清晰、可执行的代码示例\n2. 解释代码的工作原理\n3. 指出潜在的问题和改进建议\n4. 使用简洁明了的语言"
}

2. 温度参数调优

场景	推荐温度	说明
事实问答	0.1-0.3	需要准确性
创意写作	0.7-0.9	需要创造性
代码生成	0.2-0.4	平衡准确性和灵活性
翻译任务	0.1-0.2	需要一致性

3. Token 使用优化

python

def optimize_token_usage(messages, max_tokens=4000):
    """优化 Token 使用"""
    # 计算消息长度（粗略估算）
    total_chars = sum(len(msg['content']) for msg in messages)
    estimated_tokens = total_chars // 4  # 粗略估算
    
    if estimated_tokens > max_tokens * 0.8:  # 留出输出空间
        # 截断较长的消息
        for msg in messages:
            if len(msg['content']) > 1000:
                msg['content'] = msg['content'][:800] + "..."
    
    return messages

4. 错误处理

python

import time
import random

def robust_chat_completion(api_key, base_url, messages, max_retries=3):
    """带重试机制的聊天完成"""
    for attempt in range(max_retries):
        try:
            return chat_completion(api_key, base_url, messages)
        except Exception as e:
            if attempt == max_retries - 1:
                raise e
            
            # 指数退避
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)
    
    return None

5. 流式输出处理

python

def handle_stream_response(response):
    """处理流式响应"""
    full_content = ""
    
    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                content = line[6:]
                if content.strip() == '[DONE]':
                    break
                
                try:
                    chunk = json.loads(content)
                    if 'choices' in chunk and chunk['choices']:
                        delta = chunk['choices'][0]['delta']
                        if 'content' in delta:
                            content_piece = delta['content']
                            full_content += content_piece
                            yield content_piece
                except json.JSONDecodeError:
                    continue
    
    return full_content

🔍 模型选择指南

主流模型对比

模型	特点	适用场景	成本
GPT-4o	多模态，性能强	复杂推理，图像分析	高
GPT-3.5-turbo	平衡性能和成本	一般对话，内容生成	中
Claude-3	长文本处理强	文档分析，长对话	中高
Gemini Pro	Google 生态	多模态任务	中

模型选择建议

python

def select_model(task_type, complexity="medium"):
    """根据任务类型选择最适合的模型"""
    model_map = {
        "text_generation": {
            "low": "gpt-3.5-turbo",
            "medium": "gpt-4o-mini", 
            "high": "gpt-4o"
        },
        "image_analysis": {
            "low": "gemini-pro-vision",
            "medium": "gpt-4o",
            "high": "gemini-2.5-flash-all"
        },
        "code_generation": {
            "low": "gpt-3.5-turbo",
            "medium": "gpt-4o-mini",
            "high": "gpt-4o"
        },
        "document_analysis": {
            "low": "gpt-3.5-turbo",
            "medium": "claude-3-sonnet",
            "high": "claude-3-opus"
        }
    }
    
    return model_map.get(task_type, {}).get(complexity, "gpt-4o")

🚨 常见错误和解决方案

1. 认证错误 (401)

json

{
    "error": {
        "message": "Invalid API key provided",
        "type": "invalid_request_error",
        "code": "invalid_api_key"
    }
}

解决方案：

检查 API 密钥是否正确
确认密钥格式：Bearer sk-xxxxx
验证密钥是否过期

2. 请求过大 (413)

json

{
    "error": {
        "message": "Request too large",
        "type": "invalid_request_error",
        "code": "request_too_large"
    }
}

解决方案：

减少输入文本长度
优化图像大小
分批处理长文档

3. 速率限制 (429)

json

{
    "error": {
        "message": "Rate limit exceeded",
        "type": "rate_limit_error", 
        "code": "rate_limit_exceeded"
    }
}

解决方案：

实现指数退避重试
控制并发请求数
升级到更高级别的套餐

4. 模型不可用 (404)

json

{
    "error": {
        "message": "Model not found",
        "type": "invalid_request_error",
        "code": "model_not_found"
    }
}

解决方案：

检查模型名称拼写
确认模型是否支持
使用备用模型

📊 性能监控

1. Token 使用统计

python

class TokenTracker:
    def __init__(self):
        self.total_prompt_tokens = 0
        self.total_completion_tokens = 0
        self.total_requests = 0
    
    def track_usage(self, response):
        usage = response.get('usage', {})
        self.total_prompt_tokens += usage.get('prompt_tokens', 0)
        self.total_completion_tokens += usage.get('completion_tokens', 0)
        self.total_requests += 1
    
    def get_stats(self):
        return {
            'total_requests': self.total_requests,
            'total_prompt_tokens': self.total_prompt_tokens,
            'total_completion_tokens': self.total_completion_tokens,
            'total_tokens': self.total_prompt_tokens + self.total_completion_tokens,
            'avg_tokens_per_request': (self.total_prompt_tokens + self.total_completion_tokens) / max(self.total_requests, 1)
        }

2. 响应时间监控

python

import time
from functools import wraps

def monitor_performance(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        
        print(f"API 调用耗时: {end_time - start_time:.2f} 秒")
        return result
    return wrapper

@monitor_performance
def monitored_chat_completion(api_key, base_url, messages):
    return chat_completion(api_key, base_url, messages)

🔒 安全注意事项

1. API 密钥安全

python

import os
from dotenv import load_dotenv

# 使用环境变量存储 API 密钥
load_dotenv()
api_key = os.getenv('API_KEY')

# 不要在代码中硬编码密钥
# ❌ 错误做法
# api_key = "sk-1234567890abcdef"

# ✅ 正确做法  
# api_key = os.getenv('API_KEY')

2. 输入验证

python

def validate_input(messages):
    """验证输入消息"""
    if not isinstance(messages, list):
        raise ValueError("messages 必须是列表")
    
    for msg in messages:
        if not isinstance(msg, dict):
            raise ValueError("每条消息必须是字典")
        
        if 'role' not in msg or 'content' not in msg:
            raise ValueError("消息必须包含 role 和 content 字段")
        
        if msg['role'] not in ['system', 'user', 'assistant', 'tool']:
            raise ValueError("role 必须是有效值")
    
    return True

3. 内容过滤

python

def content_filter(text):
    """基础内容过滤"""
    sensitive_words = ['敏感词1', '敏感词2']  # 根据需要定义
    
    for word in sensitive_words:
        if word in text:
            return False, f"包含敏感内容: {word}"
    
    return True, "内容安全"

📚 相关资源

官方文档

🔧 模型列表和定价
📊 使用统计面板

💡 高级技巧

1. 对话上下文管理

python

class ConversationManager:
    def __init__(self, max_history=10):
        self.messages = []
        self.max_history = max_history
    
    def add_message(self, role, content):
        self.messages.append({"role": role, "content": content})
        
        # 保持对话历史在限制范围内
        if len(self.messages) > self.max_history:
            # 保留系统消息，删除最旧的用户/助手消息
            system_msgs = [msg for msg in self.messages if msg['role'] == 'system']
            other_msgs = [msg for msg in self.messages if msg['role'] != 'system']
            self.messages = system_msgs + other_msgs[-(self.max_history - len(system_msgs)):]
    
    def get_messages(self):
        return self.messages
    
    def clear_history(self, keep_system=True):
        if keep_system:
            self.messages = [msg for msg in self.messages if msg['role'] == 'system']
        else:
            self.messages = []

2. 批量处理

python

async def batch_process(api_key, base_url, requests_list):
    """批量处理多个请求"""
    import asyncio
    import aiohttp
    
    async def single_request(session, request_data):
        headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }
        
        async with session.post(
            f"{base_url}/v1/chat/completions",
            json=request_data,
            headers=headers
        ) as response:
            return await response.json()
    
    async with aiohttp.ClientSession() as session:
        tasks = [single_request(session, req) for req in requests_list]
        results = await asyncio.gather(*tasks)
        return results

3. 智能重试机制

python

import random
import time
from functools import wraps

def smart_retry(max_retries=3, base_delay=1, max_delay=60):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_retries - 1:
                        raise e
                    
                    # 根据错误类型调整策略
                    if "rate_limit" in str(e).lower():
                        delay = min(base_delay * (2 ** attempt) + random.uniform(0, 1), max_delay)
                    elif "timeout" in str(e).lower():
                        delay = base_delay * (attempt + 1)
                    else:
                        delay = base_delay
                    
                    print(f"请求失败，{delay:.1f}秒后重试... (尝试 {attempt + 1}/{max_retries})")
                    time.sleep(delay)
            
            return None
        return wrapper
    return decorator

@smart_retry(max_retries=5, base_delay=2)
def resilient_chat_completion(api_key, base_url, messages):
    return chat_completion(api_key, base_url, messages)

🎯 总结

Chat-Completions API 是一个功能强大且灵活的接口，支持多种 AI 模型和使用场景。通过本指南，您可以：

✅ 掌握基础用法 - 了解接口调用方法和参数配置
✅ 实现高级功能 - 支持多模态、流式输出、函数调用等
✅ 优化性能 - 合理选择模型、管理 Token、处理错误
✅ 确保安全 - 保护 API 密钥、验证输入、过滤内容
✅ 监控使用 - 跟踪性能指标、统计使用情况

遵循本指南的最佳实践，您将能够构建稳定、高效的 AI 应用程序。如有任何问题，请参考相关资源或联系技术支持。

Chat-Completions API 使用指南 ​

📝 简介 ​

🔧 接口定义 ​

🔐 鉴权方法 ​

💡 请求示例 ​

基础文本对话 ✅ ​

图像分析对话 ✅ ​

流式响应 ✅ ​

函数调用 ✅ ​

JSON 模式输出 ✅ ​

📮 请求参数详解 ​

核心参数 ​

messages 参数详解 ​

高级参数 ​

📥 响应格式 ​

标准响应 ​

流式响应 ​

finish_reason 说明 ​

💻 代码示例 ​

Python 示例 ​

基础对话 ​

图像分析 ​

流式输出 ​

Node.js 示例 ​

基础对话 ​

函数调用 ​

cURL 示例 ​

基础请求 ​

流式请求 ​

Go 示例 ​

🚀 使用场景详解 ​

1. 智能客服 ​

2. 内容创作 ​

3. 代码助手 ​

4. 文档分析 ​

⚙️ 最佳实践 ​

1. 系统提示优化 ​

2. 温度参数调优 ​

3. Token 使用优化 ​

4. 错误处理 ​

5. 流式输出处理 ​

🔍 模型选择指南 ​

主流模型对比 ​

模型选择建议 ​

🚨 常见错误和解决方案 ​

1. 认证错误 (401) ​

2. 请求过大 (413) ​

3. 速率限制 (429) ​

4. 模型不可用 (404) ​

📊 性能监控 ​

1. Token 使用统计 ​

2. 响应时间监控 ​

🔒 安全注意事项 ​

1. API 密钥安全 ​

2. 输入验证 ​

3. 内容过滤 ​

📚 相关资源 ​

官方文档 ​

💡 高级技巧 ​

1. 对话上下文管理 ​

2. 批量处理 ​

3. 智能重试机制 ​

🎯 总结 ​

Chat-Completions API 使用指南

📝 简介

🔧 接口定义

🔐 鉴权方法

💡 请求示例

基础文本对话 ✅

图像分析对话 ✅

流式响应 ✅

函数调用 ✅

JSON 模式输出 ✅

📮 请求参数详解

核心参数

messages 参数详解

高级参数

📥 响应格式

标准响应

流式响应

finish_reason 说明

💻 代码示例

Python 示例

基础对话

图像分析

流式输出

Node.js 示例

基础对话

函数调用

cURL 示例

基础请求

流式请求

Go 示例

🚀 使用场景详解

1. 智能客服

2. 内容创作

3. 代码助手

4. 文档分析

⚙️ 最佳实践

1. 系统提示优化

2. 温度参数调优

3. Token 使用优化

4. 错误处理

5. 流式输出处理

🔍 模型选择指南

主流模型对比

模型选择建议

🚨 常见错误和解决方案

1. 认证错误 (401)

2. 请求过大 (413)

3. 速率限制 (429)

4. 模型不可用 (404)

📊 性能监控

1. Token 使用统计

2. 响应时间监控

🔒 安全注意事项

1. API 密钥安全

2. 输入验证

3. 内容过滤

📚 相关资源

官方文档

💡 高级技巧

1. 对话上下文管理

2. 批量处理

3. 智能重试机制

🎯 总结