Chat-Completions API 使用指南
本文档详细说明了如何使用 Chat-Completions API 进行各种 AI 对话和文本生成任务。该 API 兼容 OpenAI 格式,支持多种模型,包括 GPT、Claude、Gemini 等主流 AI 模型。
官方文档:https://platform.openai.com/docs/api-reference/chat
📝 简介
Chat-Completions API 是一个强大且灵活的接口,提供了访问最先进的 AI 模型的简单方式,支持:
- 💬 文本对话:自然语言问答和对话
- 🖼️ 图像分析:多模态内容理解
- 🔄 流式响应:实时流式输出
- 🛠️ 函数调用:工具集成和自动化
- 📊 结构化输出:JSON 格式输出
- 🎯 多种模型:支持各大厂商的主流模型
🔧 接口定义
- 端点:
/v1/chat/completions - 方法:
POST - 认证:
Bearer Token - 格式:
application/json
🔐 鉴权方法
所有请求都需要在 HTTP Header 中包含 Authorization 字段:
http
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json请将 YOUR_API_KEY 替换为您在平台生成的有效 API 密钥。
💡 请求示例
基础文本对话 ✅
最简单的文本问答场景:
json
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "你好,请介绍一下人工智能的发展历史"
}
],
"max_tokens": 1000,
"temperature": 0.7
}图像分析对话 ✅
支持图像输入的多模态对话:
json
{
"model": "gemini-2.5-flash-all",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "请描述这张图片中的内容"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD..."
}
}
]
}
],
"max_tokens": 500
}流式响应 ✅
实现实时流式输出:
json
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "请写一首关于春天的诗"
}
],
"stream": true,
"stream_options": {
"include_usage": true
}
}函数调用 ✅
集成外部工具和函数:
json
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "今天北京的天气怎么样?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取指定城市的天气信息",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "城市名称"
}
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto"
}JSON 模式输出 ✅
强制模型输出结构化 JSON:
json
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "你是一个数据提取助手,请将用户输入的信息提取为JSON格式"
},
{
"role": "user",
"content": "张三,男,25岁,软件工程师,住在北京市朝阳区"
}
],
"response_format": {
"type": "json_object"
}
}📮 请求参数详解
核心参数
| 参数 | 类型 | 必需 | 默认值 | 描述 |
|---|---|---|---|---|
model | string | 是 | - | 指定使用的模型名称 |
messages | array | 是 | - | 对话消息列表 |
max_tokens | integer | 否 | - | 生成内容的最大 Token 数 |
temperature | number | 否 | 1 | 控制输出随机性 (0-2) |
top_p | number | 否 | 1 | 核采样参数 (0-1) |
n | integer | 否 | 1 | 生成响应的数量 |
stream | boolean | 否 | false | 是否启用流式输出 |
messages 参数详解
消息数组中每个对象的结构:
json
{
"role": "user|assistant|system|tool",
"content": "消息内容",
"name": "可选的发送者名称",
"tool_calls": "工具调用信息",
"tool_call_id": "工具调用ID"
}角色说明:
system:系统提示,定义 AI 的行为和角色user:用户输入的消息assistant:AI 助手的回复tool:工具调用的返回结果
高级参数
| 参数 | 类型 | 描述 |
|---|---|---|
stop | string/array | 停止生成的序列 |
presence_penalty | number | 存在惩罚 (-2.0 到 2.0) |
frequency_penalty | number | 频率惩罚 (-2.0 到 2.0) |
logit_bias | object | Token 生成偏置 |
user | string | 最终用户标识符 |
seed | integer | 确定性采样种子 |
response_format | object | 输出格式控制 |
tools | array | 可用工具列表 |
tool_choice | string/object | 工具选择策略 |
📥 响应格式
标准响应
json
{
"id": "chatcmpl-8XYZ123",
"object": "chat.completion",
"created": 1699000000,
"model": "gpt-4o-2024-05-13",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "这是 AI 的回复内容"
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 25,
"total_tokens": 40,
"completion_tokens_details": {
"reasoning_tokens": 0
}
},
"system_fingerprint": "fp_abc123"
}流式响应
json
data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":"你好"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-8XYZ123","object":"chat.completion.chunk","created":1699000000,"model":"gpt-4o","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
data: [DONE]finish_reason 说明
| 值 | 描述 |
|---|---|
stop | 模型自然停止或遇到停止序列 |
length | 达到最大 Token 限制 |
tool_calls | 模型调用了工具 |
content_filter | 内容被过滤 |
💻 代码示例
Python 示例
基础对话
python
import requests
import json
def chat_completion(api_key, base_url, messages, model="gpt-4o"):
"""基础聊天完成函数"""
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
data = {
"model": model,
"messages": messages,
"max_tokens": 1000,
"temperature": 0.7
}
response = requests.post(
f"{base_url}/v1/chat/completions",
headers=headers,
data=json.dumps(data)
)
if response.status_code == 200:
return response.json()
else:
raise Exception(f"API Error: {response.status_code} - {response.text}")
# 使用示例
api_key = "YOUR_API_KEY"
base_url = "https://api.rainboxs.com"
messages = [
{"role": "user", "content": "请介绍一下机器学习的基本概念"}
]
result = chat_completion(api_key, base_url, messages)
print(result['choices'][0]['message']['content'])图像分析
python
import base64
def encode_image(image_path):
"""将图像编码为 base64"""
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def analyze_image(api_key, base_url, image_path, question):
"""图像分析函数"""
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
# 编码图像
base64_image = encode_image(image_path)
data = {
"model": "gemini-2.5-flash-all",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": question},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
}
]
}
],
"max_tokens": 500
}
response = requests.post(
f"{base_url}/v1/chat/completions",
headers=headers,
data=json.dumps(data)
)
return response.json()
# 使用示例
result = analyze_image(
api_key,
base_url,
"image.jpg",
"请详细描述这张图片的内容"
)
print(result['choices'][0]['message']['content'])流式输出
python
import requests
import json
def stream_chat(api_key, base_url, messages, model="gpt-4o"):
"""流式聊天函数"""
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
data = {
"model": model,
"messages": messages,
"stream": True,
"stream_options": {"include_usage": True}
}
response = requests.post(
f"{base_url}/v1/chat/completions",
headers=headers,
data=json.dumps(data),
stream=True
)
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
content = line[6:] # 移除 'data: ' 前缀
if content.strip() == '[DONE]':
break
try:
chunk = json.loads(content)
if 'choices' in chunk and chunk['choices']:
delta = chunk['choices'][0]['delta']
if 'content' in delta:
print(delta['content'], end='', flush=True)
except json.JSONDecodeError:
continue
# 使用示例
messages = [
{"role": "user", "content": "请写一首关于人工智能的诗"}
]
stream_chat(api_key, base_url, messages)Node.js 示例
基础对话
javascript
const axios = require('axios');
async function chatCompletion(apiKey, baseUrl, messages, model = 'gpt-4o') {
const headers = {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json'
};
const data = {
model: model,
messages: messages,
max_tokens: 1000,
temperature: 0.7
};
try {
const response = await axios.post(
`${baseUrl}/v1/chat/completions`,
data,
{ headers }
);
return response.data;
} catch (error) {
throw new Error(`API Error: ${error.response?.status} - ${error.response?.data}`);
}
}
// 使用示例
const apiKey = 'YOUR_API_KEY';
const baseUrl = 'https://api.rainboxs.com';
const messages = [
{ role: 'user', content: '请解释一下深度学习的原理' }
];
chatCompletion(apiKey, baseUrl, messages)
.then(result => {
console.log(result.choices[0].message.content);
})
.catch(error => {
console.error('Error:', error.message);
});函数调用
javascript
async function functionCalling(apiKey, baseUrl, query) {
const headers = {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json'
};
const data = {
model: 'gpt-4o',
messages: [
{ role: 'user', content: query }
],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: '获取指定城市的天气信息',
parameters: {
type: 'object',
properties: {
city: {
type: 'string',
description: '城市名称'
}
},
required: ['city']
}
}
}
],
tool_choice: 'auto'
};
const response = await axios.post(
`${baseUrl}/v1/chat/completions`,
data,
{ headers }
);
return response.data;
}
// 使用示例
functionCalling(apiKey, baseUrl, '今天上海的天气如何?')
.then(result => {
const message = result.choices[0].message;
if (message.tool_calls) {
console.log('工具调用:', message.tool_calls);
} else {
console.log('回复:', message.content);
}
});cURL 示例
基础请求
bash
curl -X POST "https://api.rainboxs.com/v1/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "你好,请介绍一下自然语言处理"
}
],
"max_tokens": 1000,
"temperature": 0.7
}'流式请求
bash
curl -X POST "https://api.rainboxs.com/v1/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "请写一个关于机器人的故事"
}
],
"stream": true,
"max_tokens": 500
}'Go 示例
go
package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
type ChatRequest struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
MaxTokens int `json:"max_tokens,omitempty"`
Temperature float64 `json:"temperature,omitempty"`
Stream bool `json:"stream,omitempty"`
}
type ChatResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []struct {
Index int `json:"index"`
Message struct {
Role string `json:"role"`
Content string `json:"content"`
} `json:"message"`
FinishReason string `json:"finish_reason"`
} `json:"choices"`
Usage struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
} `json:"usage"`
}
func chatCompletion(apiKey, baseURL string, req ChatRequest) (*ChatResponse, error) {
jsonData, err := json.Marshal(req)
if err != nil {
return nil, err
}
httpReq, err := http.NewRequest("POST", baseURL+"/v1/chat/completions", bytes.NewBuffer(jsonData))
if err != nil {
return nil, err
}
httpReq.Header.Set("Authorization", "Bearer "+apiKey)
httpReq.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, err := client.Do(httpReq)
if err != nil {
return nil, err
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, err
}
var chatResp ChatResponse
err = json.Unmarshal(body, &chatResp)
if err != nil {
return nil, err
}
return &chatResp, nil
}
func main() {
apiKey := "YOUR_API_KEY"
baseURL := "https://api.rainboxs.com"
req := ChatRequest{
Model: "gpt-4o",
Messages: []Message{
{Role: "user", Content: "请介绍一下 Go 语言的特点"},
},
MaxTokens: 1000,
Temperature: 0.7,
}
resp, err := chatCompletion(apiKey, baseURL, req)
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
if len(resp.Choices) > 0 {
fmt.Println(resp.Choices[0].Message.Content)
}
}🚀 使用场景详解
1. 智能客服
python
def customer_service_chat(user_query):
messages = [
{
"role": "system",
"content": "你是一个专业的客服助手,请友好、准确地回答用户问题。"
},
{
"role": "user",
"content": user_query
}
]
return chat_completion(api_key, base_url, messages)2. 内容创作
python
def content_creation(topic, style="专业"):
messages = [
{
"role": "system",
"content": f"你是一个{style}的内容创作者,请根据用户需求创作高质量内容。"
},
{
"role": "user",
"content": f"请创作关于'{topic}'的内容"
}
]
return chat_completion(api_key, base_url, messages, model="gpt-4o")3. 代码助手
python
def code_assistant(programming_task):
messages = [
{
"role": "system",
"content": "你是一个专业的编程助手,请提供清晰、可运行的代码解决方案。"
},
{
"role": "user",
"content": programming_task
}
]
return chat_completion(api_key, base_url, messages, model="gpt-4o")4. 文档分析
python
def analyze_document(document_content, question):
messages = [
{
"role": "system",
"content": "你是一个文档分析专家,请基于提供的文档内容回答问题。"
},
{
"role": "user",
"content": f"文档内容:\n{document_content}\n\n问题:{question}"
}
]
return chat_completion(api_key, base_url, messages)⚙️ 最佳实践
1. 系统提示优化
好的系统提示示例:
json
{
"role": "system",
"content": "你是一个专业的Python编程导师。请遵循以下原则:\n1. 提供清晰、可执行的代码示例\n2. 解释代码的工作原理\n3. 指出潜在的问题和改进建议\n4. 使用简洁明了的语言"
}2. 温度参数调优
| 场景 | 推荐温度 | 说明 |
|---|---|---|
| 事实问答 | 0.1-0.3 | 需要准确性 |
| 创意写作 | 0.7-0.9 | 需要创造性 |
| 代码生成 | 0.2-0.4 | 平衡准确性和灵活性 |
| 翻译任务 | 0.1-0.2 | 需要一致性 |
3. Token 使用优化
python
def optimize_token_usage(messages, max_tokens=4000):
"""优化 Token 使用"""
# 计算消息长度(粗略估算)
total_chars = sum(len(msg['content']) for msg in messages)
estimated_tokens = total_chars // 4 # 粗略估算
if estimated_tokens > max_tokens * 0.8: # 留出输出空间
# 截断较长的消息
for msg in messages:
if len(msg['content']) > 1000:
msg['content'] = msg['content'][:800] + "..."
return messages4. 错误处理
python
import time
import random
def robust_chat_completion(api_key, base_url, messages, max_retries=3):
"""带重试机制的聊天完成"""
for attempt in range(max_retries):
try:
return chat_completion(api_key, base_url, messages)
except Exception as e:
if attempt == max_retries - 1:
raise e
# 指数退避
wait_time = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait_time)
return None5. 流式输出处理
python
def handle_stream_response(response):
"""处理流式响应"""
full_content = ""
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
content = line[6:]
if content.strip() == '[DONE]':
break
try:
chunk = json.loads(content)
if 'choices' in chunk and chunk['choices']:
delta = chunk['choices'][0]['delta']
if 'content' in delta:
content_piece = delta['content']
full_content += content_piece
yield content_piece
except json.JSONDecodeError:
continue
return full_content🔍 模型选择指南
主流模型对比
| 模型 | 特点 | 适用场景 | 成本 |
|---|---|---|---|
| GPT-4o | 多模态,性能强 | 复杂推理,图像分析 | 高 |
| GPT-3.5-turbo | 平衡性能和成本 | 一般对话,内容生成 | 中 |
| Claude-3 | 长文本处理强 | 文档分析,长对话 | 中高 |
| Gemini Pro | Google 生态 | 多模态任务 | 中 |
模型选择建议
python
def select_model(task_type, complexity="medium"):
"""根据任务类型选择最适合的模型"""
model_map = {
"text_generation": {
"low": "gpt-3.5-turbo",
"medium": "gpt-4o-mini",
"high": "gpt-4o"
},
"image_analysis": {
"low": "gemini-pro-vision",
"medium": "gpt-4o",
"high": "gemini-2.5-flash-all"
},
"code_generation": {
"low": "gpt-3.5-turbo",
"medium": "gpt-4o-mini",
"high": "gpt-4o"
},
"document_analysis": {
"low": "gpt-3.5-turbo",
"medium": "claude-3-sonnet",
"high": "claude-3-opus"
}
}
return model_map.get(task_type, {}).get(complexity, "gpt-4o")🚨 常见错误和解决方案
1. 认证错误 (401)
json
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}解决方案:
- 检查 API 密钥是否正确
- 确认密钥格式:
Bearer sk-xxxxx - 验证密钥是否过期
2. 请求过大 (413)
json
{
"error": {
"message": "Request too large",
"type": "invalid_request_error",
"code": "request_too_large"
}
}解决方案:
- 减少输入文本长度
- 优化图像大小
- 分批处理长文档
3. 速率限制 (429)
json
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}解决方案:
- 实现指数退避重试
- 控制并发请求数
- 升级到更高级别的套餐
4. 模型不可用 (404)
json
{
"error": {
"message": "Model not found",
"type": "invalid_request_error",
"code": "model_not_found"
}
}解决方案:
- 检查模型名称拼写
- 确认模型是否支持
- 使用备用模型
📊 性能监控
1. Token 使用统计
python
class TokenTracker:
def __init__(self):
self.total_prompt_tokens = 0
self.total_completion_tokens = 0
self.total_requests = 0
def track_usage(self, response):
usage = response.get('usage', {})
self.total_prompt_tokens += usage.get('prompt_tokens', 0)
self.total_completion_tokens += usage.get('completion_tokens', 0)
self.total_requests += 1
def get_stats(self):
return {
'total_requests': self.total_requests,
'total_prompt_tokens': self.total_prompt_tokens,
'total_completion_tokens': self.total_completion_tokens,
'total_tokens': self.total_prompt_tokens + self.total_completion_tokens,
'avg_tokens_per_request': (self.total_prompt_tokens + self.total_completion_tokens) / max(self.total_requests, 1)
}2. 响应时间监控
python
import time
from functools import wraps
def monitor_performance(func):
@wraps(func)
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"API 调用耗时: {end_time - start_time:.2f} 秒")
return result
return wrapper
@monitor_performance
def monitored_chat_completion(api_key, base_url, messages):
return chat_completion(api_key, base_url, messages)🔒 安全注意事项
1. API 密钥安全
python
import os
from dotenv import load_dotenv
# 使用环境变量存储 API 密钥
load_dotenv()
api_key = os.getenv('API_KEY')
# 不要在代码中硬编码密钥
# ❌ 错误做法
# api_key = "sk-1234567890abcdef"
# ✅ 正确做法
# api_key = os.getenv('API_KEY')2. 输入验证
python
def validate_input(messages):
"""验证输入消息"""
if not isinstance(messages, list):
raise ValueError("messages 必须是列表")
for msg in messages:
if not isinstance(msg, dict):
raise ValueError("每条消息必须是字典")
if 'role' not in msg or 'content' not in msg:
raise ValueError("消息必须包含 role 和 content 字段")
if msg['role'] not in ['system', 'user', 'assistant', 'tool']:
raise ValueError("role 必须是有效值")
return True3. 内容过滤
python
def content_filter(text):
"""基础内容过滤"""
sensitive_words = ['敏感词1', '敏感词2'] # 根据需要定义
for word in sensitive_words:
if word in text:
return False, f"包含敏感内容: {word}"
return True, "内容安全"📚 相关资源
官方文档
💡 高级技巧
1. 对话上下文管理
python
class ConversationManager:
def __init__(self, max_history=10):
self.messages = []
self.max_history = max_history
def add_message(self, role, content):
self.messages.append({"role": role, "content": content})
# 保持对话历史在限制范围内
if len(self.messages) > self.max_history:
# 保留系统消息,删除最旧的用户/助手消息
system_msgs = [msg for msg in self.messages if msg['role'] == 'system']
other_msgs = [msg for msg in self.messages if msg['role'] != 'system']
self.messages = system_msgs + other_msgs[-(self.max_history - len(system_msgs)):]
def get_messages(self):
return self.messages
def clear_history(self, keep_system=True):
if keep_system:
self.messages = [msg for msg in self.messages if msg['role'] == 'system']
else:
self.messages = []2. 批量处理
python
async def batch_process(api_key, base_url, requests_list):
"""批量处理多个请求"""
import asyncio
import aiohttp
async def single_request(session, request_data):
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
async with session.post(
f"{base_url}/v1/chat/completions",
json=request_data,
headers=headers
) as response:
return await response.json()
async with aiohttp.ClientSession() as session:
tasks = [single_request(session, req) for req in requests_list]
results = await asyncio.gather(*tasks)
return results3. 智能重试机制
python
import random
import time
from functools import wraps
def smart_retry(max_retries=3, base_delay=1, max_delay=60):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_retries - 1:
raise e
# 根据错误类型调整策略
if "rate_limit" in str(e).lower():
delay = min(base_delay * (2 ** attempt) + random.uniform(0, 1), max_delay)
elif "timeout" in str(e).lower():
delay = base_delay * (attempt + 1)
else:
delay = base_delay
print(f"请求失败,{delay:.1f}秒后重试... (尝试 {attempt + 1}/{max_retries})")
time.sleep(delay)
return None
return wrapper
return decorator
@smart_retry(max_retries=5, base_delay=2)
def resilient_chat_completion(api_key, base_url, messages):
return chat_completion(api_key, base_url, messages)🎯 总结
Chat-Completions API 是一个功能强大且灵活的接口,支持多种 AI 模型和使用场景。通过本指南,您可以:
✅ 掌握基础用法 - 了解接口调用方法和参数配置
✅ 实现高级功能 - 支持多模态、流式输出、函数调用等
✅ 优化性能 - 合理选择模型、管理 Token、处理错误
✅ 确保安全 - 保护 API 密钥、验证输入、过滤内容
✅ 监控使用 - 跟踪性能指标、统计使用情况
遵循本指南的最佳实践,您将能够构建稳定、高效的 AI 应用程序。如有任何问题,请参考相关资源或联系技术支持。