SSE 在大语言模型中的流式数据处理简单封装

1216 字

6 分钟

SSE 在大语言模型中的流式数据处理简单封装

2025-04-01

前端

SSE

/

LLM

效果展示#

背景#

最近我在使用大语言模型做一些简单、好玩的工具应用，比如算命、生成名人大事记等，然而大语言模型返回的数据量比较大，如果一次性返回，那么用户体验会比较差，所以需要使用 SSE 来实现流式数据传输，提升用户体验。

SSE 是什么？#

SSE（Server-Sent Events）是一种单向服务器推送技术，即：

服务器 → 客户端

• 长连接（保持 HTTP 连接） • 逐步推送数据，适用于流式响应（如大模型返回 Token）

与 WebSocket 的区别

特性	SSE	WebSocket
连接方向	单向(服务器到客户端)	双向
适用场景	流式数据	实时聊天、多人协作
兼容性	所有现代浏览器	需要服务器和客户端都支持
连接数	受浏览器限制	无限制

LLM 中的 SSE 工作流程#

应用场景：ChatGPT、DeepSeek Chat、Gemini 等对话模型返回数据逐步传输，而不是一次性返回整个 JSON。

步骤#

客户端发起请求 • 使用 EventSource（浏览器）或 fetch + ReadableStream（Node.js）创建 SSE 连接
服务器逐步返回数据 • 服务器端不断向 SSE 连接推送 Token
客户端逐步解析数据 • 浏览器或前端代码监听 SSE 事件，逐步渲染消息

SSE 代码示例#

服务端 Node.js 代码示例

1
import OpenAI from 'openai';
2

3
const openai = new OpenAI({
4
  baseURL: 'https://api.deepseek.com',
5
  apiKey: process.env.DEEPSEEK_KEY,
6
});
7

8
const stream = await openai.chat.completions.create({
9
  model: "deepseek-chat",
10
  messages: [{ role: "user", content: "Hello, world!" }],
11
  stream: true,
12
});

客户端代码示例 JavaScript

1
const stream = await fetch('http://localhost:3000/sse', {
2
  method: 'GET',
3
}).then(res => res.body);
4

5
const reader = stream?.getReader();
6
const decoder = new TextDecoder();
7
while (true) {
8
  const { done, value } = await reader!.read();
9
  if (done) break;
10
  const text = decoder.decode(value);
11
  console.log(text);
12
}

简单封装#

基于大模型返回的 SSE 数据，进行简单封装，实现流式数据传输以及页面渲染，包含思考模型处理。

服务端#

函数定义#

1
// @/utils/server.ts
2

3
/**
4
 * 获取大模型响应流
5
 * @param completion 大模型响应流 await service.chat.completions.create 返回的流式数据 create({ stream: true })
6
 * @param model 模型名称
7
 * @param type 类型
8
 * @returns 流式数据
9
 */
10
export const getStreamData = (completion: Stream<ChatCompletionChunk>) => {
11
  let count = 0;
12
  let thinkingCount = 0;
13
  const stream = new ReadableStream({
14
    async start(controller) {
15
      try {
16
        console.log('Starting stream processing...');
17
        for await (const chunk of completion) {
18
          const { choices } = chunk;
19
          const delta = choices[0]?.delta as Delta ?? {};
20

21
          if (!delta) continue;
22

23
          const { reasoning_content = null, content = null } = delta;
24
          // 添加 thinking 标签
25
          if (count === 0 && reasoning_content) {
26
            controller.enqueue(new TextEncoder().encode('<thinking>'));
27
            // 等待 100ms 后避免批处理
28
            await new Promise(resolve => setTimeout(resolve, 100));
29
          }
30
          if (reasoning_content) {
31
            count++;
32
            thinkingCount++;
33
            controller.enqueue(new TextEncoder().encode(reasoning_content));
34
          }
35
          if (count - thinkingCount === 0 && thinkingCount !== 0 && content) {
36
            controller.enqueue(new TextEncoder().encode('</thinking>'));
37
            // 等待 100ms 后避免批处理
38
            await new Promise(resolve => setTimeout(resolve, 100));
39
          }
40
          if (content) {
41
            count++;
42
            controller.enqueue(new TextEncoder().encode(content));
43
          }
44
        }
45
      } catch (error) {
46
        console.error('Stream processing error:', error);
47
        controller.error(error);
48
      } finally {
49
        if (count === 0) {
50
          controller.enqueue(new TextEncoder().encode('[服务器繁忙，请稍后再试。]'));
51
        } else {
52
          controller.enqueue(new TextEncoder().encode('[DONE]'));
53
        }
54
        controller.close();
55
      }
56
    },
57
  });
58
  return stream;
59
};

使用#

1
import OpenAI from 'openai';
2
import { getStreamData } from '@/utils/server';
3

4
const openai = new OpenAI({
5
  baseURL: 'https://api.deepseek.com',
6
  apiKey: process.env.DEEPSEEK_KEY,
7
});
8

9
const completion = await openai.chat.completions.create({
10
  model: "deepseek-chat",
11
  messages: [{ role: "user", content: "Hello, world!" }],
12
  stream: true,
13
});
14
const stream = await getStreamData(completion);
15

16
// 返回流式数据
17
return new Response(stream, {
18
  headers: {
19
    'Content-Type': 'text/event-stream',
20
  },
21
});

客户端#

函数定义#

1
// @/utils/client.ts
2

3
/**
4
 * 解析流式数据
5
 * @param stream 流式数据 ReadableStream
6
 * @param options 选项
7
 * @param options.output 输出类型 默认 text 如果 output 为 json 则onResult 的 result 返回解析后的 json 对象
8
 * @param options.onStart 流式数据开始
9
 * @param options.onEnd 流式数据结束 形参 thinking 为思考内容，result 为结果
10
 * @param options.onchange 实时更新最新内容 形参 thinking 为思考内容，result 为结果
11
 * @param options.onThinkingStart 如果是思考模型，则 onThinkingStart 会触发
12
 * @param options.onThinkingEnd 如果是思考模型，则 onThinkingEnd 会触发
13
 */
14
export const parseReadableStream = async (stream: ReadableStream<Uint8Array<ArrayBufferLike>>, options: {
15
  output?: 'text' | 'json';
16
  onStart?: () => void;
17
  onEnd?: (thinking: string, result: string | Record<string, unknown>) => void;
18
  onchange?: (thinking: string, result: string) => void;
19
  onThinkingStart?: () => void;
20
  onThinkingEnd?: () => void;
21
}) => {
22
  const { output = 'text', onStart = () => {}, onEnd = () => {}, onThinkingStart = () => {}, onThinkingEnd = () => {}, onchange = () => {} } = options;
23
  const reader = stream?.getReader();
24
  const decoder = new TextDecoder();
25
  let isReasoning = false;
26
  let thinking = '';
27
  let content = '';
28
  onStart();
29
  while (true) {
30
    const { done, value } = await reader!.read();
31
    if (done) break;
32
    const text = decoder.decode(value);
33
    if (text.includes('<thinking>')) {
34
      isReasoning = true;
35
      onThinkingStart();
36
      continue;
37
    }
38
    if (text.includes('</thinking>')) {
39
      isReasoning = false;
40
      onThinkingEnd();
41
      continue;
42
    }
43
    if (text.includes('[DONE]')) {
44
      let result = content;
45
      if (output === 'json') {
46
        // 取出 ```json 和 ``` 之间的内容
47
        const jsonContent = result.match(/```json\s*([\s\S]*?)\s*```/)?.[1] || '';
48
        try {
49
          result = JSON.parse(jsonContent);
50
        } catch (error) {
51
          console.error(error);
52
        }
53
      }
54
      onEnd(thinking, result);
55
      break;
56
    }
57
    if (isReasoning) {
58
      thinking += text;
59
    } else {
60
      content += text;
61
    }
62
    onchange(thinking, content);
63
  }
64
};

使用#

1
import { parseReadableStream } from '@/utils/client';
2

3
const stream = await fetch('http://localhost:3000/sse', {
4
  method: 'GET',
5
}).then(res => res.body);
6

7
try {
8
  parseReadableStream(stream, {
9
  output: 'json',
10
  onStart: () => {
11
    console.log('开始');
12
  },
13
  onEnd: (thinking, result) => {
14
    console.log('结束', thinking, result);
15
  },
16
  onchange: (thinking, result) => {
17
    // 如果 output 为 json 则 result 为 已经解析后对象
18
    console.log('实时更新', thinking, result);
19
  },
20
  onThinkingStart: () => {
21
    console.log('思考开始');
22
  },
23
    onThinkingEnd: () => {
24
      console.log('思考结束');
25
    },
26
  });
27
} catch (error) {
28
  console.error(error);
29
}