Merge branch 'main' of github.com:alibaba-damo-academy/FunASR

add
2025-09-15 14:48:36 +08:00 · 2023-10-11 16:19:52 +08:00 · 2023-10-11 16:19:52 +08:00 · e899096ce4
commit e899096ce4
parent f8a7f228f9 0f95934e80
24 changed files with 2761 additions and 57 deletions
--- a/docs/academic_recipe/asr_recipe.md
+++ b/docs/academic_recipe/asr_recipe.md
@ -12,7 +12,7 @@ cd egs/aishell/paraformer
 Then you can directly start the recipe as follows:
 ```sh
 conda activate funasr
-. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
+bash run.sh --CUDA_VISIBLE_DEVICES "0,1" --gpu_num 2
 ```

 The training log files are saved in `${exp_dir}/exp/${model_dir}/log/train.log.*`， which can be viewed using the following command:
@ -264,4 +264,4 @@ Users can use ModelScope for inference and fine-tuning based on a trained academ

 ### Decoding by CPU or GPU

-We support CPU and GPU decoding. For CPU decoding, set `gpu_inference=false` and `njob` to specific the total number of CPU jobs. For GPU decoding, first set `gpu_inference=true`. Then set `gpuid_list` to specific which GPUs for decoding and `njob` to specific the number of decoding jobs on each GPU.
+We support CPU and GPU decoding. For CPU decoding, set `gpu_inference=false` and `njob` to specific the total number of CPU jobs. For GPU decoding, first set `gpu_inference=true`. Then set `gpuid_list` to specific which GPUs for decoding and `njob` to specific the number of decoding jobs on each GPU.
--- a/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh
+++ b/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh
--- a/funasr/runtime/docs/SDK_advanced_guide_offline_en.md
+++ b/funasr/runtime/docs/SDK_advanced_guide_offline_en.md
@ -0,0 +1,211 @@
+ # Advanced Development Guide (File transcription service)
+ 
+FunASR provides a English offline file transcription service that can be deployed locally or on a cloud server with just one click. The core of the service is the FunASR runtime SDK, which has been open-sourced. FunASR-runtime combines various capabilities such as speech endpoint detection (VAD), large-scale speech recognition (ASR) using Paraformer-large, and punctuation detection (PUNC), which have all been open-sourced by the speech laboratory of DAMO Academy on the Modelscope community. This enables accurate and efficient high-concurrency transcription of audio files.
+
+This document serves as a development guide for the FunASR offline file transcription service. If you wish to quickly experience the offline file transcription service, please refer to the one-click deployment example for the FunASR offline file transcription service ([docs](./SDK_tutorial.md)).
+
+## Installation of Docker
+
+The following steps are for manually installing Docker and Docker images. If your Docker image has already been launched, you can ignore this step.
+
+### Installation of Docker environment
+
+```shell
+# Ubuntu：
+curl -fsSL https://test.docker.com -o test-docker.sh 
+sudo sh test-docker.sh 
+# Debian：
+curl -fsSL https://get.docker.com -o get-docker.sh 
+sudo sh get-docker.sh 
+# CentOS：
+curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun 
+# MacOS：
+brew install --cask --appdir=/Applications docker
+```
+
+More details could ref to [docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
+
+### Starting Docker
+
+```shell
+sudo systemctl start docker
+```
+
+### Pulling and launching images
+
+Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
+
+```shell
+sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0
+
+sudo docker run -p 10095:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0
+```
+
+Introduction to command parameters: 
+```text
+-p <host port>:<mapped docker port>: In the example, host machine (ECS) port 10095 is mapped to port 10095 in the Docker container. Make sure that port 10095 is open in the ECS security rules.
+
+-v <host path>:<mounted Docker path>: In the example, the host machine path /root is mounted to the Docker path /workspace/models.
+
+```
+
+
+## Starting the server
+
+Use the flollowing script to start the server ：
+```shell
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx  \
+  --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx > log.out 2>&1 &
+
+# If you want to close ssl，please add：--certfile 0
+
+```
+
+More details about the script run_server.sh:
+
+The FunASR-wss-server supports downloading models from Modelscope. You can set the model download address (--download-model-dir, default is /workspace/models) and the model ID (--model-dir, --vad-dir, --punc-dir). Here is an example:
+
+```shell
+cd /workspace/FunASR/funasr/runtime/websocket/build/bin
+./funasr-wss-server  \
+  --download-model-dir /workspace/models \
+  --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \
+  --decoder-thread-num 32 \
+  --io-thread-num  8 \
+  --port 10095 \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key
+ ```
+
+Introduction to command parameters: 
+
+```text
+--download-model-dir: Model download address, download models from Modelscope by setting the model ID.
+--model-dir: Modelscope model ID.
+--quantize: True for quantized ASR model, False for non-quantized ASR model. Default is True.
+--vad-dir: Modelscope model ID.
+--vad-quant: True for quantized VAD model, False for non-quantized VAD model. Default is True.
+--punc-dir: Modelscope model ID.
+--punc-quant: True for quantized PUNC model, False for non-quantized PUNC model. Default is True.
+--itn-dir modelscope model ID
+--port: Port number that the server listens on. Default is 10095.
+--decoder-thread-num: Number of inference threads that the server starts. Default is 8.
+--io-thread-num: Number of IO threads that the server starts. Default is 1.
+--certfile <string>: SSL certificate file. Default is ../../../ssl_key/server.crt. If you want to close ssl，set ""
+--keyfile <string>: SSL key file. Default is ../../../ssl_key/server.key. If you want to close ssl，set ""
+```
+
+The FunASR-wss-server also supports loading models from a local path (see Preparing Model Resources for detailed instructions on preparing local model resources). Here is an example:
+
+```shell
+cd /workspace/FunASR/funasr/runtime/websocket/build/bin
+./funasr-wss-server  \
+  --model-dir /workspace/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \
+  --vad-dir /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \
+  --decoder-thread-num 32 \
+  --io-thread-num  8 \
+  --port 10095 \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key
+ ```
+
+After executing the above command, the real-time speech transcription service will be started. If the model is specified as a ModelScope model id, the following models will be automatically downloaded from ModelScope:
+[FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary)
+[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx/summary)
+[CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx/summary)
+
+If you wish to deploy your fine-tuned model (e.g., 10epoch.pb), you need to manually rename the model to model.pb and replace the original model.pb in ModelScope. Then, specify the path as `model_dir`.
+
+## Starting the client
+
+After completing the deployment of FunASR offline file transcription service on the server, you can test and use the service by following these steps. Currently, FunASR-bin supports multiple ways to start the client. The following are command-line examples based on python-client, c++-client, and custom client Websocket communication protocol: 
+
+### python-client
+```shell
+python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "./data/wav.scp" --send_without_sleep --output_dir "./results"
+```
+
+Introduction to command parameters:
+
+```text
+--host: the IP address of the server. It can be set to 127.0.0.1 for local testing.
+--port: the port number of the server listener.
+--audio_in: the audio input. Input can be a path to a wav file or a wav.scp file (a Kaldi-formatted wav list in which each line includes a wav_id followed by a tab and a wav_path).
+--output_dir: the path to the recognition result output.
+--ssl: whether to use SSL encryption. The default is to use SSL.
+--mode: offline mode.
+--hotword: If am is hotword model, setting hotword: *.txt(one hotword perline) or hotwords seperate by space (could be: 阿里巴巴 达摩院)
+--use_itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### c++-client
+```shell
+. /funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path test.wav --thread-num 1 --is-ssl 1
+```
+
+Introduction to command parameters:
+
+```text
+--server-ip: the IP address of the server. It can be set to 127.0.0.1 for local testing.
+--port: the port number of the server listener.
+--wav-path: the audio input. Input can be a path to a wav file or a wav.scp file (a Kaldi-formatted wav list in which each line includes a wav_id followed by a tab and a wav_path).
+--is-ssl: whether to use SSL encryption. The default is to use SSL.
+--hotword: If am is hotword model, setting hotword: *.txt(one hotword perline) or hotwords seperate by space (could be: 阿里巴巴 达摩院)
+--use-itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### Custom client
+
+If you want to define your own client, see the [Websocket communication protocol](./websocket_protocol.md)
+
+## How to customize service deployment
+
+The code for FunASR-runtime is open source. If the server and client cannot fully meet your needs, you can further develop them based on your own requirements:
+
+### C++ client
+
+https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/websocket
+
+### Python client
+
+https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket
+
+### C++ server
+
+#### VAD
+```c++
+// The use of the VAD model consists of two steps: FsmnVadInit and FsmnVadInfer:
+FUNASR_HANDLE vad_hanlde=FsmnVadInit(model_path, thread_num);
+// Where: model_path contains "model-dir" and "quantize", thread_num is the ONNX thread count;
+FUNASR_RESULT result=FsmnVadInfer(vad_hanlde, wav_file.c_str(), NULL, 16000);
+// Where: vad_hanlde is the return value of FunOfflineInit, wav_file is the path to the audio file, and sampling_rate is the sampling rate (default 16k).
+```
+
+See the usage example for details [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-vad.cpp)
+
+#### ASR
+```text
+// The use of the ASR model consists of two steps: FunOfflineInit and FunOfflineInfer:
+FUNASR_HANDLE asr_hanlde=FunOfflineInit(model_path, thread_num);
+// Where: model_path contains "model-dir" and "quantize", thread_num is the ONNX thread count;
+FUNASR_RESULT result=FunOfflineInfer(asr_hanlde, wav_file.c_str(), RASR_NONE, NULL, 16000);
+// Where: asr_hanlde is the return value of FunOfflineInit, wav_file is the path to the audio file, and sampling_rate is the sampling rate (default 16k).
+```
+
+See the usage example for details, [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline.cpp)
+
+#### PUNC
+```text
+// The use of the PUNC model consists of two steps: CTTransformerInit and CTTransformerInfer:
+FUNASR_HANDLE punc_hanlde=CTTransformerInit(model_path, thread_num);
+// Where: model_path contains "model-dir" and "quantize", thread_num is the ONNX thread count;
+FUNASR_RESULT result=CTTransformerInfer(punc_hanlde, txt_str.c_str(), RASR_NONE, NULL);
+// Where: punc_hanlde is the return value of CTTransformerInit, txt_str is the text
+```
+See the usage example for details, [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-punc.cpp)
--- a/funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md
+++ b/funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md
@ -0,0 +1,271 @@
+# FunASR英文离线文件转写服务开发指南
+
+FunASR提供可一键本地或者云端服务器部署的英文离线文件转写服务，内核为FunASR已开源runtime-SDK。FunASR-runtime结合了达摩院语音实验室在Modelscope社区开源的语音端点检测(VAD)、Paraformer-large语音识别(ASR)、标点检测(PUNC) 等相关能力，可以准确、高效的对音频进行高并发转写。
+
+本文档为FunASR离线文件转写服务开发指南。如果您想快速体验离线文件转写服务，可参考[快速上手](#快速上手)。
+
+## 服务器配置
+
+用户可以根据自己的业务需求，选择合适的服务器配置，推荐配置为：
+- 配置1: （X86，计算型），4核vCPU，内存8G，单机可以支持大约32路的请求
+- 配置2: （X86，计算型），16核vCPU，内存32G，单机可以支持大约64路的请求
+- 配置3: （X86，计算型），64核vCPU，内存128G，单机可以支持大约200路的请求
+
+详细性能测试报告（[点击此处](./benchmark_onnx_cpp.md)）
+
+云服务厂商，针对新用户，有3个月免费试用活动，申请教程（[点击此处](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/docs/aliyun_server_tutorial.md)）
+
+
+## 快速上手
+### 镜像启动
+
+通过下述命令拉取并启动FunASR runtime-SDK的docker镜像：
+
+```shell
+sudo docker pull \
+  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0
+mkdir -p ./funasr-runtime-resources/models
+sudo docker run -p 10095:10095 -it --privileged=true \
+  -v $PWD/funasr-runtime-resources/models:/workspace/models \
+  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0
+```
+如果您没有安装docker，可参考[Docker安装](#Docker安装)
+
+### 服务端启动
+
+docker启动之后，启动 funasr-wss-server服务程序：
+```shell
+cd FunASR/funasr/runtime
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx  \
+  --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx  > log.out 2>&1 &
+
+# 如果您想关闭ssl，增加参数：--certfile 0
+
+```
+服务端详细参数介绍可参考[服务端参数介绍](#服务端参数介绍)
+### 客户端测试与使用
+
+下载客户端测试工具目录samples
+```shell
+wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz
+```
+我们以Python语言客户端为例，进行说明，支持多种音频格式输入（.wav, .pcm, .mp3等），也支持视频输入(.mp4等)，以及多文件列表wav.scp输入，其他版本客户端请参考文档（[点击此处](#客户端用法详解)），定制服务部署请参考[如何定制服务部署](#如何定制服务部署)
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+------------------
+## Docker安装
+
+下述步骤为手动安装docker环境的步骤：
+
+### docker环境安装
+```shell
+# Ubuntu：
+curl -fsSL https://test.docker.com -o test-docker.sh 
+sudo sh test-docker.sh 
+# Debian：
+curl -fsSL https://get.docker.com -o get-docker.sh 
+sudo sh get-docker.sh 
+# CentOS：
+curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun 
+# MacOS：
+brew install --cask --appdir=/Applications docker
+```
+
+安装详见：https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html
+
+### docker启动
+
+```shell
+sudo systemctl start docker
+```
+
+
+## 客户端用法详解
+
+在服务器上完成FunASR服务部署以后，可以通过如下的步骤来测试和使用离线文件转写服务。
+目前分别支持以下几种编程语言客户端
+
+- [Python](#python-client)
+- [CPP](#cpp-client)
+- [html网页版本](#Html网页版)
+- [Java](#Java-client)
+
+### python-client
+若想直接运行client进行测试，可参考如下简易说明，以python版本为例：
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline \
+        --audio_in "../audio/asr_example.wav" --output_dir "./results"
+```
+
+命令参数说明：
+```text
+--host 为FunASR runtime-SDK服务部署机器ip，默认为本机ip（127.0.0.1），如果client与服务不在同一台服务器，
+       需要改为部署机器ip
+--port 10095 部署端口号
+--mode offline表示离线文件转写
+--audio_in 需要进行转写的音频文件，支持文件路径，文件列表wav.scp
+--thread_num 设置并发发送线程数，默认为1
+--ssl 设置是否开启ssl证书校验，默认1开启，设置为0关闭
+--hotword 如果模型为热词模型，可以设置热词: *.txt(每行一个热词) 或者空格分隔的热词字符串(阿里巴巴 达摩院)
+--use_itn 设置是否使用itn，默认1开启，设置为0关闭
+```
+
+### cpp-client
+进入samples/cpp目录后，可以用cpp进行测试，指令如下：
+```shell
+./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
+```
+
+命令参数说明：
+
+```text
+--server-ip 为FunASR runtime-SDK服务部署机器ip，默认为本机ip（127.0.0.1），如果client与服务不在同一台服务器，
+            需要改为部署机器ip
+--port 10095 部署端口号
+--wav-path 需要进行转写的音频文件，支持文件路径
+--hotword 如果模型为热词模型，可以设置热词: *.txt(每行一个热词) 或者空格分隔的热词字符串 (阿里巴巴 达摩院)
+--use-itn 设置是否使用itn，默认1开启，设置为0关闭
+```
+
+### Html网页版
+
+在浏览器中打开 html/static/index.html，即可出现如下页面，支持麦克风输入与文件上传，直接进行体验
+
+<img src="images/html.png"  width="900"/>
+
+### Java-client
+
+```shell
+FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
+```
+详细可以参考文档（[点击此处](../java/readme.md)）
+
+
+
+## 服务端参数介绍：
+
+funasr-wss-server支持从Modelscope下载模型，设置模型下载地址（--download-model-dir，默认为/workspace/models）及model ID（--model-dir、--vad-dir、--punc-dir）,示例如下：
+```shell
+cd /workspace/FunASR/funasr/runtime/websocket/build/bin
+./funasr-wss-server  \
+  --download-model-dir /workspace/models \
+  --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \
+  --decoder-thread-num 32 \
+  --io-thread-num  8 \
+  --port 10095 \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key
+ ```
+命令参数介绍：
+```text
+--download-model-dir 模型下载地址，通过设置model ID从Modelscope下载模型
+--model-dir  modelscope model ID
+--quantize  True为量化ASR模型，False为非量化ASR模型，默认是True
+--vad-dir  modelscope model ID
+--vad-quant   True为量化VAD模型，False为非量化VAD模型，默认是True
+--punc-dir  modelscope model ID
+--punc-quant   True为量化PUNC模型，False为非量化PUNC模型，默认是True
+--itn-dir modelscope model ID
+--port  服务端监听的端口号，默认为 10095
+--decoder-thread-num  服务端启动的推理线程数，默认为 8
+--io-thread-num  服务端启动的IO线程数，默认为 1
+--certfile  ssl的证书文件，默认为：../../../ssl_key/server.crt，如果需要关闭ssl，参数设置为”“
+--keyfile   ssl的密钥文件，默认为：../../../ssl_key/server.key，如果需要关闭ssl，参数设置为”“
+```
+
+funasr-wss-server同时也支持从本地路径加载模型（本地模型资源准备详见[模型资源准备](#模型资源准备)）示例如下：
+```shell
+cd /workspace/FunASR/funasr/runtime/websocket/build/bin
+./funasr-wss-server  \
+  --model-dir /workspace/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \
+  --vad-dir /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \
+  --decoder-thread-num 32 \
+  --io-thread-num  8 \
+  --port 10095 \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key
+ ```
+命令参数介绍：
+```text
+--model-dir  ASR模型路径，默认为：/workspace/models/asr
+--quantize   True为量化ASR模型，False为非量化ASR模型，默认是True
+--vad-dir  VAD模型路径，默认为：/workspace/models/vad
+--vad-quant   True为量化VAD模型，False为非量化VAD模型，默认是True
+--punc-dir  PUNC模型路径，默认为：/workspace/models/punc
+--punc-quant   True为量化PUNC模型，False为非量化PUNC模型，默认是True
+--itn-dir modelscope model ID
+--port  服务端监听的端口号，默认为 10095
+--decoder-thread-num  服务端启动的推理线程数，默认为 8
+--io-thread-num  服务端启动的IO线程数，默认为 1
+--certfile ssl的证书文件，默认为：../../../ssl_key/server.crt，如果需要关闭ssl，参数设置为”“
+--keyfile  ssl的密钥文件，默认为：../../../ssl_key/server.key，如果需要关闭ssl，参数设置为”“
+```
+
+执行上述指令后，启动离线文件转写服务。如果模型指定为ModelScope中model id，会自动从MoldeScope中下载如下模型：
+[FSMN-VAD模型](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary)，
+[Paraformer-lagre模型](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx/summary)
+[CT-Transformer标点预测模型](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx/summary)
+
+如果，您希望部署您finetune后的模型（例如10epoch.pb），需要手动将模型重命名为model.pb，并将原modelscope中模型model.pb替换掉，将路径指定为`model_dir`即可。
+
+
+## 如何定制服务部署
+
+FunASR-runtime的代码已开源，如果服务端和客户端不能很好的满足您的需求，您可以根据自己的需求进行进一步的开发：
+### c++ 客户端：
+
+https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/websocket
+
+### python 客户端：
+
+https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket
+
+### 自定义客户端：
+
+如果您想定义自己的client，参考[websocket通信协议](./websocket_protocol_zh.md)
+
+
+```
+
+### c++ 服务端：
+
+#### VAD
+```c++
+// VAD模型的使用分为FsmnVadInit和FsmnVadInfer两个步骤：
+FUNASR_HANDLE vad_hanlde=FsmnVadInit(model_path, thread_num);
+// 其中：model_path 包含"model-dir"、"quantize"，thread_num为onnx线程数；
+FUNASR_RESULT result=FsmnVadInfer(vad_hanlde, wav_file.c_str(), NULL, 16000);
+// 其中：vad_hanlde为FunOfflineInit返回值，wav_file为音频路径，sampling_rate为采样率(默认16k)
+```
+
+使用示例详见：https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-vad.cpp
+
+#### ASR
+```text
+// ASR模型的使用分为FunOfflineInit和FunOfflineInfer两个步骤：
+FUNASR_HANDLE asr_hanlde=FunOfflineInit(model_path, thread_num);
+// 其中：model_path 包含"model-dir"、"quantize"，thread_num为onnx线程数；
+FUNASR_RESULT result=FunOfflineInfer(asr_hanlde, wav_file.c_str(), RASR_NONE, NULL, 16000);
+// 其中：asr_hanlde为FunOfflineInit返回值，wav_file为音频路径，sampling_rate为采样率(默认16k)
+```
+
+使用示例详见：https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline.cpp
+
+#### PUNC
+```text
+// PUNC模型的使用分为CTTransformerInit和CTTransformerInfer两个步骤：
+FUNASR_HANDLE punc_hanlde=CTTransformerInit(model_path, thread_num);
+// 其中：model_path 包含"model-dir"、"quantize"，thread_num为onnx线程数；
+FUNASR_RESULT result=CTTransformerInfer(punc_hanlde, txt_str.c_str(), RASR_NONE, NULL);
+// 其中：punc_hanlde为CTTransformerInit返回值，txt_str为文本
+```
+使用示例详见：https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-punc.cpp
--- a/funasr/runtime/docs/SDK_advanced_guide_offline_zh.md
+++ b/funasr/runtime/docs/SDK_advanced_guide_offline_zh.md
@ -26,7 +26,7 @@ sudo docker pull \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.2.2
 mkdir -p ./funasr-runtime-resources/models
 sudo docker run -p 10095:10095 -it --privileged=true \
-  -v ./funasr-runtime-resources/models:/workspace/models \
+  -v $PWD/funasr-runtime-resources/models:/workspace/models \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.2.2
 ```
 如果您没有安装docker，可参考[Docker安装](#Docker安装)
--- a/funasr/runtime/docs/SDK_advanced_guide_online.md
+++ b/funasr/runtime/docs/SDK_advanced_guide_online.md
@ -10,7 +10,7 @@ Use the following command to pull and start the FunASR software package docker i
 ```shell
 sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.2
 mkdir -p ./funasr-runtime-resources/models
-sudo docker run -p 10095:10095 -it --privileged=true -v ./funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.2
+sudo docker run -p 10095:10095 -it --privileged=true -v $PWD/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.2
 ```
 If you do not have Docker installed, please refer to [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)

--- a/funasr/runtime/docs/SDK_advanced_guide_online_zh.md
+++ b/funasr/runtime/docs/SDK_advanced_guide_online_zh.md
@ -15,7 +15,7 @@ sudo docker pull \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.2
 mkdir -p ./funasr-runtime-resources/models
 sudo docker run -p 10095:10095 -it --privileged=true \
-  -v ./funasr-runtime-resources/models:/workspace/models \
+  -v $PWD/funasr-runtime-resources/models:/workspace/models \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.2
 ```
 如果您没有安装docker，可参考[Docker安装](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker_zh.html)
--- a/funasr/runtime/docs/SDK_tutorial_en.md
+++ b/funasr/runtime/docs/SDK_tutorial_en.md
@ -0,0 +1,198 @@
+([简体中文](./SDK_tutorial_en_zh.md)|English)
+
+# Highlights
+**FunASR offline file transcription service 1.0 has been released. Feel free to deploy and experience it!**
+
+# FunASR Offline File Transcription Service
+
+FunASR provides an offline file transcription service that can be easily deployed on a local or cloud server. The core is the FunASR open-source runtime-SDK. It integrates various capabilities such as speech endpoint detection (VAD) and Paraformer-large speech recognition (ASR) and punctuation restoration (PUNC) released by the speech laboratory of the Damo Academy in the Modelscope community. It has a complete speech recognition chain and can recognize audio or video of tens of hours into punctuated text. Moreover, it supports transcription for hundreds of simultaneous requests.
+
+## Server Configuration
+
+Users can choose appropriate server configurations based on their business needs. The recommended configurations are:
+- Configuration 1: (X86, computing-type) 4-core vCPU, 8GB memory, and a single machine can support about 32 requests.
+- Configuration 2: (X86, computing-type) 16-core vCPU, 32GB memory, and a single machine can support about 64 requests.
+- Configuration 3: (X86, computing-type) 64-core vCPU, 128GB memory, and a single machine can support about 200 requests. 
+
+Detailed performance [report](./benchmark_onnx_cpp.md)
+
+Cloud service providers offer a 3-month free trial for new users. Application tutorial ([docs](./aliyun_server_tutorial.md)).
+
+## Quick Start
+
+### Server Startup
+
+`Note`: The one-click deployment tool process includes installing Docker, downloading Docker images, and starting the service. If the user wants to start from the FunASR Docker image, please refer to the development guide ([docs](./SDK_advanced_guide_offline.md).
+
+Download the deployment tool `funasr-runtime-deploy-offline-cpu-en.sh`
+
+```shell
+curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR/main/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh;
+# If there is a network problem, users in mainland China can use the following command:
+# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-en.sh;
+```
+
+Execute the deployment tool and press the Enter key at the prompt to complete the installation and deployment of the server. Currently, the convenient deployment tool only supports Linux environments. For other environments, please refer to the development guide ([docs](./SDK_advanced_guide_offline_en.md)).
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh install --workspace /root/funasr-runtime-resources
+```
+
+### Client Testing and Usage
+
+After running the above installation instructions, the client testing tool directory samples will be downloaded in the default installation directory /root/funasr-runtime-resources ([download click](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz)).
+We take the Python language client as an example to explain that it supports multiple audio format inputs (such as .wav, .pcm, .mp3, etc.), video inputs (.mp4, etc.), and multiple file list wav.scp inputs. For other client versions, please refer to the [documentation](#Detailed-Description-of-Client-Usage).
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+## Detailed Description of Client Usage
+
+After completing the FunASR runtime-SDK service deployment on the server, you can test and use the offline file transcription service through the following steps. Currently, the following programming language client versions are supported:
+
+- [Python](#python-client)
+- [CPP](#cpp-client)
+- [html](#html-client)
+- [java](#java-client)
+
+For more client version support, please refer to the [development guide](./SDK_advanced_guide_offline_zh.md).
+
+### python-client
+If you want to run the client directly for testing, you can refer to the following simple instructions, using the Python version as an example:
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+Command parameter instructions:
+```text
+--host is the IP address of the FunASR runtime-SDK service deployment machine, which defaults to the local IP address (127.0.0.1). If the client and the service are not on the same server, it needs to be changed to the deployment machine IP address.
+--port 10095 deployment port number
+--mode offline represents offline file transcription
+--audio_in is the audio file that needs to be transcribed, supporting file paths and file list wav.scp
+--thread_num sets the number of concurrent sending threads, default is 1
+--ssl sets whether to enable SSL certificate verification, default is 1 to enable, and 0 to disable
+--hotword If am is hotword model, setting hotword: *.txt(one hotword perline) or hotwords seperate by space (could be: 阿里巴巴 达摩院)
+--use_itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### cpp-client
+
+After entering the samples/cpp directory, you can test it with CPP. The command is as follows:
+```shell
+./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
+```
+
+Command parameter description:
+```text
+--server-ip specifies the IP address of the machine where the FunASR runtime-SDK service is deployed. The default value is the local IP address (127.0.0.1). If the client and the service are not on the same server, the IP address needs to be changed to the IP address of the deployment machine.
+--port specifies the deployment port number as 10095.
+--wav-path specifies the audio file to be transcribed, and supports file paths.
+--thread_num sets the number of concurrent send threads, with a default value of 1.
+--ssl sets whether to enable SSL certificate verification, with a default value of 1 for enabling and 0 for disabling.
+--hotword If am is hotword model, setting hotword: *.txt(one hotword perline) or hotwords seperate by space (could be: 阿里巴巴 达摩院)
+--use-itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### html-client
+
+To experience it directly, open `html/static/index.html` in your browser. You will see the following page, which supports microphone input and file upload.
+<img src="images/html.png"  width="900"/>
+
+### java-client
+
+```shell
+FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
+```
+For more details, please refer to the [docs](../java/readme.md)
+
+## Server Usage Details
+
+### Start the deployed FunASR service
+
+If you have restarted the computer or shut down Docker after one-click deployment, you can start the FunASR service directly with the following command. The startup configuration is the same as the last one-click deployment.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh start
+```
+
+### Stop the FunASR service
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh stop
+```
+
+### Release the FunASR service
+
+Release the deployed FunASR service.
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh remove
+```
+
+### Restart the FunASR service
+
+Restart the FunASR service with the same configuration as the last one-click deployment.
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh restart
+```
+
+### Replace the model and restart the FunASR service
+
+Replace the currently used model, and restart the FunASR service. The model must be an ASR/VAD/PUNC model in ModelScope, or a finetuned model from ModelScope.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--asr_model | --vad_model | --punc_model] <model_id or local model path>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --asr_model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
+```
+
+### Update parameters and restart the FunASR service
+
+Update the configured parameters and restart the FunASR service to take effect. The parameters that can be updated include the host and Docker port numbers, as well as the number of inference and IO threads.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--host_port | --docker_port] <port number>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--decode_thread_num | --io_thread_num] <the number of threads>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--workspace] <workspace in local>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--ssl] <0: close SSL; 1: open SSL, default:1>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --decode_thread_num 32
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --workspace /root/funasr-runtime-resources
+```
+
+### Set SSL
+
+SSL verification is enabled by default. If you need to disable it, you can set it when starting.
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --ssl 0
+```
+
+
+
+## Contact Us
+
+If you encounter any problems during use, please join our user group for feedback.
+
+
+|                                DingDing Group                                |                             Wechat                             |
+|:----------------------------------------------------------------------------:|:--------------------------------------------------------------:|
+| <div align="left"><img src="../../../docs/images/dingding.jpg" width="250"/> | <img src="../../../docs/images/wechat.png" width="232"/></div> |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
--- a/funasr/runtime/docs/SDK_tutorial_en_zh.md
+++ b/funasr/runtime/docs/SDK_tutorial_en_zh.md
@ -0,0 +1,204 @@
+(简体中文|[English](./SDK_tutorial_en.md))
+
+
+# FunASR英文离线文件转写服务便捷部署教程
+FunASR提供可便捷本地或者云端服务器部署的离线文件转写服务，内核为FunASR已开源runtime-SDK。
+集成了达摩院语音实验室在Modelscope社区开源的语音端点检测(VAD)、Paraformer-large语音识别(ASR)、标点恢复(PUNC) 等相关能力，拥有完整的语音识别链路，可以将几十个小时的音频或视频识别成带标点的文字，而且支持上百路请求同时进行转写。
+
+# 发布日志
+**FunASR英文离线文件转写服务1.0已发布，欢迎部署体验[快速上手](#快速上手)**
+
+## 服务器配置
+
+用户可以根据自己的业务需求，选择合适的服务器配置，推荐配置为：
+- 配置1: （X86，计算型），4核vCPU，内存8G，单机可以支持大约32路的请求
+- 配置2: （X86，计算型），16核vCPU，内存32G，单机可以支持大约64路的请求
+- 配置3: （X86，计算型），64核vCPU，内存128G，单机可以支持大约200路的请求
+
+详细性能测试报告（[点击此处](./benchmark_onnx_cpp.md)）
+
+云服务厂商，针对新用户，有3个月免费试用活动，申请教程（[点击此处](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/docs/aliyun_server_tutorial.md)）
+
+## 快速上手
+
+### 服务端启动
+
+`注意`：一键部署工具，过程分为：安装docker、下载docker镜像、启动服务。如果用户希望直接从FunASR docker镜像启动，可以参考开发指南（[点击此处](./SDK_advanced_guide_offline_en_zh.md)）
+
+下载部署工具`funasr-runtime-deploy-offline-cpu-en.sh`
+
+```shell
+curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR/main/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh;
+# 如遇到网络问题，中国大陆用户，可以使用下面的命令：
+# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-en.sh;
+```
+
+执行部署工具，在提示处输入回车键即可完成服务端安装与部署。目前便捷部署工具暂时仅支持Linux环境，其他环境部署参考开发指南（[点击此处](./SDK_advanced_guide_offline_zh.md)）
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh install --workspace ./funasr-runtime-resources
+```
+
+### 客户端测试与使用
+
+运行上面安装指令后，会在/root/funasr-runtime-resources（默认安装目录）中下载客户端测试工具目录samples（手动下载，[点击此处](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz)），
+我们以Python语言客户端为例，进行说明，支持多种音频格式输入（.wav, .pcm, .mp3等），也支持视频输入(.mp4等)，以及多文件列表wav.scp输入，其他版本客户端请参考文档（[点击此处](#客户端用法详解)）
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+## 客户端用法详解
+
+在服务器上完成FunASR服务部署以后，可以通过如下的步骤来测试和使用离线文件转写服务。
+目前分别支持以下几种编程语言客户端
+
+- [Python](#python-client)
+- [CPP](#cpp-client)
+- [html](#html-client)
+- [java](#java-client)
+
+更多版本客户端支持请参考[websocket/grpc协议](./websocket_protocol_zh.md)
+
+### python-client
+若想直接运行client进行测试，可参考如下简易说明，以python版本为例：
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+命令参数说明：
+```text
+--host 为FunASR runtime-SDK服务部署机器ip，默认为本机ip（127.0.0.1），如果client与服务不在同一台服务器，
+        需要改为部署机器ip
+--port 10095 部署端口号
+--mode offline表示离线文件转写
+--audio_in 需要进行转写的音频文件，支持文件路径，文件列表wav.scp
+--thread_num 设置并发发送线程数，默认为1
+--ssl 设置是否开启ssl证书校验，默认1开启，设置为0关闭
+--hotword 如果模型为热词模型，可以设置热词: *.txt(每行一个热词) 或者空格分隔的热词字符串 (阿里巴巴 达摩院)
+--use_itn 设置是否使用itn，默认1开启，设置为0关闭
+```
+
+### cpp-client
+进入samples/cpp目录后，可以用cpp进行测试，指令如下：
+```shell
+./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
+```
+
+命令参数说明：
+
+```text
+--server-ip 为FunASR runtime-SDK服务部署机器ip，默认为本机ip（127.0.0.1），如果client与服务不在同一台服务器，
+            需要改为部署机器ip
+--port 10095 部署端口号
+--wav-path 需要进行转写的音频文件，支持文件路径
+--thread_num 设置并发发送线程数，默认为1
+--ssl 设置是否开启ssl证书校验，默认1开启，设置为0关闭
+--hotword 如果模型为热词模型，可以设置热词: *.txt(每行一个热词) 或者空格分隔的热词字符串 (阿里巴巴 达摩院)
+--use-itn 设置是否使用itn，默认1开启，设置为0关闭
+```
+
+### html-client
+
+在浏览器中打开 html/static/index.html，即可出现如下页面，支持麦克风输入与文件上传，直接进行体验
+
+<img src="images/html.png"  width="900"/>
+
+### java-client
+
+```shell
+FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
+```
+详细可以参考文档（[点击此处](../java/readme.md)）
+
+## 服务端用法详解
+
+### 启动已经部署过的FunASR服务
+一键部署后若出现重启电脑等关闭Docker的动作，可通过如下命令直接启动FunASR服务，启动配置为上次一键部署的设置。
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh start
+```
+
+### 关闭FunASR服务
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh stop
+```
+
+### 释放FunASR服务
+
+释放已经部署的FunASR服务。
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh remove
+```
+
+### 重启FunASR服务
+
+根据上次一键部署的设置重启启动FunASR服务。
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh restart
+```
+
+### 替换模型并重启FunASR服务
+
+替换正在使用的模型，并重新启动FunASR服务。模型需为ModelScope中的ASR/VAD/PUNC模型，或者从ModelScope中模型finetune后的模型。
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--asr_model | --vad_model | --punc_model] <model_id or local model path>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --asr_model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
+```
+
+### 更新参数并重启FunASR服务
+
+更新已配置参数，并重新启动FunASR服务生效。可更新参数包括宿主机和Docker的端口号，以及推理和IO的线程数量。
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--host_port | --docker_port] <port number>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--decode_thread_num | --io_thread_num] <the number of threads>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--workspace] <workspace in local>
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update [--ssl] <0: close SSL; 1: open SSL, default:1>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --decode_thread_num 32
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --workspace /root/funasr-runtime-resources
+```
+
+### 关闭SSL证书
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-en.sh update --ssl 0
+```
+
+
+
+## 联系我们
+
+在您使用过程中，如果遇到问题，欢迎加入用户群进行反馈
+
+
+|                                    钉钉用户群                                     |                                      微信               |
+|:----------------------------------------------------------------------------:|:-----------------------------------------------------:|
+| <div align="left"><img src="../../../docs/images/dingding.jpg" width="250"/> | <img src="../../../docs/images/wechat.png" width="232"/></div> |
+
+
+## 视频demo
+
+[点击此处]()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
--- a/funasr/runtime/docs/docker_offline_cpu_en_lists
+++ b/funasr/runtime/docs/docker_offline_cpu_en_lists
@ -0,0 +1,8 @@
+DOCKER:
+  funasr-runtime-sdk-en-cpu-0.1.0
+DEFAULT_ASR_MODEL:
+  damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx
+DEFAULT_VAD_MODEL:
+  damo/speech_fsmn_vad_zh-cn-16k-common-onnx
+DEFAULT_PUNC_MODEL:
+  damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
--- a/funasr/runtime/onnxruntime/include/funasrruntime.h
+++ b/funasr/runtime/onnxruntime/include/funasrruntime.h
@ -18,10 +18,6 @@
 #define FUNASR_CALLBCK_PREFIX __stdcall
 #endif

-#ifdef __cplusplus 
-
-extern "C" {
-#endif

 typedef void* FUNASR_HANDLE;
 typedef void* FUNASR_RESULT;
@ -122,7 +118,4 @@ _FUNASRAPI FUNASR_RESULT	FunTpassInferBuffer(FUNASR_HANDLE handle, FUNASR_HANDLE
 _FUNASRAPI void				FunTpassUninit(FUNASR_HANDLE handle);
 _FUNASRAPI void				FunTpassOnlineUninit(FUNASR_HANDLE handle);

-#ifdef __cplusplus 

-}
-#endif
--- a/funasr/runtime/onnxruntime/src/audio.cpp
+++ b/funasr/runtime/onnxruntime/src/audio.cpp
@ -9,6 +9,9 @@
 #include "audio.h"
 #include "precomp.h"

+#ifdef _MSC_VER
+#pragma warning(disable:4996)
+#endif

 #if defined(__APPLE__)
 #include <string.h>
@ -423,7 +426,7 @@ bool Audio::FfmpegLoad(const char* buf, int n_file_len){
    return false;
 #else
    // from buf
-    char* buf_copy = (char *)malloc(n_file_len);
+    void* buf_copy = av_malloc(n_file_len);
    memcpy(buf_copy, buf, n_file_len);

    AVIOContext* avio_ctx = avio_alloc_context(
--- a/funasr/runtime/onnxruntime/src/commonfunc.h
+++ b/funasr/runtime/onnxruntime/src/commonfunc.h
@ -1,29 +1,34 @@
 #pragma once 
 #include <algorithm>
+#ifdef _WIN32
+#include <codecvt>
+#endif

 namespace funasr {
 typedef struct
 {
-    std::string msg="";
-    std::string stamp="";
-    std::string tpass_msg="";
-    float snippet_time=0;
+    std::string msg;
+    std::string stamp;
+    std::string tpass_msg;
+    float snippet_time;
 }FUNASR_RECOG_RESULT;

 typedef struct
 {
    std::vector<std::vector<int>>* segments;
-    float  snippet_time=0;
+    float  snippet_time;
 }FUNASR_VAD_RESULT;

 typedef struct
 {
-    string msg="";
+    string msg;
    vector<string> arr_cache;
 }FUNASR_PUNC_RESULT;

 #ifdef _WIN32
-#include <codecvt>
+
+#define ORTSTRING(str) StrToWstr(str)
+#define ORTCHAR(str) StrToWstr(str).c_str()

 inline std::wstring String2wstring(const std::string& str, const std::string& locale)
 {
@ -39,8 +44,15 @@ inline std::wstring  StrToWstr(std::string str) {

 }

+#else
+
+#define ORTSTRING(str) str
+#define ORTCHAR(str) str
+
 #endif

+
+
 inline void GetInputName(Ort::Session* session, string& inputName,int nIndex=0) {
    size_t numInputNodes = session->GetInputCount();
    if (numInputNodes > 0) {
--- a/funasr/runtime/onnxruntime/src/ct-transformer-online.cpp
+++ b/funasr/runtime/onnxruntime/src/ct-transformer-online.cpp
@ -17,7 +17,7 @@ void CTTransformerOnline::InitPunc(const std::string &punc_model, const std::str
    session_options.DisableCpuMemArena();

    try{
-        m_session = std::make_unique<Ort::Session>(env_, punc_model.c_str(), session_options);
+        m_session = std::make_unique<Ort::Session>(env_, ORTSTRING(punc_model).c_str(), session_options);
        LOG(INFO) << "Successfully load model from " << punc_model;
    }
    catch (std::exception const &e) {
@ -74,8 +74,8 @@ string CTTransformerOnline::AddPunc(const char* sz_input, vector<string> &arr_ca
    for (size_t i = 0; i < InputData.size(); i += TOKEN_LEN)
    {
        nDiff = (i + TOKEN_LEN) < InputData.size() ? (0) : (i + TOKEN_LEN - InputData.size());
-        vector<int32_t> InputIDs(InputData.begin() + i, InputData.begin() + i + TOKEN_LEN - nDiff);
-        vector<string> InputStr(strOut.begin() + i, strOut.begin() + i + TOKEN_LEN - nDiff);
+        vector<int32_t> InputIDs(InputData.begin() + i, InputData.begin() + i + (TOKEN_LEN - nDiff));
+        vector<string> InputStr(strOut.begin() + i, strOut.begin() + i + (TOKEN_LEN - nDiff));
        InputIDs.insert(InputIDs.begin(), RemainIDs.begin(), RemainIDs.end()); // RemainIDs+InputIDs;
        InputStr.insert(InputStr.begin(), RemainStr.begin(), RemainStr.end()); // RemainStr+InputStr;

@ -102,10 +102,10 @@ string CTTransformerOnline::AddPunc(const char* sz_input, vector<string> &arr_ca
                nSentEnd = nLastCommaIndex;
                Punction[nSentEnd] = PERIOD_INDEX;
            }
-            RemainStr.assign(InputStr.begin() + nSentEnd + 1, InputStr.end());
-            RemainIDs.assign(InputIDs.begin() + nSentEnd + 1, InputIDs.end());
-            InputStr.assign(InputStr.begin(), InputStr.begin() + nSentEnd + 1);  // minit_sentence
-            Punction.assign(Punction.begin(), Punction.begin() + nSentEnd + 1);
+            RemainStr.assign(InputStr.begin() + (nSentEnd + 1), InputStr.end());
+            RemainIDs.assign(InputIDs.begin() + (nSentEnd + 1), InputIDs.end());
+            InputStr.assign(InputStr.begin(), InputStr.begin() + (nSentEnd + 1));  // minit_sentence
+            Punction.assign(Punction.begin(), Punction.begin() + (nSentEnd + 1));
        }
        
        for (auto& item : Punction)  
@ -149,7 +149,7 @@ string CTTransformerOnline::AddPunc(const char* sz_input, vector<string> &arr_ca
            break;
        }
    }
-    arr_cache.assign(sentence_words_list.begin() + nSentEnd + 1, sentence_words_list.end());
+    arr_cache.assign(sentence_words_list.begin() + (nSentEnd + 1), sentence_words_list.end());

    if (sentenceOut.size() > 0 && m_tokenizer.IsPunc(sentenceOut[sentenceOut.size() - 1]))
    {
--- a/funasr/runtime/onnxruntime/src/ct-transformer.cpp
+++ b/funasr/runtime/onnxruntime/src/ct-transformer.cpp
@ -17,7 +17,7 @@ void CTTransformer::InitPunc(const std::string &punc_model, const std::string &p
    session_options.DisableCpuMemArena();

    try{
-        m_session = std::make_unique<Ort::Session>(env_, punc_model.c_str(), session_options);
+        m_session = std::make_unique<Ort::Session>(env_, ORTSTRING(punc_model).c_str(), session_options);
        LOG(INFO) << "Successfully load model from " << punc_model;
    }
    catch (std::exception const &e) {
@ -66,8 +66,8 @@ string CTTransformer::AddPunc(const char* sz_input, std::string language)
    for (size_t i = 0; i < InputData.size(); i += TOKEN_LEN)
    {
        nDiff = (i + TOKEN_LEN) < InputData.size() ? (0) : (i + TOKEN_LEN - InputData.size());
-        vector<int32_t> InputIDs(InputData.begin() + i, InputData.begin() + i + TOKEN_LEN - nDiff);
-        vector<string> InputStr(strOut.begin() + i, strOut.begin() + i + TOKEN_LEN - nDiff);
+        vector<int32_t> InputIDs(InputData.begin() + i, InputData.begin() + i + (TOKEN_LEN - nDiff));
+        vector<string> InputStr(strOut.begin() + i, strOut.begin() + i + (TOKEN_LEN - nDiff));
        InputIDs.insert(InputIDs.begin(), RemainIDs.begin(), RemainIDs.end()); // RemainIDs+InputIDs;
        InputStr.insert(InputStr.begin(), RemainStr.begin(), RemainStr.end()); // RemainStr+InputStr;

@ -94,10 +94,10 @@ string CTTransformer::AddPunc(const char* sz_input, std::string language)
                nSentEnd = nLastCommaIndex;
                Punction[nSentEnd] = PERIOD_INDEX;
            }
-            RemainStr.assign(InputStr.begin() + nSentEnd + 1, InputStr.end());
-            RemainIDs.assign(InputIDs.begin() + nSentEnd + 1, InputIDs.end());
-            InputStr.assign(InputStr.begin(), InputStr.begin() + nSentEnd + 1);  // minit_sentence
-            Punction.assign(Punction.begin(), Punction.begin() + nSentEnd + 1);
+            RemainStr.assign(InputStr.begin() + (nSentEnd + 1), InputStr.end());
+            RemainIDs.assign(InputIDs.begin() + (nSentEnd + 1), InputIDs.end());
+            InputStr.assign(InputStr.begin(), InputStr.begin() + (nSentEnd + 1));  // minit_sentence
+            Punction.assign(Punction.begin(), Punction.begin() + (nSentEnd + 1));
        }
        
        NewPunctuation.insert(NewPunctuation.end(), Punction.begin(), Punction.end());
--- a/funasr/runtime/onnxruntime/src/fsmn-vad.cpp
+++ b/funasr/runtime/onnxruntime/src/fsmn-vad.cpp
@ -54,7 +54,7 @@ void FsmnVad::LoadConfigFromYaml(const char* filename){
 void FsmnVad::ReadModel(const char* vad_model) {
    try {
        vad_session_ = std::make_shared<Ort::Session>(
-                env_, vad_model, session_options_);
+                env_, ORTCHAR(vad_model), session_options_);
        LOG(INFO) << "Successfully load model from " << vad_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load vad onnx model: " << e.what();
--- a/funasr/runtime/onnxruntime/src/funasrruntime.cpp
+++ b/funasr/runtime/onnxruntime/src/funasrruntime.cpp
@ -1,9 +1,6 @@
 #include "precomp.h"
 #include <vector>
-#ifdef __cplusplus 

-extern "C" {
-#endif

 	// APIs for Init
 	_FUNASRAPI FUNASR_HANDLE  FunASRInit(std::map<std::string, std::string>& model_path, int thread_num, ASR_TYPE type)
@ -257,12 +254,16 @@ extern "C" {
 		int n_total = audio.GetQueueSize();
 		float start_time = 0.0;
 		std::string cur_stamp = "[";
+		std::string lang = (offline_stream->asr_handle)->GetLang();
 		while (audio.Fetch(buff, len, flag, start_time) > 0) {
 			string msg = (offline_stream->asr_handle)->Forward(buff, len, true, hw_emb);
 			std::vector<std::string> msg_vec = funasr::split(msg, '|');
 			if(msg_vec.size()==0){
 				continue;
 			}
+			if(lang == "en-bpe" and p_result->msg != ""){
+				p_result->msg += " ";
+			}
 			p_result->msg += msg_vec[0];
 			//timestamp
 			if(msg_vec.size() > 1){
@ -282,7 +283,6 @@ extern "C" {
 			p_result->stamp += cur_stamp + "]";
 		}
 		if(offline_stream->UsePunc()){
-			string lang = (offline_stream->asr_handle)->GetLang();
 			string punc_res = (offline_stream->punc_handle)->AddPunc((p_result->msg).c_str(), lang);
 			p_result->msg = punc_res;
 		}
@ -338,12 +338,16 @@ extern "C" {
 		int n_total = audio.GetQueueSize();
 		float start_time = 0.0;
 		std::string cur_stamp = "[";
+		std::string lang = (offline_stream->asr_handle)->GetLang();
 		while (audio.Fetch(buff, len, flag, start_time) > 0) {
 			string msg = (offline_stream->asr_handle)->Forward(buff, len, true, hw_emb);
 			std::vector<std::string> msg_vec = funasr::split(msg, '|');
 			if(msg_vec.size()==0){
 				continue;
 			}
+			if(lang == "en-bpe" and p_result->msg != ""){
+				p_result->msg += " ";
+			}
 			p_result->msg += msg_vec[0];
 			//timestamp
 			if(msg_vec.size() > 1){
@ -364,7 +368,6 @@ extern "C" {
 			p_result->stamp += cur_stamp + "]";
 		}
 		if(offline_stream->UsePunc()){
-			string lang = (offline_stream->asr_handle)->GetLang();
 			string punc_res = (offline_stream->punc_handle)->AddPunc((p_result->msg).c_str(), lang);
 			p_result->msg = punc_res;
 		}
@ -688,8 +691,4 @@ extern "C" {
 		delete tpass_online_stream;
 	}

-#ifdef __cplusplus 
-
-}
-#endif

--- a/funasr/runtime/onnxruntime/src/offline-stream.cpp
+++ b/funasr/runtime/onnxruntime/src/offline-stream.cpp
@ -1,5 +1,4 @@
 #include "precomp.h"
-#include <unistd.h>

 namespace funasr {
 OfflineStream::OfflineStream(std::map<std::string, std::string>& model_path, int thread_num)
--- a/funasr/runtime/onnxruntime/src/paraformer.cpp
+++ b/funasr/runtime/onnxruntime/src/paraformer.cpp
@ -37,7 +37,7 @@ void Paraformer::InitAsr(const std::string &am_model, const std::string &am_cmvn
    session_options_.DisableCpuMemArena();

    try {
-        m_session_ = std::make_unique<Ort::Session>(env_, am_model.c_str(), session_options_);
+        m_session_ = std::make_unique<Ort::Session>(env_, ORTSTRING(am_model).c_str(), session_options_);
        LOG(INFO) << "Successfully load model from " << am_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load am onnx model: " << e.what();
@ -90,7 +90,7 @@ void Paraformer::InitAsr(const std::string &en_model, const std::string &de_mode
    session_options_.DisableCpuMemArena();

    try {
-        encoder_session_ = std::make_unique<Ort::Session>(env_, en_model.c_str(), session_options_);
+        encoder_session_ = std::make_unique<Ort::Session>(env_, ORTSTRING(en_model).c_str(), session_options_);
        LOG(INFO) << "Successfully load model from " << en_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load am encoder model: " << e.what();
@ -98,7 +98,7 @@ void Paraformer::InitAsr(const std::string &en_model, const std::string &de_mode
    }

    try {
-        decoder_session_ = std::make_unique<Ort::Session>(env_, de_model.c_str(), session_options_);
+        decoder_session_ = std::make_unique<Ort::Session>(env_, ORTSTRING(de_model).c_str(), session_options_);
        LOG(INFO) << "Successfully load model from " << de_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load am decoder model: " << e.what();
@ -153,7 +153,7 @@ void Paraformer::InitAsr(const std::string &am_model, const std::string &en_mode

    // offline
    try {
-        m_session_ = std::make_unique<Ort::Session>(env_, am_model.c_str(), session_options_);
+        m_session_ = std::make_unique<Ort::Session>(env_, ORTSTRING(am_model).c_str(), session_options_);
        LOG(INFO) << "Successfully load model from " << am_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load am onnx model: " << e.what();
@ -250,7 +250,7 @@ void Paraformer::InitHwCompiler(const std::string &hw_model, int thread_num) {
    hw_session_options.DisableCpuMemArena();

    try {
-        hw_m_session = std::make_unique<Ort::Session>(hw_env_, hw_model.c_str(), hw_session_options);
+        hw_m_session = std::make_unique<Ort::Session>(hw_env_, ORTSTRING(hw_model).c_str(), hw_session_options);
        LOG(INFO) << "Successfully load model from " << hw_model;
    } catch (std::exception const &e) {
        LOG(ERROR) << "Error when load hw compiler onnx model: " << e.what();
--- a/funasr/runtime/onnxruntime/src/precomp.h
+++ b/funasr/runtime/onnxruntime/src/precomp.h
@ -17,6 +17,25 @@
 #include <numeric>
 #include <cstring>

+#ifdef _WIN32
+#include<io.h>
+#ifndef R_OK
+#define R_OK 4
+#endif
+#ifndef W_OK
+#define W_OK 2
+#endif
+#ifndef X_OK
+#define X_OK 0 
+#endif
+#ifndef F_OK
+#define F_OK 0
+#endif
+#define access _access
+#else
+#include <unistd.h>
+#endif
+
 using namespace std;
 // third part
 #if defined(__APPLE__)
@ -33,6 +52,8 @@ using namespace std;

 // mine
 #include <glog/logging.h>
+
+
 #include "common-struct.h"
 #include "com-define.h"
 #include "commonfunc.h"
--- a/funasr/runtime/onnxruntime/src/tpass-online-stream.cpp
+++ b/funasr/runtime/onnxruntime/src/tpass-online-stream.cpp
@ -1,5 +1,4 @@
 #include "precomp.h"
-#include <unistd.h>

 namespace funasr {
 TpassOnlineStream::TpassOnlineStream(TpassStream* tpass_stream, std::vector<int> chunk_size){
--- a/funasr/runtime/onnxruntime/src/tpass-stream.cpp
+++ b/funasr/runtime/onnxruntime/src/tpass-stream.cpp
@ -1,5 +1,4 @@
 #include "precomp.h"
-#include <unistd.h>

 namespace funasr {
 TpassStream::TpassStream(std::map<std::string, std::string>& model_path, int thread_num)
--- a/funasr/runtime/readme.md
+++ b/funasr/runtime/readme.md
@ -6,10 +6,34 @@ It has attracted many developers to participate in experiencing and developing.

 - File transcription service, Mandarin, CPU version, done
 - The real-time transcription service, Mandarin (CPU), done
+- File transcription service, English, CPU version, done
 - File transcription service, Mandarin, GPU version, in progress
- File transcription service, English, in progress
 - and more.

+## File Transcription Service, English (CPU)
+
+Currently, the FunASR runtime-SDK supports the deployment of file transcription service, English (CPU version), with a complete speech recognition chain that can transcribe tens of hours of audio into punctuated text, and supports recognition for more than a hundred concurrent streams. 
+
+To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.
+
+### Technical Principles
+
+The technical principles and documentation behind FunASR explain the underlying technology, recognition accuracy, computational efficiency, and core advantages of the framework, including convenience, high precision, high efficiency, and support for long audio chains. For detailed information, please refer to the documentation available by [docs](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww). 
+
+### Deployment Tutorial
+
+The documentation mainly targets novice users who have no need for modifications or customization. It supports downloading model deployments from modelscope and also supports deploying models that users have fine-tuned. For detailed tutorials, please refer to [docs](docs/SDK_tutorial_en.md).
+
+### Advanced Development Guide
+
+The documentation mainly targets advanced developers who require modifications and customization of the service. It supports downloading model deployments from modelscope and also supports deploying models that users have fine-tuned. For detailed information, please refer to the documentation available by [docs](./docs/SDK_advanced_guide_offline_en.md)
+
+### latest version & image ID
+| image version                |  image ID | INFO |
+|------------------------------|-----|------|
+| funasr-runtime-sdk-en-cpu-0.1.0 |  4ce696fe9ba5   |      |
+
+
 ## The real-time transcription service, Mandarin (CPU)

 The FunASR real-time speech-to-text service software package not only performs real-time speech-to-text conversion, but also allows high-precision transcription text correction at the end of each sentence and outputs text with punctuation, supporting high-concurrency multiple requests.
@ -36,13 +60,13 @@ The document introduces the technology principles behind the service, recognitio

 ## File Transcription Service, Mandarin (CPU)

-Currently, the FunASR runtime-SDK-0.0.1 version supports the deployment of file transcription service, Mandarin (CPU version), with a complete speech recognition chain that can transcribe tens of hours of audio into punctuated text, and supports recognition for more than a hundred concurrent streams. 
+Currently, the FunASR runtime-SDK supports the deployment of file transcription service, Mandarin (CPU version), with a complete speech recognition chain that can transcribe tens of hours of audio into punctuated text, and supports recognition for more than a hundred concurrent streams. 

 To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.

 ### Technical Principles

-The technical principles and documentation behind FunASR explain the underlying technology, recognition accuracy, computational efficiency, and core advantages of the framework, including convenience, high precision, high efficiency, and support for long audio chains. For detailed information, please refer to the documentation available by [docs](https://mp.weixin.qq.com/s?__biz=MzA3MTQ0NTUyMw==&tempkey=MTIyNF84d05USjMxSEpPdk5GZXBJUFNJNzY0bU1DTkxhV19mcWY4MTNWQTJSYXhUaFgxOWFHZTZKR0JzWC1JRmRCdUxCX2NoQXg0TzFpNmVJX2R1WjdrcC02N2FEcUc3MDhzVVhpNWQ5clU4QUdqNFdkdjFYb18xRjlZMmc5c3RDOTl0U0NiRkJLb05ZZ0RmRlVkVjFCZnpXNWFBVlRhbXVtdWs4bUMwSHZnfn4%3D&chksm=1f2c3254285bbb42bc8f76a82e9c5211518a0bb1ff8c357d085c1b78f675ef2311f3be6e282c#rd). 
+The technical principles and documentation behind FunASR explain the underlying technology, recognition accuracy, computational efficiency, and core advantages of the framework, including convenience, high precision, high efficiency, and support for long audio chains. For detailed information, please refer to the documentation available by [docs](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww). 

 ### Deployment Tutorial

--- a/funasr/runtime/readme_cn.md
+++ b/funasr/runtime/readme_cn.md
@ -7,10 +7,34 @@ SDK 支持以下几种服务部署：

 - 中文离线文件转写服务（CPU版本），已完成
 - 中文流式语音识别服务（CPU版本），已完成
+- 英文离线文件转写服务（CPU版本），已完成
 - 中文离线文件转写服务（GPU版本），进行中
- 英文离线转写服务，进行中
 - 更多支持中

+## 英文离线文件转写服务（CPU版本）
+
+英文离线文件转写服务部署（CPU版本），拥有完整的语音识别链路，可以将几十个小时的长音频与视频识别成带标点的文字，而且支持上百路请求同时进行转写。
+为了支持不同用户的需求，针对不同场景，准备了不同的图文教程：
+
+### 便捷部署教程
+
+适用场景为，对服务部署SDK无修改需求，部署模型来自于ModelScope，或者用户finetune，详细教程参考（[点击此处](./docs/SDK_tutorial_en_zh.md)）
+
+
+### 开发指南
+
+适用场景为，对服务部署SDK有修改需求，部署模型来自于ModelScope，或者用户finetune，详细文档参考（[点击此处](./docs/SDK_advanced_guide_offline_en_zh.md)）
+
+### 技术原理揭秘
+
+文档介绍了背后技术原理，识别准确率，计算效率等，以及核心优势介绍：便捷、高精度、高效率、长音频链路，详细文档参考（[点击此处](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww)）
+
+### 最新版本及image ID
+| image version                |  image ID | INFO |
+|------------------------------|-----|------|
+| funasr-runtime-sdk-en-cpu-0.1.0 |  4ce696fe9ba5   |      |
+
+
 ## 中文实时语音听写服务（CPU版本）

 FunASR实时语音听写服务软件包，既可以实时地进行语音转文字，而且能够在说话句尾用高精度的转写文字修正输出，输出文字带有标点，支持高并发多路请求。