[feat][WIP]: support multistream overlap(dbo) for deepseek #941

zxdukki · 2025-05-23T15:46:25Z

What this PR does / why we need it?

Based on the design of dual-batch overlap proposed by Deepseek team and also the implementation of fused moe in VLLM project, we implement the multi-stream(also known as dual-batch) overlap for deepseek+mla on Ascend NPU. We split the input batch of model into two microbatches and then overlap the comp/comm ops in attention and moe layers using two streams to improve the performance. Our approach can be easily extended when adding dispatch/combine communications for moe layer.
Compared with the previously proposed draft, we use one stream for computation ops and the other for communication ops, separately. In out opinions, it is beneficial for arranging the order of executing different ops and thus avoiding the contention of computation/communication resources.

Note that this PR is in progress. The benchmark performance will be updated soon.

ref: overlap for llama
ref: dbo in sglang

Does this PR introduce any user-facing change?

Adding a switch in vllm (or env variable in vllm-ascend) to control whether we enable or disable the function of dbo/multistream. By default we will not enable the function of multi-stream/dbo.

How was this patch tested?

Currently, we can test it by adding '--enable-multi-stream' when starting the vllm online service using the specified version of vllm. We will update the related modifications in vllm soon.

Any advice/discussion is welcome.

github-actions bot added the module:ops label May 23, 2025

[feat]: support dbo for deepseek

68070f1

zxdukki force-pushed the dev_multistream_overlap branch from 943d296 to 68070f1 Compare May 23, 2025 16:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat][WIP]: support multistream overlap(dbo) for deepseek #941

[feat][WIP]: support multistream overlap(dbo) for deepseek #941

Uh oh!

zxdukki commented May 23, 2025

Uh oh!

Uh oh!

[feat][WIP]: support multistream overlap(dbo) for deepseek #941

Are you sure you want to change the base?

[feat][WIP]: support multistream overlap(dbo) for deepseek #941

Uh oh!

Conversation

zxdukki commented May 23, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Uh oh!