Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Update: 2025-12-31

Description

🤗 Upvotes: 30 | cs.CV

Authors:

Hau-Shiang Shiu, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Po-Fan Yu, Yu-Chih Chen, Yu-Lun Liu

Title:

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Arxiv:

http://arxiv.org/abs/2512.23709v1

Abstract:

Diffusion-based video super-resolution (VSR) methods achieve strong perceptual quality but remain impractical for latency-sensitive settings due to reliance on future frames and expensive multi-step denoising. We propose Stream-DiffVSR, a causally conditioned diffusion framework for efficient online VSR. Operating strictly on past frames, it combines a four-step distilled denoiser for fast inference, an Auto-regressive Temporal Guidance (ARTG) module that injects motion-aligned cues during latent denoising, and a lightweight temporal-aware decoder with a Temporal Processor Module (TPM) that enhances detail and temporal coherence. Stream-DiffVSR processes 720p frames in 0.328 seconds on an RTX4090 GPU and significantly outperforms prior diffusion-based methods. Compared with the online SOTA TMP, it boosts perceptual quality (LPIPS +0.095) while reducing latency by over 130x. Stream-DiffVSR achieves the lowest latency reported for diffusion-based VSR, reducing initial delay from over 4600 seconds to 0.328 seconds, thereby making it the first diffusion VSR method suitable for low-latency online deployment. Project page: https://jamichss.github.io/stream-diffvsr-project-page/

Comments

In Channel

mHC: Manifold-Constrained Hyper-Connections

2026-01-0220:57

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

2026-01-0228:35

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

2026-01-0225:58

GaMO: Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction

2026-01-0222:28

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

2025-12-3124:49

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

2025-12-3123:16

Yume-1.5: A Text-Controlled Interactive World Generation Model

2025-12-3125:01

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

2025-12-3124:01

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

2025-12-3125:32

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

2025-12-3125:06

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

2025-12-3123:48

SpotEdit: Selective Region Editing in Diffusion Transformers

2025-12-3122:44

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

2025-12-3122:03

InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

2025-12-3023:11

Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

2025-12-3021:17

MAI-UI Technical Report: Real-World Centric Foundation GUI Agents

2025-12-3024:59

Latent Implicit Visual Reasoning

2025-12-2725:49

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

2025-12-2726:01

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

2025-12-2621:22

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

2025-12-2622:56

00:00

1.0x

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Jingwen Liang, Gengyu Wang

#box-pro-ellipsis-176739775336876{-webkit-line-clamp:2;}Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Jingwen Liang, Gengyu Wang

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion