ProgCo: Program Helps Self-Correction of Large Language Models

Update: 2025-01-04

Description

🤗 Upvotes: 17 | cs.CL, cs.AI, cs.LG

Authors:

Xiaoshuai Song, Yanan Wu, Weixun Wang, Jiaheng Liu, Wenbo Su, Bo Zheng

Title:

ProgCo: Program Helps Self-Correction of Large Language Models

Arxiv:

http://arxiv.org/abs/2501.01264v1

Abstract:

Self-Correction aims to enable large language models (LLMs) to self-verify and self-refine their initial responses without external feedback. However, LLMs often fail to effectively self-verify and generate correct feedback, further misleading refinement and leading to the failure of self-correction, especially in complex reasoning tasks. In this paper, we propose Program-driven Self-Correction (ProgCo). First, program-driven verification (ProgVe) achieves complex verification logic and extensive validation through self-generated, self-executing verification pseudo-programs. Then, program-driven refinement (ProgRe) receives feedback from ProgVe, conducts dual reflection and refinement on both responses and verification programs to mitigate misleading of incorrect feedback in complex reasoning tasks. Experiments on three instruction-following and mathematical benchmarks indicate that ProgCo achieves effective self-correction, and can be further enhance performance when combined with real program tools.

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Visual-RFT: Visual Reinforcement Fine-Tuning

2025-03-0522:49

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

2025-03-0525:44

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

2025-03-0519:04

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

2025-03-0422:49

Chain of Draft: Thinking Faster by Writing Less

2025-03-0422:37

Multi-Turn Code Generation Through Single-Step Rewards

2025-03-0425:33

Self-rewarding correction for mathematical reasoning

2025-03-0124:30

MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning

2025-03-0123:19

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

2025-03-0122:25

LongRoPE2: Near-Lossless LLM Context Window Scaling

2025-03-0123:05

FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving

2025-03-0126:27

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

2025-03-0121:52

UniTok: A Unified Tokenizer for Visual Generation and Understanding

2025-03-0124:43

NeoBERT: A Next-Generation BERT

2025-03-0123:40

Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance

2025-03-0121:30

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

2025-03-0122:16

GHOST 2.0: generative high-fidelity one shot transfer of heads

2025-02-2818:41

Kanana: Compute-efficient Bilingual Language Models

2025-02-2822:05

TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

2025-02-2822:22

Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance

2025-02-2824:57

00:00

1.0x

ProgCo: Program Helps Self-Correction of Large Language Models

Jingwen Liang, Gengyu Wang

#box-pro-ellipsis-174131495362155{-webkit-line-clamp:2;}ProgCo: Program Helps Self-Correction of Large Language Models

ProgCo: Program Helps Self-Correction of Large Language Models

Jingwen Liang, Gengyu Wang

ProgCo: Program Helps Self-Correction of Large Language Models