Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

Update: 2024-12-02

Description

This paper introduces cDPO, a method for identifying critical tokens in LLMs that lead to incorrect reasoning, enhancing model alignment through token-level rewards and improving performance on reasoning tasks.

https://arxiv.org/abs//2411.19943

YouTube: https://www.youtube.com/@ArxivPapers

TikTok: https://www.tiktok.com/@arxiv_papers

Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016

Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

---

Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

[QA] Transformers Struggle to Learn to Search

2024-12-0907:41

Transformers Struggle to Learn to Search

2024-12-0919:58

[QA] Navigation World Models

2024-12-0707:36

Navigation World Models

2024-12-0719:47

[QA] Motion Prompting: Controlling Video Generation with Motion Trajectories

2024-12-0707:21

Motion Prompting: Controlling Video Generation with Motion Trajectories

2024-12-0716:36

[QA] Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

2024-12-0608:01

Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

2024-12-0621:32

[QA] NVILA: Efficient Frontier Visual Language Models

2024-12-0607:38

NVILA: Efficient Frontier Visual Language Models

2024-12-0623:11

[QA] o1-Coder: an o1 Replication for Coding

2024-12-0309:05

o1-Coder: an o1 Replication for Coding

2024-12-0319:21

[QA] Efficient Track Anything

2024-12-0307:55

Efficient Track Anything

2024-12-0319:04

[QA] Reverse Thinking Makes LLMs Stronger Reasoners

2024-12-0208:08

Reverse Thinking Makes LLMs Stronger Reasoners

2024-12-0216:36

[QA] Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

2024-12-0207:29

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

2024-12-0212:00

[QA] JetFormer: an autoregressive generative model of raw images and text

2024-12-0207:29

JetFormer: an autoregressive generative model of raw images and text

2024-12-0223:02

00:00

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

#box-pro-ellipsis-173403733625580{-webkit-line-clamp:2;}Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability

Igor Melnyk

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM’s Reasoning Capability