Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

Update: 2024-07-25

Description

How can we effectively use large language models (LLMs) for cybersecurity tasks? In this Carnegie Mellon University Software Engineering Institute podcast, Jeff Gennari and Sam Perl discuss applications for LLMs in cybersecurity, potential challenges, and recommendations for evaluating LLMs.

Comments

In Channel

Understanding Container Reproducibility Challenges: Stopping the Next Solar Winds

2025-07-3025:10

Mitigating Cyber Risk with Secure by Design

2025-07-1432:29

The Magic in the Middle: Evolving Scaled Software Solutions for National Defense

2025-06-1821:25

Making Process Respectable Again: Advancing DevSecOps in the DoD Mission Space

2025-06-0444:26

Deploying on the Edge

2025-05-2801:01:02

The Best and Brightest: 6 Years of Supporting the President’s Cup Cybersecurity Competition

2025-05-1221:40

Updating Risk Assessment in the CERT Secure Coding Standard

2025-04-1735:52

Delivering Next Generation Cyber Capabilities to the DoD Warfighter

2025-04-1527:16

Getting the Most Out of Your Insider Risk Data with IIDES

2025-03-2639:14

Grace Lewis Outlines Vision for IEEE Computer Society Presidency

2025-03-1118:14

Improving Machine Learning Test and Evaluation with MLTE

2025-03-0329:06

DOD Software Modernization: SEI Impact and Innovation

2025-02-2527:12

Securing Docker Containers: Techniques, Challenges, and Tools

2024-12-1639:09

An Introduction to Software Cost Estimation

2024-12-0422:55

Cybersecurity Metrics: Protecting Data and Understanding Threats

2024-10-1127:00

3 Key Elements for Designing Secure Systems

2024-10-0236:28

Using Role-Playing Scenarios to Identify Bias in LLMs

2024-09-1645:07

Best Practices and Lessons Learned in Standing Up an AISIRT

2024-09-0938:29

3 API Security Risks (and How to Protect Against Them)

2024-08-2219:28

Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

2024-07-2543:05

00:00

Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

#box-pro-ellipsis-175674992762424{-webkit-line-clamp:2;}Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices