Ep 32: Best Practices - Pipeline Development, Part Two

Update: 2024-03-23

Description

PHA4GE Ten Best Practices for Public Health Bioinformatics Pipelines:
https://github.com/pha4ge/public-health-pipeline-best-practices/blob/main/docs/pipeline-best-practices.md

Summary

In this episode, Kevin Libuit and Andrew Page discuss the 10 best practices for public health pipeline development. They start by emphasizing the use of common file formats and the importance of avoiding reinventing the wheel. They highlight the benefits of standard file formats and the availability of parsers for different languages. They also discuss the implementation of software testing, including the use of automated testing and the integration of testing with Docker containers. They emphasize the need for accessibility to benchmark or validation data sets and the importance of reference data requirements. They also touch on the significance of hiring bioinformaticians and the documentation practices that should be followed.

Takeaways

Use common file formats to avoid reinventing the wheel and enable compatibility with other programs.
Implement software testing, including automated testing, to ensure functionality and identify bugs.
Provide benchmark or validation data sets to allow users to compare and evaluate the performance of the pipeline.
Consider the reference data requirements and ensure accessibility to curated databases.
Hire bioinformaticians with domain expertise to navigate the complexities of pipeline development.
Follow documentation practices, including communication of authorship, pipeline maintenance statements, and community guidelines for contribution and support.

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Ep 45: Pathogen Focus - MPXV

2024-11-2009:57

Ep 44: Career Focus:Julian Paganini & Olinto Linares-Perdomo

2024-10-2825:13

Ep 43: Pathogen Focus: Dengue Virus

2024-10-1111:15

Ep 42: GMI 14, Barcelona, Spain

2024-09-2310:48

Ep 41 Importance of Dev Environments

2024-09-0216:21

Ep 40 Career Spotlight: Sharvari Narendra

2024-08-0242:52

Ep 39: Bioinformatics Hackathons

2024-07-1214:40

Ep 38: Workflow Infrastructure

2024-06-2718:21

Ep 37: SRALite

2024-06-1217:23

Ep 36: Technical Career Tracks

2024-05-2217:23

Ep 35: Bioinformatics in the Cloud

2024-05-0819:04

Ep 34: PHA4GE with Dr. Alan Christoffels

2024-04-1746:53

Ep 33: Communicating Results in Public Health

2024-04-0814:17

Ep 32: Best Practices - Pipeline Development, Part Three

2024-04-0115:11

Ep 32: Best Practices - Pipeline Development, Part Two

2024-03-2316:56

Ep 32: Best Practices - Pipeline Development, Part One

2024-03-1618:26

Ep31: AI-Generated Images

2024-03-0817:36

Ep30 Part 2: Gut-Brain Axis with Kevin Bonham, Part Two

2024-02-2949:37

Ep30 Part One: Gut-Brain Axis with Kevin Bonham

2024-02-2920:22

Ep29 Career Spotlight: Kevin G. Libuit

2024-02-2223:56

00:00

Ep 32: Best Practices - Pipeline Development, Part Two

#box-pro-ellipsis-173403407881574{-webkit-line-clamp:2;}Ep 32: Best Practices - Pipeline Development, Part Two

Ep 32: Best Practices - Pipeline Development, Part Two

The Bioinformatics Lab

Ep 32: Best Practices - Pipeline Development, Part Two