DiscoverThe HDF Group's Call the Doctor
The HDF Group's Call the Doctor
Claim Ownership

The HDF Group's Call the Doctor

Author: The HDF Group

Subscribed: 0Played: 0
Share

Description

Find the latest information about the HDF file formats and new developments from The HDF Group and the HDF community.
63 Episodes
Reverse
Aleksandar Jelenak hosted Call the Doctor on Tuesday, April 16, 2024. He talked about last week’s h5py 3.11 release and current interesting issues and PRs. Aleksandar has a background as a scientist working with geospatial and geoscience data, and now works at The HDF Group to make HDF5 and related tools more effective for other scientists.This session happened on April 16, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
The HDF Group’s John Readey talked about using HSDS with Github Codespaces.This session happened on April 9, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
The HDF Group's Neil Fortner presented a design for a new feature we’d like to implement to bring crash proofing to HDF5. This feature will allow HDF5 files to be easily recoverable, even if application execution is interrupted due to an application crash, system crash, or hardware failure. The feature is designed to be easy to use and add minimal overhead.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
If you are involved in research involving high-performance computing clusters, this Call the Doctor hosted by The HDF Group's Scot Breitenfeld  provided support for using HDF5. Scot presented recent subfiling performance results and answered some community questions on the efforts so far. This session happened on March 26, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
The HDF Group's Aleksander Jelenak hosted this episode of Call the Doctor.He went through some recent h5py issues and pull requests. He helped a user who was experiencing poor performance while streaming object references. He referenced this poster. Aleksander also talked about fsspec a Python project which provides a file object-like interface for various remote storage systems. This session happened on March 19, 2024. You can watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
Did you know that HSDS supports SQL-like query operations on datasets? In this session of Call the Doctor, John Readey describes this feature and illustrates with some live examples. Links mentioned by John in the session: H5py + PyTables blogquery read examplequery update exampleThis session was recorded on March 12, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of Call the Doctor, Director of Software Engineering Dana Robinson covers the schedule for new releases for HDF5, HDF4, and HDFView and what those new releases will include. A user question about when The HDF Group would change the name to HDF5 2.0.This session happened on March 5, 2024. You can watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of Call the Doctor," director of software engineering, Dana Robinson covers updates and changes in the HDF4 library. Dana discusses HDF 4.3.0 (released 2/29/24) and version 4.4, with plans to maintain HDF4 indefinitely. The focus is on improving HDF4's compatibility with modern compilers and platforms, addressing issues like the xdr library's compatibility problems and the deployment of internal header files. Significant changes include removing xdr from configuration options, improving compiler support, and cleaning up memory sanitizer issues.The episode also touches on restructuring the Fortran code, removing outdated Fortran 77 support, and potentially merging libraries into a single HDF4 library. Additionally, plans involve phasing out the old netCDF 2.3.2 API and associated tools in favor of newer alternatives.The HDF Group will continue to modernizing HDF4 to ensure its long-term maintainability, despite potential disruptions caused by these changes. We do need to hear from users about these plans and how they might work or cause conflict with your code, so please reach out to us to let us know about your concerns on the HDF4 Forum or by emailing help@hdfgroup.org.  This session happened on February 27, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of "Call the Doctor," Aleksandar Jelenak discusses ways to document and describe the content of HDF5 files in order to generate documentation about the file content, as inspired by this forum post from Mike Jackson of Blue Quartz. The session emphasizes the need for a format that is approachable, shareable, and easily parsable by multiple programming languages. Jelenak discusses various options that have been used, including Excel spreadsheets, JSON, and text formats like YAML. He also presents his own idea of using YAML documents to describe the content of HDF5 files in a hierarchical and straightforward manner. The session concludes with a discussion about the importance of bidirectionality in the toolchain and the potential for future developments in this area.This session happened on February 20, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of Call the Doctor, The HDF Group’s Matt Larson will give an introduction to the HDF in the Cloud tutorials: REST VOL, HSDS, ROS3. and H5Coro. If you’re an experienced HDF5 user, come to this Call the Doctor session, take a look at the tutorial content, and feel free to make improvements or let us know of any issues.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
For this week's Call the Doctor, The HDF Group's Jordan Henderson, author of the Adding support for 16-bit floating point and complex number datatypes to HDF5 RFC, briefly talked through that RFC and lead a community discussion for feedback on the concept and proposal.For more to discussion, and to participate in the discussion, please visit this forum post: https://forum.hdfgroup.org/t/hdf5-rfc-adding-support-for-16-bit-floating-point-and-complex-number-datatypes-to-hdf5/11975 This session happened on February 6, 2024. You can also watch this session online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
For an individual, creating an HDF5 tutorial is hard; making a good HDF5 tutorial is nearly impossible. There are many technical facets to cover; the global community it serves is phenomenally diverse, and so is its rich ecosystem. Wouldn’t it be great to have a tutorial that anyone could enjoy without pesky installation details, and everyone with an idea to improve upon it could contribute easily? Introducing the HDF5 Tutorial developed by the community for the community. Fork it on GitHub at https://github.com/HDFGroup/hdf5-tutorial.The HDF Group's Executive Director Gerd Heber will uses this session of Call the Doctor to give an overview of the tutorial and introduce the underpinnings so that everyone so inclined can contribute and help create the best possible HDF5 tutorial ever.Would you like to discuss this tutorial with Gerd and others? Come to the forum and let us know your thoughts. This session was recorded on January 30, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of "Call the Doctor," Scot Breitenfeld of The HDF Group hosted an open help session for your HPC HDF5 questions. (Questions are welcome at any session, but we knew we had some community members with specific needs for this session.) A research engineer working on open-source thermohydraulics code discusses issues with writing results in the CGNS (CFD General Notation System) format. The engineer explores different approaches, including writing the CGNS skeleton on the master processor and distributing data writing on multiple nodes. However, there are concerns about the slowdown when increasing time steps, possibly due to opening and closing the HDF file at each time step. The discussion also touches on options like sub-filing and using the core file driver to eliminate the file system component. Additionally, there's a brief inquiry about compressing large strings in HDF5. See this forum post for details. Overall, the episode addresses technical challenges in parallel file writing and optimization strategies.This session happened on January 23, 2024. You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of "Call the Doctor," Aleksandar Jelenak delves into recent work on analyzing Earth science HDF5 files, comparing their original versions with cloud-optimized versions. Using data from the Global Ecosystem Dynamics Investigation Instrument (GEDI) on the International Space Station, he showcases the instrument's laser beams mapping Earth's vegetation. Aleksandar discusses the challenges of optimizing HDF5 files for cloud usage, emphasizing the need for user-friendly data for scientists worldwide. He presents a detailed analysis of chunk data sets, storage settings, and statistics, highlighting the potential benefits of cloud-optimized files. The episode concludes with performance comparisons between original and cloud-optimized files, shedding light on the advantages of efficient data storage and access.This session was recorded on January 16, 2024. You can also watch it online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of Call the Doctor, The HDF Group's John Readey discusses upcoming features in the HSDS release. Since John is in China, the session is pre-recorded, with software engineer Matt Larson available to answer questions. The discussed features include shape reduction, broadcasting, UTF-8 fixed strings, quick scan, N-bit and scale offset filters, enhanced array type support, field ops for compound types, support for long attribute names, non-UTF-8 encodable attributes, multi-op attributes, long link names, and hyper chunking. John also introduces the concepts of async tasks for long-running operations and using Parquet for encoding chunks with variable-length types.  The design doc for async tasks can be viewed on github.  The episode includes a demonstration of the attribute multi-op feature, showcasing its efficiency compared to a serial approach. You can participate in the discussion of features for HSDS 0.9 on the forum.This session happened on January 9, 2024. You can also watch this episode.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
While we're off for this week, please enjoy this audio recording from the 2023 HDF5 Users Group meeting held in Ohio. This session was presented by Aleksandar Jelenak, The HDF Group.Zarr is a fairly recent format for multidimensional data arrays specifically targeting storage systems with key-value interface. Some scientific communities interested in implementing scalable cloud-native data analysis are considering Zarr as their chosen data format because of its straightforward implementation in cloud object stores. HDF Group had developed its own cloud-native HDF5 format, called HSDS schema, about the same time as Zarr. Only HDF Group’s developed software, HSDS, currently creates data in the HSDS schema. Since both Zarr and HSDS schema share the same design approach, it would be worthwhile to consider whether Zarr could serve as the cloud-native HDF5 format. The currently developed Zarr version 3 specification introduces the concept of extensions as a way to add more storage features. The goal of this session is to discuss pros and cons of using Zarr v3 to formulate a new cloud-native HDF5 format. Some technical information will be provided with aim to open up discussion among all attendees.If you'd like, you can watch this session online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
This is a bonus episode recorded live in August 2023 at the HDF5 User Group meeting held in Ohio.Dana Robinson and Neil Fortner talked about some future work The HDF Group is planning, plus took questions, input, and discussion from the audience. Unfortunately not every audience contribution will be easy to here but the majority of them in this recording are fairly audible. Sorry about that!  You can watch this session online or access the slide deck. Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
While we're off for the holidays, enjoy this audio recording of the session, "State of HDF5 and New Features" from Dana Robinson (Director of Software Engineering) and Neil Fortner (Chief HDF5 Software Architect), presented at the August 2023 HDF5 User Group meeting held in Ohio. Dana talked about some changes being made: the HDF5 Working Group meetings, the Sustaining Engineer of the Week, centering development on GitHub and the need for external Codeowners, a new process for Change management, and some of the HDF5 issues and development work we plan to focus on. Neil talked about new features being added to HDF5: Multi Dataset I/O, Selection and Vector I/O, and the Subfiling VFD.Note: At the end when we took questions from the audience, you'll find the audio is not that great. We apologize for that. Later sessions recorded during this event were better.You can watch the recording of this session on youtube, and also access Dana's slide deck and Neil's slide deck if the audio experience isn't doing it for you.  Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
The HDF Group’s Aleksandar Jelenak will talk about the conda, mamba, and micromamba package managers. These package managers are frequently used in the data science communities. Aleksander talked about the differences between these packages and how to get started using them.You can also watch this episode online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
In this episode of "Call the Doctor," The HDF Group's John Readey explores the functionality of linked data sets in HSDS (Highly Scalable Data Service).  Using a Python notebook running on AWS, he walks through examples using data from the National Renewable Energy Lab, which has substantial HDF5 and HSDS data freely accessible. John covers various aspects, including domain information, data set details, and how to read and analyze chunks. He delves into the specifics of the chunk layout, discussing file URIs, offsets, and sizes. Comparisons between HSDS and direct S3 access using the HDF5 library reveal differences in performance due to the sequential nature of the HDF5 library's requests. John concludes by demonstrating a new feature for querying specific data sets using hsls.You can also watch this session online.Call the Doctor is a series of weekly, unscripted, live events! The HDF Group’s staff members will answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Join us every Tuesday at 12:20 p.m. central (US/Canada.) on Zoom!
loading
Comments 
Download from Google Play
Download from App Store