DiscoverStorage Developer Conference
Storage Developer Conference
Claim Ownership

Storage Developer Conference

Author: SNIA Technical Council

Subscribed: 50Played: 805
Share

Description

Every week the Storage Developer Conference (SDC) podcast presents important technical topics to the Storage Developer community. Each episode is hand selected by the SNIA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at www.snia.org/podcasts.
202 Episodes
Reverse
This presentation by provides an overview of the NVM Express® ratified technical proposal TP4146 Flexible Data Placement and shows how a host can manage its user data to capitalize on a lower Write Amplification Factor (WAF) by an SSD to extend the life of the device, improve performance, and lower latency. Mike Allison, the lead author of the technical proposal will cover: a) The new terms associated with FDP (e.g., Placement Identifier, Reclaim Unit, Reclaim Unit Handle, Reclaim Groups, etc.); b) Enabling the FDP capability; c) Managing the Reclaim Unit Handles during namespace creation; d) How I/O writes place of the user data to a Reclaim Unit; e) The differences between other placement capabilities supported by NVM Express. Learning Objectives 1) Obtain an understanding of the architecture of the NVM Express Flexible data placement (FDP) capability; 2) Learn how issue I/O write commands to place user data and avoid SSD garbage collection; 3) Understand the differences between FDP, ZNS, and Streams.
SSDs that support Zoned Namespace (ZNS) are increasingly popular for large-scale storage deployments due to their cost efficiency and performance improvements over conventional SSDs, which include 3-4x throughput increase, prolonged SSD lifetime, as well as making QLC media available to I/O heavy workloads. As the zoned storage hardware ecosystem has matured, its open-source software ecosystem has also grown. As a result, we are now emerging at a new stage that provides a solid foundation for large-scale cloud adoption. This talk describes SSDs that support Zoned Namespace s, the work of SNIA's Zoned Storage TWG to standardize ZNS SSD device models, and the quickly evolving software ecosystem across filesystems (f2fs, btrfs, SSDFS), database systems (RocksDB, TerarkDB, MySQL), and cloud orchestration platforms (Openstack and Kubernetes /w Mayastor, Longhorn, SPDK's CSAL). Learning Objectives 1) ZNS SSDs; 2) Emerging hardware & software eco-system for zoned storage; 3) Zoned storage cloud adoption.
The IEEE Security In Storage Work Group (SISWG) produces standards that many storage developers, storage vendors, and storage system operators care about, including: a) A family of standards on sanitization: the IEEE 2883 family b) A family of standards on encryption methods for storage components: the IEEE 1619 family c) A standard on Discovery, Authentication, and Authentication in Host Attachments of Storage Devices: the IEEE 1667 specification IEEE has a different work group (IEEE P3172) focusing on post-quantum cryptography, but when they are done, a family method that recommends new quantum encryption for various storage types (e.g., block, stream) may be appropriate for SISWG’s IEEE 1619 family. IEEE has a different work group focusing on Zero Trust Security (ZTS, IEEE P2887), however an application of those principles for storage devices and systems is also within the purview of the IEEE SISWG. Learning Objectives 1) Understand the scope of standards developed by the IEEE SISWG; 2) Understand the relevance of SISWG standards to the listener's business; 3) Understand how to participate in the IEEE SISWG.
The introduction of CXL has significantly advanced the enablement of memory disaggregation. Along with disaggregation has risen the need for reliable and effective ways to transparently tier data in real time between local direct attached CPU memory and CXL pooled memory. While the CXL hardware level elements have advanced in definition, the OS level support, drivers and application APIs that facilitate mass adoption are still very much under development and still in discovery phase. Even though memory tiering presents new challenges, we can learn a great deal from the evolution of storage from direct attached to storage area networks, software defined storage and early disaggregated/composable storage solutions such as NVMe over fabrics. Presented from the viewpoint of a real time block storage tiering architect with products deployed in more than 1 million PCs and servers.
It’s been a year since the announcement that Intel would “Wind Down” its Optane 3D XPoint memories. Has anything risen to take its place? Should it? This presentation reviews the alternatives to Optane that are now available or are in development, and evaluates the likelihood that one or more of these could fill the void that is being left behind. We will also briefly review the legacy Optane left behind to see how that legacy is likely to be used to support persistent memories in more diverse applications, including cache memory chiplets. Along the way we’ll show how Optane not only spawned new thinking on software, as embodied in the SNIA Nonvolatile Memory Programming Model, but also drove the creation of new communication protocols, particularly CXL and UCIe. Learning Objectives: 1) Understand the growing role of emerging memory technologies in future processors; 2) Learn how Persistence, NUMA, and Chiplets have blossomed in Optane's wake; 3) See how SNIA's NVM Programming Model will support tomorrow's software, even though Optane won't be using it.
One of the goals of HPE’s Spaceborne Computer program is proving the value of edge computing. Spaceborne Computer-1 (SBC-1) was launched in August of 2017 with the latest available COTS (Commercial off the Shelf) hardware, including twenty solid state disks (SSDs) for storage. The disappointing durability of those SSDs will be covered; the Failure Analysis (FA) of them upon Return To Earth (RTE) will be presented and the mitigation done in Spaceborne Computer-2 will be detailed. HPE’s Spaceborne Computer-2 (SBC-2) launched in February of 2021 with over 6 TB of internal SSD storage. Onboard storage of ISS-generated “raw” data is critical to proving the value of edge computing, to delivering results and insights to scientists and researchers faster and to enabling deeper exploration of the cosmos. The storage design will be summarized, including the concept of operations (ConOps) for backup and disaster recovery. Several successful SBC-2 edge computing experiments will be reviewed, all of which demonstrate a reduction in download size to Earth of at least 95%. Additionally, Spaceborne Computer-2 has access to the ISS Payloads Network Attached Storage (PL-NAS). The PL-NAS is a NASA file server with five hard drive bays and allows onboard ISS systems to access a shared folder location on the PL-NAS. A summary of the decision to opt for SSDs on HPE’s Spaceborne Computer instead of traditional hard drives will be presented. SBC-2 exploitation of the PL-NAS for EVA safety operations will be detailed. Finally, the market for anticipated edge services is being better defined and concepts will be presented, all of which require stable and reliable storage at the edge. Learning Objectives: 1)Storage considerations for space-based storage; 2) Value of capable "raw" storage to support edge computing, AI/ML and HPC; 3) Lessons learned from storage experiments in space; 4) Failure rates of SSDs in Space; 5) Future space-based services requiring storage.
Azure Disks provide block storage for Azure Virtual Machines and are a core pillar of the Azure IaaS platform. In this talk, we will provide an overview of Direct Drive - Azure's next-generation block storage architecture. Direct Drive forms the foundation for a new family of Azure disk offerings, starting with Ultra Disk (Azure's highest performance disks). We will describe the challenges of providing durable, highly-available, high-performance disks at cloud scale as well as the software and hardware innovations that allow us to overcome these challenges.
For the past three decades, PCI-SIG® has delivered a succession of industry-leading PCI Express® (PCIe®) specifications that remain ahead of the increasing demand for a high-bandwidth, low-latency interconnect for compute-intensive systems in diverse market segments, including data centers, Artificial Intelligence and Machine Learning (AI/ML), high-performance computing (HPC) and storage applications. In early 2022, PCI-SIG released the PCIe 6.0 specification to members, doubling the data rate of the PCIe 5.0 specification to 64 GT/s (up to 256 GB/s for a x16 configuration). To achieve high data transfer rates with low latency, PCIe 6.0 technology adds innovative new features like Pulse Amplitude Modulation with 4 levels (PAM4) signaling, low-latency Forward Error Correction (FEC) and Flit-based encoding. PCIe 6.0 technology is an optimal solution to meet the demands of Artificial Intelligence and Machine Learning applications, which often require high data bandwidth, low latency transport channels. This presentation will explore the benefits of PCIe 6.0 architecture for storage and AI/ML workloads and its impact on next-generation cloud data centers. Attendees will also learn about the potential AI/ML use cases for PCIe 6.0 technology. Finally, the presentation will provide a preview of what is coming next for PCIe specifications.
Los Alamos is working to revolutionize how scientific data management is done, moving from large Petabyte sized files generated periodically by extreme scale simulations to a record and column based approach. Along the journey, the NVMe Computational Storage efforts became a strategic way to help accomplish this revolution. Los Alamos has been working on a series of proof of concepts with a set of data storage industry partners and these partnerships have proven to be the key to success. This talk will summarize the three proof of concept applications of Computation Storage and some of the industry partnership projects that have helped pave the way for LANL and hopefully industry to new approaches to large scale data management.
Computational Storage is a new field that is addressing performance and scaling issues for compute with traditional server architectures. This is an active area of innovation in the industry where multiple device and solution providers are collaborating in defining this architecture while actively working to create new and exciting solutions. The SNIA Computational Storage TWG is leading the way with new interface definitions with Computational Storage APIs that work across different hardware architectures. Learn how these APIs may be applied and what types of problems they can help solve.
Synthetic DNA-based data storage is a major attraction due to the possibility of storage over long periods. This technology is a solution for the current data centers, reducing energy consumption and physical storage space. Nowadays, the quantity of data generated has been growing exponentially, while the storage capacity does not keep up with the growth, caused by new technologies and globalization.
Data persistence on CXL is an essential enabler toward the goal of instant-on processing. DRAM class performance combined with non-volatility on CXL enables a new class of computing architectures that can exploit these features and solve real-world bottlenecks for system performance, data reliability, and recovery from power failures. New authentication methods also enhance the security of server data in a world of cyberattacks.
Large-scale data analytics, machine learning, and big data applications often require the storage of a massive amount of data. For cost-effective high bandwidth, many data centers have used tiered storage with warmer tiers made of flashes or persistent memory modules and cooler tiers provisioned with high-density rotational drives. While ultra fast data insertion and retrieval rates have been increasingly demonstrated by research communities and industry at warm storage, complex queries with predicates on multiple columns tend to still experience excessive delays when unordered, unindexed (or potentially only lightly indexed) data written in log-structured formats for high write bandwidth is subsequently read for ad-hoc analysis at row level. Queries run slowly because an entire dataset may have to be scanned in the absence of a full set of indexes on all columns. In the worst case, significant delays are experienced even when data is read from warm storage. A user sees even higher delays when data must be streamed from cool storage before analysis takes place. In this presentation, we present C2, a research collaboration between Seagate and Los Alamos National Lab (LANL) for the lab's next-generation campaign storage. Campaign is a scalable cool storage tier at LANL managed by MarFS that currently provides 60 PBs of storage space for longer-term data storage. Cost-effective data protection is done through multi-level erasure coding at both node level and rack level. To prevent users from always having to read back all data for complex queries, C2 enables direct data analytics at the storage layer by leveraging Seagate Kinetic Drives to asynchronously add indexes to data at per-drive level after data lands on the drives. Asynchronously constructed indexes cover all data columns and are read at query time by the drives to drastically reduce the amount of data that needs to be sent back to the querying client for result aggregation. Combining computational storage technologies with erasure coding based data protection schemes for rapid data analytics over cool storage presents unique challenges in which individual drives may not be able to see complete data records and may not deliver performance required by high-level data insertion, access, and protection workflows. We discuss those challenges in the talk, share our designs, and report early results.
Azure Block Storage, also referred to as Azure Disks, is the persistent block storage for Azure Virtual Machines and a core pillar for Azure IaaS infrastructure. Azure offer unique block storage capabilities that differentiate it from other Cloud Block Storage offerings. In this talk, we will use a few of these capabilities as examples to reveal the technical designs behind and how they are tied to our XStore storage architecture. Starting with fast restore from snapshot, we will share the CoR technology built to orchestrate instant Disk recovery from snapshots stored in different storage medias. In addition, we will highlight how multi-protocol support is enabled on block storage for SCSI and REST access leveraging our 3-layer XStore architecture. We will conclude with recent enhancements to XStore architecture and upcoming innovations.
If you haven’t caught the new wave in storage management, it’s time to dive in. This presentation provides a broad look at the Redfish and Swordfish ReSTful hierarchies, maps these to some common applications, and provides an overview of the Swordfish tools and documentation ecosystem developed by SNIA’s Scalable Storage Management Technical Work Group (SSM TWG) and the Redfish Forum. It will also provide an overview of what’s new in ’22, including enhancements to NVMe support, storage fabric management, and capacity and performance metric management.
With Amazon S3 celebrating its sixteenth birthday this year, it's easy to forget just how revolutionary it was at its release. S3's buckets and objects were profoundly different from the directories and files that developers had been manipulating through filesystem APIs. What drove this innovation, and how does cloud object storage actually work? In this session, Pat Patterson, Chief Developer Evangelist at Backblaze, will trace the evolution of cloud object storage, explain the trade-offs in implementing secure, reliable, scalable online data storage, and give a detailed technical explanation of Backblaze B2 Cloud Storage’s implementation.
Data bytes stored continues to grow at about 40% annually. This trend now exceeds the device capacity growth rate of all existing commercial scale media types including HDD, Flash, Tape and Optical, and the gap between growth rates is about 20%. That implies that the datacenter footprint for storage will be approximately doubling every 3.5 years just to keep up. However, the roadmaps for ongoing density improvement makes the situation much more stark. Past 2030, growth in device capacities may slow substantially leading to a need for 40x or more datacenter space and power by 2040 to keep up with data growth. Although the methods we have used to store data with magnetized materials or corralling electrons are true technological wonders, it is becoming apparent that if we don't want to impinge data growth we may need a substantial paradigm shift in storage technology. Molecular storage is the panacea of storage density and DNA is the leading contender for the championship of storage density, but we will also need to invest in technologies that allow for high speed molecular storage. This is going to be a heavy lift, but if we want to intercept the coming storage capacity crunch we need to start work now.
SMB3 has seen significant adoption as the storage protocol of choice for running private cloud deployments. In this iteration of the talk, we’ll update the audience on SMB protocol changes as well as improvements to the Windows implementation of the SMB server and client. Added to the SMB protocol is a new server-to-client notification mechanism, which enables a variety of novel use cases. We’ll present the details of protocol messaging (new message types, etc) as well as the one scenario which leverages this new mechanism (server-triggered graceful session closure). In addition to the protocol changes, we’ll provide an overview of the latest feature additions to the Windows SMB server and client: authentication rate limiting to protect against password spray attacks, and upcoming improvements to SMB over QUIC.
DNA data storage will dramatically effect the way organizations think about data retention, data protection, and archival by providing capacity density and longevity several orders of magnitude beyond anything available today, while reducing requirements for both power, cooling, and fixity checks. One of challenges of any long term archival storage is being able to recover the data after possibly decades or longer. To do this, the reader must be able to bootstrap the archive, akin to how an OS is loaded after the master boot record is loaded. This talk will describe our initial work to define a standard schema for a self-describing DNA data archive sector zero, which will be as generic as possible, exploiting the format immutability of the natural DNA molecule to assure the archive can be bootstrapped by sequencers decades in the future, all while enabling archive writers to continue innovation in how the balance of the archive is synthesized. We call this the “DNA Rosetta Stone” project.
The industry needs a new storage medium that is more dense, durable, sustainable, and cost effective to cope with the expected future growth of archival data. DNA, nature’s data storage medium, enters this picture at a time when synthesis and sequencing technologies for advanced medical and scientific applications are enabling the manipulation of synthetic DNA in ways previously unimagined. This session will provide and overview of why DNA data storage is compelling and what the DNA Data Storage Alliance is doing to help build an interoperable DNA Data Storage ecosystem.
loading
Comments 
Download from Google Play
Download from App Store