The way that DASD (direct access storage device) data is stored on z/OS is very different from distributed/open systems. An understanding of just how z/OS datasets are created, located, accessed, and deleted is as important to end-users as it is to storage administrators and system programmers.
In our October 2021 educational webinar, DTS CTO Steve Pryor discussed this important topic, and how the concepts and structures invented in the 1960s have evolved into today’s high-availability, high-performance, high-reliability disk storage subsystems.
To the more seasoned Big Iron mainframers (and by seasoned we mean 20 years or more), the information covered might seem familiar. But much of this information is no longer widely disseminated, so while it may be a good review for some, for many the concepts of z/OS disk metadata and catalogs, presented by an authority in storage management, is high-value information.
The z/OS Storage Hierarchy
At the top of the food chain is an extensive catalog that keeps track of every dataset in every volume in the system. Each volume contains a table of contents (VTOC) just like the table of contents in the front of a book. The VTOC contains information about the locations of all the datasets, as well as what space on a particular volume is used and not used.
The SMS (system-managed storage) subsystem handles all the information about the allocation needs of the dataset: physical characteristics, logical record length, performance needs, etc., and decides, given all of its characteristics, on which DASD volume to place the dataset. SMS is important to every aspect of the dataset life cycle, beginning with allocation and volume placement and extending to dataset usage, archiving, recall, and eventual expiration.
Naming conventions, which are very specific in z/OS and are critical for managing the large volumes of data common in z/OS systems, are covered in-depth during the webinar. Slides with examples of Mapping, Dataset Extents, and much more were included in the presentation.
The Evolution of z/OS Storage Systems
How did all of this evolve from the systems of old? With the development of larger, faster CPUs in the late 1970s / early 1980s came the need for expanded storage capacity.
Traditional disk drives, made up of a stack of platters in a cylinder with 56,664 bytes per track and 15 tracks per cylinder, existed well into the 1990s. Traditional drives were eventually replaced by RAID (redundant array of independent disks) drives, which were introduced in a famous paper in 1988 by David Patterson, Garth Gibson, and Randy Katz at SIGMOD. RAID simply spreads storage blocks across many different drives and lets a controller do the work of knitting them together. Not only does this make data access faster, but it also ensures the data is accessible in case of a hardware failure in a disk within the RAID due to parity blocks. Even with the advent of more modern storage systems, z/OS still operates its storage mapping as it did pre-1980, ensuring the absolute compatibility and continuing value of software investments that are the hallmark of mainframe systems.
Carl Sagan said in 1980 that “You have to know the past to understand the present.” As installations grow ever larger and more complex, an understanding of how these critical legacy systems developed, and how they still operate, can inform today’s extensive data modernization initiatives.
Understanding z/OS Disk Metadata: Catalogs, VTOCs, VVDSs, Indexes, and More! is a 60-minute informative, educational look at a historic topic in the mainframe space. If you weren’t able to attend, you can view it on-demand and download a copy of the slide deck used in the presentation by using this link.
Unlike Partitioned data sets, which are useful if you have a bunch of different types of data that you want to keep all in a single data set, or VSAM data sets, which are useful for more sophisticated types of access by key or by relative record, sequential data sets are the simplest form of data set. Not surprisingly, they’re useful if you want to read the data in sequence, and when we specify the data set organization in the JCL or define the data set under ISPF, we specify a data set organization or DSORG equal to PS (or occasionally, PSU, or DA).
Sequential data sets can be one of three types, BASIC, LARGE, or Extended Format. If you don’t specify anything out of the ordinary, you get a Basic format data set, which is simply a collection of extents on a disk that is being pointed to by the Volume Table of Contents (VTOC) and the VSAM Volume Data Set (VVDS). Because it’s ordinary sequential data, you can use the Queued or Basic Sequential Access Methods (QSAM or BSAM) to write to it, but it has to be written sequentially — before you write record two, you must write record one.
Sequential data sets are original to z/OS (or MVS, back in the 60s), and thus have some limitations. Basic data sets can have no more than 16 extents per volume and a maximum of 65,535 tracks total even if the data set goes to more than one volume. If you need to exceed that number, you can create a large sequential data set by specifying the DSNTYPE=LARGE parameter. A large format dataset is still limited to 16 extents per volume, but you can have 16,777,215 tracks per volume.
Because the limit of 16 extents is somewhat restrictive, you can use an extended-format sequential data set. Extended-format datasets are implicitly LARGE and can exceed 65,535 tracks, but more importantly, you can get up to 123 extents per volume, offering a few clear advantages. For one, an extended-format data set allows you to fail with anOut-Of-Space error much less often, but it can also be striped. With a single stripe, it’s just an ordinary sequential data set, but if it’s multi-striped, each volume can be read and written in parallel. For an application such as SMF data, where there’s a large amount of sequential data, it’s useful to stripe the data across multiple volumes so they can be read much more quickly.
Wealth of z/OS Webinar Training on DTSSoftware.com
This is a simple breakdown of sequential data sets, but DTS Software has a wealth of additional information on partitioned datasets, PDEs, generation data groups, hierarchical file structures, and UNIX services on z/OS. To learn more, click here to view our August 2021 webinar on demand: “PDS, PDSE, GDG, HFS, USS and Other Strange z/OS Animals.” In this presentation, DTS Software CTO Steve Pryor takes a deep dive into the peculiarities of storage elements on z/OS to help you make sense of these often confounding creatures.
Mainframe storage has changed a lot over the years, and the amount of it available in terms of both memory and disk/tape has grown substantially. Although this gradual progression has over time liberated mainframers from many long-standing limitations, careful storage management has remained a central tenet of mainframe culture and a major differentiator on the platform.
As General and eventual President Eisenhower once said, “Plans are useless. Planning is indispensable.” By understanding current storage availability and outlining future needs, mainframers are able to make heavy and advanced use of the resources and features available to them, and they can do so without interruption. According to Simson Garfinkel and Gene Spafford, Ph.D., in their book, Practical Unix and Internet Security, a computing system is secure when you can count on it to behave in the expected manner. Storage is clearly important because it impacts availability on the mainframe, but it can also offer insights from a more conventional InfoSec standpoint.
Expanding the Event Horizon to Mainframe Storage
On most platforms, external security managers (ESM) or internal security handlers are monitoring user accesses, failed logins, changing permissions, and other mainstream threat indicators. What they aren’t thinking about are certain events on the mainframe that could impact security from a storage management perspective such as allocation, open/close, I/O, extents, and more.
At the same time, storage elements including generation data groups (GDGs), archives, and hierarchical storage management (HSM) products play a major role in overall functionality, which is why it’s crucial to have a system management facility (SMF) keeping records of salient happenings on the mainframe. Some of this record-keeping is automatic, but you also get to decide in some cases what’s salient. Those events might include logins, access to a secured resource, or when something happens that’s outside of the normal, everyday activity. Relevant events in both SMF and other mainframe facilities will allow you to view security issues (hopefully in real time) and send alerts for remediation.
Storage is critical to security because it’s another vantage point from which to view the mainframe’s operation to its “expected manner.” When storage events are given the care and attention they deserve, they can help inform security and reliability improvements that protect your organization’s most valuable IT asset.
For more information about how storage impacts security on the mainframe, check out DTS Software’s webinar Aggregation without Aggravation: When Putting More Log Data in Your SIEM Is a Good Thing. The presentation, which features DTS Software CTO Steve Pryor and Mainframe Analytics Ltd. Founder Reg Harbeck, is available on demand.