DTS Webinar Recap: We’ve Got a Problem — An Introduction to z/OS® Dumps and Debugging Tools

In large enterprises, the most complex and mission-critical business applications are entrusted to z/OS because of its unrivaled security and reliability. In any complex environment, however, unexpected errors and unplanned failures are bound to occur. When they do, there is an immediate need to understand the problem, find/fix the root cause, and prevent future errors. If you can apply automation to the remediation, you might classify as a Mainframe Champion, as defined in the latest BMC Mainframe Survey.

Fortunately, z/OS programmers have access to a large set of debugging tools, including dumps, traces, log records, and more. The ability to leverage these tools, particularly system dumps, is an important part of a programmer’s job description and a daily workflow.

As with all operating systems, z/OS is unique in both its ability to debug and the way debugging is performed. This was the focus of our November educational webinar, presented by DTS CTO Steve Pryor.

While it is a vast subject, there are aspects of Dumps and Debugging exclusive to z/OS, which is where Pryor spends most of the hour-long session. In-depth examples can be found in the slide deck from the presentation, available for download here.

What can an ABEND Tell You?

ABENDs (abnormal terminations) are of two types – User ABENDs (generated from an application or utility) and System ABENDs (caused by an error performing a system-related function). Debugging user abends requires an understanding of what the program is trying to do and what condition is indicated by the user abend, as specified in the program or utility documentation. System abends occur when a system function, such as obtaining virtual storage or other resources, fails, or when an instruction cannot correctly be executed. Typically, application programmers are called upon to resolve user abends, while system abends are addressed by the system programmer or storage administrator.

Most abends will be accompanied by a formatted dump, placed on either a SYSUDUMP or SYSABEND dataset, or for a larger system-related problem, a SYSMDUMP dataset . Many abends are related to supervisor call (SVC) instructions. In these cases, the last two digits of the abend code will identify the SVC. This can be a useful clue as to which type of system function failed and how to attack the problem.

In addition to SVCs, Program Exceptions indicate that the CPU cannot continue operation due to a problem with the instruction being executed. Program exceptions can be identified in many different ways, which are also covered in the webinar.

Common Error Types
What are the most common error types encountered in z/OS? From addressing errors and data errors to instruction errors and others, such as timing, loop and wait errors, each is identifiable if you know what you are looking for.

While an ABEND is a “hard” error, errors such as “incorrect output” are logic errors, which are more difficult to debug and require more in-depth knowledge of the application.

You’ve Identified the Error – now What?

Once the source of the error is known, what tools are available? The most common are discussed in the webinar, what they do and what to expect from them. This includes a brief conversation about sending the dump to IBM when necessary.

At closing, Pryor recommends a number of reference materials available and the types of issues covered in each.

More about Dumps and Debugging Tools in our webinar: “We’ve Got a Problem: An Introduction to z/OS® Dumps and Debugging Tools” is a 60-minute informative and educational look at an important topic in the mainframe space. If you weren’t able to attend, you can view it on-demand and download a copy of the slide deck used in the presentation by using this link.

DTS Webinar Recap: Understanding IBM® z/OS® Disk Metadata: Catalogs, VTOCs, VVDSs, Indexes, and More!

The way that DASD (direct access storage device) data is stored on z/OS is very different from distributed/open systems. An understanding of just how z/OS datasets are created, located, accessed, and deleted is as important to end-users as it is to storage administrators and system programmers.

In our October 2021 educational webinar, DTS CTO Steve Pryor discussed this important topic, and how the concepts and structures invented in the 1960s have evolved into today’s high-availability, high-performance, high-reliability disk storage subsystems.

To the more seasoned Big Iron mainframers (and by seasoned we mean 20 years or more), the information covered might seem familiar. But much of this information is no longer widely disseminated, so while it may be a good review for some, for many the concepts of z/OS disk metadata and catalogs, presented by an authority in storage management, is high-value information.

The z/OS Storage Hierarchy
At the top of the food chain is an extensive catalog that keeps track of every dataset in every volume in the system. Each volume contains a table of contents (VTOC) just like the table of contents in the front of a book. The VTOC contains information about the locations of all the datasets, as well as what space on a particular volume is used and not used.

The SMS (system-managed storage) subsystem handles all the information about the allocation needs of the dataset: physical characteristics, logical record length, performance needs, etc., and decides, given all of its characteristics, on which DASD volume to place the dataset. SMS is important to every aspect of the dataset life cycle, beginning with allocation and volume placement and extending to dataset usage, archiving, recall, and eventual expiration.

Naming conventions, which are very specific in z/OS and are critical for managing the large volumes of data common in z/OS systems, are covered in-depth during the webinar. Slides with examples of Mapping, Dataset Extents, and much more were included in the presentation.

The Evolution of z/OS Storage Systems
How did all of this evolve from the systems of old? With the development of larger, faster CPUs in the late 1970s / early 1980s came the need for expanded storage capacity.

Traditional disk drives, made up of a stack of platters in a cylinder with 56,664 bytes per track and 15 tracks per cylinder, existed well into the 1990s. Traditional drives were eventually replaced by RAID (redundant array of independent disks) drives, which were introduced in a famous paper in 1988 by David Patterson, Garth Gibson, and Randy Katz at SIGMOD. RAID simply spreads storage blocks across many different drives and lets a controller do the work of knitting them together. Not only does this make data access faster, but it also ensures the data is accessible in case of a hardware failure in a disk within the RAID due to parity blocks.  Even with the advent of more modern storage systems, z/OS still operates its storage mapping as it did pre-1980, ensuring the absolute compatibility and continuing value of software investments that are the hallmark of mainframe systems.

Carl Sagan said in 1980 that “You have to know the past to understand the present.” As installations grow ever larger and more complex, an understanding of how these critical legacy systems developed, and how they still operate, can inform today’s extensive data modernization initiatives.

Understanding z/OS Disk Metadata: Catalogs, VTOCs, VVDSs, Indexes, and More! is a 60-minute informative, educational look at a historic topic in the mainframe space. If you weren’t able to attend, you can view it on-demand and download a copy of the slide deck used in the presentation by using this link.

DTS Webinar Recap: Secure Data for Everyone – Pervasive Encryption and z/OS® Storage Management

Data security is making the news these days far too often, and for all the wrong reasons. Because the data in your IBM® z/OS® system is the most valuable and useful data in the enterprise, it is also the data cybercriminals want most. So while it has to remain available for your business to run, it must also be completely secure.

In our September 2021 Webinar, DTS Software CTO Steve Pryor discussed, from a storage management perspective, some of the practical steps involved in making Pervasive Encryption a reality in your z/OS environment. IBM’s Pervasive Encryption initiative aims to achieve these availability and security goals by making encryption so seamless for the user that it’s implemented by default.

But as easy as IBM’s intentions might be, you still must consider the following:

  1. Are you sure you’re taking the proper steps when encrypting datasets? And,
  2. How are you ensuring (and verifying) that it’s encrypted?

A few highlights of Pryor’s overview of encryption and z/OS storage management:

Why Encrypt?
Some of the reasons are obvious, such as regulations and data breaches. But there are other, less obvious reasons. Accidental (or intentional) exposure of sensitive data and insider attacks are two very real threats that must be considered.

Who Encrypts (or Decrypts)?
Pryor identifies three primary personas typically involved with encryption: the security administrator (most often the RACF security administrator), who’s responsible for system security policies; the storage administrator, who’s responsible for managing the data sets – the creation of the data sets and which device they’re allocated on; and, ultimately, the end-user uses encryption (or decryption) to read and write data.

Encryption in z/OS
The concept of “pervasive encryption” is simply that everything is encrypted. This includes at-rest, in-use and in-flight data. While this may seem like overkill, the upside is that by encrypting everything, regulatory requirements are met 100% of the time. Furthermore, existing security policy mechanisms are used to provide dataset-level encryption and allow access by user privileges for added security.

Crypto Hardware
Crypto hardware for z/OS consists of two possible components as well as ZPDT emulated adjunct process. How does each function and what are the features and benefits? Pryor clears up any questions with a quick overview.

Deep Dive in a Live Demo
Once the baseline is set, Pryor dives into the nuts and bolts of dataset encryption with a live demonstration and on-screen explanation of options and elements. He then addresses one of the most important aspects of encryption: key distribution. How do you distribute the keys for those people who need them and control the use of the keys? How do you rotate the keys and avoid compromised keys? How do you audit the system? All are crucial questions that must be considered carefully.

Secure Data for Everyone – Pervasive Encryption and z/OS Storage Management is an informative, educational look at a timely topic in the mainframe space. If you weren’t able to attend, you can view it on-demand and download a copy of the slide deck used in the presentation by using this link.

Enabling Event Analysis to Spot Unusual Access Patterns With DTS Software’s ACC Monarch

The Direct Access Device Space Manager (DADSM) handles key functions in z/OS that dictate much of what happens to a dataset during its lifecycle. Creation and deletion are the most obvious, but this component can also extend a dataset to a new volume, release unused space using the partial release function, rename the dataset, and more. Just as on any other platform, datasets on z/OS have a largely predictable use pattern, which is why it’s a good idea to investigate when usage defies expectations. With the right solution in place, anomalies in the typical pattern of events can provide valuable insights to system administrators.

All DADSM events go through a system exit point or control point such as IDGPRE00 and IDGPOST0, at which point DTS Software’s ACC Monarch product can take control with automation and perform an array of actions. Using a policy rules language, ACC Monarch relies on IF statements to take action based on user-defined dataset characteristics. If the specified condition is met, the Dynamic Install Facility (DIF) started task performs the action.

A basic example of an action might be updating system control blocks, but actions could also include analysis operations such as writing records to DASD, writing records to a log file, or writing reports. These resources can be created using an arbitrary, user-defined record that isn’t necessarily an SMF record, and they can also be written directly to TCP/IP for analysis by Splunk or any other SIEM system. By enabling this kind of thorough analysis during the dataset lifecycle, organizations can spot unusual access patterns that could indicate a threat — and they can do it without the need to know assembler coding.

For more information about how storage event awareness can contribute to security, we encourage you to view our recent webinar on TechChannel, “Aggregation without Aggravation: When Putting More Log Data in Your SIEM is a Good Thing.” DTS Software CTO Steve Pryor and veteran mainframe expert Reg Harbeck offer insights into how you can leverage dfSMS events in conjunction with your existing SIEM data to build a more complete picture of the threats facing your organization.

Whitepaper Download: A Data Center Without Enforceable Standards Risks Much More Than Mere Storage Mismanagement.

ACC Whitepaper

Breaking Down Sequential Data Sets and Their Limitations on z/OS®

Unlike Partitioned data sets, which are useful if you have a bunch of different types of data that you want to keep all in a single data set, or VSAM data sets, which are useful for more sophisticated types of access by key or by relative record, sequential data sets are the simplest form of data set. Not surprisingly, they’re useful if you want to read the data in sequence, and when we specify the data set organization in the JCL or define the data set under ISPF, we specify a data set organization or DSORG equal to PS (or occasionally, PSU, or DA).

Sequential data sets can be one of three types, BASIC, LARGE, or Extended Format. If you don’t specify anything out of the ordinary, you get a Basic format data set, which is simply a collection of extents on a disk that is being pointed to by the Volume Table of Contents (VTOC) and the VSAM Volume Data Set (VVDS). Because it’s ordinary sequential data, you can use the Queued or Basic Sequential Access Methods (QSAM or BSAM) to write to it, but it has to be written sequentially — before you write record two, you must write record one.

Sequential data sets are original to z/OS (or MVS, back in the 60s), and thus have some limitations. Basic data sets can have no more than 16 extents per volume and a maximum of 65,535 tracks total even if the data set goes to more than one volume. If you need to exceed that number, you can create a large sequential data set by specifying the DSNTYPE=LARGE parameter. A large format dataset is still limited to 16 extents per volume, but you can have 16,777,215 tracks per volume.

Because the limit of 16 extents is somewhat restrictive, you can use an extended-format sequential data set. Extended-format datasets are implicitly LARGE and can exceed 65,535 tracks, but more importantly, you can get up to 123 extents per volume, offering a few clear advantages. For one, an extended-format data set allows you to fail with anOut-Of-Space error much less often, but it can also be striped. With a single stripe, it’s just an ordinary sequential data set, but if it’s multi-striped, each volume can be read and written in parallel. For an application such as SMF data, where there’s a large amount of sequential data, it’s useful to stripe the data across multiple volumes so they can be read much more quickly.

Wealth of z/OS Webinar Training on DTSSoftware.com

This is a simple breakdown of sequential data sets, but DTS Software has a wealth of additional information on partitioned datasets, PDEs, generation data groups, hierarchical file structures, and UNIX services on z/OS. To learn more, click here to view our August 2021 webinar on demand: “PDS, PDSE, GDG, HFS, USS and Other Strange z/OS Animals.” In this presentation, DTS Software CTO Steve Pryor takes a deep dive into the peculiarities of storage elements on z/OS to help you make sense of these often confounding creatures.

Cyber Forensics — How Storage Plays a Critical Role in Security and Regulatory Compliance

Mainframe storage has changed a lot over the years, and the amount of it available in terms of both memory and disk/tape has grown substantially. Although this gradual progression has over time liberated mainframers from many long-standing limitations, careful storage management has remained a central tenet of mainframe culture and a major differentiator on the platform.

As General and eventual President Eisenhower once said, “Plans are useless. Planning is indispensable.” By understanding current storage availability and outlining future needs, mainframers are able to make heavy and advanced use of the resources and features available to them, and they can do so without interruption. According to Simson Garfinkel and Gene Spafford, Ph.D., in their book, Practical Unix and Internet Security, a computing system is secure when you can count on it to behave in the expected manner. Storage is clearly important because it impacts availability on the mainframe, but it can also offer insights from a more conventional InfoSec standpoint.

Expanding the Event Horizon to Mainframe Storage

On most platforms, external security managers (ESM) or internal security handlers are monitoring user accesses, failed logins, changing permissions, and other mainstream threat indicators. What they aren’t thinking about are certain events on the mainframe that could impact security from a storage management perspective such as allocation, open/close, I/O, extents, and more.

At the same time, storage elements including generation data groups (GDGs), archives, and hierarchical storage management (HSM) products play a major role in overall functionality, which is why it’s crucial to have a system management facility (SMF) keeping records of salient happenings on the mainframe. Some of this record-keeping is automatic, but you also get to decide in some cases what’s salient. Those events might include logins, access to a secured resource, or when something happens that’s outside of the normal, everyday activity. Relevant events in both SMF and other mainframe facilities will allow you to view security issues (hopefully in real time) and send alerts for remediation.

Storage is critical to security because it’s another vantage point from which to view the mainframe’s operation to its “expected manner.” When storage events are given the care and attention they deserve, they can help inform security and reliability improvements that protect your organization’s most valuable IT asset.

For more information about how storage impacts security on the mainframe, check out DTS Software’s webinar Aggregation without Aggravation: When Putting More Log Data in Your SIEM Is a Good Thing. The presentation, which features DTS Software CTO Steve Pryor and Mainframe Analytics Ltd. Founder Reg Harbeck, is available on demand.