Understanding XMI Files¶

NETDATA/XMIT¶

NETDATA files are primarily used to transfer sequential and partitioned datasets between mainframe environments, sometimes via non mainframe environments. NETDATA is the official name for the file format of the output from the z/OS TRANSMIT, z/VM NETDATA or the opensource tool XMIT370. However, it is more often reffered to as an XMI file. This documentation uses XMI and NETDATA interchangably. Typically third parties and even IBM will refer to them as XMI files.

Note

Some quick terminology:

Dataset: a mainframe file, usually refered to as a sequential dataset or seq
Partitioned dataset: a mainframe folder usually refered to as a PDS
Member: files in a partitioned dataset
Unload: extracting data to be used elsewhere
LRECL: The record length. This is how long each line in a file is, padded with spaces.
RECFM: Record format, where:
- The first letter is one of F, V, U where:
  F = fixed length records
  
  V = Variable length records
  
  U = Unknown
- And additional letters may be:
  B = blocked
  
  A = ANSI control characters (for printers)
  
  M = machine control characters (for printers)
  
  S = standard blocks

XMI files contain either a sequential dataset or a partitioned dataset, and optionally a message. They cannot contain more than one dataset, paritioned or sequential at a time. They can, however, also include an optional message which is technically sequential dataset, however the dataset name is lost.

Sequential datasets are ‘unloaded’ by XMIT using the program INMCOPY whereas partitioned datasets are unloaded using a program called IEBCOPY.

Think about XMI files as tar files on Linux but only if you could add one file or one folder to the tar file. Oftentimes XMI files will contain nested XMI files due to this limitation.

XMI files are commonly used by IBM, Broadcom, and many other mainframe vendors to send files to customers. There’s also a large collection of software and programs made available for free using XMI by the amazing CBTTAPE project available at http://cbttape.org/cbtdowns.htm.

Creating XMI Files¶

To create a XMI file on z/OS you use the TSO program XMIT/TRANSMIT:

XMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI')

TRANSMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI')

You can also add a message to XMI files:

XMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI') MSGDATASET('SEQ.MSG.FILE')

If you are using TK4- you can use XMIT370 and some JCL to generate XMI files:

//XMIMAKE JOB (01),'COPY TO TAPE',CLASS=H,MSGCLASS=H,NOTIFY=HERC01
//* ------------------------------------------------------------------
//* CREATES XMILIB TEST XMIT FILES
//* ------------------------------------------------------------------
//* EXAMPLE 1: STEP XMITSEQ
//* CREATES THE XMI FILE PYTHON.XMI.SEQ.XMIT FROM THE
//* SEQUENTIAL DATASET PYTHON.XMI.SEQ
//XMITSEQ  EXEC PGM=XMIT370
//XMITLOG  DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//COPYR1   DD DUMMY
//SYSIN    DD DUMMY
//SYSUT1   DD DSN=PYTHON.XMI.SEQ,DISP=SHR
//SYSUT2   DD DSN=&&SYSUT2,UNIT=3390,
//         SPACE=(TRK,(255,255)),
//         DISP=(NEW,DELETE,DELETE)
//XMITOUT  DD DSN=PYTHON.XMI.SEQ.XMIT,DISP=(,CATLG,DELETE),
//            UNIT=3350,VOL=SER=KICKS,SPACE=(TRK,(50,50))
//* EXAMPLE 2: STEP XMIPDS
//* CREATES THE XMI FILE PYTHON.XMI.PDS.XMIT FROM THE
//* PARTITIONED DATASET PYTHON.XMI.PDS
//XMIPDS   EXEC PGM=XMIT370
//XMITLOG  DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//COPYR1   DD DUMMY
//SYSIN    DD DUMMY
//SYSUT1   DD DSN=PYTHON.XMI.PDS,DISP=SHR
//SYSUT2   DD DSN=&&SYSUT2,UNIT=3390,
//         SPACE=(TRK,(255,255)),
//         DISP=(NEW,DELETE,DELETE)
//XMITOUT  DD DSN=PYTHON.XMI.PDS.XMIT,DISP=(,CATLG,DELETE),
//            UNIT=3350,VOL=SER=KICKS,SPACE=(TRK,(50,50))

I’ll leave generating XMI files on z/VM up to the reader.

Transferring XMI files¶

XMI files (as with most mainframe files) are in EBCDIC, therefore to download the XMI file from the mainframe you will need to use FTP in binary file transfer mode. Fortunately enabling binary on FTP is simple, just issue the FTP command binary once connected and transfer the XMI file to your machine.

File Structure¶

XMI files are composed of control records which contain metadata and dataset information.

Control Records:

INMR01 - Header records
INMR02 - File control record(s)
INMR03 - Data control record(s)
INMR04 - User control record
INMR06 - Final record
INMR07 - Notification record

This library only processes INMR01, INRM02, INMR03, INMR04, and INMR06 records. INMR07 records are notification records and do not contain any files.

INMR records are composed of the name, two digit number (INMR01, etc) followed by IBM text units which contains metadata about the record.

Text units in INMR## records are broken down like this:

First two bytes are the ‘key’/type
Second two bytes are how many text unit records there are
Then records are broken down by size (two bytes) and the data
Data can be string, int or hex

More information about text units is available here: https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.ikjb400/txunit.htm

INRM01 Records¶

INRM01 records always contain the following text units:

INMFTIME - date/time the XMI was created
INMLRECL - Record length for this XMI
INMFNODE - name of the originating system
INMTNODE - name of the target system
INMFUID - userid of the person who created the XMI
INMTUID - userid of the user this XMI is being sent to

The following text units are optional:

INMFACK - notification receipt
INMFVERS - version number
INMNUMF - number of files (1 for dataset only; 2 when a message is also present — the message counts as file 1, the dataset as file 2)
INMUSERP - user options

INRM02 Records¶

An XMI file may contain multiple INMR02 control records. These records always contain the following text units:

INMDSORG - dataset organization
INMLRECL - record length
INMSIZE - size in bytes
INMUTILN - utility program

Optional text units include:

INMDSNAM - dataset name (absent on message INMR02 records — its absence is how parsers identify a message stream)
INMBLKSZ - block size
INMRECFM - record format
INMTERM - present (with count=0) on the message INMR02 only; marks this record as a message stream rather than a dataset
INMCREAT - the date the file was created

There are multiple other optional text units which can be read here: https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.ikjb400/inmr02.htm

The utility program (INMUTILN) defines how the file was generated:

INMCOPY - converts a sequential dataset (file) for XMI
IEBCOPY - converts a partitioned dataset (folder) for XMI
AMSCIPHR - encrypts the files in XMI; this library does not support extracting encrypted files

Each INMR02 carries a 4-byte file ordinal field immediately after the record name. This ordinal tells z/OS RECEIVE which INMR02 descriptor belongs to which INMR03 / data block pair. The number of INMR02 records depends on the content:

Sequential dataset only — one INMR02 (INMCOPY, file ordinal 1)
Partitioned dataset only — two INMR02 records (IEBCOPY then INMCOPY, both file ordinal 1)
Sequential dataset + message — two INMR02 records: message INMCOPY (ordinal 1) then dataset INMCOPY (ordinal 2)
Partitioned dataset + message — three INMR02 records: message INMCOPY (ordinal 1) then PDS IEBCOPY + INMCOPY (both ordinal 2)

Therefore, partitioned datasets will always have two or more INMR02 records.

INRM03 Records¶

Defines the file format and contains the following text units:

INMDSORG - dataset organization
INMLRECL - dataset record length
INMRECFM - dataset record format
INMSIZE - size of the dataset in bytes

INRM04 Records¶

INMR04 records are used to pass data to instalation specific exits (i.e. APIs).

Metadata¶

Let’s take a look at the file test_pds_msg.xmi (generated with XMIT on TSO) in the tests folder. Using this library we can extract the XMI metadata as json:

{
"INMR01": {
    "INMLRECL": 80,
    "INMFNODE": "SMOG",
    "INMFUID": "PHIL",
    "INMTNODE": "XMIT",
    "INMTUID": "PHIL",
    "INMFTIME": "2021-03-09T05:14:41.000000",
    "INMNUMF": 2
},
"INMR02": {
    "1": {
        "INMUTILN": "INMCOPY",
        "INMSIZE": 58786,
        "INMDSORG": "PS",
        "INMLRECL": 251,
        "INMBLKSZ": 3120,
        "INMRECFM": "VB",
        "numfile": 1
    },
    "2": {
        "INMUTILN": "IEBCOPY",
        "INMSIZE": 176358,
        "INMDSORG": "PO",
        "INMTYPE": "None",
        "INMLRECL": 80,
        "INMBLKSZ": 27920,
        "INMRECFM": "FB",
        "INMDIR": 6,
        "INMDSNAM": "PYTHON.XMI.PDS",
        "numfile": 2
    },
    "3": {
        "INMUTILN": "INMCOPY",
        "INMSIZE": 176358,
        "INMDSORG": "PS",
        "INMLRECL": 32756,
        "INMBLKSZ": 3120,
        "INMRECFM": "VS",
        "numfile": 2
    }
},
"INMR03": {
    "1": {
        "INMSIZE": 176358,
        "INMDSORG": "PS",
        "INMLRECL": 80,
        "INMRECFM": "?"
    },
    "2": {
        "INMSIZE": 176358,
        "INMDSORG": "PS",
        "INMLRECL": 80,
        "INMRECFM": "?"
    }
}
}

Notice that test_pds_msg.xmi had a message, hence there being three INMR02 records. And since it was a PDS it contains the records, IEBCOPY and another for INMCOPY.

Now lets look at the sequential dataset test_seq.xmi in the tests folder. This XMI file was generated with XMIT370.

{
"INMR01": {
    "INMLRECL": 80,
    "INMFNODE": "ORIGNODE",
    "INMFUID": "ORIGUID",
    "INMTNODE": "DESTNODE",
    "INMTUID": "DESTUID",
    "INMFTIME": "2021-03-09T04:53:18.000000",
    "INMNUMF": 1
},
"INMR02": {
    "1": {
        "INMUTILN": "INMCOPY",
        "INMSIZE": 0,
        "INMDSORG": "PS",
        "INMLRECL": 80,
        "INMBLKSZ": 3200,
        "INMRECFM": "FB",
        "numfile": 1,
    }
},
"INMR03": {
    "1": {
        "INMSIZE": 0,
        "INMDSORG": "PS",
        "INMLRECL": 80,
        "INMRECFM": "?"
    }
}
}

Notice how there is only one INMR02 record. Also notice that XMIT370 omits the INMDSNAM text unit for sequential files.

The File Contents XMI¶

After parsing the control records the actual file contents follow. If the file is a sequential dataset its easy enough to detect the mime type using file and extract its content. If the file is a PDS then that means it was “unloaded” using IEBCOPY which is a little more complicated.