Understanding XMI Files ======================= NETDATA/XMIT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NETDATA files are primarily used to transfer sequential and partitioned datasets between mainframe environments, sometimes via non mainframe environments. NETDATA is the official name for the file format of the output from the z/OS ``TRANSMIT``, z/VM ``NETDATA`` or the opensource tool ``XMIT370``. However, it is more often reffered to as an ``XMI`` file. This documentation uses XMI and NETDATA interchangably. Typically third parties and even IBM will refer to them as XMI files. .. note:: Some quick terminology: - **Dataset**: a mainframe file, usually refered to as a sequential dataset or seq - **Partitioned dataset**: a mainframe folder usually refered to as a PDS - **Member**: files in a partitioned dataset - **Unload**: extracting data to be used elsewhere - **LRECL**: The record length. This is how long each line in a file is, padded with spaces. - **RECFM**: Record format, where: - The first letter is one of F, V, U where: * F = fixed length records * V = Variable length records * U = Unknown - And additional letters may be: * B = blocked * A = ANSI control characters (for printers) * M = machine control characters (for printers) * S = standard blocks XMI files contain either a sequential dataset or a partitioned dataset, and optionally a message. They cannot contain more than one dataset, paritioned or sequential at a time. They can, however, also include an optional message which is technically sequential dataset, however the dataset name is lost. Sequential datasets are 'unloaded' by XMIT using the program ``INMCOPY`` whereas partitioned datasets are unloaded using a program called ``IEBCOPY``. Think about XMI files as tar files on Linux but only if you could add one file or one folder to the tar file. Oftentimes XMI files will contain nested XMI files due to this limitation. XMI files are commonly used by IBM, Broadcom, and many other mainframe vendors to send files to customers. There's also a large collection of software and programs made available for free using XMI by the amazing **CBTTAPE** project available at http://cbttape.org/cbtdowns.htm. Creating XMI Files ~~~~~~~~~~~~~~~~~~ To create a XMI file on z/OS you use the TSO program ``XMIT``/``TRANSMIT``:: XMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI') TRANSMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI') You can also add a message to XMI files:: XMIT NODE.USER DATASET('DATASET.TO.SEND') OUTDATASET('OUTPUT.FROM.XMIT.XMI') MSGDATASET('SEQ.MSG.FILE') If you are using TK4- you can use ``XMIT370`` and some JCL to generate XMI files: .. code-block:: JCL //XMIMAKE JOB (01),'COPY TO TAPE',CLASS=H,MSGCLASS=H,NOTIFY=HERC01 //* ------------------------------------------------------------------ //* CREATES XMILIB TEST XMIT FILES //* ------------------------------------------------------------------ //* EXAMPLE 1: STEP XMITSEQ //* CREATES THE XMI FILE PYTHON.XMI.SEQ.XMIT FROM THE //* SEQUENTIAL DATASET PYTHON.XMI.SEQ //XMITSEQ EXEC PGM=XMIT370 //XMITLOG DD SYSOUT=* //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //COPYR1 DD DUMMY //SYSIN DD DUMMY //SYSUT1 DD DSN=PYTHON.XMI.SEQ,DISP=SHR //SYSUT2 DD DSN=&&SYSUT2,UNIT=3390, // SPACE=(TRK,(255,255)), // DISP=(NEW,DELETE,DELETE) //XMITOUT DD DSN=PYTHON.XMI.SEQ.XMIT,DISP=(,CATLG,DELETE), // UNIT=3350,VOL=SER=KICKS,SPACE=(TRK,(50,50)) //* EXAMPLE 2: STEP XMIPDS //* CREATES THE XMI FILE PYTHON.XMI.PDS.XMIT FROM THE //* PARTITIONED DATASET PYTHON.XMI.PDS //XMIPDS EXEC PGM=XMIT370 //XMITLOG DD SYSOUT=* //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //COPYR1 DD DUMMY //SYSIN DD DUMMY //SYSUT1 DD DSN=PYTHON.XMI.PDS,DISP=SHR //SYSUT2 DD DSN=&&SYSUT2,UNIT=3390, // SPACE=(TRK,(255,255)), // DISP=(NEW,DELETE,DELETE) //XMITOUT DD DSN=PYTHON.XMI.PDS.XMIT,DISP=(,CATLG,DELETE), // UNIT=3350,VOL=SER=KICKS,SPACE=(TRK,(50,50)) I'll leave generating XMI files on z/VM up to the reader. Transferring XMI files ~~~~~~~~~~~~~~~~~~~~~~ XMI files (as with most mainframe files) are in EBCDIC, therefore to download the XMI file from the mainframe you will need to use FTP in binary file transfer mode. Fortunately enabling binary on FTP is simple, just issue the FTP command ``binary`` once connected and transfer the XMI file to your machine. File Structure ~~~~~~~~~~~~~~ XMI files are composed of control records which contain metadata and dataset information. Control Records: * INMR01 - Header records * INMR02 - File control record(s) * INMR03 - Data control record(s) * INMR04 - User control record * INMR06 - Final record * INMR07 - Notification record This library only processes INMR01, INRM02, INMR03, INMR04, and INMR06 records. INMR07 records are notification records and do not contain any files. INMR records are composed of the name, two digit number (INMR01, etc) followed by IBM text units which contains metadata about the record. Text units in INMR## records are broken down like this: * First two bytes are the 'key'/type * Second two bytes are how many text unit records there are * Then records are broken down by size (two bytes) and the data * Data can be string, int or hex More information about text units is available here: https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.ikjb400/txunit.htm INRM01 Records ~~~~~~~~~~~~~~ INRM01 records always contain the following text units: * INMFTIME - date/time the XMI was created * INMLRECL - Record length for this XMI * INMFNODE - name of the originating system * INMTNODE - name of the target system * INMFUID - userid of the person who created the XMI * INMTUID - userid of the user this XMI is being sent to The following text units are optional: * INMFACK - notification receipt * INMFVERS - version number * INMNUMF - number of files (``1`` for dataset only; ``2`` when a message is also present — the message counts as file 1, the dataset as file 2) * INMUSERP - user options INRM02 Records ~~~~~~~~~~~~~~ An XMI file may contain multiple INMR02 control records. These records always contain the following text units: * INMDSORG - dataset organization * INMLRECL - record length * INMSIZE - size in bytes * INMUTILN - utility program Optional text units include: * INMDSNAM - dataset name (absent on message INMR02 records — its absence is how parsers identify a message stream) * INMBLKSZ - block size * INMRECFM - record format * INMTERM - present (with count=0) on the message INMR02 only; marks this record as a message stream rather than a dataset * INMCREAT - the date the file was created There are multiple other optional text units which can be read here: https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.ikjb400/inmr02.htm The utility program (*INMUTILN*) defines how the file was generated: * INMCOPY - converts a sequential dataset (file) for XMI * IEBCOPY - converts a partitioned dataset (folder) for XMI * AMSCIPHR - encrypts the files in XMI; this library does not support extracting encrypted files Each INMR02 carries a 4-byte file ordinal field immediately after the record name. This ordinal tells z/OS RECEIVE which INMR02 descriptor belongs to which INMR03 / data block pair. The number of INMR02 records depends on the content: * Sequential dataset only — one INMR02 (INMCOPY, file ordinal 1) * Partitioned dataset only — two INMR02 records (IEBCOPY then INMCOPY, both file ordinal 1) * Sequential dataset + message — two INMR02 records: message INMCOPY (ordinal 1) then dataset INMCOPY (ordinal 2) * Partitioned dataset + message — three INMR02 records: message INMCOPY (ordinal 1) then PDS IEBCOPY + INMCOPY (both ordinal 2) Therefore, partitioned datasets will always have two or more INMR02 records. INRM03 Records ~~~~~~~~~~~~~~ Defines the file format and contains the following text units: * INMDSORG - dataset organization * INMLRECL - dataset record length * INMRECFM - dataset record format * INMSIZE - size of the dataset in bytes INRM04 Records ~~~~~~~~~~~~~~ INMR04 records are used to pass data to instalation specific exits (i.e. APIs). Metadata ~~~~~~~~ Let's take a look at the file ``test_pds_msg.xmi`` (generated with ``XMIT`` on TSO) in the tests folder. Using this library we can extract the XMI metadata as json: .. code-block:: json { "INMR01": { "INMLRECL": 80, "INMFNODE": "SMOG", "INMFUID": "PHIL", "INMTNODE": "XMIT", "INMTUID": "PHIL", "INMFTIME": "2021-03-09T05:14:41.000000", "INMNUMF": 2 }, "INMR02": { "1": { "INMUTILN": "INMCOPY", "INMSIZE": 58786, "INMDSORG": "PS", "INMLRECL": 251, "INMBLKSZ": 3120, "INMRECFM": "VB", "numfile": 1 }, "2": { "INMUTILN": "IEBCOPY", "INMSIZE": 176358, "INMDSORG": "PO", "INMTYPE": "None", "INMLRECL": 80, "INMBLKSZ": 27920, "INMRECFM": "FB", "INMDIR": 6, "INMDSNAM": "PYTHON.XMI.PDS", "numfile": 2 }, "3": { "INMUTILN": "INMCOPY", "INMSIZE": 176358, "INMDSORG": "PS", "INMLRECL": 32756, "INMBLKSZ": 3120, "INMRECFM": "VS", "numfile": 2 } }, "INMR03": { "1": { "INMSIZE": 176358, "INMDSORG": "PS", "INMLRECL": 80, "INMRECFM": "?" }, "2": { "INMSIZE": 176358, "INMDSORG": "PS", "INMLRECL": 80, "INMRECFM": "?" } } } Notice that ``test_pds_msg.xmi`` had a message, hence there being three INMR02 records. And since it was a PDS it contains the records, ``IEBCOPY`` and another for ``INMCOPY``. Now lets look at the sequential dataset ``test_seq.xmi`` in the ``tests`` folder. This XMI file was generated with ``XMIT370``. .. code-block:: json { "INMR01": { "INMLRECL": 80, "INMFNODE": "ORIGNODE", "INMFUID": "ORIGUID", "INMTNODE": "DESTNODE", "INMTUID": "DESTUID", "INMFTIME": "2021-03-09T04:53:18.000000", "INMNUMF": 1 }, "INMR02": { "1": { "INMUTILN": "INMCOPY", "INMSIZE": 0, "INMDSORG": "PS", "INMLRECL": 80, "INMBLKSZ": 3200, "INMRECFM": "FB", "numfile": 1, } }, "INMR03": { "1": { "INMSIZE": 0, "INMDSORG": "PS", "INMLRECL": 80, "INMRECFM": "?" } } } Notice how there is only one INMR02 record. Also notice that ``XMIT370`` omits the *INMDSNAM* text unit for sequential files. The File Contents XMI ~~~~~~~~~~~~~~~~~~~~~ After parsing the control records the actual file contents follow. If the file is a sequential dataset its easy enough to detect the mime type using ``file`` and extract its content. If the file is a PDS then that means it was "unloaded" using ``IEBCOPY`` which is a little more complicated.