System/380 Principles of Operation - Version 2.X

This document was written mostly by Paul Edwards and is released to the public domain.


This document outlines the principles and history behind both MVS/380 (version 1.0 was released in June 2009) and VM/380 (version 1.0 was released in July 2009).


DRIVING FORCE
-------------

The lack of a free C compiler on z/OS meant that some people both past and present were unable to cost-justify the expense of a C compiler and thus had to write in a non-preferred language.  The free GCC compiler has had a S/370 target for approximately two decades, but it was always a case of "so close but yet so far", as unfortunately the GCC compiler itself had not been written with a non-Unix-based C90 environment in mind.  Even just opening a file - open() would often be used instead of fopen(), despite the fact that GCC was basically a text processing application that could have been written in pure C90.  The effort to overcome these problems is a story in itself (documented further down), but the end result was that GCC (which inherently generates 31-bit clean code) was ported to the (free) 24-bit MVS 3.8j operating system.  However, the memory constraint of approximately 9 MB memory, for both the 3 MB executable and its data, placed severe restrictions on how large a program being compiled could be before running out of memory.  GCC was for example unable to recompile all of itself on MVS.  On VM/370 where nearly 16 MB total is available, the situation was different - GCC could indeed recompile itself, although not all modules at full optimization (to do that required approximately 23 MB, including 3 MB for the compiler itself).

Basically, GCC needed to be executed as a 31-bit application rather than being constrained to 24-bits by the operating system and hardware.  With z/OS in mind, being 31-bit was never a problem.  However, z/OS is not available for free, so hobbyists were not directly able to write software targetting z/OS.  Going through hoops, it was possible to verify that z/OS worked, but what was ideally needed was a 31-bit MVS, even if it was substandard compared to z/OS.

Independently of the effort to make GCC work on an EBCDIC platform, a C runtime library (PDPCLIB) was being developed (since 1994).  Initially written on the PC, when access to C/370 was suddenly made available (in 1997), the mainframe (then OS/390) was able to be targetted.  The design of PDPCLIB was such that all OS (in this case, MVS) access was done via one assembler file less than 1000 lines long.  GCC meanwhile was of the order of 400,000 lines of C code, which then became 700,000 lines of assembler.  The important thing about this generated assembler was that it was "stock standard". No OS calls, just clean 31-bit-capable code.  Whether the executable was 24-bit or 31-bit came down to just 1000 lines of hand-written assembler.  And this 1000 lines of code would determine the fate of EVERY C program, not just the GCC compiler itself.  And some of those programs require even more memory than GCC.  This 1000 lines of assembler code was eventually made AMODE 31, RMODE ANY for a good z/OS target.

The cleanliness of the generated code and the (deliberate) isolation of OS-dependent code had always held out the hope that one day those 1000 lines could be replaced with something that would allow the rest of the compiler to run 31-bit, free from the constraints of a 24-bit operating system. Something like a standalone program.  When it was time to analyze what could be done, it was noted that those 1000 lines could cope with being executed in AMODE 24, even if the caller was running AMODE 31, using ATL (above the 16 MB line) data, because the data that the assembler operated on was almost all obtained by the assembler code itself in a previous call, or resided on the stack (DSA - dynamic save area - which was allocated in the startup code, the other very small bit of assembler).  A simple AMODE switch to 24-bit should have been all that was required.  What this meant was that if there was some way to get into 31-bit mode, noting the possibility that interrupts may need to have been disabled to prevent the 24-bit operating system from interfering and getting confused, the C code would be able to run freely until it hit the assembler, at which point it could switch mode back and reenable interrupts and no-one would be any the wiser.  Sophisticated OS services like SPIE would possibly not be available, multitasking may have needed to be temporarily halted, but none of these things were actually required in the situation in question (hobbyist system trying to do a large compile, and when complete, return to business as usual).

MVS (24-bit) would have already loaded the 31-bit-capable code into memory, so it would just be sitting there waiting for an appropriate machine to execute it.  ie something like this sequence:

1. Suspend the entire traditional S/370 machine when ready to enter 31-bit mode.
2. Switch in an artificial machine (resembling S/390 to some extent) that could cope with 31-bit memory addresses, all in real memory, thus allowing application (GCC) logic to access large data structures, but not requiring operating system services.
3. When the application was ready to do I/O, and thus switch back to 24-bit mode ready for interaction with the operating system, at the point of mode transition, switch out the artificial machine and switch in the S/370 machine, which would be unaware that anything had actually happened unless there was some timing issue or interrupt issue.

The question was probably not if it was possible, but rather - how much work was required to construct a machine capable of fulfilling such a requirement.  In the end a method was found that only involved about 20 lines of code changes to the S/370 system provided by the Hercules emulator to produce an artificial S/380 system that was able to slot in very cleanly indeed.  Interrupts did not need to be disabled.  Machine type didn't need to be repeatedly swapped. Real memory was not required.

The simple technique involving mapping memory was first made to work in November 2007 (by Paul Edwards) and a formal release of the new architecture was released in December 2007.  A new SVC (233) was introduced (by "somitcw") with MVS/380 0.2 to obtain and release ATL memory.  By MVS/380 0.4 (March 2008) the SVC was hidden inside the GETMAIN (thus allowing source code compatibility with z/OS) until August 2008 (although not formally released until January 2009 with MVS/380 0.9) when Gerhard Postpischil developed a technique to intercept SVC 120 which provided 31-bit binary compatability between MVS/380 and z/OS (at least for applications that conformed to the 31-bit interface that MVS/380 supported).  At the same time (January 2009), CMS had also been modified (by Robert O'Hara) to allow the same source and binary compatability between VM/380 0.9 and z/VM.

In March 2009, Jason Winter introduced a far more sophisticated flavor of S/380 - providing for memory protection, multiple memory requests and even a semblance of virtual memory support.  At time of writing, this version requires people to roll their own version of Hercules, and is dependent on Jason's JCC compiler.  But technically, memory protection is available for those who see this as a barrier for adoption.  Either implementation (or others in the pipeline), is transparent to the actual application (at least applications that use a heap and thus only make one memory request of the OS - which GCCMVS is able to generate, including when it generates itself).

In June 2009 MVS/380 1.0 was released, followed in July 2009 by VM/380 1.0, after a suitable form of distribution was found for both products.

In August 2009, during some unrelated ftp work involving an SVC intercept, Adrian Sutherland proposed another radical change - offload the SVC 120 intercept, and other MVS functionality such as an SQL server, to the PC, using Hercules as the bridge, bypassing the need to port all software to MVS. The concept was proven for the SVC 120 intercept at least, and has opened the door for quite nice memory protection and virtual memory facilities. August 2009 also saw VM/380 1.1 released with some bug fixes.

In January 2010, Michel Beaulieu managed to get a program to switch to 31-bit mode under DOS/VS under VM/380 under Hercules/380, and VSE/380 effectively came into existence. Also in January, Amrith K started working on disassemblies of MVS code. Recognizing the infrastructural problems that made this process inefficient, Gerhard made extensive changes to the disassembler while Paul organized rationalizing the source code and putting it into CVS.

In January 2011, VSE/380 1.0 was released.

In March 2011, MVS/380 1.1 was released, with Scott Cosel taking over.

In October 2019, MVS/380 2.0 was released by Paul Edwards, and MVS/380 1.X ceased to be a supported target for 31-bit programs. Programs are allowed to make multiple ATL memory requests.


S/380 HARDWARE ARCHITECTURE
---------------------------

Hercules was used to create the necessary "hardware".  The existing S/370 was used as a base, and basically renamed to S/380, to avoid the need to create a 4th machine type.  There is a flag to downgrade S/380 to S/370, but it is S/370 that is considered to be an "option".  Some instructions available in S/390 were added to S/380.  The way Hercules is constructed, this was a minor modification.  The S/370 I/O remains.  This is absolutely essential since all of MVS 3.8j uses it, and the goal is to not have to rewrite MVS, where the complete source code is not even available (not even PL/S-generated "source").

One of the S/390 instructions is "BSM" which allows switching to 31-bit mode.  Some small changes (e.g. not using a fixed 24-bit address mask) were necessary to get S/380 to respect that mode change.  The biggest change was what to do about the ATL (above the 16 MB line) memory.  All of MVS 3.8j is 24-bit.  Neither the operating system nor any applications ever reference memory above the line.  Similarly, the S/370 architecture meant that there was no expectation for more than 16 MB of real memory to be used.  All virtual memory references resolved to BTL (below the line) addresses.  So there was nothing in existence (it was never logically possible to create such a thing) to interfere with any use of ATL memory.  As such, the change required was decidely simple - simply map any ATL reference into the equivalent ATL address of real memory.  This meant that all address spaces resolve to the same address so you can only run one 31-bit application at a time if you want to be assured of memory protection.  Given the young state of S/380, and the body of current users, in practice this is a non-issue, as most people don't even run one 31-bit application, nevermind having a requirement to multitask multiple 31-bit applications.  In addition, storage keys were ignored so that the operating system didn't require modifications to set the storage key of ATL memory to 8 for use by problem-state programs.  As noted earlier, these tradeoffs don't exist in Jason Winter's version.

In order to run 31-bit applications under CMS, one more modification was required (that was not required for MVS).  Since CMS runs under CP, and neither CP nor CMS are 31-bit aware, when a CMS application does an SVC, it doesn't simply load the old PSW to return.  It instead constructs a new PSW, losing the old 31-bit status in the process.  Actually, CP, running in ECMODE, does save the fact that the interrupt occurred in 31-bit mode, so when returning from hardware interrupts, there is no problem.  The problem only arises because CMS runs in BCMODE and thus gets an inherently 24-bit PSW.  But it is CMS that needs to decide what the return address will be, and obviously with zero knowledge of 31-bit, it can't construct the required PSW.  This problem was circumvented by, during an SVC, saving the return address, and when an LPSW was done, to check the address being returned to, seeing if that is the same address previously noted and if so, restoring 31-bit mode prior to returning to the application.  When CP is eventually modified to include similar logic, this change can be removed from Hercules/380 (where it doesn't really belong).  Note that while this change satisfies the requirements for most SVCs, there are some SVCs that return control to a different address, thus bypassing S/380's ability to detect it.  If calling these SVCs, the application is required to switch to 24-bit mode prior to invoking the SVC (this is not particularly onerous, since the application will already have to do such AMODE switches whenever calling the file i/o routines, which are done as calls, and aren't (or weren't) 31-bit clean, at least in the XA days).  MVS does not have this problem, as the SVCs are not intercepted and the entire ECMODE (31-bit aware) PSW context is saved and restored on SVC return.

S/380 was expanded to give more DAT options, as follows:

If CR0.10 is not set (ie, we are using S/370 mode), and if CR13 = 0, then ATL memory accesses are done to real storage, ignoring storage keys (for ATL accesses only), even though (S/370) DAT is switched on. This provides instant access to ATL memory for applications, before having to make any OS changes at all. This is not designed for serious use though, due to the lack of integrity in ATL memory for multiple running programs. It does provide very quick proof-of-concept though.

If CR0.10 is not set, and if CR13 is non-zero, e.g. even setting a dummy value of 0x00001000, then ATL memory accesses are done via a DAT mechanism instead of using real memory. This is a split DAT.

With the above dummy value, the size of the table is only sufficient to address BTL memory (ie 16 MB), so that value of CR13 (note that CR13 is of the same format as CR1) effectively prevents access to ATL memory completely, if that was required for some reason.

When CR13 contains a length signifying more than 16 MB of memory, then that is used to access ATL memory, and CR1 is ignored (ie split DAT). Also CR0 is ignored as far as 64K vs 1 MB segments is concerned - 1 MB is used unconditionally (for ATL memory).

However, even if CR13 is non-zero, it is ignored if CR0.10 is set to 1. Anyone who sets CR0.10 to signify XA DAT is expected to provide the sole DAT used, and not be required to mess around with CR13. This provides the flexibility of some ("legacy") tasks being able to use split DAT, while others can be made to use a proper XA DAT.

The S/370 architecture offers 1 MB and 64K segments, and either may be selected here. It is also possible to enable extended addressing, which will allow multiple BTL address spaces.

S/390 normally only provides 1 MB segments, but with Hercules/380 you can also have 64K segments should that serve some purpose.

PDOS/380 was created to exercise the 4 distinct options for protected ATL access. First, the segment size is either 64k or 1 MB. Secondly, either a complete switch is made to the XA DAT or else the DAT can be split using 370 DAT below the line and 1 MB XA DAT above the line. These 2 choices create the 4 different flavours of protected S/380 - 64k, 1MB, XA 64k and XA 1MB. The normal one to choose is 1MB, split DAT. Different projects to upgrade MVS 3.8j to S/380 may choose any of the 4 options Hercules/380 opens up. Also note that PDOS/370 provides ATL access via the same crude method that MVS/380 does.

S/380 also introduces a new AMODE - AM32. If AM32 is active, then the PSW will have bits 30 and 32 on, and bit 31 off. To get into AM32, you can either load a PSW that has those bits set appropriately, or you can configure the Hercules/380 "hardware" to convert any attempt to BSM to AM31 into a BSM to AM32. There are also options to force a BSM or LPSW to AM24 to instead activate AM32, and an attempt to get into AM64 can be translated into AM32 activation. So programs that are written to "step down" to work properly on MVS/XA, or do a mode switch for other reasons, can be forced to stay totally in AM32. AM32 has been tested on MVS/CMS/VSE/PDOS.


MVS/380 PROGRAMMING INTERFACE
-----------------------------

MVS 3.8k

The last free release of MVS was 3.8j. However, with the benefit of hindsight, specifically OS/390 allowing 31-bit programs to reside ATL (above the 16 MiB line), it is useful to posit a new 24-bit release of MVS that allowed programs to be developed on MVS 3.8k (still totally 24/24), and run as 24/24 on MVS 3.8j and 31/ANY on OS/390 (ie that meet the requirement of ANY/ANY modules). This theoretical system (ie not yet built, although most concepts of MVS 3.8k are indeed present in MVS/380) has the following notable changes compared to MVS 3.8j:

Macros such as PUT are 31-bit clean, with code like this:
         L     15,48(0,1)               LOAD PUT ROUTINE ADDR
changed to:
         SR    15,15
         ICM   15,B'0111',48+1(1)       LOAD PUT ROUTINE ADDR

The GETMAIN macro is modified to support LOC=BELOW as an alternative to using "GETMAIN R" (which also guarantees to get BTL storage on OS/390). MVS 3.8k happily accepts the LOC=BELOW at runtime (as MVS 3.8j already does).

The assembler is modified to recognize the BSM and BASSM instructions, although those instructions should never be executed unless the application detects that the program is executing ATL, which can only be true if it is running on a 31-bit system that supports those instructions. The assembler also recognizes AMODE ANY and RMODE ANY.

The linker (IEWL) accepts "AMODE=31,RMODE=ANY" for marking ANY/ANY modules (IBM unfortunately prevented the use of the more accurate ANY/ANY labeling for load modules in OS/390). Even though ANY/ANY modules are appropriately labeled on the MVS 3.8k system, those attributes will be ignored by MVS 3.8k (the same as MVS 3.8j), and the module will be run 24/24. 

Needless to say, applications, as well as obeying those rules, must not introduce 24-bit dependencies on their own, by putting rubbish into the high byte of an address. In fact, ideally there shouldn't even be rubbish put into the top bit of an address (ie use it as a flag), so that 32-bit modules can execute as AM32 or AM64. Applications should aim to be trimodal (not just bimodal), and designed to test at startup what AMODE and RMODE they were invoked as, and act appropriately. Acting appropriately includes that if the module was marked AM31/RM24, that we may be running on MVS/XA which requires I/O routines to be run AM24, so the main application should "step down" to AM24 (OS AMODE) prior to calling an I/O routine, and then it can restore the original (application) AMODE. If the module is running AM31/RM31 then no step-down is possible, so none should be done. If the module is running AM32 or AM64, but RM31, then it should step down to AM31. MVSSUPA has example code to do this, see the @@SETUP routine and the GAMOS and GAMAPP macros.

Note that MVS 3.8k is somewhat similar to the 80386SX processor and the Win32s API which allowed more advanced software to run in a somewhat degraded mode on older systems.
 

MEMORY

While there is nothing currently (ie this is subject to change without notice - and Jason Winter has a version where this does not apply) physically preventing an application from directly accessing ATL memory, the official interface is via the normal z/OS GETMAIN with the LOC=ANY parameter.  The MVS 3.8j macro was updated to allow this parameter, and the SVC 120 which it invokes is intercepted by an add-on program (SVC120I) that is usually run at system startup.  SVC120I also allows the operator to partition the ATL memory to allow 31-bit programs that go through the proper interface to multitask while sharing the memory (although at time of writing, in the non-JW version, there is nothing preventing such applications from overwriting each other's memory).  Programs that use the official interface are portable at both source and binary level to z/OS, since z/OS uses this exact facility to provide memory to applications.

Currently the (non-JW) GETMAIN enhancement does not allow more than one simultaneous ATL memory request from the same program, although if the memory is first freed, it is then available to be reobtained.  Depending on your application, this may or may not be a problem.  C programs usually use a heap, thus a single request for a large chunk of memory is quite sufficient, and generally preferred for performance.  This is certainly the case for users of PDPCLIB, and people using this library can have this done automatically for them.

One more restriction on the (non-JW) GETMAIN is that only requests for a chunk of memory equal to or greater than 16 MB will go to ATL memory.  The reason for this is that any such large request would otherwise fail, so there is no harm done by only honoring a single request for a block such as this.  However, an application that codes LOC=ANY may do that to signify that it doesn't care if the memory resides above the line, even if it only requests a small amount of memory.  So an application that requests 3 blocks of 1 MB of memory would fail on the second request if ATL memory is used, but would succeed if BTL memory is obtained.  So ATL memory is (currently, non-JW) reserved for use in a very specific GETMAIN request.  This restriction is also expected to be lifted in the future to provide the same facilities that Jason Winter's version has.

Beta MVS/380 2.0 has lifted the restrictions and multiple ATL memory requests can be made, so long as the caller is in AM31.


ASSEMBLER

IFOX00 has some constraints that have been lifted, bringing it a little closer to IEV90. IEV90 has been defined as an alias to allow assembly JCL to be more compatible between MVS/380 and z/OS. It is possible to construct assembly JCL in such a way that the same JCL will work on both real IFOX00 and real IEV90, and it is this style that is thus portable between the two systems.  Access to the S/390 instructions has been provided by copying SYS1.ZMACLIB into SYS1.MACLIB.  This implementation is subject to change without notice.


FILES

A fundamental concept is how data is stored in a system. There are two basic forms of data - text data meant to be read by humans in an editor, and binary data designed to be read by software.

Text data usually contains characters like the letter 'A' and spaces and is divided into lines. On Unix systems a control character, the newline, x'0a', is inserted into the data stream to delimit lines. There are advantages and disadvantages to this system. On MVS, a completely different approach was taken. The length of the line is given upfront (as a binary value). This also has advantages and disadvantages.

There are multiple ways of storing this length field, and MVS uses a few different ones internally. The most important one, and the one that programmers should be aware of, is the RDW format, since this will become visible (ie in the data stream) when reading text data in binary mode, depending on where the data is currently stored. Specifically if the data is stored in a RECFM=V or VB dataset, then the RDWs will be visible. In addition, if a RECFM=V dataset is transferred to the PC in binary mode via ftp with the rdw option, then the RDW will appear in the data on the PC.

One of the advantages of the length-up-front format is that binary data can also be stored and not be misinterpreted as a line delimiter, so a V dataset can contain either text or binary data, and can be transferred to the PC intact (ie not losing the line (aka record) boundaries).

The RDW format is a 4-byte field. The first two bytes are the length of the line, plus the length (4) of the RDW, stored in big-endian format. The second two bytes are reserved, and should always be set to x'00'.

Internally, MVS stores multiple lines together in a block, prepended with a BDW which looks just like a RDW itself, but this should not be made visible to the programmer for normal applications. Internally there's even another level, a physical block size (aka sector size - but unlike on the PC, a variable sector size), maintained by the hardware, which similarly should not normally be made visible to the programmer.

Unfortunately, for historical reasons, there is quite a lot of software written that makes use of either or both of the BDW and the hardware-stored size, and thus has become intimately tied to hardware it resides on. Fortunately, the data that such software produces can usually be converted into a more logical form (ie the RDW standard) for subsequent manipulation by more generic programs. And converted back again too. Using a standard utility - IDCAMS REPRO.

So here are the general rules.

If you wish to store text data, store it in RECFM=VB, with a recommended LRECL=255,BLKSIZE=6233, which provides a good sector size for most disks.

If you wish to store binary data, that by coincidence happens to be a sequence of variable-length blocks of data (ie records), then take advantage of the existing RDW convention and store it in the same RECFM=VB type dataset as above, and increase the LRECL from 255 to 6229 (giving a true record length of 6225) if desired. Although, depending on available hardware, it is possible to go above 6229, if you have a need to do so, you should probably treat this as arbitrary data rather than trying to shoehorn the data into the otherwise convenient VB format.

If you are storing load modules, use a block size of 6144, which IBM chose as a reasonable value for the disks supported.

For all other arbitrary binary data (e.g. a "zip" file), that doesn't neatly fit into fairly short "records", the data should be put into RECFM=U with a recommended LRECL=0,BLKSIZE=6233. With this format, it is essentially the same as a PC, except the sector size is 6233 instead of 512 or 2048 or whatever. The "U" means "Undefined", ie there is no record structure, it's just arbitrary bytes.

However there is one more exception. Many applications have a need to write binary data in fixed length chunks (records). There is a convenient type of dataset to store that sort of data in as well - RECFM=FB. If your data consists of e.g. 80-byte records, you should store them into a file of format RECFM=FB,LRECL=80,BLKSIZE=6160. This is the largest number, less than or equal to 6233, that is a multiple of the LRECL.

For co-existence with existing software - if the existing software writes to RECFM=U yet expects the block boundaries to be preserved (ie the data should really have been put into VB), then try to restrict the block size to 6225 so that it can be put into a proper VB dataset if required by other applications that are coded to expect proper RDWs.

If the existing software reads only from RECFM=F, when it actually contains e.g. assembler code, then use a utility like COPYFILE (with the -tt option) to convert it from its natural RECFM=V format into the F format. COPYFILE will append spaces to each line to pad out to the required width (and truncate them on return).

If the existing software puts text data into RECFM=U as separate sectors for each line, then use IDCAMS REPRO to get it into its natural VB format.

If space efficiency on a 3390 is an absolute necessary, then 18452 will give 1/3 track blocking, while not eliminating the ability to use a 3350. However, if you would like to store an IEBCOPY image of your data files in a simple (RECFM=V rather than VS) format, then allow a 20-byte overhead for all datasets, which IEBCOPY requires (12 bytes overhead plus the BDW+RDW for RECFM=V). This would make the recommended block size 18432, which happens to be a multiple of 1024, which makes it ideally suited for load modules too. But normally this is not a consideration, so the recommended sizes are RECFM=VB,LRECL=255,BLKSIZE=18452 for a typical text file, RECFM=FB,LRECL=80,BLKSIZE=18400 for a typical assembler file, RECFM=U,LRECL=0,BLKSIZE=18452 for a zip file and RECFM=U,LRECL=0,BLKSIZE=18432 for a load module.

Sometimes (e.g. when using unzip) a file may be put into RECFM=U not knowing whether it is binary or text. If Unix-style text data is stored in RECFM=U (ie with newline separators), then this can be moved into its more natural RECFM=VB format using COPYFILE (with the -tt option).

If RDW-format data is stored in RECFM=U and you want to move it into its natural format, then use COPYFILE (with -bb option) to do that. Fixed length data can be moved across in exactly the same way.

RDW-format data can also cope with being temporarily copied into a RECFM=F dataset and out again intact (because the mandatory NUL-padding at the end can be recognized by COPYFILE as the true end of the data, since it is an invalid RDW), but it is not recommended to store RDW data in RECFM=F.

Hercules/380 provides a convenient interface for getting data directly from the PC into the appropriate target dataset format. This is done via extensions to the TDF format.

You can use the "TEXT" format to present text data as:

RECFM=U with no newline characters (not recommended)
RECFM=U with newline characters
RECFM=F with space-padding
RECFM=V (this is the recommended option)

You can use the "FIXED" format to represent binary data as either RECFM=U or RECFM=F.

You can use the "RDWUND" format to present RDW data as RECFM=U.

You can use the "RDWVAR" format to present RDW data as RECFM=V.

You can use the "RDWFIX" format to present RDW data as RECFM=F with space-padding.

You can use the "RDWFIXN" format to present RDW data as RECFM=F with NUL-padding.

Similarly, if data is written to an output tape dataset, it can be extracted using "hetget" and be put into the appropriate format. Ideally, RDW data would be written as RECFM=V, where a binary extraction will append the RDW automatically. A binary data of RECFM=U will lose any block boundaries, although if you know it is really RDW data, then there is an option to force an RDW to be added.  Text data is ideally written with RECFM=V, and a text extraction will convert that to normal newline (Unix/DOS) format. A text extraction of RECFM=F will strip blanks and add a newline. A text extraction of RECFM=U will simply do EBCDIC to ASCII conversion of all characters including an expected newline, although if you know the data is missing newlines, those can be force-added at block boundaries.

In general, binary data either needs to be of fixed length, unformatted (ie the program reading the data has its own mechanism for recognizing data), or have RDWs, in order to be able to be moved to another system and still be usable. So you have to be careful that you are storing your data in one of those formats on both MVS and the PC.

Note that MVS applications are traditionally written to read/write records, or if not records, then blocks. But these concepts do not exist in C, and nor are they needed in the PDPCLIB implementation which hides the records so that a normal C byte data stream is presented to the C program.

Also note that reading/writing to devices is beyond the scope of PDPCLIB.

Also note that Unix concepts such as file attributes (read/write/executable at user/group/world level) are not part of the C standard, do not exist on MVS, and applications should thus not be written to be dependent on these things.



PDPCLIB

Users of any C program linked with PDPCLIB will need to define 3 standard DDs - SYSIN, SYSPRINT and SYSTERM, corresponding to stdin, stdout and stderr.  DCB information will need to be provided for new output datasets, unless the IBM default of RECFM=U is desired.  The startup code is designed to expect parameters in either TSO or BATCH style and will adjust automatically.  Files opened via fopen() can have a filename of "dd:xxx" which signifies that a DDNAME of "XXX" exists to be opened.  Otherwise, the filename will be dynamically allocated to a generated DDNAME (PDPxxx).

Text files (ie second parameter of fopen = "r" or "w") are processed as follows.  If file is F or FB, then trailing blanks will be stripped from records on input and replaced with a single newline character.  On writing, the newline is stripped and records are blank-padded.  If the line is longer than the LRECL, extraneous characters are silently discarded.  If file is V or VB, then on input the BDW and RDW are stripped, and a newline is added at the end of the record.  Unless the record consists of a single space, in which case the space is stripped as well.  On write, empty lines have a space added.  This is consistent with handling of Variable records on MVS (e.g. when using the ISPF editor), and some standard IBM utilities (e.g. IEBCOMPR) cannot cope with truly empty records (RDW=4).  If a line is longer than the maximum LRECL, extra characters are silently dropped. If RECFM=U, on read, block boundaries are not significant, and the byte stream is presented unchanged to the application.  Unlike IBM's C compiler, block boundaries don't get newline characters inserted into the byte stream. The reason for this is it prevents the ability of a binary read of a text file from preserving the data, since the block boundaries disappear in such a scenario. When writing to RECFM=U text files, data is written until a block is full.  Unlike IBM's implementation of C (which implies that it is not a good idea to store text data in RECFM=U if it can be avoided, and that RECFM=V is a better choice), newline characters do not cause a new block to be written. Once again, this allows a binary transmit of a RECFM=U file to have the line separators preserved when the data arrives at say the PC side. No special handling of the block boundaries needs to be done.

Binary files (ie second parameter of fopen = "rb" or "wb") are processed as follows.  If file is F or FB, then on input, data will be presented to application unchanged.  On output, data is also written unchanged, except that the last record will be padded with NUL characters if required.  If file is V or VB, then on input the BDW will be stripped, but (unlike IBM C, which implies that it is not a good idea to store binary data in RECFM=V, and RECFM=U is a better choice) the full RDW will be presented to the application.  This makes the byte stream compatible with what a PC application would see when reading a VB file transferred via ftp with the "rdw" option.  On write, an RDW needs to be provided by the application.  Any invalid RDW causes a file error condition, and no further data is written.  With one exception.  An RDW of all-NUL is a signal to discard any further data written.  This allows for a binary copy of a V dataset to an F dataset to be copied back to V without change or error, even if NUL-padding was required.  (Note that this consideration doesn't apply to text files since no RDW is provided by the application).  If a provided RDW is greater than the maximum LRECL then the RDW will be silently adjusted and the extra data silently discarded. RECFM=U files will have the raw data presented as-is, with block boundaries ignored.

Opening a PDS without a member for read will cause the directory to be read and presented as a byte stream.  Any attempt to write to a PDS directory will cause an abend.

Here is a visual presentation of the different file formats on MVS:

In all cases, a COPYFILE -tt was used to copy this instream data:
//IN       DD *
ABC
DEF

GGGG
/*

into the different record formats available on MVS. Then a binary hexdump is done so that the internal representation is shown.

// DCB=(RECFM=FB,LRECL=10,BLKSIZE=100)

000000  C1C2C340 40404040 4040C4C5 C6404040  ABC       DEF
000010  40404040 40404040 40404040 4040C7C7                GG
000020  C7C74040 40404040                    GG


// DCB=(RECFM=VB,LRECL=10,BLKSIZE=100)

000000  00070000 C1C2C300 070000C4 C5C60005  ....ABC....DEF..
000010  00004000 080000C7 C7C7C7             .. ....GGGG


// DCB=(RECFM=U,LRECL=0,BLKSIZE=100)

000000  C1C2C315 C4C5C615 15C7C7C7 C715      ABC.DEF..GGGG.



ADDRESSING MODE

In the past, programs desiring 31-bit execution may or may not be entered in 31-bit mode directly, and were required to detect what mode they were called in and make the appropriate switch, then restore the caller's AMODE on exit. However now, all programs are expected to honor the AMODE/RMODE settings the module is marked as, and in the case of PDPCLIB programs, the beta now does that (function @@SETUP in mvssupa.asm). MVS/380 1.x will no longer be a supported platform for 31-bit programming.



GENERAL USABILITY

If you are considering using MVS/380 to do real work, bear in mind the following:

  o  Although CICS is not available, there is a commercial product
     KICKS which may do the job.

  o  The version of Cobol currently available is so old that the syntax
     is probably unsuitable. z/Cobol and Open Cobol are both potential
     solutions to this problem, but not yet implemented.

  o  There is a port of SQLlite available in source form for the JCC
     compiler. There is also a plan to offload this major functionality
     to the PC, using Hercules as a bridge.

  o  tcp/ip and ftp are not currently available in a generally usable
     form, but are available (via bridge) in a demonstration form.

  o  RPF provides an ISPF-like environment, and REVIEW is an ISPF-like
     editor.



VM/380 PROGRAMMING INTERFACE
----------------------------


MEMORY

The interface for CMS programs is identical to MVS users.  GETMAIN with LOC=ANY will obtain ATL storage.  There is no partitioning facility available in CMS, but it's not a real concept for CMS.  It would be a concept for CP, but there is no facility in CP for partitioning, nor any communication from CMS to CP.  So only one guest OS should run an ATL-using application.  CMS applications that obtain memory via this interface can be ported to z/VM at both the source and binary levels also.


PARAMETERS

VM/380 provides EPLIST support.  This is the same in VM/380 as in z/VM.  Parameters should be obtained the same way by chaining back via the save areas.  Once again, this is handled automatically for users of PDPCLIB.


EXEC2

VM/380 provides limited EXEC2 support similar to z/VM.  As with z/VM, this is activated via &TRACE.  For portable scripting (between VM/380 and z/VM), only EXEC2 is guaranteed to have EPLIST available, and only the subset of EXEC2 commands that are present in EXEC should be used.

So that means restricting your EXEC2 scripts to:

&TRACE ALL | OFF
&ARGS
&BEGSTACK (no ALL)
&ERROR
&EXIT
&GOTO
&IF
&LOOP (but no conditions on the loop)
&READ
&SIP
&STACK (no HT or RT)
&TYPE

No functions (&CONCAT, &DATATYPE, &LENGTH, &LITERAL, &SUBSTR)

No &*, &$, &DISK... &EXEC, &GLOBAL, &READFLAG, &TYPEFLAG


ASSEMBLER

On z/VM, the "ASSEMBLE" assembler is quite limited, and for programs with a large number of symbols you need to use "ASMAHL" instead.  VM/380 has simulated this by adding some limited enhancement to "ASSEMBLE", copying that to ASMAHL, and updating the maclibs to provide macros such as BSM.  Naturally this is subject to change without notice, but the programming interface remains the same.


MACROS

z/VM rearranged the macro libraries (e.g. replacing CMSLIB with DMSGPI).  To allow application portability, the macro library was copied to its new name, as well as having the GETMAIN macro updated (sourced from MVS) and having a BSM macro added to compensate for it not being internally defined in the assembler.


PDPCLIB

Users of any C program linked with PDPCLIB can either define the 3 standard DDs - SYSIN, SYSPRINT and SYSTERM, corresponding to stdin, stdout and stderr, or these will be allocated to the terminal dynamically.  New files can be either defined with FILEDEF and opened by DDNAME by specifying a filename of "dd:xxx" where xxx is the DDNAME, or else they can be a full filename.  If a full filename is specified, then on creation of an output binary file, DCB attributes are set to RECFM=F, LRECL=800.  An output text file is set to RECFM=V, LRECL=2000 by default.  Dynamically allocated files are given generated DDNAMEs of format PDPxxx where xxx is a number.  The startup code is designed to detect an EPLIST otherwise get parameters from a PLIST.  However, if a SYSPARM filedef is in place, the parameters are obtained from the first line of that file instead.  If both a SYSPARM (even a dummy one) and a parameter are provided, then special processing is signalled, on the assumption that this is an EXEC environment where only a PLIST is available, and the user has difficulty passing long and mixed case parameters to the application.  The parameter list will be lowercased, and only characters preceded by a "_" will be uppercased.  Spaces will be stripped unless preceded by "_".  If the first parameter is "_+" then the lower/upper rules are swapped.  Two underscores will create a single one.

Text files (ie second parameter of fopen = "r" or "w") are processed as follows.  If file is F, then trailing blanks will be stripped from records on input and replaced with a single newline character.  On writing, the newline is stripped and records are blank-padded.  If the line is longer than the LRECL, extraneous characters are silently discarded.  If file is V, then on input the BDW and RDW are stripped, and a newline is added at the end of the record.  Unless the record consists of a single space, in which case the space is stripped as well.  On write, empty lines have a space added.  This is consistent with handling of Variable records on MVS (e.g. when using the ISPF editor).  If a line is longer than the maximum LRECL, characters are silently dropped.

Binary files (ie second parameter of fopen = "rb" or "wb") are processed as follows.  If file is F or FB, then on input, data will be presented to application unchanged.  On output, data is also written unchanged, except that the last record will be padded with NUL characters if required.  If file is V, then on input the BDW will be stripped, but the full RDW will be presented to the application.  This makes the byte stream compatible with what a PC application would see when reading a VB file transferred via ftp with the "rdw" option.  On write, an RDW needs to be provided by the application.  Any invalid RDW causes a file error condition, and no further data is written.  With one exception.  An RDW of all-NUL is a signal to discard any further data written.  This allows for a binary copy of a V dataset to an F dataset to be copied back to V without change or error, even if NUL-padding was required.


ADDRESSING MODE

Programs desiring 31-bit execution may or may not be entered in 31-bit mode directly, and are required to detect what mode they were called in and make the appropriate switch, then restore the caller's AMODE on exit.


COMMON C CALLING CONVENTION
---------------------------

GCCMVS and GCCCMS (the C compilers bundled with MVS/380 and VM/380) generate a special entry point @@MAIN when a program with main() defined is processed.  All function names are uppercased and truncated to 8 characters, and "_" is converted to "@".  As such @@MAIN is distinct from MAIN.  @@MAIN simply branches to the assembler startup code (@@CRT0) and control is never returned to it.

GCC-generated code pretty much follows the standard OS linkage conventions, except that the list of addresses passed to the called program via R1 are not terminated with a 1 in the top bit of a 32-bit address. In C you are expected to know how many arguments you'll have. In addition, integer parameters are not stored as addresses, but instead their actual value is used.  This is expected to change in the future to be compatible with IBM and the Language Environment, so macros should be used in preparation for this change.

When @@CRT0 is invoked, it sets up a stack.  The first 18 words of the stack are a standard OS register save area.

Beta @@CRT0 in MVS calls @@SETUP to analyze the AMODE and RMODE it was executed in, and gracefully accepts whatever it was given. It sets a flag to do an appropriate AMODE switch before calling the MVS I/O routines, enabling a single load module to be shipped that works optimally in all of MVS 3.8j, MVS/XA, z/OS and MVS/380 2.0 beta.

@@CRT0 then calls @@START, which in turn calls MAIN (the user's "main") - which is NOT @@MAIN (the entry point to the executable).

Each routine's save area comes from the GCC stack allocated in @@CRT0. These save areas are chained following OS conventions, ie savearea+4 points to the previous save area, savearea+8 points to the next one. A routine's save area includes space for its local variables. This amount is calculated by the compiler, and passed as the FRAME= parameter of the PDPPRLG macro.

So R13 points to an area that looks like this:

0 - unused by C, but PL/I or Cobol might use it
4 - backchain to previous save area
8 - forward chain to next save area
12 - R14
16 - R15
20 - R0
24 - R1
28 - R2
32 - R3
36 - R4
40 - R5
44 - R6
48 - R7
52 - R8
56 - R9
60 - R10
64 - R11
68 - R12
72 - unused but could be used to store a CRAB
76 - pointer to the top of the stack
80 - work area for compiler-generated code (CONVLO)
84 - work area for compiler-generated code (CONVHI)
88 - local variables begin




SEQUENCE NUMBERS
----------------

When IBM produced its first mainframes, there were no interactive CRTs with fullscreen editors which could be used to write program code. Instead, program code was fed onto the system using punched cards. Editing your program consisted of inserting physical cards into an appropriate spot in the card deck, or replacing cards. If you dropped this deck of cards, or lost what you had been doing, the thing that saved you was intrusive sequence numbers. ie not logical (invisible) line numbers, but physical sequence numbers that were included as part of the data. Not as intrusive as BASIC line numbers though, which were part of the programming logic. The mainframe sequence numbers were located in columns 73-80 (inclusive and 1-based counting). The assembler code (or indeed, other language code, perhaps Cobol), never referred to these sequence numbers, but they were there to keep track of your editing.

You might have assumed that once interactive program editors appeared with TSO, that MVS programmers all breathed a sigh of relief now that horrible-looking sequence numbers were completely obsolete. You would be wrong. Instead, an entire culture was set up surrounding sequence numbers. Ordinary applications such as a "file difference" program would have specialized code in there to ignore sequence numbers, possibly even by default. IEBUPDTE, MVS's answer to the "patch" program in Unix, was designed to operate against intrusive sequence numbers, instead of the logical line numbers used by "patch". Even modern programs like IBM's C compiler have an option to ignore columns 73-80 when compiling C source. Probably 95% of MVS programmers continue to use sequence numbers to this day. When using editors, sequence numbers are usually on by default, and an explicit "unnum" or "numbers off" is required to eradicate them.

Intrusive sequence numbers are not part of the C standard, and as such, there is no need for a C programmer to either use sequence numbers in their C code, nor do their programs need to cater for people still stuck in the card reader days. If people wish "diff" to ignore sequence numbers when comparing 2 files, then the onus is on them to strip sequence numbers from both files before calling "diff". You are free to join the 5% of MVS programmers who find intrusive sequence numbers to be a truly ugly sight. Just bear in mind that when you run GCCMVS and you get a strange error message on line 1, when you can't see anything at all wrong with the line, that if you scroll to the right, you will probably find a sequence number in columns 73-80 which was added by default without your knowledge. The same thing applies if you are submitting your C code via instream JCL. The whole JCL file probably has sequence numbers.

One thing of particular interest to note is that there is an extension to the Unix "patch" program in the MVS port. There is an option "-m" or "--mvs-num" which will preserve the original sequence numbers in the file being patched. This allows you to strip sequence numbers from all files, and use all common tools (CVS/SVN/diff/diff3/vi/etc) that are NOT aware of sequence numbers. When you have finished your work, you can then bring all the sequence numbers back from the dead using "patch -m".




FUTURE DIRECTION
----------------

The following projects are either underway or being considered:

  o  Port OpenCobol to MVS/380.

  o  Port z/Cobol to MVS/380.

  o  Port PDPCLIB to other mainframe C compilers (C/370,
     Dignus).

  o  Possible enablement of languages other than C in GCC (the
     C++ target could perhaps use stdcxx as the C++ runtime
     library)

  o  Use BREXX or Regina to provide REXX internally to CMS as
     per z/VM (prototype demonstrated in May 2009).

  o  Provide the equivalent of mingw (or maybe LE/370) to CMS
     applications by putting the C runtime library in shared
     memory allowing small executables (prototype demonstrated
     in May 2009)

  o  Port 31-bit version of RPF to MVS/380 and use ATL memory
     to allow editting of large files.

  o  Modify CP so that it is responsible for remembering which
     applications need the 31-bit restored and remove this
     logic from Hercules/380.

  o  Memory protection for different address spaces accessing
     ATL memory (multiple solutions to this, some relatively
     easy, some requiring extensive OS modifications, some
     already in existence).

  o  Add memmgr to the GETMAIN intercepts allowing an
     application to do multiple GETMAINs for ATL memory (now
     being done a different way).

  o  CICS programming interface provided via KICKS.

  o  Port SQLite to GCCMVS now that Jason Winter has ported it
     to JCC.

  o  Getting GCCMVS/GCCMU as a native application under
     MUSIC/SP, completing Dave Edwards's (RIP) project.

  o  Adding TCP/IP to MVS and VM (Jason Winter has a solution
     of sorts to this, but it hasn't yet been integrated).

  o  A prelinker for GCC-generated code to allow long names
     and reentrancy.

  o  Enhancements to the CMS editor.

  o  Better cleanup on MVS of ATL memory for when a program
     terminates without doing a FREEMAIN.

  o  DFDSS-compatible backup (restore is already available),
     to provide a superior method of software distribution.
     This is currently in beta testing at time of writing.
  
  o  Other unusual/niche EBCDIC programming environments exist, 
     although not necessarily commercially sold/used. MUSIC/SP
     is one of those environments and has had GCC ported to
     it already. Others include ORVYL, ACP/TPF, TSS, OS/360,
     z/390, MTS, BS2000/OSD.
  
  o  Getting Linux Bigfoot (i370) to be able to boot could
     be useful.

  o  Getting z/Linux or OpenSolaris running under VM/380
     and using the HLASM from GCCMVS may be useful.

  o  The following MVS usermods:

JES2 spool percentage display or spool full

EOV merge extents

SORT using 3390s properly

Linkedit to put zeroes in DS and ORG*+nnn

IEBCOPY overlays not allowed ( sysgen MACRO ? )

IEHMOVE repeated module load ( minor logic change )

JCL accept apostrophes in SUBSYS field ( CCSS ? )

MPF support ( big automation improvement )



GCC PORT HISTORY
----------------

The first i370 code generator for GCC was written in 1988-1989 by Jan Stein, targetting Amdahl's UTS.  It was distributed to others to use as a base.  In 1992 Dave Pitts picked that up, made modifications to it, and arranged with Richard Stallman to get it into the official GCC base, which happened in 1993 in GCC 2.5.  Unfortunately, GCC itself was far from being C90-compliant which would have made it easy to port to the mainframe (or any other) environment.  Considering the fact that objectively all it did was read in a bunch of text files (C code) and produced another text file (the assembler code) - at least with the "-S" option specified - it should have been possible to have written it C90-compliant.

One of the big problems was that the GCC coders had made assumptions that they were running on an ASCII host.  To solve this problem meant going into the internals of the compiler to find out where that had been done and make the code generic.  This work was largely done by Dave Pitts and by 1998, GCC 2.8.1 had an OS/390 EBCDIC port.  Also in 1998, Linas Vepstas (with assistance from Dan Lepore and a machine courtesy of Melinda Varian) started making large scale changes to the i370 target in support of an effort to port Linux to S/370.

Independent of GCC, in mid-1994, Paul Edwards had set about trying to create a C runtime library (PDPCLIB) for the PC, and especially for PDOS (a replacement for MSDOS). In 1997, when access to a mainframe was temporarily available, he used that opportunity to port PDPCLIB to MVS. Although Paul was originally from an MVS background, he had been on Unix during PDPCLIB's history.

In April 1998, Paul Edwards, shortly before he started working on a real MVS (called OS/390 at that time) system, had dusted off his 1997 MVS port of PDPCLIB, then contacted the GCC maintainers to ask about making modifications for MVS.  He was unaware of the other two activities (Linas and Dave were in communication with each other though), and the GCC maintainer apparently didn't know either, so work was done on 2.7.2 and later 2.8.1 to try to make it C90-compliant, with a simple compilation procedure and a single executable using Borland C++ and Watcom C++ on OS/2 (a deliberately alien platform), ready to be ported to MVS.  The maintainers weren't too thrilled about changes being made to make gcc a single executable, but some of the other changes were accepted.  Replacement Unix functions were written and the gcc executable was able to be compiled and linked (using AD/Cycle C/370) and display its usage.  However, when doing a real compile, it went into a loop, that required in-depth knowledge of the GCC application to resolve, so the effort was aborted at that point.  In March 1999 the laptop with this work on it was stolen so any GCC changes that the maintainers hadn't accepted were lost.  However, the Unix I/O replacement functions had been backed up.  In addition, the concept of converting GCC into a single, C90-compliant executable had come close to being proven.

Apparently encountering difficulty getting i370 mods into mainstream GCC, Dave had been adding his i370 mods to different versions of GCC since 1998 and maintaining them separately.  Linas managed to get some, but not all, of his work into the GCC baseline (these additional changes made in 1999 would end up being lost from the active development stream until 2009).  At around this time (1999) another development had been taking place - the introduction of Hercules, which allowed the S/370, 390 etc hardware to be emulated, thus allowing hobbyists to run old versions of MVS (which were public domain).  So access to a mainframe ceased to be problematic, especially with the introduction of packaged systems like Tur(n)key from Volker Bandke.

By late 2002, Dave was up to version 3.2 of GCC, working under z/OS with USS (Posix support). Paul made initial contact with Dave in November 2002 to inquire about the technical plausibility of a port to non-USS MVS.  One year later, November 2003, Paul Edwards, working with Phil Roberts, picked up this version with a view to getting it working natively on MVS 3.8j.

The problems that Dave identified in any attempt to port to MVS 3.8j were that the size of the main part of the compiler (cc1) was 16 MB on OS/390, and that the way the gcc driver loaded cpp, cc1 etc would need to be emulated somehow, and that a scheme would be needed to map Unix style file names into MVS datasets.  Not mentioned were - the compiler had never been used to attempt to compile itself which would have revealed that it was riddled with bugs, and the fact that it was riddled with non-C90 functions, plus other non-C90 things like relying on a long function names to be unique.

However, as you can probably guess, there were solutions to all these problems.  First the 16 MB executable.  PDPCLIB is quite small, possibly because it doesn't support VSAM files, Posix and many other nice-to-have features.  It did however have the ability to process text files, which is all that was required for the GCC application.  While optimization wasn't switched on until years later, the entire optimized executable was eventually found to be just 3 MB (it was 4 MB unoptimized).  MVS 3.8j gave about 9 MB of address space, and if abnormally stripped down, could provide upwards of 10 MB.  This proved to be sufficient for most normal use.  Abnormal use - such as recompiling GCC itself at full optimization - was not possible though.

GCC is split up into multiple components, with a small "gcc" executable invoking the other executables in turn.  However, this is fairly strange in that most of the code is in cc1 anyway, so there's not a lot to be gained.  And the price is everything channelled via a "system" call, or fork/exec calls - which are all inherently non-portable.  The solution here was to mask out all that channelling code and instead get gcc to call cc1 etc as normal function calls to provide a single large gcc executable.  This in turn mean that the function names needed to be unique across all the executables, so duplicate functions needed to be found and then renamed with a #define.

The mapping of the include filenames was initially done by renaming them to 8-character unique names and changing the corresponding source code.  The path searching for include files was nullified and replaced with DD lookups for INCLUDE and SYSINCL (the latter for system headers).  Later on the "remap" facility was unearthed and all the renames in the source code were able to be reversed out.

The includes for non-standard headers (fcntl.h, sys/stat.h etc) were #ifdef'ed out.  These header files generally pointed to function calls which also didn't exist in C90.  The simplest solution to this problem was to create a mini-Posix library where open() is defined in terms of fopen().  Some functions were made to return a reasonable value for most common use.  Anything abnormal needed a code change to get rid of the call that wasn't needed in the first place in a text-processing program.

One of the bugs hit early on was the fact that the compiler was converting static functions into 8-character assembler identifiers according to their actual name, which mean that they needed to be unique within the source file.  When the dust settled, there were about 3000 functions that had to be #defined to new names, about half of them static (C90 requires the statics to be unique to more than 8 characters, so it was the MVS port of GCC that was at fault).  To make matters worse, the code was initially generating CSECTs for each function name.  The IBM linker is designed to just take the largest CSECT of the same name and silently use that instead of reporting a name clash.  The code generator was changed to use unnamed CSECTs and use ENTRY statements instead to ensure external identifiers were unique and clashes detected.  Years later, the static bug was fixed and a tool developed to search out duplicates in the generated assembler so as to only keep those names that needed to be kept (ie external name clashes, which ended up being about 1300).  Although the GNU config.h is annoying in that they don't provide a C90 version by default, and instead one needs to be constructed manually, it does have the advantage that all the remaps were able to be done in there and get picked up across the entire source base.

While those other problems were time-consuming to resolve, they were nonetheless straightforward.  It was the bugs that were the biggest obstacle.  Without someone familiar with the compiler internals, it was sometimes necessary to hack the generated assembler.  By 2004-01-11 ((after having started in November 2003), the compiler was running natively on MVS. By March 2004, GCC 3.2 was able to recompile itself (at least on a 31-bit machine) and version 1.0 was released.  The most serious problem was with floating point - the native compiler was generating bad floating point values and the workaround was to generate the value "1.0" all the time instead.  This didn't cause a problem for the recompilation of GCC itself because it apparently didn't use floating point for anything important.  However, it meant that PDPCLIB had some defects due to this kludge that it wouldn't normally have had.

Meanwhile, mainstream GCC was about to release the 3.4.x series, the last that would include the i370 target, as for the 4.x series they had decided to unceremoniously dump it!  SVN revision 77216 on 2004-02-04 did the dumping.  ie just as 15 years of effort was about to bear fruit.  The GCC maintainers aren't MVS/EBCDIC users themselves (the S/390 port is Linux/ASCII), so it is a struggle to refit the EBCDIC support for each release as it is either screwed with or dumped or the changes aren't accepted in the first place.  So it always took a long time for the MVS version to come out, waiting on Dave Pitts to get the USS version working on the next release first.

At this point (April 2004), Dave Wade picked up the MVS port to try to get it working on VM/CMS with a view to enabling BREXX to be ported.  He succeeded in doing this, plus fixing the floating point bug, plus other bugs and unimplemented functionality in PDPCLIB and in January 2006, version 2.0 was officially released.  At around this time, Dave Pitts had independently moved his changes up to version 3.2.3, so the GCCMVS changes were reapplied on top of that.  So version 1.0 of 3.2.3 was released in January 2006 also.  Version 2.0 followed a short while later (March 2006) mainly to enable building on Unix with later versions of GCC.

Version 3.0 was released in August 2007 and significantly progressed the mainframe-ness of the compiler. The prologue/epilogue assembler macros were created rather than being done with a separate program. Include files could be concatenated.  It was fully 31-bit on z/OS (instead of being restricted to RMODE 24 due to the way the RDJFCB macro had been coded).  Remap was made to work. The generated files (mainly generated from the machine description) were able to be generated on the MVS system.  Optimization (-O2) was switched on, taking the executable size from 4 MB to 3 MB, although some code workarounds were needed to bypass optimizer bugs.  Aliases for PDPCLIB modules were provided to enable automatic linking.  Another code generator bug fix was applied.  Also, on VM, it was now possible to get GCC to recompile itself unoptimized - or to create a hybrid where most of it was optimized, but a few modules were still unoptimized.  This state of affairs was probably made possible earlier when GCC had been modified to stop invoking setjmp() all the time which consumed a lot of memory (saving the stack), due to an overzealous implementation that would later be changed.  Regardless, this was the point at which it was now possible to have a purely mainframe compiler able to recompile itself on a freely available mainframe operating system.  Even the MVS version could theoretically be generated from VM/370.  This was never tried as it was academic and was soon replaced by an alternative and superior advance.

Up until this point, Paul Edwards, due to his very old and very flakey PC, had never dared attempt to install Hercules to see GCC running for himself.  Instead, all work had been done via email as he sent code to Phil Roberts and Phil sent back dumps, traces and on the odd occasion, the result of a successful run.  If Phil hadn't done this, everything would probably have been delayed by 4 years.  By November 2007 Paul had purchased a new laptop and had Hercules running TK3 (rather than TK3SU1 as that required Hercules 3.05 which wasn't working due to another problem).  It was then discovered that there wasn't enough region in TK3 out of the box to compile many of the source files that had previously been set as compilable in 24-bit.  Previously an elaborate scheme had been set up such that the JCL had "dummy" compiles where instead of doing a real compile, the old assembler (from the PC) was simply copied.  On a 31-bit system, those dummy compiles were then globally changed to real compiles.

The problem was that there was no good figure to use for available memory.  Multiple attempts were made to find a "lowest common denominator", but even the same machine produced different results on multiple runs.  By the 17th November 2007 the region had been lowered yet again, this time from 8500k to 8400k, but there was no end in sight to this problem.  We were trying to get too much out of the 24-bit system and it was simply the wrong solution.  This is why 31-bit systems exist and it was time to upgrade.  On 14th November 2007 Paul had initiated a general query to find out the best way to force through essentially once-off 31-bit compiles on the 24-bit MVS, with a bit of "trickery" (the phrase actually used upon success was "Paul managed to ram a square peg in a round hole with a sledge hammer").  There was a wide variety of opinions and suggestions, and on the 20th November 2007 an S/380 [in practice, but still displaying S/370] machine was able to enter 31-bit mode and stay there with no complaint from MVS.  On the 21st November the S/380 test program was able to write to ATL memory, although it wasn't until the 22nd November that this was realised due to confusion over the so-called "crap byte" that BALR inserts (and BASR doesn't).  By 7th December, 2007 GCC had been compiled end-to-end (ie reproducing itself) on the S/380 eliminating any remaining doubt about whether it was technically possible or not.

Version 4.0 was released in December 2007 and a heap manager (memmgr) was added which provided support for the newly created MVS/380.  In addition, the PC and mainframe were producing identical assembler thanks to the -Os option being used plus some other minor code changes plus another code generation problem being fixed.  This showed that there were no code generator bugs that had introduced a bug into the GCC executable itself.  Later it was discovered (by Dave Edwards when he was doing work on MUSIC/SP) that -O2 causes different code (both forms valid) to be generated depending on the exact representation of floating point values.  -Os does not appear to be sensitive to that.  Ideally code shouldn't be written that is sensitive to that, but no-one knew where that was happening.  Prior to Dave's discovery, it was assumed that one of the code generation bug fixes, or the generally random nature of those code changes were responsible for the identical code.  -Os had been switched on for an entirely different reason (ie an apparently incorrect claim that it produced significantly faster code than -O2 on MVS).

Version 5.0 was released in March 2008 and the last major standards violation - requiring statics to be unique in the first 8 characters was lifted as it was discovered what needed to be changed so that static functions could be renamed to be unique, generated names.

Version 6.0 was released in January 2009 along with version 1.00 of PDPCLIB as (after 15 years) it became C90-compliant, at least on the mainframe (as far as it was known) - with the exception that there were still a lot of known compiler bugs which no-one involved knew how to fix.  So finally there was a free (even for commercial use) C90-compliant (although given the known bugs, it would be best to give this "beta" status) environment on the mainframe.  The VM build procedure was totally revamped, and techniques developed to allow traditional automatic linking.  Plus it became a totally mainframe product as BISON and SED were provided on the mainframe so that nothing at all came from the PC.  The 31-bit GCC executables produced for both MVS/380 and VM/380 were made available, unchanged, for z/OS and z/VM users.  The z/OS deliverable was made available as an XMIT, while the z/VM deliverable was provided as an AWS tape. Also, output was switched to move mode rather than locate mode by default which made debugging-via-printf much easier after an abend.

The availability of a C compiler allowed a variety of other C products to be ported, and these were all bundled with MVS/380 0.9 and VM/380 0.9.  They were bison, brexx, bwbasic, diffutils, flex, m4, patch, sed, minizip (zlib).  The changes to brexx and bwbasic to support MVS and CMS were incorporated into the base product.

In February 2009, Linas was contacted to see if he was interested in fixing some of the remaining compiler bugs (about 7 serious ones preventing code from being compiled), and it was discovered that some of his code changes were not even the current version of the compiler.  Paul merged in the remaining code changes, except for one, which ironically was the only change that fixed one of the 7 bugs, but as a side-effect created an even more serious bug in other code! However, this change allowed experimentation to find out what change was required to circumvent the problem in question.

In April 2009, Paul had reached the point with Dave Pitts's unreleased 3.4.6 mods of being able to produce a single executable under Windows using gcc from Cygwin.  This was a precursor for getting it to work on MVS with PDPCLIB.

May 2009 saw two more advances. Robert O'Hara had produced GCCLIB, a native CMS runtime, which could be made resident, enabling small applications to be developed.  Meanwhile, Linas passed on sufficient knowledge to Paul Edwards to enable him to fix GCC compiler bugs.  This allowed the 15 or thereabouts known compiler bugs to be fixed at long last, producing a far more robust compiler.

June 2009 saw the release of GCC 3.2.3 MVS 7.0, the first in-practice C90-compliant release.  This was the point where GCC on MVS came of age, and the point at which C became a lingua franca for computers, with every major (or at least, more widely used than DOS/VSE) platform now speaking the language for no additional monetary cost.

August 2009 saw the release of GCC 3.2.3 MVS 7.5, which saw extensive changes to the machine definition in order to eliminate the warnings it generated. Plus some other cleanups such as protecting against unsupported things like compiling two programs or not producing assembler output. A new PDPCLIB was released which had Linux support too.

September 2009 saw the port of 3.4.6, which had had multiple attempts made on it previously (stretching back several years), finally produce a self-compiling native compiler, at least with optimization switched off, and not including the generated files.

October 2009 saw a revamp of PDPCLIB for MVS by Gerhard, which made things much more user-friendly by introducing default DCB information.

November 2009 saw 3.4.6 having the 370 assembler files being completely generated via the normal (configure/make) build process.

January 2010 saw 3.2.3 ported to MUSIC/SP. It had been held up due to an attempt to track down a paging error (MUSIC/SP was not configured with sufficient memory to prevent paging).

March 2010 saw the release of the DOS/VS "5 pack", with S/380 support then added to Hercules/380, effectively creating a VSE/380, which then saw PDPCLIB ported to some extent.

April 2010 saw GCC 3.2.3 running on VSE/380 and z/VSE, after bypassing various system restrictions. However, due to the incomplete PDPCLIB, GCC couldn't compile anything with #include in it.

May 2010 saw the release of GCC 3.2.3 MVS 8.0 which shipped with PDPCLIB 3.00 which had a major revamp of the MVS code to support default DCB information and able to run natively under TSO with PUTLINE.

January 2011 saw the release of GCC 3.2.3 MVS 8.5 which had a properly working DOS/VSE port. Also VSE/380 1.0 was released. Also PDOS had been developed and was a targetted system requiring some MVS changes.

June 2011 saw the release of GCC 3.4.6 MVS 1.0 which had a completely revamped method of building that fits in with the configure/make paradigm. However the generated files were no longer able to be generated natively on MVS.

December 2014 Paul noticed that PDOS/390 was executing AMODE 24/RMODE 24 programs like "DIFF" in 31-bit mode, without complaint. Meaning the modules were technically AMODE ANY, not AMODE 24. This opened up the potential to have a single RMODE 24 executable that GETMAINed storage with LOC=ANY, which would only get ATL memory on z/OS, and still work using BTL memory on MVS 3.8j. In January 2014 it was confirmed that a module could be built on MVS/380, could be marked as AMODE 31 and run on z/OS and from printing addresses we could see that it was indeed using ATL memory. The same module also worked fine under MVS 3.8j, where it was found that the LOC=ANY bit was ignored. This was a fantastic technical advance, but there was yet another potential advance available. Could a module built on MVS/380 be marked ANY/ANY and actually run BTL on MVS 3.8j and ATL on z/OS? This was tested, and unbelievably, it actually worked. That meant that we had the ability to build perfect ANY/ANY modules on MVS/380 and the single executable was all that was required to be distributed. The GETMAIN could use the default LOC=RES also, and we were able to make the "370" and "390" flavours of PDPCLIB the same. Basically we could just build "390" modules and run them perfectly fine on MVS 3.8j (meaning "370" is now obsolete). Note that it was found that IBM unfortunately disallows a module to be marked ANY/ANY, so it needed to be marked 31/ANY instead, but it's still fine running these 31/ANY modules on MVS 3.8j where the bits are ignored.

January 2015 Johann Geyer released a modification to the IEWL linker so that AMODE 31 or ANY modules can be produced. Unfortunately he didn't get RMODE ANY working, so if building a 31/ANY module on MVS/380 it is still necessary to use the "PDS" command (ie PDS load-library, attrib module rmodeany, end).
 
June 2015 Paul managed to get IEWL working correctly so we could have clean JCL to produce clean modules for z/OS.

September 2016 Paul proposed a convoluted method to allow more than 2 GiB of host memory to be made available to MVS/380.

November 2016 after a discussion, somitcw proposed a very clean "separate memory" implementation in Hercules which would allow each ATL-using program to have its own Hercules-managed memory. Paul made modifications to Hercules/380 to enable z/Arch instructions and 64-bit addressing in S/380 mode, while still using the S/370 PSW. Paul also produced a 64-bit Windows executable that could make more than 4 GiB of memory available to 64-bit MVS programs.

October 2017 Greg Price got a 31-bit version of REVIEW to work under MVS/380 2.0 beta sufficiently, enabling datasets larger than 16 MiB to be edited.

December 2017 Gerhard finalized changes to the SVC 120 intercept (which now included multiple ATL memory requests) that allowed core MVS/380 to be distributed as two 8k modules - a MVS380MN module, and a modified IEWFETCH called MVS380FT.

March 2018 Paul added some level of TCP/IP support via an SVC 120 API call. 

April 2018 Peppe added Jason Winter's TCP/IP instruction to Hercules/380.

May 2018 Paul added a "step-down" concept to mvssupa, allowing a single load module to be distributed that worked optimally in all of MVS 3.8j, MVS/XA, z/OS and the new beta MVS/380 2.0. Also Paul added AM32 to Hercules/380 and the ability for a BSM to AM31 to actually activate AM32 or AM64 or AM24 (plus other BSMs are configurable). GCC was able to rebuild itself running in AM32, but not AM64 due to the use of negative indexes. Gerhard fixed mvssupa so that VS worked properly. Paul made a modification to Hercules/380 for the Windows environment to support graphics on the Hercules console and joystick input, plus the ability to intercept TPUT and display the 3270 data stream on the Hercules console.

October 2018 Paul got PDPCLIB working with modern IBM C again.

October 2019 saw the release of GCC 3.2.3 MVS 9.0 and MVS/380 2.0

September 2020 Paul added the ability for a reference in TSO to a dataset name of "test.c" to be automatically translated into "c(test)" allowing the PC-style naming convention to work. That means "gcc -S test.c" works, with output going to "s(test)".