17 April 2014

I am HOBSET because the binder put me in APCL in CICS


So, after a break of a few years, I am plunging back into the world of CICS.

I needed to do some testing with writing out of messages via CEEMOUT, a Language Environment call that is the issuer of messages, and used by languages–er, "members"–for diagnostics and basic output. For COBOL, DISPLAY goes here; for PL/I, PUT without FILE; for C/C++, stdout/err. In batch, this is controlled by MSGFILE. Under CICS, CEEMOUT output goes to a temporary data queue, by default CESE; this by default is defined as an external queue, and it is defined to DDNAME CEEOUT.

I started writing a driver program to put a new subroutine through its paces. However, when I tried to run it, I got an APCL abend.  This is documented in typical IBM-speak something like "LE recognized that this was an LE program, but it couldn't initialize it." Googling didn't help; the only references are to the aforementioned messages & codes manuals, and one very old reference to a PL/I for MVS and VM manual.

Since I was using some old JCL of mine from prior places, I went to some that is used in another product build. That worked, so it was time to find the differences. I started with the binding process, and it did not take long. The culprit was HOBSET=YES. This parm sets the AMODE bits on in V-cons, so it can be used in BSM and BASSM instructions if you are flipping AMODEs.  I removed it, and voilà, the problem went away. 

I am debating whether or not to open a PMR. The restriction is not documented in either CICS or LE documentation. I'm OK with it being a restriction, but the lack of documentation bothers me.

03 March 2014

Why choosing the right tools for software development is très important, and why you should turn on compiler informational messages.

goto fail;

So, by now about half the country is familiar with what is being called the "goto fail" bug in Apple's Secure Socket Layer implementation in both OS X and iOS. And, frankly, I'm surprised that this took forever to be noticed.

Why? Because any good C compiler would have caught it!!

The code in question:

Here's the code. It's pretty simple to understand, actually, no knowledge of C is required. (Formatting and comment taken from Naked Security's article on the subject.)

hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
hashOut.length = SSL_SHA1_DIGEST_LEN;
if ((err = SSLFreeBuffer(&hashCtx)) != 0)
    goto fail;
if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
    goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
    goto fail;

err = sslRawVerify(...);
The code performs several functions, and if any of the return codes from the functions is not 0 (a recognized programming standard, non-zero means something bad happened), the code goes to label fail (not pictured here). Note the commented goto fail line, however. There's no condition there, so the computer is going to branch to fail, and the err variable will be 0, because of the successful return of the prior function. Big oops. Major oops. Catastrophic oops.

My take

Billions of bits and CPU cycles have already been burned on this. Heck, even some code analysis ISVs are screaming from the rooftops that their products would have caught it. And that's a good thing, because software applications are so complex these days that issues exist across multiple source code units, and a compiler + linker/binder isn't sophisticated enough to see those. But this bug is so simple that a good compiler should have caught it and issued a warning message, or even an informational message. But no one seems to have either noticed this, or run a compile with informational messages turned on. Informational messages can be a nuisance during the majority of the development cycle, but as you work out those final little nits, a developer should turn them on, because they can point out scenarios that can lead to critical issues down the road. 

I have a suspicion that with today's DevOps, Agile, Scrum, and other "modern" development methodologies, the time for going through the code with a fine-toothed comb is thought of as unnecessary, because we have that two-week cycle deadline coming up, and frankly we need to get the code out, bugs and all.

But then of course you have to have a compiler that checks for this kind of stuff. And in my experience I have not seen this come out of any other C compiler except the IBM System z C compilers. This is probably because the prior IBM mainframe compilers, such as COBOL, have checked for this sort of thing for decades. For example, from Enterprise COBOL 4.2:

IGYXX3091-W     Code from "?" to "?" can never be executed and was therefore discarded.      

And I remember this exact message (although its prefix was IGC) from the ANS Full COBOL V4 compiler when I was in college 35 years ago.

If other compilers have this checking available, this helps cement my feeling that code quality is being ignored in the interest of expediency, because developers aren't using it, obviously!

(If you do know how to generate these messages out of a particular compiler, please let us know in the comments. Some developers may not know how to get them.)

Sadly, as long as this misguided of time-versus-money paradigm continues in the software development world, you're going to see stupid mistakes like this, or hard-coded admin IDs and passwords inside software (the persistent rumors about 2013's data breach at Target Stores keep saying this), or encryption set aside because it's too darned expensive (which is why the USA once again is playing catch-up to Europe in the credit card business).

Informational messages: they're put there for a reason. Use them to create better software.

25 June 2013

Pitfalls of reusing storage with ATTACH(X)

So here's a little something that bit me, and it's something that I know, and has bit me before. Be careful if you reuse storage.

Some development shops have macros that set up CALL parameter lists in a particular area. Or maybe you've done that yourself. It's tempting to use that area for parameters passed to an attached program (PARAM, MF=(E,area)). However, if the attaching program uses this area again before your attached program has a chance to get it, the attached program will pick up whatever is there. So the pointer you expect , maybe to a program communications vector table, may wind up being something completely different, say a log process block. The famous IBM line "unpredictable results will occur" becomes prophetic. In my case, it was an S052-0101 error, which is a problem with LXRES. Now my PC was not executing LXRES, but because the loaded address came from not the expected area, whatever garbage was picked up was identical to LXRES.