Engineering problems from a practical perspective.

6. - Closed Loops

How should a CPU handle sub-routine calls internally? One approach would be to store the program counter address of the sub-routine call and on reaching the end of the sub-routine return control to the program counter address 'plus one'. Alternatively the CUP could store the program counter address 'plus one' and use that to reload the program counter address at the end of the routine. But what happens if neither approach is used and how to fix it?

Marconi Instruments, circa 1983

The Advanced Systems Group had produced a specialist CP board for controlling a high-speed digital test set. The board was of multi-layer construction and the designer had spent a lot of time checking his work before the board went to fabrication. Given the nature of the business there was little margin for error as the first production batch of boards was earmarked for development, demonstrator and first customer applications. Failure was not an option!

To assist the Advanced Systems Group a crack commissioning team had been assembled. In part that was to relieve ASG from doing its own development but also to tap into the considerable experience of the commissioning engineers. (As an aside a Service Department technician once complained about the status accorded to the Commissioning Engineers. It was pointed out to him that at least he had the assurance that the equipment that he was called out to fix at worked at least once. For the commissioning team there was no such assurance: Had the equipment been put together correctly? Had it been built correctly? Had it been designed correctly? Far too often the later was often not the case)..

Step by step the board was put through its paces. All went fine until the sub-routine test, at which point the board got stuck in an infinite loop. It turned out that the board had been designed to store the calling program counter address and then to use that stored value as the return call address, i.e. it went back to where it came from which promptly made another call to the sub-routine that had just been completed. When the designer was told of this fault he was in shock. The hope had always been that the commissioning team would just pick up and clear manufacturing errors, solder bridges and the like.

How to fix it? Once the designer had stopped screaming it was suggested to him that the track be cut leading to the lowest address bit line and that line be 'tied up' with a resistor to 'force' a logic high. All that was then needed was to ensure that all sub-routine calls were made from 'even' addresses, the new return address now always being  forced to be the next higher 'odd' address. Of course that meant that the program compilation stage had to slip in the odd NOP (No Operation) code to make things right. NOPs in old programs were invariably there for a purpose!