DOW versus DOU - The Final Battle
By Réal Bédard
T here has been a continuing debate amongst RPG/400 programmers as to the best way to code a file-reading loop. The three most common options are illustrated below. Let's examine each one, and see if we can put the subject to rest once and for all.
Method A
READ RECORD 80 *IN80 DOWEQ *OFF THIS IFEQ THAT EXSR STEP1 ELSE EXSR STEP2 ENDIF READ RECORD 80 ENDDOMethod B
*IN80 DOUEQ *OFF READ RECORD 80 *IN80 IFEQ *OFF THIS IFEQ THAT EXSR STEP1 ELSE EXSR STEP2 ENDIF ENDIF ENDDOMethod C
SETOF 80 *IN80 DOWEQ *OFF READ RECORD 80 *IN80 IFEQ *OFF THIS IFEQ THAT EXSR STEP1 ELSE EXSR STEP2 ENDIF ENDIF ENDDOAll three methods work, and will yield the exact same result. Now, let's take a closer look.
Methods B and C have two nested structures; the DOU/W and the IF. This creates confusion at the end of the loop, where we have two ENDs plus the other END introduced in the processing logic for the record. The confusion increases with the inclusion of many levels of Ifs and Dos in the record-processing logic.
Method A, on the other hand, has only one level of nesting. This makes the code a lot simpler to read, especially when the record-processing involves many nested structures. Method A has the added advantage of using the READ/DO and READ/ENDDO as brackets. When reading through a complex read-processing section of code, the READ/ENDDO stands out and signals the end of the READ loop. When a second READ record loop is imbedded within the first loop, clarity becomes very important; this principle is illustrated below in example 1.
Example 1
READ FILEA 80 *IN80 DOWEQ *OFF KEY CHAIN FILEB 80 *IN80 DOWEQ *OFF EXSR STEP1 THIS IFEQ THAT EXSR STEP2A ELSE EXSR STEP2B ENDIF KEY READE FILEB ENDDO EXSR STEP3 READ FILEA 80 ENDDOMethod A is also great for handling groups of records. In the nested loops in Example 1, we can see that a CHAIN was used to position to the first record, whereas a READE was used to read the other records. This cannot be elegantly accomplished using methods B or C. Method A can support many different reading techniques. The top Read could be replaced with a CHAIN, similar to that in example 1, or by a combination of SETLL/READ. Typically, the start of the READ loop requires special positioning that fits well above the DO in method A.
Both methods B and C check for the end of file twice in each iteration: once on the DO and once on the IF; Method A checks only once. Method A has two READs whereas Methods B and C only have one. This causes an overhead of a few bytes but so will the extra IF in Methods B and C. The difference is not significant and is far outweighed by the ease of maintenance of Method A.
The DOU can be very confusing. Most hard-to-find file reading bugs are caused by incorrectly implemented DOU statements. One of the very few cases where DOU is handy is when deleting a group of records sharing the same key value; see Example 2, below.
Example 2
*IN80 DOUEQ *ON KEY DELETERECORD 80 ENDDOThe problem with the RPG DOU is that it is written backwards. In PASCAL language, for example, the DOU is implemented as shown in Example 3.
Example 3
REPEAT do this and that UNTIL (A > B)The test of the condition actually happens at the bottom of the loop, as shown in the PASCAL example. This is also true of RPG, but the condition is actually written at the top of the loop. That's why programmers have to remember that DOU always does the loop at least once, whereas DOW will do nothing if the condition is false to begin with.
If we examine Method C carefully, we note that it is really Method B in disguise. The SETOF forces the loop to always be executed once, like the DOU.
Method A is more intuitively logical when "read" in English. "Read a record, while not at the end, do this or that…, read the next record and check the loop condition again." If the file was empty to start with, the first READ will signal the end of file, and the record-processing loop will not be entered. That's logical!
Method B reads in English as follows: "Do the following until the end of the file (when we read this, we have not encountered the file yet! What file is it talking about? If there was code between the DO and the READ, this would be even more confusing), then, read the file, if not the end of file, do this or that, check the condition again". If the file is empty to start with, Method B enters the loop only to exit again. This is totally illogical.
Method C is similar to Method B with the added twist that it does a SETOF, so that the program will always enter the loop. This makes the DOW behave like a DOU; but why? If the file is empty to start with, Method C enters a processing loop, tries to read a record (finds the end of file), and then has to make its way out of the loop. It should not have entered the processing loop if there was nothing to do in first place. Again, this is totally illogical.
The first READ of method A is often called a "priming read". This READ primes or sets the condition that will be tested by the DO. The second read resets the condition in preparation for another test. Method A is commonly referred to in the academic community as the "Priming Read Method".
Conclusions
Obviously, we should never use Method C. It is a DOU in disguise. We should write what we mean.
Method B is more complex, prone to errors, more difficult to read, and less efficient.
Method A is the clear winner.
T
<
G
Réal Bédard, BMath, can be reached at Helix Data Processing Consultants Ltd., (905) 479-3780 or at realb@helixdp.com