DLSuperCBT is resynchronizing byte compare program. It details changes in a manner that shows where the files differ – bytes have been changed between both files, added to the new file or deleted from the old file. The program is unique as most other byte compare programs, rarely, display meaningful results after data shifting occurs between files. In other words, most byte compare programs are unable to generate accurate results when the number of byte inserts and deletes get out of synch.
Amazingly, DLSuperCBT uses the same comparison algorithm as DLSuperC. Any difference in the process is minimal, except the unit of comparison is a single character instead of a text line. Even though the byte compare unit is small and only differs between a 0 to 256 value (more lightly heavily biased to the ASCII text character values), meaningful results can be shown.
If there is a major problem, it is the large number of identical sequences that exist in any large data file. This should not a problem with the DLSuperC difference algorithm as sifting through identical lines and determining differing line changes has been demonstrated in the line compare results from DLSuperC. If there is any complaint about the DLSuperCBT output it is that many differences (i.e., noisy output) tends to camouflage meaningless results even though the differences are accurate. Well, yes, another minor complaint could be the amount of time to inspect large files with large differences.
Most useful comparisons are as a result of minimal data changes and a corresponding presentation that can be meaningfully interpreted. File compares as performed by using DLSuperCBF give equivalent results as DLSuperCBT when a comparison result of “the files are the same or different” is acceptable. DLSuperCBT does more that just obtain a same or different result. It indicates where in the files these differences occur and a view of these byte differences in a presentation model that can many times be helpful. For example, it is useful in showing data differences in embedded module data paths. The embedded data does not have to be the same length. The display of the difference can be observed in the ASCII data.
The presentation report is in the form of a traditional data dump divided into two vertical sections. The left vertical section of each data output line is for the hexadecimal character representation of the data. The right vertical section shows the character equivalent (if possible) for each byte listed. The leftmost data address displayed for each line is the new file byte offset value. Old file change line byte offsets for the first byte file position on the lines is listed in the rightmost column of the report line. Hence, the addresses on the left are the running offset values through the new file. The equivalent resynchronization addresses for the old file is on the right.
There are several preference options that enhance the DLSuperCBT process. Part byte compare options (i.e., Nofs and Oofs) allow the user to specify specific sections of each file to be compared. Sometimes files can be long and only part of a data file is of interest. Another option affects the maximum number of matching data lines to be displayed before and after each displayed line of changes (i.e., Chgv) when using the listing type of Chng. The user can also customize the output file allocation by selecting the Apnd preference option. This causes the output file to accumulate successive reports using the same output file as Mod instead of New. Lastly, there is a Noss option that bypasses the reporting of the comparison statistics where statistics may be secondary to the reporting of the lines of data that changed.
Users will find the operation of DLSuperCBT is much like DLSuperC except there are no data filters (i.e. Dp Lines) that screen out data to be ignored.
Panel Field Discussion
DLSuperCBT retains the look, feel and presentation of its other companion products – DLSuperC, DLSuperCX, DLSuperCRV and DLSuperCBF. There is a main tab panel that allows the user to input the file names for the two files to be compared and an additional, optional, slot for a file name that is to be used to directly send the output results. The compare report panel can be inspected in a separate panel at the termination of the compare.
The user has the capability to additionally save or print the displayed report at the completion of the compare. The optional main panel output file slot is provided since byte differences tend to generate a large output report and on-line digesting and reviewing the comparison results is best achieved with a user’s favorite editor following the run operation. In other words, the user gets two ways to save the results. It also automates the operation so the reviewing of the comparison can be bypassed and reviewed later.
Both the report and the output file are identical in the information detailed.
As with the DLSuperC program, the user can select from a list the format of the output report. The listing options are Ovsum, Delta, Chng, or Long. Each selection allows the user a range of from minimal to maximum amount of output.
- Ovsum produces the least amount of output – only the total statistics are displayed.
- Delta displays only the changed lines with a preceding matched line above each change. Note: Line compare does not display the preceding single but the information in byte compare makes this convention appear naked.
- Chng displays the report with default of up to 6 lines above and 6 lines below each accumulated data changed line. The user can modify the “before and after number of matches” by supplying a preference option amount where a 1 to 9 match line value is allowed.
- Long produces a report with the full amount of output. All changes and matching data is reported using this listing option.
There is also the standard File Selection, About, and Help/Overview panel tabs. These panels provide the standard services expected from most interactive programs. The file selection panel is equipped to allow the user to select and or all of the three (two input and one output) files before returning to the main panel. The main panel retains three drop down combo boxes to retain up to 10 previously specified names for each required file slot.
The Help tab contains this help file. It is initially loaded at the beginning of the program initialization as DLSuperCBT.rtf. It has to be contained in the same directory as the DLSuperCBT.exe program file.
Compare Criteria, Results and Output Interpretation
As been previously mentioned, DLSuperCBT uses, essentially, the same compare algorithm in detecting changes as the other DLSuperC programs. However, the change detection criterion targets both binary as well as ASCII text. Each byte in the file is inspected and many sequences that appear similar in text files can be examined in more detail for suspected differences.
For example, a text file comparison program would equate several line ending character sequences as equivalent. For example, a CR-LF sequence and a single LF character might both be interpreted as a text line ending control sequence. Similarly, one file might end with an EOF (i.e., 1A hex) and be interpreted as a end of a text file whereas the comparison file may not have a EOF character at the end. Neither file would appear different to a line compare program. And lastly, many times, null lines compare as equivalent to lines composed of only blank characters. These interpretations and rules are followed for text comparisons with DLSuperC and DLSuperCX.
Yet, interpreting the output of DLSuperCBT can sometime be difficult even though there are only four classifications for a line of displayed data:
- (Blank) – A matching line of binary data bytes found in both the new and old file at a beginning synchronized offset. The old file offset references the first matching old file data byte displayed on this common line.
- (I -) – An inserted string of data bytes from the new file not found in the old file. The new file offset continues ascending as the inserts are shown.
- (D -) – A deleted string of data bytes found only in the old file. The new file corresponding offset is shown by the left address position. The old file offset can be determined from the rightmost first-old-file-reference-address.
- (DR-) – A deleted-replaced string of data bytes found only in the old file at a string position continuous with the inserted bytes above in the I – tag line. Similarly, the old file offset can be interpreted by using the rightmost old file reference.
As a result, it is difficult to view the data differences when using the new file offset address as a guide and also view the differences in the old file with the old file offsets in the right-side column as a guide. When the files first match at the beginning, these offsets start off having the same value. As data differences in each file accumulate, the offsets also diverge – depending on the extent of the inserted and deleted bytes. As mentioned, the address offset for any old file difference is listed as the first occurrence of the differing byte. Whereas, the new file offset always designate an even 16 based (e.g., 3FF0) hexadecimal offset value, the old file first change value may display any odd or even value. It takes a little practice to visualize the position of old file deletion bytes relative to the new file address displayed. It is hard to visualize that old file deletions occur in “zero space” while the new file offsets retains their same relative value.
Similarly, a sequence of inserted new file bytes does not change the offset of the old file byte position. Only an inserted string with a corresponding Deleted-Replaced string display an equal advance to each offset position.
It probably isn’t clear but it takes practice in interpreting the output. Anyway, it’s the best that can be done in the presently used presentation model. Study the output and it appears to make sense.
Note: The I/0 processing routines in DLSuperCBT can not handle files greater than 2 GB. The program execution continues processing normally if the user wants to continue the compare up to the 2 GB boundry. The user may want to use the NOfs and OOfs process options beforehand if he is aware of the 2 GB boundry limitation.