onsdag den 2. september 2015

How the assembler works (2)

How to create a translate table

This time we are going to manipulate the location counter a bit and show how we can use it for our benefit during programming. Let us set up the game. We are supposed to make a program that receives data as a parameter and we shall tranlate all upper case letters to lower case (and leave the lower case letters as is). We need a translate table of 256 of characters that describes this translation.

First thing first. We will make a table containing all hexadecimal values from X'00' to X'FF', 256 bytes. We call it LC (Lower Case). 

The next line will change the location counter back to the start of LC, Take a look at the location counter on the third line which equals the first line. The instruction ORG is doing that. 

The third line makes the whole table of 256 characters, This is a kind of magic but obvious when you know how to do it. We define a constant (DC) of 256 single characters (AL1). Each of these 256 characters takes the value of the current location counter, which increases with one for each character, subtracts the start of this table (LC). The result is as you see on the left. We have filled up the empty LC definition with values from X'00' to X'FF'.

00019E                              183 LC       DS    CL256        
00029E                0029E 0019E   184          ORG   LC           
00019E 0001020304050607             185          DC    256Al1(*-LC) 
0001A6 08090A0B0C0D0E0F                                             
0001AE 1011121314151617                                             
0001B6 18191A1B1C1D1E1F                                             
0001BE 2021222324252627                                             
0001C6 28292A2B2C2D2E2F                                             
0001CE 3031323334353637                                             
0001D6 38393A3B3C3D3E3F                                             
0001DE 4041424344454647                                             
0001E6 48494A4B4C4D4E4F                                             
0001EE 5051525354555657                                             
0001F6 58595A5B5C5D5E5F                                             
0001FE 6061626364656667                                             
000206 68696A6B6C6D6E6F                                             
00020E 7071727374757677                                             
000216 78797A7B7C7D7E7F                                             
00021E 8081828384858687                                             
000226 88898A8B8C8D8E8F                                             
00022E 9091929394959697                                             
000236 98999A9B9C9D9E9F                                             
00023E A0A1A2A3A4A5A6A7                                             
000246 A8A9AAABACADAEAF                                             
00024E B0B1B2B3B4B5B6B7                                             
000256 B8B9BABBBCBDBEBF                                             
00025E C0C1C2C3C4C5C6C7                                             
000266 C8C9CACBCCCDCECF                                             
00026E D0D1D2D3D4D5D6D7                                             
000276 D8D9DADBDCDDDEDF                                             
00027E E0E1E2E3E4E5E6E7                                             
000286 E8E9EAEBECEDEEEF                                             
00028E F0F1F2F3F4F5F6F7                                             
000296 F8F9FAFBFCFDFEFF                                             
See *(1)

However, that is not going to help us a lot. We need to replace all upper case letters with lower case letters. 
00029E                0029E 0025F   186          ORG   LC+C'A'       
00025F 8182838485868788             187          DC    9AL1(*-LC-64) 
We do that by manipulating the location counter once again. This time to the start of the upper case alphabet. The second line shows that it is at X'00025F'. Then we replace nine characters with their equivalent lower case. We know they have a value 64 less then the upper case letter. The EBCDIC characters come in bundles of nine except for the last part from 'S' that is eight.
000267 89                                                            
000268                00268 0026F   188          ORG   LC+C'J'       
00026F 9192939495969798             189          DC    9AL1(*-LC-64) 
000277 99                                                            
000278                00278 00280   190          ORG   LC+C'S'       
000280 A2A3A4A5A6A7A8A9             191          DC    8AL1(*-LC-64) 

We have in Danish three extra vowels that we must translate too *(2).
000288                00288 00219   192          ORG   LC+C'Æ'       
000219 C0                           193          DC    C'æ'          
00021A                0021A 0021A   194          ORG   LC+C'Ø'       
00021A 6A                           195          DC    C'ø'          
00021B                0021B 001F9   196          ORG   LC+C'Å'       
0001F9 D0                           197          DC    C'å'          
0001FA                001FA 0029E   198          ORG   LC+256        
We just use the actual character as offset into the table and replaces it with the lower case.
Please note the last line. It is a common mistake to forget it. What will happen if we forget it? The next instruction or Data Constant will start at location counter 0001FA. That will destroy the LC-table

By the way. The translation table can be used by the TR-instruction, f.ex.
          TR    AREA,LC
Upper case letters in AREA will be converted to lower case.


*(1) - If you wish to see the whole table, as here, you must precede the first line with PRINT DATA, and after the table use PRINT NODATA to set it off again.
*(2) - The Scandinavian languages are a little bit more sofisticated than English. ;-)

Ingen kommentarer:

Send en kommentar