7 February 2004 Code that moves itself I recently bought a small laptop on eBay for a hundred bucks (a Soyo PW-9800). I've decided to remove the Windows partition from the Soyo's hard disk and devote the whole hard disk to Karig. I want to be able to take the laptop to a café somewhere, plug it in, turn it on, and have it boot up into Karig, so I can start writing or coding while I have my coffee. So I'm dropping the tiny-compiler miniproject for now and concentrating on writing something I can store on the Soyo's hard disk. So I'm working on a boot sector that will move itself down to address 0:0x0800 and then load 62KB from the hard disk (filling memory up to address 0:0xFFFF). This 62KB block would contain Karig's setup code, kernel, and compiler. I just had to figure out how to get NASM to produce the kind of code I wanted. The problem First of all, NASM complains when you include more than one ORG directive in a source-code file (it prints "error: program origin redefined" and produces no output file), so this code won't work: [ORG 0x7C00] ; copy code to 0x0800 ; jump to code at 0x0800 [ORG 0x0800] ; code continues So I had to come up with a source-code file with one ORG directive, but which contained code to be executed from two locations in memory: The block of code that sets up and executes the copying of the boot-sector code to the new location had to be executed from the old location (0x7C00 to 0x7DFF), while everything else had to be executed from the new location (0x0800 to 0x09FF). NASM uses the ORG directive to figure out how to translate a JMP statement into the correct machine code. If I use [ORG 0x7C00] and enter "JMP 0x0800", NASM produces three bytes of machine code. However, what NDISASM (the disassembler that comes with NASM) produces is not 00000000 E90008 jmp 0x0800 as a newbie might expect, but rather 00000000 E9FD8B jmp 0x8c00 This requires a little study. First, the '0x8c00' adds up, because NDISASM assumes "ORG 0", but I wanted "ORG 0x7C00", and if you add the word 0x7C00 to the word 0x8C00, you get the word 0x0800, which is the JMP destination I wanted. But second, look at the second and third byte of machine code — "FD8B", which add up to the word "8BFD". Where did this come from? The answer is that, in near jumps, the address in a JMP instruction is really a signed number that is added to the processor's instruction pointer. In this case, 0x8C00 is really minus 0x7400, so the processor is subtracting 0x7400 from 0x7C00 and getting 0x0800. Also, 0x8BFD is really minus 0x7403. If the processor loads the JMP instruction and updates the instruction pointer as soon as it knows how many bytes the instruction takes up, then the address in the instruction pointer isn't 0x7C00, but 0x7C03, the address of the next instruction. Subtract 0x7403, and again you get 0x0800. So everything adds up here. But the bulk of the code in the boot sector, and in the sectors that the boot sector is supposed to load, requires "[ORG 0x0800]" because the bulk of this code will be executed within the block of memory that begins at 0x0800, not the block of memory beginning at 0x7C00 where the BIOS loads the boot sector. The catch is that the code that actually jumps to the new block at 0x0800 must be executed from the old block at 0x7C00 and therefore requires an "[ORG 0x7C00]". If I try to do a "JMP 0x0800" after setting an "[ORG 0x0800]", I get 00000000 E9FDFF jmp 0x0 which is equivalent to a "JMP NEAR $" (a near jump to the current instruction — a three-byte endless loop), which is not what I want at all. Solution: Far jump The solution is to use a far jump — "JMP 0:0x0800". A near jump, when executed, adds a number to the instruction pointer; a far jump, when executed, stores a number into the instruction pointer. However, a far jump is five bytes long, and I wanted to save bytes wherever I could. (I don't know how much I can cram into 62KB, but I'd like to find out.) Then I realized that every boot sector I've written since my first one started off with a far jump, which I used to straighten out the CS register (to ensure that it contained a zero, and not 0x07C0, which is what some BIOSes put in there before calling the boot sector, according to something I read somewhere). Suppose I began the boot sector with code that didn't depend on the exact content of CS — that is, it made no subroutine calls — and put the far jump at the end of this code instead of at the beginning? Before this, I had planned on using a far jump to straighten out CS and a near jump to the 0x0800 block; I'd replace these with a single far jump to the 0x0800 block that at the same time straightened out CS. I would thus save three bytes (the length of a near-jump instruction). The code There are two parts to this section. The first covers the new boot-sector statements needed to implement (1) copying the boot sector to 0x0800 and (2) jumping to the boot-sector copy without crashing. The second section covers changing the destination of the jump, so as to make room for some variables or other data in the boot sector, so that they will be copied along with the rest of the boot sector and will still be available after the jump has been made. The complete boot-sector code is here. A simple jump What I have here is code that jumps from address 0x7C00+N to address 0x0800+N+5 (five being the length in bytes of a far jump). The code is in two "stages," with the first stage being the copying and the far jump, and the second being whatever the code in the 0x0800 block does. The code is 16-bit, and the origin is 0x0800. The Stage-One code should be able to run anywhere in memory, so that changing the origin won't change the machine code produced from these instructions (i.e., the fact that the ORG directive here is technically wrong for these instructions doesn't matter, because NASM will produce the correct machine code no matter what the ORG value is). [ORG 0x0800] [BITS 16] First, I straighten out the four data-segment registers — all of these should point at segment zero. (Note that I've removed the far jump I've been using to straighten out the code-segment register.) stage_1: xor ax, ax mov ds, ax mov es, ax mov fs, ax mov gs, ax Then I set up a stack, beginning at 0x0800 and growing downwards (so that the first word pushed will be written to 0x07FE). cli mov ss, ax mov sp, 0x0800 Having stored in the stack pointer the address to which the boot sector is to be copied, I start preparing for the move by copying the address into DI before re-enabling interrupts. mov di, sp sti Now I finish setting up the move, which requires the following registers to be set up:
So far, I've already set DS, ES, and DI. I still need to clear the direction flag and set SI and CX. cld mov cx, 256 mov si, 0x7C00 Finally I can move the boot sector... rep movsw ...and make my far jump. Note that I have to put in "(stage_2 - stage_1)" because I want to skip over that part of the boot sector that has already been executed. I wrote that the far jump goes from 0x7C00+N to 0x0800+N+5; the "(stage_2 - stage_1)" is the "N+5" here. jmp 0:0x0800 + (stage_2 - stage_1) stage_2: Verifying that the jump occurred To ensure that the copy and jump worked, I wrote some Stage-Two test code. First, I wiped out the original boot-sector by filling it with zeroes. cld xor al, al mov di, 0x7C00 mov cx, 256 rep stosw Then I dumped 64 bytes from the boot-sector copy, and 64 bytes from the old boot-sector block. As I had hoped, the memory dump occurred, the first four lines displayed machine-code bytes, and the last four lines displayed null bytes. call clear_screen mov bx, 0x0800 ; test code only call dump_16 call dump_16 call dump_16 call dump_16 mov bx, 0x7C00 ; test code only call dump_16 call dump_16 call dump_16 call dump_16 Jumping over a block of variables I wanted to be able to specify variables or constants right in the source code, and to have such data right in the boot sector so that I don't need to write code to set them up outside of the boot sector, and so they'd be copied too. I'm thinking of using this option when writing the code to load that 62KB of sectors from the disk. You usually place such data between subroutines — after JMP or RET statements. That far jump I used to go to Stage Two didn't have to land at the instruction immediately following the copy of that far-jump instruction; it could land a little further up in memory, thus leaving a space between the far-jump instruction copy and the first instruction executed afterward. This space would be perfect for storing variables or tables if you have no other place for them. So my first modification here will be to the far jump. The modified far jump will jump from 0x7C00+N to 0x0800+N+5+X, where X is the size of whatever non-code stuff I want to put into the boot sector. My second modification here will be to add test code to prove that the far jump is indeed landing at exactly the correct address, whether X is equal to zero or anything else. Here follows the above source code, with modifications in boldface. The first part of the boot sector is unchanged. Stage One still needs to straighten out the segment registers, set up the call stack, and copy the boot sector to address 0x0800. [ORG 0x0800] [BITS 16] stage_1: xor ax, ax mov ds, ax mov es, ax mov fs, ax mov gs, ax cli mov ss, ax mov sp, 0x0800 mov di, sp sti cld mov cx, 256 mov si, 0x7C00 rep movsw But here I add a line of test code. I want to use a counter to prove that my far jump lands in the right place. The counter is in BX, which I set to zero. (I have to put the instruction in Stage One, before the jump is made, to ensure that it is executed.) xor bx, bx Then I jump. I don't really have to modify the jump instruction at all. All I have to do is store my data between the jump instruction and the stage_2 label. jmp 0:0x0800 + (stage_2 - stage_1) dd 0x12345678, 0x9ABCDEF0 inc bx inc bx inc bx stage_2: Note that if the jump lands a byte or two too low in memory, the counter will be incremented one or two too many times. The test will succeed only if the counter contains a one at the end of this. The counter should still be zero immediately after the jump. I increment it here, at the beginning of Stage Two. If the jump lands a byte or two too high in memory, BX will not be incremented. inc bx I need a way to display the value of the counter on the screen. So I wipe out the original boot-sector code, as before, but I store BX to the first byte there. cld xor al, al mov di, 0x7C00 mov cx, 256 rep stosw mov [0x7C00], bx The last dozen lines of source code are the same, but the effect has one difference: The first byte on the fifth line printed on the screen (beginning with 0000:7C00:) should be 01, not 00, because BX was just stored there, and BX should contain a one. call clear_screen mov bx, 0x0800 call dump_16 call dump_16 call dump_16 call dump_16 mov bx, 0x7C00 call dump_16 call dump_16 call dump_16 call dump_16 Results So the results need to be as follows:
So my code works. The lines that appear on my laptop look like this: 0000:0800: EA 05 7C 00 00 8C C8 8E D8 8E C0 8E E0 8E E8 FA | ..| ........... 0000:0810: 8E D0 BC 00 08 89 E7 FB B9 00 01 BE 00 7C F3 A5 | ... ..... .. |.. 0000:0820: 31 DB E9 09 8C 78 56 34 12 F0 DE BC 9A F4 43 FC | 1....xV4......C. 0000:0830: 30 C0 BF 00 7C B9 00 01 F3 AB 89 1E 00 7C E8 20 | 0.. |. ..... |. 0000:7C00: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | 0000:7C10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | 0000:7C20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | 0000:7C30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | Now I can get busy writing the code to read 62KB from the disk. It'll probably just be a rewrite of code already presented. Check the index for other entries. |