Search (using Google):  Web Karig

 

24 January 2004

(Revised 25 January 2004)

Threads in Karig

At this point it is way too early for me to write any code that handles threads. I'm still working on the prototype compiler; I need to finish more entries on finding memory; I need to finish my system loader before I really worry about Karig's kernel. But I decided to write this up, because I worked out the steps that Karig needs to take if it is to support doing things in the background. The steps are subject to change, of course, but as they are, they're worth sharing.

First of all, a thread is a string of instructions with its own data stack and return stack. The system can have several of these set up at the same time, but only one thread can be running at any given moment. The system keeps a schedule — a list of threads. When one thread stops running, the system grabs the next task from the schedule and starts it running. When the system reaches the bottom of the schedule, it starts over from the top, so that all threads are run in a continuous cycle.

This schedule can change at any time. Any thread can start a new thread, so that a secondary job can be completed in the background, and the original thread can continue to do its job independently. In addition, a thread can finish its job at any time and exit — that is, allow the system to remove it from the schedule.

Each thread usually runs only for a tiny fraction of a second before it stops running and the next thread starts. In this way, the system produces the illusion that the computer is doing several things at once, such as running a search and generating a report listing all documents that meet certain criteria, while at the same time converting your keystrokes into text, storing the text into the current document, and reformatting the document on the fly as you type.

I expect that the system will run a thread or two of its own. For example, Karig may run a thread to search the system to seek out "holes" in memory left by deleted objects and to close those holes by moving undeleted objects down in memory. (I'm considering a memory allocator that just keeps a pointer to the top of allocated memory and just moves the pointer up to allocate more memory.) If Karig has to do something fairly regularly, I'll want Karig to do it in a background thread — I don't want the system to just lock up while Karig is doing housekeeping.

Karig will offer cooperative multitasking. This will mean that each thread must either finish its work quickly and exit, or do a portion of its work and then yield. In cooperative multitasking, a thread has to cooperate with the other threads on the system and yield control to them before too much time has passed. Most modern operating systems offer pre-emptive multitasking, which interrupts threads at arbitrary moments, to ensure that no thread can take too much time away from other threads. I decided to stick with cooperative multitasking because it's simpler to implement — I don't have to deal with locks or critical sections.

The THREAD word

In Karig, one thread will be able to run a word in its own new thread with the thread word. This word will stop the current thread, add a new thread to the schedule, put the address of the word into the return stack of the new thread, and "return" to the new thread so that it starts running immediately.

I thought of two ways of doing this. The first way wasn't very good.

The bad version

In the first version, the thread word would be used like this:

word ... thread ... ;

Here, a word turns itself into a thread. Now why is this a really bad idea? To figure this out, you need to know what has to happen in order to stop one thread and start a new one.

To stop a thread so that it can be restarted, the system has to save its state so that when it comes time to let the thread run again, the state can be restored, and the thread will continue as if it had never stopped and nothing had changed. The state of a thread includes the contents of the processor's registers and the contents of the thread's data and return stacks. If the word thread occurs in the middle of the definition of a word word that is supposed to be run as a new thread, then of course there is the huge risk that the words that preceded thread in the definition changed the registers and stacks between the time that word was called and the time that thread was called. The idea is to allow the caller of word to be able to continue reliably after making the call to word, and that requires leaving the registers and stacks as they were when the call was made. So requiring thread to appear in the definition of word before word can be run in its own thread is dangerous.

It is also inflexible. In this scheme, word would always run in its own thread; there'd be no way to run it as part of the current thread without redefining the word.

The good version

In the second version, you'd use this syntax:

caller ... thread word ... ;

In other words, the word to be run as a thread appears immediately after the word thread. So if you wanted to print something in the background, you'd enter thread print. Note that you still have the option to run the word as part of the current thread, and you could run almost any word in its own thread.

Note that this means that if I enter thread word in a word definition, the compiler cannot compile word in the way it normally would — by adding a CALL instruction to the definition of word — because that would run the word in the current thread. So thread has to do something to make the compiler operate differently. For this to happen, thread needs to be one of those words that is not compiled, but executed at compile-time.

I thought of writing the compiler so that it exposed a few of its services, so that if you wanted to write a word that was to be executed at compile-time, you could use these services. These services would be very simple jobs, to be completed in only a handful of instructions. Three such services would be (1) to return the next word waiting to be processed, (2) to look a word up in the dictionary and return its address, and (3) generate a CALL instruction to an address. If the compiler was to process the words thread print, it would encounter thread, discover that this word was to be executed, and execute it. Then thread would ask the compiler for the next word, which would be print, and then ask for that word's address. Then thread could have the compiler generate a CALL to that address, or it could do something else with the address.

I expect I'll give these services names that begin with an asterisk, to mark them as words that the programmer would not normally use.

Here are the steps that I expect that thread will perform at compile-time:

  • Call compiler service *next to get next word.

  • Call compiler service *addr to look word up in dictionary and get address.

  • Compile code to push this address onto the data stack at run-time (probably by calling a compiler service *lit).

  • Compile code to call the kernel routine *thread. This routine "freezes" the current thread, sets up a new thread, and starts the new thread running. It also fixes the return stack of the "frozen" thread so that when the scheduler next "thaws" and runs the thread, the thread will continue running from the correct address within the original caller routine.

So the code added to the code space at compile-time by thread word looks something like this:

		lea	esi, [esi-4]           ; 3 bytes
		mov	[esi], eax             ; 2 bytes
		mov	eax, [address_of_word] ; 5 bytes
		call	_thread                ; 5 bytes

Here are the steps taken by the generated code at run-time:

  • Push address of word onto the data stack and call *thread. This adds to the current thread's return stack an address within caller.

  • The word address is copied into a variable, private to *thread, and is dropped from the data stack. (One advantage of cooperative multitasking is that routines don't have to be "thread-safe.")

  • The return address is left on the return stack, so that the scheduler can return to the correct location, when it is again caller's turn to run.

  • The *freeze routine is called. This saves all registers to the register space set aside for the current thread (adjusting ESP to account for the address to return from this routine).

  • If I decide that I want overflow protection for stacks, then I'll place each stack into its own 4KB page, with adjacent pages marked absent. If I do this, then *freeze will also mark the current thread's data- and return-stack pages absent.

  • Once the current thread is frozen, *thread creates a new thread. It asks for two unused pages for creating a new data and return stack, and it asks for space into which registers can be saved when it is time to freeze this new thread. (If there isn't space to do all this, then the error must be handled — the error should be recorded in a system log somewhere, and perhaps the current thread can be unfrozen and the word can be run in the current thread.)

  • The word address is pushed onto the (otherwise empty) new return stack.

  • The new thread is made the current thread, and the scheduler is updated.

  • The *thread routine returns, which starts the new thread running.

  • When the new thread calls yield, other threads (such as the original caller) can run.

  • When the new thread returns (i.e., tries to execute a RET instruction when the stack is empty), this triggers a stack underflow. The system takes over and quietly destroys the thread (thus eliminating the need for a special exit routine). (Obviously this requires that I set up an IDT and point it at a stack-underflow handler.)

More stuff about threads

Thread support in Karig will be simple. A thread in Karig will have its own data stack and return stack, but not its own address space. All threads will operate in the same address space, and the programmer would be expected to remember that there will be other threads running and that any thread that can't complete its work in a very short time should use yield liberally.

I have no plans to support priorities among threads. Karig will include a simple round-robin scheduler that simply runs each thread on the schedule in turn.

Many OSes support blocking a thread (refusing to let it run) until an event occurs, or letting a thread sleep (not run) until a certain amount of time has passed. I don't plan to support these, as they would complicate the scheduler (by requiring it to check for conditions and keep a log of which thread to wake up when a certain condition is true). Better that a thread handle its own blocking and sleeping, and check for its own condition, by defining a word that checks for the condition, yielding if the condition is false, and returning when it is true, like this:

wait-until-ready not-ready if yield then ;

Check the index for other entries.