Here's a bit of evil hackery to brighten your weekend.
It's quite simple to switch between coroutines using standard C: use setjmp() to save the state of the current coroutine, and longjmp() to jump to the target coroutine. It's also useful to be able to transfer a bit of data while transferring control. The function coto() (short for "coroutine goto") does this. First it saves its argument somewhere the target coroutine can find it, then it calls setjmp() which saves the current coroutine's state and returns 0, so it goes on to longjmp() to the target coroutine, where setjmp() returns 1 so coto() returns the saved argument.
static void *coarg; void *coto(jmp_buf here, jmp_buf there, void *arg) { coarg = arg; if (setjmp(here)) return(coarg); longjmp(there, 1); }
The other thing you need to be able to do is create coroutines. This requires allocating space for the coroutine's stack, initializing a stack frame there, and jumping to it. This is usually considered impossible to do without special support from the operating system or the run-time system - for example, see this paper (postscript). However if we can assume the stack is laid out contiguously in memory and that variable length arrays are allocated on the stack, we can do it using pure standard C99. (If your compiler doesn't support variable length arrays, alloca() might be available as an alternative.)
Each coroutine's stack is a chunk of the memory area used for the normal process stack. To allocate a new chunk of stack, we add the chunk size to the current top of stack (which allows space for future function calls by the current topmost coroutine) to give us a target location for the new topmost coroutine's stack. We then need to move the stack pointer to this new location, which we can do by creating a variable length array of the appropriate length. We can use the address of a local variable to approximate the current stack pointer when calculating this length. Finally, a simple function call creates the coroutine's initial stack frame and jumps to it.
The function cogo() (short for "coroutine start") implements this. It also saves the current coroutine's state before starting the new one.
#define STACKDIR - // set to + for upwards and - for downwards #define STACKSIZE (1<<12) static char *tos; // top of stack void *cogo(jmp_buf here, void (*fun)(void*), void *arg) { if (tos == NULL) tos = (char*)&arg; tos += STACKDIR STACKSIZE; char n[STACKDIR (tos - (char*)&arg)]; coarg = n; // ensure optimizer keeps n if (setjmp(here)) return(coarg); fun(arg); abort(); }
Here's a little test program to demonstrate these functions in action.
#define MAXTHREAD 10000 static jmp_buf thread[MAXTHREAD]; static int count = 0; static void comain(void *arg) { int *p = arg, i = *p; for (;;) { printf("coroutine %d at %p arg %p\n", i, (void*)&i, arg); int n = arc4random() % count; printf("jumping to %d\n", n); arg = coto(thread[i], thread[n], (char*)arg + 1); } } int main(void) { while (++count < MAXTHREAD) { printf("spawning %d\n", count); cogo(thread[0], comain, &count); } return 0; }
January 23 2010, 17:43:16 UTC 11 years ago
January 23 2010, 22:14:57 UTC 11 years ago
I should also note that Eric Freudenthal has also implemented the stack hack, though not so concisely. He even inflicts it on his students as part of a course on operating systems! http://robust.cs.utep.edu/freudent/courses/osWeb/labs/tthreads/
January 24 2010, 13:35:05 UTC 11 years ago
January 24 2010, 19:48:13 UTC 11 years ago Edited: January 24 2010, 19:49:46 UTC
When a function returns, its frame is deallocated only in the sense that it is liable to being overwritten - the memory isn't given back to the OS. So if we make sure we don't overwrite it, the frame can hang around as long as we want.
The purpose of the array n in cogo() is to provide space for the current stack to grow, without overwriting the stack frame created by cogo()'s call to fun(arg). Actually, it does a bit more than that because if the current coroutine is not the topmost chunk on the stack, the array has to span all the more recently created coroutines' stack chunks.
For a slightly more developed coroutine library (50ish lines of code and 150ish lines of comments) have a look at http://dotat.at/prog/picoro/. It has its own data type and a constructor function and everything!
Deleted comment
February 1 2010, 12:20:17 UTC 11 years ago
(I know that on BSD, setjmp and longjmp make system calls to set the signal mask, so one should use _setjmp and _longjmp instead.)
You don't have to put all the coroutine stacks in chunks of the main stack: you can instead malloc() them, and use the same stack pointer arithmetic trick (either a variable length array or a call to alloca) to make the first switch to the new stack :-)
Anonymous
July 1 2015, 14:26:48 UTC 5 years ago
Hey,
I picked up your picoro lib, its awesome! Im trying to lift stack size limits by switching stack to malloc(). Could you elaborate on this? Code is complex enough so i easily loose track of what should be happening there. I tried obvious thing of replacing stack array with dynamically allocated chunk of memory but something is missing and im not entirely sure what. Besides there probably should be some stack switching back? Also could you give me some hints on stack pointer manipulation tricks? All i know is unportable inline asm.
July 1 2015, 15:44:06 UTC 5 years ago
Look at coroutine1() in http://dotat.at/cgi/git?p=picoro.git;a=blob;f=picoro.c;hb=smaller
The dummy array saves some stack for the parent coroutine and moves the stack pointer to the child coroutine.
To use malloc() you need something like this (untested)
Anonymous
July 5 2015, 10:56:32 UTC 5 years ago
Question on Stack Allocation
Anonymous
February 24 2015, 17:42:45 UTC 6 years ago
Re: Question on Stack Allocation
February 24 2015, 18:27:45 UTC 6 years ago