I have written a program; suppose it's called worker
.
(While the program is written in Haskell, I don't think that's particularly relevant to this post.)
When run, worker
starts a bunch of copies of a script.
Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say ./sleep.sh
:
#!/bin/bash
sleep 10
Clearly, there is no weird behaviour here, assuming that the system has bash
under /bin
, and mine does.
The copies of sleep.sh
are started by passing ./sleep.sh
to posix_spawnp(3)
.
(The Haskell process
library does this for me.)
The thing is, occasionally (once every 5 to 10 invocations of ./worker
, approximately), posix_spawnp
returns EFAULT
("Bad Address").
The manpage for posix_spawnp
says that:
ERRORS
The posix_spawn() and posix_spawnp() functions fail only in the case where the underlying fork(2), vfork(2) or clone(2) call fails; in these cases, these functions return an error number, which will be one of the errors described for fork(2), vfork(2) or clone(2).
In addition, these functions fail if:
ENOSYS Function not supported on this system.
Okay, so I should look for EFAULT
in fork(2)
, vfork(2)
and clone(2)
to figure out what goes wrong, right?
Wrong.
Or, in any case, none of those manpages mention EFAULT
.
I've looked through the source code of posix_spawnp
in glibc and it at least doesn't throw EFAULT
directly; presumably, one of the subroutines it calls does.
glibc is large and I don't think looking through the entire call tree will be very productive, so I tried to diagnose the issue from the outside instead.
And this is where the weirdness starts.
Whenever my program encounters EFAULT
from posix_spawnp
, it prints Oops EFAULT
; hence grepping for EFAULT
gives output precisely if the error occurred in this run.
I get the following observations:
./worker 2>&1 | grep EFAULT
: errors occur../worker 2>&1 | grep EFAULT | cat
: errors DO NOT occur../worker 2>&1 | grep --line-buffered EFAULT | cat
: errors occur../worker 2>&1 | grep --line-buffered EFAULT
: errors occur.("errors occur" means that once every few executions I get output indicating that EFAULT
occurred; in the negative case I've run it for >20x the number of invocations that are necessary to produce EFAULT
in the other cases, without any EFAULT
.)
The only situation in which posix_spawnp
seems to always succeed, is when stdout
of the process that worker
's output is piped to, is block-buffered.
But this makes no sense: there shouldn't even be a reasonable way in which worker
can even determine whether this is the case!
Surely it can distinguish between ./worker | cat
and ./worker
(using isatty(3)
-- this is precisely what grep
does when not passed --line-buffered
), but in all of the above cases the output is piped to another process anyway.
This is already spooky, but it gets even spookier: if I replace the invocation of ./sleep.sh
by an invocation of sleep
(i.e. removing the indirection of the shell script), errors occur in none of the above setups.
Somehow, starting a script is different from starting a native process (and changing bash
to dash
in sleep.sh
doesn't change anything).
posix_spawnp
shouldn't care what it is starting!
That's the job of the loader, as far as I know.
So what gives?
I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post. In the meantime, spookiness.