I have written a program; suppose it's called worker
.
(While the program is written in Haskell, I don't think that's particularly relevant to this post.)
(EDIT: Reproducer can be found here.)
(EDIT 2: Diagnosis by int-e
on irc here.)
When run, worker
starts a bunch of copies of a script.
Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say ./sleep.sh
:
#!/bin/bash
sleep 10
Clearly, there is no weird behaviour here, assuming that the system has bash
under /bin
, and mine does.
The copies of sleep.sh
are started by passing ./sleep.sh
to posix_spawnp(3)
.
(The Haskell process
library does this for me.)
The thing is, occasionally (once every 5 to 10 invocations of ./worker
, approximately), posix_spawnp
returns EFAULT
("Bad Address").
The manpage for posix_spawnp
says that:
ERRORS
The posix_spawn() and posix_spawnp() functions fail only in the case where the underlying fork(2), vfork(2) or clone(2) call fails; in these cases, these functions return an error number, which will be one of the errors described for fork(2), vfork(2) or clone(2).
In addition, these functions fail if:
ENOSYS Function not supported on this system.
Okay, so I should look for EFAULT
in fork(2)
, vfork(2)
and clone(2)
to figure out what goes wrong, right?
Wrong.
Or, in any case, none of those manpages mention EFAULT
.
I've looked through the source code of posix_spawnp
in glibc and it at least doesn't throw EFAULT
directly; presumably, one of the subroutines it calls does.
glibc is large and I don't think looking through the entire call tree will be very productive, so I tried to diagnose the issue from the outside instead.
And this is where the weirdness starts.
Whenever my program encounters EFAULT
from posix_spawnp
, it prints Oops EFAULT
; hence grepping for EFAULT
gives output precisely if the error occurred in this run.
I get the following observations:
./worker 2>&1 | grep EFAULT
: errors occur../worker 2>&1 | grep EFAULT | cat
: errors DO NOT occur../worker 2>&1 | grep --line-buffered EFAULT | cat
: errors occur../worker 2>&1 | grep --line-buffered EFAULT
: errors occur.("errors occur" means that once every few executions I get output indicating that EFAULT
occurred; in the negative case I've run it for >20x the number of invocations that are necessary to produce EFAULT
in the other cases, without any EFAULT
.)
The only situation in which posix_spawnp
seems to always succeed, is when stdout
of the process that worker
's output is piped to, is block-buffered.
But this makes no sense: there shouldn't even be a reasonable way in which worker
can even determine whether this is the case!
Surely it can distinguish between ./worker | cat
and ./worker
(using isatty(3)
-- this is precisely what grep
does when not passed --line-buffered
), but in all of the above cases the output is piped to another process anyway.
This is already spooky, but it gets even spookier: if I replace the invocation of ./sleep.sh
by an invocation of sleep
(i.e. removing the indirection of the shell script), errors occur in none of the above setups.
Somehow, starting a script is different from starting a native process (and changing bash
to dash
in sleep.sh
doesn't change anything).
posix_spawnp
shouldn't care what it is starting!
That's the job of the loader, as far as I know.
So what gives?
I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post.
In the meantime, spookiness.
snap-server
modifies the environment to set the locale, and setenv(3)
is not atomic.
In particular, it breaks execve(2)
when they race, and this is what happens.
All possible solutions to this problem are hacks.