From 813aae5bddcb2f8e34371cd5be44f8dcbfdf8a04 Mon Sep 17 00:00:00 2001 From: Tom Smeding Date: Thu, 8 Sep 2022 14:25:27 +0200 Subject: Add efault diagnosis, thanks int-e --- bugs/efault.html | 13 +++++++++---- bugs/efault.md | 13 +++++++++++-- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/bugs/efault.html b/bugs/efault.html index d00a51c..6aceda4 100644 --- a/bugs/efault.html +++ b/bugs/efault.html @@ -1,7 +1,8 @@

The impossible EFAULT

I have written a program; suppose it's called worker. -(While the program is written in Haskell, I don't think that's particularly relevant to this post.) -(EDIT: Reproducer can be found here.)

+(While the program is written in Haskell, I don't think that's particularly relevant to this post.)

+

(EDIT: Reproducer can be found here.)

+

(EDIT 2: Diagnosis by int-e on irc here.)

When run, worker starts a bunch of copies of a script. Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say ./sleep.sh:

#!/bin/bash
@@ -41,5 +42,9 @@ Somehow, starting a script is different from starting a native process (and chan
 posix_spawnp shouldn't care what it is starting!
 That's the job of the loader, as far as I know.
 So what gives?

-

I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post. -In the meantime, spookiness.

+

The cause

+

I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post. +In the meantime, spookiness.

+

snap-server modifies the environment to set the locale, and setenv(3) is not atomic. +In particular, it breaks execve(2) when they race, and this is what happens. +All possible solutions to this problem are hacks.

diff --git a/bugs/efault.md b/bugs/efault.md index c66a6e1..1d434e2 100644 --- a/bugs/efault.md +++ b/bugs/efault.md @@ -2,8 +2,11 @@ I have written a program; suppose it's called `worker`. (While the program is written in Haskell, I don't think that's particularly relevant to this post.) + (EDIT: Reproducer can be found [here](https://git.tomsmeding.com/snap-efault/tree/).) +(EDIT 2: Diagnosis by `int-e` on irc [here](https://paste.tomsmeding.com/D22SvR2T).) + When run, `worker` starts a bunch of copies of a script. Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say `./sleep.sh`: @@ -54,5 +57,11 @@ Somehow, starting a script is different from starting a native process (and chan That's the job of the loader, as far as I know. So what gives? -I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post. -In the meantime, spookiness. +### The cause + +I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post. +In the meantime, spookiness. + +`snap-server` [modifies the environment](https://github.com/snapframework/snap-server/blob/8d89c10014d8d295bfbf5419bbb8551de32d7f85/src/Snap/Http/Server.hs#L161) to set the locale, and `setenv(3)` is not atomic. +In particular, it breaks `execve(2)` when they race, and this is what happens. +All possible solutions to this problem are hacks. -- cgit v1.2.3-54-g00ecf