summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom Smeding <tom@tomsmeding.com>2022-09-08 14:25:27 +0200
committerTom Smeding <tom@tomsmeding.com>2022-09-08 14:25:27 +0200
commit813aae5bddcb2f8e34371cd5be44f8dcbfdf8a04 (patch)
tree2bd99d6dd9671b5628f2da729252f968c8629030
parentf4288dd09a9be0dad2cf955695d66d1b37c107d2 (diff)
Add efault diagnosis, thanks int-e
-rw-r--r--bugs/efault.html13
-rw-r--r--bugs/efault.md13
2 files changed, 20 insertions, 6 deletions
diff --git a/bugs/efault.html b/bugs/efault.html
index d00a51c..6aceda4 100644
--- a/bugs/efault.html
+++ b/bugs/efault.html
@@ -1,7 +1,8 @@
<h2>The impossible EFAULT</h2>
<p>I have written a program; suppose it's called <code>worker</code>.
-(While the program is written in Haskell, I don't think that's particularly relevant to this post.)
-(EDIT: Reproducer can be found <a href="https://git.tomsmeding.com/snap-efault/tree/">here</a>.)</p>
+(While the program is written in Haskell, I don't think that's particularly relevant to this post.)</p>
+<p>(EDIT: Reproducer can be found <a href="https://git.tomsmeding.com/snap-efault/tree/">here</a>.)</p>
+<p>(EDIT 2: Diagnosis by <code>int-e</code> on irc <a href="https://paste.tomsmeding.com/D22SvR2T">here</a>.)</p>
<p>When run, <code>worker</code> starts a bunch of copies of a script.
Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say <code>./sleep.sh</code>:</p>
<pre><code class="language-bash">#!/bin/bash
@@ -41,5 +42,9 @@ Somehow, starting a script is different from starting a native process (and chan
<code>posix_spawnp</code> shouldn't care what it is starting!
That's the job of the loader, as far as I know.
So what gives?</p>
-<p>I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post.
-In the meantime, spookiness.</p>
+<h3>The cause</h3>
+<p><s>I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post.
+In the meantime, spookiness.</s></p>
+<p><code>snap-server</code> <a href="https://github.com/snapframework/snap-server/blob/8d89c10014d8d295bfbf5419bbb8551de32d7f85/src/Snap/Http/Server.hs#L161">modifies the environment</a> to set the locale, and <code>setenv(3)</code> is not atomic.
+In particular, it breaks <code>execve(2)</code> when they race, and this is what happens.
+All possible solutions to this problem are hacks.</p>
diff --git a/bugs/efault.md b/bugs/efault.md
index c66a6e1..1d434e2 100644
--- a/bugs/efault.md
+++ b/bugs/efault.md
@@ -2,8 +2,11 @@
I have written a program; suppose it's called `worker`.
(While the program is written in Haskell, I don't think that's particularly relevant to this post.)
+
(EDIT: Reproducer can be found [here](https://git.tomsmeding.com/snap-efault/tree/).)
+(EDIT 2: Diagnosis by `int-e` on irc [here](https://paste.tomsmeding.com/D22SvR2T).)
+
When run, `worker` starts a bunch of copies of a script.
Under normal circumstances this script sets up a container using Linux cgroups and Linux user namespaces, but none of that is relevant because the strange behaviour in question occurs just fine without all of that -- in fact, we'll let it start the following script, say `./sleep.sh`:
@@ -54,5 +57,11 @@ Somehow, starting a script is different from starting a native process (and chan
That's the job of the loader, as far as I know.
So what gives?
-I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post.
-In the meantime, spookiness.
+### The cause
+
+<s>I'll try to reduce my own program to a minimal reproducer, and if I find anything I'll post an update to this post.
+In the meantime, spookiness.</s>
+
+`snap-server` [modifies the environment](https://github.com/snapframework/snap-server/blob/8d89c10014d8d295bfbf5419bbb8551de32d7f85/src/Snap/Http/Server.hs#L161) to set the locale, and `setenv(3)` is not atomic.
+In particular, it breaks `execve(2)` when they race, and this is what happens.
+All possible solutions to this problem are hacks.