[REGRESSION] ptrace broken from "cgroup: cgroup v2 freezer" (76f969e)

* [REGRESSION] ptrace broken from "cgroup: cgroup v2 freezer" (76f969e)
@ 2019-05-13  1:20 Alex Xu (Hello71)
  2019-05-13  1:57 ` Valdis Klētnieks
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Alex Xu (Hello71) @ 2019-05-13  1:20 UTC (permalink / raw)
  To: linux-kernel, tj, guro; +Cc: oleg, kernel-team

Hi,

I was trying to use strace recently and found that it exhibited some 
strange behavior. I produced this minimal test case:

#include <unistd.h>

int main() {
    write(1, "a", 1);
    return 0;
}

which, when run using "gcc test.c && strace ./a.out" produces this 
strace output:

[ pre-main omitted ]
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[ repeats forever ]

The correct result is of course:

[ pre-main omitted ]
write(1, "a", 1)                        = 1
exit_group(0)                           = ?
+++ exited with 0 +++

Strangely, this only occurs when outputting to a tty-like output. 
Running "strace ./a.out" from a native Linux x86 console or a terminal 
emulator causes the abnormal behavior. However, the following commands 
work correctly:

- strace ./a.out >/dev/null
- strace ./a.out >/tmp/a # /tmp is a standard tmpfs
- strace ./a.out >&- # causes -1 EBADF (Bad file descriptor)

"strace -o /tmp/a ./a.out" hangs and produces the above (infinite) 
output to /tmp/a.

I bisected this to 76f969e, "cgroup: cgroup v2 freezer". I reverted the 
entire patchset (reverting only that one caused a conflict), which 
resolved the issue. I skimmed the patch and came up with this 
workaround, which also resolves the issue. I am not at all clear on the 
technical workings of the patchset, but it seems to me like a process's 
frozen status is supposed to be "suspended" when a frozen process is 
ptraced, and "unsuspended" when ptracing ends. Therefore, it seems 
suspicious to always "enter frozen" whether or not the cgroup is 
actually frozen. It seems like the code should instead check if the 
cgroup is actually frozen, and if so, restore the frozen status.

I am using systemd but not any other cgroup features. I tried in an 
initramfs environment (no systemd, /init -> shell script) and reproduced 
the failing test case.

Please CC me on replies.

Thanks,
Alex.

---
 kernel/signal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 62f9aea4a15a..47145d9d89ca 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2110,7 +2110,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
                preempt_disable();
                read_unlock(&tasklist_lock);
                preempt_enable_no_resched();
-               cgroup_enter_frozen();
+               //cgroup_enter_frozen();
                freezable_schedule();
        } else {
                /*
@@ -2289,7 +2289,7 @@ static bool do_signal_stop(int signr)
                }
 
                /* Now we don't run again until woken by SIGCONT or SIGKILL */
-               cgroup_enter_frozen();
+               //cgroup_enter_frozen();
                freezable_schedule();
                return true;
        } else {
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread