* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads [not found] ` <CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com> @ 2021-05-04 8:39 ` Peter Zijlstra 2021-05-04 15:35 ` Borislav Petkov 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2021-05-04 8:39 UTC (permalink / raw) To: Linus Torvalds Cc: Andy Lutomirski, Thomas Gleixner, Stefan Metzmacher, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains + linux-toolchains On Mon, May 03, 2021 at 12:14:45PM -0700, Linus Torvalds wrote: > On Mon, May 3, 2021 at 9:05 AM Andy Lutomirski <luto@amacapital.net> wrote: > > > > Linus, what is the actual effect of allowing gdb to attach these threads? Can we instead make all the regset ops do: > > > > if (not actually a user thread) return -EINVAL; > > I don't think it matters - the end result ends up being the same, ie > gdb gets confused about whether the (parent) thread is a 32-bit or > 64-bit one. > > So the basic issue is > > (a) we want the IO threads to look exactly like normal user threads > as far as the kernel is concerned, because we had way too many bugs > due to special cases. > > (b) but that means that they are also visible to user space, and then > gdb has this odd thing where it takes the 64-bit vs 32-bit data for > the whole process from one thread, and picks the worst possible thread > to do it (ie explicitly not even the main thread, so usually the IO > thread!) > > That (a) ended up really being critical. The issues with special cases > were just horrendous, both for security issues (ie "make them kernel > threads but carry user credentials" just caused lots of problems) but > also for various just random other state handling issues (signal state > in particular). > > So generally, the IO threads are now 100% normal threads - it's > literally just that they never return to user space because they are > always just doing the IO offload on the kernel side. > > That part is lovely, but part of the "100% IO threads" really is that > they share the signal struct too, which in turn means that they very > much show up as normal threads. Again, not a problem: they really > _are_ normal threads for all intents and purposes. > > But then that (b) issue means that gdb gets confused by them. I > personally think that's just a pure gdb mis-feature, but I also think > that "hey, if we just make the register state look like the main > thread, and unconfuse gdb that way, problem solved". > > So I'd actually rather not make these non-special threads any more > special at all. And I strongly suspect that making ptrace() not work > on them will just confuse gdb even more - so it would make them just > unnecessarily special in the kernel, for no actual gain. > > Is the right thing to do to fix gdb to not look at irrelevant thread B > when deciding whether thread A is 64-bit or not? Yeah, that seems like > obviously the RightThing(tm) to me. > > But at the same time, this is arguably about "regression", although at > the same time it's "gdb doesn't understand new user programs that use > new features, film at 11", so I think that argument is partly bogus > too. > > So my personal preference would be: > > - make those threads look even more like user threads, even if that > means giving them pointless user segment data that the threads > themselves will never use > > So I think Stefan's patch is reasonable, if not pretty. Literally > becasue of that "make these threads look even more normal" > > - ALSO fix gdb that is doing obviously garbage stupid things > > But I'm obviously not involved in that "ALSO fix gdb" part, and > arguably the kernel hack then makes it more likely that gdb will > continue doing its insane broken thing. Anybody on toolchains that can help get GDB fixed? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-04 8:39 ` [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads Peter Zijlstra @ 2021-05-04 15:35 ` Borislav Petkov 2021-05-04 15:55 ` Simon Marchi 0 siblings, 1 reply; 14+ messages in thread From: Borislav Petkov @ 2021-05-04 15:35 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Stefan Metzmacher, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On Tue, May 04, 2021 at 10:39:23AM +0200, Peter Zijlstra wrote: > Anybody on toolchains that can help get GDB fixed? In the meantime, Tom is looking at fixing this, in case people wanna try gdb patches or give him a test case or so... https://sourceware.org/bugzilla/show_bug.cgi?id=27822 Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-04 15:35 ` Borislav Petkov @ 2021-05-04 15:55 ` Simon Marchi 2021-05-05 11:29 ` Stefan Metzmacher 0 siblings, 1 reply; 14+ messages in thread From: Simon Marchi @ 2021-05-04 15:55 UTC (permalink / raw) To: Borislav Petkov, Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Stefan Metzmacher, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On 2021-05-04 11:35 a.m., Borislav Petkov wrote: > On Tue, May 04, 2021 at 10:39:23AM +0200, Peter Zijlstra wrote: >> Anybody on toolchains that can help get GDB fixed? > > In the meantime, Tom is looking at fixing this, in case people wanna try > gdb patches or give him a test case or so... > > https://sourceware.org/bugzilla/show_bug.cgi?id=27822 Yes, please provide reproducing steps in that bug. Unlike what was said in this thread, some people do work on gdb and are willing to fix things, but they can only do so if they know about the problem. Simon ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-04 15:55 ` Simon Marchi @ 2021-05-05 11:29 ` Stefan Metzmacher 2021-05-05 21:59 ` Simon Marchi 0 siblings, 1 reply; 14+ messages in thread From: Stefan Metzmacher @ 2021-05-05 11:29 UTC (permalink / raw) To: Simon Marchi, Borislav Petkov, Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains Am 04.05.21 um 17:55 schrieb Simon Marchi: > On 2021-05-04 11:35 a.m., Borislav Petkov wrote: >> On Tue, May 04, 2021 at 10:39:23AM +0200, Peter Zijlstra wrote: >>> Anybody on toolchains that can help get GDB fixed? >> >> In the meantime, Tom is looking at fixing this, in case people wanna try >> gdb patches or give him a test case or so... >> >> https://sourceware.org/bugzilla/show_bug.cgi?id=27822 > > Yes, please provide reproducing steps in that bug. Unlike what was said > in this thread, some people do work on gdb and are willing to fix > things, but they can only do so if they know about the problem. See https://lore.kernel.org/io-uring/0375b37f-2e1e-7999-53b8-c567422aa181@samba.org/ and https://lore.kernel.org/io-uring/20210411152705.2448053-1-metze@samba.org/T/#m461f280e8c3d32a49bc7da7bb5e214e90d97cf65 The question is why does inferior_ptid doesn't represent the thread that was specified by 'gdb --pid PIDVAL' https://www.samba.org/~metze/strace-uring-fail.txt used "gdb --pid 1396" and does the following ptrace calls: # grep ptrace strace-uring-fail.txt > 18:46:35.319925 ptrace(PTRACE_ATTACH, 1396) = 0 <0.000048> > 18:46:35.321622 ptrace(PTRACE_ATTACH, 1397) = 0 <0.000059> > 18:46:35.322813 ptrace(PTRACE_ATTACH, 1398) = 0 <0.003052> > 18:46:35.327287 ptrace(PTRACE_ATTACH, 1399) = 0 <0.000028> > 18:46:35.334920 ptrace(PTRACE_GETREGS, 1396, NULL, 0x7ffed6173ea0) = 0 <0.000067> > 18:46:35.341506 ptrace(PTRACE_SETOPTIONS, 1410, NULL, PTRACE_O_TRACESYSGOOD) = 0 <0.000056> > 18:46:35.341681 ptrace(PTRACE_SETOPTIONS, 1410, NULL, PTRACE_O_TRACEFORK) = 0 <0.000051> > 18:46:35.341816 ptrace(PTRACE_SETOPTIONS, 1410, NULL, PTRACE_O_TRACEFORK|PTRACE_O_TRACEVFORKDONE) = 0 <0.000054> > 18:46:35.341957 ptrace(PTRACE_CONT, 1410, NULL, 0) = 0 <0.000056> > 18:46:35.345568 ptrace(PTRACE_GETEVENTMSG, 1410, NULL, [1411]) = 0 <0.000081> > 18:46:35.350541 ptrace(PTRACE_SETOPTIONS, 1410, NULL, PTRACE_O_EXITKILL) = 0 <0.000019> > 18:46:35.354010 ptrace(PTRACE_SETOPTIONS, 1397, NULL, PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEFORK|PTRACE_O_TRACEVFORK|PTRACE_O_TRACECLONE|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEVFORKDONE) = 0 <0.000019> > 18:46:35.415730 ptrace(PTRACE_SETOPTIONS, 1396, NULL, PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEFORK|PTRACE_O_TRACEVFORK|PTRACE_O_TRACECLONE|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEVFORKDONE) = 0 <0.000076> > 18:46:35.421076 ptrace(PTRACE_GETREGS, 1412, NULL, 0x7ffed6174980) = 0 <0.000088> > 18:46:35.429498 ptrace(PTRACE_PEEKUSER, 1397, 8*CS, [NULL]) = 0 <0.000022> > 18:46:35.429632 ptrace(PTRACE_PEEKUSER, 1397, 8*SS + 24, [NULL]) = 0 <0.000019> > 18:46:35.429732 ptrace(PTRACE_GETREGSET, 1397, NT_X86_XSTATE, [{iov_base=0x7ffed6174780, iov_len=576}]) = 0 <0.000030> > 18:46:35.435507 ptrace(PTRACE_GETREGS, 1397, NULL, 0x7ffed6173cb0) = 0 <0.000019> > 18:46:35.445877 ptrace(PTRACE_PEEKTEXT, 1397, 0x56357e99de00, [0x7f49d572b160]) = 0 <0.000057> > 18:46:35.446043 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572b168, [0x7f49d572b190]) = 0 <0.000049> > 18:46:35.447192 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572bbf0, [0x64762d78756e696c]) = 0 <0.000060> > 18:46:35.447368 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572bbf0, [0x64762d78756e696c]) = 0 <0.000075> > 18:46:35.447571 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572bbf8, [0x312e6f732e6f73]) = 0 <0.000070> > 18:46:35.447762 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572bbf8, [0x312e6f732e6f73]) = 0 <0.000067> > 18:46:35.448658 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be10, [0x3638782f62696c2f]) = 0 <0.000076> > 18:46:35.448917 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be10, [0x3638782f62696c2f]) = 0 <0.000050> > 18:46:35.449051 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be18, [0x756e696c2d34365f]) = 0 <0.000045> > 18:46:35.449173 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be18, [0x756e696c2d34365f]) = 0 <0.000043> > 18:46:35.449292 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be20, [0x696c2f756e672d78]) = 0 <0.000042> > 18:46:35.449414 ptrace(PTRACE_PEEKTEXT, 1397, 0x7f49d572be20, [0x696c2f756e672d78]) = 0 <0.000048> ptrace(PTRACE_GETREGS, 1396, ... looks expected to me, but starting with ptrace(PTRACE_PEEKUSER, 1397, 8*CS, [NULL]) (which triggers the actual problem) it's unexpected to me why 1397 is used instead of 1396. 1397 is the iou-mgr-1396 iothread. I hope that helps! metze ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 11:29 ` Stefan Metzmacher @ 2021-05-05 21:59 ` Simon Marchi 2021-05-05 22:11 ` Andy Lutomirski 2021-05-05 22:21 ` Stefan Metzmacher 0 siblings, 2 replies; 14+ messages in thread From: Simon Marchi @ 2021-05-05 21:59 UTC (permalink / raw) To: Stefan Metzmacher, Borislav Petkov, Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On 2021-05-05 7:29 a.m., Stefan Metzmacher wrote: > See https://lore.kernel.org/io-uring/0375b37f-2e1e-7999-53b8-c567422aa181@samba.org/ > and https://lore.kernel.org/io-uring/20210411152705.2448053-1-metze@samba.org/T/#m461f280e8c3d32a49bc7da7bb5e214e90d97cf65 > > The question is why does inferior_ptid doesn't represent the thread > that was specified by 'gdb --pid PIDVAL' Hi Stefan, When you attach to PIDVAL (assuming that PIDVAL is a thread-group leader), GDB attaches to all the threads of that thread group. The inferior_ptid global variable is "the thread we are currently working with", and changes whenever GDB wants to deal with a different thread. After attaching to all threads, GDB wants to know more about that process' architecture (that read_description call mentioned in [1]). The way this is implemented varies from arch to arch, but as you found out, for x86-64 it consists of peeking into a thread's CS/DS to determine whether the process is x86-64, x32 or i386. The thread used for this - the inferior_ptid value - just happens to be the latest thread we switched inferior_ptid to (presumably because we iterated on the thread list to do something and that was the last thread in the list). And up to now, this was working under the assumption that picking any thread of the process would yield the same values for that purpose. I don't think it was intentionally done this way, but it worked and didn't cause any trouble until now. With io threads, that assumption doesn't hold anymore. From what I understand, your v1 patch made it so that the kernel puts the CS/DS values GDB expects in the io threads (even though they are never actually used otherwise). I don't understand how your v2 patch addresses the problem though. I don't think it would be a problem on the GDB-side to make sure to fetch these values from a "standard" thread. Most likely the thread group leader, like Tom has proposed in [1]. Thanks, Simon [1] https://sourceware.org/bugzilla/show_bug.cgi?id=27822#c0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 21:59 ` Simon Marchi @ 2021-05-05 22:11 ` Andy Lutomirski 2021-05-05 23:12 ` Borislav Petkov ` (2 more replies) 2021-05-05 22:21 ` Stefan Metzmacher 1 sibling, 3 replies; 14+ messages in thread From: Andy Lutomirski @ 2021-05-05 22:11 UTC (permalink / raw) To: Simon Marchi Cc: Stefan Metzmacher, Borislav Petkov, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains I'm not holding my breath, but: On Wed, May 5, 2021 at 2:59 PM Simon Marchi <simon.marchi@polymtl.ca> wrote: > > On 2021-05-05 7:29 a.m., Stefan Metzmacher wrote: > > See https://lore.kernel.org/io-uring/0375b37f-2e1e-7999-53b8-c567422aa181@samba.org/ > > and https://lore.kernel.org/io-uring/20210411152705.2448053-1-metze@samba.org/T/#m461f280e8c3d32a49bc7da7bb5e214e90d97cf65 > > > > The question is why does inferior_ptid doesn't represent the thread > > that was specified by 'gdb --pid PIDVAL' > > Hi Stefan, > > When you attach to PIDVAL (assuming that PIDVAL is a thread-group > leader), GDB attaches to all the threads of that thread group. The > inferior_ptid global variable is "the thread we are currently working > with", and changes whenever GDB wants to deal with a different thread. > > After attaching to all threads, GDB wants to know more about that > process' architecture (that read_description call mentioned in [1]). ^^^^^^ For what it's worth, this is already fundamentally incorrect. On x86_64 Linux, a process *does* *not* *have* an architecture. Every task on an x86_64 Linux host has a full 64-bit register state. The task can, and sometimes does, change CS using far transfers or other bizarre techniques, and neither the kernel nor GDB will be notified or have a chance to take any action in response. ELF files can be 32-bit, CS:rIP can point at 32-bit code, and system calls can be 32-bit (even from 64-bit code), but *tasks* are not 32-bit. Now I realize that the ptrace() API is awful and makes life difficult in several respects for no good reason but, if gdb is ever interested in fixing its ideas about architecture to understand that all tasks, even those that think of themselves as "compat", have full 64-bit state, I would be more than willing to improve the ptrace() API as needed to make this work well. Since I'm not holding my breath, please at least keep in mind that anything you do here is merely a heuristic, cannot be fully correct, and then whenever gdb determines that a thread group or a thread is "32-bit", gdb is actually deciding to operate in a degraded mode for that task, is not accurately representing the task state, and is at risk of crashing, malfunctioning, or crashing the inferior due to its incorrect assumptions. If you have ever attached gdb to QEMU's gdbserver and tried to debug the early boot process of a 64-bit Linux kernel, you may have encountered this class of bugs. gdb works very, very poorly for this use case. (To avoid confusion, this is not a universal property of Linux. arm64 and arm32 tasks on an arm64 Linux host are different and cannot arbitrarily switch modes.) --Andy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 22:11 ` Andy Lutomirski @ 2021-05-05 23:12 ` Borislav Petkov 2021-05-05 23:22 ` Andy Lutomirski 2021-05-06 1:04 ` Simon Marchi 2021-05-06 9:47 ` David Laight 2 siblings, 1 reply; 14+ messages in thread From: Borislav Petkov @ 2021-05-05 23:12 UTC (permalink / raw) To: Andy Lutomirski Cc: Simon Marchi, Stefan Metzmacher, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On Wed, May 05, 2021 at 03:11:18PM -0700, Andy Lutomirski wrote: > Since I'm not holding my breath, please at least keep in mind that > anything you do here is merely a heuristic, cannot be fully correct, > and then whenever gdb determines that a thread group or a thread is > "32-bit", gdb is actually deciding to operate in a degraded mode for > that task, is not accurately representing the task state, and is at > risk of crashing, malfunctioning, or crashing the inferior due to its > incorrect assumptions. If you have ever attached gdb to QEMU's > gdbserver and tried to debug the early boot process of a 64-bit Linux > kernel, you may have encountered this class of bugs. gdb works very, > very poorly for this use case. So we were talking about this with toolchain folks today and they gave me this example: Imagine you've stopped the target this way: <insn><-- stopped here <insn> <mode changing insn> <insn> <insn> ... now, if you dump rIP and say, rIP + the 10 following insns at the place you've stopped it, gdb cannot know that 2 insns further into the stream a <mode changing insn> is coming and it should change the disassembly of the insns after that <mode changing insn> to the new mode. Unless it goes and inspects all further instructions and disassembles them and analyzes the flow... So what you can do is (gdb) set arch ... at the <mode changing insn> to the mode you're changing to. Dunno, maybe I'm missing something but this sounds like without user help gdb can only assume things. Good night and good luck. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 23:12 ` Borislav Petkov @ 2021-05-05 23:22 ` Andy Lutomirski 0 siblings, 0 replies; 14+ messages in thread From: Andy Lutomirski @ 2021-05-05 23:22 UTC (permalink / raw) To: Borislav Petkov Cc: Andy Lutomirski, Simon Marchi, Stefan Metzmacher, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On Wed, May 5, 2021 at 4:12 PM Borislav Petkov <bp@alien8.de> wrote: > > On Wed, May 05, 2021 at 03:11:18PM -0700, Andy Lutomirski wrote: > > Since I'm not holding my breath, please at least keep in mind that > > anything you do here is merely a heuristic, cannot be fully correct, > > and then whenever gdb determines that a thread group or a thread is > > "32-bit", gdb is actually deciding to operate in a degraded mode for > > that task, is not accurately representing the task state, and is at > > risk of crashing, malfunctioning, or crashing the inferior due to its > > incorrect assumptions. If you have ever attached gdb to QEMU's > > gdbserver and tried to debug the early boot process of a 64-bit Linux > > kernel, you may have encountered this class of bugs. gdb works very, > > very poorly for this use case. > > So we were talking about this with toolchain folks today and they gave > me this example: > > Imagine you've stopped the target this way: > > <insn><-- stopped here > <insn> > <mode changing insn> > <insn> > <insn> > ... > > now, if you dump rIP and say, rIP + the 10 following insns at the place > you've stopped it, gdb cannot know that 2 insns further into the stream > a > > <mode changing insn> > > is coming and it should change the disassembly of the insns after that > <mode changing insn> to the new mode. Unless it goes and inspects all > further instructions and disassembles them and analyzes the flow... That's fine. x86 machine code is like this. You also can't disassemble instructions before rIP accurately either. > > So what you can do is > > (gdb) set arch ... > > at the <mode changing insn> to the mode you're changing to. > > Dunno, maybe I'm missing something but this sounds like without user > help gdb can only assume things. In the tools/testing/x86/selftests directory, edited slightly for brevity: $ gdb ./test_syscall_vdso_32 (gdb) b call64_from_32 Breakpoint 1 at 0x80499ef: file thunks_32.S, line 19. (gdb) display/i $pc 1: x/i $pc <error: No registers.> (gdb) r Starting program: /home/luto/apps/linux/tools/testing/selftests/x86/test_syscall_vdso_32 ... [RUN] Executing 6-argument 32-bit syscall via VDSO Breakpoint 1, call64_from_32 () at thunks_32.S:19 19 mov 4(%esp), %eax 1: x/i $pc => 0x80499ef <call64_from_32>: mov 0x4(%esp),%eax (gdb) si 22 push %ecx 1: x/i $pc => 0x80499f3 <call64_from_32+4>: push %ecx (gdb) call64_from_32 () at thunks_32.S:23 23 push %edx 1: x/i $pc => 0x80499f4 <call64_from_32+5>: push %edx (gdb) call64_from_32 () at thunks_32.S:24 24 push %esi 1: x/i $pc => 0x80499f5 <call64_from_32+6>: push %esi (gdb) call64_from_32 () at thunks_32.S:25 25 push %edi 1: x/i $pc => 0x80499f6 <call64_from_32+7>: push %edi (gdb) call64_from_32 () at thunks_32.S:28 28 jmp $0x33,$1f 1: x/i $pc => 0x80499f7 <call64_from_32+8>: ljmp $0x33,$0x80499fe (gdb) info registers eax 0x80492e8 134517480 ecx 0x3f 63 edx 0x1 1 ebx 0xf7fc8550 -134445744 esp 0xffffc57c 0xffffc57c ebp 0xffffc5e8 0xffffc5e8 esi 0x0 0 edi 0x8049180 134517120 eip 0x80499f7 0x80499f7 <call64_from_32+8> eflags 0x292 [ AF SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) si 32 call *%rax 1: x/i $pc => 0x80499fe <call64_from_32+15>: call *%eax (gdb) info registers eax 0x80492e8 134517480 ^^^ Should be rax ecx 0x3f 63 edx 0x1 1 ebx 0xf7fc8550 -134445744 esp 0xffffc57c 0xffffc57c ebp 0xffffc5e8 0xffffc5e8 esi 0x0 0 edi 0x8049180 134517120 eip 0x80499fe 0x80499fe <call64_from_32+15> ^^^ r8, etc are all missing eflags 0x292 [ AF SF IF ] cs 0x33 51 ^^^ 64-bit! ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) si poison_regs64 () at test_syscall_vdso.c:35 35 long syscall_addr; 1: x/i $pc => 0x80492e8 <poison_regs64>: dec %ecx (gdb) si 36 long get_syscall(char **envp) 1: x/i $pc => 0x80492ef <poison_regs64+7>: dec %ecx (gdb) set arch i386:x86-64 warning: Selected architecture i386:x86-64 is not compatible with reported target architecture i386 Architecture `i386:x86-64' not recognized. The target architecture is set to "auto" (currently "i386"). (gdb) set arch i386:x86-64:intel warning: Selected architecture i386:x86-64:intel is not compatible with reported target architecture i386 Architecture `i386:x86-64:intel' not recognized. The target architecture is set to "auto" (currently "i386"). I don't know enough about gdb internals to know precisely what failed here, but this did not work the way it should have. Sure, ptrace should provide a nice API to figure out that CS == 0x33 means long mode, but gdb could do a whole lot better here. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 22:11 ` Andy Lutomirski 2021-05-05 23:12 ` Borislav Petkov @ 2021-05-06 1:04 ` Simon Marchi 2021-05-06 15:11 ` Andy Lutomirski 2021-05-06 9:47 ` David Laight 2 siblings, 1 reply; 14+ messages in thread From: Simon Marchi @ 2021-05-06 1:04 UTC (permalink / raw) To: Andy Lutomirski Cc: Stefan Metzmacher, Borislav Petkov, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On 2021-05-05 6:11 p.m., Andy Lutomirski wrote: > For what it's worth, this is already fundamentally incorrect. On > x86_64 Linux, a process *does* *not* *have* an architecture. Every > task on an x86_64 Linux host has a full 64-bit register state. The > task can, and sometimes does, change CS using far transfers or other > bizarre techniques, and neither the kernel nor GDB will be notified or > have a chance to take any action in response. ELF files can be > 32-bit, CS:rIP can point at 32-bit code, and system calls can be > 32-bit (even from 64-bit code), but *tasks* are not 32-bit. Thanks for that explanation: I didn't know that "32-bit" tasks had the same register state as any other task. I never really looked into it, but I was assuming that tasks were either 32-bit or 64-bit (based on the ELF header of the file they exec'd or something) and that 32-bit tasks had the register state as a task on an i386 machine would have. And that PEEKUSER would return the 64-bit register state for 64-bit tasks, and 32-bit register state for 32-bit tasks. I looked at how GDB reads registers from a "64-bit" task and a "32-bit" task (I have to quote now, since I now know it's an abuse of terminology) side by side. And indeed, GDB reads a full 64-bit state in both cases. For the 32-bit case, it picks the 32-bit values from that buffer. For example, to get the eax value it picks the low 4 bytes of rax (well, ax in user_regs_struct). So I suppose that if GDB wanted to tell nothing but the truth, it would present the full 64-bit register state to the user even when debugging a 32-bit program. But at the end of the day, the typical user debugging a 32-bit program on a 64-bit probably just wants the illusion that they are on i386. > Now I realize that the ptrace() API is awful and makes life difficult > in several respects for no good reason but, if gdb is ever interested > in fixing its ideas about architecture to understand that all tasks, > even those that think of themselves as "compat", have full 64-bit > state, I would be more than willing to improve the ptrace() API as > needed to make this work well. Just wondering, do you have specific ptrace shortcomings in mind when saying this? As I found above, ptrace lets us read the whole 64-bit register state. After that it's up to us to analyze the state of the program based on its registers and memory. What more could ptrace give us? > Since I'm not holding my breath, please at least keep in mind that > anything you do here is merely a heuristic, cannot be fully correct, > and then whenever gdb determines that a thread group or a thread is > "32-bit", gdb is actually deciding to operate in a degraded mode for > that task, is not accurately representing the task state, and is at > risk of crashing, malfunctioning, or crashing the inferior due to its > incorrect assumptions. If you have ever attached gdb to QEMU's > gdbserver and tried to debug the early boot process of a 64-bit Linux > kernel, you may have encountered this class of bugs. gdb works very, > very poorly for this use case. Yes, that QEMU case comes up often. I wish that things were better, but the reality is that this is an edge case, it would require somebody with that particular itch to scratch to work on GDB to improve that use case. So as you said, don't hold your breath :). I completely understand that GDB putting processes in the "32-bit" or "64-bit" bin is not the right thing to do in general, from a kernel perspective. But it converged to this because it's enough for and useful to the 99.9% of users who debug programs that don't do funky things. At least, it's good to know about it in case problems related to this arise in the future. Simon ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-06 1:04 ` Simon Marchi @ 2021-05-06 15:11 ` Andy Lutomirski 0 siblings, 0 replies; 14+ messages in thread From: Andy Lutomirski @ 2021-05-06 15:11 UTC (permalink / raw) To: Simon Marchi Cc: Andy Lutomirski, Stefan Metzmacher, Borislav Petkov, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains On Wed, May 5, 2021 at 6:04 PM Simon Marchi <simon.marchi@polymtl.ca> wrote: > > On 2021-05-05 6:11 p.m., Andy Lutomirski wrote: > I looked at how GDB reads registers from a "64-bit" task and a "32-bit" > task (I have to quote now, since I now know it's an abuse of > terminology) side by side. And indeed, GDB reads a full 64-bit state in > both cases. For the 32-bit case, it picks the 32-bit values from that > buffer. For example, to get the eax value it picks the low 4 bytes of > rax (well, ax in user_regs_struct). > > So I suppose that if GDB wanted to tell nothing but the truth, it would > present the full 64-bit register state to the user even when debugging a > 32-bit program. But at the end of the day, the typical user debugging a > 32-bit program on a 64-bit probably just wants the illusion that they > are on i386. True. I see no reason, especially by default, to show the extra registers. On the other hand, if the program switches modes, having gdb notice would be nice. And, if gdb handled this correctly, all this io_uring stuff would be entirely moot. The made-up register state of the io_uring thread would have no bearing on the debugging of other threads. > > > Now I realize that the ptrace() API is awful and makes life difficult > > in several respects for no good reason but, if gdb is ever interested > > in fixing its ideas about architecture to understand that all tasks, > > even those that think of themselves as "compat", have full 64-bit > > state, I would be more than willing to improve the ptrace() API as > > needed to make this work well. > > Just wondering, do you have specific ptrace shortcomings in mind when > saying this? As I found above, ptrace lets us read the whole 64-bit > register state. After that it's up to us to analyze the state of the > program based on its registers and memory. What more could ptrace give > us? Two specific issues come to mind: 1. PTRACE_GETREGSET and PTRACE_SETREGSET are terminally broken. See the comment above task_user_regset_view() in arch/x86/kernel/ptrace.c. We need a new version of those APIs that takes an e_machine parameter. (I don't even see how you can call these APIs safely at all, short of allocating a buffer with a guard page or intentionally over-allocating and calculating the maximum possible size of buffer that could be used in case of a screwup.) 2. There should be an API to either read the descriptor table or to look up a specific descriptor. How else are you supposed to know whether CS.L is set? (Keep in mind that 0x33 is not necessarily the only long mode segment that gets used. Linux on Xen PV has an extra one.) --Andy ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 22:11 ` Andy Lutomirski 2021-05-05 23:12 ` Borislav Petkov 2021-05-06 1:04 ` Simon Marchi @ 2021-05-06 9:47 ` David Laight 2021-05-06 9:53 ` David Laight 2 siblings, 1 reply; 14+ messages in thread From: David Laight @ 2021-05-06 9:47 UTC (permalink / raw) To: 'Andy Lutomirski', Simon Marchi Cc: Stefan Metzmacher, Borislav Petkov, Peter Zijlstra, Linus Torvalds, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains > (To avoid confusion, this is not a universal property of Linux. arm64 > and arm32 tasks on an arm64 Linux host are different and cannot > arbitrarily switch modes.) Although there are patches lurking to change that. (not from me). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-06 9:47 ` David Laight @ 2021-05-06 9:53 ` David Laight 0 siblings, 0 replies; 14+ messages in thread From: David Laight @ 2021-05-06 9:53 UTC (permalink / raw) To: 'Andy Lutomirski', 'Simon Marchi' Cc: 'Stefan Metzmacher', 'Borislav Petkov', 'Peter Zijlstra', 'Linus Torvalds', 'Thomas Gleixner', 'Jens Axboe', 'Linux Kernel Mailing List', 'io-uring', 'the arch/x86 maintainers', 'linux-toolchains@vger.kernel.org' > > (To avoid confusion, this is not a universal property of Linux. arm64 > > and arm32 tasks on an arm64 Linux host are different and cannot > > arbitrarily switch modes.) > > Although there are patches lurking to change that. > (not from me). Actually they may be just to allow 64bit tasks make 32bit system calls. The code is almost certainly still 54bit. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 21:59 ` Simon Marchi 2021-05-05 22:11 ` Andy Lutomirski @ 2021-05-05 22:21 ` Stefan Metzmacher 2021-05-05 23:15 ` Simon Marchi 1 sibling, 1 reply; 14+ messages in thread From: Stefan Metzmacher @ 2021-05-05 22:21 UTC (permalink / raw) To: Simon Marchi, Borislav Petkov, Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains Hi Simon, > When you attach to PIDVAL (assuming that PIDVAL is a thread-group > leader), GDB attaches to all the threads of that thread group. The > inferior_ptid global variable is "the thread we are currently working > with", and changes whenever GDB wants to deal with a different thread. > > After attaching to all threads, GDB wants to know more about that > process' architecture (that read_description call mentioned in [1]). > The way this is implemented varies from arch to arch, but as you found > out, for x86-64 it consists of peeking into a thread's CS/DS to > determine whether the process is x86-64, x32 or i386. The thread used > for this - the inferior_ptid value - just happens to be the latest > thread we switched inferior_ptid to (presumably because we iterated on > the thread list to do something and that was the last thread in the > list). And up to now, this was working under the assumption that > picking any thread of the process would yield the same values for that > purpose. I don't think it was intentionally done this way, but it > worked and didn't cause any trouble until now. > > With io threads, that assumption doesn't hold anymore. Yes, in 5.12 > From what I understand, your v1 patch made it so that the kernel puts the CS/DS > values GDB expects in the io threads (even though they are never > actually used otherwise). I don't understand how your v2 patch > addresses the problem though. v1 did clear everything and tried to keep some selected registers: 'memset(childregs, 0, sizeof(struct pt_regs));' childregs->cs = currenttrgs->cs; childregs->ss = currenttrgs->ss; childregs->ds = currenttrgs->ds; childregs->es = currenttrgs->es; v2 copies everything and only clears 3 selected registers (I added the last two for the PF_IO_WORKER case: *childregs = *current_pt_regs(); childregs->ax = 0; ... childregs->ip = 0; childregs->sp = 0; So regarding childregs->cs and childregs->ds the effect is the same. (Also note that on x86_64 ds in handled by savesegment(ds, p->thread.ds); already instead so the effective problem as only childregs->cs which is cleared in 5.12, but will be kept with both of my patches. > I don't think it would be a problem on the GDB-side to make sure to > fetch these values from a "standard" thread. Most likely the thread > group leader, like Tom has proposed in [1]. Ok. Is it clear now? metze ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads 2021-05-05 22:21 ` Stefan Metzmacher @ 2021-05-05 23:15 ` Simon Marchi 0 siblings, 0 replies; 14+ messages in thread From: Simon Marchi @ 2021-05-05 23:15 UTC (permalink / raw) To: Stefan Metzmacher, Borislav Petkov, Peter Zijlstra Cc: Linus Torvalds, Andy Lutomirski, Thomas Gleixner, Jens Axboe, Linux Kernel Mailing List, io-uring, the arch/x86 maintainers, linux-toolchains > Is it clear now? > metze Yes, thanks. I had missed that the point of the 2nd patch is to not do the memset for these threads. Makes sense now. Simon ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-05-06 15:11 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <8735v3ex3h.ffs@nanos.tec.linutronix.de> [not found] ` <3C41339D-29A2-4AB1-958F-19DB0A92D8D7@amacapital.net> [not found] ` <CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com> 2021-05-04 8:39 ` [PATCH] io_thread/x86: don't reset 'cs', 'ss', 'ds' and 'es' registers for io_threads Peter Zijlstra 2021-05-04 15:35 ` Borislav Petkov 2021-05-04 15:55 ` Simon Marchi 2021-05-05 11:29 ` Stefan Metzmacher 2021-05-05 21:59 ` Simon Marchi 2021-05-05 22:11 ` Andy Lutomirski 2021-05-05 23:12 ` Borislav Petkov 2021-05-05 23:22 ` Andy Lutomirski 2021-05-06 1:04 ` Simon Marchi 2021-05-06 15:11 ` Andy Lutomirski 2021-05-06 9:47 ` David Laight 2021-05-06 9:53 ` David Laight 2021-05-05 22:21 ` Stefan Metzmacher 2021-05-05 23:15 ` Simon Marchi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).