On Thu, Jan 13, 2022 at 1:17 PM Peter Maydell wrote: > On Sun, 9 Jan 2022 at 16:40, Warner Losh wrote: > > > > Implement host_signal_handler to handle signals generated by the host > > and to do safe system calls. > > > > Signed-off-by: Stacey Son > > Signed-off-by: Kyle Evans > > Signed-off-by: Warner Losh > > --- > > bsd-user/signal.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 105 insertions(+) > > > > diff --git a/bsd-user/signal.c b/bsd-user/signal.c > > index b1331f63d61..a6e07277fb2 100644 > > --- a/bsd-user/signal.c > > +++ b/bsd-user/signal.c > > @@ -142,6 +142,111 @@ void force_sig_fault(int sig, int code, abi_ulong > addr) > > > > static void host_signal_handler(int host_sig, siginfo_t *info, void > *puc) > > { > > + CPUState *cpu = thread_cpu; > > + CPUArchState *env = cpu->env_ptr; > > + int sig; > > + target_siginfo_t tinfo; > > + ucontext_t *uc = puc; > > + uintptr_t pc = 0; > > + bool sync_sig = false; > > + > > + /* > > + * Non-spoofed SIGSEGV and SIGBUS are synchronous, and need special > > + * handling wrt signal blocking and unwinding. > > + */ > > + if ((host_sig == SIGSEGV || host_sig == SIGBUS) && info->si_code > > 0) { > > + MMUAccessType access_type; > > + uintptr_t host_addr; > > + abi_ptr guest_addr; > > + bool is_write; > > + > > + host_addr = (uintptr_t)info->si_addr; > > + > > + /* > > + * Convert forcefully to guest address space: addresses outside > > + * reserved_va are still valid to report via SEGV_MAPERR. > > + */ > > + guest_addr = h2g_nocheck(host_addr); > > + > > + pc = host_signal_pc(uc); > > + is_write = host_signal_write(info, uc); > > + access_type = adjust_signal_pc(&pc, is_write); > > + > > + if (host_sig == SIGSEGV) { > > + bool maperr = true; > > + > > + if (info->si_code == SEGV_ACCERR && h2g_valid(host_addr)) { > > + /* If this was a write to a TB protected page, restart. > */ > > + if (is_write && > > + handle_sigsegv_accerr_write(cpu, &uc->uc_sigmask, > > + pc, guest_addr)) { > > + return; > > + } > > + > > + /* > > + * With reserved_va, the whole address space is > PROT_NONE, > > + * which means that we may get ACCERR when we want > MAPERR. > > + */ > > + if (page_get_flags(guest_addr) & PAGE_VALID) { > > + maperr = false; > > + } else { > > + info->si_code = SEGV_MAPERR; > > + } > > + } > > + > > + sigprocmask(SIG_SETMASK, &uc->uc_sigmask, NULL); > > + cpu_loop_exit_sigsegv(cpu, guest_addr, access_type, maperr, > pc); > > + } else { > > + sigprocmask(SIG_SETMASK, &uc->uc_sigmask, NULL); > > + if (info->si_code == BUS_ADRALN) { > > + cpu_loop_exit_sigbus(cpu, guest_addr, access_type, pc); > > + } > > + } > > + > > + sync_sig = true; > > + } > > + > > + /* Get the target signal number. */ > > + sig = host_to_target_signal(host_sig); > > + if (sig < 1 || sig > TARGET_NSIG) { > > + return; > > + } > > + trace_user_host_signal(cpu, host_sig, sig); > > + > > + host_to_target_siginfo_noswap(&tinfo, info); > > + > > + queue_signal(env, sig, &tinfo); /* XXX how to cope with > failure? */ > > queue_signal() can't fail, so there is nothing to cope with. > (Your bsd-user version even has the right 'void' type -- > linux-user's returns 1 always and we never look at the return > value, so we should really switch that to void return too.) > Submitted a patch to that last bit. I'll remove this comment in the rework I'm doing. > > + /* > > + * Linux does something else here -> the queue signal may be wrong, > but > > + * maybe not. And then it does the rewind_if_in_safe_syscall > > + */ > > I think you have here a bit of a mix of linux-user's current design > and some older (broken) version. This is how linux-user works today: > Yes. I think we forked bsd-user from linux-user around 2015 and the old design dropped signal queueing in 2016 if I'm reading git log correctly. > * queue_signal() is a little bit misnamed, because there is no > "queue" here: there can only be at most one "queued" signal, > and it lives in the TaskState struct (which is user-only specific > information that hangs off the guest CPU struct) as the > TaskState::sync_signal field. The reason > we only have one at once is that queue_signal() is used only > for signals generated by QEMU itself by calling queue_signal() > directly or indirectly from the cpu_loop() code. The cpu loop > always calls process_pending_signals() at the end of its loop, > which will pick up a queued signal. We never call queue_signal() > twice in a row before getting back to process_pending_signals(), > so there's only ever at most one thing in the "queue". > * for all signals we get from the host except SIGSEGV/SIGBUS, > we track whether there's a host signal pending in the > TaskState::sigtab[] array (which is indexed by signal number). > We block all host signals except SIGSEGV/SIGBUS before calling > cpu_exit(), so we know we're not going to get more than one > of these at once (and it won't clash with a queue_signal() > signal either, as those use the sync_signal field, not the > sigtab[]). > * for host-sent non-spoofed (ie not sent via 'kill()') SIGSEGV/SIGBUS, > we know this was caused by a bit of generated code, so we just > use cpu_loop_exit_restore() to turn this into an EXCP_INTERRUPT > at the right guest PC > > I feel fairly strongly that you definitely want to use the same > design as current linux-user does for signals: > * getting this right is pretty tricky, and even if we get two > different designs to both have the same semantics it's going > to be pretty confusing > * we thought quite hard about the linux-user code at the time > and it's definitely less buggy than the previous design > * It's much easier to review the bsd-user code as "yes this is > doing the same thing linux-user does" than working through > a different approach from first principles > I agree. I like that design better than what's in bsd-user today and seems simpler to implement and get right. I'd add a 4th bullet point "It's easier to share code, either directly or via copying" > I don't have as strong an opinion on whether we should try to get > it into the tree that way from the start, or to put in whatever > you have currently and then fix it later. (More accurately, > I would prefer to review patches which use the same design > as linux-user but if that's going to be massively painful/slow > for you to get something upstream doing it that way around > I can probably live with the other approach...) > I'm going to start down that path and see if I can rework the patches and see if it solves some of the regressions we've seen in bsd-user between 6.1 and 6.2 as the signal stuff was reworked in linux-user to cope better with sigsegv and sigbus. If that works out OK, then I'll move forward with adjusting the fork and reflect that back into this patch series. > > + /* > > + * For synchronous signals, unwind the cpu state to the faulting > > + * insn and then exit back to the main loop so that the signal > > + * is delivered immediately. > > + XXXX Should this be in queue_signal? > > No, because queue_signal() is called for lots of ways to pend > a signal, most of which aren't real host signals. > OK. I now agree after reading the code. > > + */ > > + if (sync_sig) { > > + cpu->exception_index = EXCP_INTERRUPT; > > + cpu_loop_exit_restore(cpu, pc); > > + } > > + > > + rewind_if_in_safe_syscall(puc); > > + > > + /* > > + * Block host signals until target signal handler entered. We > > + * can't block SIGSEGV or SIGBUS while we're executing guest > > + * code in case the guest code provokes one in the window between > > + * now and it getting out to the main loop. Signals will be > > + * unblocked again in process_pending_signals(). > > + */ > > + sigfillset(&uc->uc_sigmask); > > + sigdelset(&uc->uc_sigmask, SIGSEGV); > > + sigdelset(&uc->uc_sigmask, SIGBUS); > > + > > + /* Interrupt the virtual CPU as soon as possible. */ > > + cpu_exit(thread_cpu); > > } > > > > void signal_init(void) > > thanks > Thank you for the insight and places to focus on making this code better. It's a bit hard to discover all this just from reading code in any sane amount of time and it is both quite helpful and much appreciated. Warner