From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 42FEDC433F5 for ; Sun, 16 Jan 2022 20:54:44 +0000 (UTC) Received: from localhost ([::1]:57888 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n9CXy-0001Wd-UE for qemu-devel@archiver.kernel.org; Sun, 16 Jan 2022 15:54:42 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38948) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9CWD-0000db-Mm for qemu-devel@nongnu.org; Sun, 16 Jan 2022 15:52:53 -0500 Received: from [2607:f8b0:4864:20::92f] (port=34755 helo=mail-ua1-x92f.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n9CWB-0005II-83 for qemu-devel@nongnu.org; Sun, 16 Jan 2022 15:52:53 -0500 Received: by mail-ua1-x92f.google.com with SMTP id y4so26995845uad.1 for ; Sun, 16 Jan 2022 12:52:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yuXamKcNia1YCbGnNg8jdyBlwoNUhQvQecO7nGhXuL8=; b=BAiPBXvCXvmxN0c9PiEBpTMU2wEwYa9Gq3KhAJWyRfsMPEY/K+XxQ9NzWCHib/E5iN vqoUEq/Ch6+0CMFPVZ1vPTqpo9tgUiC3m9up7qHai90BsVEfsvSE5+uf+Z6YGViEkaLK 4mELGz4e4uD1D7G3ZBGqDOqLTIFg7ZRRmk3SdaE+u+XN3xj580ao2BoJiy8ACtESUvJ6 Oul+EyRWbqzGwQtd1BQH/m5zv+DY7/ByeEFjAsbRuSmnlk/fdkYo5v1dHSHVQI5sKNdD NrUMg6CKl3YHGas7ngEFOlGY4/68K4ffz2Ui/JxTrqr3E0zW3Zq3TU6eG+B4CeCZZU/N /M6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yuXamKcNia1YCbGnNg8jdyBlwoNUhQvQecO7nGhXuL8=; b=Ju9tFEp7uDm73CsUDD/9hkijZAY2j5rs92H6ZTURZi1Nw7Gvaa0UoFZPaOGNrorMpp uR/ybbMzd0bTUCa6nEvNjW98UZlNfzta2qp3Wgt6KbuqWcdKRZCiNofsx+6O2i8a4J2v BaIvT3t+2RgukMAjjl+cO7f8JbfGm4aU5Jl4rPcbHphMgb8Iz8vVFvGkAxhOl17MUi4m B4XPMBdzxS2pWrm42FxRcSTCT/IOVhxHNWwBQ2yhBitSdkajIJwPPLehglzW+VNTpryz Hn4EG2fiGu4Si5d5BJYnAKOwqSbp54ZP3tqrGOmiXLWjfCArNb57zxVcv8RYupINnn20 892w== X-Gm-Message-State: AOAM532zmTjXJy5c3paCu2AMpBh96u0pxGbVrwdOfSz9KFoOBjs6Kq7Z bFv0tIAO2y3j6l53aB7Wtqch0w/ZAGHMYu6RXoF5FQ== X-Google-Smtp-Source: ABdhPJxOglSoN02Uxf9czk6vjKNa+31+piymkVP1HBe/QtmRFe5VjrDdY49dVsWEC0QE7aFVMsQepXxArpSXiUav6HQ= X-Received: by 2002:a67:edc5:: with SMTP id e5mr6855818vsp.6.1642366369670; Sun, 16 Jan 2022 12:52:49 -0800 (PST) MIME-Version: 1.0 References: <20220109161923.85683-1-imp@bsdimp.com> <20220109161923.85683-19-imp@bsdimp.com> In-Reply-To: From: Warner Losh Date: Sun, 16 Jan 2022 13:52:38 -0700 Message-ID: Subject: Re: [PATCH 18/30] bsd-user/signal.c: Implement host_signal_handler To: Peter Maydell Content-Type: multipart/alternative; boundary="0000000000000e389b05d5b938c1" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::92f (failed) Received-SPF: none client-ip=2607:f8b0:4864:20::92f; envelope-from=wlosh@bsdimp.com; helo=mail-ua1-x92f.google.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kyle Evans , Stacey Son , QEMU Developers Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --0000000000000e389b05d5b938c1 Content-Type: text/plain; charset="UTF-8" On Thu, Jan 13, 2022 at 1:17 PM Peter Maydell wrote: > On Sun, 9 Jan 2022 at 16:40, Warner Losh wrote: > > > > Implement host_signal_handler to handle signals generated by the host > > and to do safe system calls. > > > > Signed-off-by: Stacey Son > > Signed-off-by: Kyle Evans > > Signed-off-by: Warner Losh > > --- > > bsd-user/signal.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 105 insertions(+) > > > > diff --git a/bsd-user/signal.c b/bsd-user/signal.c > > index b1331f63d61..a6e07277fb2 100644 > > --- a/bsd-user/signal.c > > +++ b/bsd-user/signal.c > > @@ -142,6 +142,111 @@ void force_sig_fault(int sig, int code, abi_ulong > addr) > > > > static void host_signal_handler(int host_sig, siginfo_t *info, void > *puc) > > { > > + CPUState *cpu = thread_cpu; > > + CPUArchState *env = cpu->env_ptr; > > + int sig; > > + target_siginfo_t tinfo; > > + ucontext_t *uc = puc; > > + uintptr_t pc = 0; > > + bool sync_sig = false; > > + > > + /* > > + * Non-spoofed SIGSEGV and SIGBUS are synchronous, and need special > > + * handling wrt signal blocking and unwinding. > > + */ > > + if ((host_sig == SIGSEGV || host_sig == SIGBUS) && info->si_code > > 0) { > > + MMUAccessType access_type; > > + uintptr_t host_addr; > > + abi_ptr guest_addr; > > + bool is_write; > > + > > + host_addr = (uintptr_t)info->si_addr; > > + > > + /* > > + * Convert forcefully to guest address space: addresses outside > > + * reserved_va are still valid to report via SEGV_MAPERR. > > + */ > > + guest_addr = h2g_nocheck(host_addr); > > + > > + pc = host_signal_pc(uc); > > + is_write = host_signal_write(info, uc); > > + access_type = adjust_signal_pc(&pc, is_write); > > + > > + if (host_sig == SIGSEGV) { > > + bool maperr = true; > > + > > + if (info->si_code == SEGV_ACCERR && h2g_valid(host_addr)) { > > + /* If this was a write to a TB protected page, restart. > */ > > + if (is_write && > > + handle_sigsegv_accerr_write(cpu, &uc->uc_sigmask, > > + pc, guest_addr)) { > > + return; > > + } > > + > > + /* > > + * With reserved_va, the whole address space is > PROT_NONE, > > + * which means that we may get ACCERR when we want > MAPERR. > > + */ > > + if (page_get_flags(guest_addr) & PAGE_VALID) { > > + maperr = false; > > + } else { > > + info->si_code = SEGV_MAPERR; > > + } > > + } > > + > > + sigprocmask(SIG_SETMASK, &uc->uc_sigmask, NULL); > > + cpu_loop_exit_sigsegv(cpu, guest_addr, access_type, maperr, > pc); > > + } else { > > + sigprocmask(SIG_SETMASK, &uc->uc_sigmask, NULL); > > + if (info->si_code == BUS_ADRALN) { > > + cpu_loop_exit_sigbus(cpu, guest_addr, access_type, pc); > > + } > > + } > > + > > + sync_sig = true; > > + } > > + > > + /* Get the target signal number. */ > > + sig = host_to_target_signal(host_sig); > > + if (sig < 1 || sig > TARGET_NSIG) { > > + return; > > + } > > + trace_user_host_signal(cpu, host_sig, sig); > > + > > + host_to_target_siginfo_noswap(&tinfo, info); > > + > > + queue_signal(env, sig, &tinfo); /* XXX how to cope with > failure? */ > > queue_signal() can't fail, so there is nothing to cope with. > (Your bsd-user version even has the right 'void' type -- > linux-user's returns 1 always and we never look at the return > value, so we should really switch that to void return too.) > Submitted a patch to that last bit. I'll remove this comment in the rework I'm doing. > > + /* > > + * Linux does something else here -> the queue signal may be wrong, > but > > + * maybe not. And then it does the rewind_if_in_safe_syscall > > + */ > > I think you have here a bit of a mix of linux-user's current design > and some older (broken) version. This is how linux-user works today: > Yes. I think we forked bsd-user from linux-user around 2015 and the old design dropped signal queueing in 2016 if I'm reading git log correctly. > * queue_signal() is a little bit misnamed, because there is no > "queue" here: there can only be at most one "queued" signal, > and it lives in the TaskState struct (which is user-only specific > information that hangs off the guest CPU struct) as the > TaskState::sync_signal field. The reason > we only have one at once is that queue_signal() is used only > for signals generated by QEMU itself by calling queue_signal() > directly or indirectly from the cpu_loop() code. The cpu loop > always calls process_pending_signals() at the end of its loop, > which will pick up a queued signal. We never call queue_signal() > twice in a row before getting back to process_pending_signals(), > so there's only ever at most one thing in the "queue". > * for all signals we get from the host except SIGSEGV/SIGBUS, > we track whether there's a host signal pending in the > TaskState::sigtab[] array (which is indexed by signal number). > We block all host signals except SIGSEGV/SIGBUS before calling > cpu_exit(), so we know we're not going to get more than one > of these at once (and it won't clash with a queue_signal() > signal either, as those use the sync_signal field, not the > sigtab[]). > * for host-sent non-spoofed (ie not sent via 'kill()') SIGSEGV/SIGBUS, > we know this was caused by a bit of generated code, so we just > use cpu_loop_exit_restore() to turn this into an EXCP_INTERRUPT > at the right guest PC > > I feel fairly strongly that you definitely want to use the same > design as current linux-user does for signals: > * getting this right is pretty tricky, and even if we get two > different designs to both have the same semantics it's going > to be pretty confusing > * we thought quite hard about the linux-user code at the time > and it's definitely less buggy than the previous design > * It's much easier to review the bsd-user code as "yes this is > doing the same thing linux-user does" than working through > a different approach from first principles > I agree. I like that design better than what's in bsd-user today and seems simpler to implement and get right. I'd add a 4th bullet point "It's easier to share code, either directly or via copying" > I don't have as strong an opinion on whether we should try to get > it into the tree that way from the start, or to put in whatever > you have currently and then fix it later. (More accurately, > I would prefer to review patches which use the same design > as linux-user but if that's going to be massively painful/slow > for you to get something upstream doing it that way around > I can probably live with the other approach...) > I'm going to start down that path and see if I can rework the patches and see if it solves some of the regressions we've seen in bsd-user between 6.1 and 6.2 as the signal stuff was reworked in linux-user to cope better with sigsegv and sigbus. If that works out OK, then I'll move forward with adjusting the fork and reflect that back into this patch series. > > + /* > > + * For synchronous signals, unwind the cpu state to the faulting > > + * insn and then exit back to the main loop so that the signal > > + * is delivered immediately. > > + XXXX Should this be in queue_signal? > > No, because queue_signal() is called for lots of ways to pend > a signal, most of which aren't real host signals. > OK. I now agree after reading the code. > > + */ > > + if (sync_sig) { > > + cpu->exception_index = EXCP_INTERRUPT; > > + cpu_loop_exit_restore(cpu, pc); > > + } > > + > > + rewind_if_in_safe_syscall(puc); > > + > > + /* > > + * Block host signals until target signal handler entered. We > > + * can't block SIGSEGV or SIGBUS while we're executing guest > > + * code in case the guest code provokes one in the window between > > + * now and it getting out to the main loop. Signals will be > > + * unblocked again in process_pending_signals(). > > + */ > > + sigfillset(&uc->uc_sigmask); > > + sigdelset(&uc->uc_sigmask, SIGSEGV); > > + sigdelset(&uc->uc_sigmask, SIGBUS); > > + > > + /* Interrupt the virtual CPU as soon as possible. */ > > + cpu_exit(thread_cpu); > > } > > > > void signal_init(void) > > thanks > Thank you for the insight and places to focus on making this code better. It's a bit hard to discover all this just from reading code in any sane amount of time and it is both quite helpful and much appreciated. Warner --0000000000000e389b05d5b938c1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Jan 13, 2022 at 1:17 PM Peter= Maydell <peter.maydell@lina= ro.org> wrote:
On Sun, 9 Jan 2022 at 16:40, Warner Losh <imp@bsdimp.com> wrote:
>
> Implement host_signal_handler to handle signals generated by the host<= br> > and to do safe system calls.
>
> Signed-off-by: Stacey Son <sson@FreeBSD.org>
> Signed-off-by: Kyle Evans <kevans@freebsd.org>
> Signed-off-by: Warner Losh <imp@bsdimp.com>
> ---
>=C2=A0 bsd-user/signal.c | 105 ++++++++++++++++++++++++++++++++++++++++= ++++++
>=C2=A0 1 file changed, 105 insertions(+)
>
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index b1331f63d61..a6e07277fb2 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -142,6 +142,111 @@ void force_sig_fault(int sig, int code, abi_ulon= g addr)
>
>=C2=A0 static void host_signal_handler(int host_sig, siginfo_t *info, v= oid *puc)
>=C2=A0 {
> +=C2=A0 =C2=A0 CPUState *cpu =3D thread_cpu;
> +=C2=A0 =C2=A0 CPUArchState *env =3D cpu->env_ptr;
> +=C2=A0 =C2=A0 int sig;
> +=C2=A0 =C2=A0 target_siginfo_t tinfo;
> +=C2=A0 =C2=A0 ucontext_t *uc =3D puc;
> +=C2=A0 =C2=A0 uintptr_t pc =3D 0;
> +=C2=A0 =C2=A0 bool sync_sig =3D false;
> +
> +=C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0* Non-spoofed SIGSEGV and SIGBUS are synchronous,= and need special
> +=C2=A0 =C2=A0 =C2=A0* handling wrt signal blocking and unwinding.
> +=C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 if ((host_sig =3D=3D SIGSEGV || host_sig =3D=3D SIGBUS)= && info->si_code > 0) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 MMUAccessType access_type;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 uintptr_t host_addr;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 abi_ptr guest_addr;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 bool is_write;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 host_addr =3D (uintptr_t)info->si_addr= ;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Convert forcefully to guest addre= ss space: addresses outside
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* reserved_va are still valid to re= port via SEGV_MAPERR.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 guest_addr =3D h2g_nocheck(host_addr); > +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 pc =3D host_signal_pc(uc);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 is_write =3D host_signal_write(info, uc);=
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 access_type =3D adjust_signal_pc(&pc,= is_write);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (host_sig =3D=3D SIGSEGV) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bool maperr =3D true;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (info->si_code =3D=3D= SEGV_ACCERR && h2g_valid(host_addr)) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* If this wa= s a write to a TB protected page, restart. */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (is_write = &&
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= handle_sigsegv_accerr_write(cpu, &uc->uc_sigmask,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 pc, guest_addr)) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= return;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* With = reserved_va, the whole address space is PROT_NONE,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* which= means that we may get ACCERR when we want MAPERR.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (page_get_= flags(guest_addr) & PAGE_VALID) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= maperr =3D false;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= info->si_code =3D SEGV_MAPERR;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sigprocmask(SIG_SETMASK, &a= mp;uc->uc_sigmask, NULL);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cpu_loop_exit_sigsegv(cpu, = guest_addr, access_type, maperr, pc);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sigprocmask(SIG_SETMASK, &a= mp;uc->uc_sigmask, NULL);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (info->si_code =3D=3D= BUS_ADRALN) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cpu_loop_exit= _sigbus(cpu, guest_addr, access_type, pc);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 sync_sig =3D true;
> +=C2=A0 =C2=A0 }
> +
> +=C2=A0 =C2=A0 /* Get the target signal number. */
> +=C2=A0 =C2=A0 sig =3D host_to_target_signal(host_sig);
> +=C2=A0 =C2=A0 if (sig < 1 || sig > TARGET_NSIG) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 return;
> +=C2=A0 =C2=A0 }
> +=C2=A0 =C2=A0 trace_user_host_signal(cpu, host_sig, sig);
> +
> +=C2=A0 =C2=A0 host_to_target_siginfo_noswap(&tinfo, info);
> +
> +=C2=A0 =C2=A0 queue_signal(env, sig, &tinfo);=C2=A0 =C2=A0 =C2=A0= =C2=A0/* XXX how to cope with failure? */

queue_signal() can't fail, so there is nothing to cope with.
(Your bsd-user version even has the right 'void' type --
linux-user's returns 1 always and we never look at the return
value, so we should really switch that to void return too.)

Submitted a patch to that last bit. I'll remove th= is comment in the rework I'm doing.
=C2=A0
> +=C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0* Linux does something else here -> the queue = signal may be wrong, but
> +=C2=A0 =C2=A0 =C2=A0* maybe not.=C2=A0 And then it does the rewind_if= _in_safe_syscall
> +=C2=A0 =C2=A0 =C2=A0*/

I think you have here a bit of a mix of linux-user's current design
and some older (broken) version. This is how linux-user works today:

Yes. I think we forked bsd-user from linux-us= er around 2015 and the old
design dropped signal queueing in 2016= if I'm reading git log correctly.
=C2=A0
=C2=A0* queue_signal() is a little bit misnamed, because there is no
=C2=A0 =C2=A0"queue" here: there can only be at most one "qu= eued" signal,
=C2=A0 =C2=A0and it lives in the TaskState struct (which is user-only speci= fic
=C2=A0 =C2=A0information that hangs off the guest CPU struct) as the
=C2=A0 =C2=A0TaskState::sync_signal field. The reason
=C2=A0 =C2=A0we only have one at once is that queue_signal() is used only =C2=A0 =C2=A0for signals generated by QEMU itself by calling queue_signal()=
=C2=A0 =C2=A0directly or indirectly from the cpu_loop() code. The cpu loop<= br> =C2=A0 =C2=A0always calls process_pending_signals() at the end of its loop,=
=C2=A0 =C2=A0which will pick up a queued signal. We never call queue_signal= ()
=C2=A0 =C2=A0twice in a row before getting back to process_pending_signals(= ),
=C2=A0 =C2=A0so there's only ever at most one thing in the "queue&= quot;.
=C2=A0* for all signals we get from the host except SIGSEGV/SIGBUS,
=C2=A0 =C2=A0we track whether there's a host signal pending in the
=C2=A0 =C2=A0TaskState::sigtab[] array (which is indexed by signal number).=
=C2=A0 =C2=A0We block all host signals except SIGSEGV/SIGBUS before calling=
=C2=A0 =C2=A0cpu_exit(), so we know we're not going to get more than on= e
=C2=A0 =C2=A0of these at once (and it won't clash with a queue_signal()=
=C2=A0 =C2=A0signal either, as those use the sync_signal field, not the
=C2=A0 =C2=A0sigtab[]).
=C2=A0* for host-sent non-spoofed (ie not sent via 'kill()') SIGSEG= V/SIGBUS,
=C2=A0 =C2=A0we know this was caused by a bit of generated code, so we just=
=C2=A0 =C2=A0use cpu_loop_exit_restore() to turn this into an EXCP_INTERRUP= T
=C2=A0 =C2=A0at the right guest PC

I feel fairly strongly that you definitely want to use the same
design as current linux-user does for signals:
=C2=A0* getting this right is pretty tricky, and even if we get two
=C2=A0 =C2=A0different designs to both have the same semantics it's goi= ng
=C2=A0 =C2=A0to be pretty confusing
=C2=A0* we thought quite hard about the linux-user code at the time
=C2=A0 =C2=A0and it's definitely less buggy than the previous design =C2=A0* It's much easier to review the bsd-user code as "yes this = is
=C2=A0 =C2=A0doing the same thing linux-user does" than working throug= h
=C2=A0 =C2=A0a different approach from first principles

I agree. I like that design better than what's in bsd-= user today and
seems simpler to implement and get right. I'd = add a 4th bullet point
"It's easier to share code, eithe= r directly or via copying"
=C2=A0
I don't have as strong an opinion on whether we should try to get
it into the tree that way from the start, or to put in whatever
you have currently and then fix it later. (More accurately,
I would prefer to review patches which use the same design
as linux-user but if that's going to be massively painful/slow
for you to get something upstream doing it that way around
I can probably live with the other approach...)

I'm going to start down that path and see if I can rework the = patches
and see if it solves some of the regressions we've se= en in bsd-user
between 6.1 and 6.2 as the signal stuff was rework= ed in linux-user
to cope better with sigsegv and sigbus. If that = works out OK, then
I'll move forward with adjusting the fork = and reflect that back into
this patch series.
=C2=A0
> +=C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0* For synchronous signals, unwind the cpu state t= o the faulting
> +=C2=A0 =C2=A0 =C2=A0* insn and then exit back to the main loop so tha= t the signal
> +=C2=A0 =C2=A0 =C2=A0* is delivered immediately.
> +=C2=A0 =C2=A0 =C2=A0XXXX Should this be in queue_signal?

No, because queue_signal() is called for lots of ways to pend
a signal, most of which aren't real host signals.
=
OK. I now agree after reading the code.
=C2=A0
> +=C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 if (sync_sig) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 cpu->exception_index =3D EXCP_INTERRUP= T;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 cpu_loop_exit_restore(cpu, pc);
> +=C2=A0 =C2=A0 }
> +
> +=C2=A0 =C2=A0 rewind_if_in_safe_syscall(puc);
> +
> +=C2=A0 =C2=A0 /*
> +=C2=A0 =C2=A0 =C2=A0* Block host signals until target signal handler = entered. We
> +=C2=A0 =C2=A0 =C2=A0* can't block SIGSEGV or SIGBUS while we'= re executing guest
> +=C2=A0 =C2=A0 =C2=A0* code in case the guest code provokes one in the= window between
> +=C2=A0 =C2=A0 =C2=A0* now and it getting out to the main loop. Signal= s will be
> +=C2=A0 =C2=A0 =C2=A0* unblocked again in process_pending_signals(). > +=C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 sigfillset(&uc->uc_sigmask);
> +=C2=A0 =C2=A0 sigdelset(&uc->uc_sigmask, SIGSEGV);
> +=C2=A0 =C2=A0 sigdelset(&uc->uc_sigmask, SIGBUS);
> +
> +=C2=A0 =C2=A0 /* Interrupt the virtual CPU as soon as possible. */ > +=C2=A0 =C2=A0 cpu_exit(thread_cpu);
>=C2=A0 }
>
>=C2=A0 void signal_init(void)

thanks

Thank you for the insight and pl= aces to focus on making this code better. It's
a bit hard to = discover all this just from reading code in any sane amount of time
and it is both quite helpful and much appreciated.

<= div>Warner=C2=A0
--0000000000000e389b05d5b938c1--