* hardcoded SIGSEGV in __die() ? @ 2020-03-23 14:17 Joakim Tjernlund 2020-03-23 14:43 ` Christophe Leroy 0 siblings, 1 reply; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-23 14:17 UTC (permalink / raw) To: linuxppc-dev In __die(), see below, there is this call to notify_send() with SIGSEGV hardcoded, this seems odd to me as the variable "err" holds the true signal(in my case SIGBUS) Should not SIGSEGV be replaced with the true signal no.? Jocke static int __die(const char *str, struct pt_regs *regs, long err) { printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter); if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN)) printk("LE "); else printk("BE "); if (IS_ENABLED(CONFIG_PREEMPT)) pr_cont("PREEMPT "); if (IS_ENABLED(CONFIG_SMP)) pr_cont("SMP NR_CPUS=%d ", NR_CPUS); if (debug_pagealloc_enabled()) pr_cont("DEBUG_PAGEALLOC "); if (IS_ENABLED(CONFIG_NUMA)) pr_cont("NUMA "); pr_cont("%s\n", ppc_md.name ? ppc_md.name : ""); if (notify_die(DIE_OOPS, str, regs, err, 255, SIGSEGV) == NOTIFY_STOP) return 1; print_modules(); show_regs(regs); return 0; } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 14:17 hardcoded SIGSEGV in __die() ? Joakim Tjernlund @ 2020-03-23 14:43 ` Christophe Leroy 2020-03-23 14:45 ` Christophe Leroy 0 siblings, 1 reply; 11+ messages in thread From: Christophe Leroy @ 2020-03-23 14:43 UTC (permalink / raw) To: Joakim Tjernlund, linuxppc-dev Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > In __die(), see below, there is this call to notify_send() with SIGSEGV hardcoded, this seems odd > to me as the variable "err" holds the true signal(in my case SIGBUS) > Should not SIGSEGV be replaced with the true signal no.? As far as I can see, comes from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=66fcb1059 Christophe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 14:43 ` Christophe Leroy @ 2020-03-23 14:45 ` Christophe Leroy 2020-03-23 15:08 ` Joakim Tjernlund 0 siblings, 1 reply; 11+ messages in thread From: Christophe Leroy @ 2020-03-23 14:45 UTC (permalink / raw) To: Joakim Tjernlund, linuxppc-dev Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : >> In __die(), see below, there is this call to notify_send() with >> SIGSEGV hardcoded, this seems odd >> to me as the variable "err" holds the true signal(in my case SIGBUS) >> Should not SIGSEGV be replaced with the true signal no.? > > As far as I can see, comes from > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=66fcb1059 > And https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae87221d3ce49d9de1e43756da834fd0bf05a2ad shows it is (was?) similar on x86. Christophe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 14:45 ` Christophe Leroy @ 2020-03-23 15:08 ` Joakim Tjernlund 2020-03-23 15:31 ` Christophe Leroy 2020-03-26 0:28 ` Michael Ellerman 0 siblings, 2 replies; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-23 15:08 UTC (permalink / raw) To: christophe.leroy, linuxppc-dev On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > > > In __die(), see below, there is this call to notify_send() with > > > SIGSEGV hardcoded, this seems odd > > > to me as the variable "err" holds the true signal(in my case SIGBUS) > > > Should not SIGSEGV be replaced with the true signal no.? > > > > As far as I can see, comes from > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=Z2bFsmDlD2MKhLACQvayk9ejz0dqgMEOlBTlocAmtTg%3D&reserved=0 > > > > And > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=97kyz3Ur88BhDUUYzya5t%2FFQVhXYu6qiHoW8hsEg81s%3D&reserved=0 > shows it is (was?) similar on x86. > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see that happens. Seems to be related to debugging. In short, I cannot see any signal being delivered to user space. If so that would explain why our user space process never dies. Is there a signal hidden in machine_check handler for SIGBUS I cannot see? Jocke ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 15:08 ` Joakim Tjernlund @ 2020-03-23 15:31 ` Christophe Leroy 2020-03-23 15:44 ` Joakim Tjernlund 2020-03-26 0:28 ` Michael Ellerman 1 sibling, 1 reply; 11+ messages in thread From: Christophe Leroy @ 2020-03-23 15:31 UTC (permalink / raw) To: Joakim Tjernlund, linuxppc-dev Le 23/03/2020 à 16:08, Joakim Tjernlund a écrit : > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> Le 23/03/2020 à 15:43, Christophe Leroy a écrit : >>> >>> Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : >>>> In __die(), see below, there is this call to notify_send() with >>>> SIGSEGV hardcoded, this seems odd >>>> to me as the variable "err" holds the true signal(in my case SIGBUS) >>>> Should not SIGSEGV be replaced with the true signal no.? >>> >>> As far as I can see, comes from >>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=Z2bFsmDlD2MKhLACQvayk9ejz0dqgMEOlBTlocAmtTg%3D&reserved=0 >>> >> >> And >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=97kyz3Ur88BhDUUYzya5t%2FFQVhXYu6qiHoW8hsEg81s%3D&reserved=0 >> shows it is (was?) similar on x86. >> > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see > that happens. Seems to be related to debugging. > > In short, I cannot see any signal being delivered to user space. If so that would explain why > our user space process never dies. > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > Isn't it done in do_exit(), called from oops_end() ? Christophe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 15:31 ` Christophe Leroy @ 2020-03-23 15:44 ` Joakim Tjernlund 2020-03-25 17:02 ` David Laight 0 siblings, 1 reply; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-23 15:44 UTC (permalink / raw) To: christophe.leroy, linuxppc-dev On Mon, 2020-03-23 at 16:31 +0100, Christophe Leroy wrote: > > Le 23/03/2020 à 16:08, Joakim Tjernlund a écrit : > > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > > > > > > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > > > > > In __die(), see below, there is this call to notify_send() with > > > > > SIGSEGV hardcoded, this seems odd > > > > > to me as the variable "err" holds the true signal(in my case SIGBUS) > > > > > Should not SIGSEGV be replaced with the true signal no.? > > > > > > > > As far as I can see, comes from > > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cefe6d37a85e1494658ec08d7cf3f513f%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205743206770599&sdata=k8%2Bs7ifiCyuNzXuOhykjXUEtWzD62q3HGIIiavqE6%2FA%3D&reserved=0 > > > > > > > > > > And > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cefe6d37a85e1494658ec08d7cf3f513f%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205743206770599&sdata=oCU%2FMelrWDOCjmGOfVuNp2tM%2BwQ%2BRD25jzRWoGbHAew%3D&reserved=0 > > > shows it is (was?) similar on x86. > > > > > > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see > > that happens. Seems to be related to debugging. > > > > In short, I cannot see any signal being delivered to user space. If so that would explain why > > our user space process never dies. > > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > > > > Isn't it done in do_exit(), called from oops_end() ? hmm, so it seems. The odd thing though is that do_exit takes an exit code, not signal number. Also, feels a bit odd to force an exit(that we haven't seen happening) rather than just a signal. Jocke ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: hardcoded SIGSEGV in __die() ? 2020-03-23 15:44 ` Joakim Tjernlund @ 2020-03-25 17:02 ` David Laight 2020-03-25 17:09 ` Joakim Tjernlund 0 siblings, 1 reply; 11+ messages in thread From: David Laight @ 2020-03-25 17:02 UTC (permalink / raw) To: 'Joakim Tjernlund', christophe.leroy, linuxppc-dev From: Joakim Tjernlund > Sent: 23 March 2020 15:45 ... > > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot > see > > > that happens. Seems to be related to debugging. > > > > > > In short, I cannot see any signal being delivered to user space. If so that would explain why > > > our user space process never dies. > > > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > > > > > > > Isn't it done in do_exit(), called from oops_end() ? > > hmm, so it seems. The odd thing though is that do_exit takes an exit code, not signal number. > Also, feels a bit odd to force an exit(that we haven't seen happening) rather than just a signal. Isn't there something 'magic' that converts EFAULT into SIGSEGV? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-25 17:02 ` David Laight @ 2020-03-25 17:09 ` Joakim Tjernlund 0 siblings, 0 replies; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-25 17:09 UTC (permalink / raw) To: christophe.leroy, linuxppc-dev, David.Laight On Wed, 2020-03-25 at 17:02 +0000, David Laight wrote: > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you recognize the sender > and know the content is safe. > > > From: Joakim Tjernlund > > Sent: 23 March 2020 15:45 > ... > > > > I tried to follow that chain thinking it would end up sending a > > > > signal to user space but I cannot > > see > > > > that happens. Seems to be related to debugging. > > > > > > > > In short, I cannot see any signal being delivered to user > > > > space. If so that would explain why > > > > our user space process never dies. > > > > Is there a signal hidden in machine_check handler for SIGBUS I > > > > cannot see? > > > > > > > > > > Isn't it done in do_exit(), called from oops_end() ? > > > > hmm, so it seems. The odd thing though is that do_exit takes an > > exit code, not signal number. > > Also, feels a bit odd to force an exit(that we haven't seen > > happening) rather than just a signal. > > Isn't there something 'magic' that converts EFAULT into SIGSEGV? I have tried to find out and I cannot see a signal beeing sent. Also, SEGV is wrong, this is a SIGBUS fault. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, > MK1 1PT, UK > Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-23 15:08 ` Joakim Tjernlund 2020-03-23 15:31 ` Christophe Leroy @ 2020-03-26 0:28 ` Michael Ellerman 2020-03-27 10:10 ` Joakim Tjernlund 2020-03-30 17:16 ` Joakim Tjernlund 1 sibling, 2 replies; 11+ messages in thread From: Michael Ellerman @ 2020-03-26 0:28 UTC (permalink / raw) To: Joakim Tjernlund, christophe.leroy, linuxppc-dev Joakim Tjernlund <Joakim.Tjernlund@infinera.com> writes: > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: >> Le 23/03/2020 à 15:43, Christophe Leroy a écrit : >> > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : >> > > In __die(), see below, there is this call to notify_send() with >> > > SIGSEGV hardcoded, this seems odd >> > > to me as the variable "err" holds the true signal(in my case SIGBUS) >> > > Should not SIGSEGV be replaced with the true signal no.? >> > >> > As far as I can see, comes from >> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=Z2bFsmDlD2MKhLACQvayk9ejz0dqgMEOlBTlocAmtTg%3D&reserved=0 >> > >> >> And >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995&sdata=97kyz3Ur88BhDUUYzya5t%2FFQVhXYu6qiHoW8hsEg81s%3D&reserved=0 >> shows it is (was?) similar on x86. >> > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see > that happens. Seems to be related to debugging. > > In short, I cannot see any signal being delivered to user space. If so that would explain why > our user space process never dies. > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? It's platform specific. What platform are you on? See the ppc_md & cur_cpu_spec calls here: void machine_check_exception(struct pt_regs *regs) { int recover = 0; bool nested = in_nmi(); if (!nested) nmi_enter(); __this_cpu_inc(irq_stat.mce_exceptions); add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); /* See if any machine dependent calls. In theory, we would want * to call the CPU first, and call the ppc_md. one if the CPU * one returns a positive number. However there is existing code * that assumes the board gets a first chance, so let's keep it * that way for now and fix things later. --BenH. */ if (ppc_md.machine_check_exception) recover = ppc_md.machine_check_exception(regs); else if (cur_cpu_spec->machine_check) recover = cur_cpu_spec->machine_check(regs); if (recover > 0) goto bail; Either the ppc_md or cpu_spec handlers can send a signal, but after a bit of grepping I think only the pseries and powernv ones do. If you get into die() then it's an oops, which is not the same as a normal signal. cheers ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-26 0:28 ` Michael Ellerman @ 2020-03-27 10:10 ` Joakim Tjernlund 2020-03-30 17:16 ` Joakim Tjernlund 1 sibling, 0 replies; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-27 10:10 UTC (permalink / raw) To: christophe.leroy, mpe, linuxppc-dev On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Joakim Tjernlund <Joakim.Tjernlund@infinera.com> writes: > > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: > > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > > > > > In __die(), see below, there is this call to notify_send() with > > > > > SIGSEGV hardcoded, this seems odd > > > > > to me as the variable "err" holds the true signal(in my case SIGBUS) > > > > > Should not SIGSEGV be replaced with the true signal no.? > > > > > > > > As far as I can see, comes from > > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3D&reserved=0 > > > > > > > > > > And > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3D&reserved=0 > > > shows it is (was?) similar on x86. > > > > > > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see > > that happens. Seems to be related to debugging. > > > > In short, I cannot see any signal being delivered to user space. If so that would explain why > > our user space process never dies. > > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > > It's platform specific. What platform are you on? I am on e500, e5500(e500mc) and 83xx :) > > See the ppc_md & cur_cpu_spec calls here: > > void machine_check_exception(struct pt_regs *regs) > { > int recover = 0; > bool nested = in_nmi(); > if (!nested) > nmi_enter(); > > __this_cpu_inc(irq_stat.mce_exceptions); > > add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); > > /* See if any machine dependent calls. In theory, we would want > * to call the CPU first, and call the ppc_md. one if the CPU > * one returns a positive number. However there is existing code > * that assumes the board gets a first chance, so let's keep it > * that way for now and fix things later. --BenH. > */ > if (ppc_md.machine_check_exception) > recover = ppc_md.machine_check_exception(regs); > else if (cur_cpu_spec->machine_check) > recover = cur_cpu_spec->machine_check(regs); > > if (recover > 0) > goto bail; > > > Either the ppc_md or cpu_spec handlers can send a signal, but after a > bit of grepping I think only the pseries and powernv ones do. Seems so > > If you get into die() then it's an oops, which is not the same as a > normal signal. Exactly, and the die/OOPS does not seem work as intended either. The system tries to limp along and generates more similar OOPses and may even hang. > > cheers ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: hardcoded SIGSEGV in __die() ? 2020-03-26 0:28 ` Michael Ellerman 2020-03-27 10:10 ` Joakim Tjernlund @ 2020-03-30 17:16 ` Joakim Tjernlund 1 sibling, 0 replies; 11+ messages in thread From: Joakim Tjernlund @ 2020-03-30 17:16 UTC (permalink / raw) To: christophe.leroy, mpe, linuxppc-dev On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Joakim Tjernlund <Joakim.Tjernlund@infinera.com> writes: > > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: > > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > > > > > In __die(), see below, there is this call to notify_send() with > > > > > SIGSEGV hardcoded, this seems odd > > > > > to me as the variable "err" holds the true signal(in my case SIGBUS) > > > > > Should not SIGSEGV be replaced with the true signal no.? > > > > > > > > As far as I can see, comes from > > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3D&reserved=0 > > > > > > > > > > And > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3D&reserved=0 > > > shows it is (was?) similar on x86. > > > > > > > I tried to follow that chain thinking it would end up sending a signal to user space but I cannot see > > that happens. Seems to be related to debugging. > > > > In short, I cannot see any signal being delivered to user space. If so that would explain why > > our user space process never dies. > > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > > It's platform specific. What platform are you on? > > See the ppc_md & cur_cpu_spec calls here: > > void machine_check_exception(struct pt_regs *regs) > { > int recover = 0; > bool nested = in_nmi(); > if (!nested) > nmi_enter(); > > __this_cpu_inc(irq_stat.mce_exceptions); > > add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); > > /* See if any machine dependent calls. In theory, we would want > * to call the CPU first, and call the ppc_md. one if the CPU > * one returns a positive number. However there is existing code > * that assumes the board gets a first chance, so let's keep it > * that way for now and fix things later. --BenH. > */ > if (ppc_md.machine_check_exception) > recover = ppc_md.machine_check_exception(regs); > else if (cur_cpu_spec->machine_check) > recover = cur_cpu_spec->machine_check(regs); > > if (recover > 0) > goto bail; > > > Either the ppc_md or cpu_spec handlers can send a signal, but after a > bit of grepping I think only the pseries and powernv ones do. > > If you get into die() then it's an oops, which is not the same as a > normal signal. I had a look at opal_machine_check and friends and came up with: diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 0381242920d9..12715d24141c 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs) reason & MCSR_MEA ? "Effective" : "Physical", addr); } + if ((user_mode(regs))) { + _exception(SIGBUS, regs, reason, regs->nip); + recoverable = 1; + } + silent_out: mtspr(SPRN_MCSR, mcsr); return mfspr(SPRN_MCSR) == 0 && recoverable; @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs) if (reason & MCSR_BUS_RPERR) printk("Bus - Read Parity Error\n"); + if ((user_mode(regs))) { + _exception(SIGBUS, regs, reason, regs->nip); + return 1; + } return 0; } @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs) if (reason & MCSR_BUS_WRERR) printk("Bus - Write Bus Error on buffered store or cache line push\n"); + if ((user_mode(regs))) { + _exception(SIGBUS, regs, reason, regs->nip); + return 1; + } return 0; } #elif defined(CONFIG_PPC32) @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs) default: printk("Unknown values in msr\n"); } + if ((user_mode(regs))) { + _exception(SIGBUS, regs, reason, regs->nip); + return 1; + } return 0; } #endif /* everything else */ I don't really know what I am doing, does the above make sense to you? Jocke ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-03-30 17:18 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-23 14:17 hardcoded SIGSEGV in __die() ? Joakim Tjernlund 2020-03-23 14:43 ` Christophe Leroy 2020-03-23 14:45 ` Christophe Leroy 2020-03-23 15:08 ` Joakim Tjernlund 2020-03-23 15:31 ` Christophe Leroy 2020-03-23 15:44 ` Joakim Tjernlund 2020-03-25 17:02 ` David Laight 2020-03-25 17:09 ` Joakim Tjernlund 2020-03-26 0:28 ` Michael Ellerman 2020-03-27 10:10 ` Joakim Tjernlund 2020-03-30 17:16 ` Joakim Tjernlund
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).