* [Xenomai] [PULL] ipipe: fixes for 3.14 @ 2015-02-16 17:35 Jan Kiszka 2015-02-16 17:48 ` Philippe Gerum 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-16 17:35 UTC (permalink / raw) To: Xenomai The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) are available in the git repository at: git://git.xenomai.org/ipipe-jki for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) All patches apply to 3.16 as well but weren't tested there yet. Still analyzing strange traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] on the target under 3.14... ---------------------------------------------------------------- Jan Kiszka (3): arm/ipipe: Fix ret_from_exception for !CONFIG_IPIPE ipipe: Disable PROVE_RCU when I-pipe is on x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq arch/arm/kernel/entry-armv.S | 4 +++- arch/x86/kernel/apic/io_apic.c | 1 + lib/Kconfig.debug | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 17:35 [Xenomai] [PULL] ipipe: fixes for 3.14 Jan Kiszka @ 2015-02-16 17:48 ` Philippe Gerum 2015-02-16 17:52 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Philippe Gerum @ 2015-02-16 17:48 UTC (permalink / raw) To: Jan Kiszka, Xenomai On 02/16/2015 06:35 PM, Jan Kiszka wrote: > The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: > > arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) > > are available in the git repository at: > > git://git.xenomai.org/ipipe-jki > > for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: > > x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) > > All patches apply to 3.16 as well but weren't tested there yet. Still > analyzing strange > > traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] > > on the target under 3.14... > Register trashing on the fast call path (system_call_after_gs) maybe. Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. -- Philippe. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 17:48 ` Philippe Gerum @ 2015-02-16 17:52 ` Jan Kiszka 2015-02-16 18:52 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-16 17:52 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-16 18:48, Philippe Gerum wrote: > On 02/16/2015 06:35 PM, Jan Kiszka wrote: >> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >> >> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >> >> are available in the git repository at: >> >> git://git.xenomai.org/ipipe-jki >> >> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >> >> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >> >> All patches apply to 3.16 as well but weren't tested there yet. Still >> analyzing strange >> >> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >> >> on the target under 3.14... >> > > Register trashing on the fast call path (system_call_after_gs) maybe. > Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. That's without any Xenomai configured. But I will check. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 17:52 ` Jan Kiszka @ 2015-02-16 18:52 ` Jan Kiszka 2015-02-16 19:11 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-16 18:52 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-16 18:52, Jan Kiszka wrote: > On 2015-02-16 18:48, Philippe Gerum wrote: >> On 02/16/2015 06:35 PM, Jan Kiszka wrote: >>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >>> >>> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >>> >>> are available in the git repository at: >>> >>> git://git.xenomai.org/ipipe-jki >>> >>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >>> >>> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >>> >>> All patches apply to 3.16 as well but weren't tested there yet. Still >>> analyzing strange >>> >>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >>> >>> on the target under 3.14... >>> >> >> Register trashing on the fast call path (system_call_after_gs) maybe. >> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. > > That's without any Xenomai configured. But I will check. This is where it traps in libpthread: 1112b: c7 f8 00 00 00 00 xbeginq 11131 <__lll_lock_elision+0x51> For some reason, glibc starts to believe it could use RTM on that machine - although this is clearly not available (Haswell i7, thus a buggy CPU). Need to understand which feature check goes wrong, and why only on I-pipe enabled kernels. The good news is that I still have a similar and unresolved report from early customer tests. That could solve their issues as well (they caused massive domain migrations). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 18:52 ` Jan Kiszka @ 2015-02-16 19:11 ` Jan Kiszka 2015-02-16 19:35 ` Philippe Gerum 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-16 19:11 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-16 19:52, Jan Kiszka wrote: > On 2015-02-16 18:52, Jan Kiszka wrote: >> On 2015-02-16 18:48, Philippe Gerum wrote: >>> On 02/16/2015 06:35 PM, Jan Kiszka wrote: >>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >>>> >>>> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >>>> >>>> are available in the git repository at: >>>> >>>> git://git.xenomai.org/ipipe-jki >>>> >>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >>>> >>>> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >>>> >>>> All patches apply to 3.16 as well but weren't tested there yet. Still >>>> analyzing strange >>>> >>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >>>> >>>> on the target under 3.14... >>>> >>> >>> Register trashing on the fast call path (system_call_after_gs) maybe. >>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. >> >> That's without any Xenomai configured. But I will check. > > This is where it traps in libpthread: > > 1112b: c7 f8 00 00 00 00 xbeginq 11131 <__lll_lock_elision+0x51> > > For some reason, glibc starts to believe it could use RTM on that > machine - although this is clearly not available (Haswell i7, thus a > buggy CPU). Need to understand which feature check goes wrong, and why > only on I-pipe enabled kernels. > > The good news is that I still have a similar and unresolved report from > early customer tests. That could solve their issues as well (they > caused massive domain migrations). Wait - we were already one step further: a potential trigger for this behaviour is a nullified pthread mutex. The decision to take the TSX path is based on field in that struct, and 0 means transactional. Guess I need a better reproduction case than some systemd service. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 19:11 ` Jan Kiszka @ 2015-02-16 19:35 ` Philippe Gerum 2015-02-16 20:15 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Philippe Gerum @ 2015-02-16 19:35 UTC (permalink / raw) To: Jan Kiszka, Xenomai On 02/16/2015 08:11 PM, Jan Kiszka wrote: > On 2015-02-16 19:52, Jan Kiszka wrote: >> On 2015-02-16 18:52, Jan Kiszka wrote: >>> On 2015-02-16 18:48, Philippe Gerum wrote: >>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote: >>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >>>>> >>>>> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >>>>> >>>>> are available in the git repository at: >>>>> >>>>> git://git.xenomai.org/ipipe-jki >>>>> >>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >>>>> >>>>> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >>>>> >>>>> All patches apply to 3.16 as well but weren't tested there yet. Still >>>>> analyzing strange >>>>> >>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >>>>> >>>>> on the target under 3.14... >>>>> >>>> >>>> Register trashing on the fast call path (system_call_after_gs) maybe. >>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. >>> >>> That's without any Xenomai configured. But I will check. >> >> This is where it traps in libpthread: >> >> 1112b: c7 f8 00 00 00 00 xbeginq 11131 <__lll_lock_elision+0x51> >> >> For some reason, glibc starts to believe it could use RTM on that >> machine - although this is clearly not available (Haswell i7, thus a >> buggy CPU). Need to understand which feature check goes wrong, and why >> only on I-pipe enabled kernels. >> >> The good news is that I still have a similar and unresolved report from >> early customer tests. That could solve their issues as well (they >> caused massive domain migrations). > > Wait - we were already one step further: a potential trigger for this > behaviour is a nullified pthread mutex. The decision to take the TSX > path is based on field in that struct, and 0 means transactional. Guess > I need a better reproduction case than some systemd service. Which glibc release are you running? -- Philippe. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 19:35 ` Philippe Gerum @ 2015-02-16 20:15 ` Jan Kiszka 2015-02-17 14:49 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-16 20:15 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-16 20:35, Philippe Gerum wrote: > On 02/16/2015 08:11 PM, Jan Kiszka wrote: >> On 2015-02-16 19:52, Jan Kiszka wrote: >>> On 2015-02-16 18:52, Jan Kiszka wrote: >>>> On 2015-02-16 18:48, Philippe Gerum wrote: >>>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote: >>>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >>>>>> >>>>>> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >>>>>> >>>>>> are available in the git repository at: >>>>>> >>>>>> git://git.xenomai.org/ipipe-jki >>>>>> >>>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >>>>>> >>>>>> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >>>>>> >>>>>> All patches apply to 3.16 as well but weren't tested there yet. Still >>>>>> analyzing strange >>>>>> >>>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >>>>>> >>>>>> on the target under 3.14... >>>>>> >>>>> >>>>> Register trashing on the fast call path (system_call_after_gs) maybe. >>>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. >>>> >>>> That's without any Xenomai configured. But I will check. >>> >>> This is where it traps in libpthread: >>> >>> 1112b: c7 f8 00 00 00 00 xbeginq 11131 <__lll_lock_elision+0x51> >>> >>> For some reason, glibc starts to believe it could use RTM on that >>> machine - although this is clearly not available (Haswell i7, thus a >>> buggy CPU). Need to understand which feature check goes wrong, and why >>> only on I-pipe enabled kernels. >>> >>> The good news is that I still have a similar and unresolved report from >>> early customer tests. That could solve their issues as well (they >>> caused massive domain migrations). >> >> Wait - we were already one step further: a potential trigger for this >> behaviour is a nullified pthread mutex. The decision to take the TSX >> path is based on field in that struct, and 0 means transactional. Guess >> I need a better reproduction case than some systemd service. > > Which glibc release are you running? 2.19. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-16 20:15 ` Jan Kiszka @ 2015-02-17 14:49 ` Jan Kiszka 2015-02-17 17:27 ` Jan Kiszka 0 siblings, 1 reply; 9+ messages in thread From: Jan Kiszka @ 2015-02-17 14:49 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-16 21:15, Jan Kiszka wrote: > On 2015-02-16 20:35, Philippe Gerum wrote: >> On 02/16/2015 08:11 PM, Jan Kiszka wrote: >>> On 2015-02-16 19:52, Jan Kiszka wrote: >>>> On 2015-02-16 18:52, Jan Kiszka wrote: >>>>> On 2015-02-16 18:48, Philippe Gerum wrote: >>>>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote: >>>>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a: >>>>>>> >>>>>>> arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100) >>>>>>> >>>>>>> are available in the git repository at: >>>>>>> >>>>>>> git://git.xenomai.org/ipipe-jki >>>>>>> >>>>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98: >>>>>>> >>>>>>> x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100) >>>>>>> >>>>>>> All patches apply to 3.16 as well but weren't tested there yet. Still >>>>>>> analyzing strange >>>>>>> >>>>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000] >>>>>>> >>>>>>> on the target under 3.14... >>>>>>> >>>>>> >>>>>> Register trashing on the fast call path (system_call_after_gs) maybe. >>>>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out. >>>>> >>>>> That's without any Xenomai configured. But I will check. >>>> >>>> This is where it traps in libpthread: >>>> >>>> 1112b: c7 f8 00 00 00 00 xbeginq 11131 <__lll_lock_elision+0x51> >>>> >>>> For some reason, glibc starts to believe it could use RTM on that >>>> machine - although this is clearly not available (Haswell i7, thus a >>>> buggy CPU). Need to understand which feature check goes wrong, and why >>>> only on I-pipe enabled kernels. >>>> >>>> The good news is that I still have a similar and unresolved report from >>>> early customer tests. That could solve their issues as well (they >>>> caused massive domain migrations). >>> >>> Wait - we were already one step further: a potential trigger for this >>> behaviour is a nullified pthread mutex. The decision to take the TSX >>> path is based on field in that struct, and 0 means transactional. Guess >>> I need a better reproduction case than some systemd service. >> >> Which glibc release are you running? > > 2.19. Retested and, although I could have sworn that !CONFIG_IPIPE doesn't cause this effect, it does on 3.14.28. And checking CPU features again, there is both "hle" and "rtm" - in contrast to a newer 3.18 kernel where those are gone (same distro, same hardware). So this is not an I-pipe issue, it's something related to the kernel or its configuration. Need to study again where the microcode comes from. I thought it was pulled from /lib/firmware/somewhere, thus would be shared between all kernels on the same rootfs. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xenomai] [PULL] ipipe: fixes for 3.14 2015-02-17 14:49 ` Jan Kiszka @ 2015-02-17 17:27 ` Jan Kiszka 0 siblings, 0 replies; 9+ messages in thread From: Jan Kiszka @ 2015-02-17 17:27 UTC (permalink / raw) To: Philippe Gerum, Xenomai On 2015-02-17 15:49, Jan Kiszka wrote: > Retested and, although I could have sworn that !CONFIG_IPIPE doesn't > cause this effect, it does on 3.14.28. And checking CPU features again, > there is both "hle" and "rtm" - in contrast to a newer 3.18 kernel where > those are gone (same distro, same hardware). > > So this is not an I-pipe issue, it's something related to the kernel or > its configuration. Need to study again where the microcode comes from. I > thought it was pulled from /lib/firmware/somewhere, thus would be shared > between all kernels on the same rootfs. Just to close the topic: We had early microcode loading disabled in the kernel, likely due to some historic reasons (there was a crash once). That caused systemd to load libpthread before the microcode fix that disables RTM was loaded. Once the microcode came in, the RTM instructions suddenly start to cause #UD. Nice. Not sure yet if this explains the issues in the customer application - we will see later. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-02-17 17:27 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-16 17:35 [Xenomai] [PULL] ipipe: fixes for 3.14 Jan Kiszka 2015-02-16 17:48 ` Philippe Gerum 2015-02-16 17:52 ` Jan Kiszka 2015-02-16 18:52 ` Jan Kiszka 2015-02-16 19:11 ` Jan Kiszka 2015-02-16 19:35 ` Philippe Gerum 2015-02-16 20:15 ` Jan Kiszka 2015-02-17 14:49 ` Jan Kiszka 2015-02-17 17:27 ` Jan Kiszka
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.