All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] [PULL] ipipe: fixes for 3.14
@ 2015-02-16 17:35 Jan Kiszka
  2015-02-16 17:48 ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-16 17:35 UTC (permalink / raw)
  To: Xenomai

The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:

  arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)

are available in the git repository at:

  git://git.xenomai.org/ipipe-jki 

for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:

  x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)

All patches apply to 3.16 as well but weren't tested there yet. Still
analyzing strange

traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]

on the target under 3.14...

----------------------------------------------------------------
Jan Kiszka (3):
      arm/ipipe: Fix ret_from_exception for !CONFIG_IPIPE
      ipipe: Disable PROVE_RCU when I-pipe is on
      x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq

 arch/arm/kernel/entry-armv.S   | 4 +++-
 arch/x86/kernel/apic/io_apic.c | 1 +
 lib/Kconfig.debug              | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 17:35 [Xenomai] [PULL] ipipe: fixes for 3.14 Jan Kiszka
@ 2015-02-16 17:48 ` Philippe Gerum
  2015-02-16 17:52   ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2015-02-16 17:48 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai

On 02/16/2015 06:35 PM, Jan Kiszka wrote:
> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
> 
>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
> 
> are available in the git repository at:
> 
>   git://git.xenomai.org/ipipe-jki 
> 
> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
> 
>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
> 
> All patches apply to 3.16 as well but weren't tested there yet. Still
> analyzing strange
> 
> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
> 
> on the target under 3.14...
> 

Register trashing on the fast call path (system_call_after_gs) maybe.
Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 17:48 ` Philippe Gerum
@ 2015-02-16 17:52   ` Jan Kiszka
  2015-02-16 18:52     ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-16 17:52 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-16 18:48, Philippe Gerum wrote:
> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>
>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>
>> are available in the git repository at:
>>
>>   git://git.xenomai.org/ipipe-jki 
>>
>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>
>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>
>> All patches apply to 3.16 as well but weren't tested there yet. Still
>> analyzing strange
>>
>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>
>> on the target under 3.14...
>>
> 
> Register trashing on the fast call path (system_call_after_gs) maybe.
> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.

That's without any Xenomai configured. But I will check.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 17:52   ` Jan Kiszka
@ 2015-02-16 18:52     ` Jan Kiszka
  2015-02-16 19:11       ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-16 18:52 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-16 18:52, Jan Kiszka wrote:
> On 2015-02-16 18:48, Philippe Gerum wrote:
>> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>>
>>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>>
>>> are available in the git repository at:
>>>
>>>   git://git.xenomai.org/ipipe-jki 
>>>
>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>>
>>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>>
>>> All patches apply to 3.16 as well but weren't tested there yet. Still
>>> analyzing strange
>>>
>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>>
>>> on the target under 3.14...
>>>
>>
>> Register trashing on the fast call path (system_call_after_gs) maybe.
>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.
> 
> That's without any Xenomai configured. But I will check.

This is where it traps in libpthread:

   1112b:       c7 f8 00 00 00 00       xbeginq 11131 <__lll_lock_elision+0x51>

For some reason, glibc starts to believe it could use RTM on that
machine - although this is clearly not available (Haswell i7, thus a
buggy CPU). Need to understand which feature check goes wrong, and why
only on I-pipe enabled kernels.

The good news is that I still have a similar and unresolved report from
early customer tests. That could solve their issues as well (they
caused massive domain migrations).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 18:52     ` Jan Kiszka
@ 2015-02-16 19:11       ` Jan Kiszka
  2015-02-16 19:35         ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-16 19:11 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-16 19:52, Jan Kiszka wrote:
> On 2015-02-16 18:52, Jan Kiszka wrote:
>> On 2015-02-16 18:48, Philippe Gerum wrote:
>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>>>
>>>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>>>
>>>> are available in the git repository at:
>>>>
>>>>   git://git.xenomai.org/ipipe-jki 
>>>>
>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>>>
>>>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>>>
>>>> All patches apply to 3.16 as well but weren't tested there yet. Still
>>>> analyzing strange
>>>>
>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>>>
>>>> on the target under 3.14...
>>>>
>>>
>>> Register trashing on the fast call path (system_call_after_gs) maybe.
>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.
>>
>> That's without any Xenomai configured. But I will check.
> 
> This is where it traps in libpthread:
> 
>    1112b:       c7 f8 00 00 00 00       xbeginq 11131 <__lll_lock_elision+0x51>
> 
> For some reason, glibc starts to believe it could use RTM on that
> machine - although this is clearly not available (Haswell i7, thus a
> buggy CPU). Need to understand which feature check goes wrong, and why
> only on I-pipe enabled kernels.
> 
> The good news is that I still have a similar and unresolved report from
> early customer tests. That could solve their issues as well (they
> caused massive domain migrations).

Wait - we were already one step further: a potential trigger for this
behaviour is a nullified pthread mutex. The decision to take the TSX
path is based on field in that struct, and 0 means transactional. Guess
I need a better reproduction case than some systemd service.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 19:11       ` Jan Kiszka
@ 2015-02-16 19:35         ` Philippe Gerum
  2015-02-16 20:15           ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2015-02-16 19:35 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai

On 02/16/2015 08:11 PM, Jan Kiszka wrote:
> On 2015-02-16 19:52, Jan Kiszka wrote:
>> On 2015-02-16 18:52, Jan Kiszka wrote:
>>> On 2015-02-16 18:48, Philippe Gerum wrote:
>>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>>>>
>>>>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>>>>
>>>>> are available in the git repository at:
>>>>>
>>>>>   git://git.xenomai.org/ipipe-jki 
>>>>>
>>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>>>>
>>>>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>>>>
>>>>> All patches apply to 3.16 as well but weren't tested there yet. Still
>>>>> analyzing strange
>>>>>
>>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>>>>
>>>>> on the target under 3.14...
>>>>>
>>>>
>>>> Register trashing on the fast call path (system_call_after_gs) maybe.
>>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.
>>>
>>> That's without any Xenomai configured. But I will check.
>>
>> This is where it traps in libpthread:
>>
>>    1112b:       c7 f8 00 00 00 00       xbeginq 11131 <__lll_lock_elision+0x51>
>>
>> For some reason, glibc starts to believe it could use RTM on that
>> machine - although this is clearly not available (Haswell i7, thus a
>> buggy CPU). Need to understand which feature check goes wrong, and why
>> only on I-pipe enabled kernels.
>>
>> The good news is that I still have a similar and unresolved report from
>> early customer tests. That could solve their issues as well (they
>> caused massive domain migrations).
> 
> Wait - we were already one step further: a potential trigger for this
> behaviour is a nullified pthread mutex. The decision to take the TSX
> path is based on field in that struct, and 0 means transactional. Guess
> I need a better reproduction case than some systemd service.

Which glibc release are you running?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 19:35         ` Philippe Gerum
@ 2015-02-16 20:15           ` Jan Kiszka
  2015-02-17 14:49             ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-16 20:15 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-16 20:35, Philippe Gerum wrote:
> On 02/16/2015 08:11 PM, Jan Kiszka wrote:
>> On 2015-02-16 19:52, Jan Kiszka wrote:
>>> On 2015-02-16 18:52, Jan Kiszka wrote:
>>>> On 2015-02-16 18:48, Philippe Gerum wrote:
>>>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>>>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>>>>>
>>>>>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>>>>>
>>>>>> are available in the git repository at:
>>>>>>
>>>>>>   git://git.xenomai.org/ipipe-jki 
>>>>>>
>>>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>>>>>
>>>>>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>>>>>
>>>>>> All patches apply to 3.16 as well but weren't tested there yet. Still
>>>>>> analyzing strange
>>>>>>
>>>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>>>>>
>>>>>> on the target under 3.14...
>>>>>>
>>>>>
>>>>> Register trashing on the fast call path (system_call_after_gs) maybe.
>>>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.
>>>>
>>>> That's without any Xenomai configured. But I will check.
>>>
>>> This is where it traps in libpthread:
>>>
>>>    1112b:       c7 f8 00 00 00 00       xbeginq 11131 <__lll_lock_elision+0x51>
>>>
>>> For some reason, glibc starts to believe it could use RTM on that
>>> machine - although this is clearly not available (Haswell i7, thus a
>>> buggy CPU). Need to understand which feature check goes wrong, and why
>>> only on I-pipe enabled kernels.
>>>
>>> The good news is that I still have a similar and unresolved report from
>>> early customer tests. That could solve their issues as well (they
>>> caused massive domain migrations).
>>
>> Wait - we were already one step further: a potential trigger for this
>> behaviour is a nullified pthread mutex. The decision to take the TSX
>> path is based on field in that struct, and 0 means transactional. Guess
>> I need a better reproduction case than some systemd service.
> 
> Which glibc release are you running?

2.19.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-16 20:15           ` Jan Kiszka
@ 2015-02-17 14:49             ` Jan Kiszka
  2015-02-17 17:27               ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2015-02-17 14:49 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-16 21:15, Jan Kiszka wrote:
> On 2015-02-16 20:35, Philippe Gerum wrote:
>> On 02/16/2015 08:11 PM, Jan Kiszka wrote:
>>> On 2015-02-16 19:52, Jan Kiszka wrote:
>>>> On 2015-02-16 18:52, Jan Kiszka wrote:
>>>>> On 2015-02-16 18:48, Philippe Gerum wrote:
>>>>>> On 02/16/2015 06:35 PM, Jan Kiszka wrote:
>>>>>>> The following changes since commit 00d8a6f2e95453f61ae97b6587dba03bc91c7e1a:
>>>>>>>
>>>>>>>   arm/ipipe: Resolve trival merge conflicts (2015-01-14 17:00:22 +0100)
>>>>>>>
>>>>>>> are available in the git repository at:
>>>>>>>
>>>>>>>   git://git.xenomai.org/ipipe-jki 
>>>>>>>
>>>>>>> for you to fetch changes up to f982e48ce99e1bc9e71eac21830adac2db265c98:
>>>>>>>
>>>>>>>   x86/ipipe: Restore invocation of ipipe_unlock_irq from startup_ioapic_irq (2015-02-16 18:22:21 +0100)
>>>>>>>
>>>>>>> All patches apply to 3.16 as well but weren't tested there yet. Still
>>>>>>> analyzing strange
>>>>>>>
>>>>>>> traps: (systemd)[1420] trap invalid opcode ip:7f4d138f612b sp:7fff4fd8f568 error:0 in libpthread-2.19.so[7f4d138e5000+18000]
>>>>>>>
>>>>>>> on the target under 3.14...
>>>>>>>
>>>>>>
>>>>>> Register trashing on the fast call path (system_call_after_gs) maybe.
>>>>>> Forcing CONFIG_IPIPE_LEGACY there might help ruling this out.
>>>>>
>>>>> That's without any Xenomai configured. But I will check.
>>>>
>>>> This is where it traps in libpthread:
>>>>
>>>>    1112b:       c7 f8 00 00 00 00       xbeginq 11131 <__lll_lock_elision+0x51>
>>>>
>>>> For some reason, glibc starts to believe it could use RTM on that
>>>> machine - although this is clearly not available (Haswell i7, thus a
>>>> buggy CPU). Need to understand which feature check goes wrong, and why
>>>> only on I-pipe enabled kernels.
>>>>
>>>> The good news is that I still have a similar and unresolved report from
>>>> early customer tests. That could solve their issues as well (they
>>>> caused massive domain migrations).
>>>
>>> Wait - we were already one step further: a potential trigger for this
>>> behaviour is a nullified pthread mutex. The decision to take the TSX
>>> path is based on field in that struct, and 0 means transactional. Guess
>>> I need a better reproduction case than some systemd service.
>>
>> Which glibc release are you running?
> 
> 2.19.

Retested and, although I could have sworn that !CONFIG_IPIPE doesn't
cause this effect, it does on 3.14.28. And checking CPU features again,
there is both "hle" and "rtm" - in contrast to a newer 3.18 kernel where
those are gone (same distro, same hardware).

So this is not an I-pipe issue, it's something related to the kernel or
its configuration. Need to study again where the microcode comes from. I
thought it was pulled from /lib/firmware/somewhere, thus would be shared
between all kernels on the same rootfs.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai] [PULL] ipipe: fixes for 3.14
  2015-02-17 14:49             ` Jan Kiszka
@ 2015-02-17 17:27               ` Jan Kiszka
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Kiszka @ 2015-02-17 17:27 UTC (permalink / raw)
  To: Philippe Gerum, Xenomai

On 2015-02-17 15:49, Jan Kiszka wrote:
> Retested and, although I could have sworn that !CONFIG_IPIPE doesn't
> cause this effect, it does on 3.14.28. And checking CPU features again,
> there is both "hle" and "rtm" - in contrast to a newer 3.18 kernel where
> those are gone (same distro, same hardware).
> 
> So this is not an I-pipe issue, it's something related to the kernel or
> its configuration. Need to study again where the microcode comes from. I
> thought it was pulled from /lib/firmware/somewhere, thus would be shared
> between all kernels on the same rootfs.

Just to close the topic: We had early microcode loading disabled in the
kernel, likely due to some historic reasons (there was a crash once).
That caused systemd to load libpthread before the microcode fix that
disables RTM was loaded. Once the microcode came in, the RTM
instructions suddenly start to cause #UD. Nice. Not sure yet if this
explains the issues in the customer application - we will see later.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-02-17 17:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-16 17:35 [Xenomai] [PULL] ipipe: fixes for 3.14 Jan Kiszka
2015-02-16 17:48 ` Philippe Gerum
2015-02-16 17:52   ` Jan Kiszka
2015-02-16 18:52     ` Jan Kiszka
2015-02-16 19:11       ` Jan Kiszka
2015-02-16 19:35         ` Philippe Gerum
2015-02-16 20:15           ` Jan Kiszka
2015-02-17 14:49             ` Jan Kiszka
2015-02-17 17:27               ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.