All of lore.kernel.org
 help / color / mirror / Atom feed
* weird issue with CentOS RT kernel, hangs hard under particular load.
@ 2019-06-28 19:35 Chris Friesen
  2019-07-03 11:21 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Friesen @ 2019-06-28 19:35 UTC (permalink / raw)
  To: linux-rt-users

Hi all,

Not sure who I should be talking to about this, but we're hitting a 
weird issue with kernel-rt-3.10.0-957.12.2.rt56.929.el7.src.rpm from CentOS.

Basically, while running containerized OpenStack on top of Kubernetes it 
just hangs after a while.  Without anything useful in the logs, and so 
hard that sysrq on the serial console doesn't work (and it does normally).

The equivalent non-RT kernel seems to work fine, does not show the same 
problem.

Also, reducing the number of OpenStack services running seems to 
increase stability even though the CPUs aren't super busy in the failure 
scenario.

Anyone got any suggestions on where to start debugging it?  Kernel 
options to enable, extra debug logging, events to monitor?

Thanks,
Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: weird issue with CentOS RT kernel, hangs hard under particular load.
  2019-06-28 19:35 weird issue with CentOS RT kernel, hangs hard under particular load Chris Friesen
@ 2019-07-03 11:21 ` Sebastian Andrzej Siewior
  2019-07-04 17:05   ` Chris Friesen
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-07-03 11:21 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-rt-users

On 2019-06-28 13:35:51 [-0600], Chris Friesen wrote:
> Hi all,
Hi,

> Not sure who I should be talking to about this, but we're hitting a weird
> issue with kernel-rt-3.10.0-957.12.2.rt56.929.el7.src.rpm from CentOS.

3.10-rt56 is terribly old and unsupported [0].

> Anyone got any suggestions on where to start debugging it?  Kernel options
> to enable, extra debug logging, events to monitor?

You could try to enable MM debugging, lock debugging and so on.

[0] https://wiki.linuxfoundation.org/realtime/preempt_rt_versions

> Thanks,
> Chris

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: weird issue with CentOS RT kernel, hangs hard under particular load.
  2019-07-03 11:21 ` Sebastian Andrzej Siewior
@ 2019-07-04 17:05   ` Chris Friesen
  0 siblings, 0 replies; 3+ messages in thread
From: Chris Friesen @ 2019-07-04 17:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users

On 7/3/2019 5:21 AM, Sebastian Andrzej Siewior wrote:
> On 2019-06-28 13:35:51 [-0600], Chris Friesen wrote:
>> Hi all,
> Hi,
> 
>> Not sure who I should be talking to about this, but we're hitting a weird
>> issue with kernel-rt-3.10.0-957.12.2.rt56.929.el7.src.rpm from CentOS.
> 
> 3.10-rt56 is terribly old and unsupported [0].

Unfortunately I'm stuck with CentOS kernels in the near term.

>> Anyone got any suggestions on where to start debugging it?  Kernel options
>> to enable, extra debug logging, events to monitor?
> 
> You could try to enable MM debugging, lock debugging and so on.

Thanks...I've enabled CONFIG_PROVE_LOCKING, hard/soft watchdog, and 
netconsole.  Hopefully I'll get something. :)

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-04 17:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-28 19:35 weird issue with CentOS RT kernel, hangs hard under particular load Chris Friesen
2019-07-03 11:21 ` Sebastian Andrzej Siewior
2019-07-04 17:05   ` Chris Friesen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.