Possible sandybridge livelock issue

* Possible sandybridge livelock issue
@ 2011-05-13 16:12 James Bottomley
  2011-05-13 16:36 ` Andi Kleen
  2011-05-16  6:29 ` Ingo Molnar
  0 siblings, 2 replies; 7+ messages in thread
From: James Bottomley @ 2011-05-13 16:12 UTC (permalink / raw)
  To: x86; +Cc: linux-mm, linux-kernel, Mel Gorman

We've just come off a large round of debugging a kswapd problem over on
linux-mm:

http://marc.info/?t=130392066000001

The upshot was that kswapd wasn't being allowed to sleep (which we're
now fixing).  However, in spite of intensive efforts, the actual hang
was only reproducible on sandybridge laptops.

When the hang occurred, kswapd basically pegged one core in 100% system
time.  This looks like there's something specific to sandybridge that
causes this type of bad interaction.  I was wondering if it could be
something to to with a scheduling problem in turbo mode?  Once kswapd
goes flat out, the core its on will kick into turbo mode, which causes
it to get preferentially scheduled there, leading to the live lock.

The only evidence I have to support this theory is that when I reproduce
the problem with PREEMPT, the core pegs at 100% system time and stays
there even if I turn off the load.  However, if I can execute work that
causes kswapd to be kicked off the core it's running on, it will calm
back down and go to sleep.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread