All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel J Blueman <daniel@numascale-asia.com>
To: Hillf Danton <dhillf@gmail.com>
Cc: Jiri Slaby <jslaby@suse.cz>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Steffen Persvold <sp@numascale.com>,
	Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: kswapd craziness round 2
Date: Mon, 18 Feb 2013 23:05:56 +0800	[thread overview]
Message-ID: <51224354.4010909@numascale-asia.com> (raw)
In-Reply-To: <CAJd=RBArPT8YowhLuE8YVGNfH7G-xXTOjSyDgdV2RsatL-9m+Q@mail.gmail.com>

On 18/02/2013 19:42, Hillf Danton wrote:
> On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman
> <daniel@numascale-asia.com> wrote:
>> On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby  wrote:
>>
>>> Hi,
>>>
>>> You still feel the sour taste of the "kswapd craziness in v3.7" thread,
>>> right? Welcome to the hell, part two :{.
>>>
>>> I believe this started happening after update from
>>> 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before,
>>> many hours of uptime are needed and perhaps some suspend/resume cycles
>>> too. Memory pressure is not high, plenty of I/O cache:
>>> # free
>>>               total       used       free     shared    buffers     cached
>>> Mem:       6026692    5571184     455508          0     351252    2016648
>>> -/+ buffers/cache:    3203284    2823408
>>> Swap:            0          0          0
>>>
>>> kswap is working very toughly though:
>>> root       580  0.6  0.0      0     0 ?        S    úno12  46:21 [kswapd0]
>>>
>>> This happens on I/O activity right now. For example by updatedb or find
>>> /. This is what the stack trace of kswapd0 looks like:
>>> [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0
>>> [<ffffffff8113ecd1>] kswapd+0x541/0x930
>>> [<ffffffff810a3000>] kthread+0xc0/0xd0
>>> [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0
>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario
>> which hoses the box and observe RCU stalls [2].
>>
>> There may be a connection; I'll do a bit more debugging in the next few
>> days.
>>
>> Daniel
>>
>> --- [1]
>>
>> 1. live-booted image using ramdisk
>> 2. boot 3.8-rc with <16GB memory and without swap
>> 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not
>> ramdisk)
>> 4. observe hang O(30) mins later
>>
>> --- [2]
>>
>> [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5}  (t=24000
>> jiffies g=6313 c=6312 q=68)
>
> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168

Close, but no cigar; I still hit this livelock on 3.8-rc7 with Ingo's 
revert or Linus's fix.

However, I am unable to reproduce the hang with 3.7.9, so will begin 
bisection tomorrow, probably automating via pexpect.

Thanks,
   Daniel
-- 
Daniel J Blueman
Principal Software Engineer, Numascale Asia

  reply	other threads:[~2013-02-18 15:06 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-18  6:18 kswapd craziness round 2 Daniel J Blueman
2013-02-18 11:42 ` Hillf Danton
2013-02-18 15:05   ` Daniel J Blueman [this message]
2013-02-20 22:14   ` Jiri Slaby
2013-02-21 12:07     ` Hillf Danton
2013-02-24 21:27       ` Jiri Slaby
2013-02-28 17:02       ` Jiri Slaby
2013-03-01 14:02         ` Hillf Danton
2013-03-07 19:37           ` Jiri Slaby
2013-03-08  6:42             ` Hillf Danton
2013-03-08  6:42               ` Hillf Danton
2013-03-08  7:29               ` Zlatko Calusic
2013-03-08  7:29                 ` Zlatko Calusic
2013-03-08  8:27                 ` Hillf Danton
2013-03-08  8:27                   ` Hillf Danton
2013-03-08 23:21               ` Jiri Slaby
2013-03-08 23:21                 ` Jiri Slaby
2013-03-19 16:59                 ` Pádraig Brady
2013-03-19 16:59                   ` Pádraig Brady
2013-03-20  4:12                   ` Hillf Danton
2013-03-20  4:12                     ` Hillf Danton
2013-03-20  8:39                     ` Jiri Slaby
2013-03-20  8:39                       ` Jiri Slaby
  -- strict thread matches above, loose matches on Subject: below --
2013-02-17 22:02 Jiri Slaby
2013-02-17 22:02 ` Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51224354.4010909@numascale-asia.com \
    --to=daniel@numascale-asia.com \
    --cc=dhillf@gmail.com \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=sp@numascale.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.