All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@whamcloud.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Hangs with cgroup memory controller
Date: Wed, 27 Jul 2011 13:16:28 -0600	[thread overview]
Message-ID: <5DBD4462-2AAA-4657-9EBB-9633336DD972@whamcloud.com> (raw)
In-Reply-To: <alpine.LFD.2.01.1107271922560.7176@sys880.ldn.framestore.com>

On 2011-07-27, at 12:57 PM, Mark Hills wrote:
> On Wed, 27 Jul 2011, Andreas Dilger wrote:
> [...] 
>> Possibly you can correlate reproducer cases with Lustre errors on the 
>> console?
> 
> I've managed to catch the bad state, on a clean client too -- there's no 
> errors reported from Lustre in dmesg.
> 
> Here's the information reported by the cgroup. It seems that there's a 
> discrepancy of 2x pages (the 'cache' field, pgpgin, pgpgout).

To dump Lustre pagecache pages use "lctl get_param llite.*.dump_page_cache",
which will print the inode, page index, read/write access, and page flags.

It wouldn't hurt to dump the kernel debug log, but it is unlikely to hold
anything useful.

> The process which was in the group terminated a long time ago.
> 
> I can leave the machine in this state until tomorrow, so any suggestions 
> for data to capture that could help trace this bug would be welcomed. 
> Thanks.
> 
> # cd /cgroup/p25321
> 
> # echo 1 > memory.force_empty
> <hangs: the bug>
> 
> # cat tasks
> <none>
> 
> # cat memory.max_usage_in_bytes 
> 1281351680
> 
> # cat memory.usage_in_bytes 
> 8192
> 
> # cat memory.stat 
> cache 8192                   <--- two pages
> rss 0
> mapped_file 0
> pgpgin 396369                <--- two pages higher than pgpgout
> pgpgout 396367
> swap 0
> inactive_anon 0
> active_anon 0
> inactive_file 0
> active_file 0
> unevictable 0
> hierarchical_memory_limit 8388608000
> hierarchical_memsw_limit 10485760000
> total_cache 8192
> total_rss 0
> total_mapped_file 0
> total_pgpgin 396369
> total_pgpgout 396367
> total_swap 0
> total_inactive_anon 0
> total_active_anon 0
> total_inactive_file 0
> total_active_file 0
> total_unevictable 0
> 
> # echo 1 > /proc/sys/vm/drop_caches
> <success>
> 
> # echo 2 > /proc/sys/vm/drop_caches
> <success>
> 
> # cat memory.stat
> <same as above>
> 
> -- 
> Mark


Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.

  reply	other threads:[~2011-07-27 19:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-27 16:21 [Lustre-devel] Hangs with cgroup memory controller Mark Hills
2011-07-27 17:11 ` Andreas Dilger
2011-07-27 17:33   ` Mark Hills
2011-07-27 18:57   ` Mark Hills
2011-07-27 19:16     ` Andreas Dilger [this message]
2011-07-28 13:53       ` Mark Hills
2011-07-28 17:10         ` Andreas Dilger
2011-07-29 14:39           ` Mark Hills
2011-08-04 17:24             ` [Lustre-devel] Bad page state after unlink (was Re: Hangs with cgroup memory controller) Mark Hills
2011-07-29  7:15     ` [Lustre-devel] Hangs with cgroup memory controller Robin Humble
2011-07-29 16:42       ` Mark Hills

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5DBD4462-2AAA-4657-9EBB-9633336DD972@whamcloud.com \
    --to=adilger@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.