linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Robert Kudyba <rkudyba@fordham.edu>
Cc: linux-kernel@vger.kernel.org
Subject: Re: rsync: page allocation stalls in kernel 4.9.10 to a VessRAID NAS
Date: Wed, 1 Mar 2017 18:36:26 +0100	[thread overview]
Message-ID: <20170301173625.GA20360@dhcp22.suse.cz> (raw)
In-Reply-To: <F77DA4E6-EF9B-427D-8FE9-9FB940A9B009@fordham.edu>

On Wed 01-03-17 10:55:33, Robert Kudyba wrote:
> 
> > On Mar 1, 2017, at 3:06 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > On Tue 28-02-17 14:32:18, Robert Kudyba wrote:
> >> 
> >>> On Feb 28, 2017, at 11:56 AM, Michal Hocko <mhocko@kernel.org <mailto:mhocko@kernel.org>> wrote:
> > [...]
> >>>> Will do here’s a perf report:
> >>> 
> >>> this will not tell us much. Tracepoints have much better chance to tell
> >>> us how reclaim is progressing.
> >> 
> >> I have SystemTap configured are there any scripts in the
> >> SystemTap_Beginners_Guide.pdf that I can run to help? Sorry I’m
> >> brand new to tracepoints.
> >> 
> > 
> > I am not familiar with systemtap much. What I meant was to
> > mount -t tracefs none /trace
> > echo 1 > /trace/events/vmscan/enable
> 
> OK I did this is there another step?

Yeah, you have to read the actual tracing data. Sorry for not beaing
clear enough

cat /trace/trace_pipe > output

> >> I do see these “vmscan” from this command:
> >> stap -L 'kernel.trace("*")'|sort
> >> 
> >> kernel.trace("vmscan:mm_shrink_slab_end") $shr:struct shrinker* $nid:int $shrinker_retval:int $unused_scan_cnt:long int $new_scan_cnt:long int $total_scan:long int
> >> kernel.trace("vmscan:mm_shrink_slab_start") $shr:struct shrinker* $sc:struct shrink_control* $nr_objects_to_shrink:long int $pgs_scanned:long unsigned int $lru_pgs:long unsigned int $cache_items:long unsigned int $delta:long long unsigned int $total_scan:long unsigned int
> >> kernel.trace("vmscan:mm_vmscan_direct_reclaim_begin") $order:int $may_writepage:int $gfp_flags:gfp_t $classzone_idx:int
> >> kernel.trace("vmscan:mm_vmscan_direct_reclaim_end") $nr_reclaimed:long unsigned int
> >> kernel.trace("vmscan:mm_vmscan_kswapd_sleep") $nid:int
> >> kernel.trace("vmscan:mm_vmscan_kswapd_wake") $nid:int $zid:int $order:int
> >> kernel.trace("vmscan:mm_vmscan_lru_isolate") $classzone_idx:int $order:int $nr_requested:long unsigned int $nr_scanned:long unsigned int $nr_taken:long unsigned int $isolate_mode:isolate_mode_t $file:int
> >> kernel.trace("vmscan:mm_vmscan_lru_shrink_inactive") $nid:int $nr_scanned:long unsigned int $nr_reclaimed:long unsigned int $priority:int $file:int
> >> kernel.trace("vmscan:mm_vmscan_memcg_isolate") $classzone_idx:int $order:int $nr_requested:long unsigned int $nr_scanned:long unsigned int $nr_taken:long unsigned int $isolate_mode:isolate_mode_t $file:int
> >> kernel.trace("vmscan:mm_vmscan_memcg_reclaim_begin") $order:int $may_writepage:int $gfp_flags:gfp_t $classzone_idx:int
> >> kernel.trace("vmscan:mm_vmscan_memcg_reclaim_end") $nr_reclaimed:long unsigned int
> >> kernel.trace("vmscan:mm_vmscan_memcg_softlimit_reclaim_begin") $order:int $may_writepage:int $gfp_flags:gfp_t $classzone_idx:int
> >> kernel.trace("vmscan:mm_vmscan_memcg_softlimit_reclaim_end") $nr_reclaimed:long unsigned int
> >> kernel.trace("vmscan:mm_vmscan_wakeup_kswapd") $nid:int $zid:int $order:int
> >> kernel.trace("vmscan:mm_vmscan_writepage") $page:struct page*
> > 
> > this looks like it would achieve the same
> 
> Is there anything else I can provide?

I am not familiar with filesystems and their tracepoints which might
tell us more

[...]
> Mar  1 06:30:59 curie kernel: kworker/u16:1: 
> Mar  1 06:30:59 curie kernel: kthreadd: page allocation stalls for 10197ms, order:1
> Mar  1 06:30:59 curie kernel: page allocation stalls for 11274ms, order:1

OK, so unlike in the previous situation, this is higher order request
(aka 2 physically contiguous pages).

> Mar  1 06:31:02 curie kernel: Normal free:130224kB min:128484kB low:160604kB high:192724kB active_anon:0kB inactive_anon:20kB active_file:308296kB inactive_file:3864kB unevictable:0kB writepending:0kB present:892920kB managed:791152kB mlocked:0kB slab_reclaimable:271984kB slab_unreclaimable:35880kB kernel_stack:1808kB pagetables:0kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB

In this case there is a lot of page cache to be reaclaimed. Most of it
on the active LRU list. There was a bug which could result in this fixed
by b4536f0c829c ("mm, memcg: fix the active list aging for lowmem
requests when memcg is enabled") which made it into the stable tree
4.9.5 but you have said that you are using 4.9.12 so you should already
have it. So it seems that this pagecache is indeed activate based on the
usage pattern.

The rest of the report is rather messed up but I assume that you simply
do not have contiguous memory in the lowmem. This surely sounds like a
32b specific problem which is not reasonably fixable.
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2017-03-01 17:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27 21:36 rsync: page allocation stalls in kernel 4.9.10 to a VessRAID NAS Robert Kudyba
2017-02-28 14:15 ` Michal Hocko
     [not found]   ` <40F07E96-7468-4355-B8EA-4B42F575ACAB@fordham.edu>
2017-02-28 14:40     ` Michal Hocko
     [not found]       ` <3E4C7821-A93D-4956-A0E0-730BEC67C9F0@fordham.edu>
2017-02-28 15:15         ` Michal Hocko
2017-02-28 16:19           ` Robert Kudyba
2017-02-28 16:56             ` Michal Hocko
2017-02-28 19:32               ` Robert Kudyba
2017-03-01  8:06                 ` Michal Hocko
     [not found]                   ` <F77DA4E6-EF9B-427D-8FE9-9FB940A9B009@fordham.edu>
2017-03-01 17:36                     ` Michal Hocko [this message]
     [not found]                       ` <D2C3C9EB-7E99-4420-887A-13526002E267@fordham.edu>
2017-03-01 19:19                         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301173625.GA20360@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rkudyba@fordham.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).