bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Martynas Pumputis <m@lambda.lt>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net
Subject: Re: [PATCH] bpf: Try harder when allocating memory for maps
Date: Sun, 10 Mar 2019 08:13:18 +0100	[thread overview]
Message-ID: <20190310071318.GW5232@dhcp22.suse.cz> (raw)
In-Reply-To: <2cbb3fe7-39a9-1b1e-e22f-4dffa20b9414@lambda.lt>

On Fri 08-03-19 21:02:41, Martynas Pumputis wrote:
> 
> 
> On 3/8/19 12:20 PM, Michal Hocko wrote:
> > On Fri 08-03-19 12:14:16, Martynas Pumputis wrote:
> > > 
> > > 
> > > On 3/8/19 9:44 AM, Michal Hocko wrote:
> > > > On Fri 08-03-19 09:08:57, Martynas Pumputis wrote:
> > > > > It has been observed that sometimes memory allocation for BPF maps
> > > > > fails when there is no obvious memory pressure in a system.
> > > > > 
> > > > > E.g. the map (BPF_MAP_TYPE_LRU_HASH, key=38, value=56, max_elems=524288)
> > > > > could not be created due to due to vmalloc unable to allocate 75497472B,
> > > > > when the system's memory consumption (in MB) was the following:
> > > > > 
> > > > >       Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727
> > > > 
> > > > Hmm 75MB is quite large and much larger than the slab/page allocator
> > > > cann provide so this is not really a fragmentation issue. Vmalloc does
> > > > respect noretry but considering that there shouldn't be a large memory
> > > > pressure I wonder how NORETRY managed to fail the allocation. Do you
> > > > happen to have the allocation failure report?
> > > 
> > > I got /proc/{meminfo,vmstat,vmallocinfo} just after the allocation has
> > > failed:
> > > https://gist.github.com/brb/62092c1d83daa6527271b88f0352e32d
> > 
> > dmesg with the allocation failure report would be more helpful
> 
> https://gist.github.com/brb/2d7ac323d2e14cb7a38bacba301fe3af

Thanks!

tc: vmalloc: allocation failure, allocated 15609856 of 62918656 bytes, mode:0x6090c0(GFP_KERNEL|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=b389e318420d891300ad9658f8e056b59972fda9547dd566245a922c34bb9e42,mems_allowed=0
[...]
Node 0 DMA free:15728kB min:268kB low:332kB high:396kB active_anon:0kB inactive_anon:0kB active_file:44kB inactive_file:12kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 3419 3866 3866 3866
Node 0 DMA32 free:105004kB min:104588kB low:119468kB high:134348kB active_anon:526128kB inactive_anon:612kB active_file:862524kB inactive_file:1552884kB unevictable:0kB writepending:0kB present:3653568kB managed:3563596kB mlocked:0kB kernel_stack:7592kB pagetables:6636kB bounce:0kB free_pcp:916kB local_pcp:736kB free_cma:0kB
lowmem_reserve[]: 0 0 446 446 446
Node 0 Normal free:22844kB min:24160kB low:26104kB high:28048kB active_anon:92340kB inactive_anon:228kB active_file:160072kB inactive_file:82480kB unevictable:0kB writepending:0kB present:524288kB managed:457544kB mlocked:0kB kernel_stack:2224kB pagetables:3776kB bounce:0kB free_pcp:996kB local_pcp:672kB free_cma:0kB
lowmem_reserve[]: 0 0 0 0 0

Except for a srtange cpuset value (which should be checked separately),
the allocation is restricted to node 0 which is pretty much out of
memory (below min watermark - lowmem_reserve). There is still a lot of
page cache to reclaim so a further reclaim is quite likely to make a
further progress. There is still 45MB to go and at least page cache is
1.5G so there is some buffer to allocate from.

That being said __GFP_NORETRY caused a pre-mature failure indeed. Using
kvmalloc(GFP_KERNEL|__GFP_RETRY_MAYFAIL) would likely help here unless
the pagecache is really hard to reclaim. Please note that this will also
imply that requests which can be satisfied from the slab allocator will
retry harder as well. Not sure this is desirable for these requests
though but your original patch does the same so if you wanted to have
__GFP_RETRY_MAYFAIL behavior only for the vmalloc path then you would
need to have an opencoded version which adds the flag just to the
vmalloc fallback path.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-03-10  7:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-08  8:08 [PATCH] bpf: Try harder when allocating memory for maps Martynas Pumputis
2019-03-08  8:44 ` Michal Hocko
2019-03-08 10:33   ` Daniel Borkmann
2019-03-08 10:55     ` Michal Hocko
2019-03-08 11:30       ` Daniel Borkmann
2019-03-08 12:00         ` Michal Hocko
2019-03-08 11:14   ` Martynas Pumputis
2019-03-08 11:20     ` Michal Hocko
2019-03-08 20:02       ` Martynas Pumputis
2019-03-10  7:13         ` Michal Hocko [this message]
2019-03-11 19:33           ` Martynas Pumputis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190310071318.GW5232@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=m@lambda.lt \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).