Georgi Nikolov
System Administrator
www.icdsoft.com

On 07/26/2018 11:48 AM, Vlastimil Babka wrote:
On 07/26/2018 10:31 AM, Vlastimil Babka wrote:
On 07/26/2018 10:03 AM, Michal Hocko wrote:
On Thu 26-07-18 09:50:45, Vlastimil Babka wrote:
On 07/26/2018 09:42 AM, Michal Hocko wrote:
On Thu 26-07-18 09:34:58, Vlastimil Babka wrote:
On 07/26/2018 09:26 AM, Michal Hocko wrote:
On Thu 26-07-18 09:18:57, Vlastimil Babka wrote:
On 07/25/2018 09:52 PM, Andrew Morton wrote:

This is likely the kvmalloc() in xt_alloc_table_info(). Between 4.13 and
4.17 it shouldn't use __GFP_NORETRY, but looks like commit 0537250fdc6c
("netfilter: x_tables: make allocation less aggressive") was backported
to 4.14. Removing __GFP_NORETRY might help here, but bring back other
issues. Less than 4MB is not that much though, maybe find some "sane"
limit and use __GFP_NORETRY only above that?
I have seen the same report via http://lkml.kernel.org/r/df6f501c-8546-1f55-40b1-7e3a8f54d872@icdsoft.com
and the reported confirmed that kvmalloc is not a real culprit
http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@icdsoft.com
Hmm but that was revert of eacd86ca3b03 ("net/netfilter/x_tables.c: use
kvmalloc() in xt_alloc_table_info()") which was the 4.13 commit that
removed __GFP_NORETRY (there's no __GFP_NORETRY under net/netfilter in
v4.14). I assume it was reverted on top of vanilla v4.14 as there would
be conflict on the stable with 0537250fdc6c backport. So what should be
tested to be sure is either vanilla v4.14 without stable backports, or
latest v4.14.y with revert of 0537250fdc6c.
But 0537250fdc6c simply restored the previous NORETRY behavior from
before eacd86ca3b03. So whatever causes these issues doesn't seem to be
directly related to the kvmalloc change. Or do I miss what you are
saying?
I'm saying that although it's not a regression, as you say (the
vmalloc() there was only for a few kernel versions called without
__GFP_NORETRY), it's still possible that removing __GFP_NORETRY will fix
the issue and thus we will rule out other possibilities.
http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@icdsoft.com
claims that reverting eacd86ca3b03 didn't really help.
Ah, I see, that mail thread references a different kernel bugzilla
#200639 which doesn't mention 4.14, but outright blames commit
eacd86ca3b03. Yet the alloc fail message contains __GFP_NORETRY, so I
still suspect the kernel also had 0537250fdc6c backport. Georgi can you
please clarify which exact kernel version had the alloc failures, and
how exactly you tested the revert (which version was the baseline for
revert). Thanks.

Of course not. eacd86ca3b03 *removed* __GFP_NORETRY, so the revert
reintroduced it. I tried to explain it in the quoted part above starting
with "Hmm but that was revert of eacd86ca3b03 ...". What I'm saying is
that eacd86ca3b03 might have actually *fixed* (or rather prevented) this
alloc failure, if there was not 0537250fdc6c and its 4.14 stable
backport (the kernel bugzilla report says 4.14, I'm assuming new enough
stable to contain 0537250fdc6c as the failure message contains
__GFP_NORETRY).

The mail you reference also says "seems that old version is masking
errors", which confirms that we are indeed looking at the right
vmalloc(), because eacd86ca3b03 also removed __GFP_NOWARN there (and
thus the revert reintroduced it).



    

Hello,
Kernel that has allocation failures is 4.14.50.
Here is the patch applied to this version which masks errors:

--- net/netfilter/x_tables.cA A A 2018-06-18 14:18:21.138347416 +0300
+++ net/netfilter/x_tables.cA A A 2018-07-26 11:58:01.721932962 +0300
@@ -1059,9 +1059,19 @@
A A A A A * than shoot all processes down before realizing there is nothing
A A A A A * more to reclaim.
A A A A A */
-A A A info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY);
+/*A A A info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY);
A A A A if (!info)
A A A A A A A return NULL;
+*/
+
+A A A if (sz <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+A A A A A A info = kmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
+A A A if (!info) {
+A A A A A A info = __vmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
+A A A A A A A PAGE_KERNEL);
+A A A A A A if (!info)
+A A A A A A return NULL;
+A A A }
A
A A A A memset(info, 0, sizeof(*info));
A A A A info->size = size;


I will try to reproduce it with only

info = kvmalloc(sz, GFP_KERNEL);

Regards,

--
Georgi Nikolov