All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Thomas Lindroth <thomas.lindroth@gmail.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Subject: Re: [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges
Date: Mon, 9 Sep 2019 13:22:45 +0200	[thread overview]
Message-ID: <20190909112245.GH27159@dhcp22.suse.cz> (raw)
In-Reply-To: <CALvZod5w72jH8fJSFRaw7wgQTnzF6nb=+St-sSXVGSiG6Bs3Lg@mail.gmail.com>

On Fri 06-09-19 11:24:55, Shakeel Butt wrote:
> On Fri, Sep 6, 2019 at 5:56 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > From: Michal Hocko <mhocko@suse.com>
> >
> > Thomas has noticed the following NULL ptr dereference when using cgroup
> > v1 kmem limit:
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > PGD 0
> > P4D 0
> > Oops: 0000 [#1] PREEMPT SMP PTI
> > CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42
> > Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015
> > RIP: 0010:create_empty_buffers+0x24/0x100
> > Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d
> > RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149
> > RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00
> > RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0
> > R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
> > R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000
> > FS:  00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0
> > Call Trace:
> >  create_page_buffers+0x4d/0x60
> >  __block_write_begin_int+0x8e/0x5a0
> >  ? ext4_inode_attach_jinode.part.82+0xb0/0xb0
> >  ? jbd2__journal_start+0xd7/0x1f0
> >  ext4_da_write_begin+0x112/0x3d0
> >  generic_perform_write+0xf1/0x1b0
> >  ? file_update_time+0x70/0x140
> >  __generic_file_write_iter+0x141/0x1a0
> >  ext4_file_write_iter+0xef/0x3b0
> >  __vfs_write+0x17e/0x1e0
> >  vfs_write+0xa5/0x1a0
> >  ksys_write+0x57/0xd0
> >  do_syscall_64+0x55/0x160
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> >  Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg
> >  fails __GFP_NOFAIL charge when the kmem limit is reached. This is a
> >  wrong behavior because nofail allocations are not allowed to fail.
> >  Normal charge path simply forces the charge even if that means to cross
> >  the limit. Kmem accounting should be doing the same.
> >
> > Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
> > Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> > Cc: stable
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> I wonder what has changed since
> <http://lkml.kernel.org/r/20180525185501.82098-1-shakeelb@google.com/>.

I have completely forgot about that one. It seems that we have just
repeated the same discussion again. This time we have a poor user who
actually enabled the kmem limit.

I guess there was no real objection to the change back then. The primary
discussion revolved around the fact that the accounting will stay broken
even when this particular part was fixed. Considering this leads to easy
to trigger crash (with the limit enabled) then I guess we should just
make it less broken and backport to stable trees and have a serious
discussion about discontinuing of the limit. Start by simply failing to
set any limit in the current upstream kernels.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-09-09 11:22 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-01 20:43 [BUG] Early OOM and kernel NULL pointer dereference in 4.19.69 Thomas Lindroth
2019-09-02  7:16 ` Michal Hocko
2019-09-02  7:27   ` Michal Hocko
2019-09-02 19:34   ` Thomas Lindroth
2019-09-03  7:41     ` Michal Hocko
2019-09-03 12:01       ` Thomas Lindroth
2019-09-03 12:05       ` Andrey Ryabinin
2019-09-03 12:22         ` Michal Hocko
2019-09-03 18:20           ` Thomas Lindroth
2019-09-03 19:36             ` Michal Hocko
     [not found] ` <666dbcde-1b8a-9e2d-7d1f-48a117c78ae1@I-love.SAKURA.ne.jp>
2019-09-03 18:25   ` Thomas Lindroth
     [not found]     ` <4d0eda9a-319d-1a7d-1eed-71da90902367@i-love.sakura.ne.jp>
2019-09-04 11:25       ` [BUG] kmemcg limit defeats __GFP_NOFAIL allocation Michal Hocko
     [not found]         ` <4d87d770-c110-224f-6c0c-d6fada90417d@i-love.sakura.ne.jp>
2019-09-04 11:59           ` Michal Hocko
     [not found]         ` <0056063b-46ff-0ebd-ff0d-c96a1f9ae6b1@i-love.sakura.ne.jp>
2019-09-04 14:29           ` Michal Hocko
     [not found]             ` <405ce28b-c0b4-780c-c883-42d741ec60e0@i-love.sakura.ne.jp>
2019-09-05 23:11               ` Thomas Lindroth
2019-09-06  7:27                 ` Michal Hocko
2019-09-06 10:54                   ` Andrey Ryabinin
2019-09-06 11:29                     ` Michal Hocko
2019-09-06 12:56 ` [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges Michal Hocko
2019-09-06 18:24   ` Shakeel Butt
2019-09-06 18:24     ` Shakeel Butt
2019-09-09 11:22     ` Michal Hocko [this message]
2019-09-11 12:00       ` Michal Hocko
2019-09-11 14:37         ` Andrew Morton
2019-09-11 15:16           ` Michal Hocko
2019-09-13  2:46             ` Shakeel Butt
2019-09-13  2:46               ` Shakeel Butt
2019-09-24 10:53   ` Michal Hocko
2019-09-24 23:06     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190909112245.GH27159@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=shakeelb@google.com \
    --cc=thomas.lindroth@gmail.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.