From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F24EC3815B for ; Wed, 15 Apr 2020 04:26:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1FD722084D for ; Wed, 15 Apr 2020 04:26:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="H4v4YkJA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1FD722084D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B9A138E0003; Wed, 15 Apr 2020 00:26:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4AEE8E0001; Wed, 15 Apr 2020 00:26:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A115E8E0003; Wed, 15 Apr 2020 00:26:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id 8551D8E0001 for ; Wed, 15 Apr 2020 00:26:01 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 42C9D8245578 for ; Wed, 15 Apr 2020 04:26:01 +0000 (UTC) X-FDA: 76708801722.08.baby31_6eb65c34f7d1b X-HE-Tag: baby31_6eb65c34f7d1b X-Filterd-Recvd-Size: 6961 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Apr 2020 04:26:00 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 67D012084D; Wed, 15 Apr 2020 04:25:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1586924759; bh=eimKtL0fzzs+CefgnCz1EnjRKx67f6+Swi94tqjnM3Q=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=H4v4YkJAlH8jiFPUzFl0yQEB7GOECob/V9RmfKBn0JijGIAR0e+SHRq/xT6iu/U1a 4Q2Mi2EjX9JwYocfyNIyyftX1P1zQWE2x0WzslHu9hSz4ymw2slDU+X5Rm8xUruYwR g5Q4udFZ7L7WxopMUlWwloCpEuuEbZqyLAcL+rlY= Date: Tue, 14 Apr 2020 21:25:58 -0700 From: Andrew Morton To: paulfurtado91@gmail.com Cc: bugzilla-daemon@bugzilla.kernel.org, Michal Hocko , linux-mm@kvack.org Subject: Re: [Bug 207273] New: cgroup with 1.5GB limit and 100MB rss usage OOM-kills processes due to page cache usage after upgrading to kernel 5.4 Message-Id: <20200414212558.58eaab4de2ecf864eaa87e5d@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 15 Apr 2020 01:32:12 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=207273 > > Bug ID: 207273 > Summary: cgroup with 1.5GB limit and 100MB rss usage OOM-kills > processes due to page cache usage after upgrading to > kernel 5.4 > Product: Memory Management > Version: 2.5 > Kernel Version: 5.4.20 > Hardware: x86-64 > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Page Allocator > Assignee: akpm@linux-foundation.org > Reporter: paulfurtado91@gmail.com > Regression: No > > Upon upgrading to kernel 5.4, we see constant OOM kills in database containers > that are restoring from backups, with nearly no RSS memory usage. It appears > all the memory is consumed by file_dirty, with applications using minimal > memory. On kernel 4.14.146 and 4.19.75, we do not see this problem, so it > appears to be a new regression. Thanks. That's an elderly kernel. Are you in a position to determine whether contemporary kernel behave similarly? > The full OOM log from dmesg shows: > > xtrabackup invoked oom-killer: > gfp_mask=0x101c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_WRITE), > order=0, oom_score_adj=993 > CPU: 9 PID: 50206 Comm: xtrabackup Tainted: G E > 5.4.20-hs779.el6.x86_64 #1 > Hardware name: Amazon EC2 c5d.9xlarge/, BIOS 1.0 10/16/2017 > Call Trace: > dump_stack+0x66/0x8b > dump_header+0x4a/0x200 > oom_kill_process+0xd7/0x110 > out_of_memory+0x105/0x510 > mem_cgroup_out_of_memory+0xb5/0xd0 > try_charge+0x7b1/0x7f0 > mem_cgroup_try_charge+0x70/0x190 > __add_to_page_cache_locked+0x2b6/0x2f0 > ? scan_shadow_nodes+0x30/0x30 > add_to_page_cache_lru+0x4a/0xc0 > pagecache_get_page+0xf5/0x210 > grab_cache_page_write_begin+0x1f/0x40 > iomap_write_begin.constprop.33+0x1ee/0x320 > ? iomap_write_end+0x91/0x240 > iomap_write_actor+0x92/0x170 > ? iomap_dirty_actor+0x1b0/0x1b0 > iomap_apply+0xba/0x130 > ? iomap_dirty_actor+0x1b0/0x1b0 > iomap_file_buffered_write+0x62/0x90 > ? iomap_dirty_actor+0x1b0/0x1b0 > xfs_file_buffered_aio_write+0xca/0x310 [xfs] > new_sync_write+0x11b/0x1b0 > vfs_write+0xad/0x1a0 > ksys_pwrite64+0x71/0x90 > do_syscall_64+0x4e/0x100 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x7f6085b181a3 > Code: 49 89 ca b8 12 00 00 00 0f 05 48 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 > 8b f0 ff ff 48 89 04 24 49 89 ca b8 12 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 > d1 f0 ff ff 48 89 d0 48 83 c4 08 48 3d 01 > RSP: 002b:00007ffd43632320 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 > RAX: ffffffffffffffda RBX: 00007ffd43632400 RCX: 00007f6085b181a3 > RDX: 0000000000100000 RSI: 0000000004a54000 RDI: 0000000000000004 > RBP: 00007ffd43632590 R08: 0000000066e00000 R09: 00007ffd436325c0 > R10: 0000000066e00000 R11: 0000000000000293 R12: 0000000000100000 > R13: 0000000066e00000 R14: 0000000066e00000 R15: 0000000001acdd20 > memory: usage 1536000kB, limit 1536000kB, failcnt 0 > memory+swap: usage 1536000kB, limit 1536000kB, failcnt 490221 > kmem: usage 23164kB, limit 9007199254740988kB, failcnt 0 > Memory cgroup stats for > /kubepods/burstable/pod6900693c-8b2c-4efe-ab52-26e4a6bd9e4c/83216944bb43baf32f0d43ef12c85ebaa2767b3f51846f5fa438bba00b4636d8: > anon 72507392 > file 1474740224 > kernel_stack 774144 > slab 18673664 > sock 0 > shmem 0 > file_mapped 0 > file_dirty 1413857280 > file_writeback 60555264 > anon_thp 0 > inactive_anon 0 > active_anon 72585216 > inactive_file 368873472 > active_file 1106067456 > unevictable 0 > slab_reclaimable 11403264 > slab_unreclaimable 7270400 > pgfault 34848 > pgmajfault 0 > workingset_refault 0 > workingset_activate 0 > workingset_nodereclaim 0 > pgrefill 17089962 > pgscan 18425256 > pgsteal 602912 > pgactivate 17822046 > pgdeactivate 17089962 > pglazyfree 0 > pglazyfreed 0 > thp_fault_alloc 0 > thp_collapse_alloc 0 > Tasks state (memory values in pages): > [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj > name > [ 42046] 500 42046 257 1 32768 0 993 init > [ 43157] 500 43157 164204 18473 335872 0 993 > vttablet > [ 50206] 500 50206 294931 8856 360448 0 993 > xtrabackup > oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=83216944bb43baf32f0d43ef12c85ebaa2767b3f51846f5fa438bba00b4636d8,mems_allowed=0,oom_memcg=/kubepods/burstable/pod6900693c-8b2c-4efe-ab52-26e4a6bd9e4c/83216944bb43baf32f0d43ef12c85ebaa2767b3f51846f5fa438bba00b4636d8,task_memcg=/kubepods/burstable/pod6900693c-8b2c-4efe-ab52-26e4a6bd9e4c/83216944bb43baf32f0d43ef12c85ebaa2767b3f51846f5fa438bba00b4636d8,task=vttablet,pid=43157,uid=500 > Memory cgroup out of memory: Killed process 43157 (vttablet) total-vm:656816kB, > anon-rss:50572kB, file-rss:23320kB, shmem-rss:0kB, UID:500 pgtables:328kB > oom_score_adj:993 > > -- > You are receiving this mail because: > You are the assignee for the bug.