From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_DKIMWL_WL_HIGH, USER_AGENT_MUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64529C28D16 for ; Mon, 10 Jun 2019 10:39:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3CD72206BB for ; Mon, 10 Jun 2019 10:39:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1560163190; bh=VTwgLXfF83Mn6JXooLjce/55tge+9zMU3zcuVQhaDTY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=qk5RhYYyYsPOWQoLQGA1dquBd+maJSXqrN7iVMIZoPUwfTC3MN7lX1HhO/fpyov6K +JeY0KDW3ksZWR7lg/Etlc2jpxMkX3GShxUavYRIhjBhsEhZ8H9grLrdFh/t9FJEs+ Nw8bZDA7qRfxMqwAw7tiIyEwTmf+I1WszxO91rmA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389546AbfFJKjt (ORCPT ); Mon, 10 Jun 2019 06:39:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:58772 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388396AbfFJKjs (ORCPT ); Mon, 10 Jun 2019 06:39:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id DA37DAFAB; Mon, 10 Jun 2019 10:39:46 +0000 (UTC) Date: Mon, 10 Jun 2019 12:39:46 +0200 From: Michal Hocko To: Minchan Kim Cc: Andrew Morton , linux-mm , LKML , stable@kernel.org, Wu Fangsuo , Pankaj Suryawanshi Subject: Re: [PATCH] mm: fix trying to reclaim unevicable LRU page Message-ID: <20190610103946.GE30967@dhcp22.suse.cz> References: <20190524071114.74202-1-minchan@kernel.org> <20190528151407.GE1658@dhcp22.suse.cz> <20190530024229.GF229459@google.com> <20190604122806.GH4669@dhcp22.suse.cz> <20190610094222.GA55602@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190610094222.GA55602@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 10-06-19 18:42:22, Minchan Kim wrote: > On Tue, Jun 04, 2019 at 02:28:06PM +0200, Michal Hocko wrote: > > On Thu 30-05-19 11:42:29, Minchan Kim wrote: > > > On Tue, May 28, 2019 at 05:14:07PM +0200, Michal Hocko wrote: > > > > [Cc Pankaj Suryawanshi who has reported a similar problem > > > > http://lkml.kernel.org/r/SG2PR02MB309806967AE91179CAFEC34BE84B0@SG2PR02MB3098.apcprd02.prod.outlook.com] > > > > > > > > On Fri 24-05-19 16:11:14, Minchan Kim wrote: > > > > > There was below bugreport from Wu Fangsuo. > > > > > > > > > > 7200 [ 680.491097] c4 7125 (syz-executor) page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24 > > > > > 7201 [ 680.531186] c4 7125 (syz-executor) flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked) > > > > > 7202 [ 680.544987] c0 7125 (syz-executor) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053 > > > > > 7203 [ 680.556162] c0 7125 (syz-executor) raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80 > > > > > 7204 [ 680.566860] c0 7125 (syz-executor) page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page)) > > > > > 7205 [ 680.578038] c0 7125 (syz-executor) page->mem_cgroup:ffffffc09bf3ee80 > > > > > 7206 [ 680.585467] c0 7125 (syz-executor) ------------[ cut here ]------------ > > > > > 7207 [ 680.592466] c0 7125 (syz-executor) kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350! > > > > > 7223 [ 680.603663] c0 7125 (syz-executor) Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > 7224 [ 680.611436] c0 7125 (syz-executor) Modules linked in: > > > > > 7225 [ 680.616769] c0 7125 (syz-executor) CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3 > > > > > 7226 [ 680.626826] c0 7125 (syz-executor) Hardware name: ASR AQUILAC EVB (DT) > > > > > 7227 [ 680.633623] c0 7125 (syz-executor) task: ffffffc00a54cd00 task.stack: ffffffc009b00000 > > > > > 7228 [ 680.641917] c0 7125 (syz-executor) PC is at shrink_page_list+0x1998/0x3240 > > > > > 7229 [ 680.649144] c0 7125 (syz-executor) LR is at shrink_page_list+0x1998/0x3240 > > > > > 7230 [ 680.656303] c0 7125 (syz-executor) pc : [] lr : [] pstate: 60400045 > > > > > 7231 [ 680.666086] c0 7125 (syz-executor) sp : ffffffc009b05940 > > > > > .. > > > > > 7342 [ 681.671308] c0 7125 (syz-executor) [] shrink_page_list+0x1998/0x3240 > > > > > 7343 [ 681.679567] c0 7125 (syz-executor) [] reclaim_clean_pages_from_list+0x3c0/0x4f0 > > > > > 7344 [ 681.688793] c0 7125 (syz-executor) [] alloc_contig_range+0x3bc/0x650 > > > > > 7347 [ 681.717421] c0 7125 (syz-executor) [] cma_alloc+0x214/0x668 > > > > > 7348 [ 681.724892] c0 7125 (syz-executor) [] ion_cma_allocate+0x98/0x1d8 > > > > > 7349 [ 681.732872] c0 7125 (syz-executor) [] ion_alloc+0x200/0x7e0 > > > > > 7350 [ 681.740302] c0 7125 (syz-executor) [] ion_ioctl+0x18c/0x378 > > > > > 7351 [ 681.747738] c0 7125 (syz-executor) [] do_vfs_ioctl+0x17c/0x1780 > > > > > 7352 [ 681.755514] c0 7125 (syz-executor) [] SyS_ioctl+0xac/0xc0 > > > > > > > > > > Wu found it's due to [1]. Before that, unevictable page goes to cull_mlocked > > > > > routine so that it couldn't reach the VM_BUG_ON_PAGE line. > > > > > > > > > > To fix the issue, this patch filter out unevictable LRU pages > > > > > from the reclaim_clean_pages_from_list in CMA. > > > > > > > > The changelog is rather modest on details and I have to confess I have > > > > little bit hard time to understand it. E.g. why do not we need to handle > > > > the regular reclaim path? > > > > > > No need to pass unevictable pages into regular reclaim patch if we are > > > able to know in advance. > > > > I am sorry to be dense here. So what is the difference in the CMA path? > > Am I right that the pfn walk (CMA) rather than LRU isolation (reclaim) > > is the key differentiator? > > Yes. > We could isolate unevictable LRU pages from the pfn waker to migrate and > could discard clean file-backed pages to reduce migration latency in CMA > path. Please be explicit about that in the changelog. The fact that this is not possible from the regular reclaim path is really important and not obvious from the first glance. Thanks! -- Michal Hocko SUSE Labs