From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B9C6C3A5A3 for ; Thu, 22 Aug 2019 14:26:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5AEF021726 for ; Thu, 22 Aug 2019 14:26:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388345AbfHVO0o (ORCPT ); Thu, 22 Aug 2019 10:26:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:50412 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731709AbfHVO0o (ORCPT ); Thu, 22 Aug 2019 10:26:44 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 03D13AC47; Thu, 22 Aug 2019 14:26:42 +0000 (UTC) Subject: Re: [PATCH 2/3] xfs: add kmem_alloc_io() To: Dave Chinner Cc: Peter Zijlstra , Christoph Hellwig , linux-xfs@vger.kernel.org, Ingo Molnar , Will Deacon , linux-kernel@vger.kernel.org, linux-mm@kvack.org, penguin-kernel@I-love.SAKURA.ne.jp References: <20190821083820.11725-3-david@fromorbit.com> <20190821232440.GB24904@infradead.org> <20190822003131.GR1119@dread.disaster.area> <20190822075948.GA31346@infradead.org> <20190822085130.GI2349@hirez.programming.kicks-ass.net> <20190822091057.GK2386@hirez.programming.kicks-ass.net> <20190822101441.GY1119@dread.disaster.area> <20190822120725.GA1119@dread.disaster.area> <20190822131739.GB1119@dread.disaster.area> From: Vlastimil Babka Message-ID: Date: Thu, 22 Aug 2019 16:26:42 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190822131739.GB1119@dread.disaster.area> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/22/19 3:17 PM, Dave Chinner wrote: > On Thu, Aug 22, 2019 at 02:19:04PM +0200, Vlastimil Babka wrote: >> On 8/22/19 2:07 PM, Dave Chinner wrote: >> > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: >> > >> > No, the problem is this (using kmalloc as a general term for >> > allocation, whether it be kmalloc, kmem_cache_alloc, alloc_page, etc) >> > >> > some random kernel code >> > kmalloc(GFP_KERNEL) >> > reclaim >> > PF_MEMALLOC >> > shrink_slab >> > xfs_inode_shrink >> > XFS_ILOCK >> > xfs_buf_allocate_memory() >> > kmalloc(GFP_KERNEL) >> > >> > And so locks on inodes in reclaim are seen below reclaim. Then >> > somewhere else we have: >> > >> > some high level read-only xfs code like readdir >> > XFS_ILOCK >> > xfs_buf_allocate_memory() >> > kmalloc(GFP_KERNEL) >> > reclaim >> > >> > And this one throws false positive lockdep warnings because we >> > called into reclaim with XFS_ILOCK held and GFP_KERNEL alloc >> >> OK, and what exactly makes this positive a false one? Why can't it continue like >> the first example where reclaim leads to another XFS_ILOCK, thus deadlock? > > Because above reclaim we only have operations being done on > referenced inodes, and below reclaim we only have unreferenced > inodes. We never lock the same inode both above and below reclaim > at the same time. > > IOWs, an operation above reclaim cannot see, access or lock > unreferenced inodes, except in inode write clustering, and that uses > trylocks so cannot deadlock with reclaim. > > An operation below reclaim cannot see, access or lock referenced > inodes except during inode write clustering, and that uses trylocks > so cannot deadlock with code above reclaim. Thanks for elaborating. Perhaps lockdep experts (not me) would know how to express that. If not possible, then replacing GFP_NOFS with __GFP_NOLOCKDEP should indeed suppress the warning, while allowing FS reclaim. > FWIW, I'm trying to make the inode writeback clustering go away from > reclaim at the moment, so even that possibility is going away soon. > That will change everything to trylocks in reclaim context, so > lockdep is going to stop tracking it entirely. That's also a nice solution :) > Hmmm - maybe we're getting to the point where we actually > don't need GFP_NOFS/PF_MEMALLOC_NOFS at all in XFS anymore..... > > Cheers, > > Dave. >