From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751932AbeDXQrD (ORCPT ); Tue, 24 Apr 2018 12:47:03 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:44808 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751513AbeDXQq7 (ORCPT ); Tue, 24 Apr 2018 12:46:59 -0400 Date: Tue, 24 Apr 2018 12:46:55 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Michal Hocko cc: LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , "Theodore Ts'o" , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: vmalloc with GFP_NOFS In-Reply-To: <20180424162712.GL17484@dhcp22.suse.cz> Message-ID: References: <20180424162712.GL17484@dhcp22.suse.cz> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 24 Apr 2018, Michal Hocko wrote: > Hi, > it seems that we still have few vmalloc users who perform GFP_NOFS > allocation: > drivers/mtd/ubi/io.c > fs/ext4/xattr.c > fs/gfs2/dir.c > fs/gfs2/quota.c > fs/nfs/blocklayout/extent_tree.c > fs/ubifs/debug.c > fs/ubifs/lprops.c > fs/ubifs/lpt_commit.c > fs/ubifs/orphan.c > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly > because we do have hardocded GFP_KERNEL allocations deep inside the > vmalloc layers. That means that if GFP_NOFS really protects from > recursion into the fs deadlocks then the vmalloc call is broken. > > What to do about this? Well, there are two things. Firstly, it would be > really great to double check whether the GFP_NOFS is really needed. I > cannot judge that because I am not familiar with the code. It would be > great if the respective maintainers (hopefully get_maintainer.sh pointed > me to all relevant ones). If there is not reclaim recursion issue then > simply use the standard vmalloc (aka GFP_KERNEL request). > > If the use is really valid then we have a way to do the vmalloc > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > does that work? You simply call memalloc_nofs_save when the reclaim > recursion critical section starts (e.g. when you take a lock which is > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > when the critical section ends. _All_ allocations within that scope > will get GFP_NOFS semantic automagically. If you are not sure about the > scope itself then the easiest workaround is to wrap the vmalloc itself > with a big fat comment that this should be revisited. > > Does that sound like something that can be done in a reasonable time? > I have tried to bring this up in the past but our speed is glacial and > there are attempts to do hacks like checking for abusers inside the > vmalloc which is just too ugly to live. > > Please do not hesitate to get back to me if something is not clear. > > Thanks! > -- > Michal Hocko > SUSE Labs I made a patch that adds memalloc_noio/fs_save around these calls a year ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html Mikulas From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mikulas Patocka Subject: Re: vmalloc with GFP_NOFS Date: Tue, 24 Apr 2018 12:46:55 -0400 (EDT) Message-ID: References: <20180424162712.GL17484@dhcp22.suse.cz> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , "Theodore Ts'o" , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , linux-mtd@lists.infradead.org, linux-ext4@vger.k To: Michal Hocko Return-path: In-Reply-To: <20180424162712.GL17484@dhcp22.suse.cz> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Tue, 24 Apr 2018, Michal Hocko wrote: > Hi, > it seems that we still have few vmalloc users who perform GFP_NOFS > allocation: > drivers/mtd/ubi/io.c > fs/ext4/xattr.c > fs/gfs2/dir.c > fs/gfs2/quota.c > fs/nfs/blocklayout/extent_tree.c > fs/ubifs/debug.c > fs/ubifs/lprops.c > fs/ubifs/lpt_commit.c > fs/ubifs/orphan.c > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly > because we do have hardocded GFP_KERNEL allocations deep inside the > vmalloc layers. That means that if GFP_NOFS really protects from > recursion into the fs deadlocks then the vmalloc call is broken. > > What to do about this? Well, there are two things. Firstly, it would be > really great to double check whether the GFP_NOFS is really needed. I > cannot judge that because I am not familiar with the code. It would be > great if the respective maintainers (hopefully get_maintainer.sh pointed > me to all relevant ones). If there is not reclaim recursion issue then > simply use the standard vmalloc (aka GFP_KERNEL request). > > If the use is really valid then we have a way to do the vmalloc > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > does that work? You simply call memalloc_nofs_save when the reclaim > recursion critical section starts (e.g. when you take a lock which is > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > when the critical section ends. _All_ allocations within that scope > will get GFP_NOFS semantic automagically. If you are not sure about the > scope itself then the easiest workaround is to wrap the vmalloc itself > with a big fat comment that this should be revisited. > > Does that sound like something that can be done in a reasonable time? > I have tried to bring this up in the past but our speed is glacial and > there are attempts to do hacks like checking for abusers inside the > vmalloc which is just too ugly to live. > > Please do not hesitate to get back to me if something is not clear. > > Thanks! > -- > Michal Hocko > SUSE Labs I made a patch that adds memalloc_noio/fs_save around these calls a year ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html Mikulas From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mikulas Patocka Date: Tue, 24 Apr 2018 12:46:55 -0400 (EDT) Subject: [Cluster-devel] vmalloc with GFP_NOFS In-Reply-To: <20180424162712.GL17484@dhcp22.suse.cz> References: <20180424162712.GL17484@dhcp22.suse.cz> Message-ID: List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Tue, 24 Apr 2018, Michal Hocko wrote: > Hi, > it seems that we still have few vmalloc users who perform GFP_NOFS > allocation: > drivers/mtd/ubi/io.c > fs/ext4/xattr.c > fs/gfs2/dir.c > fs/gfs2/quota.c > fs/nfs/blocklayout/extent_tree.c > fs/ubifs/debug.c > fs/ubifs/lprops.c > fs/ubifs/lpt_commit.c > fs/ubifs/orphan.c > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly > because we do have hardocded GFP_KERNEL allocations deep inside the > vmalloc layers. That means that if GFP_NOFS really protects from > recursion into the fs deadlocks then the vmalloc call is broken. > > What to do about this? Well, there are two things. Firstly, it would be > really great to double check whether the GFP_NOFS is really needed. I > cannot judge that because I am not familiar with the code. It would be > great if the respective maintainers (hopefully get_maintainer.sh pointed > me to all relevant ones). If there is not reclaim recursion issue then > simply use the standard vmalloc (aka GFP_KERNEL request). > > If the use is really valid then we have a way to do the vmalloc > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > does that work? You simply call memalloc_nofs_save when the reclaim > recursion critical section starts (e.g. when you take a lock which is > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > when the critical section ends. _All_ allocations within that scope > will get GFP_NOFS semantic automagically. If you are not sure about the > scope itself then the easiest workaround is to wrap the vmalloc itself > with a big fat comment that this should be revisited. > > Does that sound like something that can be done in a reasonable time? > I have tried to bring this up in the past but our speed is glacial and > there are attempts to do hacks like checking for abusers inside the > vmalloc which is just too ugly to live. > > Please do not hesitate to get back to me if something is not clear. > > Thanks! > -- > Michal Hocko > SUSE Labs I made a patch that adds memalloc_noio/fs_save around these calls a year ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html Mikulas