From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752142AbbKZPKX (ORCPT ); Thu, 26 Nov 2015 10:10:23 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:33443 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750786AbbKZPKT (ORCPT ); Thu, 26 Nov 2015 10:10:19 -0500 Date: Thu, 26 Nov 2015 16:10:17 +0100 From: Michal Hocko To: "Theodore Ts'o" Cc: Tetsuo Handa , linux-mm@kvack.org, Andrew Morton , LKML , john.johansen@canonical.com Subject: Re: [PATCH] jbd2: get rid of superfluous __GFP_REPEAT Message-ID: <20151126151017.GJ7953@dhcp22.suse.cz> References: <1446740160-29094-4-git-send-email-mhocko@kernel.org> <1446826623-23959-1-git-send-email-mhocko@kernel.org> <563D526F.6030504@I-love.SAKURA.ne.jp> <20151108050802.GB3880@thunk.org> <20151109081650.GA8916@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151109081650.GA8916@dhcp22.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ted, are there any objections for the patch or should I just repost it? On Mon 09-11-15 09:16:50, Michal Hocko wrote: > On Sun 08-11-15 00:08:02, Theodore Ts'o wrote: > > On Sat, Nov 07, 2015 at 10:22:55AM +0900, Tetsuo Handa wrote: > > > All jbd2_alloc() callers seem to pass GFP_NOFS. Therefore, use of > > > vmalloc() which implicitly passes GFP_KERNEL | __GFP_HIGHMEM can cause > > > deadlock, can't it? This vmalloc(size) call needs to be replaced with > > > __vmalloc(size, flags). > > > > jbd2_alloc is only passed in the bh->b_size, which can't be > > > PAGE_SIZE, so the code path that calls vmalloc() should never get > > called. When we conveted jbd2_alloc() to suppor sub-page size > > allocations in commit d2eecb039368, there was an assumption that it > > could be called with a size greater than PAGE_SIZE, but that's > > certaily not true today. > > Thanks for the clarification. Then the patch can be simplified even > more then. > --- > From fbf02c347dae8ee86e396bc769a88e85773db83e Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Wed, 21 Oct 2015 17:14:49 +0200 > Subject: [PATCH] jbd2: get rid of superfluous __GFP_REPEAT > > jbd2_alloc is explicit about its allocation preferences wrt. the > allocation size. Sub page allocations go to the slab allocator > and larger are using either the page allocator or vmalloc. This > is all good but the logic is unnecessarily complex. > 1) as per Ted, the vmalloc fallback is a left-over: > : jbd2_alloc is only passed in the bh->b_size, which can't be > > : PAGE_SIZE, so the code path that calls vmalloc() should never get > : called. When we conveted jbd2_alloc() to suppor sub-page size > : allocations in commit d2eecb039368, there was an assumption that it > : could be called with a size greater than PAGE_SIZE, but that's > : certaily not true today. > Moreover vmalloc allocation might even lead to a deadlock because > the callers expect GFP_NOFS context while vmalloc is GFP_KERNEL. > 2) Requests smaller than order-3 are go to the page allocator with > __GFP_REPEAT. The flag doesn't do anything useful for those because they > are smaller than PAGE_ALLOC_COSTLY_ORDER. > > Let's simplify the code flow and use the slab allocator for sub-page > requests and the page allocator for others. Even though order > 0 is > not currently used as per above leave that option open. > > Cc: "Theodore Ts'o" > Signed-off-by: Michal Hocko > --- > fs/jbd2/journal.c | 32 +++++++------------------------- > 1 file changed, 7 insertions(+), 25 deletions(-) > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 81e622681c82..0145e7978ab4 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -2299,18 +2299,10 @@ void *jbd2_alloc(size_t size, gfp_t flags) > > BUG_ON(size & (size-1)); /* Must be a power of 2 */ > > - flags |= __GFP_REPEAT; > - if (size == PAGE_SIZE) > - ptr = (void *)__get_free_pages(flags, 0); > - else if (size > PAGE_SIZE) { > - int order = get_order(size); > - > - if (order < 3) > - ptr = (void *)__get_free_pages(flags, order); > - else > - ptr = vmalloc(size); > - } else > + if (size < PAGE_SIZE) > ptr = kmem_cache_alloc(get_slab(size), flags); > + else > + ptr = (void *)__get_free_pages(flags, get_order(size)); > > /* Check alignment; SLUB has gotten this wrong in the past, > * and this can lead to user data corruption! */ > @@ -2321,20 +2313,10 @@ void *jbd2_alloc(size_t size, gfp_t flags) > > void jbd2_free(void *ptr, size_t size) > { > - if (size == PAGE_SIZE) { > - free_pages((unsigned long)ptr, 0); > - return; > - } > - if (size > PAGE_SIZE) { > - int order = get_order(size); > - > - if (order < 3) > - free_pages((unsigned long)ptr, order); > - else > - vfree(ptr); > - return; > - } > - kmem_cache_free(get_slab(size), ptr); > + if (size < PAGE_SIZE) > + kmem_cache_free(get_slab(size), ptr); > + else > + free_pages((unsigned long)ptr, get_order(size)); > }; > > /* > -- > 2.6.2 > > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by kanga.kvack.org (Postfix) with ESMTP id 239086B0038 for ; Thu, 26 Nov 2015 10:10:20 -0500 (EST) Received: by wmec201 with SMTP id c201so26108832wme.1 for ; Thu, 26 Nov 2015 07:10:19 -0800 (PST) Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com. [74.125.82.42]) by mx.google.com with ESMTPS id gg6si21701862wjb.56.2015.11.26.07.10.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 26 Nov 2015 07:10:19 -0800 (PST) Received: by wmvv187 with SMTP id v187so35535774wmv.1 for ; Thu, 26 Nov 2015 07:10:18 -0800 (PST) Date: Thu, 26 Nov 2015 16:10:17 +0100 From: Michal Hocko Subject: Re: [PATCH] jbd2: get rid of superfluous __GFP_REPEAT Message-ID: <20151126151017.GJ7953@dhcp22.suse.cz> References: <1446740160-29094-4-git-send-email-mhocko@kernel.org> <1446826623-23959-1-git-send-email-mhocko@kernel.org> <563D526F.6030504@I-love.SAKURA.ne.jp> <20151108050802.GB3880@thunk.org> <20151109081650.GA8916@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151109081650.GA8916@dhcp22.suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Theodore Ts'o Cc: Tetsuo Handa , linux-mm@kvack.org, Andrew Morton , LKML , john.johansen@canonical.com Hi Ted, are there any objections for the patch or should I just repost it? On Mon 09-11-15 09:16:50, Michal Hocko wrote: > On Sun 08-11-15 00:08:02, Theodore Ts'o wrote: > > On Sat, Nov 07, 2015 at 10:22:55AM +0900, Tetsuo Handa wrote: > > > All jbd2_alloc() callers seem to pass GFP_NOFS. Therefore, use of > > > vmalloc() which implicitly passes GFP_KERNEL | __GFP_HIGHMEM can cause > > > deadlock, can't it? This vmalloc(size) call needs to be replaced with > > > __vmalloc(size, flags). > > > > jbd2_alloc is only passed in the bh->b_size, which can't be > > > PAGE_SIZE, so the code path that calls vmalloc() should never get > > called. When we conveted jbd2_alloc() to suppor sub-page size > > allocations in commit d2eecb039368, there was an assumption that it > > could be called with a size greater than PAGE_SIZE, but that's > > certaily not true today. > > Thanks for the clarification. Then the patch can be simplified even > more then. > --- > From fbf02c347dae8ee86e396bc769a88e85773db83e Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Wed, 21 Oct 2015 17:14:49 +0200 > Subject: [PATCH] jbd2: get rid of superfluous __GFP_REPEAT > > jbd2_alloc is explicit about its allocation preferences wrt. the > allocation size. Sub page allocations go to the slab allocator > and larger are using either the page allocator or vmalloc. This > is all good but the logic is unnecessarily complex. > 1) as per Ted, the vmalloc fallback is a left-over: > : jbd2_alloc is only passed in the bh->b_size, which can't be > > : PAGE_SIZE, so the code path that calls vmalloc() should never get > : called. When we conveted jbd2_alloc() to suppor sub-page size > : allocations in commit d2eecb039368, there was an assumption that it > : could be called with a size greater than PAGE_SIZE, but that's > : certaily not true today. > Moreover vmalloc allocation might even lead to a deadlock because > the callers expect GFP_NOFS context while vmalloc is GFP_KERNEL. > 2) Requests smaller than order-3 are go to the page allocator with > __GFP_REPEAT. The flag doesn't do anything useful for those because they > are smaller than PAGE_ALLOC_COSTLY_ORDER. > > Let's simplify the code flow and use the slab allocator for sub-page > requests and the page allocator for others. Even though order > 0 is > not currently used as per above leave that option open. > > Cc: "Theodore Ts'o" > Signed-off-by: Michal Hocko > --- > fs/jbd2/journal.c | 32 +++++++------------------------- > 1 file changed, 7 insertions(+), 25 deletions(-) > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 81e622681c82..0145e7978ab4 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -2299,18 +2299,10 @@ void *jbd2_alloc(size_t size, gfp_t flags) > > BUG_ON(size & (size-1)); /* Must be a power of 2 */ > > - flags |= __GFP_REPEAT; > - if (size == PAGE_SIZE) > - ptr = (void *)__get_free_pages(flags, 0); > - else if (size > PAGE_SIZE) { > - int order = get_order(size); > - > - if (order < 3) > - ptr = (void *)__get_free_pages(flags, order); > - else > - ptr = vmalloc(size); > - } else > + if (size < PAGE_SIZE) > ptr = kmem_cache_alloc(get_slab(size), flags); > + else > + ptr = (void *)__get_free_pages(flags, get_order(size)); > > /* Check alignment; SLUB has gotten this wrong in the past, > * and this can lead to user data corruption! */ > @@ -2321,20 +2313,10 @@ void *jbd2_alloc(size_t size, gfp_t flags) > > void jbd2_free(void *ptr, size_t size) > { > - if (size == PAGE_SIZE) { > - free_pages((unsigned long)ptr, 0); > - return; > - } > - if (size > PAGE_SIZE) { > - int order = get_order(size); > - > - if (order < 3) > - free_pages((unsigned long)ptr, order); > - else > - vfree(ptr); > - return; > - } > - kmem_cache_free(get_slab(size), ptr); > + if (size < PAGE_SIZE) > + kmem_cache_free(get_slab(size), ptr); > + else > + free_pages((unsigned long)ptr, get_order(size)); > }; > > /* > -- > 2.6.2 > > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org