Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: David Rientjes <rientjes@google.com>
To: Yang Shi <yang.shi@linux.alibaba.com>
Cc: mhocko@suse.com, dvyukov@google.com, catalin.marinas@arm.com,
	 akpm@linux-foundation.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: page_alloc: document kmemleak's non-blockable __GFP_NOFAIL case
Date: Sat, 13 Jul 2019 12:39:16 -0700 (PDT)
Message-ID: <alpine.DEB.2.21.1907131230280.246128@chino.kir.corp.google.com> (raw)
In-Reply-To: <1562964544-59519-1-git-send-email-yang.shi@linux.alibaba.com>

On Sat, 13 Jul 2019, Yang Shi wrote:

> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
> 
> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
> Modules linked in: loop dax_pmem dax_pmem_core
> ip_tables x_tables xfs virtio_net net_failover virtio_blk failover
> ata_generic virtio_pci virtio_ring virtio libata
> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> ...
>  kmemleak_alloc+0x4e/0xb0
>  kmem_cache_alloc+0x2a7/0x3e0
>  ? __kmalloc+0x1d6/0x470
>  ? ___might_sleep+0x9c/0x170
>  ? mempool_alloc+0x2b0/0x2b0
>  mempool_alloc_slab+0x2d/0x40
>  mempool_alloc+0x118/0x2b0
>  ? __kasan_check_read+0x11/0x20
>  ? mempool_resize+0x390/0x390
>  ? lock_downgrade+0x3c0/0x3c0
>  bio_alloc_bioset+0x19d/0x350
>  ? __swap_duplicate+0x161/0x240
>  ? bvec_alloc+0x1b0/0x1b0
>  ? do_raw_spin_unlock+0xa8/0x140
>  ? _raw_spin_unlock+0x27/0x40
>  get_swap_bio+0x80/0x230
>  ? __x64_sys_madvise+0x50/0x50
>  ? end_swap_bio_read+0x310/0x310
>  ? __kasan_check_read+0x11/0x20
>  ? check_chain_key+0x24e/0x300
>  ? bdev_write_page+0x55/0x130
>  __swap_writepage+0x5ff/0xb20
> 
> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
> __GFP_NOFAIL set all the time due to commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> with fault injection").
> 

It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation 
would immediately succeed before falling back to the elements in the 
mempool.  If that fails, and the mempool is empty, mempool_alloc() 
attempts the allocation with __GFP_DIRECT_RECLAIM.  So for the problem 
described here, I think what we really want is this:

diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
 	gfp_mask |= __GFP_NORETRY;	/* don't loop in __alloc_pages */
 	gfp_mask |= __GFP_NOWARN;	/* failures are OK */
 
-	gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO);
+	gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL);
 
 repeat_alloc:
 
But bio_alloc_bioset() plays with gfp_mask itself: are we sure that it 
isn't the one clearing __GFP_DIRECT_RECLAIM itself before falling back to 
saved_gfp?

In other words do we also want this?

diff --git a/block/bio.c b/block/bio.c
--- a/block/bio.c
+++ b/block/bio.c
@@ -462,16 +462,16 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs,
 		 * We solve this, and guarantee forward progress, with a rescuer
 		 * workqueue per bio_set. If we go to allocate and there are
 		 * bios on current->bio_list, we first try the allocation
-		 * without __GFP_DIRECT_RECLAIM; if that fails, we punt those
-		 * bios we would be blocking to the rescuer workqueue before
-		 * we retry with the original gfp_flags.
+		 * without __GFP_DIRECT_RECLAIM or __GFP_NOFAIL; if that fails,
+		 * we punt those bios we would be blocking to the rescuer
+		 * workqueue before we retry with the original gfp_flags.
 		 */
-
 		if (current->bio_list &&
 		    (!bio_list_empty(&current->bio_list[0]) ||
 		     !bio_list_empty(&current->bio_list[1])) &&
 		    bs->rescue_workqueue)
-			gfp_mask &= ~__GFP_DIRECT_RECLAIM;
+			gfp_mask &= ~(__GFP_DIRECT_RECLAIM |
+				      __GFP_NOFAIL);
 
 		p = mempool_alloc(&bs->bio_pool, gfp_mask);
 		if (!p && gfp_mask != saved_gfp) {


  reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-12 20:49 Yang Shi
2019-07-13 19:39 ` David Rientjes [this message]
2019-07-15  3:43   ` Yang Shi
2019-07-15 13:18   ` Michal Hocko
2019-07-13 21:25 ` Matthew Wilcox
2019-07-15  3:47   ` Yang Shi
2019-07-15 13:06     ` Matthew Wilcox
2019-07-15 17:00       ` Yang Shi
2019-07-15 13:17 ` Michal Hocko
2019-07-15 15:01   ` Catalin Marinas
2019-07-15 15:18     ` Qian Cai
2019-07-15 16:58       ` Yang Shi
2019-07-16 17:38     ` Yang Shi

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1907131230280.246128@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org linux-mm@archiver.kernel.org
	public-inbox-index linux-mm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox