All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Baoquan He <bhe@redhat.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	Jens Axboe <axboe@kernel.dk>, Omar Sandoval <osandov@fb.com>
Subject: Re: [PATCH 1/1] mm: vmalloc: Fix lockdep warning
Date: Sat, 30 Mar 2024 13:55:29 +0100	[thread overview]
Message-ID: <ZggLwbnaeK0AsRwQ@pc636> (raw)
In-Reply-To: <ZgZxaOg6/qZAfUUA@MiWiFi-R3L-srv>

On Fri, Mar 29, 2024 at 03:44:40PM +0800, Baoquan He wrote:
> On 03/28/24 at 03:03pm, Uladzislau Rezki (Sony) wrote:
> > A lockdep reports a possible deadlock in the find_vmap_area_exceed_addr_lock()
> > function:
> > 
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.9.0-rc1-00060-ged3ccc57b108-dirty #6140 Not tainted
> > --------------------------------------------
> > drgn/455 is trying to acquire lock:
> > ffff0000c00131d0 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124
> > 
> > but task is already holding lock:
> > ffff0000c0011878 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124
> > 
> > other info that might help us debug this:
> >  Possible unsafe locking scenario:
> > 
> >        CPU0
> >        ----
> >   lock(&vn->busy.lock/1);
> >   lock(&vn->busy.lock/1);
> > 
> >  *** DEADLOCK ***
> > 
> > indeed it can happen if the find_vmap_area_exceed_addr_lock()
> > gets called concurrently because it tries to acquire two nodes
> > locks. It was done to prevent removing a lowest VA found on a
> > previous step.
> > 
> > To address this a lowest VA is found first without holding a
> > node lock where it resides. As a last step we check if a VA
> > still there because it can go away, if removed, proceed with
> > next lowest.
> > 
> > Fixes: 53becf32aec1 ("mm: vmalloc: support multiple nodes in vread_iter")
> > Tested-by: Jens Axboe <axboe@kernel.dk>
> > Tested-by: Omar Sandoval <osandov@fb.com>
> > Reported-by: Jens Axboe <axboe@kernel.dk>
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> >  mm/vmalloc.c | 74 +++++++++++++++++++++++++++++++---------------------
> >  1 file changed, 44 insertions(+), 30 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index e94ce4562805..a5a5dfc3843e 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -989,6 +989,27 @@ unsigned long vmalloc_nr_pages(void)
> >  	return atomic_long_read(&nr_vmalloc_pages);
> >  }
> >  
> > +static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
> > +{
> > +	struct rb_node *n = root->rb_node;
> > +
> > +	addr = (unsigned long)kasan_reset_tag((void *)addr);
> > +
> > +	while (n) {
> > +		struct vmap_area *va;
> > +
> > +		va = rb_entry(n, struct vmap_area, rb_node);
> > +		if (addr < va->va_start)
> > +			n = n->rb_left;
> > +		else if (addr >= va->va_end)
> > +			n = n->rb_right;
> > +		else
> > +			return va;
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> >  /* Look up the first VA which satisfies addr < va_end, NULL if none. */
> >  static struct vmap_area *
> >  __find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root)
> > @@ -1025,47 +1046,40 @@ __find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root)
> >  static struct vmap_node *
> >  find_vmap_area_exceed_addr_lock(unsigned long addr, struct vmap_area **va)
> >  {
> > -	struct vmap_node *vn, *va_node = NULL;
> > -	struct vmap_area *va_lowest;
> > +	unsigned long va_start_lowest;
> > +	struct vmap_node *vn;
> >  	int i;
> >  
> > -	for (i = 0; i < nr_vmap_nodes; i++) {
> > +repeat:
> > +	for (i = 0, va_start_lowest = 0; i < nr_vmap_nodes; i++) {
> >  		vn = &vmap_nodes[i];
> >  
> >  		spin_lock(&vn->busy.lock);
> > -		va_lowest = __find_vmap_area_exceed_addr(addr, &vn->busy.root);
> > -		if (va_lowest) {
> > -			if (!va_node || va_lowest->va_start < (*va)->va_start) {
> > -				if (va_node)
> > -					spin_unlock(&va_node->busy.lock);
> > -
> > -				*va = va_lowest;
> > -				va_node = vn;
> > -				continue;
> > -			}
> > -		}
> > +		*va = __find_vmap_area_exceed_addr(addr, &vn->busy.root);
> > +
> > +		if (*va)
> > +			if (!va_start_lowest || (*va)->va_start < va_start_lowest)
> > +				va_start_lowest = (*va)->va_start;
> 
> How about below change about va_start_lowest? Personal preference, not
> strong opinion.
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 9b1a41e12d70..bd6a66c54ad2 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1046,19 +1046,19 @@ __find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root)
>  static struct vmap_node *
>  find_vmap_area_exceed_addr_lock(unsigned long addr, struct vmap_area **va)
>  {
> -	unsigned long va_start_lowest;
> +	unsigned long va_start_lowest = ULONG_MAX;
>  	struct vmap_node *vn;
>  	int i;
>  
>  repeat:
> -	for (i = 0, va_start_lowest = 0; i < nr_vmap_nodes; i++) {
> +	for (i = 0; i < nr_vmap_nodes; i++) {
>  		vn = &vmap_nodes[i];
>  
>  		spin_lock(&vn->busy.lock);
>  		*va = __find_vmap_area_exceed_addr(addr, &vn->busy.root);
>  
>  		if (*va)
> -			if (!va_start_lowest || (*va)->va_start < va_start_lowest)
> +			if ((*va)->va_start < va_start_lowest)
>  				va_start_lowest = (*va)->va_start;
>  		spin_unlock(&vn->busy.lock);
>  	}
> @@ -1069,7 +1069,7 @@ find_vmap_area_exceed_addr_lock(unsigned long addr, struct vmap_area **va)
>  	 * been removed concurrently thus we need to proceed
>  	 * with next one what is a rare case.
>  	 */
> -	if (va_start_lowest) {
> +	if (va_start_lowest != ULONG_MAX) {
>  		vn = addr_to_node(va_start_lowest);
>  
>  		spin_lock(&vn->busy.lock);
> 
> 
To me it looks as incomplete. The "va_start_lowest" should be initialized
when repeat. Otherwise we can end up with an infinite repeating because
va_start_lowest != ULONG_MAX.

> >  	}
> >  
> > -	return va_node;
> > -}
> > -
> > -static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
> > -{
> > -	struct rb_node *n = root->rb_node;
> > +	/*
> > +	 * Check if found VA exists, it might it is gone away.
>                                         ~~~~ grammer mistake?
> > +	 * In this case we repeat the search because a VA has
> > +	 * been removed concurrently thus we need to proceed
> > +	 * with next one what is a rare case.
>                          ~~~~ typo, which?
> > +	 */
> > +	if (va_start_lowest) {
> > +		vn = addr_to_node(va_start_lowest);
> >  
> > -	addr = (unsigned long)kasan_reset_tag((void *)addr);
> > +		spin_lock(&vn->busy.lock);
> > +		*va = __find_vmap_area(va_start_lowest, &vn->busy.root);
> >  
> > -	while (n) {
> > -		struct vmap_area *va;
> > +		if (*va)
> > +			return vn;
> >  
> > -		va = rb_entry(n, struct vmap_area, rb_node);
> > -		if (addr < va->va_start)
> > -			n = n->rb_left;
> > -		else if (addr >= va->va_end)
> > -			n = n->rb_right;
> > -		else
> > -			return va;
> > +		spin_unlock(&vn->busy.lock);
> > +		goto repeat;
> >  	}
> 
> Other than above nickpick concerns, this looks good to me.
> 
> Reviewed-by: Baoquan He <bhe@redhat.com>
> 
Thank you!

--
Uladzislau Rezki

  reply	other threads:[~2024-03-30 12:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 14:03 [PATCH 1/1] mm: vmalloc: Fix lockdep warning Uladzislau Rezki (Sony)
2024-03-29  7:44 ` Baoquan He
2024-03-30 12:55   ` Uladzislau Rezki [this message]
2024-03-30 13:21     ` Baoquan He
2024-03-30 19:39       ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZggLwbnaeK0AsRwQ@pc636 \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bhe@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=osandov@fb.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.