All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Yang Shi <shy828301@gmail.com>
Cc: ziy@nvidia.com, kirill.shutemov@linux.intel.com,
	wangyugui@e16-tech.com, hughd@google.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [v2 PATCH] mm: thp: check total_mapcount instead of page_mapcount
Date: Thu, 20 May 2021 22:06:29 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.2105202120220.6466@eggly.anvils> (raw)
In-Reply-To: <20210513212334.217424-1-shy828301@gmail.com>

On Thu, 13 May 2021, Yang Shi wrote:

> When debugging the bug reported by Wang Yugui [1], try_to_unmap() may
> return false positive for PTE-mapped THP since page_mapcount() is used
> to check if the THP is unmapped, but it just checks compound mapount and
> head page's mapcount.  If the THP is PTE-mapped and head page is not
> mapped, it may return false positive.
> 
> Use total_mapcount() instead of page_mapcount() for try_to_unmap() and
> do so for the VM_BUG_ON_PAGE in split_huge_page_to_list as well.
> 
> This changed the semantic of try_to_unmap(), but I don't see there is
> any usecase that expects try_to_unmap() just unmap one subpage of a huge
> page.  So using page_mapcount() seems like a bug.
> 
> [1] https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>

I don't object to this patch, I've no reason to NAK it; but I'll
point out a few deficiencies which might make you want to revisit it.

> ---
> v2: Removed dead code and updated the comment of try_to_unmap() per Zi
>     Yan.
> 
>  mm/huge_memory.c | 11 +----------
>  mm/rmap.c        | 10 ++++++----
>  2 files changed, 7 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b25deaa..3b08b9ba1578 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2348,7 +2348,6 @@ static void unmap_page(struct page *page)
>  		ttu_flags |= TTU_SPLIT_FREEZE;
>  
>  	unmap_success = try_to_unmap(page, ttu_flags);
> -	VM_BUG_ON_PAGE(!unmap_success, page);

The unused variable unmap_success has already been reported and
dealt with.  But I couldn't tell what you intended: why change
try_to_unmap()'s output, if you then ignore it?

>  }
>  
>  static void remap_page(struct page *page, unsigned int nr)
> @@ -2718,7 +2717,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  	}
>  
>  	unmap_page(head);
> -	VM_BUG_ON_PAGE(compound_mapcount(head), head);
> +	VM_BUG_ON_PAGE(total_mapcount(head), head);

And having forced try_to_unmap() to do the expensive-on-a-THP
total_mapcount() calculation, you now repeat it here.  Better
to stick with the previous VM_BUG_ON_PAGE(!unmap_success).

Or better a VM_WARN_ONCE(), accompanied by dump_page()s as before,
to get some perhaps useful info out, which this patch has deleted.
Probably better inside unmap_page() than cluttering up here.

VM_WARN_ONCE() because nothing in this patch fixes whatever Wang
Yugui is suffering from; and (aside from the BUG()) it's harmless,
because there are other ways in which the page_ref_freeze() can fail,
and that is allowed for.  We would like to know when this problem
occurs: there is something wrong, but no reason to crash.

>  
>  	/* block interrupt reentry in xa_lock and spinlock */
>  	local_irq_disable();
> @@ -2758,14 +2757,6 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  		__split_huge_page(page, list, end);
>  		ret = 0;
>  	} else {
> -		if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {
> -			pr_alert("total_mapcount: %u, page_count(): %u\n",
> -					mapcount, count);
> -			if (PageTail(page))
> -				dump_page(head, NULL);
> -			dump_page(page, "total_mapcount(head) > 0");
> -			BUG();
> -		}

This has always looked ugly (as if Kirill had hit an unsolved case),
so it is nice to remove it; but you're losing the dump_page() info,
and not really gaining anything more than a cosmetic cleanup.

>  		spin_unlock(&ds_queue->split_queue_lock);
>  fail:		if (mapping)
>  			xa_unlock(&mapping->i_pages);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 693a610e181d..f52825b1330d 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1742,12 +1742,14 @@ static int page_not_mapped(struct page *page)
>  }
>  
>  /**
> - * try_to_unmap - try to remove all page table mappings to a page
> - * @page: the page to get unmapped
> + * try_to_unmap - try to remove all page table mappings to a page and the
> + *                compound page it belongs to
> + * @page: the page or the subpages of compound page to get unmapped
>   * @flags: action and flags
>   *
>   * Tries to remove all the page table entries which are mapping this
> - * page, used in the pageout path.  Caller must hold the page lock.
> + * page and the compound page it belongs to, used in the pageout path.
> + * Caller must hold the page lock.
>   *
>   * If unmap is successful, return true. Otherwise, false.
>   */
> @@ -1777,7 +1779,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags)
>  	else
>  		rmap_walk(page, &rwc);
>  
> -	return !page_mapcount(page) ? true : false;
> +	return !total_mapcount(page) ? true : false;

That always made me wince: "return !total_mapcount(page);" surely.

Or slightly better, "return !page_mapped(page);", since at least that
one breaks out as soon as it sees a mapcount.  Though I guess I'm
being silly there, since that case should never occur, so both
total_mapcount() and page_mapped() scan through all pages.

Or better, change try_to_unmap() to void: most callers ignore its
return value anyway, and make their own decisions; the remaining
few could be changed to do the same.  Though again, I may be
being silly, since the expensive THP case is not the common case.

>  }
>  
>  /**
> -- 
> 2.26.2

  parent reply	other threads:[~2021-05-21  5:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13 21:23 [v2 PATCH] mm: thp: check total_mapcount instead of page_mapcount Yang Shi
2021-05-14 13:50 ` Zi Yan
2021-05-21  5:06 ` Hugh Dickins [this message]
2021-05-21  5:06   ` Hugh Dickins
2021-05-21 17:16   ` Yang Shi
2021-05-21 17:16     ` Yang Shi
2021-05-21 19:27     ` Yang Shi
2021-05-21 19:27       ` Yang Shi
2021-05-21 23:17       ` Hugh Dickins
2021-05-21 23:17         ` Hugh Dickins
2021-05-22  0:36         ` Yang Shi
2021-05-22  0:36           ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.2105202120220.6466@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=wangyugui@e16-tech.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.