linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, linux-aio@kvack.org,
	linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
	linux-mm@kvack.org, linux-xfs@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net,
	ocfs2-devel@oss.oracle.com, linux-mtd@lists.infradead.org,
	virtualization@lists.linux-foundation.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v2 07/19] mm/migrate: Convert expected_page_refs() to folio_expected_refs()
Date: Fri, 8 Jul 2022 04:29:14 +0100	[thread overview]
Message-ID: <Ysekikp60Hhs5lV0@casper.infradead.org> (raw)
In-Reply-To: <6e7599d1-8a5f-bf16-383c-febd753bd051@google.com>

On Thu, Jul 07, 2022 at 07:50:17PM -0700, Hugh Dickins wrote:
> On Wed, 8 Jun 2022, Matthew Wilcox (Oracle) wrote:
> 
> > Now that both callers have a folio, convert this function to
> > take a folio & rename it.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  mm/migrate.c | 19 ++++++++++++-------
> >  1 file changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 2975f0c4d7cf..2e2f41572066 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -336,13 +336,18 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd)
> >  }
> >  #endif
> >  
> > -static int expected_page_refs(struct address_space *mapping, struct page *page)
> > +static int folio_expected_refs(struct address_space *mapping,
> > +		struct folio *folio)
> >  {
> > -	int expected_count = 1;
> > +	int refs = 1;
> > +	if (!mapping)
> > +		return refs;
> >  
> > -	if (mapping)
> > -		expected_count += compound_nr(page) + page_has_private(page);
> > -	return expected_count;
> > +	refs += folio_nr_pages(folio);
> > +	if (folio_get_private(folio))
> > +		refs++;
> > +
> > +	return refs;
> >  }
> >  
> >  /*
> > @@ -359,7 +364,7 @@ int folio_migrate_mapping(struct address_space *mapping,
> >  	XA_STATE(xas, &mapping->i_pages, folio_index(folio));
> >  	struct zone *oldzone, *newzone;
> >  	int dirty;
> > -	int expected_count = expected_page_refs(mapping, &folio->page) + extra_count;
> > +	int expected_count = folio_expected_refs(mapping, folio) + extra_count;
> >  	long nr = folio_nr_pages(folio);
> >  
> >  	if (!mapping) {
> > @@ -669,7 +674,7 @@ static int __buffer_migrate_folio(struct address_space *mapping,
> >  		return migrate_page(mapping, &dst->page, &src->page, mode);
> >  
> >  	/* Check whether page does not have extra refs before we do more work */
> > -	expected_count = expected_page_refs(mapping, &src->page);
> > +	expected_count = folio_expected_refs(mapping, src);
> >  	if (folio_ref_count(src) != expected_count)
> >  		return -EAGAIN;
> >  
> > -- 
> > 2.35.1
> 
> This commit (742e89c9e352d38df1a5825fe40c4de73a5d5f7a in pagecache.git
> folio/for-next and recent linux-next) is dangerously wrong, at least
> for swapcache, and probably for some others.
> 
> I say "dangerously" because it tells page migration a swapcache page
> is safe for migration when it certainly is not.
> 
> The fun that typically ensues is kernel BUG at include/linux/mm.h:750!
> put_page_testzero() VM_BUG_ON_PAGE(page_ref_count(page) == 0, page),
> if CONFIG_DEBUG_VM=y (bisecting for that is what brought me to this).
> But I guess you might get silent data corruption too.
> 
> I assumed at first that you'd changed the rules, and were now expecting
> any subsystem that puts a non-zero value into folio->private to raise
> its refcount - whereas the old convention (originating with buffer heads)
> is that setting PG_private says an extra refcount has been taken, please
> call try_to_release_page() to lower it, and maybe that will use data in
> page->private to do so; but page->private free for the subsystem owning
> the page to use as it wishes, no refcount implication.  But that you
> had missed updating swapcache.
> 
> So I got working okay with the patch below; but before turning it into
> a proper patch, noticed that there were still plenty of other places
> applying the test for PG_private: so now think that maybe you set out
> with intention as above, realized it wouldn't work, but got distracted
> before cleaning up some places you'd already changed.  And patch below
> now goes in the wrong direction.
> 
> Or maybe you didn't intend any change, but the PG_private test just got
> missed in a few places.  I don't know, hope you remember, but current
> linux-next badly inconsistent.
> Over to you, thanks,

Ugh.  The problem I'm trying to solve is that we're short on page flags.
We _seemed_ to have correlation between "page->private != NULL" and
"PG_private is set", and so I thought I could make progress towards
removing PG_private.  But the rule you set out above wasn't written down
anywhere that I was able to find it.

I'm about to go to sleep, but I'll think on this some more tomorrow.

> Hugh
> 
> --- a/mm/migrate.c	2022-07-06 14:24:44.499941975 -0700
> +++ b/mm/migrate.c	2022-07-06 15:49:25.000000000 -0700
> @@ -351,6 +351,10 @@ unlock:
>  }
>  #endif
>  
> +static inline bool folio_counted_private(struct folio *folio)
> +{
> +	return !folio_test_swapcache(folio) && folio_get_private(folio);
> +}
>  static int folio_expected_refs(struct address_space *mapping,
>  		struct folio *folio)
>  {
> @@ -359,7 +363,7 @@ static int folio_expected_refs(struct ad
>  		return refs;
>  
>  	refs += folio_nr_pages(folio);
> -	if (folio_get_private(folio))
> +	if (folio_counted_private(folio))
>  		refs++;
>  
>  	return refs;
> --- a/mm/vmscan.c	2022-07-06 14:24:44.531942217 -0700
> +++ b/mm/vmscan.c	2022-07-06 15:49:37.000000000 -0700
> @@ -2494,6 +2494,10 @@ shrink_inactive_list(unsigned long nr_to
>   * The downside is that we have to touch folio->_refcount against each folio.
>   * But we had to alter folio->flags anyway.
>   */
> +static inline bool folio_counted_private(struct folio *folio)
> +{
> +	return !folio_test_swapcache(folio) && folio_get_private(folio);
> +}
>  static void shrink_active_list(unsigned long nr_to_scan,
>  			       struct lruvec *lruvec,
>  			       struct scan_control *sc,
> @@ -2538,8 +2542,9 @@ static void shrink_active_list(unsigned
>  		}
>  
>  		if (unlikely(buffer_heads_over_limit)) {
> -			if (folio_get_private(folio) && folio_trylock(folio)) {
> -				if (folio_get_private(folio))
> +			if (folio_counted_private(folio) &&
> +			    folio_trylock(folio)) {
> +				if (folio_counted_private(folio))
>  					filemap_release_folio(folio, 0);
>  				folio_unlock(folio);
>  			}

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2022-07-08  3:29 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08 15:02 [PATCH v2 00/19] Convert aops->migratepage to aops->migrate_folio Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 01/19] secretmem: Remove isolate_page Matthew Wilcox (Oracle)
2022-06-09 12:46   ` David Hildenbrand
2022-06-08 15:02 ` [PATCH v2 02/19] mm: Convert all PageMovable users to movable_operations Matthew Wilcox (Oracle)
2022-06-09 10:23   ` David Hildenbrand
2022-06-08 15:02 ` [PATCH v2 03/19] fs: Add aops->migrate_folio Matthew Wilcox (Oracle)
2022-06-09 12:50   ` David Hildenbrand
2022-06-09 14:35     ` Matthew Wilcox
2022-06-10 10:17       ` David Hildenbrand
2022-06-08 15:02 ` [PATCH v2 04/19] mm/migrate: Convert fallback_migrate_page() to fallback_migrate_folio() Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 05/19] mm/migrate: Convert writeout() to take a folio Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 06/19] mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio() Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 07/19] mm/migrate: Convert expected_page_refs() to folio_expected_refs() Matthew Wilcox (Oracle)
2022-07-08  2:50   ` Hugh Dickins
2022-07-08  3:29     ` Matthew Wilcox [this message]
2022-06-08 15:02 ` [PATCH v2 08/19] btrfs: Convert btree_migratepage to migrate_folio Matthew Wilcox (Oracle)
2022-06-09 16:25   ` David Sterba
2022-06-08 15:02 ` [PATCH v2 09/19] nfs: Convert " Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 10/19] mm/migrate: Convert migrate_page() to migrate_folio() Matthew Wilcox (Oracle)
2022-06-09 16:26   ` David Sterba
2022-06-08 15:02 ` [PATCH v2 11/19] mm/migrate: Add filemap_migrate_folio() Matthew Wilcox (Oracle)
2022-06-08 15:22   ` Darrick J. Wong
2022-06-08 15:02 ` [PATCH v2 12/19] btrfs: Convert btrfs_migratepage to migrate_folio Matthew Wilcox (Oracle)
2022-06-09 16:33   ` David Sterba
2022-06-09 17:40     ` Matthew Wilcox
2022-06-09 23:05       ` David Sterba
2022-06-08 15:02 ` [PATCH v2 13/19] ubifs: Convert to filemap_migrate_folio() Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 14/19] f2fs: " Matthew Wilcox (Oracle)
2022-06-15  8:11   ` [f2fs-dev] " Chao Yu
2022-06-08 15:02 ` [PATCH v2 15/19] aio: Convert to migrate_folio Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 16/19] hugetlb: " Matthew Wilcox (Oracle)
2022-06-09  3:53   ` Muchun Song
2022-06-09 20:51   ` Mike Kravetz
2022-06-08 15:02 ` [PATCH v2 17/19] secretmem: " Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 18/19] fs: Remove aops->migratepage() Matthew Wilcox (Oracle)
2022-06-08 15:02 ` [PATCH v2 19/19] mm/folio-compat: Remove migration compatibility functions Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ysekikp60Hhs5lV0@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=cluster-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).