All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	William Kucharski <william.kucharski@oracle.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, hch@lst.de,
	hannes@cmpxchg.org, yang.shi@linux.alibaba.com,
	dchinner@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 00/16] Overhaul multi-page lookups for THP
Date: Mon, 16 Nov 2020 15:14:44 +0000	[thread overview]
Message-ID: <20201116151444.GB29991@casper.infradead.org> (raw)
In-Reply-To: <alpine.LSU.2.11.2011160128001.1206@eggly.anvils>

On Mon, Nov 16, 2020 at 02:34:34AM -0800, Hugh Dickins wrote:
> On Thu, 12 Nov 2020, Matthew Wilcox (Oracle) wrote:
> 
> > This THP prep patchset changes several page cache iteration APIs to only
> > return head pages.
> > 
> >  - It's only possible to tag head pages in the page cache, so only
> >    return head pages, not all their subpages.
> >  - Factor a lot of common code out of the various batch lookup routines
> >  - Add mapping_seek_hole_data()
> >  - Unify find_get_entries() and pagevec_lookup_entries()
> >  - Make find_get_entries only return head pages, like find_get_entry().
> > 
> > These are only loosely connected, but they seem to make sense together
> > as a series.
> > 
> > v4:
> >  - Add FGP_ENTRY, remove find_get_entry and find_lock_entry
> >  - Rename xas_find_get_entry to find_get_entry
> >  - Add "Optimise get_shadow_from_swap_cache"
> >  - Move "iomap: Use mapping_seek_hole_data" to this patch series
> >  - Rebase against next-20201112
> 
> I hope next-20201112 had nothing vital for this series, I applied
> it to v5.10-rc3, and have been busy testing huge tmpfs on that.

Thank you.  It's plain I'm not able to hit these cases ... I do run
xfstests against shmem, but that's obviously not good enough.  Can
you suggest something I should be running to improve my coverage?

> Fix to [PATCH v4 06/16] mm/filemap: Add helper for finding pages.
> I hit that VM_BUG_ON_PAGE(!thp_contains) when swapping, it is not
> safe without page lock, during the interval when shmem is moving a
> page between page cache and swap cache.  It could be tightened by
> passing in a new FGP to distinguish whether searching page or swap
> cache, but I think never tight enough in the swap case - because there
> is no rule for persisting page->private as there is for page->index.
> The best I could do is:

I'll just move this out to the caller who actually locks the page:

+++ b/mm/filemap.c
@@ -1839,7 +1839,6 @@ static inline struct page *find_get_entry(struct xa_state *xas, pgoff_t max,
                put_page(page);
                goto reset;
        }
-       VM_BUG_ON_PAGE(!thp_contains(page, xas->xa_index), page);
 
        return page;
 reset:
@@ -1923,6 +1922,8 @@ unsigned find_lock_entries(struct address_space *mapping, pgoff_t start,
                                goto put;
                        if (page->mapping != mapping || PageWriteback(page))
                                goto unlock;
+                       VM_BUG_ON_PAGE(!thp_contains(page, xas->xa_index),
+                                       page);
                }
                indices[pvec->nr] = xas.xa_index;
                if (!pagevec_add(pvec, page))

> Fix to [PATCH v4 07/16] mm/filemap: Add mapping_seek_hole_data.
> Crashed on a swap entry 0x2ff09, fairly obvious...

Whoops.  Thanks.

> Fix to [PATCH v4 15/16] mm/truncate,shmem: Handle truncates that split THPs.
> One machine ran fine, swapping and building in ext4 on loop0 on huge tmpfs;
> one machine got occasional pages of zeros in its .os; one machine couldn't
> get started because of ext4_find_dest_de errors on the newly mkfs'ed fs.
> The partial_end case was decided by PAGE_SIZE, when there might be a THP
> there.  The below patch has run well (for not very long), but I could
> easily have got it slightly wrong, off-by-one or whatever; and I have
> not looked into the similar code in mm/truncate.c, maybe that will need
> a similar fix or maybe not.
> 
> --- 5103w/mm/shmem.c	2020-11-12 15:46:21.075254036 -0800
> +++ 5103wh/mm/shmem.c	2020-11-16 01:09:35.431677308 -0800
> @@ -874,7 +874,7 @@ static void shmem_undo_range(struct inod
>  	long nr_swaps_freed = 0;
>  	pgoff_t index;
>  	int i;
> -	bool partial_end;
> +	bool same_page;
>  
>  	if (lend == -1)
>  		end = -1;	/* unsigned, so actually very big */
> @@ -907,16 +907,12 @@ static void shmem_undo_range(struct inod
>  		index++;
>  	}
>  
> -	partial_end = ((lend + 1) % PAGE_SIZE) > 0;
> +	same_page = (lstart >> PAGE_SHIFT) == end;
>  	page = NULL;
>  	shmem_getpage(inode, lstart >> PAGE_SHIFT, &page, SGP_READ);
>  	if (page) {
> -		bool same_page;
> -
>  		page = thp_head(page);
>  		same_page = lend < page_offset(page) + thp_size(page);
> -		if (same_page)
> -			partial_end = false;

I don't object to this patch at all, at least partly because it's shorter
and simpler!  I don't understand what it's solving, though.  The case
where there's a THP which covers partial_end is supposed to be handled
by the three lines above.

> Fix to [PATCH v4 15/16] mm/truncate,shmem: Handle truncates that split THPs.
> xfstests generic/012 on huge tmpfs hit this every time (when checking
> xfs_io commands available: later decides "not run" because no "fiemap").
> I grabbed this line unthinkingly from one of your later series, it fixes
> the crash; but once I actually thought about it when trying to track down
> weirder behaviours, realize that the kmap_atomic() and flush_dcache_page()
> in zero_user_segments() are not prepared for a THP - so highmem and
> flush_dcache_page architectures will be disappointed. If I searched
> through your other series, I might find the complete fix; or perhaps
> it's already there in linux-next, I haven't looked.

zero_user_segments() is fixed by "mm: Support THPs in zero_user_segments".
I think most recently posted here:
https://lore.kernel.org/linux-mm/20201026183136.10404-2-willy@infradead.org/

My fault for not realising this patch depended on that patch.  I did
test these patches stand-alone, but it didn't trigger this problem.

flush_dcache_page() needs to be called once for each sub-page.  We
really need a flush_dcache_thp() so that architectures can optimise this.
Although maybe now that's going to be called flush_dcache_folio().

> I also had noise from the WARN_ON(page_to_index(page) != index)
> in invalidate_inode_pages2_range(): but that's my problem, since
> for testing I add a dummy shmem_direct_IO() (return 0): for that
> I've now added a shmem_mapping() check at the head of pages2_range().

Ah, I have a later fix for invalidate_inode_pages2_range():
https://lore.kernel.org/linux-mm/20201026183136.10404-6-willy@infradead.org/

I didn't post it earlier because there aren't any filesystems currently
which use THPs and directIO ;-)

> That's all for now: I'll fire off more overnight testing.

Thanks!

  reply	other threads:[~2020-11-16 15:14 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 21:26 [PATCH v4 00/16] Overhaul multi-page lookups for THP Matthew Wilcox (Oracle)
2020-11-12 21:26 ` [PATCH v4 01/16] mm: Make pagecache tagged lookups return only head pages Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 02/16] mm/shmem: Use pagevec_lookup in shmem_unlock_mapping Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 03/16] mm/swap: Optimise get_shadow_from_swap_cache Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 04/16] mm: Add FGP_ENTRY Matthew Wilcox (Oracle)
2020-11-14 10:00   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 05/16] mm/filemap: Rename find_get_entry to mapping_get_entry Matthew Wilcox (Oracle)
2020-11-14 10:01   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 06/16] mm/filemap: Add helper for finding pages Matthew Wilcox (Oracle)
2020-11-14 10:03   ` Christoph Hellwig
2020-11-14 15:15     ` Matthew Wilcox
2020-11-12 21:26 ` [PATCH v4 07/16] mm/filemap: Add mapping_seek_hole_data Matthew Wilcox (Oracle)
2020-11-14 10:04   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 08/16] iomap: Use mapping_seek_hole_data Matthew Wilcox (Oracle)
2020-11-14 10:06   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 09/16] mm: Add and use find_lock_entries Matthew Wilcox (Oracle)
2020-11-14 10:07   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 10/16] mm: Add an 'end' parameter to find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:08   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 11/16] mm: Add an 'end' parameter to pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:19   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 12/16] mm: Remove nr_entries parameter from pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:19   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 13/16] mm: Pass pvec directly to find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:21   ` Christoph Hellwig
2020-11-14 15:22     ` Matthew Wilcox
2020-11-12 21:26 ` [PATCH v4 14/16] mm: Remove pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:22   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 15/16] mm/truncate,shmem: Handle truncates that split THPs Matthew Wilcox (Oracle)
2020-11-12 21:26 ` [PATCH v4 16/16] mm/filemap: Return only head pages from find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:23   ` Christoph Hellwig
2020-11-16 10:34 ` [PATCH v4 00/16] Overhaul multi-page lookups for THP Hugh Dickins
2020-11-16 10:34   ` Hugh Dickins
2020-11-16 15:14   ` Matthew Wilcox [this message]
2020-11-16 21:27     ` Hugh Dickins
2020-11-16 21:27       ` Hugh Dickins
2020-11-17 15:39   ` Matthew Wilcox
2020-11-17 16:26     ` Hugh Dickins
2020-11-17 16:26       ` Hugh Dickins
2020-11-17 19:15       ` Matthew Wilcox
2020-11-17 23:43         ` Matthew Wilcox
2020-11-25  2:32           ` Matthew Wilcox
2020-11-25  2:50             ` Hugh Dickins
2020-11-25  2:50               ` Hugh Dickins
2020-11-25  2:56               ` Hugh Dickins
2020-11-25  2:56                 ` Hugh Dickins
2020-11-25 23:08             ` Andrew Morton
2020-11-26  0:11               ` Hugh Dickins
2020-11-26  0:11                 ` Hugh Dickins
2020-11-26 12:15                 ` Matthew Wilcox
2020-11-26 19:24                   ` Hugh Dickins
2020-11-26 19:24                     ` Hugh Dickins
2020-11-26 20:07                     ` Matthew Wilcox
2020-11-30 19:45                       ` Hugh Dickins
2020-11-30 19:45                         ` Hugh Dickins
2020-12-01  4:52                         ` Hugh Dickins
2020-12-01  4:52                           ` Hugh Dickins
     [not found]             ` <CGME20201203154604eucas1p200d001d25dd344a1dd1c7da34f35aad0@eucas1p2.samsung.com>
2020-12-03 15:46               ` Marek Szyprowski
     [not found]                 ` <CGME20201203172725eucas1p2fddec1d269c55095859d490942b78b93@eucas1p2.samsung.com>
2020-12-03 17:27                   ` Marek Szyprowski
2020-12-03 21:27                     ` Qian Cai
2020-12-03 21:27                       ` Qian Cai
2020-12-03 22:19                       ` Hugh Dickins
2020-12-03 22:19                         ` Hugh Dickins
2020-12-03 21:45                     ` Hugh Dickins
2020-12-03 21:45                       ` Hugh Dickins
2021-02-23 22:58 ` Andrew Morton
2021-02-23 23:27   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201116151444.GB29991@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=william.kucharski@oracle.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.