linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	William Kucharski <william.kucharski@oracle.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, hch@lst.de,
	hannes@cmpxchg.org, yang.shi@linux.alibaba.com,
	dchinner@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 00/16] Overhaul multi-page lookups for THP
Date: Tue, 17 Nov 2020 19:15:13 +0000	[thread overview]
Message-ID: <20201117191513.GV29991@casper.infradead.org> (raw)
In-Reply-To: <alpine.LSU.2.11.2011170820030.1014@eggly.anvils>

On Tue, Nov 17, 2020 at 08:26:03AM -0800, Hugh Dickins wrote:
> On Tue, 17 Nov 2020, Matthew Wilcox wrote:
> > On Mon, Nov 16, 2020 at 02:34:34AM -0800, Hugh Dickins wrote:
> > > Fix to [PATCH v4 15/16] mm/truncate,shmem: Handle truncates that split THPs.
> > > One machine ran fine, swapping and building in ext4 on loop0 on huge tmpfs;
> > > one machine got occasional pages of zeros in its .os; one machine couldn't
> > > get started because of ext4_find_dest_de errors on the newly mkfs'ed fs.
> > > The partial_end case was decided by PAGE_SIZE, when there might be a THP
> > > there.  The below patch has run well (for not very long), but I could
> > > easily have got it slightly wrong, off-by-one or whatever; and I have
> > > not looked into the similar code in mm/truncate.c, maybe that will need
> > > a similar fix or maybe not.
> > 
> > Thank you for the explanation in your later email!  There is indeed an
> > off-by-one, although in the safe direction.
> > 
> > > --- 5103w/mm/shmem.c	2020-11-12 15:46:21.075254036 -0800
> > > +++ 5103wh/mm/shmem.c	2020-11-16 01:09:35.431677308 -0800
> > > @@ -874,7 +874,7 @@ static void shmem_undo_range(struct inod
> > >  	long nr_swaps_freed = 0;
> > >  	pgoff_t index;
> > >  	int i;
> > > -	bool partial_end;
> > > +	bool same_page;
> > >  
> > >  	if (lend == -1)
> > >  		end = -1;	/* unsigned, so actually very big */
> > > @@ -907,16 +907,12 @@ static void shmem_undo_range(struct inod
> > >  		index++;
> > >  	}
> > >  
> > > -	partial_end = ((lend + 1) % PAGE_SIZE) > 0;
> > > +	same_page = (lstart >> PAGE_SHIFT) == end;
> > 
> > 'end' is exclusive, so this is always false.  Maybe something "obvious":
> > 
> > 	same_page = (lstart >> PAGE_SHIFT) == (lend >> PAGE_SHIFT);
> > 
> > (lend is inclusive, so lend in 0-4095 are all on the same page)
> 
> My brain is not yet in gear this morning, so I haven't given this the
> necessary thought: but I do have to question what you say there, and
> throw it back to you for the further thought -
> 
> the first shmem_getpage(inode, lstart >> PAGE_SHIFT, &page, SGP_READ);
> the second shmem_getpage(inode, end, &page, SGP_READ).
> So same_page = (lstart >> PAGE_SHIFT) == end
> had seemed right to me.

I find both of these functions exceptionally confusing.  Does this
make it easier to understand?

@@ -859,22 +859,47 @@ static void shmem_undo_range(struct inode *inode, loff_t l
start, loff_t lend,
 {
        struct address_space *mapping = inode->i_mapping;
        struct shmem_inode_info *info = SHMEM_I(inode);
-       pgoff_t start = (lstart + PAGE_SIZE - 1) >> PAGE_SHIFT;
-       pgoff_t end = (lend + 1) >> PAGE_SHIFT;
+       pgoff_t start = lstart >> PAGE_SHIFT;
+       pgoff_t end = lend >> PAGE_SHIFT;
        struct pagevec pvec;
        pgoff_t indices[PAGEVEC_SIZE];
        struct page *page;
        long nr_swaps_freed = 0;
        pgoff_t index;
        int i;
-       bool same_page;
+       bool same_page = (start == end);
 
-       if (lend == -1)
-               end = -1;       /* unsigned, so actually very big */
+       page = NULL;
+       shmem_getpage(inode, start, &page, SGP_READ);
+       if (page) {
+               page = thp_head(page);
+               same_page = lend < page_offset(page) + thp_size(page);
+               set_page_dirty(page);
+               if (truncate_inode_partial_page(page, lstart, lend))
+                       start++;
+               else
+                       start = page->index + thp_nr_pages(page);
+               unlock_page(page);
+               put_page(page);
+               page = NULL;
+       }
+
+       if (!same_page)
+               shmem_getpage(inode, end, &page, SGP_READ);
+       if (page) {
+               page = thp_head(page);
+               set_page_dirty(page);
+               if (truncate_inode_partial_page(page, lstart, lend))
+                       end--;
+               else
+                       end = page->index - 1;
+               unlock_page(page);
+               put_page(page);
+       }
 
        pagevec_init(&pvec);
        index = start;
-       while (index < end && find_lock_entries(mapping, index, end - 1,
+       while (index <= end && find_lock_entries(mapping, index, end,
                        &pvec, indices)) {
                for (i = 0; i < pagevec_count(&pvec); i++) {
                        page = pvec.pages[i];
@@ -900,40 +925,11 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
                index++;
        }
 
-       same_page = (lend >> PAGE_SHIFT) == (lstart >> PAGE_SHIFT);
-       page = NULL;
-       shmem_getpage(inode, lstart >> PAGE_SHIFT, &page, SGP_READ);
-       if (page) {
-               page = thp_head(page);
-               same_page = lend < page_offset(page) + thp_size(page);
-               set_page_dirty(page);
-               if (!truncate_inode_partial_page(page, lstart, lend)) {
-                       start = page->index + thp_nr_pages(page);
-                       if (same_page)
-                               end = page->index;
-               }
-               unlock_page(page);
-               put_page(page);
-               page = NULL;
-       }
-
-       if (!same_page)
-               shmem_getpage(inode, end, &page, SGP_READ);
-       if (page) {
-               page = thp_head(page);
-               set_page_dirty(page);
-               if (!truncate_inode_partial_page(page, lstart, lend))
-                       end = page->index;
-               unlock_page(page);
-               put_page(page);
-       }
-
        index = start;
-       while (index < end) {
+       while (index <= end) {
                cond_resched();
 
-               if (!find_get_entries(mapping, index, end - 1, &pvec,
-                               indices)) {
+               if (!find_get_entries(mapping, index, end, &pvec, indices)) {
                        /* If all gone or hole-punch or unfalloc, we're done */
                        if (index == start || end != -1)
                                break;

That is, we change the definitions of start and end to be the more natural
"index of page which contains the first/last byte".  Then we deal with
the start and end of the range, and adjust the start & end appropriately.

I almost managed to get rid of 'same_page' until I thought about the case
where start was a compound page, and split succeeded.  In this case, we
already dealt with the tail and don't want to deal with it again.

  reply	other threads:[~2020-11-17 19:15 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 21:26 Matthew Wilcox (Oracle)
2020-11-12 21:26 ` [PATCH v4 01/16] mm: Make pagecache tagged lookups return only head pages Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 02/16] mm/shmem: Use pagevec_lookup in shmem_unlock_mapping Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 03/16] mm/swap: Optimise get_shadow_from_swap_cache Matthew Wilcox (Oracle)
2020-11-14  9:53   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 04/16] mm: Add FGP_ENTRY Matthew Wilcox (Oracle)
2020-11-14 10:00   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 05/16] mm/filemap: Rename find_get_entry to mapping_get_entry Matthew Wilcox (Oracle)
2020-11-14 10:01   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 06/16] mm/filemap: Add helper for finding pages Matthew Wilcox (Oracle)
2020-11-14 10:03   ` Christoph Hellwig
2020-11-14 15:15     ` Matthew Wilcox
2020-11-12 21:26 ` [PATCH v4 07/16] mm/filemap: Add mapping_seek_hole_data Matthew Wilcox (Oracle)
2020-11-14 10:04   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 08/16] iomap: Use mapping_seek_hole_data Matthew Wilcox (Oracle)
2020-11-14 10:06   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 09/16] mm: Add and use find_lock_entries Matthew Wilcox (Oracle)
2020-11-14 10:07   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 10/16] mm: Add an 'end' parameter to find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:08   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 11/16] mm: Add an 'end' parameter to pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:19   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 12/16] mm: Remove nr_entries parameter from pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:19   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 13/16] mm: Pass pvec directly to find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:21   ` Christoph Hellwig
2020-11-14 15:22     ` Matthew Wilcox
2020-11-12 21:26 ` [PATCH v4 14/16] mm: Remove pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-11-14 10:22   ` Christoph Hellwig
2020-11-12 21:26 ` [PATCH v4 15/16] mm/truncate,shmem: Handle truncates that split THPs Matthew Wilcox (Oracle)
2020-11-12 21:26 ` [PATCH v4 16/16] mm/filemap: Return only head pages from find_get_entries Matthew Wilcox (Oracle)
2020-11-14 10:23   ` Christoph Hellwig
2020-11-16 10:34 ` [PATCH v4 00/16] Overhaul multi-page lookups for THP Hugh Dickins
2020-11-16 15:14   ` Matthew Wilcox
2020-11-16 21:27     ` Hugh Dickins
2020-11-17 15:39   ` Matthew Wilcox
2020-11-17 16:26     ` Hugh Dickins
2020-11-17 19:15       ` Matthew Wilcox [this message]
2020-11-17 23:43         ` Matthew Wilcox
2020-11-25  2:32           ` Matthew Wilcox
2020-11-25  2:50             ` Hugh Dickins
2020-11-25  2:56               ` Hugh Dickins
2020-11-25 23:08             ` Andrew Morton
2020-11-26  0:11               ` Hugh Dickins
2020-11-26 12:15                 ` Matthew Wilcox
2020-11-26 19:24                   ` Hugh Dickins
2020-11-26 20:07                     ` Matthew Wilcox
2020-11-30 19:45                       ` Hugh Dickins
2020-12-01  4:52                         ` Hugh Dickins
     [not found]             ` <CGME20201203154604eucas1p200d001d25dd344a1dd1c7da34f35aad0@eucas1p2.samsung.com>
2020-12-03 15:46               ` Marek Szyprowski
     [not found]                 ` <CGME20201203172725eucas1p2fddec1d269c55095859d490942b78b93@eucas1p2.samsung.com>
2020-12-03 17:27                   ` Marek Szyprowski
2020-12-03 21:27                     ` Qian Cai
2020-12-03 22:19                       ` Hugh Dickins
2020-12-03 21:45                     ` Hugh Dickins
2021-02-23 22:58 ` Andrew Morton
2021-02-23 23:27   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201117191513.GV29991@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=william.kucharski@oracle.com \
    --cc=yang.shi@linux.alibaba.com \
    --subject='Re: [PATCH v4 00/16] Overhaul multi-page lookups for THP' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox