* [PATCH] fix page_count in ->iomap_migrate_page()
@ 2018-12-14 5:56 zhangjun
2018-12-14 11:25 ` Richard Weinberger
2018-12-15 4:26 ` Gao Xiang
0 siblings, 2 replies; 8+ messages in thread
From: zhangjun @ 2018-12-14 5:56 UTC (permalink / raw)
To: Alexander Viro
Cc: linux-fsdevel, linux-kernel, Richard Weinberger,
Darrick J . Wong, zhangjun
IOMAP uses PG_private a little different with buffer_head based
filesystem.
It uses it as marker and when set, the page counter is not incremented,
migrate_page_move_mapping() assumes that PG_private indicates a counter
of +1.
so, we have to pass a extra count of -1 to migrate_page_move_mapping()
if the flag is set.
Signed-off-by: zhangjun <openzhangj@gmail.com>
---
fs/iomap.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/iomap.c b/fs/iomap.c
index 64ce240..352e58a 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -544,8 +544,17 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage,
struct page *page, enum migrate_mode mode)
{
int ret;
+ int extra_count = 0;
- ret = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+ /*
+ * IOMAP uses PG_private as marker and does not raise the page counter.
+ * migrate_page_move_mapping() expects a incremented counter if PG_private
+ * is set. Therefore pass -1 as extra_count for this case.
+ */
+ if (page_has_private(page))
+ extra_count = -1;
+ ret = migrate_page_move_mapping(mapping, newpage, page,
+ NULL, mode, extra_count);
if (ret != MIGRATEPAGE_SUCCESS)
return ret;
--
2.7.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun
@ 2018-12-14 11:25 ` Richard Weinberger
2018-12-14 12:26 ` Gao Xiang
2018-12-15 10:51 ` Christoph Hellwig
2018-12-15 4:26 ` Gao Xiang
1 sibling, 2 replies; 8+ messages in thread
From: Richard Weinberger @ 2018-12-14 11:25 UTC (permalink / raw)
To: zhangjun
Cc: Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong,
hch, bfoster, darrick.wong, Dave Chinner, akpm, kirill.shutemov,
mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm,
Gao Xiang
[CC'ing authors of the code plus mm folks]
Am Freitag, 14. Dezember 2018, 06:56:01 CET schrieb zhangjun:
> IOMAP uses PG_private a little different with buffer_head based
> filesystem.
> It uses it as marker and when set, the page counter is not incremented,
> migrate_page_move_mapping() assumes that PG_private indicates a counter
> of +1.
> so, we have to pass a extra count of -1 to migrate_page_move_mapping()
> if the flag is set.
>
> Signed-off-by: zhangjun <openzhangj@gmail.com>
> ---
> fs/iomap.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 64ce240..352e58a 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -544,8 +544,17 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage,
> struct page *page, enum migrate_mode mode)
> {
> int ret;
> + int extra_count = 0;
>
> - ret = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
> + /*
> + * IOMAP uses PG_private as marker and does not raise the page counter.
> + * migrate_page_move_mapping() expects a incremented counter if PG_private
> + * is set. Therefore pass -1 as extra_count for this case.
> + */
> + if (page_has_private(page))
> + extra_count = -1;
> + ret = migrate_page_move_mapping(mapping, newpage, page,
> + NULL, mode, extra_count);
> if (ret != MIGRATEPAGE_SUCCESS)
> return ret;
This is the third place which needs this workaround.
UBIFS, F2FS, and now iomap.
I agree with Dave that nobody can assume that PG_private implies an additional
page reference.
But page migration does that. Including parts of the write back code.
Thanks,
//richard
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 11:25 ` Richard Weinberger
@ 2018-12-14 12:26 ` Gao Xiang
2018-12-14 13:35 ` Richard Weinberger
2018-12-15 10:51 ` Christoph Hellwig
1 sibling, 1 reply; 8+ messages in thread
From: Gao Xiang @ 2018-12-14 12:26 UTC (permalink / raw)
To: Richard Weinberger
Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel,
Darrick J . Wong, hch, bfoster, Dave Chinner, akpm,
kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy,
linux, linux-mm, Gao Xiang
Hi Richard,
On 2018/12/14 19:25, Richard Weinberger wrote:
> This is the third place which needs this workaround.
> UBIFS, F2FS, and now iomap.
>
> I agree with Dave that nobody can assume that PG_private implies an additional
> page reference.
> But page migration does that. Including parts of the write back code.
It seems that it's clearly documented in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/mm.h#n780
* A pagecache page contains an opaque `private' member, which belongs to the
* page's address_space. Usually, this is the address of a circular list of
* the page's disk buffers. PG_private must be set to tell the VM to call
* into the filesystem to release these pages.
*
* A page may belong to an inode's memory mapping. In this case, page->mapping
* is the pointer to the inode, and page->index is the file offset of the page,
* in units of PAGE_SIZE.
*
* If pagecache pages are not associated with an inode, they are said to be
* anonymous pages. These may become associated with the swapcache, and in that
* case PG_swapcache is set, and page->private is an offset into the swapcache.
*
* In either case (swapcache or inode backed), the pagecache itself holds one
* reference to the page. Setting PG_private should also increment the
* refcount. The each user mapping also has a reference to the page.
and when I looked into that, I found
https://lore.kernel.org/lkml/3CB3CA93.D141680B@zip.com.au/
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 12:26 ` Gao Xiang
@ 2018-12-14 13:35 ` Richard Weinberger
2018-12-14 13:55 ` Gao Xiang
0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2018-12-14 13:35 UTC (permalink / raw)
To: Gao Xiang, Artem Bityutskiy
Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel,
Darrick J . Wong, hch, bfoster, Dave Chinner, akpm,
kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy,
linux, linux-mm, Gao Xiang
Am Freitag, 14. Dezember 2018, 13:26:28 CET schrieb Gao Xiang:
> Hi Richard,
>
> On 2018/12/14 19:25, Richard Weinberger wrote:
> > This is the third place which needs this workaround.
> > UBIFS, F2FS, and now iomap.
> >
> > I agree with Dave that nobody can assume that PG_private implies an additional
> > page reference.
> > But page migration does that. Including parts of the write back code.
>
> It seems that it's clearly documented in
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/mm.h#n780
>
> * A pagecache page contains an opaque `private' member, which belongs to the
> * page's address_space. Usually, this is the address of a circular list of
> * the page's disk buffers. PG_private must be set to tell the VM to call
> * into the filesystem to release these pages.
> *
> * A page may belong to an inode's memory mapping. In this case, page->mapping
> * is the pointer to the inode, and page->index is the file offset of the page,
> * in units of PAGE_SIZE.
> *
> * If pagecache pages are not associated with an inode, they are said to be
> * anonymous pages. These may become associated with the swapcache, and in that
> * case PG_swapcache is set, and page->private is an offset into the swapcache.
> *
> * In either case (swapcache or inode backed), the pagecache itself holds one
> * reference to the page. Setting PG_private should also increment the
> * refcount. The each user mapping also has a reference to the page.
>
> and when I looked into that, I found
> https://lore.kernel.org/lkml/3CB3CA93.D141680B@zip.com.au/
Hmm, in case of UBIFS it seems easy. We can add a get/put_page() around setting/clearing
the flag.
I did that now and so far none of my tests exploded.
Artem, do you remember why UBIFS never raised the page counter when setting PG_private?
Thanks,
//richard
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 13:35 ` Richard Weinberger
@ 2018-12-14 13:55 ` Gao Xiang
0 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-14 13:55 UTC (permalink / raw)
To: Richard Weinberger
Cc: Artem Bityutskiy, zhangjun, Alexander Viro, linux-fsdevel,
linux-kernel, Darrick J . Wong, hch, bfoster, Dave Chinner, akpm,
kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy,
linux, linux-mm, Gao Xiang
Hi Richard,
On 2018/12/14 21:35, Richard Weinberger wrote:
> Hmm, in case of UBIFS it seems easy. We can add a get/put_page() around setting/clearing
> the flag.
> I did that now and so far none of my tests exploded.
Yes, many existed codes are based on this restriction in order to be freeable race-free.
and that's it since PG_Private was once introduced at first by Andrew Morton in 2002
for many Linux versions....and it's not bad I think... :)
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun
2018-12-14 11:25 ` Richard Weinberger
@ 2018-12-15 4:26 ` Gao Xiang
1 sibling, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-15 4:26 UTC (permalink / raw)
To: zhangjun
Cc: Alexander Viro, linux-fsdevel, linux-kernel, Richard Weinberger,
Darrick J . Wong, Andrew Morton
On 2018/12/14 13:56, zhangjun wrote:
> IOMAP uses PG_private a little different with buffer_head based
> filesystem.
> It uses it as marker and when set, the page counter is not incremented,
> migrate_page_move_mapping() assumes that PG_private indicates a counter
> of +1.
> so, we have to pass a extra count of -1 to migrate_page_move_mapping()
> if the flag is set.
>
> Signed-off-by: zhangjun <openzhangj@gmail.com>
> ---
I found that it fixed in
https://patchwork.kernel.org/patch/10684835/
and has been merged in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61c6de667263184125d5ca75e894fcad632b0dd3
It seems it has been corrected by Piotr.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-14 11:25 ` Richard Weinberger
2018-12-14 12:26 ` Gao Xiang
@ 2018-12-15 10:51 ` Christoph Hellwig
2018-12-15 11:17 ` Richard Weinberger
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2018-12-15 10:51 UTC (permalink / raw)
To: Richard Weinberger
Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel,
Darrick J . Wong, hch, bfoster, Dave Chinner, akpm,
kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy,
linux, linux-mm, Gao Xiang
FYI, for iomap we got a patch to just increment the page count when
setting the private data, and it finally got merged into mainline after
a while.
Not that it totally makes sense to me, but it is what it is. It would
just be nice if set_page_private took care of it and we had a
clear_page_private to undo it, making the whole scheme at lot more
obvious.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page()
2018-12-15 10:51 ` Christoph Hellwig
@ 2018-12-15 11:17 ` Richard Weinberger
0 siblings, 0 replies; 8+ messages in thread
From: Richard Weinberger @ 2018-12-15 11:17 UTC (permalink / raw)
To: Christoph Hellwig
Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel,
Darrick J . Wong, bfoster, Dave Chinner, akpm, kirill.shutemov,
mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm,
Gao Xiang
Am Samstag, 15. Dezember 2018, 11:51:12 CET schrieb Christoph Hellwig:
> FYI, for iomap we got a patch to just increment the page count when
> setting the private data, and it finally got merged into mainline after
> a while.
>
> Not that it totally makes sense to me, but it is what it is. It would
> just be nice if set_page_private took care of it and we had a
> clear_page_private to undo it, making the whole scheme at lot more
> obvious.
Yeah, UBIFS will go the same route. I have already a patch prepared
which increments the page count when UBIFS sets PG_private.
Thanks,
//richard
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-15 11:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun
2018-12-14 11:25 ` Richard Weinberger
2018-12-14 12:26 ` Gao Xiang
2018-12-14 13:35 ` Richard Weinberger
2018-12-14 13:55 ` Gao Xiang
2018-12-15 10:51 ` Christoph Hellwig
2018-12-15 11:17 ` Richard Weinberger
2018-12-15 4:26 ` Gao Xiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).