* [PATCH] fix page_count in ->iomap_migrate_page() @ 2018-12-14 5:56 zhangjun 2018-12-14 11:25 ` Richard Weinberger 2018-12-15 4:26 ` Gao Xiang 0 siblings, 2 replies; 8+ messages in thread From: zhangjun @ 2018-12-14 5:56 UTC (permalink / raw) To: Alexander Viro Cc: linux-fsdevel, linux-kernel, Richard Weinberger, Darrick J . Wong, zhangjun IOMAP uses PG_private a little different with buffer_head based filesystem. It uses it as marker and when set, the page counter is not incremented, migrate_page_move_mapping() assumes that PG_private indicates a counter of +1. so, we have to pass a extra count of -1 to migrate_page_move_mapping() if the flag is set. Signed-off-by: zhangjun <openzhangj@gmail.com> --- fs/iomap.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/iomap.c b/fs/iomap.c index 64ce240..352e58a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -544,8 +544,17 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, struct page *page, enum migrate_mode mode) { int ret; + int extra_count = 0; - ret = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0); + /* + * IOMAP uses PG_private as marker and does not raise the page counter. + * migrate_page_move_mapping() expects a incremented counter if PG_private + * is set. Therefore pass -1 as extra_count for this case. + */ + if (page_has_private(page)) + extra_count = -1; + ret = migrate_page_move_mapping(mapping, newpage, page, + NULL, mode, extra_count); if (ret != MIGRATEPAGE_SUCCESS) return ret; -- 2.7.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun @ 2018-12-14 11:25 ` Richard Weinberger 2018-12-14 12:26 ` Gao Xiang 2018-12-15 10:51 ` Christoph Hellwig 2018-12-15 4:26 ` Gao Xiang 1 sibling, 2 replies; 8+ messages in thread From: Richard Weinberger @ 2018-12-14 11:25 UTC (permalink / raw) To: zhangjun Cc: Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, hch, bfoster, darrick.wong, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang [CC'ing authors of the code plus mm folks] Am Freitag, 14. Dezember 2018, 06:56:01 CET schrieb zhangjun: > IOMAP uses PG_private a little different with buffer_head based > filesystem. > It uses it as marker and when set, the page counter is not incremented, > migrate_page_move_mapping() assumes that PG_private indicates a counter > of +1. > so, we have to pass a extra count of -1 to migrate_page_move_mapping() > if the flag is set. > > Signed-off-by: zhangjun <openzhangj@gmail.com> > --- > fs/iomap.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/fs/iomap.c b/fs/iomap.c > index 64ce240..352e58a 100644 > --- a/fs/iomap.c > +++ b/fs/iomap.c > @@ -544,8 +544,17 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, > struct page *page, enum migrate_mode mode) > { > int ret; > + int extra_count = 0; > > - ret = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0); > + /* > + * IOMAP uses PG_private as marker and does not raise the page counter. > + * migrate_page_move_mapping() expects a incremented counter if PG_private > + * is set. Therefore pass -1 as extra_count for this case. > + */ > + if (page_has_private(page)) > + extra_count = -1; > + ret = migrate_page_move_mapping(mapping, newpage, page, > + NULL, mode, extra_count); > if (ret != MIGRATEPAGE_SUCCESS) > return ret; This is the third place which needs this workaround. UBIFS, F2FS, and now iomap. I agree with Dave that nobody can assume that PG_private implies an additional page reference. But page migration does that. Including parts of the write back code. Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 11:25 ` Richard Weinberger @ 2018-12-14 12:26 ` Gao Xiang 2018-12-14 13:35 ` Richard Weinberger 2018-12-15 10:51 ` Christoph Hellwig 1 sibling, 1 reply; 8+ messages in thread From: Gao Xiang @ 2018-12-14 12:26 UTC (permalink / raw) To: Richard Weinberger Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, hch, bfoster, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang Hi Richard, On 2018/12/14 19:25, Richard Weinberger wrote: > This is the third place which needs this workaround. > UBIFS, F2FS, and now iomap. > > I agree with Dave that nobody can assume that PG_private implies an additional > page reference. > But page migration does that. Including parts of the write back code. It seems that it's clearly documented in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/mm.h#n780 * A pagecache page contains an opaque `private' member, which belongs to the * page's address_space. Usually, this is the address of a circular list of * the page's disk buffers. PG_private must be set to tell the VM to call * into the filesystem to release these pages. * * A page may belong to an inode's memory mapping. In this case, page->mapping * is the pointer to the inode, and page->index is the file offset of the page, * in units of PAGE_SIZE. * * If pagecache pages are not associated with an inode, they are said to be * anonymous pages. These may become associated with the swapcache, and in that * case PG_swapcache is set, and page->private is an offset into the swapcache. * * In either case (swapcache or inode backed), the pagecache itself holds one * reference to the page. Setting PG_private should also increment the * refcount. The each user mapping also has a reference to the page. and when I looked into that, I found https://lore.kernel.org/lkml/3CB3CA93.D141680B@zip.com.au/ Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 12:26 ` Gao Xiang @ 2018-12-14 13:35 ` Richard Weinberger 2018-12-14 13:55 ` Gao Xiang 0 siblings, 1 reply; 8+ messages in thread From: Richard Weinberger @ 2018-12-14 13:35 UTC (permalink / raw) To: Gao Xiang, Artem Bityutskiy Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, hch, bfoster, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang Am Freitag, 14. Dezember 2018, 13:26:28 CET schrieb Gao Xiang: > Hi Richard, > > On 2018/12/14 19:25, Richard Weinberger wrote: > > This is the third place which needs this workaround. > > UBIFS, F2FS, and now iomap. > > > > I agree with Dave that nobody can assume that PG_private implies an additional > > page reference. > > But page migration does that. Including parts of the write back code. > > It seems that it's clearly documented in > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/mm.h#n780 > > * A pagecache page contains an opaque `private' member, which belongs to the > * page's address_space. Usually, this is the address of a circular list of > * the page's disk buffers. PG_private must be set to tell the VM to call > * into the filesystem to release these pages. > * > * A page may belong to an inode's memory mapping. In this case, page->mapping > * is the pointer to the inode, and page->index is the file offset of the page, > * in units of PAGE_SIZE. > * > * If pagecache pages are not associated with an inode, they are said to be > * anonymous pages. These may become associated with the swapcache, and in that > * case PG_swapcache is set, and page->private is an offset into the swapcache. > * > * In either case (swapcache or inode backed), the pagecache itself holds one > * reference to the page. Setting PG_private should also increment the > * refcount. The each user mapping also has a reference to the page. > > and when I looked into that, I found > https://lore.kernel.org/lkml/3CB3CA93.D141680B@zip.com.au/ Hmm, in case of UBIFS it seems easy. We can add a get/put_page() around setting/clearing the flag. I did that now and so far none of my tests exploded. Artem, do you remember why UBIFS never raised the page counter when setting PG_private? Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 13:35 ` Richard Weinberger @ 2018-12-14 13:55 ` Gao Xiang 0 siblings, 0 replies; 8+ messages in thread From: Gao Xiang @ 2018-12-14 13:55 UTC (permalink / raw) To: Richard Weinberger Cc: Artem Bityutskiy, zhangjun, Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, hch, bfoster, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang Hi Richard, On 2018/12/14 21:35, Richard Weinberger wrote: > Hmm, in case of UBIFS it seems easy. We can add a get/put_page() around setting/clearing > the flag. > I did that now and so far none of my tests exploded. Yes, many existed codes are based on this restriction in order to be freeable race-free. and that's it since PG_Private was once introduced at first by Andrew Morton in 2002 for many Linux versions....and it's not bad I think... :) Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 11:25 ` Richard Weinberger 2018-12-14 12:26 ` Gao Xiang @ 2018-12-15 10:51 ` Christoph Hellwig 2018-12-15 11:17 ` Richard Weinberger 1 sibling, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2018-12-15 10:51 UTC (permalink / raw) To: Richard Weinberger Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, hch, bfoster, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang FYI, for iomap we got a patch to just increment the page count when setting the private data, and it finally got merged into mainline after a while. Not that it totally makes sense to me, but it is what it is. It would just be nice if set_page_private took care of it and we had a clear_page_private to undo it, making the whole scheme at lot more obvious. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-15 10:51 ` Christoph Hellwig @ 2018-12-15 11:17 ` Richard Weinberger 0 siblings, 0 replies; 8+ messages in thread From: Richard Weinberger @ 2018-12-15 11:17 UTC (permalink / raw) To: Christoph Hellwig Cc: zhangjun, Alexander Viro, linux-fsdevel, linux-kernel, Darrick J . Wong, bfoster, Dave Chinner, akpm, kirill.shutemov, mhocko, n-horiguchi, mgorman, aarcange, willy, linux, linux-mm, Gao Xiang Am Samstag, 15. Dezember 2018, 11:51:12 CET schrieb Christoph Hellwig: > FYI, for iomap we got a patch to just increment the page count when > setting the private data, and it finally got merged into mainline after > a while. > > Not that it totally makes sense to me, but it is what it is. It would > just be nice if set_page_private took care of it and we had a > clear_page_private to undo it, making the whole scheme at lot more > obvious. Yeah, UBIFS will go the same route. I have already a patch prepared which increments the page count when UBIFS sets PG_private. Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fix page_count in ->iomap_migrate_page() 2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun 2018-12-14 11:25 ` Richard Weinberger @ 2018-12-15 4:26 ` Gao Xiang 1 sibling, 0 replies; 8+ messages in thread From: Gao Xiang @ 2018-12-15 4:26 UTC (permalink / raw) To: zhangjun Cc: Alexander Viro, linux-fsdevel, linux-kernel, Richard Weinberger, Darrick J . Wong, Andrew Morton On 2018/12/14 13:56, zhangjun wrote: > IOMAP uses PG_private a little different with buffer_head based > filesystem. > It uses it as marker and when set, the page counter is not incremented, > migrate_page_move_mapping() assumes that PG_private indicates a counter > of +1. > so, we have to pass a extra count of -1 to migrate_page_move_mapping() > if the flag is set. > > Signed-off-by: zhangjun <openzhangj@gmail.com> > --- I found that it fixed in https://patchwork.kernel.org/patch/10684835/ and has been merged in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61c6de667263184125d5ca75e894fcad632b0dd3 It seems it has been corrected by Piotr. Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-15 11:17 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-14 5:56 [PATCH] fix page_count in ->iomap_migrate_page() zhangjun 2018-12-14 11:25 ` Richard Weinberger 2018-12-14 12:26 ` Gao Xiang 2018-12-14 13:35 ` Richard Weinberger 2018-12-14 13:55 ` Gao Xiang 2018-12-15 10:51 ` Christoph Hellwig 2018-12-15 11:17 ` Richard Weinberger 2018-12-15 4:26 ` Gao Xiang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).