* [PATCH] netfs: Only call folio_start_fscache() one time for each folio
@ 2023-06-08 21:41 Dave Wysochanski
2023-07-24 14:20 ` [Linux-cachefs] " David Wysochanski
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Dave Wysochanski @ 2023-06-08 21:41 UTC (permalink / raw)
To: David Howells; +Cc: linux-cachefs, linux-nfs
If a network filesystem using netfs implements a clamp_length()
function, it can set subrequest lengths smaller than a page size.
When we loop through the folios in netfs_rreq_unlock_folios() to
set any folios to be written back, we need to make sure we only
call folio_start_fscache() once for each folio. Otherwise,
this simple testcase:
mount -o fsc,rsize=1024,wsize=1024 127.0.0.1:/export /mnt/nfs
dd if=/dev/zero of=/mnt/nfs/file.bin bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0126359 s, 324 kB/s
cat /mnt/nfs/file.bin > /dev/null
will trigger an oops similar to the following:
...
page dumped because: VM_BUG_ON_FOLIO(folio_test_private_2(folio))
------------[ cut here ]------------
kernel BUG at include/linux/netfs.h:44!
...
CPU: 5 PID: 134 Comm: kworker/u16:5 Kdump: loaded Not tainted 6.4.0-rc5
...
RIP: 0010:netfs_rreq_unlock_folios+0x68e/0x730 [netfs]
...
Call Trace:
<TASK>
netfs_rreq_assess+0x497/0x660 [netfs]
netfs_subreq_terminated+0x32b/0x610 [netfs]
nfs_netfs_read_completion+0x14e/0x1a0 [nfs]
nfs_read_completion+0x2f9/0x330 [nfs]
rpc_free_task+0x72/0xa0 [sunrpc]
rpc_async_release+0x46/0x70 [sunrpc]
process_one_work+0x3bd/0x710
worker_thread+0x89/0x610
kthread+0x181/0x1c0
ret_from_fork+0x29/0x50
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
---
fs/netfs/buffered_read.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 3404707ddbe7..0dafd970c1b6 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -21,6 +21,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
pgoff_t last_page = ((rreq->start + rreq->len) / PAGE_SIZE) - 1;
size_t account = 0;
bool subreq_failed = false;
+ bool folio_started;
XA_STATE(xas, &rreq->mapping->i_pages, start_page);
@@ -53,6 +54,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
pg_end = folio_pos(folio) + folio_size(folio) - 1;
+ folio_started = false;
for (;;) {
loff_t sreq_end;
@@ -60,8 +62,10 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
pg_failed = true;
break;
}
- if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
+ if (!folio_started && test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) {
folio_start_fscache(folio);
+ folio_started = true;
+ }
pg_failed |= subreq_failed;
sreq_end = subreq->start + subreq->len - 1;
if (pg_end < sreq_end)
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Linux-cachefs] [PATCH] netfs: Only call folio_start_fscache() one time for each folio
2023-06-08 21:41 [PATCH] netfs: Only call folio_start_fscache() one time for each folio Dave Wysochanski
@ 2023-07-24 14:20 ` David Wysochanski
2023-09-11 17:02 ` [PATCH] netfs: Only call folio_start_fscache " Jeff Layton
2023-09-15 13:31 ` [PATCH] netfs: Only call folio_start_fscache() one " David Howells
2 siblings, 0 replies; 6+ messages in thread
From: David Wysochanski @ 2023-07-24 14:20 UTC (permalink / raw)
To: David Howells; +Cc: linux-nfs, linux-cachefs
On Thu, Jun 8, 2023 at 5:41 PM Dave Wysochanski <dwysocha@redhat.com> wrote:
>
> If a network filesystem using netfs implements a clamp_length()
> function, it can set subrequest lengths smaller than a page size.
> When we loop through the folios in netfs_rreq_unlock_folios() to
> set any folios to be written back, we need to make sure we only
> call folio_start_fscache() once for each folio. Otherwise,
> this simple testcase:
> mount -o fsc,rsize=1024,wsize=1024 127.0.0.1:/export /mnt/nfs
> dd if=/dev/zero of=/mnt/nfs/file.bin bs=4096 count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0126359 s, 324 kB/s
> cat /mnt/nfs/file.bin > /dev/null
>
> will trigger an oops similar to the following:
> ...
> page dumped because: VM_BUG_ON_FOLIO(folio_test_private_2(folio))
> ------------[ cut here ]------------
> kernel BUG at include/linux/netfs.h:44!
> ...
> CPU: 5 PID: 134 Comm: kworker/u16:5 Kdump: loaded Not tainted 6.4.0-rc5
> ...
> RIP: 0010:netfs_rreq_unlock_folios+0x68e/0x730 [netfs]
> ...
> Call Trace:
> <TASK>
> netfs_rreq_assess+0x497/0x660 [netfs]
> netfs_subreq_terminated+0x32b/0x610 [netfs]
> nfs_netfs_read_completion+0x14e/0x1a0 [nfs]
> nfs_read_completion+0x2f9/0x330 [nfs]
> rpc_free_task+0x72/0xa0 [sunrpc]
> rpc_async_release+0x46/0x70 [sunrpc]
> process_one_work+0x3bd/0x710
> worker_thread+0x89/0x610
> kthread+0x181/0x1c0
> ret_from_fork+0x29/0x50
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> ---
> fs/netfs/buffered_read.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
> index 3404707ddbe7..0dafd970c1b6 100644
> --- a/fs/netfs/buffered_read.c
> +++ b/fs/netfs/buffered_read.c
> @@ -21,6 +21,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> pgoff_t last_page = ((rreq->start + rreq->len) / PAGE_SIZE) - 1;
> size_t account = 0;
> bool subreq_failed = false;
> + bool folio_started;
>
> XA_STATE(xas, &rreq->mapping->i_pages, start_page);
>
> @@ -53,6 +54,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
>
> pg_end = folio_pos(folio) + folio_size(folio) - 1;
>
> + folio_started = false;
> for (;;) {
> loff_t sreq_end;
>
> @@ -60,8 +62,10 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> pg_failed = true;
> break;
> }
> - if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
> + if (!folio_started && test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) {
> folio_start_fscache(folio);
> + folio_started = true;
> + }
> pg_failed |= subreq_failed;
> sreq_end = subreq->start + subreq->len - 1;
> if (pg_end < sreq_end)
> --
> 2.31.1
>
> --
> Linux-cachefs mailing list
> Linux-cachefs@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-cachefs
>
David,
Just wanted to ping a friendly reminder on this patch as I didn't see
any response or in any tree that I could find.
Also, there is a Red Hat bugzilla for it, so patch should have had:
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2210612
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] netfs: Only call folio_start_fscache time for each folio
2023-06-08 21:41 [PATCH] netfs: Only call folio_start_fscache() one time for each folio Dave Wysochanski
2023-07-24 14:20 ` [Linux-cachefs] " David Wysochanski
@ 2023-09-11 17:02 ` Jeff Layton
2023-09-13 11:40 ` Jeff Layton
2023-09-15 13:31 ` [PATCH] netfs: Only call folio_start_fscache() one " David Howells
2 siblings, 1 reply; 6+ messages in thread
From: Jeff Layton @ 2023-09-11 17:02 UTC (permalink / raw)
To: Dave Wysochanski, David Howells; +Cc: linux-cachefs, linux-nfs
On Thu, 2023-06-08 at 17:41 -0400, Dave Wysochanski wrote:
> If a network filesystem using netfs implements a clamp_length()
> function, it can set subrequest lengths smaller than a page size.
> When we loop through the folios in netfs_rreq_unlock_folios() to
> set any folios to be written back, we need to make sure we only
> call folio_start_fscache() once for each folio. Otherwise,
> this simple testcase:
> mount -o fsc,rsize=1024,wsize=1024 127.0.0.1:/export /mnt/nfs
> dd if=/dev/zero of=/mnt/nfs/file.bin bs=4096 count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0126359 s, 324 kB/s
> cat /mnt/nfs/file.bin > /dev/null
>
> will trigger an oops similar to the following:
> ...
> page dumped because: VM_BUG_ON_FOLIO(folio_test_private_2(folio))
> ------------[ cut here ]------------
> kernel BUG at include/linux/netfs.h:44!
> ...
> CPU: 5 PID: 134 Comm: kworker/u16:5 Kdump: loaded Not tainted 6.4.0-rc5
> ...
> RIP: 0010:netfs_rreq_unlock_folios+0x68e/0x730 [netfs]
> ...
> Call Trace:
> <TASK>
> netfs_rreq_assess+0x497/0x660 [netfs]
> netfs_subreq_terminated+0x32b/0x610 [netfs]
> nfs_netfs_read_completion+0x14e/0x1a0 [nfs]
> nfs_read_completion+0x2f9/0x330 [nfs]
> rpc_free_task+0x72/0xa0 [sunrpc]
> rpc_async_release+0x46/0x70 [sunrpc]
> process_one_work+0x3bd/0x710
> worker_thread+0x89/0x610
> kthread+0x181/0x1c0
> ret_from_fork+0x29/0x50
>
> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> ---
> fs/netfs/buffered_read.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
> index 3404707ddbe7..0dafd970c1b6 100644
> --- a/fs/netfs/buffered_read.c
> +++ b/fs/netfs/buffered_read.c
> @@ -21,6 +21,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> pgoff_t last_page = ((rreq->start + rreq->len) / PAGE_SIZE) - 1;
> size_t account = 0;
> bool subreq_failed = false;
> + bool folio_started;
nit: I'd move this declaration inside the xas_for_each loop, and just
initialize it to false there.
>
> XA_STATE(xas, &rreq->mapping->i_pages, start_epage);
>
> @@ -53,6 +54,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
>
> pg_end = folio_pos(folio) + folio_size(folio) - 1;
>
> + folio_started = false;
> for (;;) {
> loff_t sreq_end;
>
> @@ -60,8 +62,10 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> pg_failed = true;
> break;
> }
> - if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
> + if (!folio_started && test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) {
> folio_start_fscache(folio);
> + folio_started = true;
> + }
> pg_failed |= subreq_failed;
> sreq_end = subreq->start + subreq->len - 1;
> if (pg_end < sreq_end)
The logic looks correct though.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] netfs: Only call folio_start_fscache time for each folio
2023-09-11 17:02 ` [PATCH] netfs: Only call folio_start_fscache " Jeff Layton
@ 2023-09-13 11:40 ` Jeff Layton
0 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2023-09-13 11:40 UTC (permalink / raw)
To: David Howells; +Cc: linux-cachefs, linux-nfs, Dave Wysochanski
On Mon, 2023-09-11 at 13:02 -0400, Jeff Layton wrote:
> On Thu, 2023-06-08 at 17:41 -0400, Dave Wysochanski wrote:
> > If a network filesystem using netfs implements a clamp_length()
> > function, it can set subrequest lengths smaller than a page size.
> > When we loop through the folios in netfs_rreq_unlock_folios() to
> > set any folios to be written back, we need to make sure we only
> > call folio_start_fscache() once for each folio. Otherwise,
> > this simple testcase:
> > mount -o fsc,rsize=1024,wsize=1024 127.0.0.1:/export /mnt/nfs
> > dd if=/dev/zero of=/mnt/nfs/file.bin bs=4096 count=1
> > 1+0 records in
> > 1+0 records out
> > 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0126359 s, 324 kB/s
> > cat /mnt/nfs/file.bin > /dev/null
> >
> > will trigger an oops similar to the following:
> > ...
> > page dumped because: VM_BUG_ON_FOLIO(folio_test_private_2(folio))
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/netfs.h:44!
> > ...
> > CPU: 5 PID: 134 Comm: kworker/u16:5 Kdump: loaded Not tainted 6.4.0-rc5
> > ...
> > RIP: 0010:netfs_rreq_unlock_folios+0x68e/0x730 [netfs]
> > ...
> > Call Trace:
> > <TASK>
> > netfs_rreq_assess+0x497/0x660 [netfs]
> > netfs_subreq_terminated+0x32b/0x610 [netfs]
> > nfs_netfs_read_completion+0x14e/0x1a0 [nfs]
> > nfs_read_completion+0x2f9/0x330 [nfs]
> > rpc_free_task+0x72/0xa0 [sunrpc]
> > rpc_async_release+0x46/0x70 [sunrpc]
> > process_one_work+0x3bd/0x710
> > worker_thread+0x89/0x610
> > kthread+0x181/0x1c0
> > ret_from_fork+0x29/0x50
> >
> > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > ---
> > fs/netfs/buffered_read.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
> > index 3404707ddbe7..0dafd970c1b6 100644
> > --- a/fs/netfs/buffered_read.c
> > +++ b/fs/netfs/buffered_read.c
> > @@ -21,6 +21,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> > pgoff_t last_page = ((rreq->start + rreq->len) / PAGE_SIZE) - 1;
> > size_t account = 0;
> > bool subreq_failed = false;
> > + bool folio_started;
>
> nit: I'd move this declaration inside the xas_for_each loop, and just
> initialize it to false there.
>
> >
> > XA_STATE(xas, &rreq->mapping->i_pages, start_epage);
> >
> > @@ -53,6 +54,7 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> >
> > pg_end = folio_pos(folio) + folio_size(folio) - 1;
> >
> > + folio_started = false;
> > for (;;) {
> > loff_t sreq_end;
> >
> > @@ -60,8 +62,10 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
> > pg_failed = true;
> > break;
> > }
> > - if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
> > + if (!folio_started && test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) {
> > folio_start_fscache(folio);
> > + folio_started = true;
> > + }
> > pg_failed |= subreq_failed;
> > sreq_end = subreq->start + subreq->len - 1;
> > if (pg_end < sreq_end)
>
>
> The logic looks correct though.
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
David, can you review/merge this patch? This apparently fixes a panic
with NFS and fscache.
Thanks,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] netfs: Only call folio_start_fscache() one time for each folio
2023-06-08 21:41 [PATCH] netfs: Only call folio_start_fscache() one time for each folio Dave Wysochanski
2023-07-24 14:20 ` [Linux-cachefs] " David Wysochanski
2023-09-11 17:02 ` [PATCH] netfs: Only call folio_start_fscache " Jeff Layton
@ 2023-09-15 13:31 ` David Howells
2023-09-15 18:41 ` David Wysochanski
2 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2023-09-15 13:31 UTC (permalink / raw)
To: Dave Wysochanski; +Cc: dhowells, linux-cachefs, linux-nfs
Okay, this looks reasonable. Should I apply Jeff's suggestion before I send
it to Linus?
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] netfs: Only call folio_start_fscache() one time for each folio
2023-09-15 13:31 ` [PATCH] netfs: Only call folio_start_fscache() one " David Howells
@ 2023-09-15 18:41 ` David Wysochanski
0 siblings, 0 replies; 6+ messages in thread
From: David Wysochanski @ 2023-09-15 18:41 UTC (permalink / raw)
To: David Howells; +Cc: linux-cachefs, linux-nfs
On Fri, Sep 15, 2023 at 9:31 AM David Howells <dhowells@redhat.com> wrote:
>
> Okay, this looks reasonable. Should I apply Jeff's suggestion before I send
> it to Linus?
>
> David
>
I will send a v2 with Jeff's suggestion added, as well as
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2210612
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-09-15 18:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-08 21:41 [PATCH] netfs: Only call folio_start_fscache() one time for each folio Dave Wysochanski
2023-07-24 14:20 ` [Linux-cachefs] " David Wysochanski
2023-09-11 17:02 ` [PATCH] netfs: Only call folio_start_fscache " Jeff Layton
2023-09-13 11:40 ` Jeff Layton
2023-09-15 13:31 ` [PATCH] netfs: Only call folio_start_fscache() one " David Howells
2023-09-15 18:41 ` David Wysochanski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).