All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: David Howells <dhowells@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>,
	Steve French <smfrench@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Marc Dionne <marc.dionne@auristor.com>,
	Paulo Alcantara <pc@manguebit.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Eric Van Hensbergen <ericvh@kernel.org>,
	Ilya Dryomov <idryomov@gmail.com>,
	Christian Brauner <christian@brauner.io>,
	linux-cachefs@redhat.com, linux-afs@lists.infradead.org,
	linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
	ceph-devel@vger.kernel.org, v9fs@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 37/40] netfs: Optimise away reads above the point at which there can be no data
Date: Thu, 21 Dec 2023 16:01:53 -0700	[thread overview]
Message-ID: <20231221230153.GA1607352@dev-arch.thelio-3990X> (raw)
In-Reply-To: <20231221132400.1601991-38-dhowells@redhat.com>

Hi David,

On Thu, Dec 21, 2023 at 01:23:32PM +0000, David Howells wrote:
> Track the file position above which the server is not expected to have any
> data (the "zero point") and preemptively assume that we can satisfy
> requests by filling them with zeroes locally rather than attempting to
> download them if they're over that line - even if we've written data back
> to the server.  Assume that any data that was written back above that
> position is held in the local cache.  Note that we have to split requests
> that straddle the line.
> 
> Make use of this to optimise away some reads from the server.  We need to
> set the zero point in the following circumstances:
> 
>  (1) When we see an extant remote inode and have no cache for it, we set
>      the zero_point to i_size.
> 
>  (2) On local inode creation, we set zero_point to 0.
> 
>  (3) On local truncation down, we reduce zero_point to the new i_size if
>      the new i_size is lower.
> 
>  (4) On local truncation up, we don't change zero_point.
> 
>  (5) On local modification, we don't change zero_point.
> 
>  (6) On remote invalidation, we set zero_point to the new i_size.
> 
>  (7) If stored data is discarded from the pagecache or culled from fscache,
>      we must set zero_point above that if the data also got written to the
>      server.
> 
>  (8) If dirty data is written back to the server, but not fscache, we must
>      set zero_point above that.
> 
>  (9) If a direct I/O write is made, set zero_point above that.
> 
> Assuming the above, any read from the server at or above the zero_point
> position will return all zeroes.
> 
> The zero_point value can be stored in the cache, provided the above rules
> are applied to it by any code that culls part of the local cache.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: linux-cachefs@redhat.com
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

<snip>

> diff --git a/include/linux/netfs.h b/include/linux/netfs.h
> index 8cde618cf6d9..a5374218efe4 100644
> --- a/include/linux/netfs.h
> +++ b/include/linux/netfs.h
> @@ -136,6 +136,8 @@ struct netfs_inode {
>  	struct fscache_cookie	*cache;
>  #endif
>  	loff_t			remote_i_size;	/* Size of the remote file */
> +	loff_t			zero_point;	/* Size after which we assume there's no data
> +						 * on the server */
>  	unsigned long		flags;
>  #define NETFS_ICTX_ODIRECT	0		/* The file has DIO in progress */
>  #define NETFS_ICTX_UNBUFFERED	1		/* I/O should not use the pagecache */
> @@ -463,22 +465,30 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
>  {
>  	ctx->ops = ops;
>  	ctx->remote_i_size = i_size_read(&ctx->inode);
> +	ctx->zero_point = ctx->remote_i_size;
>  	ctx->flags = 0;
>  #if IS_ENABLED(CONFIG_FSCACHE)
>  	ctx->cache = NULL;
>  #endif
> +	/* ->releasepage() drives zero_point */
> +	mapping_set_release_always(ctx->inode.i_mapping);
>  }

I bisected a crash that I see when trying to mount an NFS volume to this
change as commit 6e3c8451f624 ("netfs: Optimise away reads above the
point at which there can be no data") in next-20231221:

  [   45.964963] BUG: kernel NULL pointer dereference, address: 0000000000000078
  [   45.964975] #PF: supervisor write access in kernel mode
  [   45.964982] #PF: error_code(0x0002) - not-present page
  [   45.964987] PGD 0 P4D 0
  [   45.964996] Oops: 0002 [#1] PREEMPT SMP NOPTI
  [   45.965004] CPU: 2 PID: 2419 Comm: mount.nfs Not tainted 6.7.0-rc6-next-20231221-debug-09925-g857647efa9be #1 adbbe7bc5037c662bc8f9b8e78ccf16be15b5e58
  [   45.965014] Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020
  [   45.965019] RIP: 0010:nfs_alloc_inode+0xa2/0xc0 [nfs]
  [   45.965092] Code: 80 b0 01 00 00 00 00 00 00 48 c7 80 38 04 00 00 00 f7 1e c2 48 c7 80 58 04 00 00 00 00 00 00 48 c7 80 40 04 00 00 00 00 00 00 <f0> 80 0a 80 48 05 b8 01 00 00 e9 5f 2b 20 f5 66 66 2e 0f 1f 84 00
  [   45.965099] RSP: 0018:ffffc900058f7bc0 EFLAGS: 00010286
  [   45.965107] RAX: ffff8881958c7290 RBX: ffff888168f0f800 RCX: 0000000000000000
  [   45.965112] RDX: 0000000000000078 RSI: ffffffffc2140a71 RDI: ffff88817a12b880
  [   45.965118] RBP: ffff888168f0f800 R08: ffffc900058f7b70 R09: 88728c958188ffff
  [   45.965123] R10: 000000000003a5c0 R11: 0000000000000005 R12: ffffffffc22f1a80
  [   45.965128] R13: ffffc900058f7c30 R14: 0000000000000000 R15: 0000000000000002
  [   45.965134] FS:  00007ff78c318740(0000) GS:ffff8887ff280000(0000) knlGS:0000000000000000
  [   45.965140] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   45.965146] CR2: 0000000000000078 CR3: 000000018a514000 CR4: 0000000000350ef0
  [   45.965152] Call Trace:
  [   45.965160]  <TASK>
  [   45.965167]  ? __die+0x23/0x70
  [   45.965183]  ? page_fault_oops+0x173/0x4e0
  [   45.965197]  ? nfs_alloc_inode+0x21/0xc0 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965279]  ? exc_page_fault+0x7e/0x180
  [   45.965291]  ? asm_exc_page_fault+0x26/0x30
  [   45.965308]  ? nfs_alloc_inode+0x21/0xc0 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965374]  ? nfs_alloc_inode+0xa2/0xc0 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965441]  alloc_inode+0x1e/0xc0
  [   45.965452]  ? __pfx_nfs_find_actor+0x10/0x10 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965517]  iget5_locked+0x97/0xf0
  [   45.965525]  ? __pfx_nfs_init_locked+0x10/0x10 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965593]  nfs_fhget+0xe4/0x700 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965666]  nfs_get_root+0xc6/0x4a0 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965732]  ? kernfs_rename_ns+0x85/0x210
  [   45.965754]  nfs_get_tree_common+0xc7/0x520 [nfs aac4a012b174ef6e5996d0df3638a0616e82eb47]
  [   45.965826]  vfs_get_tree+0x29/0xf0
  [   45.965836]  fc_mount+0x12/0x40
  [   45.965846]  do_nfs4_mount+0x12e/0x370 [nfsv4 9bac1f2bd94d7294fbbaf875b7b5cec5adc527f5]
  [   45.965946]  nfs4_try_get_tree+0x48/0xd0 [nfsv4 9bac1f2bd94d7294fbbaf875b7b5cec5adc527f5]
  [   45.966034]  vfs_get_tree+0x29/0xf0
  [   45.966041]  ? srso_return_thunk+0x5/0x5f
  [   45.966051]  path_mount+0x4ca/0xb10
  [   45.966063]  __x64_sys_mount+0x11a/0x150
  [   45.966074]  do_syscall_64+0x64/0xe0
  [   45.966083]  ? do_syscall_64+0x70/0xe0
  [   45.966090]  ? syscall_exit_to_user_mode+0x2b/0x40
  [   45.966098]  ? srso_return_thunk+0x5/0x5f
  [   45.966106]  ? do_syscall_64+0x70/0xe0
  [   45.966113]  ? srso_return_thunk+0x5/0x5f
  [   45.966121]  ? exc_page_fault+0x7e/0x180
  [   45.966130]  entry_SYSCALL_64_after_hwframe+0x6c/0x74
  [   45.966138] RIP: 0033:0x7ff78c5f2a1e
  ...

It appears that ctx->inode.i_mapping is NULL in netfs_inode_init(). This
patch appears to cure the problem for me but I am not sure if it is
proper or not.

Cheers,
Nathan

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index a5374218efe4..8daaba665421 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -471,7 +471,8 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
 	ctx->cache = NULL;
 #endif
 	/* ->releasepage() drives zero_point */
-	mapping_set_release_always(ctx->inode.i_mapping);
+	if (ctx->inode.i_mapping)
+		mapping_set_release_always(ctx->inode.i_mapping);
 }
 
 /**

  reply	other threads:[~2023-12-21 23:01 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-21 13:22 [PATCH v5 00/40] netfs, afs, 9p: Delegate high-level I/O to netfslib David Howells
2023-12-21 13:22 ` [PATCH v5 01/40] afs: Remove whitespace before most ')' from the trace header David Howells
2023-12-21 13:22 ` [PATCH v5 02/40] afs: Automatically generate trace tag enums David Howells
2023-12-21 13:22 ` [PATCH v5 03/40] netfs, fscache: Move fs/fscache/* into fs/netfs/ David Howells
2023-12-21 13:22 ` [PATCH v5 04/40] netfs, fscache: Combine fscache with netfs David Howells
2023-12-21 13:23 ` [PATCH v5 05/40] netfs, fscache: Remove ->begin_cache_operation David Howells
2023-12-21 13:23 ` [PATCH v5 06/40] netfs, fscache: Move /proc/fs/fscache to /proc/fs/netfs and put in a symlink David Howells
2024-01-03 16:49   ` Marc Dionne
2023-12-21 13:23 ` [PATCH v5 07/40] netfs: Move pinning-for-writeback from fscache to netfs David Howells
2023-12-21 13:23 ` [PATCH v5 08/40] netfs: Add a procfile to list in-progress requests David Howells
2023-12-21 13:23 ` [PATCH v5 09/40] netfs: Allow the netfs to make the io (sub)request alloc larger David Howells
2023-12-21 13:23 ` [PATCH v5 10/40] netfs: Add a ->free_subrequest() op David Howells
2023-12-21 13:23 ` [PATCH v5 11/40] afs: Don't use folio->private to record partial modification David Howells
2023-12-21 13:23 ` [PATCH v5 12/40] netfs: Provide invalidate_folio and release_folio calls David Howells
2023-12-21 13:23 ` [PATCH v5 13/40] netfs: Implement unbuffered/DIO vs buffered I/O locking David Howells
2023-12-21 13:23 ` [PATCH v5 14/40] netfs: Add iov_iters to (sub)requests to describe various buffers David Howells
2023-12-21 13:23 ` [PATCH v5 15/40] netfs: Add support for DIO buffering David Howells
2023-12-26 16:54   ` Nathan Chancellor
2023-12-28 10:47     ` Christian Brauner
2023-12-28 16:58       ` Nathan Chancellor
2023-12-21 13:23 ` [PATCH v5 16/40] netfs: Provide tools to create a buffer in an xarray David Howells
2023-12-21 13:23 ` [PATCH v5 17/40] netfs: Add func to calculate pagecount/size-limited span of an iterator David Howells
2023-12-21 13:23 ` [PATCH v5 18/40] netfs: Limit subrequest by size or number of segments David Howells
2023-12-21 13:23 ` [PATCH v5 19/40] netfs: Extend the netfs_io_*request structs to handle writes David Howells
2023-12-21 13:23 ` [PATCH v5 20/40] netfs: Add a hook to allow tell the netfs to update its i_size David Howells
2023-12-21 13:23 ` [PATCH v5 21/40] netfs: Make netfs_put_request() handle a NULL pointer David Howells
2023-12-21 13:23 ` [PATCH v5 22/40] netfs: Make the refcounting of netfs_begin_read() easier to use David Howells
2023-12-21 13:23 ` [PATCH v5 23/40] netfs: Prep to use folio->private for write grouping and streaming write David Howells
2023-12-21 13:23 ` [PATCH v5 24/40] netfs: Dispatch write requests to process a writeback slice David Howells
2023-12-21 13:23 ` [PATCH v5 25/40] netfs: Provide func to copy data to pagecache for buffered write David Howells
2023-12-21 13:23 ` [PATCH v5 26/40] netfs: Make netfs_read_folio() handle streaming-write pages David Howells
2023-12-21 13:23 ` [PATCH v5 27/40] netfs: Allocate multipage folios in the writepath David Howells
2023-12-21 13:23 ` [PATCH v5 28/40] netfs: Implement unbuffered/DIO read support David Howells
2023-12-21 13:23 ` [PATCH v5 29/40] netfs: Implement unbuffered/DIO write support David Howells
2023-12-21 13:23 ` [PATCH v5 30/40] netfs: Implement buffered write API David Howells
2023-12-21 13:23 ` [PATCH v5 31/40] netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() David Howells
2023-12-21 13:23 ` [PATCH v5 32/40] netfs: Provide netfs_file_read_iter() David Howells
2023-12-21 13:23 ` [PATCH v5 33/40] netfs, cachefiles: Pass upper bound length to allow expansion David Howells
2024-01-02 14:04   ` Gao Xiang
2024-01-02 17:11   ` David Howells
2024-01-02 20:37     ` Gao Xiang
2024-01-03  9:18       ` Yiqun Leng
2024-01-03 10:15     ` Jia Zhu
2023-12-21 13:23 ` [PATCH v5 34/40] netfs: Provide a writepages implementation David Howells
2023-12-21 13:23 ` [PATCH v5 35/40] netfs: Provide a launder_folio implementation David Howells
2023-12-21 13:23 ` [PATCH v5 36/40] netfs: Implement a write-through caching option David Howells
2023-12-21 13:23 ` [PATCH v5 37/40] netfs: Optimise away reads above the point at which there can be no data David Howells
2023-12-21 23:01   ` Nathan Chancellor [this message]
2023-12-22 11:49   ` David Howells
2023-12-22 12:00   ` [PATCH] Fix oops in NFS David Howells
2024-01-05  4:52     ` Matthew Wilcox
2024-01-05 10:12     ` David Howells
2024-01-05 13:17       ` Matthew Wilcox
2024-01-05 17:20         ` Dominique Martinet
2024-01-05 11:48     ` David Howells
2024-01-05 14:33     ` David Howells
2023-12-21 13:23 ` [PATCH v5 38/40] netfs: Export the netfs_sreq tracepoint David Howells
2023-12-21 13:23 ` [PATCH v5 39/40] afs: Use the netfs write helpers David Howells
2023-12-21 13:23 ` [PATCH v5 40/40] 9p: Use netfslib read/write_iter David Howells
2024-01-03  7:22   ` Dominique Martinet
2024-01-03 19:52     ` Eric Van Hensbergen
2024-01-03 12:08   ` David Howells
2024-01-03 12:39   ` David Howells
2024-01-03 13:00     ` Dominique Martinet
2023-12-22 13:02 ` [PATCH] Fix EROFS Kconfig David Howells
2023-12-22 13:02   ` David Howells
2023-12-23  3:55   ` Jingbo Xu
2023-12-23  3:55     ` Jingbo Xu
2023-12-23 13:32     ` Gao Xiang
2023-12-23 13:32       ` Gao Xiang
2024-01-02 15:39 ` [PATCH v5 40/40] 9p: Use netfslib read/write_iter David Howells
2024-01-02 21:49 ` [PATCH] 9p: Fix initialisation of netfs_inode for 9p David Howells
2024-01-03 13:10   ` Dominique Martinet
2024-01-03 13:59   ` David Howells
2024-01-03 14:04   ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231221230153.GA1607352@dev-arch.thelio-3990X \
    --to=nathan@kernel.org \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=christian@brauner.io \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=marc.dionne@auristor.com \
    --cc=netdev@vger.kernel.org \
    --cc=pc@manguebit.com \
    --cc=smfrench@gmail.com \
    --cc=sprasad@microsoft.com \
    --cc=tom@talpey.com \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.