All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] afs: Fixes for 3rd party-induced data corruption
@ 2021-09-20  7:09 David Howells
  0 siblings, 0 replies; 3+ messages in thread
From: David Howells @ 2021-09-20  7:09 UTC (permalink / raw)
  To: torvalds
  Cc: dhowells, Marc Dionne, Markus Suvanto, linux-afs, linux-fsdevel,
	linux-kernel

Hi Linus,

Can you pull these fixes for AFS that can cause data corruption due to
interaction with another client modifying data cached locally please[1]?

 (1) When d_revalidating a dentry, don't look at the inode to which it
     points.  Only check the directory to which the dentry belongs.  This
     was confusing things and causing the silly-rename cleanup code to
     remove the file now at the dentry of a file that got deleted.

 (2) Fix mmap data coherency.  When a callback break is received that
     relates to a file that we have cached, the data content may have been
     changed (there are other reasons, such as the user's rights having
     been changed).  However, we're checking it lazily, only on entry to
     the kernel, which doesn't happen if we have a writeable shared mapped
     page on that file.

     We make the kernel keep track of mmapped files and clear all PTEs
     mapping to that file as soon as the callback comes in by calling
     unmap_mapping_pages() (we don't necessarily want to zap the
     pagecache).  This causes the kernel to be reentered when userspace
     tries to access the mmapped address range again - and at that point we
     can query the server and, if we need to, zap the page cache.

     Ideally, I would check each file at the point of notification, but
     that involves poking the server[*] - which is holding an exclusive
     lock on the vnode it is changing, waiting for all the clients it
     notified to reply.  This could then deadlock against the server.
     Further, invalidating the pagecache might call ->launder_page(), which
     would try to write to the file, which would definitely deadlock.  (AFS
     doesn't lease file access).

     [*] Checking to see if the file content has changed is a matter of
     	 comparing the current data version number, but we have to ask the
     	 server for that.  We also need to get a new callback promise and
     	 we need to poke the server for that too.

 (3) Add some more points at which the inode is validated, since we're
     doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but
     also when performing some directory operations.

     Ideally, checking in ->read_iter() would be done in some derivation of
     filemap_read().  If we're going to call the server to read the file,
     then we get the file status fetch as part of that.

 (4) The above is now causing us to make a lot more calls to afs_validate()
     to check the inode - and afs_validate() takes the RCU read lock each
     time to make a quick check (ie. afs_check_validity()).  This is
     entirely for the purpose of checking cb_s_break to see if the server
     we're using reinitialised its list of callbacks - however this isn't a
     very common event, so most of the time we're taking this needlessly.

     Add a new cell-wide counter to count the number of reinitialisations
     done by any server and check that - and only if that changes, take the
     RCU read lock and check the server list (the server list may change,
     but the cell a file is part of won't).

 (5) Don't update vnode->cb_s_break and ->cb_v_break inside the validity
     checking loop.  The cb_lock is done with read_seqretry, so we might go
     round the loop a second time after resetting those values - and that
     could cause someone else checking validity to miss something (I
     think).

Also included are patches for fixes for some bugs encountered whilst
debugging this.

 (6) Fix a leak of afs_read objects and fix a leak of keys hidden by that.

 (7) Fix a leak of pages that couldn't be added to extend a writeback.

 (8) Fix the maintenance of i_blocks when i_size is changed by a local
     write or a local dir edit[**].

     [**] Would you prefer this patch separately to the other patches?

David

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1]
Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch
---

The following changes since commit b91db6a0b52e019b6bdabea3f1dbe36d85c7e52c:

  Merge tag 'for-5.15/io_uring-vfs-2021-08-30' of git://git.kernel.dk/linux-block (2021-08-30 19:39:59 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/afs-fixes-20210913

for you to fetch changes up to 9d37e1cab2a9d2cee2737973fa455e6f89eee46a:

  afs: Fix updating of i_blocks on file/dir extension (2021-09-13 09:14:21 +0100)

----------------------------------------------------------------
AFS fixes

----------------------------------------------------------------
David Howells (8):
      afs: Fix missing put on afs_read objects and missing get on the key therein
      afs: Fix page leak
      afs: Add missing vnode validation checks
      afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation
      afs: Fix mmap coherency vs 3rd-party changes
      afs: Try to avoid taking RCU read lock when checking vnode validity
      afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server
      afs: Fix updating of i_blocks on file/dir extension

 fs/afs/callback.c          | 44 ++++++++++++++++++++-
 fs/afs/cell.c              |  2 +
 fs/afs/dir.c               | 57 +++++++++------------------
 fs/afs/dir_edit.c          |  4 +-
 fs/afs/file.c              | 86 ++++++++++++++++++++++++++++++++++++++--
 fs/afs/fs_probe.c          |  8 +++-
 fs/afs/fsclient.c          | 31 +++++++++------
 fs/afs/inode.c             | 98 ++++++++++++++++++++--------------------------
 fs/afs/internal.h          | 21 ++++++++++
 fs/afs/protocol_afs.h      | 15 +++++++
 fs/afs/protocol_yfs.h      |  6 +++
 fs/afs/rotate.c            |  1 +
 fs/afs/server.c            |  2 +
 fs/afs/super.c             |  1 +
 fs/afs/write.c             | 29 +++++++++++---
 include/trace/events/afs.h |  8 +++-
 mm/memory.c                |  1 +
 17 files changed, 294 insertions(+), 120 deletions(-)
 create mode 100644 fs/afs/protocol_afs.h


^ permalink raw reply	[flat|nested] 3+ messages in thread
* [GIT PULL] afs: Fixes for 3rd party-induced data corruption
@ 2021-09-13 10:49 David Howells
  2021-09-20 23:32 ` pr-tracker-bot
  0 siblings, 1 reply; 3+ messages in thread
From: David Howells @ 2021-09-13 10:49 UTC (permalink / raw)
  To: torvalds
  Cc: dhowells, Marc Dionne, Markus Suvanto, linux-afs, linux-fsdevel,
	linux-kernel


Hi Linus,

Here are some fixes for AFS that can cause data corruption due to
interaction with another client modifying data cached locally[1].

 (1) When d_revalidating a dentry, don't look at the inode to which it
     points.  Only check the directory to which the dentry belongs.  This
     was confusing things and causing the silly-rename cleanup code to
     remove the file now at the dentry of a file that got deleted.

 (2) Fix mmap data coherency.  When a callback break is received that
     relates to a file that we have cached, the data content may have been
     changed (there are other reasons, such as the user's rights having
     been changed).  However, we're checking it lazily, only on entry to
     the kernel, which doesn't happen if we have a writeable shared mapped
     page on that file.

     We make the kernel keep track of mmapped files and clear all PTEs
     mapping to that file as soon as the callback comes in by calling
     unmap_mapping_pages() (we don't necessarily want to zap the
     pagecache).  This causes the kernel to be reentered when userspace
     tries to access the mmapped address range again - and at that point we
     can query the server and, if we need to, zap the page cache.

     Ideally, I would check each file at the point of notification, but
     that involves poking the server[*] - which is holding an exclusive
     lock on the vnode it is changing, waiting for all the clients it
     notified to reply.  This could then deadlock against the server.
     Further, invalidating the pagecache might call ->launder_page(), which
     would try to write to the file, which would definitely deadlock.  (AFS
     doesn't lease file access).

     [*] Checking to see if the file content has changed is a matter of
     	 comparing the current data version number, but we have to ask the
     	 server for that.  We also need to get a new callback promise and
     	 we need to poke the server for that too.

 (3) Add some more points at which the inode is validated, since we're
     doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but
     also when performing some directory operations.

     Ideally, checking in ->read_iter() would be done in some derivation of
     filemap_read().  If we're going to call the server to read the file,
     then we get the file status fetch as part of that.

 (4) The above is now causing us to make a lot more calls to afs_validate()
     to check the inode - and afs_validate() takes the RCU read lock each
     time to make a quick check (ie. afs_check_validity()).  This is
     entirely for the purpose of checking cb_s_break to see if the server
     we're using reinitialised its list of callbacks - however this isn't a
     very common event, so most of the time we're taking this needlessly.

     Add a new cell-wide counter to count the number of reinitialisations
     done by any server and check that - and only if that changes, take the
     RCU read lock and check the server list (the server list may change,
     but the cell a file is part of won't).

 (5) Don't update vnode->cb_s_break and ->cb_v_break inside the validity
     checking loop.  The cb_lock is done with read_seqretry, so we might go
     round the loop a second time after resetting those values - and that
     could cause someone else checking validity to miss something (I
     think).

Also included are patches for fixes for some bugs encountered whilst
debugging this.

 (6) Fix a leak of afs_read objects and fix a leak of keys hidden by that.

 (7) Fix a leak of pages that couldn't be added to extend a writeback.

 (8) Fix the maintenance of i_blocks when i_size is changed by a local
     write or a local dir edit[**].

     [**] Would you prefer this patch separately to the other patches?

David

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1]
Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch
---

The following changes since commit b91db6a0b52e019b6bdabea3f1dbe36d85c7e52c:

  Merge tag 'for-5.15/io_uring-vfs-2021-08-30' of git://git.kernel.dk/linux-block (2021-08-30 19:39:59 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/afs-fixes-20210913

for you to fetch changes up to 9d37e1cab2a9d2cee2737973fa455e6f89eee46a:

  afs: Fix updating of i_blocks on file/dir extension (2021-09-13 09:14:21 +0100)

----------------------------------------------------------------
AFS fixes

----------------------------------------------------------------
David Howells (8):
      afs: Fix missing put on afs_read objects and missing get on the key therein
      afs: Fix page leak
      afs: Add missing vnode validation checks
      afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation
      afs: Fix mmap coherency vs 3rd-party changes
      afs: Try to avoid taking RCU read lock when checking vnode validity
      afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server
      afs: Fix updating of i_blocks on file/dir extension

 fs/afs/callback.c          | 44 ++++++++++++++++++++-
 fs/afs/cell.c              |  2 +
 fs/afs/dir.c               | 57 +++++++++------------------
 fs/afs/dir_edit.c          |  4 +-
 fs/afs/file.c              | 86 ++++++++++++++++++++++++++++++++++++++--
 fs/afs/fs_probe.c          |  8 +++-
 fs/afs/fsclient.c          | 31 +++++++++------
 fs/afs/inode.c             | 98 ++++++++++++++++++++--------------------------
 fs/afs/internal.h          | 21 ++++++++++
 fs/afs/protocol_afs.h      | 15 +++++++
 fs/afs/protocol_yfs.h      |  6 +++
 fs/afs/rotate.c            |  1 +
 fs/afs/server.c            |  2 +
 fs/afs/super.c             |  1 +
 fs/afs/write.c             | 29 +++++++++++---
 include/trace/events/afs.h |  8 +++-
 mm/memory.c                |  1 +
 17 files changed, 294 insertions(+), 120 deletions(-)
 create mode 100644 fs/afs/protocol_afs.h


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-20 23:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-20  7:09 [GIT PULL] afs: Fixes for 3rd party-induced data corruption David Howells
  -- strict thread matches above, loose matches on Subject: below --
2021-09-13 10:49 David Howells
2021-09-20 23:32 ` pr-tracker-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.