All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC 00/19] Support loop-back NFS mounts
@ 2014-04-16  4:03 ` NeilBrown
  0 siblings, 0 replies; 151+ messages in thread
From: NeilBrown @ 2014-04-16  4:03 UTC (permalink / raw)
  To: linux-mm, linux-nfs, linux-kernel
  Cc: xfs, Peter Zijlstra, Ingo Molnar, Ming Lei, netdev

Loop-back NFS mounts are when the NFS client and server run on the
same host.

The use-case for this is a high availability cluster with shared
storage.  The shared filesystem is mounted on any one machine and
NFS-mounted on the others.
If the nfs server fails, some other node will take over that service,
and then it will have a loop-back NFS mount which needs to keep
working.

This patch set addresses the "keep working" bit and specifically
addresses deadlocks and livelocks.
Allowing the fail-over itself to be deadlock free is a separate
challenge for another day.

The short description of how this works is:

deadlocks:
  - Elevate PF_FSTRANS to apply globally instead of just in NFS and XFS.
    PF_FSTRANS disables __GFP_NS in the same way that PF_MEMALLOC_NOIO
    disables __GFP_IO.
  - Set PF_FSTRANS in nfsd when handling requests related to
    memory reclaim, or requests which could block requests related
    to memory reclaim.
  - Use lockdep to find all consequent deadlocks from some other
    thread allocating memory while holding a lock that nfsd might
    want.
  - Fix those other deadlocks by setting PF_FSTRANS or using GFP_NOFS
    as appropriate.

livelocks:
  - identify throttling during reclaim and bypass it when
    PF_LESS_THROTTLE is set
  - only set PF_LESS_THROTTLE for nfsd when handling write requests
    from the local host.

The last 12 patches address various deadlocks due to locking chains.
11 were found by lockdep, 2 by testing.  There is a reasonable chance
that there are more, I just need to exercise more code while
testing....

There is one issue that lockdep reports which I haven't fixed (I've
just hacked the code out for my testing).  That issue relates to
freeze_super().
I may not be interpreting the lockdep reports perfectly, but I think
they are basically saying that if I were to freeze a filesystem that
was exported to the local host, then we could end up deadlocking.
This is to be expected.  The NFS filesystem would need to be frozen
first.  I don't know how to tell lockdep that I know that is a problem
and I don't want to be warned about it.  Suggestions welcome.
Until this is addressed I cannot really ask others to test the code
with lockdep enabled.

There are more subsidiary places that I needed to add PF_FSTRANS than
I would have liked.  The thought keeps crossing my mind that maybe we
can get rid of __GFP_FS and require that memory reclaim never ever
block on a filesystem.  Then most of these patches go away.

Now that writeback doesn't happen from reclaim (but from kswapd) much
of the calls from reclaim to FS are gone.
The ->releasepage call is the only one that I *know* causes me
problems so I'd like to just say that that must never block.  I don't
really understand the consequences of that though.
There are a couple of other places where __GFP_FS is used and I'd need
to carefully analyze those.  But if someone just said "no, that is
impossible", I could be happy and stick with the current approach....

I've cc:ed Peter Zijlstra and Ingo Molnar only on the lockdep-related
patches, Ming Lei only on the PF_MEMALLOC_NOIO related patches,
and net-dev only on the network-related patches.
There are probably other people I should CC.  Apologies if I missed you.
I'll ensure better coverage if the nfs/mm/xfs people are reasonably happy.

Comments, criticisms, etc most welcome.

Thanks,
NeilBrown


---

NeilBrown (19):
      Promote current_{set,restore}_flags_nested from xfs to global.
      lockdep: lockdep_set_current_reclaim_state should save old value
      lockdep: improve scenario messages for RECLAIM_FS errors.
      Make effect of PF_FSTRANS to disable __GFP_FS universal.
      SUNRPC: track whether a request is coming from a loop-back interface.
      nfsd: set PF_FSTRANS for nfsd threads.
      nfsd and VM: use PF_LESS_THROTTLE to avoid throttle in shrink_inactive_list.
      Set PF_FSTRANS while write_cache_pages calls ->writepage
      XFS: ensure xfs_file_*_read cannot deadlock in memory allocation.
      NET: set PF_FSTRANS while holding sk_lock
      FS: set PF_FSTRANS while holding mmap_sem in exec.c
      NET: set PF_FSTRANS while holding rtnl_lock
      MM: set PF_FSTRANS while allocating per-cpu memory to avoid deadlock.
      driver core: set PF_FSTRANS while holding gdp_mutex
      nfsd: set PF_FSTRANS when client_mutex is held.
      VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc.
      VFS: set PF_FSTRANS while namespace_sem is held.
      nfsd: set PF_FSTRANS during nfsd4_do_callback_rpc.
      XFS: set PF_FSTRANS while ilock is held in xfs_free_eofblocks


 drivers/base/core.c             |    3 ++
 drivers/base/power/runtime.c    |    6 ++---
 drivers/block/nbd.c             |    6 ++---
 drivers/md/dm-bufio.c           |    6 ++---
 drivers/md/dm-ioctl.c           |    6 ++---
 drivers/mtd/nand/nandsim.c      |   28 ++++++---------------
 drivers/scsi/iscsi_tcp.c        |    6 ++---
 drivers/usb/core/hub.c          |    6 ++---
 fs/dcache.c                     |    4 ++-
 fs/exec.c                       |    6 +++++
 fs/fs-writeback.c               |    5 ++--
 fs/namespace.c                  |    4 +++
 fs/nfs/file.c                   |    3 +-
 fs/nfsd/nfs4callback.c          |    5 ++++
 fs/nfsd/nfs4state.c             |    3 ++
 fs/nfsd/nfssvc.c                |   24 ++++++++++++++----
 fs/nfsd/vfs.c                   |    6 +++++
 fs/xfs/kmem.h                   |    2 --
 fs/xfs/xfs_aops.c               |    7 -----
 fs/xfs/xfs_bmap_util.c          |    4 +++
 fs/xfs/xfs_file.c               |   12 +++++++++
 fs/xfs/xfs_linux.h              |    7 -----
 include/linux/lockdep.h         |    8 +++---
 include/linux/sched.h           |   32 +++++++++---------------
 include/linux/sunrpc/svc.h      |    2 ++
 include/linux/sunrpc/svc_xprt.h |    1 +
 include/net/sock.h              |    1 +
 kernel/locking/lockdep.c        |   51 ++++++++++++++++++++++++++++-----------
 kernel/softirq.c                |    6 ++---
 mm/migrate.c                    |    9 +++----
 mm/page-writeback.c             |    3 ++
 mm/page_alloc.c                 |   18 ++++++++------
 mm/percpu.c                     |    4 +++
 mm/slab.c                       |    2 ++
 mm/slob.c                       |    2 ++
 mm/slub.c                       |    1 +
 mm/vmscan.c                     |   31 +++++++++++++++---------
 net/core/dev.c                  |    6 ++---
 net/core/rtnetlink.c            |    9 ++++++-
 net/core/sock.c                 |    8 ++++--
 net/sunrpc/sched.c              |    5 ++--
 net/sunrpc/svc.c                |    6 +++++
 net/sunrpc/svcsock.c            |   10 ++++++++
 net/sunrpc/xprtrdma/transport.c |    5 ++--
 net/sunrpc/xprtsock.c           |   17 ++++++++-----
 45 files changed, 247 insertions(+), 149 deletions(-)

-- 
Signature


^ permalink raw reply	[flat|nested] 151+ messages in thread

end of thread, other threads:[~2014-04-17  5:58 UTC | newest]

Thread overview: 151+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-16  4:03 [PATCH/RFC 00/19] Support loop-back NFS mounts NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` [PATCH 03/19] lockdep: improve scenario messages for RECLAIM_FS errors NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  7:22   ` Peter Zijlstra
2014-04-16  7:22     ` Peter Zijlstra
2014-04-16  7:22     ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 06/19] nfsd: set PF_FSTRANS for nfsd threads NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  7:28   ` Peter Zijlstra
2014-04-16  7:28     ` Peter Zijlstra
2014-04-16  7:28     ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 14/19] driver core: set PF_FSTRANS while holding gdp_mutex NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 07/19] nfsd and VM: use PF_LESS_THROTTLE to avoid throttle in shrink_inactive_list NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 11/19] FS: set PF_FSTRANS while holding mmap_sem in exec.c NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 01/19] Promote current_{set, restore}_flags_nested from xfs to global NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 04/19] Make effect of PF_FSTRANS to disable __GFP_FS universal NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:37   ` Dave Chinner
2014-04-16  5:37     ` Dave Chinner
2014-04-16  5:37     ` Dave Chinner
2014-04-16  6:17     ` NeilBrown
2014-04-16  6:17       ` NeilBrown
2014-04-17  1:03       ` NeilBrown
2014-04-17  1:03         ` NeilBrown
2014-04-17  4:41         ` Dave Chinner
2014-04-17  4:41           ` Dave Chinner
2014-04-17  4:41           ` Dave Chinner
2014-04-16  4:03 ` [PATCH 09/19] XFS: ensure xfs_file_*_read cannot deadlock in memory allocation NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:04   ` Dave Chinner
2014-04-16  6:04     ` Dave Chinner
2014-04-16  6:04     ` Dave Chinner
2014-04-16  6:27     ` NeilBrown
2014-04-16  6:27       ` NeilBrown
2014-04-16  6:31     ` Dave Chinner
2014-04-16  6:31       ` Dave Chinner
2014-04-16  6:31       ` Dave Chinner
2014-04-16  4:03 ` [PATCH 05/19] SUNRPC: track whether a request is coming from a loop-back interface NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16 14:47   ` Jeff Layton
2014-04-16 14:47     ` Jeff Layton
2014-04-16 14:47     ` Jeff Layton
2014-04-16 23:25     ` NeilBrown
2014-04-16 23:25       ` NeilBrown
2014-04-16  4:03 ` [PATCH 02/19] lockdep: lockdep_set_current_reclaim_state should save old value NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 10/19] NET: set PF_FSTRANS while holding sk_lock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:13   ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:47     ` NeilBrown
2014-04-16  5:47       ` NeilBrown
2014-04-16  5:47       ` NeilBrown
2014-04-16 13:00     ` David Miller
2014-04-16 13:00       ` David Miller
2014-04-16 13:00       ` David Miller
2014-04-17  2:38       ` NeilBrown
2014-04-17  2:38         ` NeilBrown
2014-04-16  4:03 ` [PATCH 13/19] MM: set PF_FSTRANS while allocating per-cpu memory to avoid deadlock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:49   ` Dave Chinner
2014-04-16  5:49     ` Dave Chinner
2014-04-16  5:49     ` Dave Chinner
2014-04-16  6:22     ` NeilBrown
2014-04-16  6:22       ` NeilBrown
2014-04-16  6:30       ` Dave Chinner
2014-04-16  6:30         ` Dave Chinner
2014-04-16  6:30         ` Dave Chinner
2014-04-16  4:03 ` [PATCH 12/19] NET: set PF_FSTRANS while holding rtnl_lock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 08/19] Set PF_FSTRANS while write_cache_pages calls ->writepage NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 18/19] nfsd: set PF_FSTRANS during nfsd4_do_callback_rpc NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 17/19] VFS: set PF_FSTRANS while namespace_sem is held NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:46   ` Al Viro
2014-04-16  4:46     ` Al Viro
2014-04-16  5:52     ` NeilBrown
2014-04-16  5:52       ` NeilBrown
2014-04-16 16:37       ` Al Viro
2014-04-16 16:37         ` Al Viro
2014-04-16 16:37         ` Al Viro
2014-04-16  4:03 ` [PATCH 15/19] nfsd: set PF_FSTRANS when client_mutex " NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 16/19] VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:25   ` Dave Chinner
2014-04-16  6:25     ` Dave Chinner
2014-04-16  6:25     ` Dave Chinner
2014-04-16  6:49     ` NeilBrown
2014-04-16  6:49       ` NeilBrown
2014-04-16  9:00       ` Dave Chinner
2014-04-16  9:00         ` Dave Chinner
2014-04-16  9:00         ` Dave Chinner
2014-04-17  0:51         ` NeilBrown
2014-04-17  0:51           ` NeilBrown
2014-04-17  5:58           ` Dave Chinner
2014-04-17  5:58             ` Dave Chinner
2014-04-17  5:58             ` Dave Chinner
2014-04-16  4:03 ` [PATCH 19/19] XFS: set PF_FSTRANS while ilock is held in xfs_free_eofblocks NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:18   ` Dave Chinner
2014-04-16  6:18     ` Dave Chinner
2014-04-16  6:18     ` Dave Chinner
2014-04-16 14:42 ` [PATCH/RFC 00/19] Support loop-back NFS mounts Jeff Layton
2014-04-16 14:42   ` Jeff Layton
2014-04-16 14:42   ` Jeff Layton
2014-04-17  0:20   ` NeilBrown
2014-04-17  0:20     ` NeilBrown
2014-04-17  0:20     ` NeilBrown
2014-04-17  1:27     ` Dave Chinner
2014-04-17  1:27       ` Dave Chinner
2014-04-17  1:27       ` Dave Chinner
2014-04-17  1:50       ` NeilBrown
2014-04-17  1:50         ` NeilBrown
2014-04-17  1:50         ` NeilBrown
2014-04-17  4:23         ` Dave Chinner
2014-04-17  4:23           ` Dave Chinner
2014-04-17  4:23           ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.