All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] xfs: truncate vs page fault IO exclusion
@ 2015-01-07 22:25 ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2015-01-07 22:25 UTC (permalink / raw)
  To: xfs; +Cc: linux-fsdevel, linux-mm

Hi folks,

This patch set is an attempt to address issues with XFS
truncate and hole-punch code from racing with page faults that enter
the IO path. This is traditionally deadlock prone due to the
inversion of filesystem IO path locks and the mmap_sem.

To avoid this issue, I have introduced a new "i_mmaplock" rwsem into
the XFS code similar to the IO lock, but this lock is only taken in
the mmap fault paths on entry into the filesystem (i.e. ->fault and
->page_mkwrite).

The concept is that if we invalidate the page cache over a range
after taking both the existing i_iolock and the new i_mmaplock, we
will have prevented any vector for repopulation of the page cache
over the invalidated range until one of the io and mmap locks has
been dropped. i.e. we can guarantee that both the syscall IO path
and page faults won't race with whatever operation the filesystem is
performing...

The introduction of a new lock is necessary to avoid deadlocks due
to mmap_sem entanglement. It has a defined lock order during page
faults of:

mmap_sem
-> i_mmaplock (read)
   -> page lock
      -> i_ilock (get blocks)

This lock is then taken by any extent manipulation code in XFS in
addition to the IO lock which has the lock ordering of

i_iolock (write)
-> i_mmaplock (write)
   -> page lock (data writeback, page invalidation)
      -> i_lock (data writeback)
   -> i_lock (modification transaction)

Hence we have consistent lock ordering (which has been validated so
far by testing with lockdep enabled) for page fault IO vs
truncate, hole punch, extent shifts, etc.

This patchset passes xfstests and various benchmarks and stress
workloads, so the real question is now:

	What have I missed?

Comments, thoughts, flames?

-Dave.

GI: [RFC PATCH 1/6] xfs: introduce mmap/truncate lock

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2015-01-22 21:32 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-07 22:25 [RFC PATCH 0/6] xfs: truncate vs page fault IO exclusion Dave Chinner
2015-01-07 22:25 ` Dave Chinner
2015-01-07 22:25 ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 1/6] xfs: introduce mmap/truncate lock Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-22 13:09   ` Brian Foster
2015-01-22 13:09     ` Brian Foster
2015-01-22 21:30     ` Dave Chinner
2015-01-22 21:30       ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 2/6] xfs: use i_mmaplock on read faults Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 3/6] xfs: use i_mmaplock on write faults Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 4/6] xfs: take i_mmap_lock on extent manipulation operations Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-22 13:23   ` Brian Foster
2015-01-22 13:23     ` Brian Foster
2015-01-22 13:23     ` Brian Foster
2015-01-22 21:32     ` Dave Chinner
2015-01-22 21:32       ` Dave Chinner
2015-01-22 21:32       ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 5/6] xfs: xfs_setattr_size no longer races with page faults Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-07 22:25 ` [RFC PATCH 6/6] xfs: lock out page faults from extent swap operations Dave Chinner
2015-01-07 22:25   ` Dave Chinner
2015-01-22 13:41   ` Brian Foster
2015-01-22 13:41     ` Brian Foster
2015-01-08 11:34 ` [RFC PATCH 0/6] xfs: truncate vs page fault IO exclusion Jan Kara
2015-01-08 11:34   ` Jan Kara
2015-01-08 11:34   ` Jan Kara
2015-01-08 12:24 ` Christoph Hellwig
2015-01-08 12:24   ` Christoph Hellwig
2015-01-08 12:24   ` Christoph Hellwig
2015-01-08 21:45   ` Dave Chinner
2015-01-08 21:45     ` Dave Chinner
2015-01-12 17:42   ` Jan Kara
2015-01-12 17:42     ` Jan Kara
2015-01-21 22:26     ` Dave Chinner
2015-01-21 22:26       ` Dave Chinner
2015-01-21 22:26       ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.