All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: stable@vger.kernel.org
Cc: linux-xfs@vger.kernel.org, Brian Foster <bfoster@redhat.com>,
	"Darrick J . Wong" <darrick.wong@oracle.com>
Subject: [PATCH 10/25] xfs: prevent multi-fsb dir readahead from reading random blocks
Date: Sat,  3 Jun 2017 15:18:21 +0200	[thread overview]
Message-ID: <20170603131836.26661-11-hch@lst.de> (raw)
In-Reply-To: <20170603131836.26661-1-hch@lst.de>

From: Brian Foster <bfoster@redhat.com>

commit cb52ee334a45ae6c78a3999e4b473c43ddc528f4 upstream.

Directory block readahead uses a complex iteration mechanism to map
between high-level directory blocks and underlying physical extents.
This mechanism attempts to traverse the higher-level dir blocks in a
manner that handles multi-fsb directory blocks and simultaneously
maintains a reference to the corresponding physical blocks.

This logic doesn't handle certain (discontiguous) physical extent
layouts correctly with multi-fsb directory blocks. For example,
consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory
block size and a directory with the following extent layout:

 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..7]:          88..95            0 (88..95)             8
   1: [8..15]:         80..87            0 (80..87)             8
   2: [16..39]:        168..191          0 (168..191)          24
   3: [40..63]:        5242952..5242975  1 (72..95)            24

Directory block 0 spans physical extents 0 and 1, dirblk 1 lies
entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because
extent 2 is larger than the directory block size, the readahead code
erroneously assumes the block is contiguous and issues a readahead
based on the physical mapping of the first fsb of the dirblk. This
results in read verifier failure and a spurious corruption or crc
failure, depending on the filesystem format.

Further, the subsequent readahead code responsible for walking
through the physical table doesn't correctly advance the physical
block reference for dirblk 2. Instead of advancing two physical
filesystem blocks, the first iteration of the loop advances 1 block
(correctly), but the subsequent iteration advances 2 more physical
blocks because the next physical extent (extent 3, above) happens to
cover more than dirblk 2. At this point, the higher-level directory
block walking is completely off the rails of the actual physical
layout of the directory for the respective mapping table.

Update the contiguous dirblock logic to consider the current offset
in the physical extent to avoid issuing directory readahead to
unrelated blocks. Also, update the mapping table advancing code to
consider the current offset within the current dirblock to avoid
advancing the mapping reference too far beyond the dirblock.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_dir2_readdir.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
index ba95846f8f8a..eba63160be6c 100644
--- a/fs/xfs/xfs_dir2_readdir.c
+++ b/fs/xfs/xfs_dir2_readdir.c
@@ -405,7 +405,8 @@ xfs_dir2_leaf_readbuf(
 		 * Read-ahead a contiguous directory block.
 		 */
 		if (i > mip->ra_current &&
-		    map[mip->ra_index].br_blockcount >= geo->fsbcount) {
+		    (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
+		    geo->fsbcount) {
 			xfs_dir3_data_readahead(dp,
 				map[mip->ra_index].br_startoff + mip->ra_offset,
 				XFS_FSB_TO_DADDR(dp->i_mount,
@@ -438,7 +439,7 @@ xfs_dir2_leaf_readbuf(
 			 * The rest of this extent but not more than a dir
 			 * block.
 			 */
-			length = min_t(int, geo->fsbcount,
+			length = min_t(int, geo->fsbcount - j,
 					map[mip->ra_index].br_blockcount -
 							mip->ra_offset);
 			mip->ra_offset += length;
-- 
2.11.0


  parent reply	other threads:[~2017-06-03 13:19 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-03 13:18 4.9-stable updates for XFS Christoph Hellwig
2017-06-03 13:18 ` [PATCH 01/25] xfs: verify inline directory data forks Christoph Hellwig
2017-06-05 14:04   ` Greg KH
2017-06-05 14:04     ` Greg KH
2017-06-03 13:18 ` [PATCH 02/25] xfs: rework the inline directory verifiers Christoph Hellwig
2017-06-05 14:06   ` Greg KH
2017-06-05 14:37     ` Christoph Hellwig
2017-06-03 13:18 ` [PATCH 03/25] xfs: fix kernel memory exposure problems Christoph Hellwig
2017-06-03 13:18 ` [PATCH 04/25] xfs: use dedicated log worker wq to avoid deadlock with cil wq Christoph Hellwig
2017-06-03 13:18 ` [PATCH 05/25] xfs: fix over-copying of getbmap parameters from userspace Christoph Hellwig
2017-06-03 13:18 ` [PATCH 06/25] xfs: actually report xattr extents via iomap Christoph Hellwig
2017-06-03 13:18 ` [PATCH 07/25] xfs: drop iolock from reclaim context to appease lockdep Christoph Hellwig
2017-06-03 13:18 ` [PATCH 08/25] xfs: fix integer truncation in xfs_bmap_remap_alloc Christoph Hellwig
2017-06-03 13:18 ` [PATCH 09/25] xfs: handle array index overrun in xfs_dir2_leaf_readbuf() Christoph Hellwig
2017-06-03 13:18 ` Christoph Hellwig [this message]
2017-06-03 13:18 ` [PATCH 11/25] xfs: fix up quotacheck buffer list error handling Christoph Hellwig
2017-06-03 13:18 ` [PATCH 12/25] xfs: support ability to wait on new inodes Christoph Hellwig
2017-06-03 13:18 ` [PATCH 13/25] xfs: update ag iterator to support " Christoph Hellwig
2017-06-03 13:18 ` [PATCH 14/25] xfs: wait on new inodes during quotaoff dquot release Christoph Hellwig
2017-06-03 13:18 ` [PATCH 15/25] xfs: reserve enough blocks to handle btree splits when remapping Christoph Hellwig
2017-06-03 13:18 ` [PATCH 16/25] xfs: fix use-after-free in xfs_finish_page_writeback Christoph Hellwig
2017-06-03 13:18 ` [PATCH 17/25] xfs: fix indlen accounting error on partial delalloc conversion Christoph Hellwig
2017-06-03 13:18 ` [PATCH 18/25] xfs: BMAPX shouldn't barf on inline-format directories Christoph Hellwig
2017-06-03 13:18 ` [PATCH 19/25] xfs: bad assertion for delalloc an extent that start at i_size Christoph Hellwig
2017-06-03 13:18 ` [PATCH 20/25] xfs: xfs_trans_alloc_empty Christoph Hellwig
2017-06-05 15:07   ` Patch "xfs: xfs_trans_alloc_empty" has been added to the 3.18-stable tree gregkh
2017-06-05 15:19     ` Greg KH
2017-06-05 15:07   ` Patch "xfs: xfs_trans_alloc_empty" has been added to the 4.11-stable tree gregkh
2017-06-05 15:08   ` Patch "xfs: xfs_trans_alloc_empty" has been added to the 4.9-stable tree gregkh
2017-06-03 13:18 ` [PATCH 21/25] xfs: avoid mount-time deadlock in CoW extent recovery Christoph Hellwig
2017-06-03 13:18 ` [PATCH 22/25] xfs: fix unaligned access in xfs_btree_visit_blocks Christoph Hellwig
2017-06-03 13:18 ` [PATCH 23/25] xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() Christoph Hellwig
2017-06-03 13:18 ` [PATCH 24/25] xfs: Fix missed holes in SEEK_HOLE implementation Christoph Hellwig
2017-06-03 13:18 ` [PATCH 25/25] xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170603131836.26661-11-hch@lst.de \
    --to=hch@lst.de \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.