stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] xfs: stable fixes for v4.19.y
@ 2019-02-04 16:54 Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 01/10] xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat Luis Chamberlain
                   ` (12 more replies)
  0 siblings, 13 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Luis Chamberlain

Kernel stable team,

here is a v2 respin of my XFS stable patches for v4.19.y. The only
change in this series is adding the upstream commit to the commit log,
and I've now also Cc'd stable@vger.kernel.org as well. No other issues
were spotted or raised with this series.

Reviews, questions, or rants are greatly appreciated.

  Luis

Brian Foster (1):
  xfs: fix shared extent data corruption due to missing cow reservation

Carlos Maiolino (1):
  xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat

Christoph Hellwig (1):
  xfs: cancel COW blocks before swapext

Christophe JAILLET (1):
  xfs: Fix error code in 'xfs_ioc_getbmap()'

Darrick J. Wong (1):
  xfs: fix PAGE_MASK usage in xfs_free_file_space

Dave Chinner (3):
  xfs: fix overflow in xfs_attr3_leaf_verify
  xfs: fix transient reference count error in
    xfs_buf_resubmit_failed_buffers
  xfs: delalloc -> unwritten COW fork allocation can go wrong

Eric Sandeen (1):
  xfs: fix inverted return from xfs_btree_sblock_verify_crc

Ye Yin (1):
  fs/xfs: fix f_ffree value for statfs when project quota is set

 fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++--
 fs/xfs/libxfs/xfs_bmap.c      |  5 ++++-
 fs/xfs/libxfs/xfs_btree.c     |  2 +-
 fs/xfs/xfs_bmap_util.c        | 10 ++++++++--
 fs/xfs/xfs_buf_item.c         | 28 +++++++++++++++++++++-------
 fs/xfs/xfs_ioctl.c            |  2 +-
 fs/xfs/xfs_qm_bhv.c           |  2 +-
 fs/xfs/xfs_reflink.c          |  1 +
 fs/xfs/xfs_stats.c            |  2 +-
 9 files changed, 47 insertions(+), 16 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 01/10] xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 02/10] xfs: cancel COW blocks before swapext Luis Chamberlain
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Carlos Maiolino, Dave Chinner, Luis Chamberlain

From: Carlos Maiolino <cmaiolino@redhat.com>

commit 41657e5507b13e963be906d5d874f4f02374fd5c upstream.

The addition of FIBT, RMAP and REFCOUNT changed the offsets into
__xfssats structure.

This caused xqmstat_proc_show() to display garbage data via
/proc/fs/xfs/xqmstat, once it relies on the offsets marked via macros.

Fix it.

Fixes: 00f4e4f9 xfs: add rmap btree stats infrastructure
Fixes: aafc3c24 xfs: support the XFS_BTNUM_FINOBT free inode btree type
Fixes: 46eeb521 xfs: introduce refcount btree definitions
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_stats.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_stats.c b/fs/xfs/xfs_stats.c
index 4e4423153071..740ac9674848 100644
--- a/fs/xfs/xfs_stats.c
+++ b/fs/xfs/xfs_stats.c
@@ -119,7 +119,7 @@ static int xqmstat_proc_show(struct seq_file *m, void *v)
 	int j;
 
 	seq_printf(m, "qm");
-	for (j = XFSSTAT_END_IBT_V2; j < XFSSTAT_END_XQMSTAT; j++)
+	for (j = XFSSTAT_END_REFCOUNT; j < XFSSTAT_END_XQMSTAT; j++)
 		seq_printf(m, " %u", counter_val(xfsstats.xs_stats, j));
 	seq_putc(m, '\n');
 	return 0;
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 02/10] xfs: cancel COW blocks before swapext
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 01/10] xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 03/10] xfs: Fix error code in 'xfs_ioc_getbmap()' Luis Chamberlain
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Christoph Hellwig, Dave Chinner, Luis Chamberlain

From: Christoph Hellwig <hch@lst.de>

commit 96987eea537d6ccd98704a71958f9ba02da80843 upstream.

We need to make sure we have no outstanding COW blocks before we swap
extents, as there is nothing preventing us from having preallocated COW
delalloc on either inode that swapext is called on.  That case can
easily be reproduced by running generic/324 in always_cow mode:

[  620.760572] XFS: Assertion failed: tip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap_util.c, line: 1669
[  620.761608] ------------[ cut here ]------------
[  620.762171] kernel BUG at fs/xfs/xfs_message.c:102!
[  620.762732] invalid opcode: 0000 [#1] SMP PTI
[  620.763272] CPU: 0 PID: 24153 Comm: xfs_fsr Tainted: G        W         4.19.0-rc1+ #4182
[  620.764203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
[  620.765202] RIP: 0010:assfail+0x20/0x28
[  620.765646] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38
[  620.767758] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202
[  620.768359] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000
[  620.769174] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9
[  620.769982] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000
[  620.770788] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98
[  620.771638] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8
[  620.772504] FS:  00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000
[  620.773475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  620.774168] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0
[  620.774978] Call Trace:
[  620.775274]  xfs_swap_extent_forks+0x2a0/0x2e0
[  620.775792]  xfs_swap_extents+0x38b/0xab0
[  620.776256]  xfs_ioc_swapext+0x121/0x140
[  620.776709]  xfs_file_ioctl+0x328/0xc90
[  620.777154]  ? rcu_read_lock_sched_held+0x50/0x60
[  620.777694]  ? xfs_iunlock+0x233/0x260
[  620.778127]  ? xfs_setattr_nonsize+0x3be/0x6a0
[  620.778647]  do_vfs_ioctl+0x9d/0x680
[  620.779071]  ? ksys_fchown+0x47/0x80
[  620.779552]  ksys_ioctl+0x35/0x70
[  620.780040]  __x64_sys_ioctl+0x11/0x20
[  620.780530]  do_syscall_64+0x4b/0x190
[  620.780927]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  620.781467] RIP: 0033:0x7fdc364d0f07
[  620.781900] Code: b3 66 90 48 8b 05 81 5f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 28
[  620.784044] RSP: 002b:00007ffe2a766038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  620.784896] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 00007fdc364d0f07
[  620.785667] RDX: 0000560296ca2fc0 RSI: 00000000c0c0586d RDI: 0000000000000005
[  620.786398] RBP: 0000000000000025 R08: 0000000000001200 R09: 0000000000000000
[  620.787283] R10: 0000000000000432 R11: 0000000000000246 R12: 0000000000000005
[  620.788051] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000006
[  620.788927] Modules linked in:
[  620.789340] ---[ end trace 9503b7417ffdbdb0 ]---
[  620.790065] RIP: 0010:assfail+0x20/0x28
[  620.790642] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38
[  620.793038] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202
[  620.793609] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000
[  620.794317] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9
[  620.795025] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000
[  620.795778] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98
[  620.796675] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8
[  620.797782] FS:  00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000
[  620.798908] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  620.799594] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0
[  620.800424] Kernel panic - not syncing: Fatal exception
[  620.801191] Kernel Offset: disabled
[  620.801597] ---[ end Kernel panic - not syncing: Fatal exception ]---

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_bmap_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 6de8d90041ff..9d1e5c3a661e 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1824,6 +1824,12 @@ xfs_swap_extents(
 	if (error)
 		goto out_unlock;
 
+	if (xfs_inode_has_cow_data(tip)) {
+		error = xfs_reflink_cancel_cow_range(tip, 0, NULLFILEOFF, true);
+		if (error)
+			return error;
+	}
+
 	/*
 	 * Extent "swapping" with rmap requires a permanent reservation and
 	 * a block reservation because it's really just a remap operation
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 03/10] xfs: Fix error code in 'xfs_ioc_getbmap()'
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 01/10] xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 02/10] xfs: cancel COW blocks before swapext Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 04/10] xfs: fix overflow in xfs_attr3_leaf_verify Luis Chamberlain
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Christophe JAILLET, Darrick J . Wong,
	Luis Chamberlain

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

commit 132bf6723749f7219c399831eeb286dbbb985429 upstream.

In this function, once 'buf' has been allocated, we unconditionally
return 0.
However, 'error' is set to some error codes in several error handling
paths.
Before commit 232b51948b99 ("xfs: simplify the xfs_getbmap interface")
this was not an issue because all error paths were returning directly,
but now that some cleanup at the end may be needed, we must propagate the
error code.

Fixes: 232b51948b99 ("xfs: simplify the xfs_getbmap interface")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 0ef5ece5634c..bad90479ade2 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1616,7 +1616,7 @@ xfs_ioc_getbmap(
 	error = 0;
 out_free_buf:
 	kmem_free(buf);
-	return 0;
+	return error;
 }
 
 struct getfsmap_info {
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 04/10] xfs: fix overflow in xfs_attr3_leaf_verify
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (2 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 03/10] xfs: Fix error code in 'xfs_ioc_getbmap()' Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 05/10] xfs: fix shared extent data corruption due to missing cow reservation Luis Chamberlain
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Dave Chinner, Darrick J . Wong, Luis Chamberlain

From: Dave Chinner <dchinner@redhat.com>

commit 837514f7a4ca4aca06aec5caa5ff56d33ef06976 upstream.

generic/070 on 64k block size filesystems is failing with a verifier
corruption on writeback or an attribute leaf block:

[   94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480
[   94.975623] XFS (pmem0): Unmount and run xfs_repair
[   94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer:
[   94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
[   94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00  ................
[   94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6  "{\.-\GL.1.7....
[   94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00  ................
[   94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00  .......h........
[   94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00  .-2.. ...-Xe....
[   94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00  .-[f.....-q5.|..
[   94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00  .-r7.....-......
[   94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3

This is failing this check:

                end = ichdr.freemap[i].base + ichdr.freemap[i].size;
                if (end < ichdr.freemap[i].base)
>>>>>                   return __this_address;
                if (end > mp->m_attr_geo->blksize)
                        return __this_address;

And from the buffer output above, the freemap array is:

	freemap[0].base = 0x00a0
	freemap[0].size = 0xdcf4	end = 0xdd94
	freemap[1].base = 0xfe98
	freemap[1].size = 0x0168	end = 0x10000
	freemap[2].base = 0xf0d8
	freemap[2].size = 0x07e0	end = 0xf8b8

These all look valid - the block size is 0x10000 and so from the
last check in the above verifier fragment we know that the end
of freemap[1] is valid. The problem is that end is declared as:

	uint16_t	end;

And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a
corruption. Fix the verifier to use uint32_t types for the check and
hence avoid the overflow.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 6fc5425b1474..2652d00842d6 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -243,7 +243,7 @@ xfs_attr3_leaf_verify(
 	struct xfs_mount		*mp = bp->b_target->bt_mount;
 	struct xfs_attr_leafblock	*leaf = bp->b_addr;
 	struct xfs_attr_leaf_entry	*entries;
-	uint16_t			end;
+	uint32_t			end;	/* must be 32bit - see below */
 	int				i;
 
 	xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &ichdr, leaf);
@@ -293,6 +293,11 @@ xfs_attr3_leaf_verify(
 	/*
 	 * Quickly check the freemap information.  Attribute data has to be
 	 * aligned to 4-byte boundaries, and likewise for the free space.
+	 *
+	 * Note that for 64k block size filesystems, the freemap entries cannot
+	 * overflow as they are only be16 fields. However, when checking end
+	 * pointer of the freemap, we have to be careful to detect overflows and
+	 * so use uint32_t for those checks.
 	 */
 	for (i = 0; i < XFS_ATTR_LEAF_MAPSIZE; i++) {
 		if (ichdr.freemap[i].base > mp->m_attr_geo->blksize)
@@ -303,7 +308,9 @@ xfs_attr3_leaf_verify(
 			return __this_address;
 		if (ichdr.freemap[i].size & 0x3)
 			return __this_address;
-		end = ichdr.freemap[i].base + ichdr.freemap[i].size;
+
+		/* be care of 16 bit overflows here */
+		end = (uint32_t)ichdr.freemap[i].base + ichdr.freemap[i].size;
 		if (end < ichdr.freemap[i].base)
 			return __this_address;
 		if (end > mp->m_attr_geo->blksize)
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 05/10] xfs: fix shared extent data corruption due to missing cow reservation
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (3 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 04/10] xfs: fix overflow in xfs_attr3_leaf_verify Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 06/10] xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers Luis Chamberlain
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Brian Foster, Darrick J . Wong, Luis Chamberlain

From: Brian Foster <bfoster@redhat.com>

commit 59e4293149106fb92530f8e56fa3992d8548c5e6 upstream.

Page writeback indirectly handles shared extents via the existence
of overlapping COW fork blocks. If COW fork blocks exist, writeback
always performs the associated copy-on-write regardless if the
underlying blocks are actually shared. If the blocks are shared,
then overlapping COW fork blocks must always exist.

fstests shared/010 reproduces a case where a buffered write occurs
over a shared block without performing the requisite COW fork
reservation.  This ultimately causes writeback to the shared extent
and data corruption that is detected across md5 checks of the
filesystem across a mount cycle.

The problem occurs when a buffered write lands over a shared extent
that crosses an extent size hint boundary and that also happens to
have a partial COW reservation that doesn't cover the start and end
blocks of the data fork extent.

For example, a buffered write occurs across the file offset (in FSB
units) range of [29, 57]. A shared extent exists at blocks [29, 35]
and COW reservation already exists at blocks [32, 34]. After
accommodating a COW extent size hint of 32 blocks and the existing
reservation at offset 32, xfs_reflink_reserve_cow() allocates 32
blocks of reservation at offset 0 and returns with COW reservation
across the range of [0, 34]. The associated data fork extent is
still [29, 35], however, which isn't fully covered by the COW
reservation.

This leads to a buffered write at file offset 35 over a shared
extent without associated COW reservation. Writeback eventually
kicks in, performs an overwrite of the underlying shared block and
causes the associated data corruption.

Update xfs_reflink_reserve_cow() to accommodate the fact that a
delalloc allocation request may not fully cover the extent in the
data fork. Trim the data fork extent appropriately, just as is done
for shared extent boundaries and/or existing COW reservations that
happen to overlap the start of the data fork extent. This prevents
shared/010 failures due to data corruption on reflink enabled
filesystems.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_reflink.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 42ea7bab9144..7088f44c0c59 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -302,6 +302,7 @@ xfs_reflink_reserve_cow(
 	if (error)
 		return error;
 
+	xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
 	trace_xfs_reflink_cow_alloc(ip, &got);
 	return 0;
 }
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 06/10] xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (4 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 05/10] xfs: fix shared extent data corruption due to missing cow reservation Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 07/10] xfs: delalloc -> unwritten COW fork allocation can go wrong Luis Chamberlain
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Dave Chinner, Darrick J . Wong, Luis Chamberlain

From: Dave Chinner <dchinner@redhat.com>

commit d43aaf1685aa471f0593685c9f54d53e3af3cf3f upstream.

When retrying a failed inode or dquot buffer,
xfs_buf_resubmit_failed_buffers() clears all the failed flags from
the inde/dquot log items. In doing so, it also drops all the
reference counts on the buffer that the failed log items hold. This
means it can drop all the active references on the buffer and hence
free the buffer before it queues it for write again.

Putting the buffer on the delwri queue takes a reference to the
buffer (so that it hangs around until it has been written and
completed), but this goes bang if the buffer has already been freed.

Hence we need to add the buffer to the delwri queue before we remove
the failed flags from the log items attached to the buffer to ensure
it always remains referenced during the resubmit process.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_buf_item.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 12d8455bfbb2..010db5f8fb00 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1233,9 +1233,23 @@ xfs_buf_iodone(
 }
 
 /*
- * Requeue a failed buffer for writeback
+ * Requeue a failed buffer for writeback.
  *
- * Return true if the buffer has been re-queued properly, false otherwise
+ * We clear the log item failed state here as well, but we have to be careful
+ * about reference counts because the only active reference counts on the buffer
+ * may be the failed log items. Hence if we clear the log item failed state
+ * before queuing the buffer for IO we can release all active references to
+ * the buffer and free it, leading to use after free problems in
+ * xfs_buf_delwri_queue. It makes no difference to the buffer or log items which
+ * order we process them in - the buffer is locked, and we own the buffer list
+ * so nothing on them is going to change while we are performing this action.
+ *
+ * Hence we can safely queue the buffer for IO before we clear the failed log
+ * item state, therefore  always having an active reference to the buffer and
+ * avoiding the transient zero-reference state that leads to use-after-free.
+ *
+ * Return true if the buffer was added to the buffer list, false if it was
+ * already on the buffer list.
  */
 bool
 xfs_buf_resubmit_failed_buffers(
@@ -1243,16 +1257,16 @@ xfs_buf_resubmit_failed_buffers(
 	struct list_head	*buffer_list)
 {
 	struct xfs_log_item	*lip;
+	bool			ret;
+
+	ret = xfs_buf_delwri_queue(bp, buffer_list);
 
 	/*
-	 * Clear XFS_LI_FAILED flag from all items before resubmit
-	 *
-	 * XFS_LI_FAILED set/clear is protected by ail_lock, caller  this
+	 * XFS_LI_FAILED set/clear is protected by ail_lock, caller of this
 	 * function already have it acquired
 	 */
 	list_for_each_entry(lip, &bp->b_li_list, li_bio_list)
 		xfs_clear_li_failed(lip);
 
-	/* Add this buffer back to the delayed write list */
-	return xfs_buf_delwri_queue(bp, buffer_list);
+	return ret;
 }
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 07/10] xfs: delalloc -> unwritten COW fork allocation can go wrong
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (5 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 06/10] xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 08/10] fs/xfs: fix f_ffree value for statfs when project quota is set Luis Chamberlain
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Dave Chinner, Darrick J . Wong, Luis Chamberlain

From: Dave Chinner <dchinner@redhat.com>

commit 9230a0b65b47fe6856c4468ec0175c4987e5bede upstream.

Long saga. There have been days spent following this through dead end
after dead end in multi-GB event traces. This morning, after writing
a trace-cmd wrapper that enabled me to be more selective about XFS
trace points, I discovered that I could get just enough essential
tracepoints enabled that there was a 50:50 chance the fsx config
would fail at ~115k ops. If it didn't fail at op 115547, I stopped
fsx at op 115548 anyway.

That gave me two traces - one where the problem manifested, and one
where it didn't. After refining the traces to have the necessary
information, I found that in the failing case there was a real
extent in the COW fork compared to an unwritten extent in the
working case.

Walking back through the two traces to the point where the CWO fork
extents actually diverged, I found that the bad case had an extra
unwritten extent in it. This is likely because the bug it led me to
had triggered multiple times in those 115k ops, leaving stray
COW extents around. What I saw was a COW delalloc conversion to an
unwritten extent (as they should always be through
xfs_iomap_write_allocate()) resulted in a /written extent/:

xfs_writepage:        dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0
xfs_iext_remove:      dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real
xfs_bmap_pre_update:  dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real
xfs_bmap_post_update: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex

Basically, Cow fork before:

	0 1            32          52
	+H+DDDDDDDDDDDD+UUUUUUUUUUU+
	   PREV		RIGHT

COW delalloc conversion allocates:

	  1	       32
	  +uuuuuuuuuuuu+
	  NEW

And the result according to the xfs_bmap_post_update trace was:

	0 1            32          52
	+H+wwwwwwwwwwwwwwwwwwwwwwww+
	   PREV

Which is clearly wrong - it should be a merged unwritten extent,
not an unwritten extent.

That lead me to look at the LEFT_FILLING|RIGHT_FILLING|RIGHT_CONTIG
case in xfs_bmap_add_extent_delay_real(), and sure enough, there's
the bug.

It takes the old delalloc extent (PREV) and adds the length of the
RIGHT extent to it, takes the start block from NEW, removes the
RIGHT extent and then updates PREV with the new extent.

What it fails to do is update PREV.br_state. For delalloc, this is
always XFS_EXT_NORM, while in this case we are converting the
delayed allocation to unwritten, so it needs to be updated to
XFS_EXT_UNWRITTEN. This LF|RF|RC case does not do this, and so
the resultant extent is always written.

And that's the bug I've been chasing for a week - a bmap btree bug,
not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced
with the recent in core extent tree scalability enhancements.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index a47670332326..3a496ffe6551 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1683,10 +1683,13 @@ xfs_bmap_add_extent_delay_real(
 	case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING | BMAP_RIGHT_CONTIG:
 		/*
 		 * Filling in all of a previously delayed allocation extent.
-		 * The right neighbor is contiguous, the left is not.
+		 * The right neighbor is contiguous, the left is not. Take care
+		 * with delay -> unwritten extent allocation here because the
+		 * delalloc record we are overwriting is always written.
 		 */
 		PREV.br_startblock = new->br_startblock;
 		PREV.br_blockcount += RIGHT.br_blockcount;
+		PREV.br_state = new->br_state;
 
 		xfs_iext_next(ifp, &bma->icur);
 		xfs_iext_remove(bma->ip, &bma->icur, state);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 08/10] fs/xfs: fix f_ffree value for statfs when project quota is set
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (6 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 07/10] xfs: delalloc -> unwritten COW fork allocation can go wrong Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 09/10] xfs: fix PAGE_MASK usage in xfs_free_file_space Luis Chamberlain
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Ye Yin, Darrick J . Wong, Luis Chamberlain

From: Ye Yin <dbyin@tencent.com>

commit de7243057e7cefa923fa5f467c0f1ec24eef41d2 upsream.

When project is set, we should use inode limit minus the used count

Signed-off-by: Ye Yin <dbyin@tencent.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_qm_bhv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_qm_bhv.c b/fs/xfs/xfs_qm_bhv.c
index 73a1d77ec187..3091e4bc04ef 100644
--- a/fs/xfs/xfs_qm_bhv.c
+++ b/fs/xfs/xfs_qm_bhv.c
@@ -40,7 +40,7 @@ xfs_fill_statvfs_from_dquot(
 		statp->f_files = limit;
 		statp->f_ffree =
 			(statp->f_files > dqp->q_res_icount) ?
-			 (statp->f_ffree - dqp->q_res_icount) : 0;
+			 (statp->f_files - dqp->q_res_icount) : 0;
 	}
 }
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 09/10] xfs: fix PAGE_MASK usage in xfs_free_file_space
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (7 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 08/10] fs/xfs: fix f_ffree value for statfs when project quota is set Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-04 16:54 ` [PATCH v2 10/10] xfs: fix inverted return from xfs_btree_sblock_verify_crc Luis Chamberlain
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Darrick J. Wong, Luis Chamberlain

From: "Darrick J. Wong" <darrick.wong@oracle.com>

commit a579121f94aba4e8bad1a121a0fad050d6925296 upstream.

In commit e53c4b598, I *tried* to teach xfs to force writeback when we
fzero/fpunch right up to EOF so that if EOF is in the middle of a page,
the post-EOF part of the page gets zeroed before we return to userspace.
Unfortunately, I missed the part where PAGE_MASK is ~(PAGE_SIZE - 1),
which means that we totally fail to zero if we're fpunching and EOF is
within the first page.  Worse yet, the same PAGE_MASK thinko plagues the
filemap_write_and_wait_range call, so we'd initiate writeback of the
entire file, which (mostly) masked the thinko.

Drop the tricky PAGE_MASK and replace it with correct usage of PAGE_SIZE
and the proper rounding macros.

Fixes: e53c4b598 ("xfs: ensure post-EOF zeroing happens after zeroing part of a file")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_bmap_util.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 9d1e5c3a661e..211b06e4702e 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1175,9 +1175,9 @@ xfs_free_file_space(
 	 * page could be mmap'd and iomap_zero_range doesn't do that for us.
 	 * Writeback of the eof page will do this, albeit clumsily.
 	 */
-	if (offset + len >= XFS_ISIZE(ip) && ((offset + len) & PAGE_MASK)) {
+	if (offset + len >= XFS_ISIZE(ip) && offset_in_page(offset + len) > 0) {
 		error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
-				(offset + len) & ~PAGE_MASK, LLONG_MAX);
+				round_down(offset + len, PAGE_SIZE), LLONG_MAX);
 	}
 
 	return error;
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 10/10] xfs: fix inverted return from xfs_btree_sblock_verify_crc
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (8 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 09/10] xfs: fix PAGE_MASK usage in xfs_free_file_space Luis Chamberlain
@ 2019-02-04 16:54 ` Luis Chamberlain
  2019-02-05  6:44 ` [PATCH v2 00/10] xfs: stable fixes for v4.19.y Amir Goldstein
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-04 16:54 UTC (permalink / raw)
  To: linux-xfs, gregkh, Alexander.Levin
  Cc: stable, amir73il, hch, Eric Sandeen, Darrick J . Wong, Luis Chamberlain

From: Eric Sandeen <sandeen@redhat.com>

commit 7d048df4e9b05ba89b74d062df59498aa81f3785 upstream.

xfs_btree_sblock_verify_crc is a bool so should not be returning
a failaddr_t; worse, if xfs_log_check_lsn fails it returns
__this_address which looks like a boolean true (i.e. success)
to the caller.

(interestingly xfs_btree_lblock_verify_crc doesn't have the issue)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/libxfs/xfs_btree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 34c6d7bd4d18..bbdae2b4559f 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -330,7 +330,7 @@ xfs_btree_sblock_verify_crc(
 
 	if (xfs_sb_version_hascrc(&mp->m_sb)) {
 		if (!xfs_log_check_lsn(mp, be64_to_cpu(block->bb_u.s.bb_lsn)))
-			return __this_address;
+			return false;
 		return xfs_buf_verify_cksum(bp, XFS_BTREE_SBLOCK_CRC_OFF);
 	}
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (9 preceding siblings ...)
  2019-02-04 16:54 ` [PATCH v2 10/10] xfs: fix inverted return from xfs_btree_sblock_verify_crc Luis Chamberlain
@ 2019-02-05  6:44 ` Amir Goldstein
  2019-02-05 22:06 ` Dave Chinner
  2019-02-10  0:06 ` Sasha Levin
  12 siblings, 0 replies; 28+ messages in thread
From: Amir Goldstein @ 2019-02-05  6:44 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-xfs, Greg KH, Sasha Levin, stable, Christoph Hellwig,
	Brian Foster, Carlos Maiolino, Eric Sandeen, Darrick J. Wong,
	Dave Chinner

On Mon, Feb 4, 2019 at 6:54 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> Kernel stable team,
>
> here is a v2 respin of my XFS stable patches for v4.19.y. The only
> change in this series is adding the upstream commit to the commit log,
> and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> were spotted or raised with this series.
>
> Reviews, questions, or rants are greatly appreciated.

Luis,

Thanks a lot for doing this work.

For the sake of people not following "oscheck", could you please
write a list of configurations you tested with xfstests. auto group?
Any expunged tests we should know about?

I went over the candidate patches and to me, they all look like stable
worthy patches and I have not identified any dependencies.

Original authors and reviewers are in the best position to  verify
those assessments, so please guys, if each one of you acks his
own patch, that shouldn't take a lot of anyone's time.

Specifically, repeating Luis's request from v1 cover letter -
There are two patches by Dave ([6,7/10]) that are originally from
a 7 patch series of assorted fixes:
https://patchwork.kernel.org/cover/10689445/

Please confirm that those two patches do stand on their own.

Thanks,
Amir.


>
>   Luis
>
> Brian Foster (1):
>   xfs: fix shared extent data corruption due to missing cow reservation
>
> Carlos Maiolino (1):
>   xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
>
> Christoph Hellwig (1):
>   xfs: cancel COW blocks before swapext
>
> Christophe JAILLET (1):
>   xfs: Fix error code in 'xfs_ioc_getbmap()'
>
> Darrick J. Wong (1):
>   xfs: fix PAGE_MASK usage in xfs_free_file_space
>
> Dave Chinner (3):
>   xfs: fix overflow in xfs_attr3_leaf_verify
>   xfs: fix transient reference count error in
>     xfs_buf_resubmit_failed_buffers
>   xfs: delalloc -> unwritten COW fork allocation can go wrong
>
> Eric Sandeen (1):
>   xfs: fix inverted return from xfs_btree_sblock_verify_crc
>
> Ye Yin (1):
>   fs/xfs: fix f_ffree value for statfs when project quota is set
>
>  fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++--
>  fs/xfs/libxfs/xfs_bmap.c      |  5 ++++-
>  fs/xfs/libxfs/xfs_btree.c     |  2 +-
>  fs/xfs/xfs_bmap_util.c        | 10 ++++++++--
>  fs/xfs/xfs_buf_item.c         | 28 +++++++++++++++++++++-------
>  fs/xfs/xfs_ioctl.c            |  2 +-
>  fs/xfs/xfs_qm_bhv.c           |  2 +-
>  fs/xfs/xfs_reflink.c          |  1 +
>  fs/xfs/xfs_stats.c            |  2 +-
>  9 files changed, 47 insertions(+), 16 deletions(-)
>
> --
> 2.18.0
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (10 preceding siblings ...)
  2019-02-05  6:44 ` [PATCH v2 00/10] xfs: stable fixes for v4.19.y Amir Goldstein
@ 2019-02-05 22:06 ` Dave Chinner
  2019-02-06  4:05   ` Sasha Levin
  2019-02-08 19:48   ` Luis Chamberlain
  2019-02-10  0:06 ` Sasha Levin
  12 siblings, 2 replies; 28+ messages in thread
From: Dave Chinner @ 2019-02-05 22:06 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> Kernel stable team,
> 
> here is a v2 respin of my XFS stable patches for v4.19.y. The only
> change in this series is adding the upstream commit to the commit log,
> and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> were spotted or raised with this series.
> 
> Reviews, questions, or rants are greatly appreciated.

Test results?

The set of changes look fine themselves, but as always, the proof is
in the testing...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-05 22:06 ` Dave Chinner
@ 2019-02-06  4:05   ` Sasha Levin
  2019-02-06 21:54     ` Dave Chinner
  2019-02-08 19:48   ` Luis Chamberlain
  1 sibling, 1 reply; 28+ messages in thread
From: Sasha Levin @ 2019-02-06  4:05 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis Chamberlain, linux-xfs, gregkh, Alexander.Levin, stable,
	amir73il, hch

On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
>On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
>> Kernel stable team,
>>
>> here is a v2 respin of my XFS stable patches for v4.19.y. The only
>> change in this series is adding the upstream commit to the commit log,
>> and I've now also Cc'd stable@vger.kernel.org as well. No other issues
>> were spotted or raised with this series.
>>
>> Reviews, questions, or rants are greatly appreciated.
>
>Test results?
>
>The set of changes look fine themselves, but as always, the proof is
>in the testing...

Luis noted on v1 that it passes through his oscheck test suite, and I
noted that I haven't seen any regression with the xfstests scripts I
have.

What sort of data are you looking for beyond "we didn't see a
regression"?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-06  4:05   ` Sasha Levin
@ 2019-02-06 21:54     ` Dave Chinner
  2019-02-08  6:06       ` Sasha Levin
  0 siblings, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2019-02-06 21:54 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Luis Chamberlain, linux-xfs, gregkh, Alexander.Levin, stable,
	amir73il, hch

On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
> On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> >On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> >>Kernel stable team,
> >>
> >>here is a v2 respin of my XFS stable patches for v4.19.y. The only
> >>change in this series is adding the upstream commit to the commit log,
> >>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> >>were spotted or raised with this series.
> >>
> >>Reviews, questions, or rants are greatly appreciated.
> >
> >Test results?
> >
> >The set of changes look fine themselves, but as always, the proof is
> >in the testing...
> 
> Luis noted on v1 that it passes through his oscheck test suite, and I
> noted that I haven't seen any regression with the xfstests scripts I
> have.
> 
> What sort of data are you looking for beyond "we didn't see a
> regression"?

Nothing special, just a summary of what was tested so we have some
visibility of whether the testing covered the proposed changes
sufficiently.  i.e. something like:

	Patchset was run through ltp and the fstests "auto" group
	with the following configs:

	- mkfs/mount defaults
	- -m reflink=1,rmapbt=1
	- -b size=1k
	- -m crc=0
	....

	No new regressions were reported.


Really, all I'm looking for is a bit more context for the review
process - nobody remembers what configs other people test. However,
it's important in reviewing a backport to know whether a backport to
a fix, say, a bug in the rmap code actually got exercised by the
tests on an rmap enabled filesystem...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-06 21:54     ` Dave Chinner
@ 2019-02-08  6:06       ` Sasha Levin
  2019-02-08 20:06         ` Luis Chamberlain
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Sasha Levin @ 2019-02-08  6:06 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis Chamberlain, linux-xfs, gregkh, Alexander.Levin, stable,
	amir73il, hch

On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
>On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
>> On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
>> >On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
>> >>Kernel stable team,
>> >>
>> >>here is a v2 respin of my XFS stable patches for v4.19.y. The only
>> >>change in this series is adding the upstream commit to the commit log,
>> >>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
>> >>were spotted or raised with this series.
>> >>
>> >>Reviews, questions, or rants are greatly appreciated.
>> >
>> >Test results?
>> >
>> >The set of changes look fine themselves, but as always, the proof is
>> >in the testing...
>>
>> Luis noted on v1 that it passes through his oscheck test suite, and I
>> noted that I haven't seen any regression with the xfstests scripts I
>> have.
>>
>> What sort of data are you looking for beyond "we didn't see a
>> regression"?
>
>Nothing special, just a summary of what was tested so we have some
>visibility of whether the testing covered the proposed changes
>sufficiently.  i.e. something like:
>
>	Patchset was run through ltp and the fstests "auto" group
>	with the following configs:
>
>	- mkfs/mount defaults
>	- -m reflink=1,rmapbt=1
>	- -b size=1k
>	- -m crc=0
>	....
>
>	No new regressions were reported.
>
>
>Really, all I'm looking for is a bit more context for the review
>process - nobody remembers what configs other people test. However,
>it's important in reviewing a backport to know whether a backport to
>a fix, say, a bug in the rmap code actually got exercised by the
>tests on an rmap enabled filesystem...

Sure! Below are the various configs this was run against. There were
multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
were observed.

[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nve0n1p3
FSTYP=xfs


[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs


[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs


[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs


[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs


[default_pmem]
TEST_DEV=/dev/pmem0
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/pmem2
FSTYP=xfs


[default_pmem]
TEST_DEV=/dev/pmem0
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/pmem2
FSTYP=xfs


[default_pmem]
TEST_DEV=/dev/pmem0
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/pmem2
FSTYP=xfs


[default_pmem]
TEST_DEV=/dev/pmem0
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/pmem2
FSTYP=xfs


[default_pmem]
TEST_DEV=/dev/pmem0
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/pmem2
FSTYP=xfs


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-05 22:06 ` Dave Chinner
  2019-02-06  4:05   ` Sasha Levin
@ 2019-02-08 19:48   ` Luis Chamberlain
  2019-02-08 21:32     ` Dave Chinner
  2019-02-11 20:09     ` Luis Chamberlain
  1 sibling, 2 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-08 19:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> > Kernel stable team,
> > 
> > here is a v2 respin of my XFS stable patches for v4.19.y. The only
> > change in this series is adding the upstream commit to the commit log,
> > and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> > were spotted or raised with this series.
> > 
> > Reviews, questions, or rants are greatly appreciated.
> 
> Test results?
> 
> The set of changes look fine themselves, but as always, the proof is
> in the testing...

I've first established a baseline for v4.19.18 with fstests using
a series of different sections to test against. I annotated the
failures on an expunge list and then use that expunge list to confirm
no regressions -- no failures if we skip the failures already known for
v4.19.18.

Each different configuration I test against I use a section for. I only
test x86_64 for now but am starting to create a baseline for ppc64le.

The sections I use:

  * xfs
  * xfs_nocrc
  * xfs_nocrc_512
  * xfs_reflink
  * xfs_reflink_1024
  * xfs_logdev
  * xfs_realtimedev

The sections definitions for these are below:

[xfs]
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/loop15
FSTYP=xfs

[xfs_nocrc]
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/loop15
FSTYP=xfs

[xfs_nocrc_512]
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/loop15
FSTYP=xfs

[xfs_reflink]
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/loop15
FSTYP=xfs

[xfs_reflink_1024]
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/loop15
FSTYP=xfs

[xfs_logdev]
MKFS_OPTIONS="-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0 -lsize=1g"
SCRATCH_LOGDEV=/dev/loop15
USE_EXTERNAL=yes
FSTYP=xfs

[xfs_realtimedev]
MKFS_OPTIONS="-f -lsize=1g"
SCRATCH_LOGDEV=/dev/loop15
SCRATCH_RTDEV=/dev/loop14
USE_EXTERNAL=yes
FSTYP=xfs

These are listed in my example.config which oscheck copies over to
/var/lib/xfstests/config/$(hostname).config upon install if you don't
have one.

I didn't find any regressions against my tests.

The baseline is reflected on oscheck's expunge list per kernel release,
so in this case expunges/4.19.18. A file exists for each section which
tests are known to fail.

I'll put them below here for completeness, but all of these files are
present on my oscheck repository [0], it is what I use to track baselines
for upstream kernels for fstests failures:

$ cat expunges/4.19.18/xfs/unassigned/xfs.txt
generic/091
generic/263
generic/464 # after ~6 runs
generic/475 # after ~15 runs
generic/484
xfs/191-input-validation
xfs/278
xfs/451
xfs/495
xfs/499

$ cat expunges/4.19.18/xfs/unassigned/xfs_nocrc.txt
generic/091
generic/263
generic/464 # after ~39 runs
generic/475 # after ~5-10 runs
generic/484
xfs/191-input-validation
xfs/273
xfs/278
xfs/451
xfs/495
xfs/499

$ cat expunges/4.19.18/xfs/unassigned/xfs_nocrc_512.txt
generic/091
generic/263
generic/475 # after ~33 runs
generic/482 # after ~16 runs
generic/484
xfs/071
xfs/191-input-validation
xfs/273
xfs/278
xfs/451
xfs/495
xfs/499

$ cat expunges/4.19.18/xfs/unassigned/xfs_reflink.txt
generic/091
generic/263
generic/464 # after ~1 run
generic/475 # after ~5 runs
generic/484
xfs/191-input-validation
xfs/278
xfs/451
xfs/495
xfs/499

$ cat expunges/4.19.18/xfs/unassigned/xfs_reflink_1024.txt
generic/091
generic/263
generic/475 # after ~2 runs
generic/484
xfs/191-input-validation
xfs/278
xfs/451
xfs/495
xfs/499

The xfs_logdev and xfs_realtimedev sections use an external log, and as
I have noted before it seems works is needed to rule out an actual
failure.

But for completely the test which fstests says fail for these sections
are below:

$ cat expunges/4.19.18/xfs/unassigned/xfs_logdev.txt
generic/034
generic/039
generic/040
generic/041
generic/054
generic/055
generic/056
generic/057
generic/059
generic/065
generic/066
generic/073
generic/081
generic/090
generic/091
generic/101
generic/104
generic/106
generic/107
generic/177
generic/204
generic/207
generic/223
generic/260
generic/263
generic/311
generic/321
generic/322
generic/325
generic/335
generic/336
generic/341
generic/342
generic/343
generic/347
generic/348
generic/361
generic/376
generic/455
generic/459
generic/464 # fails after ~2 runs
generic/475 # fails after ~5 runs, crashes sometimes
generic/482
generic/483
generic/484
generic/489
generic/498
generic/500
generic/502
generic/510
generic/512
generic/520
shared/002
shared/298
xfs/030
xfs/033
xfs/045
xfs/070
xfs/137
xfs/138
xfs/191-input-validation
xfs/194
xfs/195
xfs/199
xfs/278
xfs/284
xfs/291
xfs/294
xfs/424
xfs/451
xfs/495
xfs/499

$ cat expunges/4.19.18/xfs/unassigned/xfs_realtimedev.txt
generic/034
generic/039
generic/040
generic/041
generic/054
generic/056
generic/057
generic/059
generic/065
generic/066
generic/073
generic/081
generic/090
generic/091
generic/101
generic/104
generic/106
generic/107
generic/177
generic/204
generic/207
generic/223
generic/260
generic/263
generic/311
generic/321
generic/322
generic/325
generic/335
generic/336
generic/341
generic/342
generic/343
generic/347
generic/348
generic/361
generic/376
generic/455
generic/459
generic/464 # fails after ~40 runs
generic/475 # fails, and sometimes crashes
generic/482
generic/483
generic/484
generic/489
generic/498
generic/500
generic/502
generic/510
generic/512
generic/520
shared/002
shared/298
xfs/002
xfs/030
xfs/033
xfs/068
xfs/070
xfs/137
xfs/138
xfs/191-input-validation
xfs/194
xfs/195
xfs/199
xfs/278
xfs/291
xfs/294
xfs/419
xfs/424
xfs/451
xfs/495
xfs/499

Perhaps worth noting which was curious is that I could not get to
trigger generic/464 on sections xfs_nocrc_512 and xfs_reflink_1024.

Athough I don't have a full baseline for ppc64le I did confirm that
backporting upstream commit 837514f7a4ca fixes the kernel.org bug [1]
report triggerable via generic/070 on ppc64le.

If you have any questions please let me know.

[0] https://gitlab.com/mcgrof/oscheck
[1] https://bugzilla.kernel.org/show_bug.cgi?id=201577

  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08  6:06       ` Sasha Levin
@ 2019-02-08 20:06         ` Luis Chamberlain
  2019-02-08 21:29         ` Dave Chinner
  2019-02-08 22:17         ` Luis Chamberlain
  2 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-08 20:06 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Dave Chinner, linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
> On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
> > On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
> > > On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> > > >On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> > > >>Kernel stable team,
> > > >>
> > > >>here is a v2 respin of my XFS stable patches for v4.19.y. The only
> > > >>change in this series is adding the upstream commit to the commit log,
> > > >>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> > > >>were spotted or raised with this series.
> > > >>
> > > >>Reviews, questions, or rants are greatly appreciated.
> > > >
> > > >Test results?
> > > >
> > > >The set of changes look fine themselves, but as always, the proof is
> > > >in the testing...
> > > 
> > > Luis noted on v1 that it passes through his oscheck test suite, and I
> > > noted that I haven't seen any regression with the xfstests scripts I
> > > have.
> > > 
> > > What sort of data are you looking for beyond "we didn't see a
> > > regression"?
> > 
> > Nothing special, just a summary of what was tested so we have some
> > visibility of whether the testing covered the proposed changes
> > sufficiently.  i.e. something like:
> > 
> > 	Patchset was run through ltp and the fstests "auto" group
> > 	with the following configs:
> > 
> > 	- mkfs/mount defaults
> > 	- -m reflink=1,rmapbt=1
> > 	- -b size=1k
> > 	- -m crc=0
> > 	....
> > 
> > 	No new regressions were reported.
> > 
> > 
> > Really, all I'm looking for is a bit more context for the review
> > process - nobody remembers what configs other people test. However,
> > it's important in reviewing a backport to know whether a backport to
> > a fix, say, a bug in the rmap code actually got exercised by the
> > tests on an rmap enabled filesystem...
> 
> Sure! Below are the various configs this was run against.

To be clear, that was Sasha's own effort. I just replied with my own
set of test and results against the baseline to confirm no regressions
were found.

My tests run on 8-core kvm vms with 8 GiB of RAM, and qcow2 images which
reside on an XFS partition mounted on nvme drives on the hypervisor, the
hypervisor runs CentOS 7, on 3.10.0-862.3.2.el7.x86_64.

For the guest I use different qcow2 images. One is 100 GiB and is used
to expose a disk to the guest so it can use it where to store the files
use dfor the SCRATCH_DEV_POOL. For the SCRATCH_DEV_POOL I use loopback devices,
using files created on the guest's own /media/truncated/ partition,
using the 100 GiB partition. I end up with 8 loopback devices to test
for then:

SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11"

The loopback devices are setup using my oscheck's $(./gendisks.sh -d)
script.

Since Sasha seems to have a system rigged for testing XFS what I could
do is collaborate with Sasha to consolidate our sections for testing and
also have both of our systems run all tests to at least have two
different test systems confirming no regressions. That is, if Sasha
is up or that. Otherwise I'll continue with whatever rig I can get
my hands on each time I test.

I have an expunge list, and he has his own, we need to consolidate that
as well with time.

Since some tests have a failure rate which is not 1 -- ie, it doesn't
fail 100% of the time, I am considering adding a *spinner tester* for each
test which runs each test 1000 times and records when if first fails.
It assumes that if you can run a test 1000 times, we really don't have
it as an expunge. If there is a better term for failure rate let's use
it, just not familiar, but I'm sure this nomenclature must exist.

A curious thing I noted was that the ppc64le bug didn't actually fail
for me as a straight forward test. That is, I had to *first* manually
mkfs.xfs with the big block specification for the partition used for
TEST_DEV and then also the first device in SCRATCH_DEV_POOL with big
block. Only after I did this and then run the test did I get with 100%
failure rate the ability to trigger the failure.

It has me wondering how many other test may fail if we did the same.

  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08  6:06       ` Sasha Levin
  2019-02-08 20:06         ` Luis Chamberlain
@ 2019-02-08 21:29         ` Dave Chinner
  2019-02-09 17:53           ` Sasha Levin
  2019-02-08 22:17         ` Luis Chamberlain
  2 siblings, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2019-02-08 21:29 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Luis Chamberlain, linux-xfs, gregkh, Alexander.Levin, stable,
	amir73il, hch

On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
> On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
> >On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
> >>On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> >>>On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> >>>>Kernel stable team,
> >>>>
> >>>>here is a v2 respin of my XFS stable patches for v4.19.y. The only
> >>>>change in this series is adding the upstream commit to the commit log,
> >>>>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> >>>>were spotted or raised with this series.
> >>>>
> >>>>Reviews, questions, or rants are greatly appreciated.
> >>>
> >>>Test results?
> >>>
> >>>The set of changes look fine themselves, but as always, the proof is
> >>>in the testing...
> >>
> >>Luis noted on v1 that it passes through his oscheck test suite, and I
> >>noted that I haven't seen any regression with the xfstests scripts I
> >>have.
> >>
> >>What sort of data are you looking for beyond "we didn't see a
> >>regression"?
> >
> >Nothing special, just a summary of what was tested so we have some
> >visibility of whether the testing covered the proposed changes
> >sufficiently.  i.e. something like:
> >
> >	Patchset was run through ltp and the fstests "auto" group
> >	with the following configs:
> >
> >	- mkfs/mount defaults
> >	- -m reflink=1,rmapbt=1
> >	- -b size=1k
> >	- -m crc=0
> >	....
> >
> >	No new regressions were reported.
> >
> >
> >Really, all I'm looking for is a bit more context for the review
> >process - nobody remembers what configs other people test. However,
> >it's important in reviewing a backport to know whether a backport to
> >a fix, say, a bug in the rmap code actually got exercised by the
> >tests on an rmap enabled filesystem...
> 
> Sure! Below are the various configs this was run against. There were
> multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
> were observed.

Thanks, Sasha. As an ongoing thing, I reckon a "grep _OPTIONS
<config_files>" (catches both mkfs and mount options) would be
sufficient as a summary of what was tested in the series
decription...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 19:48   ` Luis Chamberlain
@ 2019-02-08 21:32     ` Dave Chinner
  2019-02-08 21:50       ` Luis Chamberlain
  2019-02-11 20:09     ` Luis Chamberlain
  1 sibling, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2019-02-08 21:32 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
> On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> > On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> > > Kernel stable team,
> > > 
> > > here is a v2 respin of my XFS stable patches for v4.19.y. The only
> > > change in this series is adding the upstream commit to the commit log,
> > > and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> > > were spotted or raised with this series.
> > > 
> > > Reviews, questions, or rants are greatly appreciated.
> > 
> > Test results?
> > 
> > The set of changes look fine themselves, but as always, the proof is
> > in the testing...
> 
> I've first established a baseline for v4.19.18 with fstests using
> a series of different sections to test against. I annotated the
> failures on an expunge list and then use that expunge list to confirm
> no regressions -- no failures if we skip the failures already known for
> v4.19.18.
> 
> Each different configuration I test against I use a section for. I only
> test x86_64 for now but am starting to create a baseline for ppc64le.
> 
> The sections I use:
> 
>   * xfs
>   * xfs_nocrc
>   * xfs_nocrc_512
>   * xfs_reflink
>   * xfs_reflink_1024
>   * xfs_logdev
>   * xfs_realtimedev

Yup, that seems to cover most common things :)

> The xfs_logdev and xfs_realtimedev sections use an external log, and as
> I have noted before it seems works is needed to rule out an actual
> failure.

Yeah, there are many tests that don't work properly with external
devices, esp. RT devices. That's a less critical area to cover, but
it's still good to run it :)

Thanks, Luis!

-Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 21:32     ` Dave Chinner
@ 2019-02-08 21:50       ` Luis Chamberlain
  2019-02-10 22:12         ` Dave Chinner
  0 siblings, 1 reply; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-08 21:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Sat, Feb 09, 2019 at 08:32:01AM +1100, Dave Chinner wrote:
> On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
> > On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> > > On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> > > > Kernel stable team,
> > > > 
> > > > here is a v2 respin of my XFS stable patches for v4.19.y. The only
> > > > change in this series is adding the upstream commit to the commit log,
> > > > and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> > > > were spotted or raised with this series.
> > > > 
> > > > Reviews, questions, or rants are greatly appreciated.
> > > 
> > > Test results?
> > > 
> > > The set of changes look fine themselves, but as always, the proof is
> > > in the testing...
> > 
> > I've first established a baseline for v4.19.18 with fstests using
> > a series of different sections to test against. I annotated the
> > failures on an expunge list and then use that expunge list to confirm
> > no regressions -- no failures if we skip the failures already known for
> > v4.19.18.
> > 
> > Each different configuration I test against I use a section for. I only
> > test x86_64 for now but am starting to create a baseline for ppc64le.
> > 
> > The sections I use:
> > 
> >   * xfs
> >   * xfs_nocrc
> >   * xfs_nocrc_512
> >   * xfs_reflink
> >   * xfs_reflink_1024
> >   * xfs_logdev
> >   * xfs_realtimedev
> 
> Yup, that seems to cover most common things :)

To be clear in the future I hope to also have a baseline for:

  * xfs_bigblock

But that is *currently* [0] only possible on the following architectures
with the respective kernel config:

aarch64:
CONFIG_ARM64_64K_PAGES=y

ppc64le:
CONFIG_PPC_64K_PAGES=y

[0] Someone is working on 64k pages on x86 I think?

  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08  6:06       ` Sasha Levin
  2019-02-08 20:06         ` Luis Chamberlain
  2019-02-08 21:29         ` Dave Chinner
@ 2019-02-08 22:17         ` Luis Chamberlain
  2019-02-09 21:56           ` Sasha Levin
  2 siblings, 1 reply; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-08 22:17 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Dave Chinner, linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
> Sure! Below are the various configs this was run against. There were
> multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
> were observed.

In an effort to consolidate our sections:

> [default]
> TEST_DEV=/dev/nvme0n1p1
> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'

This matches my "xfs" section.

> USE_EXTERNAL=no
> LOGWRITES_DEV=/dev/nve0n1p3
> FSTYP=xfs
> 
> 
> [default]
> TEST_DEV=/dev/nvme0n1p1
> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'

This matches my "xfs_reflink"

> USE_EXTERNAL=no
> LOGWRITES_DEV=/dev/nvme0n1p3
> FSTYP=xfs
> 
> 
> [default]
> TEST_DEV=/dev/nvme0n1p1
> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'

This matches my "xfs_reflink_1024" section.

> USE_EXTERNAL=no
> LOGWRITES_DEV=/dev/nvme0n1p3
> FSTYP=xfs
> 
> 
> [default]
> TEST_DEV=/dev/nvme0n1p1
> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
> MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'

This matches my "xfs_nocrc" section.

> USE_EXTERNAL=no
> LOGWRITES_DEV=/dev/nvme0n1p3
> FSTYP=xfs
> 
> 
> [default]
> TEST_DEV=/dev/nvme0n1p1
> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
> MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'

This matches my "xfs_nocrc_512" section.

> USE_EXTERNAL=no
> LOGWRITES_DEV=/dev/nvme0n1p3
> FSTYP=xfs
> 
> 
> [default_pmem]
> TEST_DEV=/dev/pmem0

I'll have to add this to my framework. Have you found pmem
issues not present on other sections?

> TEST_DIR=/media/test
> SCRATCH_DEV_POOL="/dev/pmem1"
> SCRATCH_MNT=/media/scratch
> RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'

OK so you just repeat the above options vervbatim but for pmem.
Correct?

Any reason you don't name the sections with more finer granularity?
It would help me in ensuring when we revise both of tests we can more
easily ensure we're talking about apples, pears, or bananas.

FWIW, I run two different bare metal hosts now, and each has a VM guest
per section above. One host I use for tracking stable, the other host for
my changes. This ensures I don't mess things up easier and I can re-test
any time fast.

I dedicate a VM guest to test *one* section. I do this with oscheck
easily:

./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+

For instance will just test xfs_nocrc section. On average each section
takes about 1 hour to run.

I could run the tests on raw nvme and do away with the guests, but
that loses some of my ability to debug on crashes easily and out to
baremetal.. but curious, how long do your tests takes? How about per
section? Say just the default "xfs" section?

IIRC you also had your system on hyperV :) so maybe you can still debug
easily on crashes.

  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 21:29         ` Dave Chinner
@ 2019-02-09 17:53           ` Sasha Levin
  0 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2019-02-09 17:53 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis Chamberlain, linux-xfs, gregkh, Alexander.Levin, stable,
	amir73il, hch

On Sat, Feb 09, 2019 at 08:29:21AM +1100, Dave Chinner wrote:
>On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
>> On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
>> >On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
>> >>On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
>> >>>On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
>> >>>>Kernel stable team,
>> >>>>
>> >>>>here is a v2 respin of my XFS stable patches for v4.19.y. The only
>> >>>>change in this series is adding the upstream commit to the commit log,
>> >>>>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
>> >>>>were spotted or raised with this series.
>> >>>>
>> >>>>Reviews, questions, or rants are greatly appreciated.
>> >>>
>> >>>Test results?
>> >>>
>> >>>The set of changes look fine themselves, but as always, the proof is
>> >>>in the testing...
>> >>
>> >>Luis noted on v1 that it passes through his oscheck test suite, and I
>> >>noted that I haven't seen any regression with the xfstests scripts I
>> >>have.
>> >>
>> >>What sort of data are you looking for beyond "we didn't see a
>> >>regression"?
>> >
>> >Nothing special, just a summary of what was tested so we have some
>> >visibility of whether the testing covered the proposed changes
>> >sufficiently.  i.e. something like:
>> >
>> >	Patchset was run through ltp and the fstests "auto" group
>> >	with the following configs:
>> >
>> >	- mkfs/mount defaults
>> >	- -m reflink=1,rmapbt=1
>> >	- -b size=1k
>> >	- -m crc=0
>> >	....
>> >
>> >	No new regressions were reported.
>> >
>> >
>> >Really, all I'm looking for is a bit more context for the review
>> >process - nobody remembers what configs other people test. However,
>> >it's important in reviewing a backport to know whether a backport to
>> >a fix, say, a bug in the rmap code actually got exercised by the
>> >tests on an rmap enabled filesystem...
>>
>> Sure! Below are the various configs this was run against. There were
>> multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
>> were observed.
>
>Thanks, Sasha. As an ongoing thing, I reckon a "grep _OPTIONS
><config_files>" (catches both mkfs and mount options) would be
>sufficient as a summary of what was tested in the series
>decription...

Will do.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 22:17         ` Luis Chamberlain
@ 2019-02-09 21:56           ` Sasha Levin
  2019-02-11 19:46             ` Luis Chamberlain
  0 siblings, 1 reply; 28+ messages in thread
From: Sasha Levin @ 2019-02-09 21:56 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Dave Chinner, linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Fri, Feb 08, 2019 at 02:17:26PM -0800, Luis Chamberlain wrote:
>On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
>> Sure! Below are the various configs this was run against. There were
>> multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
>> were observed.
>
>In an effort to consolidate our sections:
>
>> [default]
>> TEST_DEV=/dev/nvme0n1p1
>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
>> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
>
>This matches my "xfs" section.
>
>> USE_EXTERNAL=no
>> LOGWRITES_DEV=/dev/nve0n1p3
>> FSTYP=xfs
>>
>>
>> [default]
>> TEST_DEV=/dev/nvme0n1p1
>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
>> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
>
>This matches my "xfs_reflink"
>
>> USE_EXTERNAL=no
>> LOGWRITES_DEV=/dev/nvme0n1p3
>> FSTYP=xfs
>>
>>
>> [default]
>> TEST_DEV=/dev/nvme0n1p1
>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
>> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
>
>This matches my "xfs_reflink_1024" section.
>
>> USE_EXTERNAL=no
>> LOGWRITES_DEV=/dev/nvme0n1p3
>> FSTYP=xfs
>>
>>
>> [default]
>> TEST_DEV=/dev/nvme0n1p1
>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
>> MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
>
>This matches my "xfs_nocrc" section.
>
>> USE_EXTERNAL=no
>> LOGWRITES_DEV=/dev/nvme0n1p3
>> FSTYP=xfs
>>
>>
>> [default]
>> TEST_DEV=/dev/nvme0n1p1
>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/nvme0n1p2"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)
>> MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
>
>This matches my "xfs_nocrc_512" section.
>
>> USE_EXTERNAL=no
>> LOGWRITES_DEV=/dev/nvme0n1p3
>> FSTYP=xfs
>>
>>
>> [default_pmem]
>> TEST_DEV=/dev/pmem0
>
>I'll have to add this to my framework. Have you found pmem
>issues not present on other sections?

Originally I've added this because the xfs folks suggested that pmem vs
block exercises very different code paths and we should be testing both
of them.

Looking at the baseline I have, it seems that there are differences
between the failing tests. For example, with "MKFS_OPTIONS='-f -m
crc=1,reflink=0,rmapbt=0, -i sparse=0'", generic/524 seems to fail on
pmem but not on block.

>> TEST_DIR=/media/test
>> SCRATCH_DEV_POOL="/dev/pmem1"
>> SCRATCH_MNT=/media/scratch
>> RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
>> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
>
>OK so you just repeat the above options vervbatim but for pmem.
>Correct?

Right.

>Any reason you don't name the sections with more finer granularity?
>It would help me in ensuring when we revise both of tests we can more
>easily ensure we're talking about apples, pears, or bananas.

Nope, I'll happily rename them if there are "official" names for it :)

>FWIW, I run two different bare metal hosts now, and each has a VM guest
>per section above. One host I use for tracking stable, the other host for
>my changes. This ensures I don't mess things up easier and I can re-test
>any time fast.
>
>I dedicate a VM guest to test *one* section. I do this with oscheck
>easily:
>
>./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
>
>For instance will just test xfs_nocrc section. On average each section
>takes about 1 hour to run.

We have a similar setup then. I just spawn the VM on azure for each
section and run them all in parallel that way.

I thought oscheck runs everything on a single VM, is it a built in
mechanism to spawn a VM for each config? If so, I can add some code in
to support azure and we can use the same codebase.

>I could run the tests on raw nvme and do away with the guests, but
>that loses some of my ability to debug on crashes easily and out to
>baremetal.. but curious, how long do your tests takes? How about per
>section? Say just the default "xfs" section?

I think that the longest config takes about 5 hours, otherwise
everything tends to take about 2 hours.

I basically run these on "repeat" until I issue a stop order, so in a
timespan of 48 hours some configs run ~20 times and some only ~10.

>IIRC you also had your system on hyperV :) so maybe you can still debug
>easily on crashes.
>
>  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
                   ` (11 preceding siblings ...)
  2019-02-05 22:06 ` Dave Chinner
@ 2019-02-10  0:06 ` Sasha Levin
  12 siblings, 0 replies; 28+ messages in thread
From: Sasha Levin @ 2019-02-10  0:06 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
>Kernel stable team,
>
>here is a v2 respin of my XFS stable patches for v4.19.y. The only
>change in this series is adding the upstream commit to the commit log,
>and I've now also Cc'd stable@vger.kernel.org as well. No other issues
>were spotted or raised with this series.
>
>Reviews, questions, or rants are greatly appreciated.
>
>  Luis
>
>Brian Foster (1):
>  xfs: fix shared extent data corruption due to missing cow reservation
>
>Carlos Maiolino (1):
>  xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
>
>Christoph Hellwig (1):
>  xfs: cancel COW blocks before swapext
>
>Christophe JAILLET (1):
>  xfs: Fix error code in 'xfs_ioc_getbmap()'
>
>Darrick J. Wong (1):
>  xfs: fix PAGE_MASK usage in xfs_free_file_space
>
>Dave Chinner (3):
>  xfs: fix overflow in xfs_attr3_leaf_verify
>  xfs: fix transient reference count error in
>    xfs_buf_resubmit_failed_buffers
>  xfs: delalloc -> unwritten COW fork allocation can go wrong
>
>Eric Sandeen (1):
>  xfs: fix inverted return from xfs_btree_sblock_verify_crc
>
>Ye Yin (1):
>  fs/xfs: fix f_ffree value for statfs when project quota is set
>
> fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++--
> fs/xfs/libxfs/xfs_bmap.c      |  5 ++++-
> fs/xfs/libxfs/xfs_btree.c     |  2 +-
> fs/xfs/xfs_bmap_util.c        | 10 ++++++++--
> fs/xfs/xfs_buf_item.c         | 28 +++++++++++++++++++++-------
> fs/xfs/xfs_ioctl.c            |  2 +-
> fs/xfs/xfs_qm_bhv.c           |  2 +-
> fs/xfs/xfs_reflink.c          |  1 +
> fs/xfs/xfs_stats.c            |  2 +-
> 9 files changed, 47 insertions(+), 16 deletions(-)

Queued for 4.19, thank you.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 21:50       ` Luis Chamberlain
@ 2019-02-10 22:12         ` Dave Chinner
  0 siblings, 0 replies; 28+ messages in thread
From: Dave Chinner @ 2019-02-10 22:12 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Fri, Feb 08, 2019 at 01:50:57PM -0800, Luis Chamberlain wrote:
> On Sat, Feb 09, 2019 at 08:32:01AM +1100, Dave Chinner wrote:
> > On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
> > > On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
> > > > On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
> > > > > Kernel stable team,
> > > > > 
> > > > > here is a v2 respin of my XFS stable patches for v4.19.y. The only
> > > > > change in this series is adding the upstream commit to the commit log,
> > > > > and I've now also Cc'd stable@vger.kernel.org as well. No other issues
> > > > > were spotted or raised with this series.
> > > > > 
> > > > > Reviews, questions, or rants are greatly appreciated.
> > > > 
> > > > Test results?
> > > > 
> > > > The set of changes look fine themselves, but as always, the proof is
> > > > in the testing...
> > > 
> > > I've first established a baseline for v4.19.18 with fstests using
> > > a series of different sections to test against. I annotated the
> > > failures on an expunge list and then use that expunge list to confirm
> > > no regressions -- no failures if we skip the failures already known for
> > > v4.19.18.
> > > 
> > > Each different configuration I test against I use a section for. I only
> > > test x86_64 for now but am starting to create a baseline for ppc64le.
> > > 
> > > The sections I use:
> > > 
> > >   * xfs
> > >   * xfs_nocrc
> > >   * xfs_nocrc_512
> > >   * xfs_reflink
> > >   * xfs_reflink_1024
> > >   * xfs_logdev
> > >   * xfs_realtimedev
> > 
> > Yup, that seems to cover most common things :)
> 
> To be clear in the future I hope to also have a baseline for:
> 
>   * xfs_bigblock
> 
> But that is *currently* [0] only possible on the following architectures
> with the respective kernel config:
> 
> aarch64:
> CONFIG_ARM64_64K_PAGES=y
> 
> ppc64le:
> CONFIG_PPC_64K_PAGES=y
> 
> [0] Someone is working on 64k pages on x86 I think?

Yup, I am, but that got derailed by wanting fsx coverage w/
dedup/clone/copy_file_range before going any further with it. That
was one of the triggers that lead to finding all those data
corruption and API problems late last year...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-09 21:56           ` Sasha Levin
@ 2019-02-11 19:46             ` Luis Chamberlain
  0 siblings, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-11 19:46 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Dave Chinner, linux-xfs, gregkh, Alexander.Levin, stable, amir73il, hch

On Sat, Feb 09, 2019 at 04:56:27PM -0500, Sasha Levin wrote:
> On Fri, Feb 08, 2019 at 02:17:26PM -0800, Luis Chamberlain wrote:
> > On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
> > Have you found pmem
> > issues not present on other sections?
> 
> Originally I've added this because the xfs folks suggested that pmem vs
> block exercises very different code paths and we should be testing both
> of them.
> 
> Looking at the baseline I have, it seems that there are differences
> between the failing tests. For example, with "MKFS_OPTIONS='-f -m
> crc=1,reflink=0,rmapbt=0, -i sparse=0'",

That's my "xfs" section.

> generic/524 seems to fail on pmem but not on block.

This is useful thanks! Can you get the failure rate? How often does it
fail when you run the test? Always? Does it *never* fail on block? How
many consecutive runs did you have run on block?

To help with this oscheck has naggy-check.sh, you could run it until
a failure is hit:

./naggy-check.sh -f -s xfs generic/524

And on another host:

./naggy-check.sh -f -s xfs_pmem generic/524

> > Any reason you don't name the sections with more finer granularity?
> > It would help me in ensuring when we revise both of tests we can more
> > easily ensure we're talking about apples, pears, or bananas.
> 
> Nope, I'll happily rename them if there are "official" names for it :)

Well since I am pushing out the stable fixes and am using oscheck to
be transparent about how I test and what I track, and since I'm using
section names, yes it would be useful to me. Simply adding a _pmem
postfix to the pmem ones would suffice.

> > FWIW, I run two different bare metal hosts now, and each has a VM guest
> > per section above. One host I use for tracking stable, the other host for
> > my changes. This ensures I don't mess things up easier and I can re-test
> > any time fast.
> > 
> > I dedicate a VM guest to test *one* section. I do this with oscheck
> > easily:
> > 
> > ./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
> > 
> > For instance will just test xfs_nocrc section. On average each section
> > takes about 1 hour to run.
> 
> We have a similar setup then. I just spawn the VM on azure for each
> section and run them all in parallel that way.

Indeed.

> I thought oscheck runs everything on a single VM,

By default it does.

> is it a built in
> mechanism to spawn a VM for each config?

Yes:

./oscheck.sh --test-section xfs_nocrc_512

For instance will test section xfs_nocrc_512 *only* on that host.

> If so, I can add some code in
> to support azure and we can use the same codebase.

Groovy. I believe the next step will if you can send me your delta
of expunges, and then I can run naggy-check.sh on them to see if I
can reach similar results. I believe you have a larger expunge list.
I suspect some of this may you may not have certain quirks handled.
We will see. But getting this right and to sync our testing should
yield good confirmation of failures.

> > I could run the tests on raw nvme and do away with the guests, but
> > that loses some of my ability to debug on crashes easily and out to
> > baremetal.. but curious, how long do your tests takes? How about per
> > section? Say just the default "xfs" section?
> 
> I think that the longest config takes about 5 hours, otherwise
> everything tends to take about 2 hours.

Oh wow, mine are only 1 hour each. Guess I got a decent rig now :)

> I basically run these on "repeat" until I issue a stop order, so in a
> timespan of 48 hours some configs run ~20 times and some only ~10.

I see... so you iterate over all tests and many times a day and this is
how you've built your expunge list. Correct?

It could could explain how you may end up with a larger set. This can
mean some tests only fail at a non-100% failure rate, for these I'm
annotating the failure rate as a comment on each expunge line. Having a
consistent format for this and proper agreed upon term would be good.
Right now I just mention how oftem I have to run a test before reaching
a failure.  This provides a rough estimate how many times one should
iterate running the test in a loop before detecting a failure. Of course
this may not always be acurate, given systems vary and this could play
an impact on the failure... but at least it provides some guidance. It
would be curious to see if we end up with similar failure rates for
tests don't always fail. And if there is a divergence, how big this
could be.

  Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 00/10] xfs: stable fixes for v4.19.y
  2019-02-08 19:48   ` Luis Chamberlain
  2019-02-08 21:32     ` Dave Chinner
@ 2019-02-11 20:09     ` Luis Chamberlain
  1 sibling, 0 replies; 28+ messages in thread
From: Luis Chamberlain @ 2019-02-11 20:09 UTC (permalink / raw)
  To: Dave Chinner
  Cc: xfs, Greg Kroah-Hartman, Sasha Levin, 4.2+,
	Amir Goldstein, Christoph Hellwig

On Fri, Feb 8, 2019 at 1:48 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> Perhaps worth noting which was curious is that I could not get to
> trigger generic/464 on sections xfs_nocrc_512 and xfs_reflink_1024.

Well I just hit a failure for generic/464 on 4.19.17 after ~3996 runs
for the xfs_nocrc_512 section, and after 7382 runs for
xfs_refkilnk_1024.
I've updated the expunge list to reflect the difficult to hit failure
of generic/464 and its failure rate on xfs_nocrc_512 on oscheck.

 Luis

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-02-11 20:09 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-04 16:54 [PATCH v2 00/10] xfs: stable fixes for v4.19.y Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 01/10] xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 02/10] xfs: cancel COW blocks before swapext Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 03/10] xfs: Fix error code in 'xfs_ioc_getbmap()' Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 04/10] xfs: fix overflow in xfs_attr3_leaf_verify Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 05/10] xfs: fix shared extent data corruption due to missing cow reservation Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 06/10] xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 07/10] xfs: delalloc -> unwritten COW fork allocation can go wrong Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 08/10] fs/xfs: fix f_ffree value for statfs when project quota is set Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 09/10] xfs: fix PAGE_MASK usage in xfs_free_file_space Luis Chamberlain
2019-02-04 16:54 ` [PATCH v2 10/10] xfs: fix inverted return from xfs_btree_sblock_verify_crc Luis Chamberlain
2019-02-05  6:44 ` [PATCH v2 00/10] xfs: stable fixes for v4.19.y Amir Goldstein
2019-02-05 22:06 ` Dave Chinner
2019-02-06  4:05   ` Sasha Levin
2019-02-06 21:54     ` Dave Chinner
2019-02-08  6:06       ` Sasha Levin
2019-02-08 20:06         ` Luis Chamberlain
2019-02-08 21:29         ` Dave Chinner
2019-02-09 17:53           ` Sasha Levin
2019-02-08 22:17         ` Luis Chamberlain
2019-02-09 21:56           ` Sasha Levin
2019-02-11 19:46             ` Luis Chamberlain
2019-02-08 19:48   ` Luis Chamberlain
2019-02-08 21:32     ` Dave Chinner
2019-02-08 21:50       ` Luis Chamberlain
2019-02-10 22:12         ` Dave Chinner
2019-02-11 20:09     ` Luis Chamberlain
2019-02-10  0:06 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).