All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/23] xfsprogs-5.0: fix various problems
@ 2019-03-01 23:26 Darrick J. Wong
  2019-03-01 23:26 ` [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection Darrick J. Wong
                   ` (23 more replies)
  0 siblings, 24 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:26 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

Here are some fixes for xfsprogs 5.0:

Patch 1 fixes the autoconf xattr.h check to use the new location of
xattr.h

Patch 2 adds some missing return value checks in xfs_io's
copy_file_range implementation.

Patch 3 teaches xfs_io's statx command to print attributes_mask.

Patch 4 fixes xfs_scrub_all to discover the fs to physical device
mappings correctly when nvme devices are present in the system.

Patch 5 holds off automated xfs_scrub_all triggers until the system is
fully booted so that we don't end up racing metadata scrubbing with
system boot.

Patches 6-7 fix thread count discovery in xfs_scrub so that we stop
shadowing variables and estimate parallelism based at least somewhat on
the storage hardware and not just the number of CPUs.  In other words,
we'll stop pounding a spinning disk with all the CPUs.

Patches 8-9 restructure the read-verify pools to make them per-device
so that a media scan on heterogeneous storage won't flood a device with
low IOPS capacity.

Patch 10 teaches scrub not to complain if it can't close the mountpoint
after failing to open it.

Patch 11 teaches scrub to check filesystem labels for misleading
character sequences.

Patch 12 teaches mkfs to validate extent size hint parameters so that
we cannot format filesystems that immediately fail to mount.

Patches 13-14 fix some link count handling in xfs_repair when we're
(re)initializing the root directory and lost+found directories.

Patch 15 fixes some build warnings in xfs_repair.

Patch 16 fixes an xfs_db problem where finobt decoding was broken.

Patch 17 fixes a metadump problem where the wrong type was used when
dumping free inode btree blocks.

Patch 18 teaches xfs_info to be smarter about figuring out whether it
should be using a live query via xfs_spaceman or xfs_db.

Patch 19-20 fix a crash in xfs_repair where accidentally create
duplicate rmapbt records for blocks that are initially allocated to the
free space btrees but then are freed back to the AGFL. while rebuilding
the rmap btree.

Patches 21-23 fix some use-after-free and stale log item problems in the
inode and buffer log items by modifying libxfs to discard those log
items when committing or cancelling a transaction.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=xfsprogs-5.0-fixes

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
@ 2019-03-01 23:26 ` Darrick J. Wong
  2019-03-08  8:08   ` Christoph Hellwig
  2019-03-01 23:27 ` [PATCH 02/23] xfs_io: actually check copy file range helper return values Darrick J. Wong
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:26 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

The only user of fsetxattr and HAVE_FSETXATTR is fsr, which includes
sys/xattr.h (from libc).  However, the m4 macro to detect fsetxattr
support requires attr/xattr.h (from libattr).  libattr dropped xattr.h
last year, so update the check.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 m4/package_libcdev.m4 |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index 7ee4acdd..2c0c72ce 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -223,14 +223,14 @@ AC_DEFUN([AC_HAVE_FLS],
   ])
 
 #
-# Check if we have a fsetxattr call (Mac OS X)
+# Check if we have a fsetxattr call
 #
 AC_DEFUN([AC_HAVE_FSETXATTR],
   [ AC_CHECK_DECL([fsetxattr],
        have_fsetxattr=yes,
        [],
        [#include <sys/types.h>
-        #include <attr/xattr.h>]
+        #include <sys/xattr.h>]
        )
     AC_SUBST(have_fsetxattr)
   ])

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/23] xfs_io: actually check copy file range helper return values
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
  2019-03-01 23:26 ` [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-08  8:09   ` Christoph Hellwig
  2019-03-01 23:27 ` [PATCH 03/23] xfs_io: statx -r should print attributes_mask Darrick J. Wong
                   ` (21 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, schumaker.anna

From: Darrick J. Wong <darrick.wong@oracle.com>

We need to check the return value of copy_src_filesize and
copy_dst_truncate because either could return -1 due to fstat/ftruncate
failure.

Fixes: 628e112afdd98c5 ("xfs_io: implement 'copy_range' command")
Cc: schumaker.anna@gmail.com
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 io/copy_file_range.c |   17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)


diff --git a/io/copy_file_range.c b/io/copy_file_range.c
index 4e2969c9..d069e5bb 100644
--- a/io/copy_file_range.c
+++ b/io/copy_file_range.c
@@ -120,11 +120,24 @@ copy_range_f(int argc, char **argv)
 		return 0;
 
 	if (src == 0 && dst == 0 && len == 0) {
-		len = copy_src_filesize(fd);
-		copy_dst_truncate();
+		off64_t	sz;
+
+		sz = copy_src_filesize(fd);
+		if (sz < 0 || (unsigned long long)sz > SIZE_MAX) {
+			ret = 1;
+			goto out;
+		}
+		len = sz;
+
+		ret = copy_dst_truncate();
+		if (ret < 0) {
+			ret = 1;
+			goto out;
+		}
 	}
 
 	ret = copy_file_range_cmd(fd, &src, &dst, len);
+out:
 	close(fd);
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/23] xfs_io: statx -r should print attributes_mask
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
  2019-03-01 23:26 ` [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 02/23] xfs_io: actually check copy file range helper return values Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-08  8:09   ` Christoph Hellwig
  2019-03-01 23:27 ` [PATCH 04/23] xfs_scrub_all: walk the lsblk device/fs hierarchy correctly Darrick J. Wong
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

We're dumping the raw structure, so we ought to dump everything,
including the attributes_mask field.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 io/stat.c  |    4 ++++
 io/statx.h |    2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)


diff --git a/io/stat.c b/io/stat.c
index 64662b43..517be66e 100644
--- a/io/stat.c
+++ b/io/stat.c
@@ -6,6 +6,9 @@
  * Portions of statx support written by David Howells (dhowells@redhat.com)
  */
 
+/* Try to pick up statx definitions from the system headers. */
+#include <linux/stat.h>
+
 #include "command.h"
 #include "input.h"
 #include "init.h"
@@ -272,6 +275,7 @@ dump_raw_statx(struct statx *stx)
 	printf("stat.ino = %llu\n", (unsigned long long)stx->stx_ino);
 	printf("stat.size = %llu\n", (unsigned long long)stx->stx_size);
 	printf("stat.blocks = %llu\n", (unsigned long long)stx->stx_blocks);
+	printf("stat.attributes_mask = 0x%llx\n", (unsigned long long)stx->stx_attributes_mask);
 	printf("stat.atime.tv_sec = %lld\n", (long long)stx->stx_atime.tv_sec);
 	printf("stat.atime.tv_nsec = %d\n", stx->stx_atime.tv_nsec);
 	printf("stat.btime.tv_sec = %lld\n", (long long)stx->stx_btime.tv_sec);
diff --git a/io/statx.h b/io/statx.h
index 4e4b31ee..4f40eaa1 100644
--- a/io/statx.h
+++ b/io/statx.h
@@ -107,7 +107,7 @@ struct statx {
 	__u64	stx_ino;	/* Inode number */
 	__u64	stx_size;	/* File size */
 	__u64	stx_blocks;	/* Number of 512-byte blocks allocated */
-	__u64	__spare1[1];
+	__u64	stx_attributes_mask; /* Mask to show what's supported in stx_attributes */
 	/* 0x40 */
 	struct statx_timestamp	stx_atime;	/* Last access time */
 	struct statx_timestamp	stx_btime;	/* File creation time */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/23] xfs_scrub_all: walk the lsblk device/fs hierarchy correctly
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 03/23] xfs_io: statx -r should print attributes_mask Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 05/23] xfs_scrub_all.timer: activate after most of the system is up Darrick J. Wong
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Back when I was designing xfs_scrub_all, I naïvely assumed that the
emitted output would always list physical storage before the virtual
devices stacked atop it.  However, this is not actually true when one
omits the "NAME" column, which is crucial to forcing the output (json or
otherwise) to capture the block device hierarchy.  If the assumption is
violated, the program crashes with a python exception.

To fix this, force the hierarchal json output and restructure the
discovery routines to walk the json object that we receive, from the top
(physical devices) downwards to wherever there are live xfs filesystems.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/xfs_scrub_all.in |   28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)


diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
index c4e9899d..5b76b49a 100644
--- a/scrub/xfs_scrub_all.in
+++ b/scrub/xfs_scrub_all.in
@@ -28,9 +28,21 @@ def DEVNULL():
 
 def find_mounts():
 	'''Map mountpoints to physical disks.'''
+	def find_xfs_mounts(bdev, fs, lastdisk):
+		'''Attach lastdisk to each fs found under bdev.'''
+		if bdev['fstype'] == 'xfs' and bdev['mountpoint'] is not None:
+			mnt = bdev['mountpoint']
+			if mnt in fs:
+				fs[mnt].add(lastdisk)
+			else:
+				fs[mnt] = set([lastdisk])
+		if 'children' not in bdev:
+			return
+		for child in bdev['children']:
+			find_xfs_mounts(child, fs, lastdisk)
 
 	fs = {}
-	cmd=['lsblk', '-o', 'KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
+	cmd=['lsblk', '-o', 'NAME,KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
 	result = subprocess.Popen(cmd, stdout=subprocess.PIPE)
 	result.wait()
 	if result.returncode != 0:
@@ -38,18 +50,12 @@ def find_mounts():
 	sarray = [x.decode(sys.stdout.encoding) for x in result.stdout.readlines()]
 	output = ' '.join(sarray)
 	bdevdata = json.loads(output)
+
 	# The lsblk output had better be in disks-then-partitions order
 	for bdev in bdevdata['blockdevices']:
-		if bdev['type'] in ('disk', 'loop'):
-			lastdisk = bdev['kname']
-		if bdev['fstype'] == 'xfs':
-			mnt = bdev['mountpoint']
-			if mnt is None:
-				continue
-			if mnt in fs:
-				fs[mnt].add(lastdisk)
-			else:
-				fs[mnt] = set([lastdisk])
+		lastdisk = bdev['kname']
+		find_xfs_mounts(bdev, fs, lastdisk)
+
 	return fs
 
 def kill_systemd(unit, proc):

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/23] xfs_scrub_all.timer: activate after most of the system is up
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (3 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 04/23] xfs_scrub_all: walk the lsblk device/fs hierarchy correctly Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 06/23] xfs_scrub: rename the global nr_threads Darrick J. Wong
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

We really don't want the xfs_scrub_all timer triggering while the system
is booting up because not all the mounts will have finished, networking
might not be up for reporting, and slowing down bootup annoys people.
Therefore, delay the xfs_scrub_all service's activation until after the
system has started all the big pieces it's going to start.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/xfs_scrub_all.service.in |    1 +
 1 file changed, 1 insertion(+)


diff --git a/scrub/xfs_scrub_all.service.in b/scrub/xfs_scrub_all.service.in
index 66f82fc7..b1b80da4 100644
--- a/scrub/xfs_scrub_all.service.in
+++ b/scrub/xfs_scrub_all.service.in
@@ -2,6 +2,7 @@
 Description=Online XFS Metadata Check for All Filesystems
 ConditionACPower=true
 Documentation=man:xfs_scrub_all(8)
+After=paths.target multi-user.target network.target network-online.target systemd-networkd.service NetworkManager.service connman.service
 
 [Service]
 Type=oneshot

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/23] xfs_scrub: rename the global nr_threads
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (4 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 05/23] xfs_scrub_all.timer: activate after most of the system is up Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 07/23] xfs_scrub: use datadev parallelization estimates for thread count Darrick J. Wong
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Various functions have nr_threads local variables that shadow the global
one.  Since the global one forces the number of threads we use, change
its name to remove this ambiguity and reflect what it really does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/common.c    |    4 ++--
 scrub/disk.c      |    4 ++--
 scrub/xfs_scrub.c |    6 +++---
 scrub/xfs_scrub.h |    2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)


diff --git a/scrub/common.c b/scrub/common.c
index 78afc4bf..c877c7c8 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -231,8 +231,8 @@ unsigned int
 scrub_nproc(
 	struct scrub_ctx	*ctx)
 {
-	if (nr_threads)
-		return nr_threads;
+	if (force_nr_threads)
+		return force_nr_threads;
 	return ctx->nr_io_threads;
 }
 
diff --git a/scrub/disk.c b/scrub/disk.c
index 7daa508e..dd109533 100644
--- a/scrub/disk.c
+++ b/scrub/disk.c
@@ -75,8 +75,8 @@ unsigned int
 disk_heads(
 	struct disk		*disk)
 {
-	if (nr_threads)
-		return nr_threads;
+	if (force_nr_threads)
+		return force_nr_threads;
 	return __disk_heads(disk);
 }
 
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index b8138000..71fc274f 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -133,7 +133,7 @@ unsigned int			bg_mode;
 int				nproc;
 
 /* Number of threads we're allowed to use. */
-unsigned int			nr_threads;
+unsigned int			force_nr_threads;
 
 /* Verbosity; higher values print more information. */
 bool				verbose;
@@ -589,7 +589,7 @@ main(
 			}
 			break;
 		case 'b':
-			nr_threads = 1;
+			force_nr_threads = 1;
 			bg_mode++;
 			break;
 		case 'C':
@@ -659,7 +659,7 @@ main(
 			perror("nr_threads");
 			usage();
 		}
-		nr_threads = x;
+		force_nr_threads = x;
 	}
 
 	if (optind != argc - 1)
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index a961d8fd..a459e4b5 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -10,7 +10,7 @@ extern char *progname;
 
 #define _PATH_PROC_MOUNTS	"/proc/mounts"
 
-extern unsigned int		nr_threads;
+extern unsigned int		force_nr_threads;
 extern unsigned int		bg_mode;
 extern unsigned int		debug;
 extern int			nproc;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/23] xfs_scrub: use datadev parallelization estimates for thread count
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (5 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 06/23] xfs_scrub: rename the global nr_threads Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 08/23] xfs_scrub: don't expose internal pool state Darrick J. Wong
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

During phases 2-5, xfs_scrub should estimate the level of
parallelization possible on the data device to determine the number of
threads spawned to scrub filesystem metadata, not just blindly using the
number of CPUs.  This avoids flooding non-rotational storage with random
reads, which totally destroys performance and makes scrub runtimes
higher.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase1.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)


diff --git a/scrub/phase1.c b/scrub/phase1.c
index 2113014b..6b472147 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -109,13 +109,6 @@ _("Must be root to run scrub."));
 		return false;
 	}
 
-	ctx->nr_io_threads = nproc;
-	if (verbose) {
-		fprintf(stdout, _("%s: using %d threads to scrub.\n"),
-				ctx->mntpoint, scrub_nproc(ctx));
-		fflush(stdout);
-	}
-
 	if (!platform_test_xfs_fd(ctx->mnt_fd)) {
 		str_info(ctx, ctx->mntpoint,
 _("Does not appear to be an XFS filesystem!"));
@@ -193,6 +186,13 @@ _("Unable to find realtime device path."));
 		return false;
 	}
 
+	ctx->nr_io_threads = disk_heads(ctx->datadev);
+	if (verbose) {
+		fprintf(stdout, _("%s: using %d threads to scrub.\n"),
+				ctx->mntpoint, scrub_nproc(ctx));
+		fflush(stdout);
+	}
+
 	if (ctx->fsinfo.fs_log) {
 		ctx->logdev = disk_open(ctx->fsinfo.fs_log);
 		if (error) {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/23] xfs_scrub: don't expose internal pool state
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (6 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 07/23] xfs_scrub: use datadev parallelization estimates for thread count Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 09/23] xfs_scrub: one read/verify pool per disk Darrick J. Wong
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In xfs_scrub, the read/verify pool tries to coalesce the media
verification requests into a smaller number of large IOs.  There's no
need to force callers to keep track of this internal state, so just move
all that into read_verify.c.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase6.c      |   23 +++++++----------------
 scrub/read_verify.c |   40 ++++++++++++++++++++++++++++++++++------
 scrub/read_verify.h |   16 ++++------------
 3 files changed, 45 insertions(+), 34 deletions(-)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index ead48d77..fe121769 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -9,7 +9,6 @@
 #include <sys/statvfs.h>
 #include "handle.h"
 #include "path.h"
-#include "ptvar.h"
 #include "workqueue.h"
 #include "xfs_scrub.h"
 #include "common.h"
@@ -290,7 +289,6 @@ xfs_report_verify_errors(
 
 struct xfs_verify_extent {
 	struct read_verify_pool	*readverify;
-	struct ptvar		*rvstate;
 	struct bitmap		*d_bad;		/* bytes */
 	struct bitmap		*r_bad;		/* bytes */
 };
@@ -424,13 +422,13 @@ xfs_check_rmap(
 	/* Schedule the read verify command for (eventual) running. */
 	disk = xfs_dev_to_disk(ctx, map->fmr_device);
 
-	read_verify_schedule_io(ve->readverify, ptvar_get(ve->rvstate), disk,
-			map->fmr_physical, map->fmr_length, ve);
+	read_verify_schedule_io(ve->readverify, disk, map->fmr_physical,
+			map->fmr_length, ve);
 
 out:
 	/* Is this the last extent?  Fire off the read. */
 	if (map->fmr_flags & FMR_OF_LAST)
-		read_verify_force_io(ve->readverify, ptvar_get(ve->rvstate));
+		read_verify_force_io(ve->readverify);
 
 	return true;
 }
@@ -450,16 +448,10 @@ xfs_scan_blocks(
 	struct xfs_verify_extent	ve;
 	bool				moveon;
 
-	ve.rvstate = ptvar_init(scrub_nproc(ctx), sizeof(struct read_verify));
-	if (!ve.rvstate) {
-		str_errno(ctx, ctx->mntpoint);
-		return false;
-	}
-
 	moveon = bitmap_init(&ve.d_bad);
 	if (!moveon) {
 		str_errno(ctx, ctx->mntpoint);
-		goto out_ve;
+		goto out;
 	}
 
 	moveon = bitmap_init(&ve.r_bad);
@@ -469,7 +461,8 @@ xfs_scan_blocks(
 	}
 
 	ve.readverify = read_verify_pool_init(ctx, ctx->geo.blocksize,
-			xfs_check_rmap_ioerr, disk_heads(ctx->datadev));
+			xfs_check_rmap_ioerr, disk_heads(ctx->datadev),
+			scrub_nproc(ctx));
 	if (!ve.readverify) {
 		moveon = false;
 		str_info(ctx, ctx->mntpoint,
@@ -489,7 +482,6 @@ _("Could not create media verifier."));
 
 	bitmap_free(&ve.r_bad);
 	bitmap_free(&ve.d_bad);
-	ptvar_free(ve.rvstate);
 	return moveon;
 
 out_pool:
@@ -498,8 +490,7 @@ _("Could not create media verifier."));
 	bitmap_free(&ve.r_bad);
 out_dbad:
 	bitmap_free(&ve.d_bad);
-out_ve:
-	ptvar_free(ve.rvstate);
+out:
 	return moveon;
 }
 
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index 75cb53ca..b5774736 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -7,6 +7,7 @@
 #include <stdint.h>
 #include <stdlib.h>
 #include <sys/statvfs.h>
+#include "ptvar.h"
 #include "workqueue.h"
 #include "path.h"
 #include "xfs_scrub.h"
@@ -36,22 +37,40 @@
 /* Tolerate 64k holes in adjacent read verify requests. */
 #define RVP_IO_BATCH_LOCALITY	(65536)
 
+struct read_verify {
+	void			*io_end_arg;
+	struct disk		*io_disk;
+	uint64_t		io_start;	/* bytes */
+	uint64_t		io_length;	/* bytes */
+};
+
 struct read_verify_pool {
 	struct workqueue	wq;		/* thread pool */
 	struct scrub_ctx	*ctx;		/* scrub context */
 	void			*readbuf;	/* read buffer */
 	struct ptcounter	*verified_bytes;
+	struct ptvar		*rvstate;	/* combines read requests */
 	read_verify_ioerr_fn_t	ioerr_fn;	/* io error callback */
 	size_t			miniosz;	/* minimum io size, bytes */
 };
 
-/* Create a thread pool to run read verifiers. */
+/*
+ * Create a thread pool to run read verifiers.
+ *
+ * @miniosz is the minimum size of an IO to expect (in bytes).
+ * @ioerr_fn will be called when IO errors occur.
+ * @nproc is the maximum number of verify requests that may be sent to a disk
+ * at any given time.
+ * @submitter_threads is the number of threads that may be sending verify
+ * requests at any given time.
+ */
 struct read_verify_pool *
 read_verify_pool_init(
 	struct scrub_ctx		*ctx,
 	size_t				miniosz,
 	read_verify_ioerr_fn_t		ioerr_fn,
-	unsigned int			nproc)
+	unsigned int			nproc,
+	unsigned int			submitter_threads)
 {
 	struct read_verify_pool		*rvp;
 	bool				ret;
@@ -71,14 +90,20 @@ read_verify_pool_init(
 	rvp->miniosz = miniosz;
 	rvp->ctx = ctx;
 	rvp->ioerr_fn = ioerr_fn;
+	rvp->rvstate = ptvar_init(submitter_threads,
+			sizeof(struct read_verify));
+	if (rvp->rvstate == NULL)
+		goto out_counter;
 	/* Run in the main thread if we only want one thread. */
 	if (nproc == 1)
 		nproc = 0;
 	ret = workqueue_create(&rvp->wq, (struct xfs_mount *)rvp, nproc);
 	if (ret)
-		goto out_counter;
+		goto out_rvstate;
 	return rvp;
 
+out_rvstate:
+	ptvar_free(rvp->rvstate);
 out_counter:
 	ptcounter_free(rvp->verified_bytes);
 out_buf:
@@ -101,6 +126,7 @@ void
 read_verify_pool_destroy(
 	struct read_verify_pool		*rvp)
 {
+	ptvar_free(rvp->rvstate);
 	ptcounter_free(rvp->verified_bytes);
 	free(rvp->readbuf);
 	free(rvp);
@@ -186,16 +212,17 @@ _("Could not queue read-verify work."));
 bool
 read_verify_schedule_io(
 	struct read_verify_pool		*rvp,
-	struct read_verify		*rv,
 	struct disk			*disk,
 	uint64_t			start,
 	uint64_t			length,
 	void				*end_arg)
 {
+	struct read_verify		*rv;
 	uint64_t			req_end;
 	uint64_t			rv_end;
 
 	assert(rvp->readbuf);
+	rv = ptvar_get(rvp->rvstate);
 	req_end = start + length;
 	rv_end = rv->io_start + rv->io_length;
 
@@ -229,12 +256,13 @@ read_verify_schedule_io(
 /* Force any stashed IOs into the verifier. */
 bool
 read_verify_force_io(
-	struct read_verify_pool		*rvp,
-	struct read_verify		*rv)
+	struct read_verify_pool		*rvp)
 {
+	struct read_verify		*rv;
 	bool				moveon;
 
 	assert(rvp->readbuf);
+	rv = ptvar_get(rvp->rvstate);
 	if (rv->io_length == 0)
 		return true;
 
diff --git a/scrub/read_verify.h b/scrub/read_verify.h
index 38f1cd1a..1e7fd83f 100644
--- a/scrub/read_verify.h
+++ b/scrub/read_verify.h
@@ -16,21 +16,13 @@ typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx,
 
 struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx,
 		size_t miniosz, read_verify_ioerr_fn_t ioerr_fn,
-		unsigned int nproc);
+		unsigned int nproc, unsigned int submitter_threads);
 void read_verify_pool_flush(struct read_verify_pool *rvp);
 void read_verify_pool_destroy(struct read_verify_pool *rvp);
 
-struct read_verify {
-	void			*io_end_arg;
-	struct disk		*io_disk;
-	uint64_t		io_start;	/* bytes */
-	uint64_t		io_length;	/* bytes */
-};
-
-bool read_verify_schedule_io(struct read_verify_pool *rvp,
-		struct read_verify *rv, struct disk *disk, uint64_t start,
-		uint64_t length, void *end_arg);
-bool read_verify_force_io(struct read_verify_pool *rvp, struct read_verify *rv);
+bool read_verify_schedule_io(struct read_verify_pool *rvp, struct disk *disk,
+		uint64_t start, uint64_t length, void *end_arg);
+bool read_verify_force_io(struct read_verify_pool *rvp);
 uint64_t read_verify_bytes(struct read_verify_pool *rvp);
 
 #endif /* XFS_SCRUB_READ_VERIFY_H_ */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/23] xfs_scrub: one read/verify pool per disk
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (7 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 08/23] xfs_scrub: don't expose internal pool state Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:27 ` [PATCH 10/23] xfs_scrub: don't close mnt_fd when mnt_fd open fails Darrick J. Wong
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Simplify the read/verify pool code further by creating one pool per
disk.  This enables us to tailor the concurrency levels of each disk to
that specific disk so that if we have a mixed hdd/ssd environment we
don't flood the hdd with a lot of requests.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase6.c      |  110 ++++++++++++++++++++++++++++++++++++---------------
 scrub/read_verify.c |   29 ++++++-------
 scrub/read_verify.h |   10 +++--
 3 files changed, 98 insertions(+), 51 deletions(-)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index fe121769..ccb795ab 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -33,18 +33,29 @@
  * and report the paths of the now corrupt files.
  */
 
+/* Verify disk blocks with GETFSMAP */
+
+struct xfs_verify_extent {
+	struct read_verify_pool	*rvp_data;
+	struct read_verify_pool	*rvp_log;
+	struct read_verify_pool	*rvp_realtime;
+	struct bitmap		*d_bad;		/* bytes */
+	struct bitmap		*r_bad;		/* bytes */
+};
+
 /* Find the fd for a given device identifier. */
-static struct disk *
-xfs_dev_to_disk(
-	struct scrub_ctx	*ctx,
-	dev_t			dev)
+static struct read_verify_pool *
+xfs_dev_to_pool(
+	struct scrub_ctx		*ctx,
+	struct xfs_verify_extent	*ve,
+	dev_t				dev)
 {
 	if (dev == ctx->fsinfo.fs_datadev)
-		return ctx->datadev;
+		return ve->rvp_data;
 	else if (dev == ctx->fsinfo.fs_logdev)
-		return ctx->logdev;
+		return ve->rvp_log;
 	else if (dev == ctx->fsinfo.fs_rtdev)
-		return ctx->rtdev;
+		return ve->rvp_realtime;
 	abort();
 }
 
@@ -285,14 +296,6 @@ xfs_report_verify_errors(
 	return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, &vei);
 }
 
-/* Verify disk blocks with GETFSMAP */
-
-struct xfs_verify_extent {
-	struct read_verify_pool	*readverify;
-	struct bitmap		*d_bad;		/* bytes */
-	struct bitmap		*r_bad;		/* bytes */
-};
-
 /* Report an IO error resulting from read-verify based off getfsmap. */
 static bool
 xfs_check_rmap_error_report(
@@ -393,7 +396,9 @@ xfs_check_rmap(
 	void				*arg)
 {
 	struct xfs_verify_extent	*ve = arg;
-	struct disk			*disk;
+	struct read_verify_pool		*rvp;
+
+	rvp = xfs_dev_to_pool(ctx, ve, map->fmr_device);
 
 	dbg_printf("rmap dev %d:%d phys %"PRIu64" owner %"PRId64
 			" offset %"PRIu64" len %"PRIu64" flags 0x%x\n",
@@ -420,19 +425,32 @@ xfs_check_rmap(
 	/* XXX: Filter out directory data blocks. */
 
 	/* Schedule the read verify command for (eventual) running. */
-	disk = xfs_dev_to_disk(ctx, map->fmr_device);
-
-	read_verify_schedule_io(ve->readverify, disk, map->fmr_physical,
-			map->fmr_length, ve);
+	read_verify_schedule_io(rvp, map->fmr_physical, map->fmr_length, ve);
 
 out:
 	/* Is this the last extent?  Fire off the read. */
 	if (map->fmr_flags & FMR_OF_LAST)
-		read_verify_force_io(ve->readverify);
+		read_verify_force_io(rvp);
 
 	return true;
 }
 
+/* Wait for read/verify actions to finish, then return # bytes checked. */
+static uint64_t
+clean_pool(
+	struct read_verify_pool	*rvp)
+{
+	uint64_t		ret;
+
+	if (!rvp)
+		return 0;
+
+	read_verify_pool_flush(rvp);
+	ret += read_verify_bytes(rvp);
+	read_verify_pool_destroy(rvp);
+	return ret;
+}
+
 /*
  * Read verify all the file data blocks in a filesystem.  Since XFS doesn't
  * do data checksums, we trust that the underlying storage will pass back
@@ -445,7 +463,7 @@ bool
 xfs_scan_blocks(
 	struct scrub_ctx		*ctx)
 {
-	struct xfs_verify_extent	ve;
+	struct xfs_verify_extent	ve = { NULL };
 	bool				moveon;
 
 	moveon = bitmap_init(&ve.d_bad);
@@ -460,21 +478,43 @@ xfs_scan_blocks(
 		goto out_dbad;
 	}
 
-	ve.readverify = read_verify_pool_init(ctx, ctx->geo.blocksize,
-			xfs_check_rmap_ioerr, disk_heads(ctx->datadev),
+	ve.rvp_data = read_verify_pool_init(ctx, ctx->datadev,
+			ctx->geo.blocksize, xfs_check_rmap_ioerr,
 			scrub_nproc(ctx));
-	if (!ve.readverify) {
+	if (!ve.rvp_data) {
 		moveon = false;
 		str_info(ctx, ctx->mntpoint,
-_("Could not create media verifier."));
+_("Could not create data device media verifier."));
 		goto out_rbad;
 	}
+	if (ctx->logdev) {
+		ve.rvp_log = read_verify_pool_init(ctx, ctx->logdev,
+				ctx->geo.blocksize, xfs_check_rmap_ioerr,
+				scrub_nproc(ctx));
+		if (!ve.rvp_log) {
+			moveon = false;
+			str_info(ctx, ctx->mntpoint,
+	_("Could not create log device media verifier."));
+			goto out_datapool;
+		}
+	}
+	if (ctx->rtdev) {
+		ve.rvp_realtime = read_verify_pool_init(ctx, ctx->rtdev,
+				ctx->geo.blocksize, xfs_check_rmap_ioerr,
+				scrub_nproc(ctx));
+		if (!ve.rvp_realtime) {
+			moveon = false;
+			str_info(ctx, ctx->mntpoint,
+	_("Could not create realtime device media verifier."));
+			goto out_logpool;
+		}
+	}
 	moveon = xfs_scan_all_spacemaps(ctx, xfs_check_rmap, &ve);
 	if (!moveon)
-		goto out_pool;
-	read_verify_pool_flush(ve.readverify);
-	ctx->bytes_checked += read_verify_bytes(ve.readverify);
-	read_verify_pool_destroy(ve.readverify);
+		goto out_rtpool;
+	ctx->bytes_checked += clean_pool(ve.rvp_data);
+	ctx->bytes_checked += clean_pool(ve.rvp_log);
+	ctx->bytes_checked += clean_pool(ve.rvp_realtime);
 
 	/* Scan the whole dir tree to see what matches the bad extents. */
 	if (!bitmap_empty(ve.d_bad) || !bitmap_empty(ve.r_bad))
@@ -484,8 +524,14 @@ _("Could not create media verifier."));
 	bitmap_free(&ve.d_bad);
 	return moveon;
 
-out_pool:
-	read_verify_pool_destroy(ve.readverify);
+out_rtpool:
+	if (ve.rvp_realtime)
+		read_verify_pool_destroy(ve.rvp_realtime);
+out_logpool:
+	if (ve.rvp_log)
+		read_verify_pool_destroy(ve.rvp_log);
+out_datapool:
+	read_verify_pool_destroy(ve.rvp_data);
 out_rbad:
 	bitmap_free(&ve.r_bad);
 out_dbad:
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index b5774736..4a9b91f2 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -50,6 +50,7 @@ struct read_verify_pool {
 	void			*readbuf;	/* read buffer */
 	struct ptcounter	*verified_bytes;
 	struct ptvar		*rvstate;	/* combines read requests */
+	struct disk		*disk;		/* which disk? */
 	read_verify_ioerr_fn_t	ioerr_fn;	/* io error callback */
 	size_t			miniosz;	/* minimum io size, bytes */
 };
@@ -57,19 +58,18 @@ struct read_verify_pool {
 /*
  * Create a thread pool to run read verifiers.
  *
+ * @disk is the disk we want to verify.
  * @miniosz is the minimum size of an IO to expect (in bytes).
  * @ioerr_fn will be called when IO errors occur.
- * @nproc is the maximum number of verify requests that may be sent to a disk
- * at any given time.
  * @submitter_threads is the number of threads that may be sending verify
  * requests at any given time.
  */
 struct read_verify_pool *
 read_verify_pool_init(
 	struct scrub_ctx		*ctx,
+	struct disk			*disk,
 	size_t				miniosz,
 	read_verify_ioerr_fn_t		ioerr_fn,
-	unsigned int			nproc,
 	unsigned int			submitter_threads)
 {
 	struct read_verify_pool		*rvp;
@@ -89,6 +89,7 @@ read_verify_pool_init(
 		goto out_buf;
 	rvp->miniosz = miniosz;
 	rvp->ctx = ctx;
+	rvp->disk = disk;
 	rvp->ioerr_fn = ioerr_fn;
 	rvp->rvstate = ptvar_init(submitter_threads,
 			sizeof(struct read_verify));
@@ -97,7 +98,8 @@ read_verify_pool_init(
 	/* Run in the main thread if we only want one thread. */
 	if (nproc == 1)
 		nproc = 0;
-	ret = workqueue_create(&rvp->wq, (struct xfs_mount *)rvp, nproc);
+	ret = workqueue_create(&rvp->wq, (struct xfs_mount *)rvp,
+			disk_heads(disk));
 	if (ret)
 		goto out_rvstate;
 	return rvp;
@@ -150,17 +152,16 @@ read_verify(
 	rvp = (struct read_verify_pool *)wq->wq_ctx;
 	while (rv->io_length > 0) {
 		len = min(rv->io_length, RVP_IO_MAX_SIZE);
-		dbg_printf("diskverify %d %"PRIu64" %zu\n", rv->io_disk->d_fd,
-				rv->io_start, len);
-		sz = disk_read_verify(rv->io_disk, rvp->readbuf,
+		dbg_printf("diskverify %d %"PRIu64" %zu\n", rvp->disk->d_fd,
 				rv->io_start, len);
+		sz = disk_read_verify(rvp->disk, rvp->readbuf, rv->io_start,
+				len);
 		if (sz < 0) {
 			dbg_printf("IOERR %d %"PRIu64" %zu\n",
-					rv->io_disk->d_fd,
-					rv->io_start, len);
+					rvp->disk->d_fd, rv->io_start, len);
 			/* IO error, so try the next logical block. */
 			len = rvp->miniosz;
-			rvp->ioerr_fn(rvp->ctx, rv->io_disk, rv->io_start, len,
+			rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, len,
 					errno, rv->io_end_arg);
 		}
 
@@ -184,11 +185,11 @@ read_verify_queue(
 	bool				ret;
 
 	dbg_printf("verify fd %d start %"PRIu64" len %"PRIu64"\n",
-			rv->io_disk->d_fd, rv->io_start, rv->io_length);
+			rvp->disk->d_fd, rv->io_start, rv->io_length);
 
 	tmp = malloc(sizeof(struct read_verify));
 	if (!tmp) {
-		rvp->ioerr_fn(rvp->ctx, rv->io_disk, rv->io_start,
+		rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start,
 				rv->io_length, errno, rv->io_end_arg);
 		return true;
 	}
@@ -212,7 +213,6 @@ _("Could not queue read-verify work."));
 bool
 read_verify_schedule_io(
 	struct read_verify_pool		*rvp,
-	struct disk			*disk,
 	uint64_t			start,
 	uint64_t			length,
 	void				*end_arg)
@@ -231,7 +231,7 @@ read_verify_schedule_io(
 	 * reporting is the same, and the two extents are close,
 	 * we can combine them.
 	 */
-	if (rv->io_length > 0 && disk == rv->io_disk &&
+	if (rv->io_length > 0 &&
 	    end_arg == rv->io_end_arg &&
 	    ((start >= rv->io_start && start <= rv_end + RVP_IO_BATCH_LOCALITY) ||
 	     (rv->io_start >= start &&
@@ -244,7 +244,6 @@ read_verify_schedule_io(
 			return read_verify_queue(rvp, rv);
 
 		/* Stash the new IO. */
-		rv->io_disk = disk;
 		rv->io_start = start;
 		rv->io_length = length;
 		rv->io_end_arg = end_arg;
diff --git a/scrub/read_verify.h b/scrub/read_verify.h
index 1e7fd83f..5fabe5e0 100644
--- a/scrub/read_verify.h
+++ b/scrub/read_verify.h
@@ -8,6 +8,7 @@
 
 struct scrub_ctx;
 struct read_verify_pool;
+struct disk;
 
 /* Function called when an IO error happens. */
 typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx,
@@ -15,13 +16,14 @@ typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx,
 		int error, void *arg);
 
 struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx,
-		size_t miniosz, read_verify_ioerr_fn_t ioerr_fn,
-		unsigned int nproc, unsigned int submitter_threads);
+		struct disk *disk, size_t miniosz,
+		read_verify_ioerr_fn_t ioerr_fn,
+		unsigned int submitter_threads);
 void read_verify_pool_flush(struct read_verify_pool *rvp);
 void read_verify_pool_destroy(struct read_verify_pool *rvp);
 
-bool read_verify_schedule_io(struct read_verify_pool *rvp, struct disk *disk,
-		uint64_t start, uint64_t length, void *end_arg);
+bool read_verify_schedule_io(struct read_verify_pool *rvp, uint64_t start,
+		uint64_t length, void *end_arg);
 bool read_verify_force_io(struct read_verify_pool *rvp);
 uint64_t read_verify_bytes(struct read_verify_pool *rvp);
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/23] xfs_scrub: don't close mnt_fd when mnt_fd open fails
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (8 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 09/23] xfs_scrub: one read/verify pool per disk Darrick J. Wong
@ 2019-03-01 23:27 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 11/23] xfs_scrub: check label for misleading characters Darrick J. Wong
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:27 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If we fail to open the mountpoint during phase 1 of scrub, don't bother
trying to close the file descriptor since it's silly to spray error
messages about that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase1.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)


diff --git a/scrub/phase1.c b/scrub/phase1.c
index 6b472147..04a5f4a9 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -59,9 +59,11 @@ xfs_cleanup_fs(
 	if (ctx->datadev)
 		disk_close(ctx->datadev);
 	fshandle_destroy();
-	error = close(ctx->mnt_fd);
-	if (error)
-		str_errno(ctx, _("closing mountpoint fd"));
+	if (ctx->mnt_fd >= 0) {
+		error = close(ctx->mnt_fd);
+		if (error)
+			str_errno(ctx, _("closing mountpoint fd"));
+	}
 	fs_table_destroy();
 
 	return true;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/23] xfs_scrub: check label for misleading characters
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (9 preceding siblings ...)
  2019-03-01 23:27 ` [PATCH 10/23] xfs_scrub: don't close mnt_fd when mnt_fd open fails Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 12/23] mkfs: validate extent size hint parameters Darrick J. Wong
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Make sure that we can retrieve the label and that it doesn't contain
anything potentially misleading.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 configure.ac          |    1 +
 include/builddefs.in  |    1 +
 m4/package_libcdev.m4 |   17 +++++++++++++++
 scrub/Makefile        |    4 ++++
 scrub/phase5.c        |   55 +++++++++++++++++++++++++++++++++++++++++++++++++
 scrub/unicrash.c      |   24 +++++++++++++++++++++
 scrub/unicrash.h      |    5 ++++
 7 files changed, 107 insertions(+)


diff --git a/configure.ac b/configure.ac
index ccc6e292..c64742b7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -197,6 +197,7 @@ AC_HAVE_SG_IO
 AC_HAVE_HDIO_GETGEO
 AC_CONFIG_SYSTEMD_SYSTEM_UNIT_DIR
 AC_CONFIG_CROND_DIR
+AC_HAVE_GETFSLABEL
 
 if test "$enable_blkid" = yes; then
 AC_HAVE_BLKID_TOPO
diff --git a/include/builddefs.in b/include/builddefs.in
index c5b38b07..b4839d9a 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -117,6 +117,7 @@ HAVE_SYSTEMD = @have_systemd@
 SYSTEMD_SYSTEM_UNIT_DIR = @systemd_system_unit_dir@
 HAVE_CROND = @have_crond@
 CROND_DIR = @crond_dir@
+HAVE_GETFSLABEL = @have_getfslabel@
 
 GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
 #	   -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index 2c0c72ce..6f8e3811 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -429,3 +429,20 @@ AC_DEFUN([AC_PACKAGE_CHECK_LTO],
     AC_SUBST(lto_cflags)
     AC_SUBST(lto_ldflags)
   ])
+
+#
+# Check if we have the GETFSLABEL ioctl
+#
+AC_DEFUN([AC_HAVE_GETFSLABEL],
+  [ AC_MSG_CHECKING([for FS_IOC_GETFSLABEL])
+    AC_TRY_LINK([
+#define _GNU_SOURCE
+#include <linux/fs.h>
+    ], [
+         char label[FSLABEL_MAX];
+         ioctl(0, FS_IOC_GETFSLABEL, &label);
+    ], have_getfslabel=yes
+       AC_MSG_RESULT(yes),
+       AC_MSG_RESULT(no))
+    AC_SUBST(have_getfslabel)
+  ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 6e155c2c..04921888 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -99,6 +99,10 @@ ifeq ($(HAVE_HDIO_GETGEO),yes)
 LCFLAGS += -DHAVE_HDIO_GETGEO
 endif
 
+ifeq ($(HAVE_GETFSLABEL),yes)
+LCFLAGS += -DHAVE_GETFSLABEL
+endif
+
 LDIRT = $(XFS_SCRUB_ALL_PROG) *.service *.cron
 
 default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG) $(OPTIONAL_TARGETS)
diff --git a/scrub/phase5.c b/scrub/phase5.c
index 6ffcec2d..331492b7 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -11,6 +11,9 @@
 #ifdef HAVE_LIBATTR
 # include <attr/attributes.h>
 #endif
+#ifdef HAVE_GETFSLABEL
+# include <linux/fs.h>
+#endif
 #include "handle.h"
 #include "list.h"
 #include "path.h"
@@ -282,6 +285,54 @@ xfs_scrub_connections(
 	return *pmoveon ? 0 : XFS_ITERATE_INODES_ABORT;
 }
 
+#ifdef HAVE_GETFSLABEL
+/*
+ * Check the filesystem label for Unicode normalization problems or misleading
+ * sequences.
+ */
+static bool
+xfs_scrub_fs_label(
+	struct scrub_ctx		*ctx)
+{
+	char				label[FSLABEL_MAX];
+	struct unicrash			*uc = NULL;
+	bool				moveon = true;
+	int				error;
+
+	moveon = unicrash_fs_label_init(&uc, ctx);
+	if (!moveon)
+		return false;
+
+	/* Retrieve label; quietly bail if we don't support that. */
+	error = ioctl(ctx->mnt_fd, FS_IOC_GETFSLABEL, &label);
+	if (error) {
+		if (errno != EOPNOTSUPP && errno != ENOTTY) {
+			moveon = false;
+			perror(ctx->mntpoint);
+		}
+		goto out;
+	}
+
+	/* Ignore empty labels. */
+	if (label[0] == 0)
+		goto out;
+
+	/* Otherwise check for weirdness. */
+	if (uc)
+		moveon = unicrash_check_fs_label(uc, ctx->mntpoint, label);
+	else
+		moveon = xfs_scrub_check_name(ctx, ctx->mntpoint,
+				_("filesystem label"), label);
+	if (!moveon)
+		goto out;
+out:
+	unicrash_free(uc);
+	return moveon;
+}
+#else
+# define xfs_scrub_fs_label(c)	(true)
+#endif /* HAVE_GETFSLABEL */
+
 /* Check directory connectivity. */
 bool
 xfs_scan_connections(
@@ -296,6 +347,10 @@ _("Filesystem has errors, skipping connectivity checks."));
 		return true;
 	}
 
+	moveon = xfs_scrub_fs_label(ctx);
+	if (!moveon)
+		return false;
+
 	ret = xfs_scan_all_inodes(ctx, xfs_scrub_connections, &moveon);
 	if (!ret)
 		moveon = false;
diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index a95fc305..121eedbc 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -465,6 +465,15 @@ unicrash_xattr_init(
 			is_only_root_writable(bstat));
 }
 
+/* Initialize the collision detector for a filesystem label. */
+bool
+unicrash_fs_label_init(
+	struct unicrash		**ucp,
+	struct scrub_ctx	*ctx)
+{
+	return unicrash_init(ucp, ctx, false, 16, true);
+}
+
 /* Free the crash detector. */
 void
 unicrash_free(
@@ -698,3 +707,18 @@ unicrash_check_xattr_name(
 	return __unicrash_check_name(uc, descr, _("extended attribute"),
 			attrname, 0);
 }
+
+/*
+ * Check the fs label for unicode normalization problems or misleading bits.
+ */
+bool
+unicrash_check_fs_label(
+	struct unicrash		*uc,
+	const char		*descr,
+	const char		*label)
+{
+	if (!uc)
+		return true;
+	return __unicrash_check_name(uc, descr, _("filesystem label"),
+			label, 0);
+}
diff --git a/scrub/unicrash.h b/scrub/unicrash.h
index 7d7276a8..85fcabc6 100644
--- a/scrub/unicrash.h
+++ b/scrub/unicrash.h
@@ -17,17 +17,22 @@ bool unicrash_dir_init(struct unicrash **ucp, struct scrub_ctx *ctx,
 		struct xfs_bstat *bstat);
 bool unicrash_xattr_init(struct unicrash **ucp, struct scrub_ctx *ctx,
 		struct xfs_bstat *bstat);
+bool unicrash_fs_label_init(struct unicrash **ucp, struct scrub_ctx *ctx);
 void unicrash_free(struct unicrash *uc);
 bool unicrash_check_dir_name(struct unicrash *uc, const char *descr,
 		struct dirent *dirent);
 bool unicrash_check_xattr_name(struct unicrash *uc, const char *descr,
 		const char *attrname);
+bool unicrash_check_fs_label(struct unicrash *uc, const char *descr,
+		const char *label);
 #else
 # define unicrash_dir_init(u, c, b)		(true)
 # define unicrash_xattr_init(u, c, b)		(true)
+# define unicrash_label_init(u, c)		(true)
 # define unicrash_free(u)			do {(u) = (u);} while (0)
 # define unicrash_check_dir_name(u, d, n)	(true)
 # define unicrash_check_xattr_name(u, d, n)	(true)
+# define unicrash_check_fs_label(u, d, n)	(true)
 #endif /* HAVE_LIBICU */
 
 #endif /* XFS_SCRUB_UNICRASH_H_ */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/23] mkfs: validate extent size hint parameters
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (10 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 11/23] xfs_scrub: check label for misleading characters Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly Darrick J. Wong
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Validate extent and cow extent size hints that are passed to mkfs so
that we avoid formatting a filesystem that will never mount.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 mkfs/xfs_mkfs.c |   64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)


diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index d1387ddf..9e1c6ec5 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -2202,6 +2202,66 @@ validate_rtextsize(
 	ASSERT(cfg->rtextblocks);
 }
 
+/* Validate the incoming extsize hint as if we were a file. */
+static void
+validate_extsize_hint(
+	struct xfs_mount	*mp,
+	struct cli_params	*cli)
+{
+	xfs_failaddr_t		fa;
+	bool			rt;
+	uint16_t		flags = 0;
+
+	rt = (cli->fsx.fsx_xflags & XFS_DIFLAG_RTINHERIT);
+
+	if (cli->fsx.fsx_xflags & XFS_DIFLAG_EXTSZINHERIT)
+		flags |= XFS_DIFLAG_EXTSIZE;
+
+	fa = libxfs_inode_validate_extsize(mp, cli->fsx.fsx_extsize, S_IFREG,
+			flags);
+	if (rt && fa == NULL)
+		fa = libxfs_inode_validate_extsize(mp, cli->fsx.fsx_extsize,
+				S_IFREG, flags | XFS_DIFLAG_REALTIME);
+	if (fa == NULL)
+		return;
+
+	if (rt && mp->m_sb.sb_rextsize > 1)
+		fprintf(stderr,
+	_("illegal extent size hint %lld, must be less than %u and a multiple of %u.\n"),
+				(long long)cli->fsx.fsx_extsize,
+				min(MAXEXTLEN, mp->m_sb.sb_agblocks / 2),
+				mp->m_sb.sb_rextsize);
+	else
+		fprintf(stderr,
+	_("illegal extent size hint %lld, must be less than %u.\n"),
+				(long long)cli->fsx.fsx_extsize,
+				min(MAXEXTLEN, mp->m_sb.sb_agblocks / 2));
+	usage();
+}
+
+/* Validate the incoming CoW extsize hint as if we were a file. */
+static void
+validate_cowextsize_hint(
+	struct xfs_mount	*mp,
+	struct cli_params	*cli)
+{
+	xfs_failaddr_t		fa;
+	uint64_t		flags2 = 0;
+
+	if (cli->fsx.fsx_xflags & FS_XFLAG_COWEXTSIZE)
+		flags2 |= XFS_DIFLAG2_COWEXTSIZE;
+	fa = libxfs_inode_validate_cowextsize(mp, cli->fsx.fsx_cowextsize,
+			S_IFREG, 0, flags2);
+	if (fa == NULL)
+		return;
+
+	fprintf(stderr,
+_("illegal CoW extent size hint %lld, must be less than %u.\n"),
+			(long long)cli->fsx.fsx_cowextsize,
+			min(MAXEXTLEN, mp->m_sb.sb_agblocks / 2));
+	usage();
+}
+
 /*
  * Validate the configured stripe geometry, or is none is specified, pull
  * the configuration from the underlying device.
@@ -3945,6 +4005,10 @@ main(
 
 	finish_superblock_setup(&cfg, mp, sbp);
 
+	/* Validate the extent size hints now that @mp is fully set up. */
+	validate_extsize_hint(mp, &cli);
+	validate_cowextsize_hint(mp, &cli);
+
 	/* Print the intended geometry of the fs. */
 	if (!quiet || dry_run) {
 		struct xfs_fsop_geom	geo;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (11 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 12/23] mkfs: validate extent size hint parameters Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-04-09 20:43   ` Eric Sandeen
  2019-03-01 23:28 ` [PATCH 14/23] xfs_repair: bump the irec on-disk nlink when adding lost+found Darrick J. Wong
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

In mk_root_dir, we reinitialize the root directory inode with a link
count of 1.  This differs from mkfs parseproto, which initializes the
root to have a link count of 2.  The nlink discrepancy in repair is
caught and corrected during phase 7, but this is unnecessary since we
should set it properly in the first place.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/phase6.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/repair/phase6.c b/repair/phase6.c
index 9477bc25..8a50b350 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -891,7 +891,7 @@ mk_root_dir(xfs_mount_t *mp)
 	ip->i_d.di_format = XFS_DINODE_FMT_EXTENTS;
 	ip->i_d.di_aformat = XFS_DINODE_FMT_EXTENTS;
 
-	set_nlink(VFS_I(ip), 1);	/* account for . */
+	set_nlink(VFS_I(ip), 2);	/* account for . and .. */
 
 	times = XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD;
 	if (ip->i_d.di_version == 3) {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 14/23] xfs_repair: bump the irec on-disk nlink when adding lost+found
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (12 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 15/23] xfs_repair: fix uninitialized variable warnings Darrick J. Wong
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

We increment the nlink of the root directory inode when creating a
"lost+found" directory during phase 6, but we don't update the irec copy
of the root dir nlink.  This normally gets papered over by phase 7, but
this can fail badly if:

1) The root directory had an entry to a busted subdirectory, so
   that root directory will have nlink == 3, but in the ino_tree,
   counted_nlinks == 2 and disk_nlinks == 3.

2) Phase 6 creates lost+found to root the files that were in the busted
   directory, we'll set nlink = 4 and counted_nlinks = 3.  The correct
   nlink is 3 ('.', '..', 'lost+found'), not 4.

3) During phase 7, we see that counted_nlinks == disk_nlinks and so we
   totally fail to correct the on-disk inode.

4) A subsequent run of xfs_repair complains about the nlink being 4
   instead of 3.

To fix this, we have to adjust the irec's disk_nlinks in step 2 so that
phase 7 seeds that counted_nlinks < disk_nlinks and resets nlink to
counted_nlinks.  This can be reproduced somewhat frequently by xfs/117.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/phase6.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)


diff --git a/repair/phase6.c b/repair/phase6.c
index 8a50b350..194cfdbf 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -1019,6 +1019,7 @@ mk_orphanage(xfs_mount_t *mp)
 	 */
 	set_inode_used(irec, ino_offset);
 	add_inode_ref(irec, ino_offset);
+	add_inode_reached(irec, ino_offset);
 
 	/*
 	 * now that we know the transaction will stay around,
@@ -1037,14 +1038,14 @@ mk_orphanage(xfs_mount_t *mp)
 
 	/*
 	 * bump up the link count in the root directory to account
-	 * for .. in the new directory
+	 * for .. in the new directory, and update the irec copy of the
+	 * on-disk nlink so we don't fail the link count check later.
 	 */
 	inc_nlink(VFS_I(pip));
-	add_inode_ref(find_inode_rec(mp,
-				XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino),
-				XFS_INO_TO_AGINO(mp, mp->m_sb.sb_rootino)), 0);
-
-
+	irec = find_inode_rec(mp, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino),
+				  XFS_INO_TO_AGINO(mp, mp->m_sb.sb_rootino));
+	add_inode_ref(irec, 0);
+	set_inode_disk_nlinks(irec, 0, get_inode_disk_nlinks(irec, 0) + 1);
 
 	libxfs_trans_log_inode(tp, pip, XFS_ILOG_CORE);
 	libxfs_dir_init(tp, ip, pip);
@@ -1056,7 +1057,6 @@ mk_orphanage(xfs_mount_t *mp)
 	}
 	libxfs_irele(ip);
 	libxfs_irele(pip);
-	add_inode_reached(irec,ino_offset);
 
 	return(ino);
 }

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 15/23] xfs_repair: fix uninitialized variable warnings
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (13 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 14/23] xfs_repair: bump the irec on-disk nlink when adding lost+found Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 16/23] xfs_db: fix finobt record decoding when sparse inodes enabled Darrick J. Wong
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Fix some uninitialized variable warnings because ASSERT disappears if
DEBUG is not defined.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/dinode.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/repair/dinode.c b/repair/dinode.c
index f670bf87..c0a56daa 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -1176,8 +1176,8 @@ process_quota_inode(
 	struct xfs_buf		*bp;
 	xfs_filblks_t		dqchunklen;
 	uint			dqperchunk;
-	int			quota_type;
-	char			*quota_string;
+	int			quota_type = 0;
+	char			*quota_string = NULL;
 	xfs_dqid_t		dqid;
 	xfs_fileoff_t		qbno;
 	int			i;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 16/23] xfs_db: fix finobt record decoding when sparse inodes enabled
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (14 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 15/23] xfs_repair: fix uninitialized variable warnings Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 17/23] xfs_db: use TYP_FINOBT for finobt metadump Darrick J. Wong
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use the sparse inobt record field decoder (inobt_spcrc_hfld) to decode
finobt records when sparse inodes are enabled.  Otherwise, xfs_db
prints out bogus things like:

recs[1] = [startino,freecount,free]
1:[214720,16429,0xfffffffffff80000]

There can never be 16429 records in an inode btree record; instead it
should print:

recs[1] = [startino,holemask,count,freecount,free]
1:[214720,0,64,45,0xfffffffffff80000]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 db/type.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/db/type.c b/db/type.c
index f5f65042..d42935a2 100644
--- a/db/type.c
+++ b/db/type.c
@@ -107,7 +107,7 @@ static const typ_t	__typtab_crc[] = {
 	{ TYP_SYMLINK, "symlink", handle_struct, symlink_crc_hfld,
 		&xfs_symlink_buf_ops, XFS_SYMLINK_CRC_OFF },
 	{ TYP_TEXT, "text", handle_text, NULL, NULL, TYP_F_NO_CRC_OFF },
-	{ TYP_FINOBT, "finobt", handle_struct, inobt_crc_hfld,
+	{ TYP_FINOBT, "finobt", handle_struct, inobt_spcrc_hfld,
 		&xfs_inobt_buf_ops, XFS_BTREE_SBLOCK_CRC_OFF },
 	{ TYP_NONE, NULL }
 };

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 17/23] xfs_db: use TYP_FINOBT for finobt metadump
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (15 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 16/23] xfs_db: fix finobt record decoding when sparse inodes enabled Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 18/23] xfs_info: use findmnt to handle mounted block devices Darrick J. Wong
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use the correct xfs_db type for dumping free inode btree blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 db/metadump.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/db/metadump.c b/db/metadump.c
index 6ecd5685..3cce3012 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -2607,7 +2607,7 @@ copy_inodes(
 		levels = be32_to_cpu(agi->agi_free_level);
 
 		finobt = 1;
-		if (!scan_btree(agno, root, levels, TYP_INOBT, &finobt,
+		if (!scan_btree(agno, root, levels, TYP_FINOBT, &finobt,
 				scanfunc_ino))
 			return 0;
 	}

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 18/23] xfs_info: use findmnt to handle mounted block devices
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (16 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 17/23] xfs_db: use TYP_FINOBT for finobt metadump Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 19/23] libfrog: hoist bitmap out of scrub Darrick J. Wong
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use findmnt to determine if the passed-in argument is associated with a
mount point, and if so, use spaceman to query the mounted filesystem.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debian/control       |    2 +-
 spaceman/xfs_info.sh |   14 +++++++++++---
 2 files changed, 12 insertions(+), 4 deletions(-)


diff --git a/debian/control b/debian/control
index f4f807b0..0b3205f5 100644
--- a/debian/control
+++ b/debian/control
@@ -8,7 +8,7 @@ Standards-Version: 4.0.0
 Homepage: https://xfs.wiki.kernel.org/
 
 Package: xfsprogs
-Depends: ${shlibs:Depends}, ${misc:Depends}, python3:any
+Depends: ${shlibs:Depends}, ${misc:Depends}, python3:any, util-linux
 Provides: fsck-backend
 Suggests: xfsdump, acl, attr, quota
 Breaks: xfsdump (<< 3.0.0)
diff --git a/spaceman/xfs_info.sh b/spaceman/xfs_info.sh
index ecf17f61..70978164 100755
--- a/spaceman/xfs_info.sh
+++ b/spaceman/xfs_info.sh
@@ -24,11 +24,19 @@ set -- extra "$@"
 shift $OPTIND
 case $# in
 	1)
-		if [ -b "$1" ] || [ -f "$1" ]; then
-			xfs_db -p xfs_info -c "info" $OPTS "$1"
+		arg="$1"
+
+		# See if we can map the arg to a loop device
+		loopdev="$(losetup -n -O NAME -j "${arg}" 2> /dev/null)"
+		test -n "${loopdev}" && arg="${loopdev}"
+
+		# If we find a mountpoint for the device, do a live query;
+		# otherwise try reading the fs with xfs_db.
+		if mountpt="$(findmnt -f -n -o TARGET "${arg}" 2> /dev/null)"; then
+			xfs_spaceman -p xfs_info -c "info" $OPTS "${mountpt}"
 			status=$?
 		else
-			xfs_spaceman -p xfs_info -c "info" $OPTS "$1"
+			xfs_db -p xfs_info -c "info" $OPTS "${arg}"
 			status=$?
 		fi
 		;;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 19/23] libfrog: hoist bitmap out of scrub
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (17 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 18/23] xfs_info: use findmnt to handle mounted block devices Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:28 ` [PATCH 20/23] xfs_repair: correctly account for free space btree shrinks when fixing freelist Darrick J. Wong
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the bitmap code to libfrog so that we can use bitmaps in
xfs_repair.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 include/bitmap.h |   24 +++
 libfrog/Makefile |    1 
 libfrog/bitmap.c |  393 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 scrub/Makefile   |    2 
 scrub/bitmap.c   |  393 ------------------------------------------------------
 scrub/bitmap.h   |   24 ---
 6 files changed, 418 insertions(+), 419 deletions(-)
 create mode 100644 include/bitmap.h
 create mode 100644 libfrog/bitmap.c
 delete mode 100644 scrub/bitmap.c
 delete mode 100644 scrub/bitmap.h


diff --git a/include/bitmap.h b/include/bitmap.h
new file mode 100644
index 00000000..e29a4335
--- /dev/null
+++ b/include/bitmap.h
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2018 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ */
+#ifndef LIBFROG_BITMAP_H_
+#define LIBFROG_BITMAP_H_
+
+struct bitmap {
+	pthread_mutex_t		bt_lock;
+	struct avl64tree_desc	*bt_tree;
+};
+
+bool bitmap_init(struct bitmap **bmap);
+void bitmap_free(struct bitmap **bmap);
+bool bitmap_set(struct bitmap *bmap, uint64_t start, uint64_t length);
+bool bitmap_iterate(struct bitmap *bmap,
+		bool (*fn)(uint64_t, uint64_t, void *), void *arg);
+bool bitmap_test(struct bitmap *bmap, uint64_t start,
+		uint64_t len);
+bool bitmap_empty(struct bitmap *bmap);
+void bitmap_dump(struct bitmap *bmap);
+
+#endif /* LIBFROG_BITMAP_H_ */
diff --git a/libfrog/Makefile b/libfrog/Makefile
index dbff9596..f5a0539b 100644
--- a/libfrog/Makefile
+++ b/libfrog/Makefile
@@ -12,6 +12,7 @@ LT_AGE = 0
 
 CFILES = \
 avl64.c \
+bitmap.c \
 convert.c \
 crc32.c \
 fsgeom.c \
diff --git a/libfrog/bitmap.c b/libfrog/bitmap.c
new file mode 100644
index 00000000..aecdba0d
--- /dev/null
+++ b/libfrog/bitmap.c
@@ -0,0 +1,393 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2018 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ */
+#include "xfs.h"
+#include <stdint.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <pthread.h>
+#include "platform_defs.h"
+#include "avl64.h"
+#include "list.h"
+#include "bitmap.h"
+
+/*
+ * Space Efficient Bitmap
+ *
+ * Implements a space-efficient bitmap.  We use an AVL tree to manage
+ * extent records that tell us which ranges are set; the bitmap key is
+ * an arbitrary uint64_t.  The usual bitmap operations (set, clear,
+ * test, test and set) are supported, plus we can iterate set ranges.
+ */
+
+#define avl_for_each_range_safe(pos, n, l, first, last) \
+	for (pos = (first), n = pos->avl_nextino, l = (last)->avl_nextino; pos != (l); \
+			pos = n, n = pos ? pos->avl_nextino : NULL)
+
+#define avl_for_each_safe(tree, pos, n) \
+	for (pos = (tree)->avl_firstino, n = pos ? pos->avl_nextino : NULL; \
+			pos != NULL; \
+			pos = n, n = pos ? pos->avl_nextino : NULL)
+
+#define avl_for_each(tree, pos) \
+	for (pos = (tree)->avl_firstino; pos != NULL; pos = pos->avl_nextino)
+
+struct bitmap_node {
+	struct avl64node	btn_node;
+	uint64_t		btn_start;
+	uint64_t		btn_length;
+};
+
+static uint64_t
+extent_start(
+	struct avl64node	*node)
+{
+	struct bitmap_node	*btn;
+
+	btn = container_of(node, struct bitmap_node, btn_node);
+	return btn->btn_start;
+}
+
+static uint64_t
+extent_end(
+	struct avl64node	*node)
+{
+	struct bitmap_node	*btn;
+
+	btn = container_of(node, struct bitmap_node, btn_node);
+	return btn->btn_start + btn->btn_length;
+}
+
+static struct avl64ops bitmap_ops = {
+	extent_start,
+	extent_end,
+};
+
+/* Initialize a bitmap. */
+bool
+bitmap_init(
+	struct bitmap		**bmapp)
+{
+	struct bitmap		*bmap;
+
+	bmap = calloc(1, sizeof(struct bitmap));
+	if (!bmap)
+		return false;
+	bmap->bt_tree = malloc(sizeof(struct avl64tree_desc));
+	if (!bmap->bt_tree) {
+		free(bmap);
+		return false;
+	}
+
+	pthread_mutex_init(&bmap->bt_lock, NULL);
+	avl64_init_tree(bmap->bt_tree, &bitmap_ops);
+	*bmapp = bmap;
+
+	return true;
+}
+
+/* Free a bitmap. */
+void
+bitmap_free(
+	struct bitmap		**bmapp)
+{
+	struct bitmap		*bmap;
+	struct avl64node	*node;
+	struct avl64node	*n;
+	struct bitmap_node	*ext;
+
+	bmap = *bmapp;
+	avl_for_each_safe(bmap->bt_tree, node, n) {
+		ext = container_of(node, struct bitmap_node, btn_node);
+		free(ext);
+	}
+	free(bmap->bt_tree);
+	*bmapp = NULL;
+}
+
+/* Create a new bitmap extent node. */
+static struct bitmap_node *
+bitmap_node_init(
+	uint64_t		start,
+	uint64_t		len)
+{
+	struct bitmap_node	*ext;
+
+	ext = malloc(sizeof(struct bitmap_node));
+	if (!ext)
+		return NULL;
+
+	ext->btn_node.avl_nextino = NULL;
+	ext->btn_start = start;
+	ext->btn_length = len;
+
+	return ext;
+}
+
+/* Set a region of bits (locked). */
+static bool
+__bitmap_set(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		length)
+{
+	struct avl64node	*firstn;
+	struct avl64node	*lastn;
+	struct avl64node	*pos;
+	struct avl64node	*n;
+	struct avl64node	*l;
+	struct bitmap_node	*ext;
+	uint64_t		new_start;
+	uint64_t		new_length;
+	struct avl64node	*node;
+	bool			res = true;
+
+	/* Find any existing nodes adjacent or within that range. */
+	avl64_findranges(bmap->bt_tree, start - 1, start + length + 1,
+			&firstn, &lastn);
+
+	/* Nothing, just insert a new extent. */
+	if (firstn == NULL && lastn == NULL) {
+		ext = bitmap_node_init(start, length);
+		if (!ext)
+			return false;
+
+		node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+		if (node == NULL) {
+			free(ext);
+			errno = EEXIST;
+			return false;
+		}
+
+		return true;
+	}
+
+	assert(firstn != NULL && lastn != NULL);
+	new_start = start;
+	new_length = length;
+
+	avl_for_each_range_safe(pos, n, l, firstn, lastn) {
+		ext = container_of(pos, struct bitmap_node, btn_node);
+
+		/* Bail if the new extent is contained within an old one. */
+		if (ext->btn_start <= start &&
+		    ext->btn_start + ext->btn_length >= start + length)
+			return res;
+
+		/* Check for overlapping and adjacent extents. */
+		if (ext->btn_start + ext->btn_length >= start ||
+		    ext->btn_start <= start + length) {
+			if (ext->btn_start < start) {
+				new_start = ext->btn_start;
+				new_length += ext->btn_length;
+			}
+
+			if (ext->btn_start + ext->btn_length >
+			    new_start + new_length)
+				new_length = ext->btn_start + ext->btn_length -
+						new_start;
+
+			avl64_delete(bmap->bt_tree, pos);
+			free(ext);
+		}
+	}
+
+	ext = bitmap_node_init(new_start, new_length);
+	if (!ext)
+		return false;
+
+	node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+	if (node == NULL) {
+		free(ext);
+		errno = EEXIST;
+		return false;
+	}
+
+	return res;
+}
+
+/* Set a region of bits. */
+bool
+bitmap_set(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		length)
+{
+	bool			res;
+
+	pthread_mutex_lock(&bmap->bt_lock);
+	res = __bitmap_set(bmap, start, length);
+	pthread_mutex_unlock(&bmap->bt_lock);
+
+	return res;
+}
+
+#if 0	/* Unused, provided for completeness. */
+/* Clear a region of bits. */
+bool
+bitmap_clear(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		len)
+{
+	struct avl64node	*firstn;
+	struct avl64node	*lastn;
+	struct avl64node	*pos;
+	struct avl64node	*n;
+	struct avl64node	*l;
+	struct bitmap_node	*ext;
+	uint64_t		new_start;
+	uint64_t		new_length;
+	struct avl64node	*node;
+	int			stat;
+
+	pthread_mutex_lock(&bmap->bt_lock);
+	/* Find any existing nodes over that range. */
+	avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
+
+	/* Nothing, we're done. */
+	if (firstn == NULL && lastn == NULL) {
+		pthread_mutex_unlock(&bmap->bt_lock);
+		return true;
+	}
+
+	assert(firstn != NULL && lastn != NULL);
+
+	/* Delete or truncate everything in sight. */
+	avl_for_each_range_safe(pos, n, l, firstn, lastn) {
+		ext = container_of(pos, struct bitmap_node, btn_node);
+
+		stat = 0;
+		if (ext->btn_start < start)
+			stat |= 1;
+		if (ext->btn_start + ext->btn_length > start + len)
+			stat |= 2;
+		switch (stat) {
+		case 0:
+			/* Extent totally within range; delete. */
+			avl64_delete(bmap->bt_tree, pos);
+			free(ext);
+			break;
+		case 1:
+			/* Extent is left-adjacent; truncate. */
+			ext->btn_length = start - ext->btn_start;
+			break;
+		case 2:
+			/* Extent is right-adjacent; move it. */
+			ext->btn_length = ext->btn_start + ext->btn_length -
+					(start + len);
+			ext->btn_start = start + len;
+			break;
+		case 3:
+			/* Extent overlaps both ends. */
+			ext->btn_length = start - ext->btn_start;
+			new_start = start + len;
+			new_length = ext->btn_start + ext->btn_length -
+					new_start;
+
+			ext = bitmap_node_init(new_start, new_length);
+			if (!ext)
+				return false;
+
+			node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+			if (node == NULL) {
+				errno = EEXIST;
+				return false;
+			}
+			break;
+		}
+	}
+
+	pthread_mutex_unlock(&bmap->bt_lock);
+	return true;
+}
+#endif
+
+#ifdef DEBUG
+/* Iterate the set regions of this bitmap. */
+bool
+bitmap_iterate(
+	struct bitmap		*bmap,
+	bool			(*fn)(uint64_t, uint64_t, void *),
+	void			*arg)
+{
+	struct avl64node	*node;
+	struct bitmap_node	*ext;
+	bool			moveon = true;
+
+	pthread_mutex_lock(&bmap->bt_lock);
+	avl_for_each(bmap->bt_tree, node) {
+		ext = container_of(node, struct bitmap_node, btn_node);
+		moveon = fn(ext->btn_start, ext->btn_length, arg);
+		if (!moveon)
+			break;
+	}
+	pthread_mutex_unlock(&bmap->bt_lock);
+
+	return moveon;
+}
+#endif
+
+/* Do any bitmap extents overlap the given one?  (locked) */
+static bool
+__bitmap_test(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		len)
+{
+	struct avl64node	*firstn;
+	struct avl64node	*lastn;
+
+	/* Find any existing nodes over that range. */
+	avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
+
+	return firstn != NULL && lastn != NULL;
+}
+
+/* Is any part of this range set? */
+bool
+bitmap_test(
+	struct bitmap		*bmap,
+	uint64_t		start,
+	uint64_t		len)
+{
+	bool			res;
+
+	pthread_mutex_lock(&bmap->bt_lock);
+	res = __bitmap_test(bmap, start, len);
+	pthread_mutex_unlock(&bmap->bt_lock);
+
+	return res;
+}
+
+/* Are none of the bits set? */
+bool
+bitmap_empty(
+	struct bitmap		*bmap)
+{
+	return bmap->bt_tree->avl_firstino == NULL;
+}
+
+#ifdef DEBUG
+static bool
+bitmap_dump_fn(
+	uint64_t		startblock,
+	uint64_t		blockcount,
+	void			*arg)
+{
+	printf("%"PRIu64":%"PRIu64"\n", startblock, blockcount);
+	return true;
+}
+
+/* Dump bitmap. */
+void
+bitmap_dump(
+	struct bitmap		*bmap)
+{
+	printf("BITMAP DUMP %p\n", bmap);
+	bitmap_iterate(bmap, bitmap_dump_fn, NULL);
+	printf("BITMAP DUMP DONE\n");
+}
+#endif
diff --git a/scrub/Makefile b/scrub/Makefile
index 04921888..1fb9ed1d 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -31,7 +31,6 @@ endif
 endif	# scrub_prereqs
 
 HFILES = \
-bitmap.h \
 common.h \
 counter.h \
 disk.h \
@@ -48,7 +47,6 @@ vfs.h \
 xfs_scrub.h
 
 CFILES = \
-bitmap.c \
 common.c \
 counter.c \
 disk.c \
diff --git a/scrub/bitmap.c b/scrub/bitmap.c
deleted file mode 100644
index aecdba0d..00000000
--- a/scrub/bitmap.c
+++ /dev/null
@@ -1,393 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0+
-/*
- * Copyright (C) 2018 Oracle.  All Rights Reserved.
- * Author: Darrick J. Wong <darrick.wong@oracle.com>
- */
-#include "xfs.h"
-#include <stdint.h>
-#include <stdlib.h>
-#include <assert.h>
-#include <pthread.h>
-#include "platform_defs.h"
-#include "avl64.h"
-#include "list.h"
-#include "bitmap.h"
-
-/*
- * Space Efficient Bitmap
- *
- * Implements a space-efficient bitmap.  We use an AVL tree to manage
- * extent records that tell us which ranges are set; the bitmap key is
- * an arbitrary uint64_t.  The usual bitmap operations (set, clear,
- * test, test and set) are supported, plus we can iterate set ranges.
- */
-
-#define avl_for_each_range_safe(pos, n, l, first, last) \
-	for (pos = (first), n = pos->avl_nextino, l = (last)->avl_nextino; pos != (l); \
-			pos = n, n = pos ? pos->avl_nextino : NULL)
-
-#define avl_for_each_safe(tree, pos, n) \
-	for (pos = (tree)->avl_firstino, n = pos ? pos->avl_nextino : NULL; \
-			pos != NULL; \
-			pos = n, n = pos ? pos->avl_nextino : NULL)
-
-#define avl_for_each(tree, pos) \
-	for (pos = (tree)->avl_firstino; pos != NULL; pos = pos->avl_nextino)
-
-struct bitmap_node {
-	struct avl64node	btn_node;
-	uint64_t		btn_start;
-	uint64_t		btn_length;
-};
-
-static uint64_t
-extent_start(
-	struct avl64node	*node)
-{
-	struct bitmap_node	*btn;
-
-	btn = container_of(node, struct bitmap_node, btn_node);
-	return btn->btn_start;
-}
-
-static uint64_t
-extent_end(
-	struct avl64node	*node)
-{
-	struct bitmap_node	*btn;
-
-	btn = container_of(node, struct bitmap_node, btn_node);
-	return btn->btn_start + btn->btn_length;
-}
-
-static struct avl64ops bitmap_ops = {
-	extent_start,
-	extent_end,
-};
-
-/* Initialize a bitmap. */
-bool
-bitmap_init(
-	struct bitmap		**bmapp)
-{
-	struct bitmap		*bmap;
-
-	bmap = calloc(1, sizeof(struct bitmap));
-	if (!bmap)
-		return false;
-	bmap->bt_tree = malloc(sizeof(struct avl64tree_desc));
-	if (!bmap->bt_tree) {
-		free(bmap);
-		return false;
-	}
-
-	pthread_mutex_init(&bmap->bt_lock, NULL);
-	avl64_init_tree(bmap->bt_tree, &bitmap_ops);
-	*bmapp = bmap;
-
-	return true;
-}
-
-/* Free a bitmap. */
-void
-bitmap_free(
-	struct bitmap		**bmapp)
-{
-	struct bitmap		*bmap;
-	struct avl64node	*node;
-	struct avl64node	*n;
-	struct bitmap_node	*ext;
-
-	bmap = *bmapp;
-	avl_for_each_safe(bmap->bt_tree, node, n) {
-		ext = container_of(node, struct bitmap_node, btn_node);
-		free(ext);
-	}
-	free(bmap->bt_tree);
-	*bmapp = NULL;
-}
-
-/* Create a new bitmap extent node. */
-static struct bitmap_node *
-bitmap_node_init(
-	uint64_t		start,
-	uint64_t		len)
-{
-	struct bitmap_node	*ext;
-
-	ext = malloc(sizeof(struct bitmap_node));
-	if (!ext)
-		return NULL;
-
-	ext->btn_node.avl_nextino = NULL;
-	ext->btn_start = start;
-	ext->btn_length = len;
-
-	return ext;
-}
-
-/* Set a region of bits (locked). */
-static bool
-__bitmap_set(
-	struct bitmap		*bmap,
-	uint64_t		start,
-	uint64_t		length)
-{
-	struct avl64node	*firstn;
-	struct avl64node	*lastn;
-	struct avl64node	*pos;
-	struct avl64node	*n;
-	struct avl64node	*l;
-	struct bitmap_node	*ext;
-	uint64_t		new_start;
-	uint64_t		new_length;
-	struct avl64node	*node;
-	bool			res = true;
-
-	/* Find any existing nodes adjacent or within that range. */
-	avl64_findranges(bmap->bt_tree, start - 1, start + length + 1,
-			&firstn, &lastn);
-
-	/* Nothing, just insert a new extent. */
-	if (firstn == NULL && lastn == NULL) {
-		ext = bitmap_node_init(start, length);
-		if (!ext)
-			return false;
-
-		node = avl64_insert(bmap->bt_tree, &ext->btn_node);
-		if (node == NULL) {
-			free(ext);
-			errno = EEXIST;
-			return false;
-		}
-
-		return true;
-	}
-
-	assert(firstn != NULL && lastn != NULL);
-	new_start = start;
-	new_length = length;
-
-	avl_for_each_range_safe(pos, n, l, firstn, lastn) {
-		ext = container_of(pos, struct bitmap_node, btn_node);
-
-		/* Bail if the new extent is contained within an old one. */
-		if (ext->btn_start <= start &&
-		    ext->btn_start + ext->btn_length >= start + length)
-			return res;
-
-		/* Check for overlapping and adjacent extents. */
-		if (ext->btn_start + ext->btn_length >= start ||
-		    ext->btn_start <= start + length) {
-			if (ext->btn_start < start) {
-				new_start = ext->btn_start;
-				new_length += ext->btn_length;
-			}
-
-			if (ext->btn_start + ext->btn_length >
-			    new_start + new_length)
-				new_length = ext->btn_start + ext->btn_length -
-						new_start;
-
-			avl64_delete(bmap->bt_tree, pos);
-			free(ext);
-		}
-	}
-
-	ext = bitmap_node_init(new_start, new_length);
-	if (!ext)
-		return false;
-
-	node = avl64_insert(bmap->bt_tree, &ext->btn_node);
-	if (node == NULL) {
-		free(ext);
-		errno = EEXIST;
-		return false;
-	}
-
-	return res;
-}
-
-/* Set a region of bits. */
-bool
-bitmap_set(
-	struct bitmap		*bmap,
-	uint64_t		start,
-	uint64_t		length)
-{
-	bool			res;
-
-	pthread_mutex_lock(&bmap->bt_lock);
-	res = __bitmap_set(bmap, start, length);
-	pthread_mutex_unlock(&bmap->bt_lock);
-
-	return res;
-}
-
-#if 0	/* Unused, provided for completeness. */
-/* Clear a region of bits. */
-bool
-bitmap_clear(
-	struct bitmap		*bmap,
-	uint64_t		start,
-	uint64_t		len)
-{
-	struct avl64node	*firstn;
-	struct avl64node	*lastn;
-	struct avl64node	*pos;
-	struct avl64node	*n;
-	struct avl64node	*l;
-	struct bitmap_node	*ext;
-	uint64_t		new_start;
-	uint64_t		new_length;
-	struct avl64node	*node;
-	int			stat;
-
-	pthread_mutex_lock(&bmap->bt_lock);
-	/* Find any existing nodes over that range. */
-	avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
-
-	/* Nothing, we're done. */
-	if (firstn == NULL && lastn == NULL) {
-		pthread_mutex_unlock(&bmap->bt_lock);
-		return true;
-	}
-
-	assert(firstn != NULL && lastn != NULL);
-
-	/* Delete or truncate everything in sight. */
-	avl_for_each_range_safe(pos, n, l, firstn, lastn) {
-		ext = container_of(pos, struct bitmap_node, btn_node);
-
-		stat = 0;
-		if (ext->btn_start < start)
-			stat |= 1;
-		if (ext->btn_start + ext->btn_length > start + len)
-			stat |= 2;
-		switch (stat) {
-		case 0:
-			/* Extent totally within range; delete. */
-			avl64_delete(bmap->bt_tree, pos);
-			free(ext);
-			break;
-		case 1:
-			/* Extent is left-adjacent; truncate. */
-			ext->btn_length = start - ext->btn_start;
-			break;
-		case 2:
-			/* Extent is right-adjacent; move it. */
-			ext->btn_length = ext->btn_start + ext->btn_length -
-					(start + len);
-			ext->btn_start = start + len;
-			break;
-		case 3:
-			/* Extent overlaps both ends. */
-			ext->btn_length = start - ext->btn_start;
-			new_start = start + len;
-			new_length = ext->btn_start + ext->btn_length -
-					new_start;
-
-			ext = bitmap_node_init(new_start, new_length);
-			if (!ext)
-				return false;
-
-			node = avl64_insert(bmap->bt_tree, &ext->btn_node);
-			if (node == NULL) {
-				errno = EEXIST;
-				return false;
-			}
-			break;
-		}
-	}
-
-	pthread_mutex_unlock(&bmap->bt_lock);
-	return true;
-}
-#endif
-
-#ifdef DEBUG
-/* Iterate the set regions of this bitmap. */
-bool
-bitmap_iterate(
-	struct bitmap		*bmap,
-	bool			(*fn)(uint64_t, uint64_t, void *),
-	void			*arg)
-{
-	struct avl64node	*node;
-	struct bitmap_node	*ext;
-	bool			moveon = true;
-
-	pthread_mutex_lock(&bmap->bt_lock);
-	avl_for_each(bmap->bt_tree, node) {
-		ext = container_of(node, struct bitmap_node, btn_node);
-		moveon = fn(ext->btn_start, ext->btn_length, arg);
-		if (!moveon)
-			break;
-	}
-	pthread_mutex_unlock(&bmap->bt_lock);
-
-	return moveon;
-}
-#endif
-
-/* Do any bitmap extents overlap the given one?  (locked) */
-static bool
-__bitmap_test(
-	struct bitmap		*bmap,
-	uint64_t		start,
-	uint64_t		len)
-{
-	struct avl64node	*firstn;
-	struct avl64node	*lastn;
-
-	/* Find any existing nodes over that range. */
-	avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
-
-	return firstn != NULL && lastn != NULL;
-}
-
-/* Is any part of this range set? */
-bool
-bitmap_test(
-	struct bitmap		*bmap,
-	uint64_t		start,
-	uint64_t		len)
-{
-	bool			res;
-
-	pthread_mutex_lock(&bmap->bt_lock);
-	res = __bitmap_test(bmap, start, len);
-	pthread_mutex_unlock(&bmap->bt_lock);
-
-	return res;
-}
-
-/* Are none of the bits set? */
-bool
-bitmap_empty(
-	struct bitmap		*bmap)
-{
-	return bmap->bt_tree->avl_firstino == NULL;
-}
-
-#ifdef DEBUG
-static bool
-bitmap_dump_fn(
-	uint64_t		startblock,
-	uint64_t		blockcount,
-	void			*arg)
-{
-	printf("%"PRIu64":%"PRIu64"\n", startblock, blockcount);
-	return true;
-}
-
-/* Dump bitmap. */
-void
-bitmap_dump(
-	struct bitmap		*bmap)
-{
-	printf("BITMAP DUMP %p\n", bmap);
-	bitmap_iterate(bmap, bitmap_dump_fn, NULL);
-	printf("BITMAP DUMP DONE\n");
-}
-#endif
diff --git a/scrub/bitmap.h b/scrub/bitmap.h
deleted file mode 100644
index e9746a12..00000000
--- a/scrub/bitmap.h
+++ /dev/null
@@ -1,24 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0+
-/*
- * Copyright (C) 2018 Oracle.  All Rights Reserved.
- * Author: Darrick J. Wong <darrick.wong@oracle.com>
- */
-#ifndef XFS_SCRUB_BITMAP_H_
-#define XFS_SCRUB_BITMAP_H_
-
-struct bitmap {
-	pthread_mutex_t		bt_lock;
-	struct avl64tree_desc	*bt_tree;
-};
-
-bool bitmap_init(struct bitmap **bmap);
-void bitmap_free(struct bitmap **bmap);
-bool bitmap_set(struct bitmap *bmap, uint64_t start, uint64_t length);
-bool bitmap_iterate(struct bitmap *bmap,
-		bool (*fn)(uint64_t, uint64_t, void *), void *arg);
-bool bitmap_test(struct bitmap *bmap, uint64_t start,
-		uint64_t len);
-bool bitmap_empty(struct bitmap *bmap);
-void bitmap_dump(struct bitmap *bmap);
-
-#endif /* XFS_SCRUB_BITMAP_H_ */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 20/23] xfs_repair: correctly account for free space btree shrinks when fixing freelist
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (18 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 19/23] libfrog: hoist bitmap out of scrub Darrick J. Wong
@ 2019-03-01 23:28 ` Darrick J. Wong
  2019-03-01 23:29 ` [PATCH 21/23] libxfs: free buffer log item in libxfs_trans_brelse Darrick J. Wong
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:28 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When we fix the freelist at the end of build_agf_agfl in phase 5 of
repair, we need to create incore rmap records for the blocks that get
added to the AGFL.  We can't let the regular freelist fixing code use
the regular on-disk rmapbt update code because the rmapbt isn't fully
set up yet.

Unfortunately, the original code fails to account for the fact that the
free space btrees can shrink when we allocate blocks to fix the
freelist; those blocks are also put on the freelist, but there are
already incore rmaps for all the free space btree blocks.  We must not
create (redundant) incore rmaps for those blocks.  If we do, repair
fails with a complaint that rebuilding the rmapbt failed during phase 5.
xfs/137 on a 1k block size occasionally triggers this bug.

To fix the problem, construct a bitmap of all OWN_AG blocks that we know
about before traversing the AGFL, and only create new incore rmaps for
those AGFL blocks that are not already tracked in the bitmap.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/rmap.c |   54 ++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 10 deletions(-)


diff --git a/repair/rmap.c b/repair/rmap.c
index d0156f9d..19cceca3 100644
--- a/repair/rmap.c
+++ b/repair/rmap.c
@@ -12,6 +12,7 @@
 #include "dinode.h"
 #include "slab.h"
 #include "rmap.h"
+#include "bitmap.h"
 
 #undef RMAP_DEBUG
 
@@ -450,15 +451,16 @@ rmap_store_ag_btree_rec(
 	struct xfs_buf		*agflbp = NULL;
 	struct xfs_trans	*tp;
 	__be32			*agfl_bno, *b;
+	struct xfs_ag_rmap	*ag_rmap = &ag_rmaps[agno];
+	struct bitmap		*own_ag_bitmap = NULL;
 	int			error = 0;
 
 	if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
 		return 0;
 
 	/* Release the ar_rmaps; they were put into the rmapbt during p5. */
-	free_slab(&ag_rmaps[agno].ar_rmaps);
-	error = init_slab(&ag_rmaps[agno].ar_rmaps,
-				  sizeof(struct xfs_rmap_irec));
+	free_slab(&ag_rmap->ar_rmaps);
+	error = init_slab(&ag_rmap->ar_rmaps, sizeof(struct xfs_rmap_irec));
 	if (error)
 		goto err;
 
@@ -478,19 +480,50 @@ rmap_store_ag_btree_rec(
 	 * rmap, we only need to add rmap records for AGFL blocks past
 	 * that point in the AGFL because those blocks are a result of a
 	 * no-rmap no-shrink freelist fixup that we did earlier.
+	 *
+	 * However, some blocks end up on the AGFL because the free space
+	 * btrees shed blocks as a result of allocating space to fix the
+	 * freelist.  We already created in-core rmap records for the free
+	 * space btree blocks, so we must be careful not to create those
+	 * records again.  Create a bitmap of already-recorded OWN_AG rmaps.
 	 */
+	error = init_slab_cursor(ag_rmap->ar_raw_rmaps, rmap_compare, &rm_cur);
+	if (error)
+		goto err;
+	if (!bitmap_init(&own_ag_bitmap)) {
+		error = -ENOMEM;
+		goto err_slab;
+	}
+	while ((rm_rec = pop_slab_cursor(rm_cur)) != NULL) {
+		if (rm_rec->rm_owner != XFS_RMAP_OWN_AG)
+			continue;
+		if (!bitmap_set(own_ag_bitmap, rm_rec->rm_startblock,
+					rm_rec->rm_blockcount)) {
+			error = EFSCORRUPTED;
+			goto err_slab;
+		}
+	}
+	free_slab_cursor(&rm_cur);
+
+	/* Create rmaps for any AGFL blocks that aren't already rmapped. */
 	agfl_bno = XFS_BUF_TO_AGFL_BNO(mp, agflbp);
-	b = agfl_bno + ag_rmaps[agno].ar_flcount;
+	b = agfl_bno + ag_rmap->ar_flcount;
 	while (*b != cpu_to_be32(NULLAGBLOCK) &&
 	       b - agfl_bno < libxfs_agfl_size(mp)) {
-		error = rmap_add_ag_rec(mp, agno, be32_to_cpu(*b), 1,
-				XFS_RMAP_OWN_AG);
-		if (error)
-			goto err;
+		xfs_agblock_t	agbno;
+
+		agbno = be32_to_cpu(*b);
+		if (!bitmap_test(own_ag_bitmap, agbno, 1)) {
+			error = rmap_add_ag_rec(mp, agno, agbno, 1,
+					XFS_RMAP_OWN_AG);
+			if (error)
+				goto err;
+		}
 		b++;
 	}
 	libxfs_putbuf(agflbp);
 	agflbp = NULL;
+	bitmap_free(&own_ag_bitmap);
 
 	/* Merge all the raw rmaps into the main list */
 	error = rmap_fold_raw_recs(mp, agno);
@@ -498,8 +531,7 @@ rmap_store_ag_btree_rec(
 		goto err;
 
 	/* Create cursors to refcount structures */
-	error = init_slab_cursor(ag_rmaps[agno].ar_rmaps, rmap_compare,
-			&rm_cur);
+	error = init_slab_cursor(ag_rmap->ar_rmaps, rmap_compare, &rm_cur);
 	if (error)
 		goto err;
 
@@ -542,6 +574,8 @@ rmap_store_ag_btree_rec(
 err:
 	if (agflbp)
 		libxfs_putbuf(agflbp);
+	if (own_ag_bitmap)
+		bitmap_free(&own_ag_bitmap);
 	return error;
 }
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 21/23] libxfs: free buffer log item in libxfs_trans_brelse
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (19 preceding siblings ...)
  2019-03-01 23:28 ` [PATCH 20/23] xfs_repair: correctly account for free space btree shrinks when fixing freelist Darrick J. Wong
@ 2019-03-01 23:29 ` Darrick J. Wong
  2019-03-01 23:29 ` [PATCH 22/23] libxfs: free inode item when committing transaction Darrick J. Wong
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:29 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If we're going to putbuf a buffer at the bottom of libxfs_trans_brelse,
that means that the buffer is clean and not held, and therefore we need
to detach the buffer from the transaction prior to releasing the buffer.
For whatever reason, we forget to free the buffer's b_log_item (though
we set b_transp to NULL), which means that if the buffer is immediately
freed or picked back up to write an inode core (which changes
b_log_item), we'll leak the buf item.

Therefore, free the buffer log item like the kernel does, which stops
the leak.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/trans.c |    3 +++
 1 file changed, 3 insertions(+)


diff --git a/libxfs/trans.c b/libxfs/trans.c
index 46ff8b4a..b0a04ecd 100644
--- a/libxfs/trans.c
+++ b/libxfs/trans.c
@@ -570,6 +570,8 @@ libxfs_trans_brelse(
 	xfs_trans_del_item(&bip->bli_item);
 	if (bip->bli_flags & XFS_BLI_HOLD)
 		bip->bli_flags &= ~XFS_BLI_HOLD;
+	kmem_zone_free(xfs_buf_item_zone, bip);
+	bp->b_log_item = NULL;
 	bp->b_transp = NULL;
 	libxfs_putbuf(bp);
 }
@@ -856,6 +858,7 @@ inode_item_done(
 		return;
 	}
 
+	ASSERT(bp->b_log_item == NULL);
 	bp->b_log_item = iip;
 	error = libxfs_iflush_int(ip, bp);
 	if (error) {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 22/23] libxfs: free inode item when committing transaction
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (20 preceding siblings ...)
  2019-03-01 23:29 ` [PATCH 21/23] libxfs: free buffer log item in libxfs_trans_brelse Darrick J. Wong
@ 2019-03-01 23:29 ` Darrick J. Wong
  2019-03-01 23:29 ` [PATCH 23/23] libxfs: free buffer and inode log items when cancelling a transaction Darrick J. Wong
  2019-03-04 20:58 ` [PATCH 24/23] xfs_io: don't walk off the end of argv in fzero_f Darrick J. Wong
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:29 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When we commit a transaction with a dirtied inode, free the inode item
once we're done, and check that we never ijoin a transaction twice.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/trans.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)


diff --git a/libxfs/trans.c b/libxfs/trans.c
index b0a04ecd..0f9f12b6 100644
--- a/libxfs/trans.c
+++ b/libxfs/trans.c
@@ -380,6 +380,7 @@ libxfs_trans_ijoin(
 	xfs_inode_log_item_t	*iip;
 
 	ASSERT(ip->i_transp == NULL);
+	ASSERT(ip->i_temp == NULL);
 	if (ip->i_itemp == NULL)
 		xfs_inode_item_init(ip, ip->i_mount);
 	iip = ip->i_itemp;
@@ -842,11 +843,8 @@ inode_item_done(
 	mp = iip->ili_item.li_mountp;
 	ASSERT(ip != NULL);
 
-	if (!(iip->ili_fields & XFS_ILOG_ALL)) {
-		ip->i_transp = NULL;	/* disassociate from transaction */
-		iip->ili_flags = 0;	/* reset all flags */
-		return;
-	}
+	if (!(iip->ili_fields & XFS_ILOG_ALL))
+		goto free;
 
 	/*
 	 * Get the buffer containing the on-disk inode.
@@ -867,7 +865,6 @@ inode_item_done(
 		return;
 	}
 
-	ip->i_transp = NULL;	/* disassociate from transaction */
 	bp->b_log_item = NULL;	/* remove log item */
 	bp->b_transp = NULL;	/* remove xact ptr */
 	libxfs_writebuf(bp, 0);
@@ -875,6 +872,10 @@ inode_item_done(
 	fprintf(stderr, "flushing dirty inode %llu, buffer %p\n",
 			ip->i_ino, bp);
 #endif
+free:
+	ip->i_transp = NULL;	/* disassociate from transaction */
+	ip->i_itemp = NULL;
+	kmem_zone_free(xfs_ili_zone, iip);
 }
 
 static void

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 23/23] libxfs: free buffer and inode log items when cancelling a transaction
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (21 preceding siblings ...)
  2019-03-01 23:29 ` [PATCH 22/23] libxfs: free inode item when committing transaction Darrick J. Wong
@ 2019-03-01 23:29 ` Darrick J. Wong
  2019-03-04 20:58 ` [PATCH 24/23] xfs_io: don't walk off the end of argv in fzero_f Darrick J. Wong
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-01 23:29 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When we're cancelling a transaction, cancel the log items instead of
leaking them, now that we no longer recycle them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/trans.c |   73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)


diff --git a/libxfs/trans.c b/libxfs/trans.c
index 0f9f12b6..c36837b1 100644
--- a/libxfs/trans.c
+++ b/libxfs/trans.c
@@ -24,6 +24,7 @@ STATIC struct xfs_trans *xfs_trans_dup(struct xfs_trans *tp);
 static int xfs_trans_reserve(struct xfs_trans *tp, struct xfs_trans_res *resp,
 		uint blocks, uint rtextents);
 static int __xfs_trans_commit(struct xfs_trans *tp, bool regrant);
+static void trans_cancel(struct xfs_trans *tp);
 
 /*
  * Simple transaction interface
@@ -326,6 +327,7 @@ libxfs_trans_cancel(
 	if (tp->t_flags & XFS_TRANS_PERM_LOG_RES)
 		xfs_defer_cancel(tp);
 
+	trans_cancel(tp);
 	xfs_trans_free_items(tp);
 	xfs_trans_free(tp);
 
@@ -878,6 +880,55 @@ inode_item_done(
 	kmem_zone_free(xfs_ili_zone, iip);
 }
 
+static void
+inode_item_cancel(
+	struct xfs_inode_log_item	*iip)
+{
+	struct xfs_inode		*ip;
+
+	ip = iip->ili_inode;
+	ASSERT(ip != NULL);
+
+	if (iip->ili_fields & XFS_ILOG_ALL) {
+#ifdef XACT_DEBUG
+		fprintf(stderr, "cancelling dirty inode %lu\n", ip->i_ino);
+#endif
+	}
+
+	ip->i_transp = NULL;	/* disassociate from transaction */
+	ip->i_itemp = NULL;
+	kmem_zone_free(xfs_ili_zone, iip);
+}
+
+static void
+buf_item_cancel(
+	struct xfs_buf_log_item	*bip)
+{
+	struct xfs_buf		*bp;
+	int			hold;
+	extern kmem_zone_t	*xfs_buf_item_zone;
+
+	bp = bip->bli_buf;
+	ASSERT(bp != NULL);
+
+	hold = (bip->bli_flags & XFS_BLI_HOLD);
+	if (bip->bli_flags & XFS_BLI_DIRTY) {
+#ifdef XACT_DEBUG
+		fprintf(stderr, "cancelling dirty buffer %p (hold=%d)\n",
+			bp, hold);
+#endif
+	}
+	if (hold)
+		bip->bli_flags &= ~XFS_BLI_HOLD;
+	else
+		libxfs_putbuf(bp);
+
+	/* release the buf item */
+	bp->b_log_item = NULL;			/* remove log item */
+	bp->b_transp = NULL;			/* remove xact ptr */
+	kmem_zone_free(xfs_buf_item_zone, bip);
+}
+
 static void
 buf_item_done(
 	xfs_buf_log_item_t	*bip)
@@ -928,6 +979,27 @@ trans_committed(
 	}
 }
 
+static void
+trans_cancel(
+	struct xfs_trans	*tp)
+{
+	struct xfs_log_item	*lip, *next;
+
+	list_for_each_entry_safe(lip, next, &tp->t_items, li_trans) {
+		xfs_trans_del_item(lip);
+
+		if (lip->li_type == XFS_LI_BUF)
+			buf_item_cancel((xfs_buf_log_item_t *)lip);
+		else if (lip->li_type == XFS_LI_INODE)
+			inode_item_cancel((xfs_inode_log_item_t *)lip);
+		else {
+			fprintf(stderr, _("%s: unrecognised log item type\n"),
+				progname);
+			ASSERT(0);
+		}
+	}
+}
+
 static void
 buf_item_unlock(
 	xfs_buf_log_item_t	*bip)
@@ -1033,6 +1105,7 @@ __xfs_trans_commit(
 	return 0;
 
 out_unreserve:
+	trans_cancel(tp);
 	xfs_trans_free_items(tp);
 	xfs_trans_free(tp);
 	return error;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 24/23] xfs_io: don't walk off the end of argv in fzero_f
  2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
                   ` (22 preceding siblings ...)
  2019-03-01 23:29 ` [PATCH 23/23] libxfs: free buffer and inode log items when cancelling a transaction Darrick J. Wong
@ 2019-03-04 20:58 ` Darrick J. Wong
  23 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2019-03-04 20:58 UTC (permalink / raw)
  To: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

The fzero_f function doesn't check that there are enough non-switch
parameters to supply offset and length arguments to fallocate.  As a
result, we can walk off the end of the argv array and crash.  A
secondary problem is that we don't use getopt to detect the -k, which is
not how most xfs_io commands work.

Therefore, use getopt to detect the -k argument and rewire the offset
and length interpretation code to check optind and use argv correctly.
This bug is trivially reproduced by "xfs_io -c 'fzero -k 0' /some/file".

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 io/prealloc.c |   20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/io/prealloc.c b/io/prealloc.c
index 9a372bae..6d452354 100644
--- a/io/prealloc.c
+++ b/io/prealloc.c
@@ -285,18 +285,24 @@ fzero_f(
 {
 	xfs_flock64_t	segment;
 	int		mode = FALLOC_FL_ZERO_RANGE;
-	int		index = 1;
+	int		c;
 
-	if (strncmp(argv[index], "-k", 3) == 0) {
-		mode |= FALLOC_FL_KEEP_SIZE;
-		index++;
+	while ((c = getopt(argc, argv, "k")) != EOF) {
+		switch (c) {
+		case 'k':
+			mode |= FALLOC_FL_KEEP_SIZE;
+			break;
+		default:
+			command_usage(&fzero_cmd);
+		}
 	}
+        if (optind != argc - 2)
+                return command_usage(&fzero_cmd);
 
-	if (!offset_length(argv[index], argv[index + 1], &segment))
+	if (!offset_length(argv[optind], argv[optind + 1], &segment))
 		return 0;
 
-	if (fallocate(file->fd, mode,
-			segment.l_start, segment.l_len)) {
+	if (fallocate(file->fd, mode, segment.l_start, segment.l_len)) {
 		perror("fallocate");
 		return 0;
 	}

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection
  2019-03-01 23:26 ` [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection Darrick J. Wong
@ 2019-03-08  8:08   ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2019-03-08  8:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/23] xfs_io: actually check copy file range helper return values
  2019-03-01 23:27 ` [PATCH 02/23] xfs_io: actually check copy file range helper return values Darrick J. Wong
@ 2019-03-08  8:09   ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2019-03-08  8:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, schumaker.anna

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 03/23] xfs_io: statx -r should print attributes_mask
  2019-03-01 23:27 ` [PATCH 03/23] xfs_io: statx -r should print attributes_mask Darrick J. Wong
@ 2019-03-08  8:09   ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2019-03-08  8:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Mar 01, 2019 at 03:27:07PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> We're dumping the raw structure, so we ought to dump everything,
> including the attributes_mask field.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly
  2019-03-01 23:28 ` [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly Darrick J. Wong
@ 2019-04-09 20:43   ` Eric Sandeen
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Sandeen @ 2019-04-09 20:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 3/1/19 5:28 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In mk_root_dir, we reinitialize the root directory inode with a link
> count of 1.  This differs from mkfs parseproto, which initializes the
> root to have a link count of 2.  The nlink discrepancy in repair is
> caught and corrected during phase 7, but this is unnecessary since we
> should set it properly in the first place.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>


I have this nagging feeling that we've pingponged some of this back
and forth, but this seems totally legitimate, so ...
embrace the fear ;)

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  repair/phase6.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/repair/phase6.c b/repair/phase6.c
> index 9477bc25..8a50b350 100644
> --- a/repair/phase6.c
> +++ b/repair/phase6.c
> @@ -891,7 +891,7 @@ mk_root_dir(xfs_mount_t *mp)
>  	ip->i_d.di_format = XFS_DINODE_FMT_EXTENTS;
>  	ip->i_d.di_aformat = XFS_DINODE_FMT_EXTENTS;
>  
> -	set_nlink(VFS_I(ip), 1);	/* account for . */
> +	set_nlink(VFS_I(ip), 2);	/* account for . and .. */
>  
>  	times = XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD;
>  	if (ip->i_d.di_version == 3) {
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-04-09 20:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 23:26 [PATCH 00/23] xfsprogs-5.0: fix various problems Darrick J. Wong
2019-03-01 23:26 ` [PATCH 01/23] configure: use sys/xattr.h for fsetxattr detection Darrick J. Wong
2019-03-08  8:08   ` Christoph Hellwig
2019-03-01 23:27 ` [PATCH 02/23] xfs_io: actually check copy file range helper return values Darrick J. Wong
2019-03-08  8:09   ` Christoph Hellwig
2019-03-01 23:27 ` [PATCH 03/23] xfs_io: statx -r should print attributes_mask Darrick J. Wong
2019-03-08  8:09   ` Christoph Hellwig
2019-03-01 23:27 ` [PATCH 04/23] xfs_scrub_all: walk the lsblk device/fs hierarchy correctly Darrick J. Wong
2019-03-01 23:27 ` [PATCH 05/23] xfs_scrub_all.timer: activate after most of the system is up Darrick J. Wong
2019-03-01 23:27 ` [PATCH 06/23] xfs_scrub: rename the global nr_threads Darrick J. Wong
2019-03-01 23:27 ` [PATCH 07/23] xfs_scrub: use datadev parallelization estimates for thread count Darrick J. Wong
2019-03-01 23:27 ` [PATCH 08/23] xfs_scrub: don't expose internal pool state Darrick J. Wong
2019-03-01 23:27 ` [PATCH 09/23] xfs_scrub: one read/verify pool per disk Darrick J. Wong
2019-03-01 23:27 ` [PATCH 10/23] xfs_scrub: don't close mnt_fd when mnt_fd open fails Darrick J. Wong
2019-03-01 23:28 ` [PATCH 11/23] xfs_scrub: check label for misleading characters Darrick J. Wong
2019-03-01 23:28 ` [PATCH 12/23] mkfs: validate extent size hint parameters Darrick J. Wong
2019-03-01 23:28 ` [PATCH 13/23] xfs_repair: reinitialize the root directory nlink correctly Darrick J. Wong
2019-04-09 20:43   ` Eric Sandeen
2019-03-01 23:28 ` [PATCH 14/23] xfs_repair: bump the irec on-disk nlink when adding lost+found Darrick J. Wong
2019-03-01 23:28 ` [PATCH 15/23] xfs_repair: fix uninitialized variable warnings Darrick J. Wong
2019-03-01 23:28 ` [PATCH 16/23] xfs_db: fix finobt record decoding when sparse inodes enabled Darrick J. Wong
2019-03-01 23:28 ` [PATCH 17/23] xfs_db: use TYP_FINOBT for finobt metadump Darrick J. Wong
2019-03-01 23:28 ` [PATCH 18/23] xfs_info: use findmnt to handle mounted block devices Darrick J. Wong
2019-03-01 23:28 ` [PATCH 19/23] libfrog: hoist bitmap out of scrub Darrick J. Wong
2019-03-01 23:28 ` [PATCH 20/23] xfs_repair: correctly account for free space btree shrinks when fixing freelist Darrick J. Wong
2019-03-01 23:29 ` [PATCH 21/23] libxfs: free buffer log item in libxfs_trans_brelse Darrick J. Wong
2019-03-01 23:29 ` [PATCH 22/23] libxfs: free inode item when committing transaction Darrick J. Wong
2019-03-01 23:29 ` [PATCH 23/23] libxfs: free buffer and inode log items when cancelling a transaction Darrick J. Wong
2019-03-04 20:58 ` [PATCH 24/23] xfs_io: don't walk off the end of argv in fzero_f Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.