* [PATCH v11 00/27] xfsprogs: online scrub/repair support
@ 2018-01-06 1:51 Darrick J. Wong
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
` (30 more replies)
0 siblings, 31 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
Hi all,
This is the eleventh revision of a patchset that adds to XFS userland tools
support for online metadata scrubbing and repair. Since v10 I've rebased
to the latest for-next, fixed some wonky error messages, and fixed a few
minor problems I found via code inspection. However, this patch series is
more or less the same as v10.
We start by creating the basic shell of the program that can do argument
parsing and error reporting, create some abstractions for the XFS ioctls
that we use to iterate and scrub metadata, and then tie together all the
in-kernel scrubbing in separate scrub phases.
Next, we move on to checking the directory tree for connectivity and
naming problems and add the infrastructure to perform an (optional) scan
of the in-use parts of the disk media. We also implement a minimal
preen -- if the fs checks out, we can try to run fstrim; and some basic
progress reporting if the program is running interactively.
Finally, we add some wrapper scripts to schedule scrubs of all the
mounted filesystems; and the necessary systemd / cron infrastructure
that is needed to automatically scan everything once a week. All of
this is disabled by default. The systemd integration allows us to give
scrub exactly the privileges it needs while walling off the rest of the
system.
If you're going to start using this mess, you probably ought to just
pull from my git tree for xfsprogs[1]. This series relies on the
libfrog patches sent earlier. Kernel support will appear in 4.15.
Comments and questions are, as always, welcome.
--D
[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 01/27] xfs_scrub: create online filesystem scrub program
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
@ 2018-01-06 1:51 ` Darrick J. Wong
2018-01-12 0:16 ` Eric Sandeen
2018-01-12 1:07 ` Eric Sandeen
2018-01-06 1:51 ` [PATCH 02/27] xfs_scrub: common error handling Darrick J. Wong
` (29 subsequent siblings)
30 siblings, 2 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create the foundations of a filesystem scrubbing tool that asks the
kernel to inspect all metadata in the filesystem and (ultimately) to
repair anything that's broken. Also create the man page for the
utility.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
.gitignore | 1
Makefile | 3 +
man/man8/xfs_scrub.8 | 117 ++++++++++++++++++++++++++++++++++++++++++
scrub/Makefile | 42 +++++++++++++++
scrub/common.c | 20 +++++++
scrub/common.h | 23 ++++++++
scrub/xfs_scrub.c | 109 +++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.h | 23 ++++++++
tools/find-api-violations.sh | 2 -
9 files changed, 338 insertions(+), 2 deletions(-)
create mode 100644 man/man8/xfs_scrub.8
create mode 100644 scrub/Makefile
create mode 100644 scrub/common.c
create mode 100644 scrub/common.h
create mode 100644 scrub/xfs_scrub.c
create mode 100644 scrub/xfs_scrub.h
diff --git a/.gitignore b/.gitignore
index e839e2a..a3db640 100644
--- a/.gitignore
+++ b/.gitignore
@@ -68,6 +68,7 @@ cscope.*
/repair/xfs_repair
/rtcp/xfs_rtcp
/spaceman/xfs_spaceman
+/scrub/xfs_scrub
# generated crc files
/libxfs/crc32selftest
diff --git a/Makefile b/Makefile
index 0dce80a..3bd0796 100644
--- a/Makefile
+++ b/Makefile
@@ -48,7 +48,7 @@ LIBFROG_SUBDIR = libfrog
DLIB_SUBDIRS = libxlog libxcmd libhandle
LIB_SUBDIRS = libxfs $(DLIB_SUBDIRS)
TOOL_SUBDIRS = copy db estimate fsck growfs io logprint mkfs quota \
- mdrestore repair rtcp m4 man doc debian spaceman
+ mdrestore repair rtcp m4 man doc debian spaceman scrub
ifneq ("$(PKG_PLATFORM)","darwin")
TOOL_SUBDIRS += fsr
@@ -91,6 +91,7 @@ repair: libxlog libxcmd
copy: libxlog
mkfs: libxcmd
spaceman: libxcmd
+scrub: libhandle libxcmd
ifeq ($(HAVE_BUILDDEFS), yes)
include $(BUILDRULES)
diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
new file mode 100644
index 0000000..95f4fea
--- /dev/null
+++ b/man/man8/xfs_scrub.8
@@ -0,0 +1,117 @@
+.TH xfs_scrub 8
+.SH NAME
+xfs_scrub \- scrub the contents of an XFS filesystem
+.SH SYNOPSIS
+.B xfs_scrub
+[
+.B \-abemnTvVxy
+]
+.I mount-point
+.br
+.B xfs_scrub \-V
+.SH DESCRIPTION
+.B xfs_scrub
+attempts to check and repair all metadata in a mounted XFS filesystem.
+.PP
+.B xfs_scrub
+asks the kernel to scrub all metadata objects in the filesystem.
+Metadata records are scanned for obviously bad values and then
+cross-referenced against other metadata.
+The goal is to establish a threasonable confidence about the consistency
+of the overall filesystem by examining the consistency of individual
+metadata records against the other metadata in the filesystem across the
+entire filesystem.
+Damaged metadata can be rebuilt from other metadata if there is
+sufficient redundancy (and no other corruption) in the metadata.
+.PP
+This utility does not know how to correct all errors.
+If the tool cannot fix the detected errors, you must unmount the
+filesystem and run
+.B xfs_repair
+to fix the problems.
+If this tool is not run with either of the
+.B \-n
+or
+.B \-y
+options, then it will optimize the filesystem when possible,
+but it will not try to fix errors.
+.SH OPTIONS
+.TP
+.BI \-a " errors"
+Abort if more than this many errors are found on the filesystem.
+.TP
+.B \-b
+Run in background mode.
+If the option is specified once, only run a single scrubbing thread at a
+time.
+If given more than once, an artificial delay of 100us is added to each
+scrub call to reduce CPU overhead even further.
+.TP
+.B \-e
+Specifies what happens when errors are detected.
+If
+.IR shutdown
+is given, the filesystem will be taken offline if errors are found.
+Not all backends can shut down a filesystem.
+If
+.IR continue
+is given, no action taken if errors are found.
+This is the default.
+.TP
+.BI \-m " file"
+Search this file for mounted filesystems instead of /etc/mtab.
+.TP
+.B \-n
+Dry run, do not modify anything in the filesystem.
+This disables all preening and optimization behaviors, and disables
+calling FITRIM on the free space after a successful run.
+.TP
+.BI \-T
+Print timing and memory usage information for each phase.
+.TP
+.B \-v
+Enable verbose mode, which prints periodic status updates.
+.TP
+.B \-V
+Prints the version number and exits.
+.TP
+.B \-x
+Scrub all file data too.
+The block list will be sorted in disk order for better performance.
+.B xfs_scrub
+will issue O_DIRECT reads to the block device directly.
+If the block device is a SCSI disk, it will issue READ VERIFY commands
+directly to the disk.
+.TP
+.B \-y
+Try to repair all filesystem errors.
+If the errors cannot be fixed online, then the filesystem must be taken
+offline for repair.
+.SH EXIT CODE
+The exit code returned by
+.B xfs_scrub
+is the sum of the following conditions:
+.br
+\ 0\ \-\ No errors
+.br
+\ 1\ \-\ File system errors left uncorrected
+.br
+\ 2\ \-\ File system optimizations possible
+.br
+\ 4\ \-\ Operational error
+.br
+\ 8\ \-\ Usage or syntax error
+.br
+.SH CAVEATS
+.B xfs_scrub
+is an immature utility!
+This program takes advantage of in-kernel scrubbing to verify a given
+data structure with locks held.
+The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS,
+GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls.
+This can tie up the system for a while.
+.PP
+If errors are found and cannot be repaired, the filesystem must be taken
+offline and repaired.
+.SH SEE ALSO
+.BR xfs_repair (8).
diff --git a/scrub/Makefile b/scrub/Makefile
new file mode 100644
index 0000000..62cca3b
--- /dev/null
+++ b/scrub/Makefile
@@ -0,0 +1,42 @@
+#
+# Copyright (C) 2018 Oracle. All Rights Reserved.
+#
+
+TOPDIR = ..
+include $(TOPDIR)/include/builddefs
+
+# On linux we get fsmap from the system or define it ourselves
+# so include this based on platform type. If this reverts to only
+# the autoconf check w/o local definition, change to testing HAVE_GETFSMAP
+SCRUB_PREREQS=$(PKG_PLATFORM)
+
+ifeq ($(SCRUB_PREREQS),linux)
+LTCOMMAND = xfs_scrub
+INSTALL_SCRUB = install-scrub
+endif # scrub_prereqs
+
+HFILES = \
+common.h \
+xfs_scrub.h
+
+CFILES = \
+common.c \
+xfs_scrub.c
+
+LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
+LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
+LLDFLAGS = -static
+
+default: depend $(LTCOMMAND)
+
+include $(BUILDRULES)
+
+install: default $(INSTALL_SCRUB)
+
+install-scrub:
+ $(INSTALL) -m 755 -d $(PKG_ROOT_SBIN_DIR)
+ $(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_ROOT_SBIN_DIR)
+
+install-dev:
+
+-include .dep
diff --git a/scrub/common.c b/scrub/common.c
new file mode 100644
index 0000000..0a58c16
--- /dev/null
+++ b/scrub/common.c
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include "common.h"
diff --git a/scrub/common.h b/scrub/common.h
new file mode 100644
index 0000000..1082296
--- /dev/null
+++ b/scrub/common.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_COMMON_H_
+#define XFS_SCRUB_COMMON_H_
+
+#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
new file mode 100644
index 0000000..4f26855
--- /dev/null
+++ b/scrub/xfs_scrub.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include "xfs_scrub.h"
+
+/*
+ * XFS Online Metadata Scrub (and Repair)
+ *
+ * The XFS scrubber uses custom XFS ioctls to probe more deeply into the
+ * internals of the filesystem. It takes advantage of scrubbing ioctls
+ * to check all the records stored in a metadata object and to
+ * cross-reference those records against the other filesystem metadata.
+ *
+ * After the program gathers command line arguments to figure out
+ * exactly what the user wants the program is going to do, scrub
+ * execution is split up into several separate phases:
+ *
+ * The "find geometry" phase queries XFS for the filesystem geometry.
+ * The block devices for the data, realtime, and log devices are opened.
+ * Kernel ioctls are test-queried to see if they actually work (the scrub
+ * ioctl in particular), and any other filesystem-specific information
+ * is gathered.
+ *
+ * In the "check internal metadata" phase, we call the metadata scrub
+ * ioctl to check the filesystem's internal per-AG btrees. This
+ * includes the AG superblock, AGF, AGFL, and AGI headers, freespace
+ * btrees, the regular and free inode btrees, the reverse mapping
+ * btrees, and the reference counting btrees. If the realtime device is
+ * enabled, the realtime bitmap and reverse mapping btrees are enabled.
+ * Quotas, if enabled, are also checked in this phase.
+ *
+ * Each AG (and the realtime device) has its metadata checked in a
+ * separate thread for better performance. Errors in the internal
+ * metadata can be fixed here prior to the inode scan; refer to the
+ * section about the "repair filesystem" phase for more information.
+ *
+ * The "scan all inodes" phase uses BULKSTAT to scan all the inodes in
+ * an AG in disk order. The BULKSTAT information provides enough
+ * information to construct a file handle that is used to check the
+ * following parts of every file:
+ *
+ * - The inode record
+ * - All three block forks (data, attr, CoW)
+ * - If it's a symlink, the symlink target.
+ * - If it's a directory, the directory entries.
+ * - All extended attributes
+ * - The parent pointer
+ *
+ * Multiple threads are started to check each the inodes of each AG in
+ * parallel. Errors in file metadata can be fixed here; see the section
+ * about the "repair filesystem" phase for more information.
+ *
+ * Next comes the (configurable) "repair filesystem" phase. The user
+ * can instruct this program to fix all problems encountered; to fix
+ * only optimality problems and leave the corruptions; or not to touch
+ * the filesystem at all. Any metadata repairs that did not succeed in
+ * the previous two phases are retried here; if there are uncorrectable
+ * errors, xfs_scrub stops here.
+ *
+ * The next phase is the "check directory tree" phase. In this phase,
+ * every directory is opened (via file handle) to confirm that each
+ * directory is connected to the root. Directory entries are checked
+ * for ambiguous Unicode normalization mappings, which is to say that we
+ * look for pairs of entries whose utf-8 strings normalize to the same
+ * code point sequence and map to different inodes, because that could
+ * be used to trick a user into opening the wrong file. The names of
+ * extended attributes are checked for Unicode normalization collisions.
+ *
+ * In the "verify data file integrity" phase, we employ GETFSMAP to read
+ * the reverse-mappings of all AGs and issue direct-reads of the
+ * underlying disk blocks. We rely on the underlying storage to have
+ * checksummed the data blocks appropriately. Multiple threads are
+ * started to check each AG in parallel; a separate thread pool is used
+ * to handle the direct reads.
+ *
+ * In the "check summary counters" phase, use GETFSMAP to tally up the
+ * blocks and BULKSTAT to tally up the inodes we saw and compare that to
+ * the statfs output. This gives the user a rough estimate of how
+ * thorough the scrub was.
+ */
+
+/* Program name; needed for libxcmd error reports. */
+char *progname = "xfs_scrub";
+
+int
+main(
+ int argc,
+ char **argv)
+{
+ fprintf(stderr, "XXX: This program is not complete!\n");
+ return 4;
+}
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
new file mode 100644
index 0000000..ff9c24d
--- /dev/null
+++ b/scrub/xfs_scrub.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_XFS_SCRUB_H_
+#define XFS_SCRUB_XFS_SCRUB_H_
+
+#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
diff --git a/tools/find-api-violations.sh b/tools/find-api-violations.sh
index 3b976d3..cb075ba 100755
--- a/tools/find-api-violations.sh
+++ b/tools/find-api-violations.sh
@@ -6,7 +6,7 @@
# NOTE: This script doesn't look for API violations in function parameters.
-tool_dirs="copy db estimate fsck fsr growfs io logprint mdrestore mkfs quota repair rtcp"
+tool_dirs="copy db estimate fsck fsr growfs io logprint mdrestore mkfs quota repair rtcp scrub"
# Calls to xfs_* functions in libxfs/*.c without the libxfs_ prefix
find_possible_api_calls() {
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 02/27] xfs_scrub: common error handling
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
@ 2018-01-06 1:51 ` Darrick J. Wong
2018-01-12 1:15 ` Eric Sandeen
2018-01-06 1:51 ` [PATCH 03/27] xfs_scrub: set up command line argument parsing Darrick J. Wong
` (28 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Standardize how we record and report errors.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/common.c | 141 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/common.h | 28 +++++++++++
scrub/xfs_scrub.c | 8 +++
scrub/xfs_scrub.h | 12 +++++
4 files changed, 189 insertions(+)
diff --git a/scrub/common.c b/scrub/common.c
index 0a58c16..3c89b7d 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -17,4 +17,145 @@
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
*/
+#include <stdio.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include "platform_defs.h"
+#include "xfs.h"
+#include "xfs_scrub.h"
#include "common.h"
+
+/*
+ * Reporting Status to the Console
+ *
+ * We aim for a roughly standard reporting format -- the severity of the
+ * status being reported, a textual description of the objecting being
+ * reported, and whatever the status happens to be.
+ *
+ * Errors are the most severe and reflect filesystem corruption.
+ * Warnings indicate that something is amiss and needs the attention of
+ * the administrator, but does not constitute a corruption. Information
+ * is merely advisory.
+ */
+
+/* Too many errors? Bail out. */
+bool
+xfs_scrub_excessive_errors(
+ struct scrub_ctx *ctx)
+{
+ bool ret;
+
+ pthread_mutex_lock(&ctx->lock);
+ ret = ctx->max_errors > 0 && ctx->errors_found >= ctx->max_errors;
+ pthread_mutex_unlock(&ctx->lock);
+
+ return ret;
+}
+
+/* Print an error string and whatever error is stored in errno. */
+void
+__str_errno(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line)
+{
+ char buf[DESCR_BUFSZ];
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stderr, _("Error: %s: %s."), descr,
+ strerror_r(errno, buf, DESCR_BUFSZ));
+ if (debug)
+ fprintf(stderr, _(" (%s line %d)"), file, line);
+ fprintf(stderr, "\n");
+ ctx->runtime_errors++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
+/* Print an error string and some error text. */
+void
+__str_error(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line,
+ const char *format,
+ ...)
+{
+ va_list args;
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stderr, _("Error: %s: "), descr);
+ va_start(args, format);
+ vfprintf(stderr, format, args);
+ va_end(args);
+ if (debug)
+ fprintf(stderr, _(" (%s line %d)"), file, line);
+ fprintf(stderr, "\n");
+ ctx->errors_found++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
+/* Print a warning string and some warning text. */
+void
+__str_warn(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line,
+ const char *format,
+ ...)
+{
+ va_list args;
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stderr, _("Warning: %s: "), descr);
+ va_start(args, format);
+ vfprintf(stderr, format, args);
+ va_end(args);
+ if (debug)
+ fprintf(stderr, _(" (%s line %d)"), file, line);
+ fprintf(stderr, "\n");
+ ctx->warnings_found++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
+/* Print an informational string and some informational text. */
+void
+__str_info(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line,
+ const char *format,
+ ...)
+{
+ va_list args;
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stdout, _("Info: %s: "), descr);
+ va_start(args, format);
+ vfprintf(stdout, format, args);
+ va_end(args);
+ if (debug)
+ fprintf(stdout, _(" (%s line %d)"), file, line);
+ fprintf(stdout, "\n");
+ fflush(stdout);
+ pthread_mutex_unlock(&ctx->lock);
+}
+
+/* Catch fatal errors from pieces we import from xfs_repair. */
+void __attribute__((noreturn))
+do_error(char const *msg, ...)
+{
+ va_list args;
+
+ fprintf(stderr, _("\nfatal error -- "));
+
+ va_start(args, msg);
+ vfprintf(stderr, msg, args);
+ va_end(args);
+ if (dumpcore)
+ abort();
+ exit(1);
+}
diff --git a/scrub/common.h b/scrub/common.h
index 1082296..f620620 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -20,4 +20,32 @@
#ifndef XFS_SCRUB_COMMON_H_
#define XFS_SCRUB_COMMON_H_
+/*
+ * When reporting a defective metadata object to the console, this
+ * is the size of the buffer to use to store the description of that
+ * item.
+ */
+#define DESCR_BUFSZ 256
+
+bool xfs_scrub_excessive_errors(struct scrub_ctx *ctx);
+
+void __str_errno(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line);
+void __str_error(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line, const char *format, ...);
+void __str_warn(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line, const char *format, ...);
+void __str_info(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line, const char *format, ...);
+void __record_repair(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line, const char *format, ...);
+void __record_preen(struct scrub_ctx *ctx, const char *descr, const char *file,
+ int line, const char *format, ...);
+
+#define str_errno(ctx, str) __str_errno(ctx, str, __FILE__, __LINE__)
+#define str_error(ctx, str, ...) __str_error(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
+#define str_warn(ctx, str, ...) __str_warn(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
+#define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
+#define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 4f26855..10116a8 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -18,6 +18,8 @@
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include <stdio.h>
+#include <pthread.h>
+#include <stdbool.h>
#include "xfs_scrub.h"
/*
@@ -99,6 +101,12 @@
/* Program name; needed for libxcmd error reports. */
char *progname = "xfs_scrub";
+/* Debug level; higher values mean more verbosity. */
+unsigned int debug;
+
+/* Should we dump core if errors happen? */
+bool dumpcore;
+
int
main(
int argc,
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index ff9c24d..f19ac6b 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -20,4 +20,16 @@
#ifndef XFS_SCRUB_XFS_SCRUB_H_
#define XFS_SCRUB_XFS_SCRUB_H_
+extern unsigned int debug;
+extern bool dumpcore;
+
+struct scrub_ctx {
+ /* Mutable scrub state; use lock. */
+ pthread_mutex_t lock;
+ unsigned long long max_errors;
+ unsigned long long runtime_errors;
+ unsigned long long errors_found;
+ unsigned long long warnings_found;
+};
+
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 03/27] xfs_scrub: set up command line argument parsing
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
2018-01-06 1:51 ` [PATCH 02/27] xfs_scrub: common error handling Darrick J. Wong
@ 2018-01-06 1:51 ` Darrick J. Wong
2018-01-11 23:39 ` Eric Sandeen
2018-01-12 1:30 ` Eric Sandeen
2018-01-06 1:51 ` [PATCH 04/27] xfs_scrub: dispatch the various phases of the scrub program Darrick J. Wong
` (27 subsequent siblings)
30 siblings, 2 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Parse command line options in order to set up the context in which we
will scrub the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/common.h | 8 ++
scrub/xfs_scrub.c | 207 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.h | 34 +++++++++
3 files changed, 249 insertions(+)
diff --git a/scrub/common.h b/scrub/common.h
index f620620..15a59bd 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -48,4 +48,12 @@ void __record_preen(struct scrub_ctx *ctx, const char *descr, const char *file,
#define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
+/* Is this debug tweak enabled? */
+static inline bool
+debug_tweak_on(
+ const char *name)
+{
+ return debug && getenv(name) != NULL;
+}
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 10116a8..9db3b41 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -20,7 +20,12 @@
#include <stdio.h>
#include <pthread.h>
#include <stdbool.h>
+#include <stdlib.h>
+#include "platform_defs.h"
+#include "xfs.h"
+#include "input.h"
#include "xfs_scrub.h"
+#include "common.h"
/*
* XFS Online Metadata Scrub (and Repair)
@@ -107,11 +112,213 @@ unsigned int debug;
/* Should we dump core if errors happen? */
bool dumpcore;
+/* Display resource usage at the end of each phase? */
+bool display_rusage;
+
+/* Background mode; higher values insert more pauses between scrub calls. */
+unsigned int bg_mode;
+
+/* Maximum number of processors available to us. */
+int nproc;
+
+/* Number of threads we're allowed to use. */
+unsigned int nr_threads;
+
+/* Verbosity; higher values print more information. */
+bool verbose;
+
+/* Should we scrub the data blocks? */
+bool scrub_data;
+
+/* Size of a memory page. */
+long page_size;
+
+static void __attribute__((noreturn))
+usage(void)
+{
+ fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
+ fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
+ fprintf(stderr, _("-b:\tBackground mode.\n"));
+ fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
+ fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
+ fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
+ fprintf(stderr, _("-T:\tDisplay timing/usage information.\n"));
+ fprintf(stderr, _("-v:\tVerbose output.\n"));
+ fprintf(stderr, _("-V:\tPrint version.\n"));
+ fprintf(stderr, _("-x:\tScrub file data too.\n"));
+ fprintf(stderr, _("-y:\tRepair all errors.\n"));
+
+ exit(16);
+}
+
int
main(
int argc,
char **argv)
{
+ int c;
+ char *mtab = NULL;
+ char *repairstr = "";
+ struct scrub_ctx ctx = {0};
+ unsigned long long total_errors;
+ bool moveon = true;
+ static bool injected;
+ int ret = 0;
+
fprintf(stderr, "XXX: This program is not complete!\n");
return 4;
+
+ progname = basename(argv[0]);
+ setlocale(LC_ALL, "");
+ bindtextdomain(PACKAGE, LOCALEDIR);
+ textdomain(PACKAGE);
+
+ pthread_mutex_init(&ctx.lock, NULL);
+ ctx.mode = SCRUB_MODE_DEFAULT;
+ ctx.error_action = ERRORS_CONTINUE;
+ while ((c = getopt(argc, argv, "a:bde:m:nTvxVy")) != EOF) {
+ switch (c) {
+ case 'a':
+ ctx.max_errors = cvt_u64(optarg, 10);
+ if (errno) {
+ perror(optarg);
+ usage();
+ }
+ break;
+ case 'b':
+ nr_threads = 1;
+ bg_mode++;
+ break;
+ case 'd':
+ debug++;
+ dumpcore = true;
+ break;
+ case 'e':
+ if (!strcmp("continue", optarg))
+ ctx.error_action = ERRORS_CONTINUE;
+ else if (!strcmp("shutdown", optarg))
+ ctx.error_action = ERRORS_SHUTDOWN;
+ else
+ usage();
+ break;
+ case 'm':
+ mtab = optarg;
+ break;
+ case 'n':
+ if (ctx.mode != SCRUB_MODE_DEFAULT) {
+ fprintf(stderr,
+_("Only one of the options -n or -y may be specified.\n"));
+ return 1;
+ }
+ ctx.mode = SCRUB_MODE_DRY_RUN;
+ break;
+ case 'T':
+ display_rusage = true;
+ break;
+ case 'v':
+ verbose = true;
+ break;
+ case 'V':
+ fprintf(stdout, _("%s version %s\n"), progname,
+ VERSION);
+ fflush(stdout);
+ exit(0);
+ case 'x':
+ scrub_data = true;
+ break;
+ case 'y':
+ if (ctx.mode != SCRUB_MODE_DEFAULT) {
+ fprintf(stderr,
+_("Only one of the options -n or -y may be specified.\n"));
+ return 1;
+ }
+ ctx.mode = SCRUB_MODE_REPAIR;
+ break;
+ case '?':
+ /* fall through */
+ default:
+ usage();
+ }
+ }
+
+ /* Override thread count if debugger */
+ if (debug_tweak_on("XFS_SCRUB_THREADS")) {
+ unsigned int x;
+
+ x = cvt_u32(getenv("XFS_SCRUB_THREADS"), 10);
+ if (errno) {
+ perror("nr_threads");
+ usage();
+ }
+ nr_threads = x;
+ }
+
+ if (optind != argc - 1)
+ usage();
+
+ ctx.mntpoint = strdup(argv[optind]);
+
+ /*
+ * If the user did not specify an explicit mount table, try to use
+ * /proc/mounts if it is available, else /etc/mtab. We prefer
+ * /proc/mounts because it is kernel controlled, while /etc/mtab
+ * may contain garbage that userspace tools like pam_mounts wrote
+ * into it.
+ */
+ if (!mtab) {
+ if (access(_PATH_PROC_MOUNTS, R_OK) == 0)
+ mtab = _PATH_PROC_MOUNTS;
+ else
+ mtab = _PATH_MOUNTED;
+ }
+
+ /* How many CPUs? */
+ nproc = sysconf(_SC_NPROCESSORS_ONLN);
+ if (nproc < 1)
+ nproc = 1;
+
+ /* Set up a page-aligned buffer for read verification. */
+ page_size = sysconf(_SC_PAGESIZE);
+ if (page_size < 0) {
+ str_errno(&ctx, ctx.mntpoint);
+ goto out;
+ }
+
+ if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR") && !injected) {
+ ctx.mode = SCRUB_MODE_REPAIR;
+ injected = true;
+ }
+
+ if (xfs_scrub_excessive_errors(&ctx))
+ str_info(&ctx, ctx.mntpoint, _("Too many errors; aborting."));
+
+ if (debug_tweak_on("XFS_SCRUB_FORCE_ERROR"))
+ str_error(&ctx, ctx.mntpoint, _("Injecting error."));
+
+out:
+ total_errors = ctx.errors_found + ctx.runtime_errors;
+ if (ctx.need_repair)
+ repairstr = _(" Unmount and run xfs_repair.");
+ if (total_errors && ctx.warnings_found)
+ fprintf(stderr,
+_("%s: %llu errors and %llu warnings found.%s\n"),
+ ctx.mntpoint, total_errors, ctx.warnings_found,
+ repairstr);
+ else if (total_errors && ctx.warnings_found == 0)
+ fprintf(stderr,
+_("%s: %llu errors found.%s\n"),
+ ctx.mntpoint, total_errors, repairstr);
+ else if (total_errors == 0 && ctx.warnings_found)
+ fprintf(stderr,
+_("%s: %llu warnings found.\n"),
+ ctx.mntpoint, ctx.warnings_found);
+ if (ctx.errors_found)
+ ret |= 1;
+ if (ctx.warnings_found)
+ ret |= 2;
+ if (ctx.runtime_errors)
+ ret |= 4;
+ free(ctx.mntpoint);
+
+ return ret;
}
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index f19ac6b..03d6012 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -20,16 +20,50 @@
#ifndef XFS_SCRUB_XFS_SCRUB_H_
#define XFS_SCRUB_XFS_SCRUB_H_
+#define _PATH_PROC_MOUNTS "/proc/mounts"
+
+extern unsigned int nr_threads;
+extern unsigned int bg_mode;
extern unsigned int debug;
+extern int nproc;
+extern bool display_rusage;
extern bool dumpcore;
+extern bool verbose;
+extern bool scrub_data;
+extern long page_size;
+
+enum scrub_mode {
+ SCRUB_MODE_DRY_RUN,
+ SCRUB_MODE_PREEN,
+ SCRUB_MODE_REPAIR,
+};
+#define SCRUB_MODE_DEFAULT SCRUB_MODE_PREEN
+
+enum error_action {
+ ERRORS_CONTINUE,
+ ERRORS_SHUTDOWN,
+};
struct scrub_ctx {
+ /* Immutable scrub state. */
+
+ /* Strings we need for presentation */
+ char *mntpoint;
+ char *blkdev;
+
+ /* What does the user want us to do? */
+ enum scrub_mode mode;
+
+ /* How does the user want us to react to errors? */
+ enum error_action error_action;
+
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
unsigned long long max_errors;
unsigned long long runtime_errors;
unsigned long long errors_found;
unsigned long long warnings_found;
+ bool need_repair;
};
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 04/27] xfs_scrub: dispatch the various phases of the scrub program
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (2 preceding siblings ...)
2018-01-06 1:51 ` [PATCH 03/27] xfs_scrub: set up command line argument parsing Darrick J. Wong
@ 2018-01-06 1:51 ` Darrick J. Wong
2018-01-06 1:51 ` [PATCH 05/27] xfs_scrub: figure out how many threads we're going to need Darrick J. Wong
` (26 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create the dispatching routines that we'll use to call out to each
separate phase of the program.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure.ac | 1
include/builddefs.in | 1
m4/package_libcdev.m4 | 18 +++
scrub/Makefile | 4 +
scrub/common.c | 63 +++++++++++
scrub/common.h | 4 +
scrub/xfs_scrub.c | 275 +++++++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 366 insertions(+)
diff --git a/configure.ac b/configure.ac
index f83d581..796a91b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -165,6 +165,7 @@ AC_HAVE_GETFSMAP
AC_HAVE_STATFS_FLAGS
AC_HAVE_MAP_SYNC
AC_HAVE_DEVMAPPER
+AC_HAVE_MALLINFO
if test "$enable_blkid" = yes; then
AC_HAVE_BLKID_TOPO
diff --git a/include/builddefs.in b/include/builddefs.in
index 9470703..28cf0d8 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -119,6 +119,7 @@ HAVE_GETFSMAP = @have_getfsmap@
HAVE_STATFS_FLAGS = @have_statfs_flags@
HAVE_MAP_SYNC = @have_map_sync@
HAVE_DEVMAPPER = @have_devmapper@
+HAVE_MALLINFO = @have_mallinfo@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index 71cedc5..d3955f0 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -344,3 +344,21 @@ AC_DEFUN([AC_HAVE_MAP_SYNC],
AC_MSG_RESULT(no))
AC_SUBST(have_map_sync)
])
+
+#
+# Check if we have a mallinfo libc call
+#
+AC_DEFUN([AC_HAVE_MALLINFO],
+ [ AC_MSG_CHECKING([for mallinfo ])
+ AC_TRY_COMPILE([
+#include <malloc.h>
+ ], [
+ struct mallinfo test;
+
+ test.arena = 0; test.hblkhd = 0; test.uordblks = 0; test.fordblks = 0;
+ test = mallinfo();
+ ], have_mallinfo=yes
+ AC_MSG_RESULT(yes),
+ AC_MSG_RESULT(no))
+ AC_SUBST(have_mallinfo)
+ ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 62cca3b..097ec84 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -27,6 +27,10 @@ LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
LLDFLAGS = -static
+ifeq ($(HAVE_MALLINFO),yes)
+LCFLAGS += -DHAVE_MALLINFO
+endif
+
default: depend $(LTCOMMAND)
include $(BUILDRULES)
diff --git a/scrub/common.c b/scrub/common.c
index 3c89b7d..9880ab5 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -159,3 +159,66 @@ do_error(char const *msg, ...)
abort();
exit(1);
}
+
+double
+timeval_subtract(
+ struct timeval *tv1,
+ struct timeval *tv2)
+{
+ return ((tv1->tv_sec - tv2->tv_sec) +
+ ((float) (tv1->tv_usec - tv2->tv_usec)) / 1000000);
+}
+
+/* Produce human readable disk space output. */
+double
+auto_space_units(
+ unsigned long long bytes,
+ char **units)
+{
+ if (debug > 1)
+ goto no_prefix;
+ if (bytes > (1ULL << 40)) {
+ *units = "TiB";
+ return (double)bytes / (1ULL << 40);
+ } else if (bytes > (1ULL << 30)) {
+ *units = "GiB";
+ return (double)bytes / (1ULL << 30);
+ } else if (bytes > (1ULL << 20)) {
+ *units = "MiB";
+ return (double)bytes / (1ULL << 20);
+ } else if (bytes > (1ULL << 10)) {
+ *units = "KiB";
+ return (double)bytes / (1ULL << 10);
+ }
+
+no_prefix:
+ *units = "B";
+ return bytes;
+}
+
+/* Produce human readable discrete number output. */
+double
+auto_units(
+ unsigned long long number,
+ char **units)
+{
+ if (debug > 1)
+ goto no_prefix;
+ if (number > 1000000000000ULL) {
+ *units = "T";
+ return number / 1000000000000.0;
+ } else if (number > 1000000000ULL) {
+ *units = "G";
+ return number / 1000000000.0;
+ } else if (number > 1000000ULL) {
+ *units = "M";
+ return number / 1000000.0;
+ } else if (number > 1000ULL) {
+ *units = "K";
+ return number / 1000.0;
+ }
+
+no_prefix:
+ *units = "";
+ return number;
+}
diff --git a/scrub/common.h b/scrub/common.h
index 15a59bd..3afc616 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -56,4 +56,8 @@ debug_tweak_on(
return debug && getenv(name) != NULL;
}
+double timeval_subtract(struct timeval *tv1, struct timeval *tv2);
+double auto_space_units(unsigned long long kilobytes, char **units);
+double auto_units(unsigned long long number, char **units);
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 9db3b41..a9c185b 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -21,6 +21,8 @@
#include <pthread.h>
#include <stdbool.h>
#include <stdlib.h>
+#include <sys/time.h>
+#include <sys/resource.h>
#include "platform_defs.h"
#include "xfs.h"
#include "input.h"
@@ -151,6 +153,267 @@ usage(void)
exit(16);
}
+#ifndef RUSAGE_BOTH
+# define RUSAGE_BOTH (-2)
+#endif
+
+/* Get resource usage for ourselves and all children. */
+static int
+scrub_getrusage(
+ struct rusage *usage)
+{
+ struct rusage cusage;
+ int err;
+
+ err = getrusage(RUSAGE_BOTH, usage);
+ if (!err)
+ return err;
+
+ err = getrusage(RUSAGE_SELF, usage);
+ if (err)
+ return err;
+
+ err = getrusage(RUSAGE_CHILDREN, &cusage);
+ if (err)
+ return err;
+
+ usage->ru_minflt += cusage.ru_minflt;
+ usage->ru_majflt += cusage.ru_majflt;
+ usage->ru_nswap += cusage.ru_nswap;
+ usage->ru_inblock += cusage.ru_inblock;
+ usage->ru_oublock += cusage.ru_oublock;
+ usage->ru_msgsnd += cusage.ru_msgsnd;
+ usage->ru_msgrcv += cusage.ru_msgrcv;
+ usage->ru_nsignals += cusage.ru_nsignals;
+ usage->ru_nvcsw += cusage.ru_nvcsw;
+ usage->ru_nivcsw += cusage.ru_nivcsw;
+ return 0;
+}
+
+/*
+ * Scrub Phase Dispatch
+ *
+ * The operations of the scrub program are split up into several
+ * different phases. Each phase builds upon the metadata checked in the
+ * previous phase, which is to say that we may skip phase (X + 1) if our
+ * scans in phase (X) reveal corruption. A phase may be skipped
+ * entirely.
+ */
+
+/* Resource usage for each phase. */
+struct phase_rusage {
+ struct rusage ruse;
+ struct timeval time;
+ unsigned long long verified_bytes;
+ void *brk_start;
+ const char *descr;
+};
+
+/* Operations for each phase. */
+#define DATASCAN_DUMMY_FN ((void *)1)
+#define REPAIR_DUMMY_FN ((void *)2)
+struct phase_ops {
+ char *descr;
+ bool (*fn)(struct scrub_ctx *);
+ bool must_run;
+};
+
+/* Start tracking resource usage for a phase. */
+static bool
+phase_start(
+ struct phase_rusage *pi,
+ unsigned int phase,
+ const char *descr)
+{
+ int error;
+
+ memset(pi, 0, sizeof(*pi));
+ error = scrub_getrusage(&pi->ruse);
+ if (error) {
+ perror(_("getrusage"));
+ return false;
+ }
+ pi->brk_start = sbrk(0);
+
+ error = gettimeofday(&pi->time, NULL);
+ if (error) {
+ perror(_("gettimeofday"));
+ return false;
+ }
+
+ pi->descr = descr;
+ if ((verbose || display_rusage) && descr) {
+ fprintf(stdout, _("Phase %u: %s\n"), phase, descr);
+ fflush(stdout);
+ }
+ return true;
+}
+
+/* Report usage stats. */
+static bool
+phase_end(
+ struct phase_rusage *pi,
+ unsigned int phase)
+{
+ struct rusage ruse_now;
+#ifdef HAVE_MALLINFO
+ struct mallinfo mall_now;
+#endif
+ struct timeval time_now;
+ char phasebuf[DESCR_BUFSZ];
+ double dt;
+ unsigned long long in, out;
+ unsigned long long io;
+ double i, o, t;
+ double din, dout, dtot;
+ char *iu, *ou, *tu, *dinu, *doutu, *dtotu;
+ int error;
+
+ if (!display_rusage)
+ return true;
+
+ error = gettimeofday(&time_now, NULL);
+ if (error) {
+ perror(_("gettimeofday"));
+ return false;
+ }
+ dt = timeval_subtract(&time_now, &pi->time);
+
+ error = scrub_getrusage(&ruse_now);
+ if (error) {
+ perror(_("getrusage"));
+ return false;
+ }
+
+ if (phase)
+ snprintf(phasebuf, DESCR_BUFSZ, _("Phase %u: "), phase);
+ else
+ phasebuf[0] = 0;
+
+#define kbytes(x) (((unsigned long)(x) + 1023) / 1024)
+#ifdef HAVE_MALLINFO
+
+ mall_now = mallinfo();
+ fprintf(stdout, _("%sMemory used: %luk/%luk (%luk/%luk), "),
+ phasebuf,
+ kbytes(mall_now.arena), kbytes(mall_now.hblkhd),
+ kbytes(mall_now.uordblks), kbytes(mall_now.fordblks));
+#else
+ fprintf(stdout, _("%sMemory used: %luk, "),
+ phasebuf,
+ (unsigned long) kbytes(((char *) sbrk(0)) -
+ ((char *) pi->brk_start)));
+#endif
+#undef kbytes
+
+ fprintf(stdout, _("time: %5.2f/%5.2f/%5.2fs\n"),
+ timeval_subtract(&time_now, &pi->time),
+ timeval_subtract(&ruse_now.ru_utime, &pi->ruse.ru_utime),
+ timeval_subtract(&ruse_now.ru_stime, &pi->ruse.ru_stime));
+
+ /* I/O usage */
+ in = ((unsigned long long)ruse_now.ru_inblock -
+ pi->ruse.ru_inblock) << BBSHIFT;
+ out = ((unsigned long long)ruse_now.ru_oublock -
+ pi->ruse.ru_oublock) << BBSHIFT;
+ io = in + out;
+ if (io) {
+ i = auto_space_units(in, &iu);
+ o = auto_space_units(out, &ou);
+ t = auto_space_units(io, &tu);
+ din = auto_space_units(in / dt, &dinu);
+ dout = auto_space_units(out / dt, &doutu);
+ dtot = auto_space_units(io / dt, &dtotu);
+ fprintf(stdout,
+_("%sI/O: %.1f%s in, %.1f%s out, %.1f%s tot\n"),
+ phasebuf, i, iu, o, ou, t, tu);
+ fprintf(stdout,
+_("%sI/O rate: %.1f%s/s in, %.1f%s/s out, %.1f%s/s tot\n"),
+ phasebuf, din, dinu, dout, doutu, dtot, dtotu);
+ }
+ fflush(stdout);
+
+ return true;
+}
+
+/* Run all the phases of the scrubber. */
+static bool
+run_scrub_phases(
+ struct scrub_ctx *ctx)
+{
+ struct phase_ops phases[] =
+ {
+ {
+ .descr = _("Find filesystem geometry."),
+ },
+ {
+ .descr = _("Check internal metadata."),
+ },
+ {
+ .descr = _("Scan all inodes."),
+ },
+ {
+ .descr = _("Defer filesystem repairs."),
+ .fn = REPAIR_DUMMY_FN,
+ },
+ {
+ .descr = _("Check directory tree."),
+ },
+ {
+ .descr = _("Verify data file integrity."),
+ .fn = DATASCAN_DUMMY_FN,
+ },
+ {
+ .descr = _("Check summary counters."),
+ },
+ {
+ NULL
+ },
+ };
+ struct phase_rusage pi;
+ struct phase_ops *sp;
+ bool moveon = true;
+ unsigned int debug_phase = 0;
+ unsigned int phase;
+
+ if (debug && debug_tweak_on("XFS_SCRUB_PHASE"))
+ debug_phase = atoi(getenv("XFS_SCRUB_PHASE"));
+
+ /* Run all phases of the scrub tool. */
+ for (phase = 1, sp = phases; sp->fn; sp++, phase++) {
+ /* Skip certain phases unless they're turned on. */
+ if (sp->fn == REPAIR_DUMMY_FN ||
+ sp->fn == DATASCAN_DUMMY_FN)
+ continue;
+
+ /* Allow debug users to force a particular phase. */
+ if (debug_phase && phase != debug_phase && !sp->must_run)
+ continue;
+
+ /* Run this phase. */
+ moveon = phase_start(&pi, phase, sp->descr);
+ if (!moveon)
+ break;
+ moveon = sp->fn(ctx);
+ if (!moveon) {
+ str_info(ctx, ctx->mntpoint,
+_("Scrub aborted after phase %d."),
+ phase);
+ break;
+ }
+ moveon = phase_end(&pi, phase);
+ if (!moveon)
+ break;
+
+ /* Too many errors? */
+ moveon = !xfs_scrub_excessive_errors(ctx);
+ if (!moveon)
+ break;
+ }
+
+ return moveon;
+}
+
int
main(
int argc,
@@ -160,6 +423,7 @@ main(
char *mtab = NULL;
char *repairstr = "";
struct scrub_ctx ctx = {0};
+ struct phase_rusage all_pi;
unsigned long long total_errors;
bool moveon = true;
static bool injected;
@@ -272,6 +536,11 @@ _("Only one of the options -n or -y may be specified.\n"));
mtab = _PATH_MOUNTED;
}
+ /* Initialize overall phase stats. */
+ moveon = phase_start(&all_pi, 0, NULL);
+ if (!moveon)
+ goto out;
+
/* How many CPUs? */
nproc = sysconf(_SC_NPROCESSORS_ONLN);
if (nproc < 1)
@@ -289,6 +558,11 @@ _("Only one of the options -n or -y may be specified.\n"));
injected = true;
}
+ /* Scrub a filesystem. */
+ moveon = run_scrub_phases(&ctx);
+ if (!moveon)
+ ret |= 4;
+
if (xfs_scrub_excessive_errors(&ctx))
str_info(&ctx, ctx.mntpoint, _("Too many errors; aborting."));
@@ -318,6 +592,7 @@ _("%s: %llu warnings found.\n"),
ret |= 2;
if (ctx.runtime_errors)
ret |= 4;
+ phase_end(&all_pi, 0);
free(ctx.mntpoint);
return ret;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 05/27] xfs_scrub: figure out how many threads we're going to need
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (3 preceding siblings ...)
2018-01-06 1:51 ` [PATCH 04/27] xfs_scrub: dispatch the various phases of the scrub program Darrick J. Wong
@ 2018-01-06 1:51 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 06/27] xfs_scrub: create an abstraction for a block device Darrick J. Wong
` (25 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:51 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create the plumbing to figure out how many threads we're going to want
to do all of our scrubbing.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/common.c | 26 ++++++++++++++++++++++++++
scrub/common.h | 2 ++
scrub/xfs_scrub.h | 3 +++
3 files changed, 31 insertions(+)
diff --git a/scrub/common.c b/scrub/common.c
index 9880ab5..75c6df5 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -222,3 +222,29 @@ auto_units(
*units = "";
return number;
}
+
+/* How many threads to kick off? */
+unsigned int
+scrub_nproc(
+ struct scrub_ctx *ctx)
+{
+ if (nr_threads)
+ return nr_threads;
+ return ctx->nr_io_threads;
+}
+
+/*
+ * How many threads to kick off for a workqueue? If we only want one
+ * thread, save ourselves the overhead and just run it in the main thread.
+ */
+unsigned int
+scrub_nproc_workqueue(
+ struct scrub_ctx *ctx)
+{
+ unsigned int x;
+
+ x = scrub_nproc(ctx);
+ if (x == 1)
+ x = 0;
+ return x;
+}
diff --git a/scrub/common.h b/scrub/common.h
index 3afc616..41b3ea7 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -59,5 +59,7 @@ debug_tweak_on(
double timeval_subtract(struct timeval *tv1, struct timeval *tv2);
double auto_space_units(unsigned long long kilobytes, char **units);
double auto_units(unsigned long long number, char **units);
+unsigned int scrub_nproc(struct scrub_ctx *ctx);
+unsigned int scrub_nproc_workqueue(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 03d6012..7f1dcb1 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -57,6 +57,9 @@ struct scrub_ctx {
/* How does the user want us to react to errors? */
enum error_action error_action;
+ /* Number of threads for metadata scrubbing */
+ unsigned int nr_io_threads;
+
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
unsigned long long max_errors;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (4 preceding siblings ...)
2018-01-06 1:51 ` [PATCH 05/27] xfs_scrub: figure out how many threads we're going to need Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-11 23:24 ` Eric Sandeen
2018-01-06 1:52 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
` (24 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create an abstraction to handle all of our low level disk operations.
We'll eventually use it to bind to a fs mount point and block device.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2 +
scrub/disk.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/disk.h | 39 +++++++++++++
3 files changed, 205 insertions(+)
create mode 100644 scrub/disk.c
create mode 100644 scrub/disk.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 097ec84..c3a9986 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -17,10 +17,12 @@ endif # scrub_prereqs
HFILES = \
common.h \
+disk.h \
xfs_scrub.h
CFILES = \
common.c \
+disk.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
diff --git a/scrub/disk.c b/scrub/disk.c
new file mode 100644
index 0000000..d4bf81f
--- /dev/null
+++ b/scrub/disk.c
@@ -0,0 +1,164 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/statvfs.h>
+#include <sys/vfs.h>
+#include <linux/fs.h>
+#include "platform_defs.h"
+#include "libfrog.h"
+#include "xfs_scrub.h"
+#include "disk.h"
+
+/*
+ * Disk Abstraction
+ *
+ * These routines help us to discover the geometry of a block device,
+ * estimate the amount of concurrent IOs that we can send to it, and
+ * abstract the process of performing read verification of disk blocks.
+ */
+
+/* Figure out how many disk heads are available. */
+static unsigned int
+__disk_heads(
+ struct disk *disk)
+{
+ int iomin;
+ int ioopt;
+ unsigned short rot;
+ int error;
+
+ /* If it's not a block device, throw all the CPUs at it. */
+ if (!S_ISBLK(disk->d_sb.st_mode))
+ return nproc;
+
+ /* Non-rotational device? Throw all the CPUs. */
+ rot = 1;
+ error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
+ if (error == 0 && rot == 0)
+ return nproc;
+
+ /*
+ * Sometimes we can infer the number of devices from the
+ * min/optimal IO sizes.
+ */
+ iomin = ioopt = 0;
+ if (ioctl(disk->d_fd, BLKIOMIN, &iomin) == 0 &&
+ ioctl(disk->d_fd, BLKIOOPT, &ioopt) == 0 &&
+ iomin > 0 && ioopt > 0) {
+ return min(nproc, max(1, ioopt / iomin));
+ }
+
+ /* Rotating device? I guess? */
+ return 2;
+}
+
+/* Figure out how many disk heads are available. */
+unsigned int
+disk_heads(
+ struct disk *disk)
+{
+ if (nr_threads)
+ return nr_threads;
+ return __disk_heads(disk);
+}
+
+/* Open a disk device and discover its geometry. */
+struct disk *
+disk_open(
+ const char *pathname)
+{
+ struct disk *disk;
+ int lba_sz;
+ int error;
+
+ disk = calloc(1, sizeof(struct disk));
+ if (!disk)
+ return NULL;
+
+ disk->d_fd = open(pathname, O_RDONLY | O_DIRECT | O_NOATIME);
+ if (disk->d_fd < 0)
+ goto out_free;
+
+ /* Try to get LBA size. */
+ error = ioctl(disk->d_fd, BLKSSZGET, &lba_sz);
+ if (error)
+ lba_sz = 512;
+ disk->d_lbalog = log2_roundup(lba_sz);
+
+ /* Obtain disk's stat info. */
+ error = fstat(disk->d_fd, &disk->d_sb);
+ if (error)
+ goto out_close;
+
+ /* Determine bdev size, block size, and offset. */
+ if (S_ISBLK(disk->d_sb.st_mode)) {
+ error = ioctl(disk->d_fd, BLKGETSIZE64, &disk->d_size);
+ if (error)
+ disk->d_size = 0;
+ error = ioctl(disk->d_fd, BLKBSZGET, &disk->d_blksize);
+ if (error)
+ disk->d_blksize = 0;
+ disk->d_start = 0;
+ } else {
+ disk->d_size = disk->d_sb.st_size;
+ disk->d_blksize = disk->d_sb.st_blksize;
+ disk->d_start = 0;
+ }
+
+ return disk;
+out_close:
+ close(disk->d_fd);
+out_free:
+ free(disk);
+ return NULL;
+}
+
+/* Close a disk device. */
+int
+disk_close(
+ struct disk *disk)
+{
+ int error = 0;
+
+ if (disk->d_fd >= 0)
+ error = close(disk->d_fd);
+ disk->d_fd = -1;
+ free(disk);
+ return error;
+}
+
+/* Read-verify an extent of a disk device. */
+ssize_t
+disk_read_verify(
+ struct disk *disk,
+ void *buf,
+ uint64_t start,
+ uint64_t length)
+{
+ return pread(disk->d_fd, buf, length, start);
+}
diff --git a/scrub/disk.h b/scrub/disk.h
new file mode 100644
index 0000000..834678e
--- /dev/null
+++ b/scrub/disk.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_DISK_H_
+#define XFS_SCRUB_DISK_H_
+
+struct disk {
+ struct stat d_sb;
+ int d_fd;
+ int d_lbalog;
+ unsigned int d_flags;
+ unsigned int d_blksize; /* bytes */
+ uint64_t d_size; /* bytes */
+ uint64_t d_start; /* bytes */
+};
+
+unsigned int disk_heads(struct disk *disk);
+struct disk *disk_open(const char *pathname);
+int disk_close(struct disk *disk);
+ssize_t disk_read_verify(struct disk *disk, void *buf, uint64_t startblock,
+ uint64_t blockcount);
+
+#endif /* XFS_SCRUB_DISK_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 07/27] xfs_scrub: find XFS filesystem geometry
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (5 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 06/27] xfs_scrub: create an abstraction for a block device Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 08/27] xfs_scrub: add inode iteration functions Darrick J. Wong
` (23 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Discover the geometry of the XFS filesystem that we've been told to
scan, and set up some common functions that will be used by the
scrub phases.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 5 +
scrub/common.c | 72 +++++++++++++++++
scrub/common.h | 10 ++
scrub/disk.c | 3 +
scrub/phase1.c | 223 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 35 ++++++++
scrub/xfs_scrub.h | 29 +++++++
7 files changed, 376 insertions(+), 1 deletion(-)
create mode 100644 scrub/phase1.c
diff --git a/scrub/Makefile b/scrub/Makefile
index c3a9986..5239dae 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -23,6 +23,7 @@ xfs_scrub.h
CFILES = \
common.c \
disk.c \
+phase1.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
@@ -33,6 +34,10 @@ ifeq ($(HAVE_MALLINFO),yes)
LCFLAGS += -DHAVE_MALLINFO
endif
+ifeq ($(HAVE_SYNCFS),yes)
+LCFLAGS += -DHAVE_SYNCFS
+endif
+
default: depend $(LTCOMMAND)
include $(BUILDRULES)
diff --git a/scrub/common.c b/scrub/common.c
index 75c6df5..252809d 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -20,8 +20,11 @@
#include <stdio.h>
#include <pthread.h>
#include <stdbool.h>
+#include <sys/statvfs.h>
#include "platform_defs.h"
#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
@@ -248,3 +251,72 @@ scrub_nproc_workqueue(
x = 0;
return x;
}
+
+/*
+ * Check if the argument is either the device name or mountpoint of a mounted
+ * filesystem.
+ */
+#define MNTTYPE_XFS "xfs"
+static bool
+find_mountpoint_check(
+ struct stat *sb,
+ struct mntent *t)
+{
+ struct stat ms;
+
+ if (S_ISDIR(sb->st_mode)) { /* mount point */
+ if (stat(t->mnt_dir, &ms) < 0)
+ return false;
+ if (sb->st_ino != ms.st_ino)
+ return false;
+ if (sb->st_dev != ms.st_dev)
+ return false;
+ if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
+ return NULL;
+ } else { /* device */
+ if (stat(t->mnt_fsname, &ms) < 0)
+ return false;
+ if (sb->st_rdev != ms.st_rdev)
+ return false;
+ if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
+ return NULL;
+ /*
+ * Make sure the mountpoint given by mtab is accessible
+ * before using it.
+ */
+ if (stat(t->mnt_dir, &ms) < 0)
+ return false;
+ }
+
+ return true;
+}
+
+/* Check that our alleged mountpoint is in mtab */
+bool
+find_mountpoint(
+ char *mtab,
+ struct scrub_ctx *ctx)
+{
+ struct mntent_cursor cursor;
+ struct mntent *t = NULL;
+ bool found = false;
+
+ if (platform_mntent_open(&cursor, mtab) != 0) {
+ fprintf(stderr, "Error: can't get mntent entries.\n");
+ exit(1);
+ }
+
+ while ((t = platform_mntent_next(&cursor)) != NULL) {
+ /*
+ * Keep jotting down matching mount details; newer mounts are
+ * towards the end of the file (hopefully).
+ */
+ if (find_mountpoint_check(&ctx->mnt_sb, t)) {
+ ctx->mntpoint = strdup(t->mnt_dir);
+ ctx->blkdev = strdup(t->mnt_fsname);
+ found = true;
+ }
+ }
+ platform_mntent_close(&cursor);
+ return found;
+}
diff --git a/scrub/common.h b/scrub/common.h
index 41b3ea7..fed95df 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -62,4 +62,14 @@ double auto_units(unsigned long long number, char **units);
unsigned int scrub_nproc(struct scrub_ctx *ctx);
unsigned int scrub_nproc_workqueue(struct scrub_ctx *ctx);
+#ifndef HAVE_SYNCFS
+static inline int syncfs(int fd)
+{
+ sync();
+ return 0;
+}
+#endif
+
+bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/disk.c b/scrub/disk.c
index d4bf81f..546a06c 100644
--- a/scrub/disk.c
+++ b/scrub/disk.c
@@ -31,6 +31,9 @@
#include <linux/fs.h>
#include "platform_defs.h"
#include "libfrog.h"
+#include "xfs.h"
+#include "path.h"
+#include "xfs_fs.h"
#include "xfs_scrub.h"
#include "disk.h"
diff --git a/scrub/phase1.c b/scrub/phase1.c
new file mode 100644
index 0000000..65409d3
--- /dev/null
+++ b/scrub/phase1.c
@@ -0,0 +1,223 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <mntent.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/statvfs.h>
+#include <sys/vfs.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <pthread.h>
+#include <errno.h>
+#include <linux/fs.h>
+#include "libfrog.h"
+#include "workqueue.h"
+#include "input.h"
+#include "path.h"
+#include "handle.h"
+#include "bitops.h"
+#include "xfs_arch.h"
+#include "xfs_format.h"
+#include "avl64.h"
+#include "list.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "disk.h"
+
+/* Phase 1: Find filesystem geometry (and clean up after) */
+
+/* Shut down the filesystem. */
+void
+xfs_shutdown_fs(
+ struct scrub_ctx *ctx)
+{
+ int flag;
+
+ flag = XFS_FSOP_GOING_FLAGS_LOGFLUSH;
+ str_info(ctx, ctx->mntpoint, _("Shutting down filesystem!"));
+ if (ioctl(ctx->mnt_fd, XFS_IOC_GOINGDOWN, &flag))
+ str_errno(ctx, ctx->mntpoint);
+}
+
+/* Clean up the XFS-specific state data. */
+bool
+xfs_cleanup_fs(
+ struct scrub_ctx *ctx)
+{
+ if (ctx->fshandle)
+ free_handle(ctx->fshandle, ctx->fshandle_len);
+ if (ctx->rtdev)
+ disk_close(ctx->rtdev);
+ if (ctx->logdev)
+ disk_close(ctx->logdev);
+ if (ctx->datadev)
+ disk_close(ctx->datadev);
+ fshandle_destroy();
+ close(ctx->mnt_fd);
+ fs_table_destroy();
+
+ return true;
+}
+
+/*
+ * Bind to the mountpoint, read the XFS geometry, bind to the block devices.
+ * Anything we've already built will be cleaned up by xfs_cleanup_fs.
+ */
+bool
+xfs_setup_fs(
+ struct scrub_ctx *ctx)
+{
+ struct fs_path *fsp;
+ int error;
+
+ /*
+ * Open the directory with O_NOATIME. For mountpoints owned
+ * by root, this should be sufficient to ensure that we have
+ * CAP_SYS_ADMIN, which we probably need to do anything fancy
+ * with the (XFS driver) kernel.
+ */
+ ctx->mnt_fd = open(ctx->mntpoint, O_RDONLY | O_NOATIME | O_DIRECTORY);
+ if (ctx->mnt_fd < 0) {
+ if (errno == EPERM)
+ str_info(ctx, ctx->mntpoint,
+_("Must be root to run scrub."));
+ else
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ error = fstat(ctx->mnt_fd, &ctx->mnt_sb);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+ error = fstatvfs(ctx->mnt_fd, &ctx->mnt_sv);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+ error = fstatfs(ctx->mnt_fd, &ctx->mnt_sf);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ ctx->nr_io_threads = nproc;
+ if (verbose) {
+ fprintf(stdout, _("%s: using %d threads to scrub.\n"),
+ ctx->mntpoint, scrub_nproc(ctx));
+ fflush(stdout);
+ }
+
+ if (!platform_test_xfs_fd(ctx->mnt_fd)) {
+ str_error(ctx, ctx->mntpoint,
+_("Does not appear to be an XFS filesystem!"));
+ return false;
+ }
+
+ /*
+ * Flush everything out to disk before we start checking.
+ * This seems to reduce the incidence of stale file handle
+ * errors when we open things by handle.
+ */
+ error = syncfs(ctx->mnt_fd);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ /* Retrieve XFS geometry. */
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSGEOMETRY, &ctx->geo);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ ctx->agblklog = log2_roundup(ctx->geo.agblocks);
+ ctx->blocklog = highbit32(ctx->geo.blocksize);
+ ctx->inodelog = highbit32(ctx->geo.inodesize);
+ ctx->inopblog = ctx->blocklog - ctx->inodelog;
+
+ error = path_to_fshandle(ctx->mntpoint, &ctx->fshandle,
+ &ctx->fshandle_len);
+ if (error) {
+ perror(_("getting fshandle"));
+ return false;
+ }
+
+ /* Go find the XFS devices if we have a usable fsmap. */
+ fs_table_initialise(0, NULL, 0, NULL);
+ errno = 0;
+ fsp = fs_table_lookup(ctx->mntpoint, FS_MOUNT_POINT);
+ if (!fsp) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find XFS information."));
+ return false;
+ }
+ memcpy(&ctx->fsinfo, fsp, sizeof(struct fs_path));
+
+ /* Did we find the log and rt devices, if they're present? */
+ if (ctx->geo.logstart == 0 && ctx->fsinfo.fs_log == NULL) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find log device path."));
+ return false;
+ }
+ if (ctx->geo.rtblocks && ctx->fsinfo.fs_rt == NULL) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find realtime device path."));
+ return false;
+ }
+
+ /* Open the raw devices. */
+ ctx->datadev = disk_open(ctx->fsinfo.fs_name);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+
+ if (ctx->fsinfo.fs_log) {
+ ctx->logdev = disk_open(ctx->fsinfo.fs_log);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+ }
+ if (ctx->fsinfo.fs_rt) {
+ ctx->rtdev = disk_open(ctx->fsinfo.fs_rt);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+ }
+
+ /*
+ * Everything's set up, which means any failures recorded after
+ * this point are most probably corruption errors (as opposed to
+ * purely setup errors).
+ */
+ ctx->need_repair = true;
+ return true;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index a9c185b..a733b8f 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -23,9 +23,12 @@
#include <stdlib.h>
#include <sys/time.h>
#include <sys/resource.h>
+#include <sys/statvfs.h>
#include "platform_defs.h"
#include "xfs.h"
+#include "xfs_fs.h"
#include "input.h"
+#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
@@ -345,6 +348,8 @@ run_scrub_phases(
{
{
.descr = _("Find filesystem geometry."),
+ .fn = xfs_setup_fs,
+ .must_run = true,
},
{
.descr = _("Check internal metadata."),
@@ -426,6 +431,7 @@ main(
struct phase_rusage all_pi;
unsigned long long total_errors;
bool moveon = true;
+ bool ismnt;
static bool injected;
int ret = 0;
@@ -522,6 +528,15 @@ _("Only one of the options -n or -y may be specified.\n"));
ctx.mntpoint = strdup(argv[optind]);
+ /* Find the mount record for the passed-in argument. */
+ if (stat(argv[optind], &ctx.mnt_sb) < 0) {
+ fprintf(stderr,
+ _("%s: could not stat: %s: %s\n"),
+ progname, argv[optind], strerror(errno));
+ ret |= 8;
+ goto out;
+ }
+
/*
* If the user did not specify an explicit mount table, try to use
* /proc/mounts if it is available, else /etc/mtab. We prefer
@@ -541,6 +556,15 @@ _("Only one of the options -n or -y may be specified.\n"));
if (!moveon)
goto out;
+ ismnt = find_mountpoint(mtab, &ctx);
+ if (!ismnt) {
+ fprintf(stderr,
+_("%s: Not a XFS mount point or block device.\n"),
+ ctx.mntpoint);
+ ret |= 8;
+ goto out;
+ }
+
/* How many CPUs? */
nproc = sysconf(_SC_NPROCESSORS_ONLN);
if (nproc < 1)
@@ -569,6 +593,11 @@ _("Only one of the options -n or -y may be specified.\n"));
if (debug_tweak_on("XFS_SCRUB_FORCE_ERROR"))
str_error(&ctx, ctx.mntpoint, _("Injecting error."));
+ /* Clean up scan data. */
+ moveon = xfs_cleanup_fs(&ctx);
+ if (!moveon)
+ ret |= 8;
+
out:
total_errors = ctx.errors_found + ctx.runtime_errors;
if (ctx.need_repair)
@@ -586,13 +615,17 @@ _("%s: %llu errors found.%s\n"),
fprintf(stderr,
_("%s: %llu warnings found.\n"),
ctx.mntpoint, ctx.warnings_found);
- if (ctx.errors_found)
+ if (ctx.errors_found) {
+ if (ctx.error_action == ERRORS_SHUTDOWN)
+ xfs_shutdown_fs(&ctx);
ret |= 1;
+ }
if (ctx.warnings_found)
ret |= 2;
if (ctx.runtime_errors)
ret |= 4;
phase_end(&all_pi, 0);
+ free(ctx.blkdev);
free(ctx.mntpoint);
return ret;
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 7f1dcb1..2be7c65 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -51,15 +51,38 @@ struct scrub_ctx {
char *mntpoint;
char *blkdev;
+ /* Mountpoint info */
+ struct stat mnt_sb;
+ struct statvfs mnt_sv;
+ struct statfs mnt_sf;
+
+ /* Open block devices */
+ struct disk *datadev;
+ struct disk *logdev;
+ struct disk *rtdev;
+
/* What does the user want us to do? */
enum scrub_mode mode;
/* How does the user want us to react to errors? */
enum error_action error_action;
+ /* fd to filesystem mount point */
+ int mnt_fd;
+
/* Number of threads for metadata scrubbing */
unsigned int nr_io_threads;
+ /* XFS specific geometry */
+ struct xfs_fsop_geom geo;
+ struct fs_path fsinfo;
+ unsigned int agblklog;
+ unsigned int blocklog;
+ unsigned int inodelog;
+ unsigned int inopblog;
+ void *fshandle;
+ size_t fshandle_len;
+
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
unsigned long long max_errors;
@@ -67,6 +90,12 @@ struct scrub_ctx {
unsigned long long errors_found;
unsigned long long warnings_found;
bool need_repair;
+ bool preen_triggers[XFS_SCRUB_TYPE_NR];
};
+/* Phase helper functions */
+void xfs_shutdown_fs(struct scrub_ctx *ctx);
+bool xfs_cleanup_fs(struct scrub_ctx *ctx);
+bool xfs_setup_fs(struct scrub_ctx *ctx);
+
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 08/27] xfs_scrub: add inode iteration functions
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (6 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 09/27] xfs_scrub: add space map " Darrick J. Wong
` (22 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
These helpers enable userspace to count or iterate all inodes in a
filesystem. The counting function uses INUMBERS, while the inode
iterator uses INUMBERS and BULKSTAT to iterate over every inode that
should be in the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/inodes.c | 284 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/inodes.h | 32 ++++++
3 files changed, 318 insertions(+)
create mode 100644 scrub/inodes.c
create mode 100644 scrub/inodes.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 5239dae..4d1c908 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -18,11 +18,13 @@ endif # scrub_prereqs
HFILES = \
common.h \
disk.h \
+inodes.h \
xfs_scrub.h
CFILES = \
common.c \
disk.c \
+inodes.c \
phase1.c \
xfs_scrub.c
diff --git a/scrub/inodes.c b/scrub/inodes.c
new file mode 100644
index 0000000..694bca7
--- /dev/null
+++ b/scrub/inodes.c
@@ -0,0 +1,284 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <pthread.h>
+#include <sys/statvfs.h>
+#include "platform_defs.h"
+#include "xfs.h"
+#include "xfs_arch.h"
+#include "xfs_format.h"
+#include "handle.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "inodes.h"
+
+/*
+ * Iterate a range of inodes.
+ *
+ * This is a little more involved than repeatedly asking BULKSTAT for a
+ * buffer's worth of stat data for some number of inodes. We want to
+ * scan as many of the inodes that the inobt thinks there are, including
+ * the ones that are broken, but if we ask for n inodes start at x,
+ * it'll skip the bad ones and fill from beyond the range (x + n).
+ *
+ * Therefore, we ask INUMBERS to return one inobt chunk's worth of inode
+ * bitmap information. Then we try to BULKSTAT only the inodes that
+ * were present in that chunk, and compare what we got against what
+ * INUMBERS said was there. If there's a mismatch, we know that we have
+ * an inode that fails the verifiers but so we can inject the bulkstat
+ * information to force the scrub code to deal with the broken inodes.
+ *
+ * If the iteration function returns ESTALE, that means that the inode
+ * has been deleted and possibly recreated since the BULKSTAT call. We
+ * wil refresh the stat information and try again up to 30 times before
+ * reporting the staleness as an error.
+ */
+
+/*
+ * Call into the filesystem for inode/bulkstat information and call our
+ * iterator function. We'll try to fill the bulkstat information in
+ * batches, but we also can detect iget failures.
+ */
+static bool
+xfs_iterate_inodes_range(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ void *fshandle,
+ uint64_t first_ino,
+ uint64_t last_ino,
+ xfs_inode_iter_fn fn,
+ void *arg)
+{
+ struct xfs_fsop_bulkreq igrpreq = {0};
+ struct xfs_fsop_bulkreq bulkreq = {0};
+ struct xfs_fsop_bulkreq onereq = {0};
+ struct xfs_handle handle;
+ struct xfs_inogrp inogrp;
+ struct xfs_bstat bstat[XFS_INODES_PER_CHUNK] = {0};
+ char idescr[DESCR_BUFSZ];
+ char buf[DESCR_BUFSZ];
+ struct xfs_bstat *bs;
+ __u64 last_stale = first_ino - 1;
+ __u64 igrp_ino;
+ __u64 oneino;
+ __u64 ino;
+ __s32 bulklen = 0;
+ __s32 onelen = 0;
+ __s32 igrplen = 0;
+ bool moveon = true;
+ int i;
+ int error;
+ int stale_count = 0;
+
+ onereq.lastip = &oneino;
+ onereq.icount = 1;
+ onereq.ocount = &onelen;
+
+ bulkreq.lastip = &ino;
+ bulkreq.icount = XFS_INODES_PER_CHUNK;
+ bulkreq.ubuffer = &bstat;
+ bulkreq.ocount = &bulklen;
+
+ igrpreq.lastip = &igrp_ino;
+ igrpreq.icount = 1;
+ igrpreq.ubuffer = &inogrp;
+ igrpreq.ocount = &igrplen;
+
+ memcpy(&handle.ha_fsid, fshandle, sizeof(handle.ha_fsid));
+ handle.ha_fid.fid_len = sizeof(xfs_fid_t) -
+ sizeof(handle.ha_fid.fid_len);
+ handle.ha_fid.fid_pad = 0;
+
+ /* Find the inode chunk & alloc mask */
+ igrp_ino = first_ino;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSINUMBERS, &igrpreq);
+ while (!error && igrplen) {
+ /* Load the inodes. */
+ ino = inogrp.xi_startino - 1;
+ bulkreq.icount = inogrp.xi_alloccount;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSBULKSTAT, &bulkreq);
+ if (error)
+ str_warn(ctx, descr, "%s", strerror_r(errno,
+ buf, DESCR_BUFSZ));
+
+ /* Did we get exactly the inodes we expected? */
+ for (i = 0, bs = bstat; i < XFS_INODES_PER_CHUNK; i++) {
+ if (!(inogrp.xi_allocmask & (1ULL << i)))
+ continue;
+ if (bs->bs_ino == inogrp.xi_startino + i) {
+ bs++;
+ continue;
+ }
+
+ /* Load the one inode. */
+ oneino = inogrp.xi_startino + i;
+ onereq.ubuffer = bs;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSBULKSTAT_SINGLE,
+ &onereq);
+ if (error || bs->bs_ino != inogrp.xi_startino + i) {
+ memset(bs, 0, sizeof(struct xfs_bstat));
+ bs->bs_ino = inogrp.xi_startino + i;
+ bs->bs_blksize = ctx->mnt_sv.f_frsize;
+ }
+ bs++;
+ }
+
+ /* Iterate all the inodes. */
+ for (i = 0, bs = bstat; i < inogrp.xi_alloccount; i++, bs++) {
+ if (bs->bs_ino > last_ino)
+ goto out;
+
+ handle.ha_fid.fid_ino = bs->bs_ino;
+ handle.ha_fid.fid_gen = bs->bs_gen;
+ error = fn(ctx, &handle, bs, arg);
+ switch (error) {
+ case 0:
+ break;
+ case ESTALE:
+ if (last_stale == inogrp.xi_startino)
+ stale_count++;
+ else {
+ last_stale = inogrp.xi_startino;
+ stale_count = 0;
+ }
+ if (stale_count < 30) {
+ igrp_ino = inogrp.xi_startino;
+ goto igrp_retry;
+ }
+ snprintf(idescr, DESCR_BUFSZ, "inode %"PRIu64,
+ (uint64_t)bs->bs_ino);
+ str_warn(ctx, idescr, "%s", strerror_r(error,
+ buf, DESCR_BUFSZ));
+ break;
+ case XFS_ITERATE_INODES_ABORT:
+ error = 0;
+ /* fall thru */
+ default:
+ moveon = false;
+ errno = error;
+ goto err;
+ }
+ if (xfs_scrub_excessive_errors(ctx)) {
+ moveon = false;
+ goto out;
+ }
+ }
+
+igrp_retry:
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSINUMBERS, &igrpreq);
+ }
+
+err:
+ if (error) {
+ str_errno(ctx, descr);
+ moveon = false;
+ }
+out:
+ return moveon;
+}
+
+/* BULKSTAT wrapper routines. */
+struct xfs_scan_inodes {
+ xfs_inode_iter_fn fn;
+ void *arg;
+ bool moveon;
+};
+
+/* Scan all the inodes in an AG. */
+static void
+xfs_scan_ag_inodes(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct xfs_scan_inodes *si = arg;
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ char descr[DESCR_BUFSZ];
+ uint64_t ag_ino;
+ uint64_t next_ag_ino;
+ bool moveon;
+
+ snprintf(descr, DESCR_BUFSZ, _("dev %d:%d AG %u inodes"),
+ major(ctx->fsinfo.fs_datadev),
+ minor(ctx->fsinfo.fs_datadev),
+ agno);
+
+ ag_ino = (__u64)agno << (ctx->inopblog + ctx->agblklog);
+ next_ag_ino = (__u64)(agno + 1) << (ctx->inopblog + ctx->agblklog);
+
+ moveon = xfs_iterate_inodes_range(ctx, descr, ctx->fshandle, ag_ino,
+ next_ag_ino - 1, si->fn, si->arg);
+ if (!moveon)
+ si->moveon = false;
+}
+
+/* Scan all the inodes in a filesystem. */
+bool
+xfs_scan_all_inodes(
+ struct scrub_ctx *ctx,
+ xfs_inode_iter_fn fn,
+ void *arg)
+{
+ struct xfs_scan_inodes si;
+ xfs_agnumber_t agno;
+ struct workqueue wq;
+ int ret;
+
+ si.moveon = true;
+ si.fn = fn;
+ si.arg = arg;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ return false;
+ }
+
+ for (agno = 0; agno < ctx->geo.agcount; agno++) {
+ ret = workqueue_add(&wq, xfs_scan_ag_inodes, agno, &si);
+ if (ret) {
+ si.moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue AG %u bulkstat work."), agno);
+ break;
+ }
+ }
+
+ workqueue_destroy(&wq);
+
+ return si.moveon;
+}
+
+/*
+ * Open a file by handle, or return a negative error code.
+ */
+int
+xfs_open_handle(
+ struct xfs_handle *handle)
+{
+ return open_by_fshandle(handle, sizeof(*handle),
+ O_RDONLY | O_NOATIME | O_NOFOLLOW | O_NOCTTY);
+}
diff --git a/scrub/inodes.h b/scrub/inodes.h
new file mode 100644
index 0000000..693cb05
--- /dev/null
+++ b/scrub/inodes.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_INODES_H_
+#define XFS_SCRUB_INODES_H_
+
+typedef int (*xfs_inode_iter_fn)(struct scrub_ctx *ctx,
+ struct xfs_handle *handle, struct xfs_bstat *bs, void *arg);
+
+#define XFS_ITERATE_INODES_ABORT (-1)
+bool xfs_scan_all_inodes(struct scrub_ctx *ctx, xfs_inode_iter_fn fn,
+ void *arg);
+
+int xfs_open_handle(struct xfs_handle *handle);
+
+#endif /* XFS_SCRUB_INODES_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 09/27] xfs_scrub: add space map iteration functions
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (7 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 08/27] xfs_scrub: add inode iteration functions Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 10/27] xfs_scrub: add file " Darrick J. Wong
` (21 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
These helpers enable userspace to iterate all the space map information
in a filesystem. The iteration function uses GETFSMAP.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/spacemap.c | 256 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/spacemap.h | 31 +++++++
3 files changed, 289 insertions(+)
create mode 100644 scrub/spacemap.c
create mode 100644 scrub/spacemap.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 4d1c908..24e0c44 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -19,6 +19,7 @@ HFILES = \
common.h \
disk.h \
inodes.h \
+spacemap.h \
xfs_scrub.h
CFILES = \
@@ -26,6 +27,7 @@ common.c \
disk.c \
inodes.c \
phase1.c \
+spacemap.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
diff --git a/scrub/spacemap.c b/scrub/spacemap.c
new file mode 100644
index 0000000..2dc6e2b
--- /dev/null
+++ b/scrub/spacemap.c
@@ -0,0 +1,256 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <string.h>
+#include <pthread.h>
+#include <sys/statvfs.h>
+#include "workqueue.h"
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "spacemap.h"
+
+/*
+ * Filesystem space map iterators.
+ *
+ * Logically, we call GETFSMAP to fetch a set of space map records and
+ * call a function to iterate over the records. However, that's not
+ * what actually happens -- the work is split into separate items, with
+ * each AG, the realtime device, and the log device getting their own
+ * work items. For an XFS with a realtime device and an external log,
+ * this means that we can have up to ($agcount + 2) threads running at
+ * once.
+ *
+ * This comes into play if we want to have per-workitem memory. Maybe.
+ * XXX: do we really need all that ?
+ */
+
+#define FSMAP_NR 65536
+
+/* Iterate all the fs block mappings between the two keys. */
+bool
+xfs_iterate_fsmap(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct fsmap *keys,
+ xfs_fsmap_iter_fn fn,
+ void *arg)
+{
+ struct fsmap_head *head;
+ struct fsmap *p;
+ bool moveon = true;
+ int i;
+ int error;
+
+ head = malloc(fsmap_sizeof(FSMAP_NR));
+ if (!head) {
+ str_errno(ctx, descr);
+ return false;
+ }
+
+ memset(head, 0, sizeof(*head));
+ memcpy(head->fmh_keys, keys, sizeof(struct fsmap) * 2);
+ head->fmh_count = FSMAP_NR;
+
+ while ((error = ioctl(ctx->mnt_fd, FS_IOC_GETFSMAP, head)) == 0) {
+ for (i = 0, p = head->fmh_recs;
+ i < head->fmh_entries;
+ i++, p++) {
+ moveon = fn(ctx, descr, p, arg);
+ if (!moveon)
+ goto out;
+ if (xfs_scrub_excessive_errors(ctx)) {
+ moveon = false;
+ goto out;
+ }
+ }
+
+ if (head->fmh_entries == 0)
+ break;
+ p = &head->fmh_recs[head->fmh_entries - 1];
+ if (p->fmr_flags & FMR_OF_LAST)
+ break;
+ fsmap_advance(head);
+ }
+
+ if (error) {
+ str_errno(ctx, descr);
+ moveon = false;
+ }
+out:
+ free(head);
+ return moveon;
+}
+
+/* GETFSMAP wrappers routines. */
+struct xfs_scan_blocks {
+ xfs_fsmap_iter_fn fn;
+ void *arg;
+ bool moveon;
+};
+
+/* Iterate all the reverse mappings of an AG. */
+static void
+xfs_scan_ag_blocks(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ struct xfs_scan_blocks *sbx = arg;
+ char descr[DESCR_BUFSZ];
+ struct fsmap keys[2];
+ off64_t bperag;
+ bool moveon;
+
+ bperag = (off64_t)ctx->geo.agblocks *
+ (off64_t)ctx->geo.blocksize;
+
+ snprintf(descr, DESCR_BUFSZ, _("dev %d:%d AG %u fsmap"),
+ major(ctx->fsinfo.fs_datadev),
+ minor(ctx->fsinfo.fs_datadev),
+ agno);
+
+ memset(keys, 0, sizeof(struct fsmap) * 2);
+ keys->fmr_device = ctx->fsinfo.fs_datadev;
+ keys->fmr_physical = agno * bperag;
+ (keys + 1)->fmr_device = ctx->fsinfo.fs_datadev;
+ (keys + 1)->fmr_physical = ((agno + 1) * bperag) - 1;
+ (keys + 1)->fmr_owner = ULLONG_MAX;
+ (keys + 1)->fmr_offset = ULLONG_MAX;
+ (keys + 1)->fmr_flags = UINT_MAX;
+
+ moveon = xfs_iterate_fsmap(ctx, descr, keys, sbx->fn, sbx->arg);
+ if (!moveon)
+ sbx->moveon = false;
+}
+
+/* Iterate all the reverse mappings of a standalone device. */
+static void
+xfs_scan_dev_blocks(
+ struct scrub_ctx *ctx,
+ int idx,
+ dev_t dev,
+ struct xfs_scan_blocks *sbx)
+{
+ struct fsmap keys[2];
+ char descr[DESCR_BUFSZ];
+ bool moveon;
+
+ snprintf(descr, DESCR_BUFSZ, _("dev %d:%d fsmap"),
+ major(dev), minor(dev));
+
+ memset(keys, 0, sizeof(struct fsmap) * 2);
+ keys->fmr_device = dev;
+ (keys + 1)->fmr_device = dev;
+ (keys + 1)->fmr_physical = ULLONG_MAX;
+ (keys + 1)->fmr_owner = ULLONG_MAX;
+ (keys + 1)->fmr_offset = ULLONG_MAX;
+ (keys + 1)->fmr_flags = UINT_MAX;
+
+ moveon = xfs_iterate_fsmap(ctx, descr, keys, sbx->fn, sbx->arg);
+ if (!moveon)
+ sbx->moveon = false;
+}
+
+/* Iterate all the reverse mappings of the realtime device. */
+static void
+xfs_scan_rt_blocks(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+
+ xfs_scan_dev_blocks(ctx, agno, ctx->fsinfo.fs_rtdev, arg);
+}
+
+/* Iterate all the reverse mappings of the log device. */
+static void
+xfs_scan_log_blocks(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+
+ xfs_scan_dev_blocks(ctx, agno, ctx->fsinfo.fs_logdev, arg);
+}
+
+/* Scan all the blocks in a filesystem. */
+bool
+xfs_scan_all_spacemaps(
+ struct scrub_ctx *ctx,
+ xfs_fsmap_iter_fn fn,
+ void *arg)
+{
+ struct workqueue wq;
+ struct xfs_scan_blocks sbx;
+ xfs_agnumber_t agno;
+ int ret;
+
+ sbx.moveon = true;
+ sbx.fn = fn;
+ sbx.arg = arg;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ return false;
+ }
+ if (ctx->fsinfo.fs_rt) {
+ ret = workqueue_add(&wq, xfs_scan_rt_blocks,
+ ctx->geo.agcount + 1, &sbx);
+ if (ret) {
+ sbx.moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue rtdev fsmap work."));
+ goto out;
+ }
+ }
+ if (ctx->fsinfo.fs_log) {
+ ret = workqueue_add(&wq, xfs_scan_log_blocks,
+ ctx->geo.agcount + 2, &sbx);
+ if (ret) {
+ sbx.moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue logdev fsmap work."));
+ goto out;
+ }
+ }
+ for (agno = 0; agno < ctx->geo.agcount; agno++) {
+ ret = workqueue_add(&wq, xfs_scan_ag_blocks, agno, &sbx);
+ if (ret) {
+ sbx.moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue AG %u fsmap work."), agno);
+ break;
+ }
+ }
+out:
+ workqueue_destroy(&wq);
+
+ return sbx.moveon;
+}
diff --git a/scrub/spacemap.h b/scrub/spacemap.h
new file mode 100644
index 0000000..9ee46f7
--- /dev/null
+++ b/scrub/spacemap.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_SPACEMAP_H_
+#define XFS_SCRUB_SPACEMAP_H_
+
+typedef bool (*xfs_fsmap_iter_fn)(struct scrub_ctx *ctx, const char *descr,
+ struct fsmap *fsr, void *arg);
+
+bool xfs_iterate_fsmap(struct scrub_ctx *ctx, const char *descr,
+ struct fsmap *keys, xfs_fsmap_iter_fn fn, void *arg);
+bool xfs_scan_all_spacemaps(struct scrub_ctx *ctx, xfs_fsmap_iter_fn fn,
+ void *arg);
+
+#endif /* XFS_SCRUB_SPACEMAP_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 10/27] xfs_scrub: add file space map iteration functions
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (8 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 09/27] xfs_scrub: add space map " Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-11 23:19 ` Eric Sandeen
2018-01-06 1:52 ` [PATCH 11/27] xfs_scrub: filesystem counter collection functions Darrick J. Wong
` (20 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
These helpers enable userspace to iterate all the space map information
for a file. The iteration function uses GETBMAPX.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2 +
scrub/filemap.c | 158 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/filemap.h | 39 ++++++++++++++
3 files changed, 199 insertions(+)
create mode 100644 scrub/filemap.c
create mode 100644 scrub/filemap.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 24e0c44..a3534e6 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -18,6 +18,7 @@ endif # scrub_prereqs
HFILES = \
common.h \
disk.h \
+filemap.h \
inodes.h \
spacemap.h \
xfs_scrub.h
@@ -25,6 +26,7 @@ xfs_scrub.h
CFILES = \
common.c \
disk.c \
+filemap.c \
inodes.c \
phase1.c \
spacemap.c \
diff --git a/scrub/filemap.c b/scrub/filemap.c
new file mode 100644
index 0000000..1c3c1cc
--- /dev/null
+++ b/scrub/filemap.c
@@ -0,0 +1,158 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "filemap.h"
+
+/*
+ * These routines provide a simple interface to query the block
+ * mappings of the fork of a given inode via GETBMAPX and call a
+ * function to iterate each mapping result.
+ */
+
+#define BMAP_NR 2048
+
+/* Iterate all the extent block mappings between the key and fork end. */
+bool
+xfs_iterate_filemaps(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ int fd,
+ int whichfork,
+ struct xfs_bmap *key,
+ xfs_bmap_iter_fn fn,
+ void *arg)
+{
+ struct fsxattr fsx;
+ struct getbmapx *map;
+ struct getbmapx *p;
+ struct xfs_bmap bmap;
+ char bmap_descr[DESCR_BUFSZ];
+ bool moveon = true;
+ xfs_off_t new_off;
+ int getxattr_type;
+ int i;
+ int error;
+
+ switch (whichfork) {
+ case XFS_ATTR_FORK:
+ snprintf(bmap_descr, DESCR_BUFSZ, _("%s attr"), descr);
+ break;
+ case XFS_COW_FORK:
+ snprintf(bmap_descr, DESCR_BUFSZ, _("%s CoW"), descr);
+ break;
+ case XFS_DATA_FORK:
+ snprintf(bmap_descr, DESCR_BUFSZ, _("%s data"), descr);
+ break;
+ default:
+ abort();
+ }
+
+ map = calloc(BMAP_NR, sizeof(struct getbmapx));
+ if (!map) {
+ str_errno(ctx, bmap_descr);
+ return false;
+ }
+
+ map->bmv_offset = BTOBB(key->bm_offset);
+ map->bmv_block = BTOBB(key->bm_physical);
+ if (key->bm_length == 0)
+ map->bmv_length = ULLONG_MAX;
+ else
+ map->bmv_length = BTOBB(key->bm_length);
+ map->bmv_count = BMAP_NR;
+ map->bmv_iflags = BMV_IF_NO_DMAPI_READ | BMV_IF_PREALLOC |
+ BMV_IF_NO_HOLES;
+ switch (whichfork) {
+ case XFS_ATTR_FORK:
+ getxattr_type = XFS_IOC_FSGETXATTRA;
+ map->bmv_iflags |= BMV_IF_ATTRFORK;
+ break;
+ case XFS_COW_FORK:
+ map->bmv_iflags |= BMV_IF_COWFORK;
+ getxattr_type = FS_IOC_FSGETXATTR;
+ break;
+ case XFS_DATA_FORK:
+ getxattr_type = FS_IOC_FSGETXATTR;
+ break;
+ default:
+ abort();
+ }
+
+ error = ioctl(fd, getxattr_type, &fsx);
+ if (error < 0) {
+ str_errno(ctx, bmap_descr);
+ moveon = false;
+ goto out;
+ }
+
+ while ((error = ioctl(fd, XFS_IOC_GETBMAPX, map)) == 0) {
+ for (i = 0, p = &map[i + 1]; i < map->bmv_entries; i++, p++) {
+ bmap.bm_offset = BBTOB(p->bmv_offset);
+ bmap.bm_physical = BBTOB(p->bmv_block);
+ bmap.bm_length = BBTOB(p->bmv_length);
+ bmap.bm_flags = p->bmv_oflags;
+ moveon = fn(ctx, bmap_descr, fd, whichfork, &fsx,
+ &bmap, arg);
+ if (!moveon)
+ goto out;
+ if (xfs_scrub_excessive_errors(ctx)) {
+ moveon = false;
+ goto out;
+ }
+ }
+
+ if (map->bmv_entries == 0)
+ break;
+ p = map + map->bmv_entries;
+ if (p->bmv_oflags & BMV_OF_LAST)
+ break;
+
+ new_off = p->bmv_offset + p->bmv_length;
+ map->bmv_length -= new_off - map->bmv_offset;
+ map->bmv_offset = new_off;
+ }
+
+ /*
+ * Pre-reflink filesystems don't know about CoW forks, so don't
+ * be too surprised if it fails.
+ */
+ if (whichfork == XFS_COW_FORK && error && errno == EINVAL)
+ error = 0;
+
+ if (error)
+ str_errno(ctx, bmap_descr);
+out:
+ memcpy(key, map, sizeof(struct getbmapx));
+ free(map);
+ return moveon;
+}
diff --git a/scrub/filemap.h b/scrub/filemap.h
new file mode 100644
index 0000000..30d53d0
--- /dev/null
+++ b/scrub/filemap.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_FILEMAP_H_
+#define XFS_SCRUB_FILEMAP_H_
+
+/* inode fork block mapping */
+struct xfs_bmap {
+ uint64_t bm_offset; /* file offset of segment in bytes */
+ uint64_t bm_physical; /* physical starting byte */
+ uint64_t bm_length; /* length of segment, bytes */
+ uint32_t bm_flags; /* output flags */
+};
+
+typedef bool (*xfs_bmap_iter_fn)(struct scrub_ctx *ctx, const char *descr,
+ int fd, int whichfork, struct fsxattr *fsx,
+ struct xfs_bmap *bmap, void *arg);
+
+bool xfs_iterate_filemaps(struct scrub_ctx *ctx, const char *descr, int fd,
+ int whichfork, struct xfs_bmap *key, xfs_bmap_iter_fn fn,
+ void *arg);
+
+#endif /* XFS_SCRUB_FILEMAP_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 11/27] xfs_scrub: filesystem counter collection functions
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (9 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 10/27] xfs_scrub: add file " Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 12/27] xfs_scrub: wrap the scrub ioctl Darrick J. Wong
` (19 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Add a couple of helper functions to estimate the inode and block
counters on the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/fscounters.c | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/fscounters.h | 29 +++++++
3 files changed, 243 insertions(+)
create mode 100644 scrub/fscounters.c
create mode 100644 scrub/fscounters.h
diff --git a/scrub/Makefile b/scrub/Makefile
index a3534e6..5397339 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -19,6 +19,7 @@ HFILES = \
common.h \
disk.h \
filemap.h \
+fscounters.h \
inodes.h \
spacemap.h \
xfs_scrub.h
@@ -27,6 +28,7 @@ CFILES = \
common.c \
disk.c \
filemap.c \
+fscounters.c \
inodes.c \
phase1.c \
spacemap.c \
diff --git a/scrub/fscounters.c b/scrub/fscounters.c
new file mode 100644
index 0000000..4294bf3
--- /dev/null
+++ b/scrub/fscounters.c
@@ -0,0 +1,212 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/statvfs.h>
+#include "platform_defs.h"
+#include "xfs.h"
+#include "xfs_arch.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "fscounters.h"
+
+/*
+ * Filesystem counter collection routines. We can count the number of
+ * inodes in the filesystem, and we can estimate the block counters.
+ */
+
+/* Count the number of inodes in the filesystem. */
+
+/* INUMBERS wrapper routines. */
+struct xfs_count_inodes {
+ bool moveon;
+ uint64_t counters[0];
+};
+
+/*
+ * Count the number of inodes. Use INUMBERS to figure out how many inodes
+ * exist in the filesystem, assuming we've already scrubbed that.
+ */
+static bool
+xfs_count_inodes_range(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ uint64_t first_ino,
+ uint64_t last_ino,
+ uint64_t *count)
+{
+ struct xfs_fsop_bulkreq igrpreq = {0};
+ struct xfs_inogrp inogrp;
+ __u64 igrp_ino;
+ uint64_t nr = 0;
+ __s32 igrplen = 0;
+ int error;
+
+ ASSERT(!(first_ino & (XFS_INODES_PER_CHUNK - 1)));
+ ASSERT((last_ino & (XFS_INODES_PER_CHUNK - 1)));
+
+ igrpreq.lastip = &igrp_ino;
+ igrpreq.icount = 1;
+ igrpreq.ubuffer = &inogrp;
+ igrpreq.ocount = &igrplen;
+
+ igrp_ino = first_ino;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSINUMBERS, &igrpreq);
+ while (!error && igrplen && inogrp.xi_startino < last_ino) {
+ nr += inogrp.xi_alloccount;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSINUMBERS, &igrpreq);
+ }
+
+ if (error) {
+ str_errno(ctx, descr);
+ return false;
+ }
+
+ *count = nr;
+ return true;
+}
+
+/* Scan all the inodes in an AG. */
+static void
+xfs_count_ag_inodes(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct xfs_count_inodes *ci = arg;
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ char descr[DESCR_BUFSZ];
+ uint64_t ag_ino;
+ uint64_t next_ag_ino;
+ bool moveon;
+
+ snprintf(descr, DESCR_BUFSZ, _("dev %d:%d AG %u inodes"),
+ major(ctx->fsinfo.fs_datadev),
+ minor(ctx->fsinfo.fs_datadev),
+ agno);
+
+ ag_ino = (__u64)agno << (ctx->inopblog + ctx->agblklog);
+ next_ag_ino = (__u64)(agno + 1) << (ctx->inopblog + ctx->agblklog);
+
+ moveon = xfs_count_inodes_range(ctx, descr, ag_ino, next_ag_ino - 1,
+ &ci->counters[agno]);
+ if (!moveon)
+ ci->moveon = false;
+}
+
+/* Count all the inodes in a filesystem. */
+bool
+xfs_count_all_inodes(
+ struct scrub_ctx *ctx,
+ uint64_t *count)
+{
+ struct xfs_count_inodes *ci;
+ xfs_agnumber_t agno;
+ struct workqueue wq;
+ bool moveon;
+ int ret;
+
+ ci = calloc(1, sizeof(struct xfs_count_inodes) +
+ (ctx->geo.agcount * sizeof(uint64_t)));
+ if (!ci)
+ return false;
+ ci->moveon = true;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ goto out_free;
+ }
+ for (agno = 0; agno < ctx->geo.agcount; agno++) {
+ ret = workqueue_add(&wq, xfs_count_ag_inodes, agno, ci);
+ if (ret) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue AG %u icount work."), agno);
+ break;
+ }
+ }
+ workqueue_destroy(&wq);
+
+ for (agno = 0; agno < ctx->geo.agcount; agno++)
+ *count += ci->counters[agno];
+ moveon = ci->moveon;
+
+out_free:
+ free(ci);
+ return moveon;
+}
+
+/* Estimate the number of blocks and inodes in the filesystem. */
+bool
+xfs_scan_estimate_blocks(
+ struct scrub_ctx *ctx,
+ unsigned long long *d_blocks,
+ unsigned long long *d_bfree,
+ unsigned long long *r_blocks,
+ unsigned long long *r_bfree,
+ unsigned long long *f_files,
+ unsigned long long *f_free)
+{
+ struct xfs_fsop_counts fc;
+ struct xfs_fsop_resblks rb;
+ struct statvfs sfs;
+ int error;
+
+ /* Grab the fstatvfs counters, since it has to report accurately. */
+ error = fstatvfs(ctx->mnt_fd, &sfs);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ /* Fetch the filesystem counters. */
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSCOUNTS, &fc);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ /*
+ * XFS reserves some blocks to prevent hard ENOSPC, so add those
+ * blocks back to the free data counts.
+ */
+ error = ioctl(ctx->mnt_fd, XFS_IOC_GET_RESBLKS, &rb);
+ if (error)
+ str_errno(ctx, ctx->mntpoint);
+ sfs.f_bfree += rb.resblks_avail;
+
+ *d_blocks = sfs.f_blocks + (ctx->geo.logstart ? ctx->geo.logblocks : 0);
+ *d_bfree = sfs.f_bfree;
+ *r_blocks = ctx->geo.rtblocks;
+ *r_bfree = fc.freertx;
+ *f_files = sfs.f_files;
+ *f_free = sfs.f_ffree;
+
+ return true;
+}
diff --git a/scrub/fscounters.h b/scrub/fscounters.h
new file mode 100644
index 0000000..40a4c05
--- /dev/null
+++ b/scrub/fscounters.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_FSCOUNTERS_H_
+#define XFS_SCRUB_FSCOUNTERS_H_
+
+bool xfs_scan_estimate_blocks(struct scrub_ctx *ctx,
+ unsigned long long *d_blocks, unsigned long long *d_bfree,
+ unsigned long long *r_blocks, unsigned long long *r_bfree,
+ unsigned long long *f_files, unsigned long long *f_free);
+bool xfs_count_all_inodes(struct scrub_ctx *ctx, uint64_t *count);
+
+#endif /* XFS_SCRUB_FSCOUNTERS_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 12/27] xfs_scrub: wrap the scrub ioctl
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (10 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 11/27] xfs_scrub: filesystem counter collection functions Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-11 23:12 ` Eric Sandeen
2018-01-06 1:52 ` [PATCH 13/27] xfs_scrub: scan filesystem and AG metadata Darrick J. Wong
` (18 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create some wrappers to call the scrub ioctls.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/common.c | 19 ++
scrub/common.h | 1
scrub/phase1.c | 8 +
scrub/scrub.c | 620 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/scrub.h | 62 ++++++
6 files changed, 712 insertions(+)
create mode 100644 scrub/scrub.c
create mode 100644 scrub/scrub.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 5397339..915b801 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -21,6 +21,7 @@ disk.h \
filemap.h \
fscounters.h \
inodes.h \
+scrub.h \
spacemap.h \
xfs_scrub.h
@@ -31,6 +32,7 @@ filemap.c \
fscounters.c \
inodes.c \
phase1.c \
+scrub.c \
spacemap.c \
xfs_scrub.c
diff --git a/scrub/common.c b/scrub/common.c
index 252809d..eb602a8 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -320,3 +320,22 @@ find_mountpoint(
platform_mntent_close(&cursor);
return found;
}
+
+/*
+ * Sleep for 100ms * however many -b we got past the initial one.
+ * This is an (albeit clumsy) way to throttle scrub activity.
+ */
+void
+background_sleep(void)
+{
+ unsigned long long time;
+ struct timespec tv;
+
+ if (bg_mode < 2)
+ return;
+
+ time = 100000 * (bg_mode - 1);
+ tv.tv_sec = time / 1000000;
+ tv.tv_nsec = time % 1000000;
+ nanosleep(&tv, NULL);
+}
diff --git a/scrub/common.h b/scrub/common.h
index fed95df..81e83c2 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -71,5 +71,6 @@ static inline int syncfs(int fd)
#endif
bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
+void background_sleep(void);
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/phase1.c b/scrub/phase1.c
index 65409d3..d7a321f 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -46,6 +46,7 @@
#include "xfs_scrub.h"
#include "common.h"
#include "disk.h"
+#include "scrub.h"
/* Phase 1: Find filesystem geometry (and clean up after) */
@@ -168,6 +169,13 @@ _("Does not appear to be an XFS filesystem!"));
return false;
}
+ /* Do we have kernel-assisted metadata scrubbing? */
+ if (!xfs_can_scrub_fs_metadata(ctx) || !xfs_can_scrub_inode(ctx) ||
+ !xfs_can_scrub_bmap(ctx) || !xfs_can_scrub_dir(ctx) ||
+ !xfs_can_scrub_attr(ctx) || !xfs_can_scrub_symlink(ctx) ||
+ !xfs_can_scrub_parent(ctx))
+ return false;
+
/* Go find the XFS devices if we have a usable fsmap. */
fs_table_initialise(0, NULL, 0, NULL);
errno = 0;
diff --git a/scrub/scrub.c b/scrub/scrub.c
new file mode 100644
index 0000000..98e7e0d
--- /dev/null
+++ b/scrub/scrub.c
@@ -0,0 +1,620 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "scrub.h"
+#include "xfs_errortag.h"
+
+/* Online scrub and repair wrappers. */
+
+/* Type info and names for the scrub types. */
+enum scrub_type {
+ ST_NONE, /* disabled */
+ ST_AGHEADER, /* per-AG header */
+ ST_PERAG, /* per-AG metadata */
+ ST_FS, /* per-FS metadata */
+ ST_INODE, /* per-inode metadata */
+};
+struct scrub_descr {
+ const char *name;
+ enum scrub_type type;
+};
+
+/* These must correspond to XFS_SCRUB_TYPE_ */
+static const struct scrub_descr scrubbers[XFS_SCRUB_TYPE_NR] = {
+ [XFS_SCRUB_TYPE_PROBE] =
+ {"metadata", ST_NONE},
+ [XFS_SCRUB_TYPE_SB] =
+ {"superblock", ST_AGHEADER},
+ [XFS_SCRUB_TYPE_AGF] =
+ {"free space header", ST_AGHEADER},
+ [XFS_SCRUB_TYPE_AGFL] =
+ {"free list", ST_AGHEADER},
+ [XFS_SCRUB_TYPE_AGI] =
+ {"inode header", ST_AGHEADER},
+ [XFS_SCRUB_TYPE_BNOBT] =
+ {"freesp by block btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_CNTBT] =
+ {"freesp by length btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_INOBT] =
+ {"inode btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_FINOBT] =
+ {"free inode btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_RMAPBT] =
+ {"reverse mapping btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_REFCNTBT] =
+ {"reference count btree", ST_PERAG},
+ [XFS_SCRUB_TYPE_INODE] =
+ {"inode record", ST_INODE},
+ [XFS_SCRUB_TYPE_BMBTD] =
+ {"data block map", ST_INODE},
+ [XFS_SCRUB_TYPE_BMBTA] =
+ {"attr block map", ST_INODE},
+ [XFS_SCRUB_TYPE_BMBTC] =
+ {"CoW block map", ST_INODE},
+ [XFS_SCRUB_TYPE_DIR] =
+ {"directory entries", ST_INODE},
+ [XFS_SCRUB_TYPE_XATTR] =
+ {"extended attributes", ST_INODE},
+ [XFS_SCRUB_TYPE_SYMLINK] =
+ {"symbolic link", ST_INODE},
+ [XFS_SCRUB_TYPE_PARENT] =
+ {"parent pointer", ST_INODE},
+ [XFS_SCRUB_TYPE_RTBITMAP] =
+ {"realtime bitmap", ST_FS},
+ [XFS_SCRUB_TYPE_RTSUM] =
+ {"realtime summary", ST_FS},
+ [XFS_SCRUB_TYPE_UQUOTA] =
+ {"user quotas", ST_FS},
+ [XFS_SCRUB_TYPE_GQUOTA] =
+ {"group quotas", ST_FS},
+ [XFS_SCRUB_TYPE_PQUOTA] =
+ {"project quotas", ST_FS},
+};
+
+/* Format a scrub description. */
+static void
+format_scrub_descr(
+ char *buf,
+ size_t buflen,
+ struct xfs_scrub_metadata *meta,
+ const struct scrub_descr *sc)
+{
+ switch (sc->type) {
+ case ST_AGHEADER:
+ case ST_PERAG:
+ snprintf(buf, buflen, _("AG %u %s"), meta->sm_agno,
+ _(sc->name));
+ break;
+ case ST_INODE:
+ snprintf(buf, buflen, _("Inode %"PRIu64" %s"),
+ (uint64_t)meta->sm_ino, _(sc->name));
+ break;
+ case ST_FS:
+ snprintf(buf, buflen, _("%s"), _(sc->name));
+ break;
+ case ST_NONE:
+ assert(0);
+ break;
+ }
+}
+
+/* Predicates for scrub flag state. */
+
+static inline bool is_corrupt(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT;
+}
+
+static inline bool is_unoptimized(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_PREEN;
+}
+
+static inline bool xref_failed(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_XFAIL;
+}
+
+static inline bool xref_disagrees(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_XCORRUPT;
+}
+
+static inline bool is_incomplete(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_INCOMPLETE;
+}
+
+static inline bool is_suspicious(struct xfs_scrub_metadata *sm)
+{
+ return sm->sm_flags & XFS_SCRUB_OFLAG_WARNING;
+}
+
+/* Should we fix it? */
+static inline bool needs_repair(struct xfs_scrub_metadata *sm)
+{
+ return is_corrupt(sm) || xref_disagrees(sm);
+}
+
+/* Warn about strange circumstances after scrub. */
+static inline void
+xfs_scrub_warn_incomplete_scrub(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct xfs_scrub_metadata *meta)
+{
+ if (is_incomplete(meta))
+ str_info(ctx, descr, _("Check incomplete."));
+
+ if (is_suspicious(meta)) {
+ if (debug)
+ str_info(ctx, descr, _("Possibly suspect metadata."));
+ else
+ str_warn(ctx, descr, _("Possibly suspect metadata."));
+ }
+
+ if (xref_failed(meta))
+ str_info(ctx, descr, _("Cross-referencing failed."));
+}
+
+/* Do a read-only check of some metadata. */
+static enum check_outcome
+xfs_check_metadata(
+ struct scrub_ctx *ctx,
+ int fd,
+ struct xfs_scrub_metadata *meta,
+ bool is_inode)
+{
+ char buf[DESCR_BUFSZ];
+ unsigned int tries = 0;
+ int code;
+ int error;
+
+ assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL"));
+ assert(meta->sm_type < XFS_SCRUB_TYPE_NR);
+ format_scrub_descr(buf, DESCR_BUFSZ, meta, &scrubbers[meta->sm_type]);
+
+ dbg_printf("check %s flags %xh\n", buf, meta->sm_flags);
+retry:
+ error = ioctl(fd, XFS_IOC_SCRUB_METADATA, meta);
+ if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR") && !error)
+ meta->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
+ if (error) {
+ code = errno;
+ switch (code) {
+ case ENOENT:
+ /* Metadata not present, just skip it. */
+ return CHECK_DONE;
+ case ESHUTDOWN:
+ /* FS already crashed, give up. */
+ str_error(ctx, buf,
+_("Filesystem is shut down, aborting."));
+ return CHECK_ABORT;
+ case EIO:
+ case ENOMEM:
+ /* Abort on I/O errors or insufficient memory. */
+ str_errno(ctx, buf);
+ return CHECK_ABORT;
+ case EDEADLOCK:
+ case EBUSY:
+ case EFSBADCRC:
+ case EFSCORRUPTED:
+ /*
+ * The first two should never escape the kernel,
+ * and the other two should be reported via sm_flags.
+ */
+ str_error(ctx, buf,
+_("Kernel bug! errno=%d"), code);
+ /* fall through */
+ default:
+ /* Operational error. */
+ str_errno(ctx, buf);
+ return CHECK_DONE;
+ }
+ }
+
+ /*
+ * If the kernel says the test was incomplete or that there was
+ * a cross-referencing discrepancy but no obvious corruption,
+ * we'll try the scan again, just in case the fs was busy.
+ * Only retry so many times.
+ */
+ if (tries < 10 && (is_incomplete(meta) ||
+ (xref_disagrees(meta) && !is_corrupt(meta)))) {
+ tries++;
+ goto retry;
+ }
+
+ /* Complain about incomplete or suspicious metadata. */
+ xfs_scrub_warn_incomplete_scrub(ctx, buf, meta);
+
+ /*
+ * If we need repairs or there were discrepancies, schedule a
+ * repair if desired, otherwise complain.
+ */
+ if (is_corrupt(meta) || xref_disagrees(meta)) {
+ if (ctx->mode < SCRUB_MODE_REPAIR) {
+ str_error(ctx, buf,
+_("Repairs are required."));
+ return CHECK_DONE;
+ }
+
+ return CHECK_REPAIR;
+ }
+
+ /*
+ * If we could optimize, schedule a repair if desired,
+ * otherwise complain.
+ */
+ if (is_unoptimized(meta)) {
+ if (ctx->mode < SCRUB_MODE_PREEN) {
+ if (!is_inode) {
+ /* AG or FS metadata, always warn. */
+ str_info(ctx, buf,
+_("Optimization is possible."));
+ } else if (!ctx->preen_triggers[meta->sm_type]) {
+ /* File metadata, only warn once per type. */
+ pthread_mutex_lock(&ctx->lock);
+ if (!ctx->preen_triggers[meta->sm_type])
+ ctx->preen_triggers[meta->sm_type] = true;
+ pthread_mutex_unlock(&ctx->lock);
+ }
+ return CHECK_DONE;
+ }
+
+ return CHECK_REPAIR;
+ }
+
+ /* Everything is ok. */
+ return CHECK_DONE;
+}
+
+/* Bulk-notify user about things that could be optimized. */
+void
+xfs_scrub_report_preen_triggers(
+ struct scrub_ctx *ctx)
+{
+ int i;
+
+ for (i = 0; i < XFS_SCRUB_TYPE_NR; i++) {
+ pthread_mutex_lock(&ctx->lock);
+ if (ctx->preen_triggers[i]) {
+ ctx->preen_triggers[i] = false;
+ pthread_mutex_unlock(&ctx->lock);
+ str_info(ctx, ctx->mntpoint,
+_("Optimizations of %s are possible."), scrubbers[i].name);
+ } else {
+ pthread_mutex_unlock(&ctx->lock);
+ }
+ }
+}
+
+/* Scrub metadata, saving corruption reports for later. */
+static bool
+xfs_scrub_metadata(
+ struct scrub_ctx *ctx,
+ enum scrub_type scrub_type,
+ xfs_agnumber_t agno)
+{
+ struct xfs_scrub_metadata meta = {0};
+ const struct scrub_descr *sc;
+ enum check_outcome fix;
+ int type;
+
+ sc = scrubbers;
+ for (type = 0; type < XFS_SCRUB_TYPE_NR; type++, sc++) {
+ if (sc->type != scrub_type)
+ continue;
+
+ meta.sm_type = type;
+ meta.sm_flags = 0;
+ meta.sm_agno = agno;
+ background_sleep();
+
+ /* Check the item. */
+ fix = xfs_check_metadata(ctx, ctx->mnt_fd, &meta, false);
+ switch (fix) {
+ case CHECK_ABORT:
+ return false;
+ case CHECK_REPAIR:
+ /* fall through */
+ case CHECK_DONE:
+ continue;
+ case CHECK_RETRY:
+ abort();
+ break;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * Scrub primary superblock. This will be useful if we ever need to hook
+ * a filesystem-wide pre-scrub activity off of the sb 0 scrubber (which
+ * currently does nothing).
+ */
+bool
+xfs_scrub_primary_super(
+ struct scrub_ctx *ctx)
+{
+ struct xfs_scrub_metadata meta = {
+ .sm_type = XFS_SCRUB_TYPE_SB,
+ };
+ enum check_outcome fix;
+
+ /* Check the item. */
+ fix = xfs_check_metadata(ctx, ctx->mnt_fd, &meta, false);
+ switch (fix) {
+ case CHECK_ABORT:
+ return false;
+ case CHECK_REPAIR:
+ /* fall through */
+ case CHECK_DONE:
+ return true;
+ case CHECK_RETRY:
+ abort();
+ break;
+ }
+
+ return true;
+}
+
+/* Scrub each AG's header blocks. */
+bool
+xfs_scrub_ag_headers(
+ struct scrub_ctx *ctx,
+ xfs_agnumber_t agno)
+{
+ return xfs_scrub_metadata(ctx, ST_AGHEADER, agno);
+}
+
+/* Scrub each AG's metadata btrees. */
+bool
+xfs_scrub_ag_metadata(
+ struct scrub_ctx *ctx,
+ xfs_agnumber_t agno)
+{
+ return xfs_scrub_metadata(ctx, ST_PERAG, agno);
+}
+
+/* Scrub whole-FS metadata btrees. */
+bool
+xfs_scrub_fs_metadata(
+ struct scrub_ctx *ctx)
+{
+ return xfs_scrub_metadata(ctx, ST_FS, 0);
+}
+
+/* Scrub inode metadata. */
+static bool
+__xfs_scrub_file(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd,
+ unsigned int type)
+{
+ struct xfs_scrub_metadata meta = {0};
+ enum check_outcome fix;
+
+ assert(type < XFS_SCRUB_TYPE_NR);
+ assert(scrubbers[type].type == ST_INODE);
+
+ meta.sm_type = type;
+ meta.sm_ino = ino;
+ meta.sm_gen = gen;
+
+ /* Scrub the piece of metadata. */
+ fix = xfs_check_metadata(ctx, fd, &meta, true);
+ if (fix == CHECK_ABORT)
+ return false;
+ if (fix == CHECK_DONE)
+ return true;
+
+ return true;
+}
+
+bool
+xfs_scrub_inode_fields(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_INODE);
+}
+
+bool
+xfs_scrub_data_fork(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTD);
+}
+
+bool
+xfs_scrub_attr_fork(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTA);
+}
+
+bool
+xfs_scrub_cow_fork(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTC);
+}
+
+bool
+xfs_scrub_dir(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_DIR);
+}
+
+bool
+xfs_scrub_attr(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_XATTR);
+}
+
+bool
+xfs_scrub_symlink(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_SYMLINK);
+}
+
+bool
+xfs_scrub_parent(
+ struct scrub_ctx *ctx,
+ uint64_t ino,
+ uint32_t gen,
+ int fd)
+{
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_PARENT);
+}
+
+/* Test the availability of a kernel scrub command. */
+static bool
+__xfs_scrub_test(
+ struct scrub_ctx *ctx,
+ unsigned int type,
+ bool repair)
+{
+ struct xfs_scrub_metadata meta = {0};
+ int error;
+
+ if (debug_tweak_on("XFS_SCRUB_NO_KERNEL"))
+ return false;
+
+ meta.sm_type = type;
+ if (repair)
+ meta.sm_flags |= XFS_SCRUB_IFLAG_REPAIR;
+ error = ioctl(ctx->mnt_fd, XFS_IOC_SCRUB_METADATA, &meta);
+ if (!error)
+ return true;
+ switch (errno) {
+ case EROFS:
+ str_info(ctx, ctx->mntpoint,
+_("Filesystem is mounted read-only; cannot proceed."));
+ return false;
+ case ENOTRECOVERABLE:
+ str_info(ctx, ctx->mntpoint,
+_("Filesystem is mounted norecovery; cannot proceed."));
+ return false;
+ case EOPNOTSUPP:
+ case ENOTTY:
+ str_info(ctx, ctx->mntpoint,
+_("Kernel %s %s facility is required."),
+ _(scrubbers[type].name),
+ repair ? _("repair") : _("scrub"));
+ return false;
+ case ENOENT:
+ /* Scrubber says not present on this fs; that's fine. */
+ return true;
+ default:
+ str_info(ctx, ctx->mntpoint, "%s", strerror(errno));
+ return true;
+ }
+ return error == 0 || (error && errno != EOPNOTSUPP && errno != ENOTTY);
+}
+
+bool
+xfs_can_scrub_fs_metadata(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_PROBE, false);
+}
+
+bool
+xfs_can_scrub_inode(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_INODE, false);
+}
+
+bool
+xfs_can_scrub_bmap(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_BMBTD, false);
+}
+
+bool
+xfs_can_scrub_dir(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_DIR, false);
+}
+
+bool
+xfs_can_scrub_attr(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_XATTR, false);
+}
+
+bool
+xfs_can_scrub_symlink(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_SYMLINK, false);
+}
+
+bool
+xfs_can_scrub_parent(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_PARENT, false);
+}
diff --git a/scrub/scrub.h b/scrub/scrub.h
new file mode 100644
index 0000000..0b454df
--- /dev/null
+++ b/scrub/scrub.h
@@ -0,0 +1,62 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_SCRUB_H_
+#define XFS_SCRUB_SCRUB_H_
+
+/* Online scrub and repair. */
+enum check_outcome {
+ CHECK_DONE, /* no further processing needed */
+ CHECK_REPAIR, /* schedule this for repairs */
+ CHECK_ABORT, /* end program */
+ CHECK_RETRY, /* repair failed, try again later */
+};
+
+void xfs_scrub_report_preen_triggers(struct scrub_ctx *ctx);
+bool xfs_scrub_primary_super(struct scrub_ctx *ctx);
+bool xfs_scrub_ag_headers(struct scrub_ctx *ctx, xfs_agnumber_t agno);
+bool xfs_scrub_ag_metadata(struct scrub_ctx *ctx, xfs_agnumber_t agno);
+bool xfs_scrub_fs_metadata(struct scrub_ctx *ctx);
+
+bool xfs_can_scrub_fs_metadata(struct scrub_ctx *ctx);
+bool xfs_can_scrub_inode(struct scrub_ctx *ctx);
+bool xfs_can_scrub_bmap(struct scrub_ctx *ctx);
+bool xfs_can_scrub_dir(struct scrub_ctx *ctx);
+bool xfs_can_scrub_attr(struct scrub_ctx *ctx);
+bool xfs_can_scrub_symlink(struct scrub_ctx *ctx);
+bool xfs_can_scrub_parent(struct scrub_ctx *ctx);
+
+bool xfs_scrub_inode_fields(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_data_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_attr_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_cow_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_dir(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_attr(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_symlink(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+bool xfs_scrub_parent(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
+ int fd);
+
+#endif /* XFS_SCRUB_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 13/27] xfs_scrub: scan filesystem and AG metadata
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (11 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 12/27] xfs_scrub: wrap the scrub ioctl Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 14/27] xfs_scrub: thread-safe stats counter Darrick J. Wong
` (17 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Scrub the filesystem and per-AG metadata.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 1
scrub/phase2.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 1
scrub/xfs_scrub.h | 1
4 files changed, 136 insertions(+)
create mode 100644 scrub/phase2.c
diff --git a/scrub/Makefile b/scrub/Makefile
index 915b801..9edc933 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -32,6 +32,7 @@ filemap.c \
fscounters.c \
inodes.c \
phase1.c \
+phase2.c \
scrub.c \
spacemap.c \
xfs_scrub.c
diff --git a/scrub/phase2.c b/scrub/phase2.c
new file mode 100644
index 0000000..153ae02
--- /dev/null
+++ b/scrub/phase2.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "scrub.h"
+
+/* Phase 2: Check internal metadata. */
+
+/* Scrub each AG's metadata btrees. */
+static void
+xfs_scan_ag_metadata(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ bool *pmoveon = arg;
+ bool moveon;
+ char descr[DESCR_BUFSZ];
+
+ snprintf(descr, DESCR_BUFSZ, _("AG %u"), agno);
+
+ /*
+ * First we scrub and fix the AG headers, because we need
+ * them to work well enough to check the AG btrees.
+ */
+ moveon = xfs_scrub_ag_headers(ctx, agno);
+ if (!moveon)
+ goto err;
+
+ /* Now scrub the AG btrees. */
+ moveon = xfs_scrub_ag_metadata(ctx, agno);
+ if (!moveon)
+ goto err;
+
+ return;
+err:
+ *pmoveon = false;
+}
+
+/* Scrub whole-FS metadata btrees. */
+static void
+xfs_scan_fs_metadata(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ bool *pmoveon = arg;
+ bool moveon;
+
+ moveon = xfs_scrub_fs_metadata(ctx);
+ if (!moveon)
+ *pmoveon = false;
+}
+
+/* Scan all filesystem metadata. */
+bool
+xfs_scan_metadata(
+ struct scrub_ctx *ctx)
+{
+ struct workqueue wq;
+ xfs_agnumber_t agno;
+ bool moveon = true;
+ int ret;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ return false;
+ }
+
+ /*
+ * In case we ever use the primary super scrubber to perform fs
+ * upgrades (followed by a full scrub), do that before we launch
+ * anything else.
+ */
+ moveon = xfs_scrub_primary_super(ctx);
+ if (!moveon)
+ return moveon;
+
+ for (agno = 0; moveon && agno < ctx->geo.agcount; agno++) {
+ ret = workqueue_add(&wq, xfs_scan_ag_metadata, agno, &moveon);
+ if (ret) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue AG %u scrub work."), agno);
+ goto out;
+ }
+ }
+
+ if (!moveon)
+ goto out;
+
+ ret = workqueue_add(&wq, xfs_scan_fs_metadata, 0, &moveon);
+ if (ret) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue filesystem scrub work."));
+ goto out;
+ }
+
+out:
+ workqueue_destroy(&wq);
+ return moveon;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index a733b8f..be52a98 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -353,6 +353,7 @@ run_scrub_phases(
},
{
.descr = _("Check internal metadata."),
+ .fn = xfs_scan_metadata,
},
{
.descr = _("Scan all inodes."),
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 2be7c65..4c3882b 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -97,5 +97,6 @@ struct scrub_ctx {
void xfs_shutdown_fs(struct scrub_ctx *ctx);
bool xfs_cleanup_fs(struct scrub_ctx *ctx);
bool xfs_setup_fs(struct scrub_ctx *ctx);
+bool xfs_scan_metadata(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 14/27] xfs_scrub: thread-safe stats counter
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (12 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 13/27] xfs_scrub: scan filesystem and AG metadata Darrick J. Wong
@ 2018-01-06 1:52 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 15/27] xfs_scrub: scan inodes Darrick J. Wong
` (16 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:52 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create a threaded stats counter that we'll use to track scan progress.
This includes things like how much of the disk blocks we've scanned,
or later how much progress we've made in each phase.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
include/ptvar.h | 32 +++++++++++++
libfrog/Makefile | 1
libfrog/ptvar.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/Makefile | 2 +
scrub/counter.c | 104 ++++++++++++++++++++++++++++++++++++++++++
scrub/counter.h | 29 ++++++++++++
6 files changed, 301 insertions(+)
create mode 100644 include/ptvar.h
create mode 100644 libfrog/ptvar.c
create mode 100644 scrub/counter.c
create mode 100644 scrub/counter.h
diff --git a/include/ptvar.h b/include/ptvar.h
new file mode 100644
index 0000000..6308228
--- /dev/null
+++ b/include/ptvar.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef LIBFROG_PERCPU_H_
+#define LIBFROG_PERCPU_H_
+
+struct ptvar;
+
+typedef bool (*ptvar_iter_fn)(struct ptvar *ptv, void *data, void *foreach_arg);
+
+struct ptvar *ptvar_init(size_t nr, size_t size);
+void ptvar_free(struct ptvar *ptv);
+void *ptvar_get(struct ptvar *ptv);
+bool ptvar_foreach(struct ptvar *ptv, ptvar_iter_fn fn, void *foreach_arg);
+
+#endif /* LIBFROG_PERCPU_H_ */
diff --git a/libfrog/Makefile b/libfrog/Makefile
index 4c15605..230b08f 100644
--- a/libfrog/Makefile
+++ b/libfrog/Makefile
@@ -16,6 +16,7 @@ convert.c \
list_sort.c \
paths.c \
projects.c \
+ptvar.c \
radix-tree.c \
topology.c \
util.c \
diff --git a/libfrog/ptvar.c b/libfrog/ptvar.c
new file mode 100644
index 0000000..3654706
--- /dev/null
+++ b/libfrog/ptvar.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <assert.h>
+#include <pthread.h>
+#include <unistd.h>
+#include "platform_defs.h"
+#include "ptvar.h"
+
+/*
+ * Per-thread Variables
+ *
+ * This data structure manages a lockless per-thread variable. We
+ * implement this by allocating an array of memory regions, and as each
+ * thread tries to acquire its own region, we hand out the array
+ * elements to each thread. This way, each thread gets its own
+ * cacheline and (after the first access) doesn't have to contend for a
+ * lock for each access.
+ */
+struct ptvar {
+ pthread_key_t key;
+ pthread_mutex_t lock;
+ size_t nr_used;
+ size_t nr_counters;
+ size_t data_size;
+ unsigned char data[0];
+};
+#define PTVAR_SIZE(nr, sz) (sizeof(struct ptvar) + ((nr) * (size)))
+
+/* Initialize per-thread counter. */
+struct ptvar *
+ptvar_init(
+ size_t nr,
+ size_t size)
+{
+ struct ptvar *ptv;
+ int ret;
+
+#ifdef _SC_LEVEL1_DCACHE_LINESIZE
+ /* Try to prevent cache pingpong by aligning to cacheline size. */
+ size = max(size, sysconf(_SC_LEVEL1_DCACHE_LINESIZE));
+#endif
+
+ ptv = malloc(PTVAR_SIZE(nr, size));
+ if (!ptv)
+ return NULL;
+ ptv->data_size = size;
+ ptv->nr_counters = nr;
+ ptv->nr_used = 0;
+ memset(ptv->data, 0, nr * size);
+ ret = pthread_mutex_init(&ptv->lock, NULL);
+ if (ret)
+ goto out;
+ ret = pthread_key_create(&ptv->key, NULL);
+ if (ret)
+ goto out_mutex;
+ return ptv;
+
+out_mutex:
+ pthread_mutex_destroy(&ptv->lock);
+out:
+ free(ptv);
+ return NULL;
+}
+
+/* Free per-thread counter. */
+void
+ptvar_free(
+ struct ptvar *ptv)
+{
+ pthread_key_delete(ptv->key);
+ pthread_mutex_destroy(&ptv->lock);
+ free(ptv);
+}
+
+/* Get a reference to this thread's variable. */
+void *
+ptvar_get(
+ struct ptvar *ptv)
+{
+ void *p;
+
+ p = pthread_getspecific(ptv->key);
+ if (!p) {
+ pthread_mutex_lock(&ptv->lock);
+ assert(ptv->nr_used < ptv->nr_counters);
+ p = &ptv->data[(ptv->nr_used++) * ptv->data_size];
+ pthread_setspecific(ptv->key, p);
+ pthread_mutex_unlock(&ptv->lock);
+ }
+ return p;
+}
+
+/* Iterate all of the per-thread variables. */
+bool
+ptvar_foreach(
+ struct ptvar *ptv,
+ ptvar_iter_fn fn,
+ void *foreach_arg)
+{
+ size_t i;
+ bool ret = true;
+
+ pthread_mutex_lock(&ptv->lock);
+ for (i = 0; i < ptv->nr_used; i++) {
+ ret = fn(ptv, &ptv->data[i * ptv->data_size], foreach_arg);
+ if (!ret)
+ break;
+ }
+ pthread_mutex_unlock(&ptv->lock);
+
+ return ret;
+}
diff --git a/scrub/Makefile b/scrub/Makefile
index 9edc933..30dbe54 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -17,6 +17,7 @@ endif # scrub_prereqs
HFILES = \
common.h \
+counter.h \
disk.h \
filemap.h \
fscounters.h \
@@ -27,6 +28,7 @@ xfs_scrub.h
CFILES = \
common.c \
+counter.c \
disk.c \
filemap.c \
fscounters.c \
diff --git a/scrub/counter.c b/scrub/counter.c
new file mode 100644
index 0000000..ced3cf3
--- /dev/null
+++ b/scrub/counter.c
@@ -0,0 +1,104 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <assert.h>
+#include <pthread.h>
+#include "ptvar.h"
+#include "counter.h"
+
+/*
+ * Per-Thread Counters
+ *
+ * This is a global counter object that uses per-thread counters to
+ * count things without having to content for a single shared lock.
+ * Provided we know the number of threads that will be accessing the
+ * counter, each thread gets its own thread-specific counter variable.
+ * Changing the value is fast, though retrieving the value is expensive
+ * and approximate.
+ */
+struct ptcounter {
+ struct ptvar *var;
+};
+
+/* Initialize per-thread counter. */
+struct ptcounter *
+ptcounter_init(
+ size_t nr)
+{
+ struct ptcounter *p;
+
+ p = malloc(sizeof(struct ptcounter));
+ if (!p)
+ return NULL;
+ p->var = ptvar_init(nr, sizeof(uint64_t));
+ if (!p->var) {
+ free(p);
+ return NULL;
+ }
+ return p;
+}
+
+/* Free per-thread counter. */
+void
+ptcounter_free(
+ struct ptcounter *ptc)
+{
+ ptvar_free(ptc->var);
+ free(ptc);
+}
+
+/* Add a quantity to the counter. */
+void
+ptcounter_add(
+ struct ptcounter *ptc,
+ int64_t nr)
+{
+ uint64_t *p;
+
+ p = ptvar_get(ptc->var);
+ *p += nr;
+}
+
+static bool
+ptcounter_val_helper(
+ struct ptvar *ptv,
+ void *data,
+ void *foreach_arg)
+{
+ uint64_t *sum = foreach_arg;
+ uint64_t *count = data;
+
+ *sum += *count;
+ return true;
+}
+
+/* Return the approximate value of this counter. */
+uint64_t
+ptcounter_value(
+ struct ptcounter *ptc)
+{
+ uint64_t sum = 0;
+
+ ptvar_foreach(ptc->var, ptcounter_val_helper, &sum);
+ return sum;
+}
diff --git a/scrub/counter.h b/scrub/counter.h
new file mode 100644
index 0000000..2aac795
--- /dev/null
+++ b/scrub/counter.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_COUNTER_H_
+#define XFS_SCRUB_COUNTER_H_
+
+struct ptcounter;
+struct ptcounter *ptcounter_init(size_t nr);
+void ptcounter_free(struct ptcounter *ptc);
+void ptcounter_add(struct ptcounter *ptc, int64_t nr);
+uint64_t ptcounter_value(struct ptcounter *ptc);
+
+#endif /* XFS_SCRUB_COUNTER_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 15/27] xfs_scrub: scan inodes
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (13 preceding siblings ...)
2018-01-06 1:52 ` [PATCH 14/27] xfs_scrub: thread-safe stats counter Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 16/27] xfs_scrub: check directory connectivity Darrick J. Wong
` (15 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Scan all the inodes in the system for problems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 1
scrub/phase3.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 1
scrub/xfs_scrub.h | 2 +
4 files changed, 156 insertions(+)
create mode 100644 scrub/phase3.c
diff --git a/scrub/Makefile b/scrub/Makefile
index 30dbe54..e0d15d8 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -35,6 +35,7 @@ fscounters.c \
inodes.c \
phase1.c \
phase2.c \
+phase3.c \
scrub.c \
spacemap.c \
xfs_scrub.c
diff --git a/scrub/phase3.c b/scrub/phase3.c
new file mode 100644
index 0000000..b3fc510
--- /dev/null
+++ b/scrub/phase3.c
@@ -0,0 +1,152 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "counter.h"
+#include "inodes.h"
+#include "scrub.h"
+
+/* Phase 3: Scan all inodes. */
+
+/*
+ * Run a per-file metadata scanner. We use the ino/gen interface to
+ * ensure that the inode we're checking matches what the inode scan
+ * told us to look at.
+ */
+static bool
+xfs_scrub_fd(
+ struct scrub_ctx *ctx,
+ bool (*fn)(struct scrub_ctx *, uint64_t,
+ uint32_t, int),
+ struct xfs_bstat *bs)
+{
+ return fn(ctx, bs->bs_ino, bs->bs_gen, ctx->mnt_fd);
+}
+
+struct scrub_inode_ctx {
+ struct ptcounter *icount;
+ bool moveon;
+};
+
+/* Verify the contents, xattrs, and extent maps of an inode. */
+static int
+xfs_scrub_inode(
+ struct scrub_ctx *ctx,
+ struct xfs_handle *handle,
+ struct xfs_bstat *bstat,
+ void *arg)
+{
+ struct scrub_inode_ctx *ictx = arg;
+ struct ptcounter *icount = ictx->icount;
+ bool moveon = true;
+ int fd = -1;
+
+ background_sleep();
+
+ /* Try to open the inode to pin it. */
+ if (S_ISREG(bstat->bs_mode)) {
+ fd = xfs_open_handle(handle);
+ /* Stale inode means we scan the whole cluster again. */
+ if (fd < 0 && errno == ESTALE)
+ return ESTALE;
+ }
+
+ /* Scrub the inode. */
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_inode_fields, bstat);
+ if (!moveon)
+ goto out;
+
+ /* Scrub all block mappings. */
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_data_fork, bstat);
+ if (!moveon)
+ goto out;
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_attr_fork, bstat);
+ if (!moveon)
+ goto out;
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_cow_fork, bstat);
+ if (!moveon)
+ goto out;
+
+ if (S_ISLNK(bstat->bs_mode)) {
+ /* Check symlink contents. */
+ moveon = xfs_scrub_symlink(ctx, bstat->bs_ino,
+ bstat->bs_gen, ctx->mnt_fd);
+ } else if (S_ISDIR(bstat->bs_mode)) {
+ /* Check the directory entries. */
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_dir, bstat);
+ }
+ if (!moveon)
+ goto out;
+
+ /* Check all the extended attributes. */
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_attr, bstat);
+ if (!moveon)
+ goto out;
+
+ /* Check parent pointers. */
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_parent, bstat);
+ if (!moveon)
+ goto out;
+
+out:
+ ptcounter_add(icount, 1);
+ if (fd >= 0)
+ close(fd);
+ if (!moveon)
+ ictx->moveon = false;
+ return ictx->moveon ? 0 : XFS_ITERATE_INODES_ABORT;
+}
+
+/* Verify all the inodes in a filesystem. */
+bool
+xfs_scan_inodes(
+ struct scrub_ctx *ctx)
+{
+ struct scrub_inode_ctx ictx;
+ bool ret;
+
+ ictx.moveon = true;
+ ictx.icount = ptcounter_init(scrub_nproc(ctx));
+ if (!ictx.icount) {
+ str_error(ctx, ctx->mntpoint, _("Could not create counter."));
+ return false;
+ }
+
+ ret = xfs_scan_all_inodes(ctx, xfs_scrub_inode, &ictx);
+ if (!ret)
+ ictx.moveon = false;
+ if (!ictx.moveon)
+ goto free;
+ xfs_scrub_report_preen_triggers(ctx);
+ ctx->inodes_checked = ptcounter_value(ictx.icount);
+
+free:
+ ptcounter_free(ictx.icount);
+ return ictx.moveon;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index be52a98..5bde6cf 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -357,6 +357,7 @@ run_scrub_phases(
},
{
.descr = _("Scan all inodes."),
+ .fn = xfs_scan_inodes,
},
{
.descr = _("Defer filesystem repairs."),
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 4c3882b..41d471b 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -89,6 +89,7 @@ struct scrub_ctx {
unsigned long long runtime_errors;
unsigned long long errors_found;
unsigned long long warnings_found;
+ unsigned long long inodes_checked;
bool need_repair;
bool preen_triggers[XFS_SCRUB_TYPE_NR];
};
@@ -98,5 +99,6 @@ void xfs_shutdown_fs(struct scrub_ctx *ctx);
bool xfs_cleanup_fs(struct scrub_ctx *ctx);
bool xfs_setup_fs(struct scrub_ctx *ctx);
bool xfs_scan_metadata(struct scrub_ctx *ctx);
+bool xfs_scan_inodes(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 16/27] xfs_scrub: check directory connectivity
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (14 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 15/27] xfs_scrub: scan inodes Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 17/27] xfs_scrub: warn about suspicious characters in directory/xattr names Darrick J. Wong
` (14 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Opening directories by file handle will cause the kernel to perform
parent lookups all the way to the root directory. Take advantage of
this to ensure that directories actually connect to the root. Some
day we'll have parent pointers and can make this more comprehensive.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 1 +
scrub/phase5.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 1 +
scrub/xfs_scrub.h | 1 +
4 files changed, 104 insertions(+)
create mode 100644 scrub/phase5.c
diff --git a/scrub/Makefile b/scrub/Makefile
index e0d15d8..adb868e 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -36,6 +36,7 @@ inodes.c \
phase1.c \
phase2.c \
phase3.c \
+phase5.c \
scrub.c \
spacemap.c \
xfs_scrub.c
diff --git a/scrub/phase5.c b/scrub/phase5.c
new file mode 100644
index 0000000..0b161e3
--- /dev/null
+++ b/scrub/phase5.c
@@ -0,0 +1,101 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "inodes.h"
+#include "scrub.h"
+
+/* Phase 5: Check directory connectivity. */
+
+/*
+ * Verify the connectivity of the directory tree.
+ * We know that the kernel's open-by-handle function will try to reconnect
+ * parents of an opened directory, so we'll accept that as sufficient.
+ */
+static int
+xfs_scrub_connections(
+ struct scrub_ctx *ctx,
+ struct xfs_handle *handle,
+ struct xfs_bstat *bstat,
+ void *arg)
+{
+ bool *pmoveon = arg;
+ char descr[DESCR_BUFSZ];
+ bool moveon = true;
+ xfs_agnumber_t agno;
+ xfs_agino_t agino;
+ int fd = -1;
+
+ agno = bstat->bs_ino / (1ULL << (ctx->inopblog + ctx->agblklog));
+ agino = bstat->bs_ino % (1ULL << (ctx->inopblog + ctx->agblklog));
+ snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (%u/%u)"),
+ (uint64_t)bstat->bs_ino, agno, agino);
+ background_sleep();
+
+ /* Open the dir, let the kernel try to reconnect it to the root. */
+ if (S_ISDIR(bstat->bs_mode)) {
+ fd = xfs_open_handle(handle);
+ if (fd < 0) {
+ if (errno == ESTALE)
+ return ESTALE;
+ str_errno(ctx, descr);
+ goto out;
+ }
+ }
+
+out:
+ if (fd >= 0)
+ close(fd);
+ if (!moveon)
+ *pmoveon = false;
+ return *pmoveon ? 0 : XFS_ITERATE_INODES_ABORT;
+}
+
+/* Check directory connectivity. */
+bool
+xfs_scan_connections(
+ struct scrub_ctx *ctx)
+{
+ bool moveon = true;
+ bool ret;
+
+ if (ctx->errors_found) {
+ str_info(ctx, ctx->mntpoint,
+_("Filesystem has errors, skipping connectivity checks."));
+ return true;
+ }
+
+ ret = xfs_scan_all_inodes(ctx, xfs_scrub_connections, &moveon);
+ if (!ret)
+ moveon = false;
+ if (!moveon)
+ return false;
+ xfs_scrub_report_preen_triggers(ctx);
+ return true;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 5bde6cf..64517f4 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -365,6 +365,7 @@ run_scrub_phases(
},
{
.descr = _("Check directory tree."),
+ .fn = xfs_scan_connections,
},
{
.descr = _("Verify data file integrity."),
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 41d471b..c9f53d8 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -100,5 +100,6 @@ bool xfs_cleanup_fs(struct scrub_ctx *ctx);
bool xfs_setup_fs(struct scrub_ctx *ctx);
bool xfs_scan_metadata(struct scrub_ctx *ctx);
bool xfs_scan_inodes(struct scrub_ctx *ctx);
+bool xfs_scan_connections(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 17/27] xfs_scrub: warn about suspicious characters in directory/xattr names
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (15 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 16/27] xfs_scrub: check directory connectivity Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions Darrick J. Wong
` (13 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Look for control characters and punctuation that interfere with shell
globbing in directory entry names and extended attribute key names.
Technically these aren't filesystem corruptions because names are
arbitrary sequences of bytes, but they've been known to cause problems
in the Unix environment so warn if we see them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure.ac | 2 +
debian/control | 2 -
include/builddefs.in | 1
m4/Makefile | 1
m4/package_attr.m4 | 23 ++++++
scrub/Makefile | 6 ++
scrub/common.c | 54 ++++++++++++++
scrub/common.h | 4 +
scrub/phase5.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.h | 1
10 files changed, 285 insertions(+), 1 deletion(-)
create mode 100644 m4/package_attr.m4
diff --git a/configure.ac b/configure.ac
index 796a91b..e2e3f66 100644
--- a/configure.ac
+++ b/configure.ac
@@ -166,6 +166,8 @@ AC_HAVE_STATFS_FLAGS
AC_HAVE_MAP_SYNC
AC_HAVE_DEVMAPPER
AC_HAVE_MALLINFO
+AC_PACKAGE_WANT_ATTRIBUTES_H
+AC_HAVE_LIBATTR
if test "$enable_blkid" = yes; then
AC_HAVE_BLKID_TOPO
diff --git a/debian/control b/debian/control
index f5980b2..f664a6b 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: admin
Priority: optional
Maintainer: XFS Development Team <linux-xfs@vger.kernel.org>
Uploaders: Nathan Scott <nathans@debian.org>, Anibal Monsalve Salazar <anibal@debian.org>
-Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev
+Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev
Standards-Version: 3.9.1
Homepage: https://xfs.wiki.kernel.org/
diff --git a/include/builddefs.in b/include/builddefs.in
index 28cf0d8..cc1b7e2 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -120,6 +120,7 @@ HAVE_STATFS_FLAGS = @have_statfs_flags@
HAVE_MAP_SYNC = @have_map_sync@
HAVE_DEVMAPPER = @have_devmapper@
HAVE_MALLINFO = @have_mallinfo@
+HAVE_LIBATTR = @have_libattr@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/Makefile b/m4/Makefile
index 77f2edd..d5f1d2f 100644
--- a/m4/Makefile
+++ b/m4/Makefile
@@ -17,6 +17,7 @@ LSRCFILES = \
package_blkid.m4 \
package_devmapper.m4 \
package_globals.m4 \
+ package_attr.m4 \
package_libcdev.m4 \
package_pthread.m4 \
package_sanitizer.m4 \
diff --git a/m4/package_attr.m4 b/m4/package_attr.m4
new file mode 100644
index 0000000..4324923
--- /dev/null
+++ b/m4/package_attr.m4
@@ -0,0 +1,23 @@
+AC_DEFUN([AC_PACKAGE_WANT_ATTRIBUTES_H],
+ [
+ AC_CHECK_HEADERS(attr/attributes.h)
+ ])
+
+#
+# Check if we have a ATTR_ROOT flag and libattr structures
+#
+AC_DEFUN([AC_HAVE_LIBATTR],
+ [ AC_MSG_CHECKING([for struct attrlist_cursor])
+ AC_TRY_COMPILE([
+#include <sys/types.h>
+#include <attr/attributes.h>
+ ], [
+struct attrlist_cursor *cur;
+struct attrlist *list;
+struct attrlist_ent *ent;
+int flags = ATTR_ROOT;
+ ], have_libattr=yes
+ AC_MSG_RESULT(yes),
+ AC_MSG_RESULT(no))
+ AC_SUBST(have_libattr)
+ ])
diff --git a/scrub/Makefile b/scrub/Makefile
index adb868e..67ac6af 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -53,8 +53,14 @@ ifeq ($(HAVE_SYNCFS),yes)
LCFLAGS += -DHAVE_SYNCFS
endif
+ifeq ($(HAVE_LIBATTR),yes)
+LCFLAGS += -DHAVE_LIBATTR
+endif
+
default: depend $(LTCOMMAND)
+phase5.o: $(TOPDIR)/include/builddefs
+
include $(BUILDRULES)
install: default $(INSTALL_SCRUB)
diff --git a/scrub/common.c b/scrub/common.c
index eb602a8..b02e7fc 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -339,3 +339,57 @@ background_sleep(void)
tv.tv_nsec = time % 1000000;
nanosleep(&tv, NULL);
}
+
+/*
+ * Return the input string with non-printing bytes escaped.
+ * Caller must free the buffer.
+ */
+char *
+string_escape(
+ const char *in)
+{
+ char *str;
+ const char *p;
+ char *q;
+ int x;
+
+ str = malloc(strlen(in) * 4);
+ if (!str)
+ return NULL;
+ for (p = in, q = str; *p != '\0'; p++) {
+ if (isprint(*p)) {
+ *q = *p;
+ q++;
+ } else {
+ x = sprintf(q, "\\x%02x", *p);
+ q += x;
+ }
+ }
+ *q = '\0';
+ return str;
+}
+
+/*
+ * Record another naming warning, and decide if it's worth
+ * complaining about.
+ */
+bool
+should_warn_about_name(
+ struct scrub_ctx *ctx)
+{
+ bool whine;
+ bool res;
+
+ pthread_mutex_lock(&ctx->lock);
+ ctx->naming_warnings++;
+ whine = ctx->naming_warnings == TOO_MANY_NAME_WARNINGS;
+ res = ctx->naming_warnings < TOO_MANY_NAME_WARNINGS;
+ pthread_mutex_unlock(&ctx->lock);
+
+ if (whine && !(debug || verbose))
+ str_info(ctx, ctx->mntpoint,
+_("More than %u naming warnings, shutting up."),
+ TOO_MANY_NAME_WARNINGS);
+
+ return debug || verbose || res;
+}
diff --git a/scrub/common.h b/scrub/common.h
index 81e83c2..e5a13d8 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -72,5 +72,9 @@ static inline int syncfs(int fd)
bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
void background_sleep(void);
+char *string_escape(const char *in);
+
+#define TOO_MANY_NAME_WARNINGS 10000
+bool should_warn_about_name(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/phase5.c b/scrub/phase5.c
index 0b161e3..98d30f8 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -20,10 +20,15 @@
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
+#include <dirent.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/statvfs.h>
+#ifdef HAVE_LIBATTR
+# include <attr/attributes.h>
+#endif
#include "xfs.h"
+#include "handle.h"
#include "path.h"
#include "workqueue.h"
#include "xfs_scrub.h"
@@ -34,6 +39,181 @@
/* Phase 5: Check directory connectivity. */
/*
+ * Warn about problematic bytes in a directory/attribute name. That means
+ * terminal control characters and escape sequences, since that could be used
+ * to do something naughty to the user's computer and/or break scripts. XFS
+ * doesn't consider any byte sequence invalid, so don't flag these as errors.
+ */
+static bool
+xfs_scrub_check_name(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *namedescr,
+ const char *name)
+{
+ const char *p;
+ bool bad = false;
+ char *errname;
+
+ /* Complain about zero length names. */
+ if (*name == '\0' && should_warn_about_name(ctx)) {
+ str_warn(ctx, descr, _("Zero length name found."));
+ return true;
+ }
+
+ /* control characters */
+ for (p = name; *p; p++) {
+ if ((*p >= 1 && *p <= 31) || *p == 127) {
+ bad = true;
+ break;
+ }
+ }
+
+ if (bad && should_warn_about_name(ctx)) {
+ errname = string_escape(name);
+ if (!errname) {
+ str_errno(ctx, descr);
+ return false;
+ }
+ str_info(ctx, descr,
+_("Control character found in %s name \"%s\"."),
+ namedescr, errname);
+ free(errname);
+ }
+
+ return true;
+}
+
+/*
+ * Iterate a directory looking for filenames with problematic
+ * characters.
+ */
+static bool
+xfs_scrub_scan_dirents(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ int *fd)
+{
+ DIR *dir;
+ struct dirent *dentry;
+ bool moveon = true;
+
+ dir = fdopendir(*fd);
+ if (!dir) {
+ str_errno(ctx, descr);
+ goto out;
+ }
+ *fd = -1; /* closedir will close *fd for us */
+
+ dentry = readdir(dir);
+ while (dentry) {
+ moveon = xfs_scrub_check_name(ctx, descr, _("directory"),
+ dentry->d_name);
+ if (!moveon)
+ break;
+ dentry = readdir(dir);
+ }
+
+ closedir(dir);
+out:
+ return moveon;
+}
+
+#ifdef HAVE_LIBATTR
+/* Routines to scan all of an inode's xattrs for name problems. */
+struct xfs_attr_ns {
+ int flags;
+ const char *name;
+};
+
+static const struct xfs_attr_ns attr_ns[] = {
+ {0, "user"},
+ {ATTR_ROOT, "system"},
+ {ATTR_SECURE, "secure"},
+ {0, NULL},
+};
+
+/*
+ * Check all the xattr names in a particular namespace of a file handle
+ * for Unicode normalization problems or collisions.
+ */
+static bool
+xfs_scrub_scan_fhandle_namespace_xattrs(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct xfs_handle *handle,
+ const struct xfs_attr_ns *attr_ns)
+{
+ struct attrlist_cursor cur;
+ char attrbuf[XFS_XATTR_LIST_MAX];
+ char keybuf[NAME_MAX + 1];
+ struct attrlist *attrlist = (struct attrlist *)attrbuf;
+ struct attrlist_ent *ent;
+ bool moveon = true;
+ int i;
+ int error;
+
+ memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
+ memset(&cur, 0, sizeof(cur));
+ memset(keybuf, 0, NAME_MAX + 1);
+ error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
+ XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
+ while (!error) {
+ /* Examine the xattrs. */
+ for (i = 0; i < attrlist->al_count; i++) {
+ ent = ATTR_ENTRY(attrlist, i);
+ snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,
+ ent->a_name);
+ moveon = xfs_scrub_check_name(ctx, descr,
+ _("extended attribute"), keybuf);
+ if (!moveon)
+ goto out;
+ }
+
+ if (!attrlist->al_more)
+ break;
+ error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
+ XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
+ }
+ if (error && errno != ESTALE)
+ str_errno(ctx, descr);
+out:
+ return moveon;
+}
+
+/*
+ * Check all the xattr names in all the xattr namespaces for problematic
+ * characters.
+ */
+static bool
+xfs_scrub_scan_fhandle_xattrs(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct xfs_handle *handle)
+{
+ const struct xfs_attr_ns *ns;
+ bool moveon = true;
+
+ for (ns = attr_ns; ns->name; ns++) {
+ moveon = xfs_scrub_scan_fhandle_namespace_xattrs(ctx, descr,
+ handle, ns);
+ if (!moveon)
+ break;
+ }
+ return moveon;
+}
+#else
+static inline bool
+xfs_scrub_scan_fhandle_xattrs(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct xfs_handle *handle)
+{
+ return true;
+}
+#endif /* HAVE_LIBATTR */
+
+/*
* Verify the connectivity of the directory tree.
* We know that the kernel's open-by-handle function will try to reconnect
* parents of an opened directory, so we'll accept that as sufficient.
@@ -58,6 +238,11 @@ xfs_scrub_connections(
(uint64_t)bstat->bs_ino, agno, agino);
background_sleep();
+ /* Warn about naming problems in xattrs. */
+ moveon = xfs_scrub_scan_fhandle_xattrs(ctx, descr, handle);
+ if (!moveon)
+ goto out;
+
/* Open the dir, let the kernel try to reconnect it to the root. */
if (S_ISDIR(bstat->bs_mode)) {
fd = xfs_open_handle(handle);
@@ -69,6 +254,13 @@ xfs_scrub_connections(
}
}
+ /* Warn about naming problems in the directory entries. */
+ if (fd >= 0 && S_ISDIR(bstat->bs_mode)) {
+ moveon = xfs_scrub_scan_dirents(ctx, descr, &fd);
+ if (!moveon)
+ goto out;
+ }
+
out:
if (fd >= 0)
close(fd);
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index c9f53d8..66003e4 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -90,6 +90,7 @@ struct scrub_ctx {
unsigned long long errors_found;
unsigned long long warnings_found;
unsigned long long inodes_checked;
+ unsigned long long naming_warnings;
bool need_repair;
bool preen_triggers[XFS_SCRUB_TYPE_NR];
};
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (16 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 17/27] xfs_scrub: warn about suspicious characters in directory/xattr names Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-16 23:52 ` Eric Sandeen
2018-01-06 1:53 ` [PATCH 19/27] xfs_scrub: create a bitmap data structure Darrick J. Wong
` (12 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Iterate all directory and xattr names to look for name collisions
amongst Unicode normalized names. This is generally a sign of buggy
programs or malicious duplicate files.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure.ac | 2
debian/control | 2
include/builddefs.in | 2
m4/Makefile | 1
m4/package_unistring.m4 | 19 ++
scrub/Makefile | 12 +
scrub/common.c | 20 ++
scrub/common.h | 3
scrub/phase5.c | 53 +++++-
scrub/unicrash.c | 399 +++++++++++++++++++++++++++++++++++++++++++++++
scrub/unicrash.h | 49 ++++++
scrub/xfs_scrub.c | 2
12 files changed, 546 insertions(+), 18 deletions(-)
create mode 100644 m4/package_unistring.m4
create mode 100644 scrub/unicrash.c
create mode 100644 scrub/unicrash.h
diff --git a/configure.ac b/configure.ac
index e2e3f66..fc44bd5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -168,6 +168,8 @@ AC_HAVE_DEVMAPPER
AC_HAVE_MALLINFO
AC_PACKAGE_WANT_ATTRIBUTES_H
AC_HAVE_LIBATTR
+AC_PACKAGE_WANT_UNINORM_H
+AC_HAVE_U8NORMALIZE
if test "$enable_blkid" = yes; then
AC_HAVE_BLKID_TOPO
diff --git a/debian/control b/debian/control
index f664a6b..36d1bd8 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: admin
Priority: optional
Maintainer: XFS Development Team <linux-xfs@vger.kernel.org>
Uploaders: Nathan Scott <nathans@debian.org>, Anibal Monsalve Salazar <anibal@debian.org>
-Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev
+Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev, libunistring-dev
Standards-Version: 3.9.1
Homepage: https://xfs.wiki.kernel.org/
diff --git a/include/builddefs.in b/include/builddefs.in
index cc1b7e2..1c264a0 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -36,6 +36,7 @@ LIBEDITLINE = @libeditline@
LIBREADLINE = @libreadline@
LIBBLKID = @libblkid@
LIBDEVMAPPER = @libdevmapper@
+LIBUNISTRING = @libunistring@
LIBXFS = $(TOPDIR)/libxfs/libxfs.la
LIBFROG = $(TOPDIR)/libfrog/libfrog.la
LIBXCMD = $(TOPDIR)/libxcmd/libxcmd.la
@@ -121,6 +122,7 @@ HAVE_MAP_SYNC = @have_map_sync@
HAVE_DEVMAPPER = @have_devmapper@
HAVE_MALLINFO = @have_mallinfo@
HAVE_LIBATTR = @have_libattr@
+HAVE_U8NORMALIZE = @have_u8normalize@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/Makefile b/m4/Makefile
index d5f1d2f..61d617e 100644
--- a/m4/Makefile
+++ b/m4/Makefile
@@ -22,6 +22,7 @@ LSRCFILES = \
package_pthread.m4 \
package_sanitizer.m4 \
package_types.m4 \
+ package_unistring.m4 \
package_utilies.m4 \
package_uuiddev.m4 \
multilib.m4 \
diff --git a/m4/package_unistring.m4 b/m4/package_unistring.m4
new file mode 100644
index 0000000..9cbfcb0
--- /dev/null
+++ b/m4/package_unistring.m4
@@ -0,0 +1,19 @@
+AC_DEFUN([AC_PACKAGE_WANT_UNINORM_H],
+ [ AC_CHECK_HEADERS(uninorm.h)
+ if test $ac_cv_header_uninorm_h = no; then
+ AC_CHECK_HEADERS(uninorm.h,, [
+ echo
+ echo 'WARNING: could not find a valid uninorm.h header.'])
+ fi
+ ])
+
+AC_DEFUN([AC_HAVE_U8NORMALIZE],
+ [ AC_CHECK_LIB(unistring, u8_normalize,[
+ libunistring=-lunistring
+ have_u8normalize=yes
+ ],[
+ echo
+ echo 'WARNING: xfs_scrub will not be built with Unicode libraries.'])
+ AC_SUBST(libunistring)
+ AC_SUBST(have_u8normalize)
+ ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 67ac6af..858bc40 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -24,6 +24,7 @@ fscounters.h \
inodes.h \
scrub.h \
spacemap.h \
+unicrash.h \
xfs_scrub.h
CFILES = \
@@ -41,8 +42,8 @@ scrub.c \
spacemap.c \
xfs_scrub.c
-LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
-LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
+LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUNISTRING)
+LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG) $(LIBUNISTRING)
LLDFLAGS = -static
ifeq ($(HAVE_MALLINFO),yes)
@@ -57,9 +58,14 @@ ifeq ($(HAVE_LIBATTR),yes)
LCFLAGS += -DHAVE_LIBATTR
endif
+ifeq ($(HAVE_U8NORMALIZE),yes)
+CFILES += unicrash.c
+LCFLAGS += -DHAVE_U8NORMALIZE
+endif
+
default: depend $(LTCOMMAND)
-phase5.o: $(TOPDIR)/include/builddefs
+phase5.o unicrash.o xfs.o: $(TOPDIR)/include/builddefs
include $(BUILDRULES)
diff --git a/scrub/common.c b/scrub/common.c
index b02e7fc..10c4017 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -75,6 +75,26 @@ __str_errno(
pthread_mutex_unlock(&ctx->lock);
}
+/* Print a warning string and whatever error is stored in errno. */
+void
+__str_errno_warn(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line)
+{
+ char buf[DESCR_BUFSZ];
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stderr, _("Warning: %s: %s."), descr,
+ strerror_r(errno, buf, DESCR_BUFSZ));
+ if (debug)
+ fprintf(stderr, _(" (%s line %d)"), file, line);
+ fprintf(stderr, "\n");
+ ctx->warnings_found++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
/* Print an error string and some error text. */
void
__str_error(
diff --git a/scrub/common.h b/scrub/common.h
index e5a13d8..dd2070e 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -41,11 +41,14 @@ void __record_repair(struct scrub_ctx *ctx, const char *descr, const char *file,
int line, const char *format, ...);
void __record_preen(struct scrub_ctx *ctx, const char *descr, const char *file,
int line, const char *format, ...);
+void __str_errno_warn(struct scrub_ctx *, const char *descr, const char *file,
+ int line);
#define str_errno(ctx, str) __str_errno(ctx, str, __FILE__, __LINE__)
#define str_error(ctx, str, ...) __str_error(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define str_warn(ctx, str, ...) __str_warn(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
+#define str_errno_warn(ctx, str) __str_errno_warn(ctx, str, __FILE__, __LINE__)
#define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
/* Is this debug tweak enabled? */
diff --git a/scrub/phase5.c b/scrub/phase5.c
index 98d30f8..8b8aeed 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -35,6 +35,7 @@
#include "common.h"
#include "inodes.h"
#include "scrub.h"
+#include "unicrash.h"
/* Phase 5: Check directory connectivity. */
@@ -92,8 +93,10 @@ static bool
xfs_scrub_scan_dirents(
struct scrub_ctx *ctx,
const char *descr,
- int *fd)
+ int *fd,
+ struct xfs_bstat *bstat)
{
+ struct unicrash *uc = NULL;
DIR *dir;
struct dirent *dentry;
bool moveon = true;
@@ -105,15 +108,24 @@ xfs_scrub_scan_dirents(
}
*fd = -1; /* closedir will close *fd for us */
+ moveon = unicrash_dir_init(&uc, ctx, bstat);
+ if (!moveon)
+ goto out_unicrash;
+
dentry = readdir(dir);
while (dentry) {
moveon = xfs_scrub_check_name(ctx, descr, _("directory"),
dentry->d_name);
if (!moveon)
break;
+ moveon = unicrash_check_dir_name(uc, descr, dentry);
+ if (!moveon)
+ break;
dentry = readdir(dir);
}
+ unicrash_free(uc);
+out_unicrash:
closedir(dir);
out:
return moveon;
@@ -142,6 +154,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
struct scrub_ctx *ctx,
const char *descr,
struct xfs_handle *handle,
+ struct xfs_bstat *bstat,
const struct xfs_attr_ns *attr_ns)
{
struct attrlist_cursor cur;
@@ -149,10 +162,15 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
char keybuf[NAME_MAX + 1];
struct attrlist *attrlist = (struct attrlist *)attrbuf;
struct attrlist_ent *ent;
+ struct unicrash *uc;
bool moveon = true;
int i;
int error;
+ moveon = unicrash_xattr_init(&uc, ctx, bstat);
+ if (!moveon)
+ return false;
+
memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
memset(&cur, 0, sizeof(cur));
memset(keybuf, 0, NAME_MAX + 1);
@@ -168,6 +186,9 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
_("extended attribute"), keybuf);
if (!moveon)
goto out;
+ moveon = unicrash_check_xattr_name(uc, descr, keybuf);
+ if (!moveon)
+ goto out;
}
if (!attrlist->al_more)
@@ -178,6 +199,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
if (error && errno != ESTALE)
str_errno(ctx, descr);
out:
+ unicrash_free(uc);
return moveon;
}
@@ -189,14 +211,15 @@ static bool
xfs_scrub_scan_fhandle_xattrs(
struct scrub_ctx *ctx,
const char *descr,
- struct xfs_handle *handle)
+ struct xfs_handle *handle,
+ struct xfs_bstat *bstat)
{
const struct xfs_attr_ns *ns;
bool moveon = true;
for (ns = attr_ns; ns->name; ns++) {
moveon = xfs_scrub_scan_fhandle_namespace_xattrs(ctx, descr,
- handle, ns);
+ handle, bstat, ns);
if (!moveon)
break;
}
@@ -217,6 +240,8 @@ xfs_scrub_scan_fhandle_xattrs(
* Verify the connectivity of the directory tree.
* We know that the kernel's open-by-handle function will try to reconnect
* parents of an opened directory, so we'll accept that as sufficient.
+ *
+ * Check for potential Unicode collisions in names.
*/
static int
xfs_scrub_connections(
@@ -227,7 +252,7 @@ xfs_scrub_connections(
{
bool *pmoveon = arg;
char descr[DESCR_BUFSZ];
- bool moveon = true;
+ bool moveon;
xfs_agnumber_t agno;
xfs_agino_t agino;
int fd = -1;
@@ -238,10 +263,10 @@ xfs_scrub_connections(
(uint64_t)bstat->bs_ino, agno, agino);
background_sleep();
- /* Warn about naming problems in xattrs. */
- moveon = xfs_scrub_scan_fhandle_xattrs(ctx, descr, handle);
- if (!moveon)
- goto out;
+ /* Warn about naming problems in xattrs. */
+ moveon = xfs_scrub_scan_fhandle_xattrs(ctx, descr, handle, bstat);
+ if (!moveon)
+ goto out;
/* Open the dir, let the kernel try to reconnect it to the root. */
if (S_ISDIR(bstat->bs_mode)) {
@@ -254,12 +279,12 @@ xfs_scrub_connections(
}
}
- /* Warn about naming problems in the directory entries. */
- if (fd >= 0 && S_ISDIR(bstat->bs_mode)) {
- moveon = xfs_scrub_scan_dirents(ctx, descr, &fd);
- if (!moveon)
- goto out;
- }
+ /* Warn about naming problems in the directory entries. */
+ if (fd >= 0 && S_ISDIR(bstat->bs_mode)) {
+ moveon = xfs_scrub_scan_dirents(ctx, descr, &fd, bstat);
+ if (!moveon)
+ goto out;
+ }
out:
if (fd >= 0)
diff --git a/scrub/unicrash.c b/scrub/unicrash.c
new file mode 100644
index 0000000..25d6701
--- /dev/null
+++ b/scrub/unicrash.c
@@ -0,0 +1,399 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <dirent.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include <unistr.h>
+#include <uninorm.h>
+#include "xfs.h"
+#include "path.h"
+#include "xfs_scrub.h"
+#include "common.h"
+
+/*
+ * Detect collisions of Unicode-normalized names.
+ *
+ * Record all the name->ino mappings in a directory/xattr, with a twist!
+ * The twist is that we perform unicode normalization on every name we
+ * see, so that we can warn about a directory containing more than one
+ * directory entries that normalize to the same Unicode string. These
+ * entries are at best a sign of Unicode mishandling, or some sort of
+ * weird name substitution attack if the entries do not point to the
+ * same inode. Warn if we see multiple dirents that do not all point to
+ * the same inode.
+ *
+ * For extended attributes we perform the same collision checks on the
+ * attribute, though any collision is enough to trigger a warning.
+ *
+ * We flag these collisions as warnings and not errors because XFS
+ * treats names as a sequence of arbitrary nonzero bytes. While a
+ * Unicode collision is not technically a filesystem corruption, we
+ * ought to say something if there's a possibility for misleading a
+ * user.
+ *
+ * To normalize, we use Unicode NFKC. We use the composing
+ * normalization mode (e.g. "E WITH ACUTE" instead of "E" then "ACUTE")
+ * because that's what W3C (and in general Linux) uses. This enables us
+ * to detect multiple object names that normalize to the same name and
+ * could be confusing to users. Furthermore, we use the compatibility
+ * mode to detect names with compatible but different code points to
+ * strengthen those checks.
+ */
+
+struct name_entry {
+ struct name_entry *next;
+ xfs_ino_t ino;
+ size_t uninamelen;
+ uint8_t uniname[0];
+};
+#define NAME_ENTRY_SZ(nl) (sizeof(struct name_entry) + 1 + \
+ (nl * sizeof(uint8_t)))
+
+struct unicrash {
+ struct scrub_ctx *ctx;
+ bool compare_ino;
+ size_t nr_buckets;
+ struct name_entry *buckets[0];
+};
+#define UNICRASH_SZ(nr) (sizeof(struct unicrash) + \
+ (nr * sizeof(struct name_entry *)))
+
+/*
+ * We only care about validating utf8 collisions if the underlying
+ * system configuration says we're using utf8. If the language
+ * specifier string used to output messages has ".UTF-8" somewhere in
+ * its name, then we conclude utf8 is in use. Otherwise, no checking is
+ * performed.
+ *
+ * Most modern Linux systems default to utf8, so the only time this
+ * check will return false is if the administrator configured things
+ * this way or if things are so messed up there is no locale data at
+ * all.
+ */
+#define UTF8_STR ".UTF-8"
+#define UTF8_STRLEN (sizeof(UTF8_STR) - 1)
+static bool
+is_utf8_locale(void)
+{
+ const char *msg_locale;
+ static int answer = -1;
+
+ if (answer != -1)
+ return answer;
+
+ msg_locale = setlocale(LC_MESSAGES, NULL);
+ if (msg_locale == NULL)
+ return false;
+
+ if (strstr(msg_locale, UTF8_STR) != NULL)
+ answer = 1;
+ else
+ answer = 0;
+ return answer;
+}
+
+/* Set up unicrash global state. */
+void
+unicrash_setup(void)
+{
+ is_utf8_locale();
+}
+
+/* Initialize the collision detector. */
+static bool
+unicrash_init(
+ struct unicrash **ucp,
+ struct scrub_ctx *ctx,
+ bool compare_ino,
+ size_t nr_buckets)
+{
+ struct unicrash *p;
+
+ if (!is_utf8_locale()) {
+ *ucp = NULL;
+ return true;
+ }
+
+ if (nr_buckets > 65536)
+ nr_buckets = 65536;
+ else if (nr_buckets < 16)
+ nr_buckets = 16;
+
+ p = calloc(1, UNICRASH_SZ(nr_buckets));
+ if (!p)
+ return false;
+ p->ctx = ctx;
+ p->nr_buckets = nr_buckets;
+ p->compare_ino = compare_ino;
+ *ucp = p;
+
+ return true;
+}
+
+/* Initialize the collision detector for a directory. */
+bool
+unicrash_dir_init(
+ struct unicrash **ucp,
+ struct scrub_ctx *ctx,
+ struct xfs_bstat *bstat)
+{
+ /*
+ * Assume 64 bytes per dentry, clamp buckets between 16 and 64k.
+ * Same general idea as dir_hash_init in xfs_repair.
+ */
+ return unicrash_init(ucp, ctx, true, bstat->bs_size / 64);
+}
+
+/* Initialize the collision detector for an extended attribute. */
+bool
+unicrash_xattr_init(
+ struct unicrash **ucp,
+ struct scrub_ctx *ctx,
+ struct xfs_bstat *bstat)
+{
+ /* Assume 16 attributes per extent for lack of a better idea. */
+ return unicrash_init(ucp, ctx, false, 16 * (1 + bstat->bs_aextents));
+}
+
+/* Free the crash detector. */
+void
+unicrash_free(
+ struct unicrash *uc)
+{
+ struct name_entry *ne;
+ struct name_entry *x;
+ size_t i;
+
+ if (!uc)
+ return;
+
+ for (i = 0; i < uc->nr_buckets; i++) {
+ for (ne = uc->buckets[i]; ne != NULL; ne = x) {
+ x = ne->next;
+ free(ne);
+ }
+ }
+ free(uc);
+}
+
+/* Steal the dirhash function from libxfs, avoid linking with libxfs. */
+
+#define rol32(x, y) (((x) << (y)) | ((x) >> (32 - (y))))
+
+/*
+ * Implement a simple hash on a character string.
+ * Rotate the hash value by 7 bits, then XOR each character in.
+ * This is implemented with some source-level loop unrolling.
+ */
+static xfs_dahash_t
+unicrash_hashname(
+ const uint8_t *name,
+ size_t namelen)
+{
+ xfs_dahash_t hash;
+
+ /*
+ * Do four characters at a time as long as we can.
+ */
+ for (hash = 0; namelen >= 4; namelen -= 4, name += 4)
+ hash = (name[0] << 21) ^ (name[1] << 14) ^ (name[2] << 7) ^
+ (name[3] << 0) ^ rol32(hash, 7 * 4);
+
+ /*
+ * Now do the rest of the characters.
+ */
+ switch (namelen) {
+ case 3:
+ return (name[0] << 14) ^ (name[1] << 7) ^ (name[2] << 0) ^
+ rol32(hash, 7 * 3);
+ case 2:
+ return (name[0] << 7) ^ (name[1] << 0) ^ rol32(hash, 7 * 2);
+ case 1:
+ return (name[0] << 0) ^ rol32(hash, 7 * 1);
+ default: /* case 0: */
+ return hash;
+ }
+}
+
+/*
+ * Normalize a name according to Unicode NFKC normalization rules.
+ * Returns true if the name was already normalized.
+ */
+static bool
+unicrash_normalize(
+ const char *in,
+ uint8_t *out,
+ size_t outlen)
+{
+ size_t inlen = strlen(in);
+
+ assert(inlen <= outlen);
+ if (!u8_normalize(UNINORM_NFKC, (const uint8_t *)in, inlen,
+ out, &outlen)) {
+ /* Didn't normalize, just return the same buffer. */
+ memcpy(out, in, inlen + 1);
+ return true;
+ }
+ out[outlen] = 0;
+ return outlen == inlen ? memcmp(in, out, inlen) == 0 : false;
+}
+
+/* Complain about Unicode problems. */
+static void
+unicrash_complain(
+ struct unicrash *uc,
+ const char *descr,
+ const char *what,
+ bool normal,
+ bool unique,
+ const char *name,
+ uint8_t *uniname)
+{
+ char *bad1 = NULL;
+ char *bad2 = NULL;
+
+ bad1 = string_escape(name);
+ bad2 = string_escape((char *)uniname);
+
+ if (!normal && should_warn_about_name(uc->ctx))
+ str_info(uc->ctx, descr,
+_("Unicode name \"%s\" in %s should be normalized as \"%s\"."),
+ bad1, what, bad2);
+ if (!unique)
+ str_warn(uc->ctx, descr,
+_("Duplicate normalized Unicode name \"%s\" found in %s."),
+ bad1, what);
+
+ free(bad1);
+ free(bad2);
+}
+
+/*
+ * Try to add a name -> ino entry to the collision detector. The name
+ * must be normalized according to Unicode NFKC normalization rules to
+ * detect byte-unique names that map to the same sequence of Unicode
+ * code points.
+ *
+ * This function returns true either if there was no previous mapping or
+ * there was a mapping that matched exactly. It returns false if
+ * there is already a record with that name pointing to a different
+ * inode.
+ */
+static bool
+unicrash_add(
+ struct unicrash *uc,
+ uint8_t *uniname,
+ xfs_ino_t ino,
+ bool *unique)
+{
+ struct name_entry *ne;
+ struct name_entry *x;
+ struct name_entry **nep;
+ size_t uninamelen = u8_strlen(uniname);
+ size_t bucket;
+ xfs_dahash_t hash;
+
+ /* Do we already know about that name? */
+ hash = unicrash_hashname(uniname, uninamelen);
+ bucket = hash % uc->nr_buckets;
+ for (nep = &uc->buckets[bucket], ne = *nep; ne != NULL; ne = x) {
+ if (u8_strcmp(uniname, ne->uniname) == 0) {
+ *unique = uc->compare_ino ? ne->ino == ino : false;
+ return true;
+ }
+ nep = &ne->next;
+ x = ne->next;
+ }
+
+ /* Remember that name. */
+ x = malloc(NAME_ENTRY_SZ(uninamelen));
+ if (!x)
+ return false;
+ x->next = NULL;
+ x->ino = ino;
+ x->uninamelen = uninamelen;
+ memcpy(x->uniname, uniname, uninamelen + 1);
+ *nep = x;
+ *unique = true;
+
+ return true;
+}
+
+/* Check a name for unicode normalization problems or collisions. */
+static bool
+__unicrash_check_name(
+ struct unicrash *uc,
+ const char *descr,
+ const char *namedescr,
+ const char *name,
+ xfs_ino_t ino)
+{
+ uint8_t uniname[(NAME_MAX * 2) + 1];
+ bool moveon;
+ bool normal;
+ bool unique;
+
+ memset(uniname, 0, (NAME_MAX * 2) + 1);
+ normal = unicrash_normalize(name, uniname, NAME_MAX * 2);
+ moveon = unicrash_add(uc, uniname, ino, &unique);
+ if (!moveon)
+ return false;
+
+ if (normal && unique)
+ return true;
+
+ unicrash_complain(uc, descr, namedescr, normal, unique, name,
+ uniname);
+ return true;
+}
+
+/* Check a directory entry for unicode normalization problems or collisions. */
+bool
+unicrash_check_dir_name(
+ struct unicrash *uc,
+ const char *descr,
+ struct dirent *dentry)
+{
+ if (!uc)
+ return true;
+ return __unicrash_check_name(uc, descr, _("directory"),
+ dentry->d_name, dentry->d_ino);
+}
+
+/*
+ * Check an extended attribute name for unicode normalization problems
+ * or collisions.
+ */
+bool
+unicrash_check_xattr_name(
+ struct unicrash *uc,
+ const char *descr,
+ const char *attrname)
+{
+ if (!uc)
+ return true;
+ return __unicrash_check_name(uc, descr, _("extended attribute"),
+ attrname, 0);
+}
diff --git a/scrub/unicrash.h b/scrub/unicrash.h
new file mode 100644
index 0000000..d4561b7
--- /dev/null
+++ b/scrub/unicrash.h
@@ -0,0 +1,49 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_UNICRASH_H_
+#define XFS_SCRUB_UNICRASH_H_
+
+struct unicrash;
+
+/* Unicode name collision detection. */
+#ifdef HAVE_U8NORMALIZE
+
+struct dirent;
+
+void unicrash_setup(void);
+bool unicrash_dir_init(struct unicrash **ucp, struct scrub_ctx *ctx,
+ struct xfs_bstat *bstat);
+bool unicrash_xattr_init(struct unicrash **ucp, struct scrub_ctx *ctx,
+ struct xfs_bstat *bstat);
+void unicrash_free(struct unicrash *uc);
+bool unicrash_check_dir_name(struct unicrash *uc, const char *descr,
+ struct dirent *dirent);
+bool unicrash_check_xattr_name(struct unicrash *uc, const char *descr,
+ const char *attrname);
+#else
+# define unicrash_setup()
+# define unicrash_dir_init(u, c, b) (true)
+# define unicrash_xattr_init(u, c, b) (true)
+# define unicrash_free(u) do {(u) = (u);} while (0)
+# define unicrash_check_dir_name(u, d, n) (true)
+# define unicrash_check_xattr_name(u, d, n) (true)
+#endif /* HAVE_U8NORMALIZE */
+
+#endif /* XFS_SCRUB_UNICRASH_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 64517f4..f7e4e37 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -31,6 +31,7 @@
#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
+#include "unicrash.h"
/*
* XFS Online Metadata Scrub (and Repair)
@@ -529,6 +530,7 @@ _("Only one of the options -n or -y may be specified.\n"));
if (optind != argc - 1)
usage();
+ unicrash_setup();
ctx.mntpoint = strdup(argv[optind]);
/* Find the mount record for the passed-in argument. */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 19/27] xfs_scrub: create a bitmap data structure
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (17 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 20/27] xfs_scrub: create infrastructure to read verify data blocks Darrick J. Wong
` (11 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create an efficient tree-based bitmap data structure. We will use this
during the data block scan to record the LBAs of IO errors so that we
can report broken files to userspace.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/bitmap.c | 410 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/bitmap.h | 38 +++++
3 files changed, 450 insertions(+)
create mode 100644 scrub/bitmap.c
create mode 100644 scrub/bitmap.h
diff --git a/scrub/Makefile b/scrub/Makefile
index 858bc40..a9aaa99 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -16,6 +16,7 @@ INSTALL_SCRUB = install-scrub
endif # scrub_prereqs
HFILES = \
+bitmap.h \
common.h \
counter.h \
disk.h \
@@ -28,6 +29,7 @@ unicrash.h \
xfs_scrub.h
CFILES = \
+bitmap.c \
common.c \
counter.c \
disk.c \
diff --git a/scrub/bitmap.c b/scrub/bitmap.c
new file mode 100644
index 0000000..a88fd0e
--- /dev/null
+++ b/scrub/bitmap.c
@@ -0,0 +1,410 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <assert.h>
+#include <inttypes.h>
+#include <pthread.h>
+#include "platform_defs.h"
+#include "avl64.h"
+#include "list.h"
+#include "bitmap.h"
+
+/*
+ * Space Efficient Bitmap
+ *
+ * Implements a space-efficient bitmap. We use an AVL tree to manage
+ * extent records that tell us which ranges are set; the bitmap key is
+ * an arbitrary uint64_t. The usual bitmap operations (set, clear,
+ * test, test and set) are supported, plus we can iterate set ranges.
+ */
+
+#define avl_for_each_range_safe(pos, n, l, first, last) \
+ for (pos = (first), n = pos->avl_nextino, l = (last)->avl_nextino; pos != (l); \
+ pos = n, n = pos ? pos->avl_nextino : NULL)
+
+#define avl_for_each_safe(tree, pos, n) \
+ for (pos = (tree)->avl_firstino, n = pos ? pos->avl_nextino : NULL; \
+ pos != NULL; \
+ pos = n, n = pos ? pos->avl_nextino : NULL)
+
+#define avl_for_each(tree, pos) \
+ for (pos = (tree)->avl_firstino; pos != NULL; pos = pos->avl_nextino)
+
+struct bitmap_node {
+ struct avl64node btn_node;
+ uint64_t btn_start;
+ uint64_t btn_length;
+};
+
+static uint64_t
+extent_start(
+ struct avl64node *node)
+{
+ struct bitmap_node *btn;
+
+ btn = container_of(node, struct bitmap_node, btn_node);
+ return btn->btn_start;
+}
+
+static uint64_t
+extent_end(
+ struct avl64node *node)
+{
+ struct bitmap_node *btn;
+
+ btn = container_of(node, struct bitmap_node, btn_node);
+ return btn->btn_start + btn->btn_length;
+}
+
+static struct avl64ops bitmap_ops = {
+ extent_start,
+ extent_end,
+};
+
+/* Initialize a bitmap. */
+bool
+bitmap_init(
+ struct bitmap **bmapp)
+{
+ struct bitmap *bmap;
+
+ bmap = calloc(1, sizeof(struct bitmap));
+ if (!bmap)
+ return false;
+ bmap->bt_tree = malloc(sizeof(struct avl64tree_desc));
+ if (!bmap->bt_tree) {
+ free(bmap);
+ return false;
+ }
+
+ pthread_mutex_init(&bmap->bt_lock, NULL);
+ avl64_init_tree(bmap->bt_tree, &bitmap_ops);
+ *bmapp = bmap;
+
+ return true;
+}
+
+/* Free a bitmap. */
+void
+bitmap_free(
+ struct bitmap **bmapp)
+{
+ struct bitmap *bmap;
+ struct avl64node *node;
+ struct avl64node *n;
+ struct bitmap_node *ext;
+
+ bmap = *bmapp;
+ avl_for_each_safe(bmap->bt_tree, node, n) {
+ ext = container_of(node, struct bitmap_node, btn_node);
+ free(ext);
+ }
+ free(bmap->bt_tree);
+ *bmapp = NULL;
+}
+
+/* Create a new bitmap extent node. */
+static struct bitmap_node *
+bitmap_node_init(
+ uint64_t start,
+ uint64_t len)
+{
+ struct bitmap_node *ext;
+
+ ext = malloc(sizeof(struct bitmap_node));
+ if (!ext)
+ return NULL;
+
+ ext->btn_node.avl_nextino = NULL;
+ ext->btn_start = start;
+ ext->btn_length = len;
+
+ return ext;
+}
+
+/* Set a region of bits (locked). */
+static bool
+__bitmap_set(
+ struct bitmap *bmap,
+ uint64_t start,
+ uint64_t length)
+{
+ struct avl64node *firstn;
+ struct avl64node *lastn;
+ struct avl64node *pos;
+ struct avl64node *n;
+ struct avl64node *l;
+ struct bitmap_node *ext;
+ uint64_t new_start;
+ uint64_t new_length;
+ struct avl64node *node;
+ bool res = true;
+
+ /* Find any existing nodes adjacent or within that range. */
+ avl64_findranges(bmap->bt_tree, start - 1, start + length + 1,
+ &firstn, &lastn);
+
+ /* Nothing, just insert a new extent. */
+ if (firstn == NULL && lastn == NULL) {
+ ext = bitmap_node_init(start, length);
+ if (!ext)
+ return false;
+
+ node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+ if (node == NULL) {
+ free(ext);
+ errno = EEXIST;
+ return false;
+ }
+
+ return true;
+ }
+
+ assert(firstn != NULL && lastn != NULL);
+ new_start = start;
+ new_length = length;
+
+ avl_for_each_range_safe(pos, n, l, firstn, lastn) {
+ ext = container_of(pos, struct bitmap_node, btn_node);
+
+ /* Bail if the new extent is contained within an old one. */
+ if (ext->btn_start <= start &&
+ ext->btn_start + ext->btn_length >= start + length)
+ return res;
+
+ /* Check for overlapping and adjacent extents. */
+ if (ext->btn_start + ext->btn_length >= start ||
+ ext->btn_start <= start + length) {
+ if (ext->btn_start < start) {
+ new_start = ext->btn_start;
+ new_length += ext->btn_length;
+ }
+
+ if (ext->btn_start + ext->btn_length >
+ new_start + new_length)
+ new_length = ext->btn_start + ext->btn_length -
+ new_start;
+
+ avl64_delete(bmap->bt_tree, pos);
+ free(ext);
+ }
+ }
+
+ ext = bitmap_node_init(new_start, new_length);
+ if (!ext)
+ return false;
+
+ node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+ if (node == NULL) {
+ free(ext);
+ errno = EEXIST;
+ return false;
+ }
+
+ return res;
+}
+
+/* Set a region of bits. */
+bool
+bitmap_set(
+ struct bitmap *bmap,
+ uint64_t start,
+ uint64_t length)
+{
+ bool res;
+
+ pthread_mutex_lock(&bmap->bt_lock);
+ res = __bitmap_set(bmap, start, length);
+ pthread_mutex_unlock(&bmap->bt_lock);
+
+ return res;
+}
+
+#if 0 /* Unused, provided for completeness. */
+/* Clear a region of bits. */
+bool
+bitmap_clear(
+ struct bitmap *bmap,
+ uint64_t start,
+ uint64_t len)
+{
+ struct avl64node *firstn;
+ struct avl64node *lastn;
+ struct avl64node *pos;
+ struct avl64node *n;
+ struct avl64node *l;
+ struct bitmap_node *ext;
+ uint64_t new_start;
+ uint64_t new_length;
+ struct avl64node *node;
+ int stat;
+
+ pthread_mutex_lock(&bmap->bt_lock);
+ /* Find any existing nodes over that range. */
+ avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
+
+ /* Nothing, we're done. */
+ if (firstn == NULL && lastn == NULL) {
+ pthread_mutex_unlock(&bmap->bt_lock);
+ return true;
+ }
+
+ assert(firstn != NULL && lastn != NULL);
+
+ /* Delete or truncate everything in sight. */
+ avl_for_each_range_safe(pos, n, l, firstn, lastn) {
+ ext = container_of(pos, struct bitmap_node, btn_node);
+
+ stat = 0;
+ if (ext->btn_start < start)
+ stat |= 1;
+ if (ext->btn_start + ext->btn_length > start + len)
+ stat |= 2;
+ switch (stat) {
+ case 0:
+ /* Extent totally within range; delete. */
+ avl64_delete(bmap->bt_tree, pos);
+ free(ext);
+ break;
+ case 1:
+ /* Extent is left-adjacent; truncate. */
+ ext->btn_length = start - ext->btn_start;
+ break;
+ case 2:
+ /* Extent is right-adjacent; move it. */
+ ext->btn_length = ext->btn_start + ext->btn_length -
+ (start + len);
+ ext->btn_start = start + len;
+ break;
+ case 3:
+ /* Extent overlaps both ends. */
+ ext->btn_length = start - ext->btn_start;
+ new_start = start + len;
+ new_length = ext->btn_start + ext->btn_length -
+ new_start;
+
+ ext = bitmap_node_init(new_start, new_length);
+ if (!ext)
+ return false;
+
+ node = avl64_insert(bmap->bt_tree, &ext->btn_node);
+ if (node == NULL) {
+ errno = EEXIST;
+ return false;
+ }
+ break;
+ }
+ }
+
+ pthread_mutex_unlock(&bmap->bt_lock);
+ return true;
+}
+#endif
+
+#ifdef DEBUG
+/* Iterate the set regions of this bitmap. */
+bool
+bitmap_iterate(
+ struct bitmap *bmap,
+ bool (*fn)(uint64_t, uint64_t, void *),
+ void *arg)
+{
+ struct avl64node *node;
+ struct bitmap_node *ext;
+ bool moveon = true;
+
+ pthread_mutex_lock(&bmap->bt_lock);
+ avl_for_each(bmap->bt_tree, node) {
+ ext = container_of(node, struct bitmap_node, btn_node);
+ moveon = fn(ext->btn_start, ext->btn_length, arg);
+ if (!moveon)
+ break;
+ }
+ pthread_mutex_unlock(&bmap->bt_lock);
+
+ return moveon;
+}
+#endif
+
+/* Do any bitmap extents overlap the given one? (locked) */
+static bool
+__bitmap_test(
+ struct bitmap *bmap,
+ uint64_t start,
+ uint64_t len)
+{
+ struct avl64node *firstn;
+ struct avl64node *lastn;
+
+ /* Find any existing nodes over that range. */
+ avl64_findranges(bmap->bt_tree, start, start + len, &firstn, &lastn);
+
+ return firstn != NULL && lastn != NULL;
+}
+
+/* Is any part of this range set? */
+bool
+bitmap_test(
+ struct bitmap *bmap,
+ uint64_t start,
+ uint64_t len)
+{
+ bool res;
+
+ pthread_mutex_lock(&bmap->bt_lock);
+ res = __bitmap_test(bmap, start, len);
+ pthread_mutex_unlock(&bmap->bt_lock);
+
+ return res;
+}
+
+/* Are none of the bits set? */
+bool
+bitmap_empty(
+ struct bitmap *bmap)
+{
+ return bmap->bt_tree->avl_firstino == NULL;
+}
+
+#ifdef DEBUG
+static bool
+bitmap_dump_fn(
+ uint64_t startblock,
+ uint64_t blockcount,
+ void *arg)
+{
+ printf("%"PRIu64":%"PRIu64"\n", startblock, blockcount);
+ return true;
+}
+
+/* Dump bitmap. */
+void
+bitmap_dump(
+ struct bitmap *bmap)
+{
+ printf("BITMAP DUMP %p\n", bmap);
+ bitmap_iterate(bmap, bitmap_dump_fn, NULL);
+ printf("BITMAP DUMP DONE\n");
+}
+#endif
diff --git a/scrub/bitmap.h b/scrub/bitmap.h
new file mode 100644
index 0000000..e8dcd4f
--- /dev/null
+++ b/scrub/bitmap.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_BITMAP_H_
+#define XFS_SCRUB_BITMAP_H_
+
+struct bitmap {
+ pthread_mutex_t bt_lock;
+ struct avl64tree_desc *bt_tree;
+};
+
+bool bitmap_init(struct bitmap **bmap);
+void bitmap_free(struct bitmap **bmap);
+bool bitmap_set(struct bitmap *bmap, uint64_t start, uint64_t length);
+bool bitmap_iterate(struct bitmap *bmap,
+ bool (*fn)(uint64_t, uint64_t, void *), void *arg);
+bool bitmap_test(struct bitmap *bmap, uint64_t start,
+ uint64_t len);
+bool bitmap_empty(struct bitmap *bmap);
+void bitmap_dump(struct bitmap *bmap);
+
+#endif /* XFS_SCRUB_BITMAP_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 20/27] xfs_scrub: create infrastructure to read verify data blocks
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (18 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 19/27] xfs_scrub: create a bitmap data structure Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 21/27] xfs_scrub: scrub file " Darrick J. Wong
` (10 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Manage the scheduling, issuance, and reporting of data block
verification reads. This enables us to combine adjacent (or nearly
adjacent) read requests, and to take advantage of high-IOPS devices by
issuing IO from multiple threads.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2
scrub/read_verify.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/read_verify.h | 50 ++++++++++
scrub/xfs_scrub.h | 3 +
4 files changed, 323 insertions(+)
create mode 100644 scrub/read_verify.c
create mode 100644 scrub/read_verify.h
diff --git a/scrub/Makefile b/scrub/Makefile
index a9aaa99..3b3eb95 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -23,6 +23,7 @@ disk.h \
filemap.h \
fscounters.h \
inodes.h \
+read_verify.h \
scrub.h \
spacemap.h \
unicrash.h \
@@ -40,6 +41,7 @@ phase1.c \
phase2.c \
phase3.c \
phase5.c \
+read_verify.c \
scrub.c \
spacemap.c \
xfs_scrub.c
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
new file mode 100644
index 0000000..244626d
--- /dev/null
+++ b/scrub/read_verify.c
@@ -0,0 +1,268 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <sys/statvfs.h>
+#include "workqueue.h"
+#include "path.h"
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "counter.h"
+#include "disk.h"
+#include "read_verify.h"
+
+/*
+ * Read Verify Pool
+ *
+ * Manages the data block read verification phase. The caller schedules
+ * verification requests, which are then scheduled to be run by a thread
+ * pool worker. Adjacent (or nearly adjacent) requests can be combined
+ * to reduce overhead when free space fragmentation is high. The thread
+ * pool takes care of issuing multiple IOs to the device, if possible.
+ */
+
+/*
+ * Perform all IO in 32M chunks. This cannot exceed 65536 sectors
+ * because that's the biggest SCSI VERIFY(16) we dare to send.
+ */
+#define RVP_IO_MAX_SIZE (33554432)
+#define RVP_IO_MAX_SECTORS (RVP_IO_MAX_SIZE >> BBSHIFT)
+
+/* Tolerate 64k holes in adjacent read verify requests. */
+#define RVP_IO_BATCH_LOCALITY (65536)
+
+struct read_verify_pool {
+ struct workqueue wq; /* thread pool */
+ struct scrub_ctx *ctx; /* scrub context */
+ void *readbuf; /* read buffer */
+ struct ptcounter *verified_bytes;
+ read_verify_ioerr_fn_t ioerr_fn; /* io error callback */
+ size_t miniosz; /* minimum io size, bytes */
+};
+
+/* Create a thread pool to run read verifiers. */
+struct read_verify_pool *
+read_verify_pool_init(
+ struct scrub_ctx *ctx,
+ size_t miniosz,
+ read_verify_ioerr_fn_t ioerr_fn,
+ unsigned int nproc)
+{
+ struct read_verify_pool *rvp;
+ bool ret;
+ int error;
+
+ rvp = calloc(1, sizeof(struct read_verify_pool));
+ if (!rvp)
+ return NULL;
+
+ error = posix_memalign((void **)&rvp->readbuf, page_size,
+ RVP_IO_MAX_SIZE);
+ if (error || !rvp->readbuf)
+ goto out_free;
+ rvp->verified_bytes = ptcounter_init(nproc);
+ if (!rvp->verified_bytes)
+ goto out_buf;
+ rvp->miniosz = miniosz;
+ rvp->ctx = ctx;
+ rvp->ioerr_fn = ioerr_fn;
+ /* Run in the main thread if we only want one thread. */
+ if (nproc == 1)
+ nproc = 0;
+ ret = workqueue_create(&rvp->wq, (struct xfs_mount *)rvp, nproc);
+ if (ret)
+ goto out_counter;
+ return rvp;
+
+out_counter:
+ ptcounter_free(rvp->verified_bytes);
+out_buf:
+ free(rvp->readbuf);
+out_free:
+ free(rvp);
+ return NULL;
+}
+
+/* Finish up any read verification work. */
+void
+read_verify_pool_flush(
+ struct read_verify_pool *rvp)
+{
+ workqueue_destroy(&rvp->wq);
+}
+
+/* Finish up any read verification work and tear it down. */
+void
+read_verify_pool_destroy(
+ struct read_verify_pool *rvp)
+{
+ ptcounter_free(rvp->verified_bytes);
+ free(rvp->readbuf);
+ free(rvp);
+}
+
+/*
+ * Issue a read-verify IO in big batches.
+ */
+static void
+read_verify(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct read_verify *rv = arg;
+ struct read_verify_pool *rvp;
+ unsigned long long verified = 0;
+ ssize_t sz;
+ ssize_t len;
+
+ rvp = (struct read_verify_pool *)wq->wq_ctx;
+ while (rv->io_length > 0) {
+ len = min(rv->io_length, RVP_IO_MAX_SIZE);
+ dbg_printf("diskverify %d %"PRIu64" %zu\n", rv->io_disk->d_fd,
+ rv->io_start, len);
+ sz = disk_read_verify(rv->io_disk, rvp->readbuf,
+ rv->io_start, len);
+ if (sz < 0) {
+ dbg_printf("IOERR %d %"PRIu64" %zu\n",
+ rv->io_disk->d_fd,
+ rv->io_start, len);
+ /* IO error, so try the next logical block. */
+ len = rvp->miniosz;
+ rvp->ioerr_fn(rvp->ctx, rv->io_disk, rv->io_start, len,
+ errno, rv->io_end_arg);
+ }
+
+ verified += len;
+ rv->io_start += len;
+ rv->io_length -= len;
+ }
+
+ free(rv);
+ ptcounter_add(rvp->verified_bytes, verified);
+}
+
+/* Queue a read verify request. */
+static bool
+read_verify_queue(
+ struct read_verify_pool *rvp,
+ struct read_verify *rv)
+{
+ struct read_verify *tmp;
+ bool ret;
+
+ dbg_printf("verify fd %d start %"PRIu64" len %"PRIu64"\n",
+ rv->io_disk->d_fd, rv->io_start, rv->io_length);
+
+ tmp = malloc(sizeof(struct read_verify));
+ if (!tmp) {
+ rvp->ioerr_fn(rvp->ctx, rv->io_disk, rv->io_start,
+ rv->io_length, errno, rv->io_end_arg);
+ return true;
+ }
+ memcpy(tmp, rv, sizeof(*tmp));
+
+ ret = workqueue_add(&rvp->wq, read_verify, 0, tmp);
+ if (ret) {
+ str_error(rvp->ctx, rvp->ctx->mntpoint,
+_("Could not queue read-verify work."));
+ free(tmp);
+ return false;
+ }
+ rv->io_length = 0;
+ return true;
+}
+
+/*
+ * Issue an IO request. We'll batch subsequent requests if they're
+ * within 64k of each other
+ */
+bool
+read_verify_schedule_io(
+ struct read_verify_pool *rvp,
+ struct read_verify *rv,
+ struct disk *disk,
+ uint64_t start,
+ uint64_t length,
+ void *end_arg)
+{
+ uint64_t req_end;
+ uint64_t rv_end;
+
+ assert(rvp->readbuf);
+ req_end = start + length;
+ rv_end = rv->io_start + rv->io_length;
+
+ /*
+ * If we have a stashed IO, we haven't changed fds, the error
+ * reporting is the same, and the two extents are close,
+ * we can combine them.
+ */
+ if (rv->io_length > 0 && disk == rv->io_disk &&
+ end_arg == rv->io_end_arg &&
+ ((start >= rv->io_start && start <= rv_end + RVP_IO_BATCH_LOCALITY) ||
+ (rv->io_start >= start &&
+ rv->io_start <= req_end + RVP_IO_BATCH_LOCALITY))) {
+ rv->io_start = min(rv->io_start, start);
+ rv->io_length = max(req_end, rv_end) - rv->io_start;
+ } else {
+ /* Otherwise, issue the stashed IO (if there is one) */
+ if (rv->io_length > 0)
+ return read_verify_queue(rvp, rv);
+
+ /* Stash the new IO. */
+ rv->io_disk = disk;
+ rv->io_start = start;
+ rv->io_length = length;
+ rv->io_end_arg = end_arg;
+ }
+
+ return true;
+}
+
+/* Force any stashed IOs into the verifier. */
+bool
+read_verify_force_io(
+ struct read_verify_pool *rvp,
+ struct read_verify *rv)
+{
+ bool moveon;
+
+ assert(rvp->readbuf);
+ if (rv->io_length == 0)
+ return true;
+
+ moveon = read_verify_queue(rvp, rv);
+ if (moveon)
+ rv->io_length = 0;
+ return moveon;
+}
+
+/* How many bytes has this process verified? */
+uint64_t
+read_verify_bytes(
+ struct read_verify_pool *rvp)
+{
+ return ptcounter_value(rvp->verified_bytes);
+}
diff --git a/scrub/read_verify.h b/scrub/read_verify.h
new file mode 100644
index 0000000..cea7a08
--- /dev/null
+++ b/scrub/read_verify.h
@@ -0,0 +1,50 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_READ_VERIFY_H_
+#define XFS_SCRUB_READ_VERIFY_H_
+
+struct scrub_ctx;
+struct read_verify_pool;
+
+/* Function called when an IO error happens. */
+typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx,
+ struct disk *disk, uint64_t start, uint64_t length,
+ int error, void *arg);
+
+struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx,
+ size_t miniosz, read_verify_ioerr_fn_t ioerr_fn,
+ unsigned int nproc);
+void read_verify_pool_flush(struct read_verify_pool *rvp);
+void read_verify_pool_destroy(struct read_verify_pool *rvp);
+
+struct read_verify {
+ void *io_end_arg;
+ struct disk *io_disk;
+ uint64_t io_start; /* bytes */
+ uint64_t io_length; /* bytes */
+};
+
+bool read_verify_schedule_io(struct read_verify_pool *rvp,
+ struct read_verify *rv, struct disk *disk, uint64_t start,
+ uint64_t length, void *end_arg);
+bool read_verify_force_io(struct read_verify_pool *rvp, struct read_verify *rv);
+uint64_t read_verify_bytes(struct read_verify_pool *rvp);
+
+#endif /* XFS_SCRUB_READ_VERIFY_H_ */
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 66003e4..31a927c 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -83,6 +83,9 @@ struct scrub_ctx {
void *fshandle;
size_t fshandle_len;
+ /* Data block read verification buffer */
+ void *readbuf;
+
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
unsigned long long max_errors;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 21/27] xfs_scrub: scrub file data blocks
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (19 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 20/27] xfs_scrub: create infrastructure to read verify data blocks Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-11 23:25 ` Eric Sandeen
2018-01-06 1:53 ` [PATCH 22/27] xfs_scrub: optionally use SCSI READ VERIFY commands to scrub data blocks on disk Darrick J. Wong
` (9 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Read all data blocks from the disk, hoping to catch IO errors.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure.ac | 2
include/builddefs.in | 2
m4/package_libcdev.m4 | 28 +++
scrub/Makefile | 7 -
scrub/phase6.c | 516 +++++++++++++++++++++++++++++++++++++++++++++++++
scrub/vfs.c | 221 +++++++++++++++++++++
scrub/vfs.h | 31 +++
scrub/xfs_scrub.c | 4
scrub/xfs_scrub.h | 2
9 files changed, 811 insertions(+), 2 deletions(-)
create mode 100644 scrub/phase6.c
create mode 100644 scrub/vfs.c
create mode 100644 scrub/vfs.h
diff --git a/configure.ac b/configure.ac
index fc44bd5..8eda010 100644
--- a/configure.ac
+++ b/configure.ac
@@ -170,6 +170,8 @@ AC_PACKAGE_WANT_ATTRIBUTES_H
AC_HAVE_LIBATTR
AC_PACKAGE_WANT_UNINORM_H
AC_HAVE_U8NORMALIZE
+AC_HAVE_OPENAT
+AC_HAVE_FSTATAT
if test "$enable_blkid" = yes; then
AC_HAVE_BLKID_TOPO
diff --git a/include/builddefs.in b/include/builddefs.in
index 1c264a0..2f8d33f 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -123,6 +123,8 @@ HAVE_DEVMAPPER = @have_devmapper@
HAVE_MALLINFO = @have_mallinfo@
HAVE_LIBATTR = @have_libattr@
HAVE_U8NORMALIZE = @have_u8normalize@
+HAVE_OPENAT = @have_openat@
+HAVE_FSTATAT = @have_fstatat@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index d3955f0..e0abc12 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -362,3 +362,31 @@ AC_DEFUN([AC_HAVE_MALLINFO],
AC_MSG_RESULT(no))
AC_SUBST(have_mallinfo)
])
+
+#
+# Check if we have a openat call
+#
+AC_DEFUN([AC_HAVE_OPENAT],
+ [ AC_CHECK_DECL([openat],
+ have_openat=yes,
+ [],
+ [#include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>]
+ )
+ AC_SUBST(have_openat)
+ ])
+
+#
+# Check if we have a fstatat call
+#
+AC_DEFUN([AC_HAVE_FSTATAT],
+ [ AC_CHECK_DECL([fstatat],
+ have_fstatat=yes,
+ [],
+ [#define _GNU_SOURCE
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <unistd.h>])
+ AC_SUBST(have_fstatat)
+ ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 3b3eb95..4b70efa 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -8,9 +8,9 @@ include $(TOPDIR)/include/builddefs
# On linux we get fsmap from the system or define it ourselves
# so include this based on platform type. If this reverts to only
# the autoconf check w/o local definition, change to testing HAVE_GETFSMAP
-SCRUB_PREREQS=$(PKG_PLATFORM)
+SCRUB_PREREQS=$(PKG_PLATFORM)$(HAVE_OPENAT)$(HAVE_FSTATAT)
-ifeq ($(SCRUB_PREREQS),linux)
+ifeq ($(SCRUB_PREREQS),linuxyesyes)
LTCOMMAND = xfs_scrub
INSTALL_SCRUB = install-scrub
endif # scrub_prereqs
@@ -27,6 +27,7 @@ read_verify.h \
scrub.h \
spacemap.h \
unicrash.h \
+vfs.h \
xfs_scrub.h
CFILES = \
@@ -41,9 +42,11 @@ phase1.c \
phase2.c \
phase3.c \
phase5.c \
+phase6.c \
read_verify.c \
scrub.c \
spacemap.c \
+vfs.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUNISTRING)
diff --git a/scrub/phase6.c b/scrub/phase6.c
new file mode 100644
index 0000000..5ecb8dc
--- /dev/null
+++ b/scrub/phase6.c
@@ -0,0 +1,516 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <dirent.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "handle.h"
+#include "path.h"
+#include "ptvar.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "bitmap.h"
+#include "disk.h"
+#include "filemap.h"
+#include "inodes.h"
+#include "read_verify.h"
+#include "spacemap.h"
+#include "vfs.h"
+
+/*
+ * Phase 6: Verify data file integrity.
+ *
+ * Identify potential data block extents with GETFSMAP, then feed those
+ * extents to the read-verify pool to get the verify commands batched,
+ * issued, and (if there are problems) reported back to us. If there
+ * are errors, we'll record the bad regions and (if available) use rmap
+ * to tell us if metadata are now corrupt. Otherwise, we'll scan the
+ * whole directory tree looking for files that overlap the bad regions
+ * and report the paths of the now corrupt files.
+ */
+
+/* Find the fd for a given device identifier. */
+static struct disk *
+xfs_dev_to_disk(
+ struct scrub_ctx *ctx,
+ dev_t dev)
+{
+ if (dev == ctx->fsinfo.fs_datadev)
+ return ctx->datadev;
+ else if (dev == ctx->fsinfo.fs_logdev)
+ return ctx->logdev;
+ else if (dev == ctx->fsinfo.fs_rtdev)
+ return ctx->rtdev;
+ abort();
+}
+
+/* Find the device major/minor for a given file descriptor. */
+static dev_t
+xfs_disk_to_dev(
+ struct scrub_ctx *ctx,
+ struct disk *disk)
+{
+ if (disk == ctx->datadev)
+ return ctx->fsinfo.fs_datadev;
+ else if (disk == ctx->logdev)
+ return ctx->fsinfo.fs_logdev;
+ else if (disk == ctx->rtdev)
+ return ctx->fsinfo.fs_rtdev;
+ abort();
+}
+
+struct owner_decode {
+ uint64_t owner;
+ const char *descr;
+};
+
+static const struct owner_decode special_owners[] = {
+ {XFS_FMR_OWN_FREE, "free space"},
+ {XFS_FMR_OWN_UNKNOWN, "unknown owner"},
+ {XFS_FMR_OWN_FS, "static FS metadata"},
+ {XFS_FMR_OWN_LOG, "journalling log"},
+ {XFS_FMR_OWN_AG, "per-AG metadata"},
+ {XFS_FMR_OWN_INOBT, "inode btree blocks"},
+ {XFS_FMR_OWN_INODES, "inodes"},
+ {XFS_FMR_OWN_REFC, "refcount btree"},
+ {XFS_FMR_OWN_COW, "CoW staging"},
+ {XFS_FMR_OWN_DEFECTIVE, "bad blocks"},
+ {0, NULL},
+};
+
+/* Decode a special owner. */
+static const char *
+xfs_decode_special_owner(
+ uint64_t owner)
+{
+ const struct owner_decode *od = special_owners;
+
+ while (od->descr) {
+ if (od->owner == owner)
+ return od->descr;
+ od++;
+ }
+
+ return NULL;
+}
+
+/* Routines to translate bad physical extents into file paths and offsets. */
+
+struct xfs_verify_error_info {
+ struct bitmap *d_bad; /* bytes */
+ struct bitmap *r_bad; /* bytes */
+};
+
+/* Report if this extent overlaps a bad region. */
+static bool
+xfs_report_verify_inode_bmap(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ int fd,
+ int whichfork,
+ struct fsxattr *fsx,
+ struct xfs_bmap *bmap,
+ void *arg)
+{
+ struct xfs_verify_error_info *vei = arg;
+ struct bitmap *bmp;
+
+ /* Only report errors for real extents. */
+ if (bmap->bm_flags & (BMV_OF_PREALLOC | BMV_OF_DELALLOC))
+ return true;
+
+ if (fsx->fsx_xflags & FS_XFLAG_REALTIME)
+ bmp = vei->r_bad;
+ else
+ bmp = vei->d_bad;
+
+ if (!bitmap_test(bmp, bmap->bm_physical, bmap->bm_length))
+ return true;
+
+ str_error(ctx, descr,
+_("offset %llu failed read verification."), bmap->bm_offset);
+ return true;
+}
+
+/* Iterate the extent mappings of a file to report errors. */
+static bool
+xfs_report_verify_fd(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ int fd,
+ void *arg)
+{
+ struct xfs_bmap key = {0};
+ bool moveon;
+
+ /* data fork */
+ moveon = xfs_iterate_filemaps(ctx, descr, fd, XFS_DATA_FORK, &key,
+ xfs_report_verify_inode_bmap, arg);
+ if (!moveon)
+ return false;
+
+ /* attr fork */
+ moveon = xfs_iterate_filemaps(ctx, descr, fd, XFS_ATTR_FORK, &key,
+ xfs_report_verify_inode_bmap, arg);
+ if (!moveon)
+ return false;
+ return true;
+}
+
+/* Report read verify errors in unlinked (but still open) files. */
+static int
+xfs_report_verify_inode(
+ struct scrub_ctx *ctx,
+ struct xfs_handle *handle,
+ struct xfs_bstat *bstat,
+ void *arg)
+{
+ char descr[DESCR_BUFSZ];
+ char buf[DESCR_BUFSZ];
+ bool moveon;
+ int fd;
+ int error;
+
+ snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (unlinked)"),
+ (uint64_t)bstat->bs_ino);
+
+ /* Ignore linked files and things we can't open. */
+ if (bstat->bs_nlink != 0)
+ return 0;
+ if (!S_ISREG(bstat->bs_mode) && !S_ISDIR(bstat->bs_mode))
+ return 0;
+
+ /* Try to open the inode. */
+ fd = xfs_open_handle(handle);
+ if (fd < 0) {
+ error = errno;
+ if (error == ESTALE)
+ return error;
+
+ str_warn(ctx, descr, "%s", strerror_r(error, buf, DESCR_BUFSZ));
+ return error;
+ }
+
+ /* Go find the badness. */
+ moveon = xfs_report_verify_fd(ctx, descr, fd, arg);
+ close(fd);
+
+ return moveon ? 0 : XFS_ITERATE_INODES_ABORT;
+}
+
+/* Scan a directory for matches in the read verify error list. */
+static bool
+xfs_report_verify_dir(
+ struct scrub_ctx *ctx,
+ const char *path,
+ int dir_fd,
+ void *arg)
+{
+ return xfs_report_verify_fd(ctx, path, dir_fd, arg);
+}
+
+/*
+ * Scan the inode associated with a directory entry for matches with
+ * the read verify error list.
+ */
+static bool
+xfs_report_verify_dirent(
+ struct scrub_ctx *ctx,
+ const char *path,
+ int dir_fd,
+ struct dirent *dirent,
+ struct stat *sb,
+ void *arg)
+{
+ bool moveon;
+ int fd;
+
+ /* Ignore things we can't open. */
+ if (!S_ISREG(sb->st_mode) && !S_ISDIR(sb->st_mode))
+ return true;
+
+ /* Ignore . and .. */
+ if (!strcmp(".", dirent->d_name) || !strcmp("..", dirent->d_name))
+ return true;
+
+ /*
+ * If we were given a dirent, open the associated file under
+ * dir_fd for badblocks scanning. If dirent is NULL, then it's
+ * the directory itself we want to scan.
+ */
+ fd = openat(dir_fd, dirent->d_name,
+ O_RDONLY | O_NOATIME | O_NOFOLLOW | O_NOCTTY);
+ if (fd < 0)
+ return true;
+
+ /* Go find the badness. */
+ moveon = xfs_report_verify_fd(ctx, path, fd, arg);
+ if (moveon)
+ goto out;
+
+out:
+ close(fd);
+
+ return moveon;
+}
+
+/* Given bad extent lists for the data & rtdev, find bad files. */
+static bool
+xfs_report_verify_errors(
+ struct scrub_ctx *ctx,
+ struct bitmap *d_bad,
+ struct bitmap *r_bad)
+{
+ struct xfs_verify_error_info vei;
+ bool moveon;
+
+ vei.d_bad = d_bad;
+ vei.r_bad = r_bad;
+
+ /* Scan the directory tree to get file paths. */
+ moveon = scan_fs_tree(ctx, xfs_report_verify_dir,
+ xfs_report_verify_dirent, &vei);
+ if (!moveon)
+ return false;
+
+ /* Scan for unlinked files. */
+ return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, &vei);
+}
+
+/* Verify disk blocks with GETFSMAP */
+
+struct xfs_verify_extent {
+ struct read_verify_pool *readverify;
+ struct ptvar *rvstate;
+ struct bitmap *d_bad; /* bytes */
+ struct bitmap *r_bad; /* bytes */
+};
+
+/* Report an IO error resulting from read-verify based off getfsmap. */
+static bool
+xfs_check_rmap_error_report(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct fsmap *map,
+ void *arg)
+{
+ const char *type;
+ char buf[32];
+ uint64_t err_physical = *(uint64_t *)arg;
+ uint64_t err_off;
+
+ if (err_physical > map->fmr_physical)
+ err_off = err_physical - map->fmr_physical;
+ else
+ err_off = 0;
+
+ snprintf(buf, 32, _("disk offset %"PRIu64),
+ (uint64_t)BTOBB(map->fmr_physical + err_off));
+
+ if (map->fmr_flags & FMR_OF_SPECIAL_OWNER) {
+ type = xfs_decode_special_owner(map->fmr_owner);
+ str_error(ctx, buf,
+_("%s failed read verification."),
+ type);
+ }
+
+ /*
+ * XXX: If we had a getparent() call we could report IO errors
+ * efficiently. Until then, we'll have to scan the dir tree
+ * to find the bad file's pathname.
+ */
+
+ return true;
+}
+
+/*
+ * Remember a read error for later, and see if rmap will tell us about the
+ * owner ahead of time.
+ */
+void
+xfs_check_rmap_ioerr(
+ struct scrub_ctx *ctx,
+ struct disk *disk,
+ uint64_t start,
+ uint64_t length,
+ int error,
+ void *arg)
+{
+ struct fsmap keys[2];
+ char descr[DESCR_BUFSZ];
+ struct xfs_verify_extent *ve = arg;
+ struct bitmap *tree;
+ dev_t dev;
+ bool moveon;
+
+ dev = xfs_disk_to_dev(ctx, disk);
+
+ /*
+ * If we don't have parent pointers, save the bad extent for
+ * later rescanning.
+ */
+ if (dev == ctx->fsinfo.fs_datadev)
+ tree = ve->d_bad;
+ else if (dev == ctx->fsinfo.fs_rtdev)
+ tree = ve->r_bad;
+ else
+ tree = NULL;
+ if (tree) {
+ moveon = bitmap_set(tree, start, length);
+ if (!moveon)
+ str_errno(ctx, ctx->mntpoint);
+ }
+
+ snprintf(descr, DESCR_BUFSZ, _("dev %d:%d ioerr @ %"PRIu64":%"PRIu64" "),
+ major(dev), minor(dev), start, length);
+
+ /* Go figure out which blocks are bad from the fsmap. */
+ memset(keys, 0, sizeof(struct fsmap) * 2);
+ keys->fmr_device = dev;
+ keys->fmr_physical = start;
+ (keys + 1)->fmr_device = dev;
+ (keys + 1)->fmr_physical = start + length - 1;
+ (keys + 1)->fmr_owner = ULLONG_MAX;
+ (keys + 1)->fmr_offset = ULLONG_MAX;
+ (keys + 1)->fmr_flags = UINT_MAX;
+ xfs_iterate_fsmap(ctx, descr, keys, xfs_check_rmap_error_report,
+ &start);
+}
+
+/* Schedule a read-verify of a (data block) extent. */
+static bool
+xfs_check_rmap(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct fsmap *map,
+ void *arg)
+{
+ struct xfs_verify_extent *ve = arg;
+ struct disk *disk;
+
+ dbg_printf("rmap dev %d:%d phys %"PRIu64" owner %"PRId64
+ " offset %"PRIu64" len %"PRIu64" flags 0x%x\n",
+ major(map->fmr_device), minor(map->fmr_device),
+ (uint64_t)map->fmr_physical, (int64_t)map->fmr_owner,
+ (uint64_t)map->fmr_offset, (uint64_t)map->fmr_length,
+ map->fmr_flags);
+
+ /* "Unknown" extents should be verified; they could be data. */
+ if ((map->fmr_flags & FMR_OF_SPECIAL_OWNER) &&
+ map->fmr_owner == XFS_FMR_OWN_UNKNOWN)
+ map->fmr_flags &= ~FMR_OF_SPECIAL_OWNER;
+
+ /*
+ * We only care about read-verifying data extents that have been
+ * written to disk. This means we can skip "special" owners
+ * (metadata), xattr blocks, unwritten extents, and extent maps.
+ * These should all get checked elsewhere in the scrubber.
+ */
+ if (map->fmr_flags & (FMR_OF_PREALLOC | FMR_OF_ATTR_FORK |
+ FMR_OF_EXTENT_MAP | FMR_OF_SPECIAL_OWNER))
+ goto out;
+
+ /* XXX: Filter out directory data blocks. */
+
+ /* Schedule the read verify command for (eventual) running. */
+ disk = xfs_dev_to_disk(ctx, map->fmr_device);
+
+ read_verify_schedule_io(ve->readverify, ptvar_get(ve->rvstate), disk,
+ map->fmr_physical, map->fmr_length, ve);
+
+out:
+ /* Is this the last extent? Fire off the read. */
+ if (map->fmr_flags & FMR_OF_LAST)
+ read_verify_force_io(ve->readverify, ptvar_get(ve->rvstate));
+
+ return true;
+}
+
+/*
+ * Read verify all the file data blocks in a filesystem. Since XFS doesn't
+ * do data checksums, we trust that the underlying storage will pass back
+ * an IO error if it can't retrieve whatever we previously stored there.
+ * If we hit an IO error, we'll record the bad blocks in a bitmap and then
+ * scan the extent maps of the entire fs tree to figure (and the unlinked
+ * inodes) out which files are now broken.
+ */
+bool
+xfs_scan_blocks(
+ struct scrub_ctx *ctx)
+{
+ struct xfs_verify_extent ve;
+ bool moveon;
+
+ ve.rvstate = ptvar_init(scrub_nproc(ctx), sizeof(struct read_verify));
+ if (!ve.rvstate) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ moveon = bitmap_init(&ve.d_bad);
+ if (!moveon) {
+ str_errno(ctx, ctx->mntpoint);
+ goto out_ve;
+ }
+
+ moveon = bitmap_init(&ve.r_bad);
+ if (!moveon) {
+ str_errno(ctx, ctx->mntpoint);
+ goto out_dbad;
+ }
+
+ ve.readverify = read_verify_pool_init(ctx, ctx->geo.blocksize,
+ xfs_check_rmap_ioerr, disk_heads(ctx->datadev));
+ if (!ve.readverify) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not create media verifier."));
+ goto out_rbad;
+ }
+ moveon = xfs_scan_all_spacemaps(ctx, xfs_check_rmap, &ve);
+ if (!moveon)
+ goto out_pool;
+ read_verify_pool_flush(ve.readverify);
+ ctx->bytes_checked += read_verify_bytes(ve.readverify);
+ read_verify_pool_destroy(ve.readverify);
+
+ /* Scan the whole dir tree to see what matches the bad extents. */
+ if (!bitmap_empty(ve.d_bad) || !bitmap_empty(ve.r_bad))
+ moveon = xfs_report_verify_errors(ctx, ve.d_bad, ve.r_bad);
+
+ bitmap_free(&ve.r_bad);
+ bitmap_free(&ve.d_bad);
+ ptvar_free(ve.rvstate);
+ return moveon;
+
+out_pool:
+ read_verify_pool_destroy(ve.readverify);
+out_rbad:
+ bitmap_free(&ve.r_bad);
+out_dbad:
+ bitmap_free(&ve.d_bad);
+out_ve:
+ ptvar_free(ve.rvstate);
+ return moveon;
+}
diff --git a/scrub/vfs.c b/scrub/vfs.c
new file mode 100644
index 0000000..6a51090
--- /dev/null
+++ b/scrub/vfs.c
@@ -0,0 +1,221 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <dirent.h>
+#include <sys/types.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "handle.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "vfs.h"
+
+/*
+ * Helper functions to assist in traversing a directory tree using regular
+ * VFS calls.
+ */
+
+/* Scan a filesystem tree. */
+struct scan_fs_tree {
+ unsigned int nr_dirs;
+ pthread_mutex_t lock;
+ pthread_cond_t wakeup;
+ struct stat root_sb;
+ bool moveon;
+ scan_fs_tree_dir_fn dir_fn;
+ scan_fs_tree_dirent_fn dirent_fn;
+ void *arg;
+};
+
+/* Per-work-item scan context. */
+struct scan_fs_tree_dir {
+ char *path;
+ struct scan_fs_tree *sft;
+ bool rootdir;
+};
+
+/* Scan a directory sub tree. */
+static void
+scan_fs_dir(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *arg)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ struct scan_fs_tree_dir *sftd = arg;
+ struct scan_fs_tree *sft = sftd->sft;
+ DIR *dir;
+ struct dirent *dirent;
+ char newpath[PATH_MAX];
+ struct scan_fs_tree_dir *new_sftd;
+ struct stat sb;
+ int dir_fd;
+ int error;
+
+ /* Open the directory. */
+ dir_fd = open(sftd->path, O_RDONLY | O_NOATIME | O_NOFOLLOW | O_NOCTTY);
+ if (dir_fd < 0) {
+ if (errno != ENOENT)
+ str_errno(ctx, sftd->path);
+ goto out;
+ }
+
+ /* Caller-specific directory checks. */
+ if (!sft->dir_fn(ctx, sftd->path, dir_fd, sft->arg)) {
+ sft->moveon = false;
+ goto out;
+ }
+
+ /* Iterate the directory entries. */
+ dir = fdopendir(dir_fd);
+ if (!dir) {
+ str_errno(ctx, sftd->path);
+ goto out;
+ }
+ rewinddir(dir);
+ for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
+ snprintf(newpath, PATH_MAX, "%s/%s", sftd->path,
+ dirent->d_name);
+
+ /* Get the stat info for this directory entry. */
+ error = fstatat(dir_fd, dirent->d_name, &sb,
+ AT_NO_AUTOMOUNT | AT_SYMLINK_NOFOLLOW);
+ if (error) {
+ str_errno(ctx, newpath);
+ continue;
+ }
+
+ /* Ignore files on other filesystems. */
+ if (sb.st_dev != sft->root_sb.st_dev)
+ continue;
+
+ /* Caller-specific directory entry function. */
+ if (!sft->dirent_fn(ctx, newpath, dir_fd, dirent, &sb,
+ sft->arg)) {
+ sft->moveon = false;
+ break;
+ }
+
+ if (xfs_scrub_excessive_errors(ctx)) {
+ sft->moveon = false;
+ break;
+ }
+
+ /* If directory, call ourselves recursively. */
+ if (S_ISDIR(sb.st_mode) && strcmp(".", dirent->d_name) &&
+ strcmp("..", dirent->d_name)) {
+ new_sftd = malloc(sizeof(struct scan_fs_tree_dir));
+ if (!new_sftd) {
+ str_errno(ctx, newpath);
+ sft->moveon = false;
+ break;
+ }
+ new_sftd->path = strdup(newpath);
+ new_sftd->sft = sft;
+ new_sftd->rootdir = false;
+ pthread_mutex_lock(&sft->lock);
+ sft->nr_dirs++;
+ pthread_mutex_unlock(&sft->lock);
+ error = workqueue_add(wq, scan_fs_dir, 0, new_sftd);
+ if (error) {
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue subdirectory scan work."));
+ sft->moveon = false;
+ break;
+ }
+ }
+ }
+
+ /* Close dir, go away. */
+ error = closedir(dir);
+ if (error)
+ str_errno(ctx, sftd->path);
+
+out:
+ pthread_mutex_lock(&sft->lock);
+ sft->nr_dirs--;
+ if (sft->nr_dirs == 0)
+ pthread_cond_signal(&sft->wakeup);
+ pthread_mutex_unlock(&sft->lock);
+
+ free(sftd->path);
+ free(sftd);
+}
+
+/* Scan the entire filesystem. */
+bool
+scan_fs_tree(
+ struct scrub_ctx *ctx,
+ scan_fs_tree_dir_fn dir_fn,
+ scan_fs_tree_dirent_fn dirent_fn,
+ void *arg)
+{
+ struct workqueue wq;
+ struct scan_fs_tree sft;
+ struct scan_fs_tree_dir *sftd;
+ int ret;
+
+ sft.moveon = true;
+ sft.nr_dirs = 1;
+ sft.root_sb = ctx->mnt_sb;
+ sft.dir_fn = dir_fn;
+ sft.dirent_fn = dirent_fn;
+ sft.arg = arg;
+ pthread_mutex_init(&sft.lock, NULL);
+ pthread_cond_init(&sft.wakeup, NULL);
+
+ sftd = malloc(sizeof(struct scan_fs_tree_dir));
+ if (!sftd) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+ sftd->path = strdup(ctx->mntpoint);
+ sftd->sft = &sft;
+ sftd->rootdir = true;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ goto out_free;
+ }
+ ret = workqueue_add(&wq, scan_fs_dir, 0, sftd);
+ if (ret) {
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue directory scan work."));
+ goto out_free;
+ }
+
+ pthread_mutex_lock(&sft.lock);
+ pthread_cond_wait(&sft.wakeup, &sft.lock);
+ assert(sft.nr_dirs == 0);
+ pthread_mutex_unlock(&sft.lock);
+ workqueue_destroy(&wq);
+
+ return sft.moveon;
+out_free:
+ free(sftd->path);
+ free(sftd);
+ return false;
+}
diff --git a/scrub/vfs.h b/scrub/vfs.h
new file mode 100644
index 0000000..100eb18
--- /dev/null
+++ b/scrub/vfs.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_VFS_H_
+#define XFS_SCRUB_VFS_H_
+
+typedef bool (*scan_fs_tree_dir_fn)(struct scrub_ctx *, const char *,
+ int, void *);
+typedef bool (*scan_fs_tree_dirent_fn)(struct scrub_ctx *, const char *,
+ int, struct dirent *, struct stat *, void *);
+
+bool scan_fs_tree(struct scrub_ctx *ctx, scan_fs_tree_dir_fn dir_fn,
+ scan_fs_tree_dirent_fn dirent_fn, void *arg);
+
+#endif /* XFS_SCRUB_VFS_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index f7e4e37..fa1d089 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -390,6 +390,10 @@ run_scrub_phases(
/* Run all phases of the scrub tool. */
for (phase = 1, sp = phases; sp->fn; sp++, phase++) {
+ /* Turn on certain phases if user said to. */
+ if (sp->fn == DATASCAN_DUMMY_FN && scrub_data)
+ sp->fn = xfs_scan_blocks;
+
/* Skip certain phases unless they're turned on. */
if (sp->fn == REPAIR_DUMMY_FN ||
sp->fn == DATASCAN_DUMMY_FN)
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 31a927c..026631a 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -93,6 +93,7 @@ struct scrub_ctx {
unsigned long long errors_found;
unsigned long long warnings_found;
unsigned long long inodes_checked;
+ unsigned long long bytes_checked;
unsigned long long naming_warnings;
bool need_repair;
bool preen_triggers[XFS_SCRUB_TYPE_NR];
@@ -105,5 +106,6 @@ bool xfs_setup_fs(struct scrub_ctx *ctx);
bool xfs_scan_metadata(struct scrub_ctx *ctx);
bool xfs_scan_inodes(struct scrub_ctx *ctx);
bool xfs_scan_connections(struct scrub_ctx *ctx);
+bool xfs_scan_blocks(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 22/27] xfs_scrub: optionally use SCSI READ VERIFY commands to scrub data blocks on disk
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (20 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 21/27] xfs_scrub: scrub file " Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 23/27] xfs_scrub: check summary counters Darrick J. Wong
` (8 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
If we sense that we're talking to a raw SCSI disk, use the SCSI READ
VERIFY command to ask the disk to verify a disk internally. This can
sharply reduce the runtime of the data block verification phase on
devices whose internal bandwidth exceeds their link bandwidth.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure.ac | 2 +
include/builddefs.in | 2 +
m4/package_libcdev.m4 | 30 ++++++++++
scrub/Makefile | 8 +++
scrub/disk.c | 146 +++++++++++++++++++++++++++++++++++++++++++++++++
scrub/disk.h | 1
6 files changed, 188 insertions(+), 1 deletion(-)
diff --git a/configure.ac b/configure.ac
index 8eda010..bb032e5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -172,6 +172,8 @@ AC_PACKAGE_WANT_UNINORM_H
AC_HAVE_U8NORMALIZE
AC_HAVE_OPENAT
AC_HAVE_FSTATAT
+AC_HAVE_SG_IO
+AC_HAVE_HDIO_GETGEO
if test "$enable_blkid" = yes; then
AC_HAVE_BLKID_TOPO
diff --git a/include/builddefs.in b/include/builddefs.in
index 2f8d33f..d44faf9 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -125,6 +125,8 @@ HAVE_LIBATTR = @have_libattr@
HAVE_U8NORMALIZE = @have_u8normalize@
HAVE_OPENAT = @have_openat@
HAVE_FSTATAT = @have_fstatat@
+HAVE_SG_IO = @have_sg_io@
+HAVE_HDIO_GETGEO = @have_hdio_getgeo@
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index e0abc12..9258c27 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -390,3 +390,33 @@ AC_DEFUN([AC_HAVE_FSTATAT],
#include <unistd.h>])
AC_SUBST(have_fstatat)
])
+
+#
+# Check if we have the SG_IO ioctl
+#
+AC_DEFUN([AC_HAVE_SG_IO],
+ [ AC_MSG_CHECKING([for struct sg_io_hdr ])
+ AC_TRY_COMPILE([#include <scsi/sg.h>],
+ [
+ struct sg_io_hdr hdr;
+ ioctl(0, SG_IO, &hdr);
+ ], have_sg_io=yes
+ AC_MSG_RESULT(yes),
+ AC_MSG_RESULT(no))
+ AC_SUBST(have_sg_io)
+ ])
+
+#
+# Check if we have the HDIO_GETGEO ioctl
+#
+AC_DEFUN([AC_HAVE_HDIO_GETGEO],
+ [ AC_MSG_CHECKING([for struct hd_geometry ])
+ AC_TRY_COMPILE([#include <linux/hdreg.h>],
+ [
+ struct hd_geometry hdr;
+ ioctl(0, HDIO_GETGEO, &hdr);
+ ], have_hdio_getgeo=yes
+ AC_MSG_RESULT(yes),
+ AC_MSG_RESULT(no))
+ AC_SUBST(have_hdio_getgeo)
+ ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 4b70efa..1fb6e84 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -70,6 +70,14 @@ CFILES += unicrash.c
LCFLAGS += -DHAVE_U8NORMALIZE
endif
+ifeq ($(HAVE_SG_IO),yes)
+LCFLAGS += -DHAVE_SG_IO
+endif
+
+ifeq ($(HAVE_HDIO_GETGEO),yes)
+LCFLAGS += -DHAVE_HDIO_GETGEO
+endif
+
default: depend $(LTCOMMAND)
phase5.o unicrash.o xfs.o: $(TOPDIR)/include/builddefs
diff --git a/scrub/disk.c b/scrub/disk.c
index 546a06c..35c5a76 100644
--- a/scrub/disk.c
+++ b/scrub/disk.c
@@ -29,12 +29,19 @@
#include <sys/statvfs.h>
#include <sys/vfs.h>
#include <linux/fs.h>
+#ifdef HAVE_SG_IO
+# include <scsi/sg.h>
+#endif
+#ifdef HAVE_HDIO_GETGEO
+# include <linux/hdreg.h>
+#endif
#include "platform_defs.h"
#include "libfrog.h"
#include "xfs.h"
#include "path.h"
#include "xfs_fs.h"
#include "xfs_scrub.h"
+#include "common.h"
#include "disk.h"
/*
@@ -90,12 +97,119 @@ disk_heads(
return __disk_heads(disk);
}
+/*
+ * Execute a SCSI VERIFY(16) to verify disk contents.
+ * For devices that support this command, this can sharply reduce the
+ * runtime of the data block verification phase if the storage device's
+ * internal bandwidth exceeds its link bandwidth. However, it only
+ * works if we're talking to a raw SCSI device, and only if we trust the
+ * firmware.
+ */
+#ifdef HAVE_SG_IO
+# define SENSE_BUF_LEN 64
+# define VERIFY16_CMDLEN 16
+# define VERIFY16_CMD 0x8F
+
+# ifndef SG_FLAG_Q_AT_TAIL
+# define SG_FLAG_Q_AT_TAIL 0x10
+# endif
+static int
+disk_scsi_verify(
+ struct disk *disk,
+ uint64_t startblock, /* lba */
+ uint64_t blockcount) /* lba */
+{
+ struct sg_io_hdr iohdr;
+ unsigned char cdb[VERIFY16_CMDLEN];
+ unsigned char sense[SENSE_BUF_LEN];
+ uint64_t llba;
+ uint64_t veri_len = blockcount;
+ int error;
+
+ assert(!debug_tweak_on("XFS_SCRUB_NO_SCSI_VERIFY"));
+
+ llba = startblock + (disk->d_start >> BBSHIFT);
+
+ /* Borrowed from sg_verify */
+ cdb[0] = VERIFY16_CMD;
+ cdb[1] = 0; /* skip PI, DPO, and byte check. */
+ cdb[2] = (llba >> 56) & 0xff;
+ cdb[3] = (llba >> 48) & 0xff;
+ cdb[4] = (llba >> 40) & 0xff;
+ cdb[5] = (llba >> 32) & 0xff;
+ cdb[6] = (llba >> 24) & 0xff;
+ cdb[7] = (llba >> 16) & 0xff;
+ cdb[8] = (llba >> 8) & 0xff;
+ cdb[9] = llba & 0xff;
+ cdb[10] = (veri_len >> 24) & 0xff;
+ cdb[11] = (veri_len >> 16) & 0xff;
+ cdb[12] = (veri_len >> 8) & 0xff;
+ cdb[13] = veri_len & 0xff;
+ cdb[14] = 0;
+ cdb[15] = 0;
+ memset(sense, 0, SENSE_BUF_LEN);
+
+ /* v3 SG_IO */
+ memset(&iohdr, 0, sizeof(iohdr));
+ iohdr.interface_id = 'S';
+ iohdr.dxfer_direction = SG_DXFER_NONE;
+ iohdr.cmdp = cdb;
+ iohdr.cmd_len = VERIFY16_CMDLEN;
+ iohdr.sbp = sense;
+ iohdr.mx_sb_len = SENSE_BUF_LEN;
+ iohdr.flags |= SG_FLAG_Q_AT_TAIL;
+ iohdr.timeout = 30000; /* 30s */
+
+ error = ioctl(disk->d_fd, SG_IO, &iohdr);
+ if (error)
+ return error;
+
+ dbg_printf("VERIFY(16) fd %d lba %"PRIu64" len %"PRIu64" info %x "
+ "status %d masked %d msg %d host %d driver %d "
+ "duration %d resid %d\n",
+ disk->d_fd, startblock, blockcount, iohdr.info,
+ iohdr.status, iohdr.masked_status, iohdr.msg_status,
+ iohdr.host_status, iohdr.driver_status, iohdr.duration,
+ iohdr.resid);
+
+ if (iohdr.info & SG_INFO_CHECK) {
+ dbg_printf("status: msg %x host %x driver %x\n",
+ iohdr.msg_status, iohdr.host_status,
+ iohdr.driver_status);
+ errno = EIO;
+ return -1;
+ }
+
+ return error;
+}
+#else
+# define disk_scsi_verify(...) (ENOTTY)
+#endif /* HAVE_SG_IO */
+
+/* Test the availability of the kernel scrub ioctl. */
+static bool
+disk_can_scsi_verify(
+ struct disk *disk)
+{
+ int error;
+
+ if (debug_tweak_on("XFS_SCRUB_NO_SCSI_VERIFY"))
+ return false;
+
+ error = disk_scsi_verify(disk, 0, 1);
+ return error == 0;
+}
+
/* Open a disk device and discover its geometry. */
struct disk *
disk_open(
const char *pathname)
{
+#ifdef HAVE_HDIO_GETGEO
+ struct hd_geometry bdgeo;
+#endif
struct disk *disk;
+ bool suspicious_disk = false;
int lba_sz;
int error;
@@ -126,13 +240,34 @@ disk_open(
error = ioctl(disk->d_fd, BLKBSZGET, &disk->d_blksize);
if (error)
disk->d_blksize = 0;
- disk->d_start = 0;
+#ifdef HAVE_HDIO_GETGEO
+ error = ioctl(disk->d_fd, HDIO_GETGEO, &bdgeo);
+ if (!error) {
+ /*
+ * dm devices will pass through ioctls, which means
+ * we can't use SCSI VERIFY unless the start is 0.
+ * Most dm devices don't set geometry (unlike scsi
+ * and nvme) so use a zeroed out CHS to screen them
+ * out.
+ */
+ if (bdgeo.start != 0 &&
+ (unsigned long long)bdgeo.heads * bdgeo.sectors *
+ bdgeo.sectors == 0)
+ suspicious_disk = true;
+ disk->d_start = bdgeo.start << BBSHIFT;
+ } else
+#endif
+ disk->d_start = 0;
} else {
disk->d_size = disk->d_sb.st_size;
disk->d_blksize = disk->d_sb.st_blksize;
disk->d_start = 0;
}
+ /* Can we issue SCSI VERIFY? */
+ if (!suspicious_disk && disk_can_scsi_verify(disk))
+ disk->d_flags |= DISK_FLAG_SCSI_VERIFY;
+
return disk;
out_close:
close(disk->d_fd);
@@ -155,6 +290,10 @@ disk_close(
return error;
}
+#define BTOLBAT(d, bytes) ((uint64_t)(bytes) >> (d)->d_lbalog)
+#define LBASIZE(d) (1ULL << (d)->d_lbalog)
+#define BTOLBA(d, bytes) (((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog)
+
/* Read-verify an extent of a disk device. */
ssize_t
disk_read_verify(
@@ -163,5 +302,10 @@ disk_read_verify(
uint64_t start,
uint64_t length)
{
+ /* Convert to logical block size. */
+ if (disk->d_flags & DISK_FLAG_SCSI_VERIFY)
+ return disk_scsi_verify(disk, BTOLBAT(disk, start),
+ BTOLBA(disk, length));
+
return pread(disk->d_fd, buf, length, start);
}
diff --git a/scrub/disk.h b/scrub/disk.h
index 834678e..8a00144 100644
--- a/scrub/disk.h
+++ b/scrub/disk.h
@@ -20,6 +20,7 @@
#ifndef XFS_SCRUB_DISK_H_
#define XFS_SCRUB_DISK_H_
+#define DISK_FLAG_SCSI_VERIFY 0x1
struct disk {
struct stat d_sb;
int d_fd;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 23/27] xfs_scrub: check summary counters
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (21 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 22/27] xfs_scrub: optionally use SCSI READ VERIFY commands to scrub data blocks on disk Darrick J. Wong
@ 2018-01-06 1:53 ` Darrick J. Wong
2018-01-06 1:54 ` [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem Darrick J. Wong
` (7 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:53 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Make sure the filesystem summary counters are somewhat close to what
we can find by scanning the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 1
scrub/common.c | 28 ++++++
scrub/common.h | 4 +
scrub/phase7.c | 266 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 5 -
scrub/xfs_scrub.h | 1
6 files changed, 302 insertions(+), 3 deletions(-)
create mode 100644 scrub/phase7.c
diff --git a/scrub/Makefile b/scrub/Makefile
index 1fb6e84..fd26624 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -43,6 +43,7 @@ phase2.c \
phase3.c \
phase5.c \
phase6.c \
+phase7.c \
read_verify.c \
scrub.c \
spacemap.c \
diff --git a/scrub/common.c b/scrub/common.c
index 10c4017..bcdb8c0 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -413,3 +413,31 @@ _("More than %u naming warnings, shutting up."),
return debug || verbose || res;
}
+
+/* Decide if a value is within +/- (n/d) of a desired value. */
+bool
+within_range(
+ struct scrub_ctx *ctx,
+ unsigned long long value,
+ unsigned long long desired,
+ unsigned long long abs_threshold,
+ unsigned int n,
+ unsigned int d,
+ const char *descr)
+{
+ assert(n < d);
+
+ /* Don't complain if difference does not exceed an absolute value. */
+ if (value < desired && desired - value < abs_threshold)
+ return true;
+ if (value > desired && value - desired < abs_threshold)
+ return true;
+
+ /* Complain if the difference exceeds a certain percentage. */
+ if (value < desired * (d - n) / d)
+ return false;
+ if (value > desired * (d + n) / d)
+ return false;
+
+ return true;
+}
diff --git a/scrub/common.h b/scrub/common.h
index dd2070e..bd67a17 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -80,4 +80,8 @@ char *string_escape(const char *in);
#define TOO_MANY_NAME_WARNINGS 10000
bool should_warn_about_name(struct scrub_ctx *ctx);
+bool within_range(struct scrub_ctx *ctx, unsigned long long value,
+ unsigned long long desired, unsigned long long abs_threshold,
+ unsigned int n, unsigned int d, const char *descr);
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/phase7.c b/scrub/phase7.c
new file mode 100644
index 0000000..460ca8a
--- /dev/null
+++ b/scrub/phase7.c
@@ -0,0 +1,266 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
+#include "ptvar.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "fscounters.h"
+#include "spacemap.h"
+
+/* Phase 7: Check summary counters. */
+
+struct xfs_summary_counts {
+ unsigned long long dbytes; /* data dev bytes */
+ unsigned long long rbytes; /* rt dev bytes */
+ unsigned long long next_phys; /* next phys bytes we see? */
+ unsigned long long agbytes; /* freespace bytes */
+};
+
+/* Record block usage. */
+static bool
+xfs_record_block_summary(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ struct fsmap *fsmap,
+ void *arg)
+{
+ struct xfs_summary_counts *counts;
+ unsigned long long len;
+
+ counts = ptvar_get((struct ptvar *)arg);
+ if (fsmap->fmr_device == ctx->fsinfo.fs_logdev)
+ return true;
+ if ((fsmap->fmr_flags & FMR_OF_SPECIAL_OWNER) &&
+ fsmap->fmr_owner == XFS_FMR_OWN_FREE)
+ return true;
+
+ len = fsmap->fmr_length;
+
+ /* freesp btrees live in free space, need to adjust counters later. */
+ if ((fsmap->fmr_flags & FMR_OF_SPECIAL_OWNER) &&
+ fsmap->fmr_owner == XFS_FMR_OWN_AG) {
+ counts->agbytes += fsmap->fmr_length;
+ }
+ if (fsmap->fmr_device == ctx->fsinfo.fs_rtdev) {
+ /* Count realtime extents. */
+ counts->rbytes += len;
+ } else {
+ /* Count datadev extents. */
+ if (counts->next_phys >= fsmap->fmr_physical + len)
+ return true;
+ else if (counts->next_phys > fsmap->fmr_physical)
+ len = counts->next_phys - fsmap->fmr_physical;
+ counts->dbytes += len;
+ counts->next_phys = fsmap->fmr_physical + fsmap->fmr_length;
+ }
+
+ return true;
+}
+
+/* Add all the summaries in the per-thread counter */
+static bool
+xfs_add_summaries(
+ struct ptvar *ptv,
+ void *data,
+ void *arg)
+{
+ struct xfs_summary_counts *total = arg;
+ struct xfs_summary_counts *item = data;
+
+ total->dbytes += item->dbytes;
+ total->rbytes += item->rbytes;
+ total->agbytes += item->agbytes;
+ return true;
+}
+
+/*
+ * Count all inodes and blocks in the filesystem as told by GETFSMAP and
+ * BULKSTAT, and compare that to summary counters. Since this is a live
+ * filesystem we'll be content if the summary counts are within 10% of
+ * what we observed.
+ */
+bool
+xfs_scan_summary(
+ struct scrub_ctx *ctx)
+{
+ struct xfs_summary_counts totalcount = {0};
+ struct ptvar *ptvar;
+ unsigned long long used_data;
+ unsigned long long used_rt;
+ unsigned long long used_files;
+ unsigned long long stat_data;
+ unsigned long long stat_rt;
+ uint64_t counted_inodes = 0;
+ unsigned long long absdiff;
+ unsigned long long d_blocks;
+ unsigned long long d_bfree;
+ unsigned long long r_blocks;
+ unsigned long long r_bfree;
+ unsigned long long f_files;
+ unsigned long long f_free;
+ bool moveon;
+ bool complain;
+ int error;
+
+ /* Flush everything out to disk before we start counting. */
+ error = syncfs(ctx->mnt_fd);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ ptvar = ptvar_init(scrub_nproc(ctx), sizeof(struct xfs_summary_counts));
+ if (!ptvar) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ /* Use fsmap to count blocks. */
+ moveon = xfs_scan_all_spacemaps(ctx, xfs_record_block_summary, ptvar);
+ if (!moveon)
+ goto out_free;
+ moveon = ptvar_foreach(ptvar, xfs_add_summaries, &totalcount);
+ if (!moveon)
+ goto out_free;
+ ptvar_free(ptvar);
+
+ /* Scan the whole fs. */
+ moveon = xfs_count_all_inodes(ctx, &counted_inodes);
+ if (!moveon)
+ goto out;
+
+ moveon = xfs_scan_estimate_blocks(ctx, &d_blocks, &d_bfree, &r_blocks,
+ &r_bfree, &f_files, &f_free);
+ if (!moveon)
+ return moveon;
+
+ /*
+ * If we counted blocks with fsmap, then dblocks includes
+ * blocks for the AGFL and the freespace/rmap btrees. The
+ * filesystem treats them as "free", but since we scanned
+ * them, we'll consider them used.
+ */
+ d_bfree -= totalcount.agbytes >> ctx->blocklog;
+
+ /* Report on what we found. */
+ used_data = (d_blocks - d_bfree) << ctx->blocklog;
+ used_rt = (r_blocks - r_bfree) << ctx->blocklog;
+ used_files = f_files - f_free;
+ stat_data = totalcount.dbytes;
+ stat_rt = totalcount.rbytes;
+
+ /*
+ * Complain if the counts are off by more than 10% unless
+ * the inaccuracy is less than 32MB worth of blocks or 100 inodes.
+ */
+ absdiff = 1ULL << 25;
+ complain = verbose;
+ complain |= !within_range(ctx, stat_data, used_data, absdiff, 1, 10,
+ _("data blocks"));
+ complain |= !within_range(ctx, stat_rt, used_rt, absdiff, 1, 10,
+ _("realtime blocks"));
+ complain |= !within_range(ctx, counted_inodes, used_files, 100, 1, 10,
+ _("inodes"));
+
+ if (complain) {
+ double d, r, i;
+ char *du, *ru, *iu;
+
+ if (used_rt || stat_rt) {
+ d = auto_space_units(used_data, &du);
+ r = auto_space_units(used_rt, &ru);
+ i = auto_units(used_files, &iu);
+ fprintf(stdout,
+_("%.1f%s data used; %.1f%s realtime data used; %.2f%s inodes used.\n"),
+ d, du, r, ru, i, iu);
+ d = auto_space_units(stat_data, &du);
+ r = auto_space_units(stat_rt, &ru);
+ i = auto_units(counted_inodes, &iu);
+ fprintf(stdout,
+_("%.1f%s data found; %.1f%s realtime data found; %.2f%s inodes found.\n"),
+ d, du, r, ru, i, iu);
+ } else {
+ d = auto_space_units(used_data, &du);
+ i = auto_units(used_files, &iu);
+ fprintf(stdout,
+_("%.1f%s data used; %.1f%s inodes used.\n"),
+ d, du, i, iu);
+ d = auto_space_units(stat_data, &du);
+ i = auto_units(counted_inodes, &iu);
+ fprintf(stdout,
+_("%.1f%s data found; %.1f%s inodes found.\n"),
+ d, du, i, iu);
+ }
+ fflush(stdout);
+ }
+
+ /*
+ * Complain if the checked inode counts are off, which
+ * implies an incomplete check.
+ */
+ if (verbose ||
+ !within_range(ctx, counted_inodes, ctx->inodes_checked, 100, 1, 10,
+ _("checked inodes"))) {
+ double i1, i2;
+ char *i1u, *i2u;
+
+ i1 = auto_units(counted_inodes, &i1u);
+ i2 = auto_units(ctx->inodes_checked, &i2u);
+ fprintf(stdout,
+_("%.1f%s inodes counted; %.1f%s inodes checked.\n"),
+ i1, i1u, i2, i2u);
+ fflush(stdout);
+ }
+
+ /*
+ * Complain if the checked block counts are off, which
+ * implies an incomplete check.
+ */
+ if (ctx->bytes_checked &&
+ (verbose ||
+ !within_range(ctx, used_data + used_rt,
+ ctx->bytes_checked, absdiff, 1, 10,
+ _("verified blocks")))) {
+ double b1, b2;
+ char *b1u, *b2u;
+
+ b1 = auto_space_units(used_data + used_rt, &b1u);
+ b2 = auto_space_units(ctx->bytes_checked, &b2u);
+ fprintf(stdout,
+_("%.1f%s data counted; %.1f%s data verified.\n"),
+ b1, b1u, b2, b2u);
+ fflush(stdout);
+ }
+
+ moveon = true;
+
+out:
+ return moveon;
+out_free:
+ ptvar_free(ptvar);
+ return moveon;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index fa1d089..bc40f3c 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -374,6 +374,8 @@ run_scrub_phases(
},
{
.descr = _("Check summary counters."),
+ .fn = xfs_scan_summary,
+ .must_run = true,
},
{
NULL
@@ -443,9 +445,6 @@ main(
static bool injected;
int ret = 0;
- fprintf(stderr, "XXX: This program is not complete!\n");
- return 4;
-
progname = basename(argv[0]);
setlocale(LC_ALL, "");
bindtextdomain(PACKAGE, LOCALEDIR);
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 026631a..a5cdba8 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -107,5 +107,6 @@ bool xfs_scan_metadata(struct scrub_ctx *ctx);
bool xfs_scan_inodes(struct scrub_ctx *ctx);
bool xfs_scan_connections(struct scrub_ctx *ctx);
bool xfs_scan_blocks(struct scrub_ctx *ctx);
+bool xfs_scan_summary(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (22 preceding siblings ...)
2018-01-06 1:53 ` [PATCH 23/27] xfs_scrub: check summary counters Darrick J. Wong
@ 2018-01-06 1:54 ` Darrick J. Wong
2018-01-16 22:07 ` Eric Sandeen
2018-01-06 1:54 ` [PATCH 25/27] xfs_scrub: progress indicator Darrick J. Wong
` (6 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:54 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
If the filesystem scan comes out clean or fixes all the problems, call
fstrim to clean out the free areas (if it's an ssd/thinp/whatever).
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 1 +
scrub/phase4.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/vfs.c | 23 +++++++++++++++++++++++
scrub/vfs.h | 2 ++
scrub/xfs_scrub.c | 26 +++++++++++++++++++++++++-
scrub/xfs_scrub.h | 1 +
6 files changed, 104 insertions(+), 1 deletion(-)
create mode 100644 scrub/phase4.c
diff --git a/scrub/Makefile b/scrub/Makefile
index fd26624..91f99ff 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -41,6 +41,7 @@ inodes.c \
phase1.c \
phase2.c \
phase3.c \
+phase4.c \
phase5.c \
phase6.c \
phase7.c \
diff --git a/scrub/phase4.c b/scrub/phase4.c
new file mode 100644
index 0000000..dadf4de
--- /dev/null
+++ b/scrub/phase4.c
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <dirent.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "list.h"
+#include "path.h"
+#include "workqueue.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "scrub.h"
+#include "vfs.h"
+
+/* Phase 4: Repair filesystem. */
+
+/* Fix everything that needs fixing. */
+bool
+xfs_repair_fs(
+ struct scrub_ctx *ctx)
+{
+ bool moveon = true;
+
+ pthread_mutex_lock(&ctx->lock);
+ if (moveon && ctx->errors_found == 0)
+ fstrim(ctx);
+ pthread_mutex_unlock(&ctx->lock);
+
+ return moveon;
+}
diff --git a/scrub/vfs.c b/scrub/vfs.c
index 6a51090..98d356f 100644
--- a/scrub/vfs.c
+++ b/scrub/vfs.c
@@ -219,3 +219,26 @@ _("Could not queue directory scan work."));
free(sftd);
return false;
}
+
+#ifndef FITRIM
+struct fstrim_range {
+ __u64 start;
+ __u64 len;
+ __u64 minlen;
+};
+#define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */
+#endif
+
+/* Call FITRIM to trim all the unused space in a filesystem. */
+void
+fstrim(
+ struct scrub_ctx *ctx)
+{
+ struct fstrim_range range = {0};
+ int error;
+
+ range.len = ULLONG_MAX;
+ error = ioctl(ctx->mnt_fd, FITRIM, &range);
+ if (error && errno != EOPNOTSUPP && errno != ENOTTY)
+ perror(_("fstrim"));
+}
diff --git a/scrub/vfs.h b/scrub/vfs.h
index 100eb18..3305159 100644
--- a/scrub/vfs.h
+++ b/scrub/vfs.h
@@ -28,4 +28,6 @@ typedef bool (*scan_fs_tree_dirent_fn)(struct scrub_ctx *, const char *,
bool scan_fs_tree(struct scrub_ctx *ctx, scan_fs_tree_dir_fn dir_fn,
scan_fs_tree_dirent_fn dirent_fn, void *arg);
+void fstrim(struct scrub_ctx *ctx);
+
#endif /* XFS_SCRUB_VFS_H_ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index bc40f3c..7809431 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -340,6 +340,20 @@ _("%sI/O rate: %.1f%s/s in, %.1f%s/s out, %.1f%s/s tot\n"),
return true;
}
+/* Run the preening phase if there are no errors. */
+static bool
+preen(
+ struct scrub_ctx *ctx)
+{
+ if (ctx->errors_found) {
+ str_info(ctx, ctx->mntpoint,
+_("Errors found, please re-run with -y."));
+ return true;
+ }
+
+ return xfs_repair_fs(ctx);
+}
+
/* Run all the phases of the scrubber. */
static bool
run_scrub_phases(
@@ -393,8 +407,18 @@ run_scrub_phases(
/* Run all phases of the scrub tool. */
for (phase = 1, sp = phases; sp->fn; sp++, phase++) {
/* Turn on certain phases if user said to. */
- if (sp->fn == DATASCAN_DUMMY_FN && scrub_data)
+ if (sp->fn == DATASCAN_DUMMY_FN && scrub_data) {
sp->fn = xfs_scan_blocks;
+ } else if (sp->fn == REPAIR_DUMMY_FN) {
+ if (ctx->mode == SCRUB_MODE_PREEN) {
+ sp->descr = _("Preen filesystem.");
+ sp->fn = preen;
+ } else if (ctx->mode == SCRUB_MODE_REPAIR) {
+ sp->descr = _("Repair filesystem.");
+ sp->fn = xfs_repair_fs;
+ }
+ sp->must_run = true;
+ }
/* Skip certain phases unless they're turned on. */
if (sp->fn == REPAIR_DUMMY_FN ||
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index a5cdba8..4a383f1 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -108,5 +108,6 @@ bool xfs_scan_inodes(struct scrub_ctx *ctx);
bool xfs_scan_connections(struct scrub_ctx *ctx);
bool xfs_scan_blocks(struct scrub_ctx *ctx);
bool xfs_scan_summary(struct scrub_ctx *ctx);
+bool xfs_repair_fs(struct scrub_ctx *ctx);
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 25/27] xfs_scrub: progress indicator
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (23 preceding siblings ...)
2018-01-06 1:54 ` [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem Darrick J. Wong
@ 2018-01-06 1:54 ` Darrick J. Wong
2018-01-11 23:27 ` Eric Sandeen
2018-01-06 1:54 ` [PATCH 26/27] xfs_scrub: create a script to scrub all xfs filesystems Darrick J. Wong
` (5 subsequent siblings)
30 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:54 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Implement a progress indicator.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
man/man8/xfs_scrub.8 | 11 ++
scrub/Makefile | 2
scrub/common.c | 23 ++++-
scrub/phase2.c | 14 +++
scrub/phase3.c | 16 ++++
scrub/phase4.c | 19 ++++
scrub/phase5.c | 2
scrub/phase6.c | 28 ++++++
scrub/progress.c | 222 ++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/progress.h | 33 +++++++
scrub/read_verify.c | 2
scrub/scrub.c | 28 ++++++
scrub/xfs_scrub.c | 59 +++++++++++++
scrub/xfs_scrub.h | 14 +++
14 files changed, 463 insertions(+), 10 deletions(-)
create mode 100644 scrub/progress.c
create mode 100644 scrub/progress.h
diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
index 95f4fea..dee9076 100644
--- a/man/man8/xfs_scrub.8
+++ b/man/man8/xfs_scrub.8
@@ -4,7 +4,7 @@ xfs_scrub \- scrub the contents of an XFS filesystem
.SH SYNOPSIS
.B xfs_scrub
[
-.B \-abemnTvVxy
+.B \-abCemnTvVxy
]
.I mount-point
.br
@@ -47,6 +47,15 @@ time.
If given more than once, an artificial delay of 100us is added to each
scrub call to reduce CPU overhead even further.
.TP
+.BI \-C " fd"
+This option causes xfs_scrub to write progress information to the
+specified file description so that the progress of the filesystem check
+can be monitored.
+If the file description is a tty, a fancy progress bar is rendered.
+Otherwise, a simple numeric status dump compatible with the
+.B fsck -C
+format is output.
+.TP
.B \-e
Specifies what happens when errors are detected.
If
diff --git a/scrub/Makefile b/scrub/Makefile
index 91f99ff..7a80ff6 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -23,6 +23,7 @@ disk.h \
filemap.h \
fscounters.h \
inodes.h \
+progress.h \
read_verify.h \
scrub.h \
spacemap.h \
@@ -45,6 +46,7 @@ phase4.c \
phase5.c \
phase6.c \
phase7.c \
+progress.c \
read_verify.c \
scrub.c \
spacemap.c \
diff --git a/scrub/common.c b/scrub/common.c
index bcdb8c0..fa66ddc 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -27,6 +27,7 @@
#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
+#include "progress.h"
/*
* Reporting Status to the Console
@@ -55,6 +56,18 @@ xfs_scrub_excessive_errors(
return ret;
}
+/* If stderr is a tty, clear to end of line to clean up progress bar. */
+static inline const char *stderr_start(void)
+{
+ return stderr_isatty ? CLEAR_EOL : "";
+}
+
+/* If stdout is a tty, clear to end of line to clean up progress bar. */
+static inline const char *stdout_start(void)
+{
+ return stdout_isatty ? CLEAR_EOL : "";
+}
+
/* Print an error string and whatever error is stored in errno. */
void
__str_errno(
@@ -66,7 +79,7 @@ __str_errno(
char buf[DESCR_BUFSZ];
pthread_mutex_lock(&ctx->lock);
- fprintf(stderr, _("Error: %s: %s."), descr,
+ fprintf(stderr, _("%sError: %s: %s."), stderr_start(), descr,
strerror_r(errno, buf, DESCR_BUFSZ));
if (debug)
fprintf(stderr, _(" (%s line %d)"), file, line);
@@ -86,7 +99,7 @@ __str_errno_warn(
char buf[DESCR_BUFSZ];
pthread_mutex_lock(&ctx->lock);
- fprintf(stderr, _("Warning: %s: %s."), descr,
+ fprintf(stderr, _("%sWarning: %s: %s."), stderr_start(), descr,
strerror_r(errno, buf, DESCR_BUFSZ));
if (debug)
fprintf(stderr, _(" (%s line %d)"), file, line);
@@ -108,7 +121,7 @@ __str_error(
va_list args;
pthread_mutex_lock(&ctx->lock);
- fprintf(stderr, _("Error: %s: "), descr);
+ fprintf(stderr, _("%sError: %s: "), stderr_start(), descr);
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
@@ -132,7 +145,7 @@ __str_warn(
va_list args;
pthread_mutex_lock(&ctx->lock);
- fprintf(stderr, _("Warning: %s: "), descr);
+ fprintf(stderr, _("%sWarning: %s: "), stderr_start(), descr);
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
@@ -156,7 +169,7 @@ __str_info(
va_list args;
pthread_mutex_lock(&ctx->lock);
- fprintf(stdout, _("Info: %s: "), descr);
+ fprintf(stdout, _("%sInfo: %s: "), stdout_start(), descr);
va_start(args, format);
vfprintf(stdout, format, args);
va_end(args);
diff --git a/scrub/phase2.c b/scrub/phase2.c
index 153ae02..e8eb1ca 100644
--- a/scrub/phase2.c
+++ b/scrub/phase2.c
@@ -131,3 +131,17 @@ _("Could not queue filesystem scrub work."));
workqueue_destroy(&wq);
return moveon;
}
+
+/* Estimate how much work we're going to do. */
+bool
+xfs_estimate_metadata_work(
+ struct scrub_ctx *ctx,
+ uint64_t *items,
+ unsigned int *nr_threads,
+ int *rshift)
+{
+ *items = xfs_scrub_estimate_ag_work(ctx);
+ *nr_threads = scrub_nproc(ctx);
+ *rshift = 0;
+ return true;
+}
diff --git a/scrub/phase3.c b/scrub/phase3.c
index b3fc510..43697c6 100644
--- a/scrub/phase3.c
+++ b/scrub/phase3.c
@@ -30,6 +30,7 @@
#include "common.h"
#include "counter.h"
#include "inodes.h"
+#include "progress.h"
#include "scrub.h"
/* Phase 3: Scan all inodes. */
@@ -116,6 +117,7 @@ xfs_scrub_inode(
out:
ptcounter_add(icount, 1);
+ progress_add(1);
if (fd >= 0)
close(fd);
if (!moveon)
@@ -150,3 +152,17 @@ xfs_scan_inodes(
ptcounter_free(ictx.icount);
return ictx.moveon;
}
+
+/* Estimate how much work we're going to do. */
+bool
+xfs_estimate_inodes_work(
+ struct scrub_ctx *ctx,
+ uint64_t *items,
+ unsigned int *nr_threads,
+ int *rshift)
+{
+ *items = ctx->mnt_sv.f_files - ctx->mnt_sv.f_ffree;
+ *nr_threads = scrub_nproc(ctx);
+ *rshift = 0;
+ return true;
+}
diff --git a/scrub/phase4.c b/scrub/phase4.c
index dadf4de..43a654a 100644
--- a/scrub/phase4.c
+++ b/scrub/phase4.c
@@ -31,6 +31,7 @@
#include "workqueue.h"
#include "xfs_scrub.h"
#include "common.h"
+#include "progress.h"
#include "scrub.h"
#include "vfs.h"
@@ -44,9 +45,25 @@ xfs_repair_fs(
bool moveon = true;
pthread_mutex_lock(&ctx->lock);
- if (moveon && ctx->errors_found == 0)
+ if (moveon && ctx->errors_found == 0) {
fstrim(ctx);
+ progress_add(1);
+ }
pthread_mutex_unlock(&ctx->lock);
return moveon;
}
+
+/* Estimate how much work we're going to do. */
+bool
+xfs_estimate_repair_work(
+ struct scrub_ctx *ctx,
+ uint64_t *items,
+ unsigned int *nr_threads,
+ int *rshift)
+{
+ *items = 1;
+ *nr_threads = 1;
+ *rshift = 0;
+ return true;
+}
diff --git a/scrub/phase5.c b/scrub/phase5.c
index 8b8aeed..1ec8313 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -34,6 +34,7 @@
#include "xfs_scrub.h"
#include "common.h"
#include "inodes.h"
+#include "progress.h"
#include "scrub.h"
#include "unicrash.h"
@@ -287,6 +288,7 @@ xfs_scrub_connections(
}
out:
+ progress_add(1);
if (fd >= 0)
close(fd);
if (!moveon)
diff --git a/scrub/phase6.c b/scrub/phase6.c
index 5ecb8dc..d349730 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -33,6 +33,7 @@
#include "bitmap.h"
#include "disk.h"
#include "filemap.h"
+#include "fscounters.h"
#include "inodes.h"
#include "read_verify.h"
#include "spacemap.h"
@@ -514,3 +515,30 @@ _("Could not create media verifier."));
ptvar_free(ve.rvstate);
return moveon;
}
+
+/* Estimate how much work we're going to do. */
+bool
+xfs_estimate_verify_work(
+ struct scrub_ctx *ctx,
+ uint64_t *items,
+ unsigned int *nr_threads,
+ int *rshift)
+{
+ unsigned long long d_blocks;
+ unsigned long long d_bfree;
+ unsigned long long r_blocks;
+ unsigned long long r_bfree;
+ unsigned long long f_files;
+ unsigned long long f_free;
+ bool moveon;
+
+ moveon = xfs_scan_estimate_blocks(ctx, &d_blocks, &d_bfree,
+ &r_blocks, &r_bfree, &f_files, &f_free);
+ if (!moveon)
+ return moveon;
+
+ *items = ((d_blocks - d_bfree) + (r_blocks - r_bfree)) << ctx->blocklog;
+ *nr_threads = disk_heads(ctx->datadev);
+ *rshift = 20;
+ return moveon;
+}
diff --git a/scrub/progress.c b/scrub/progress.c
new file mode 100644
index 0000000..30b2152
--- /dev/null
+++ b/scrub/progress.c
@@ -0,0 +1,222 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include "libxfs.h"
+#include <stdio.h>
+#include <dirent.h>
+#include <pthread.h>
+#include <sys/statvfs.h>
+#include "../repair/threads.h"
+#include "path.h"
+#include "disk.h"
+#include "read_verify.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "counter.h"
+#include "progress.h"
+
+/*
+ * Progress Tracking
+ *
+ * For scrub phases that expect to take a long time, this facility uses
+ * the threaded counter and some phase/state information to report the
+ * progress of a particular phase to stdout. Each phase that wants
+ * progress information needs to set up the tracker with an estimate of
+ * the work to be done and periodic updates when work items finish. In
+ * return, the progress tracker will print a pretty progress bar and
+ * twiddle to a tty, or a raw numeric output compatible with fsck -C.
+ */
+struct progress_tracker {
+ FILE *fp;
+ const char *tag;
+ struct ptcounter *ptc;
+ uint64_t max;
+ unsigned int phase;
+ int rshift;
+ int twiddle;
+ bool isatty;
+ bool terminate;
+ pthread_t thread;
+
+ /* static state */
+ pthread_mutex_t lock;
+ pthread_cond_t wakeup;
+};
+
+static struct progress_tracker pt = {
+ .lock = PTHREAD_MUTEX_INITIALIZER,
+ .wakeup = PTHREAD_COND_INITIALIZER,
+};
+
+/* Add some progress. */
+void
+progress_add(
+ uint64_t x)
+{
+ if (pt.fp)
+ ptcounter_add(pt.ptc, x);
+}
+
+static const char twiddles[] = "|/-\\";
+
+static void
+progress_report(
+ uint64_t sum)
+{
+ char buf[81];
+ int tag_len;
+ int num_len;
+ int pbar_len;
+ int plen;
+
+ if (!pt.fp)
+ return;
+
+ if (sum > pt.max)
+ sum = pt.max;
+
+ /* Emulate fsck machine-readable output (phase, cur, max, label) */
+ if (!pt.isatty) {
+ snprintf(buf, sizeof(buf), _("%u %"PRIu64" %"PRIu64" %s"),
+ pt.phase, sum, pt.max, pt.tag);
+ fprintf(pt.fp, "%s\n", buf);
+ fflush(pt.fp);
+ return;
+ }
+
+ /* Interactive twiddle progress bar. */
+ if (debug) {
+ num_len = snprintf(buf, sizeof(buf),
+ "%c %"PRIu64"/%"PRIu64" (%.1f%%)",
+ twiddles[pt.twiddle],
+ sum >> pt.rshift,
+ pt.max >> pt.rshift,
+ 100.0 * sum / pt.max);
+ } else {
+ num_len = snprintf(buf, sizeof(buf),
+ "%c (%.1f%%)",
+ twiddles[pt.twiddle],
+ 100.0 * sum / pt.max);
+ }
+ memmove(buf + sizeof(buf) - (num_len + 1), buf, num_len + 1);
+ tag_len = snprintf(buf, sizeof(buf), _("Phase %u: |"), pt.phase);
+ pbar_len = sizeof(buf) - (num_len + 1 + tag_len);
+ plen = (int)((double)pbar_len * sum / pt.max);
+ memset(buf + tag_len, '=', plen);
+ memset(buf + tag_len + plen, ' ', pbar_len - plen);
+ pt.twiddle = (pt.twiddle + 1) % 4;
+ fprintf(pt.fp, "%c%s\r%c", START_IGNORE, buf, END_IGNORE);
+ fflush(pt.fp);
+}
+
+#define NSEC_PER_SEC (1000000000)
+static void *
+progress_report_thread(void *arg)
+{
+ struct timespec abstime;
+ int ret;
+
+ pthread_mutex_lock(&pt.lock);
+ while (1) {
+ /* Every half second. */
+ ret = clock_gettime(CLOCK_REALTIME, &abstime);
+ if (ret)
+ break;
+ abstime.tv_nsec += NSEC_PER_SEC / 2;
+ if (abstime.tv_nsec > NSEC_PER_SEC) {
+ abstime.tv_sec++;
+ abstime.tv_nsec -= NSEC_PER_SEC;
+ }
+ pthread_cond_timedwait(&pt.wakeup, &pt.lock, &abstime);
+ if (pt.terminate)
+ break;
+ progress_report(ptcounter_value(pt.ptc));
+ }
+ pthread_mutex_unlock(&pt.lock);
+ return NULL;
+}
+
+/* End a phase of progress reporting. */
+void
+progress_end_phase(void)
+{
+ if (!pt.fp)
+ return;
+
+ pthread_mutex_lock(&pt.lock);
+ pt.terminate = true;
+ pthread_mutex_unlock(&pt.lock);
+ pthread_cond_broadcast(&pt.wakeup);
+ pthread_join(pt.thread, NULL);
+
+ progress_report(pt.max);
+ ptcounter_free(pt.ptc);
+ pt.max = 0;
+ pt.ptc = NULL;
+ if (pt.fp) {
+ fprintf(pt.fp, CLEAR_EOL);
+ fflush(pt.fp);
+ }
+ pt.fp = NULL;
+}
+
+/* Set ourselves up to report progress. */
+bool
+progress_init_phase(
+ struct scrub_ctx *ctx,
+ FILE *fp,
+ unsigned int phase,
+ uint64_t max,
+ int rshift,
+ unsigned int nr_threads)
+{
+ int ret;
+
+ assert(pt.fp == NULL);
+ if (fp == NULL || max == 0) {
+ pt.fp = NULL;
+ return true;
+ }
+ pt.fp = fp;
+ pt.isatty = isatty(fileno(fp));
+ pt.tag = ctx->mntpoint;
+ pt.max = max;
+ pt.phase = phase;
+ pt.rshift = rshift;
+ pt.twiddle = 0;
+ pt.terminate = false;
+
+ pt.ptc = ptcounter_init(nr_threads);
+ if (!pt.ptc)
+ goto out_max;
+
+ ret = pthread_create(&pt.thread, NULL, progress_report_thread, NULL);
+ if (ret)
+ goto out_ptcounter;
+
+ return true;
+
+out_ptcounter:
+ ptcounter_free(pt.ptc);
+ pt.ptc = NULL;
+out_max:
+ pt.max = 0;
+ pt.fp = NULL;
+ return false;
+}
diff --git a/scrub/progress.h b/scrub/progress.h
new file mode 100644
index 0000000..29a3e83
--- /dev/null
+++ b/scrub/progress.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_PROGRESS_H_
+#define XFS_SCRUB_PROGRESS_H_
+
+#define CLEAR_EOL "\033[K"
+#define START_IGNORE '\001'
+#define END_IGNORE '\002'
+
+bool progress_init_phase(struct scrub_ctx *ctx, FILE *progress_fp,
+ unsigned int phase, uint64_t max, int rshift,
+ unsigned int nr_threads);
+void progress_end_phase(void);
+void progress_add(uint64_t x);
+
+#endif /* XFS_SCRUB_PROGRESS_H_ */
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index 244626d..e816688 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -31,6 +31,7 @@
#include "counter.h"
#include "disk.h"
#include "read_verify.h"
+#include "progress.h"
/*
* Read Verify Pool
@@ -154,6 +155,7 @@ read_verify(
errno, rv->io_end_arg);
}
+ progress_add(len);
verified += len;
rv->io_start += len;
rv->io_length -= len;
diff --git a/scrub/scrub.c b/scrub/scrub.c
index 98e7e0d..bc4eab4 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -31,6 +31,7 @@
#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
+#include "progress.h"
#include "scrub.h"
#include "xfs_errortag.h"
@@ -343,6 +344,7 @@ xfs_scrub_metadata(
/* Check the item. */
fix = xfs_check_metadata(ctx, ctx->mnt_fd, &meta, false);
+ progress_add(1);
switch (fix) {
case CHECK_ABORT:
return false;
@@ -416,6 +418,32 @@ xfs_scrub_fs_metadata(
return xfs_scrub_metadata(ctx, ST_FS, 0);
}
+/* How many items do we have to check? */
+unsigned int
+xfs_scrub_estimate_ag_work(
+ struct scrub_ctx *ctx)
+{
+ const struct scrub_descr *sc;
+ int type;
+ unsigned int estimate = 0;
+
+ sc = scrubbers;
+ for (type = 0; type < XFS_SCRUB_TYPE_NR; type++, sc++) {
+ switch (sc->type) {
+ case ST_AGHEADER:
+ case ST_PERAG:
+ estimate += ctx->geo.agcount;
+ break;
+ case ST_FS:
+ estimate++;
+ break;
+ default:
+ break;
+ }
+ }
+ return estimate;
+}
+
/* Scrub inode metadata. */
static bool
__xfs_scrub_file(
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 7809431..5750108 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -32,6 +32,7 @@
#include "xfs_scrub.h"
#include "common.h"
#include "unicrash.h"
+#include "progress.h"
/*
* XFS Online Metadata Scrub (and Repair)
@@ -139,12 +140,17 @@ bool scrub_data;
/* Size of a memory page. */
long page_size;
+/* If stdout/stderr are ttys, we can use richer terminal control. */
+bool stderr_isatty;
+bool stdout_isatty;
+
static void __attribute__((noreturn))
usage(void)
{
fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
fprintf(stderr, _("-b:\tBackground mode.\n"));
+ fprintf(stderr, _("-C:\tPrint progress information to this fd.\n"));
fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
@@ -219,6 +225,8 @@ struct phase_rusage {
struct phase_ops {
char *descr;
bool (*fn)(struct scrub_ctx *);
+ bool (*estimate_work)(struct scrub_ctx *, uint64_t *,
+ unsigned int *, int *);
bool must_run;
};
@@ -357,7 +365,8 @@ _("Errors found, please re-run with -y."));
/* Run all the phases of the scrubber. */
static bool
run_scrub_phases(
- struct scrub_ctx *ctx)
+ struct scrub_ctx *ctx,
+ FILE *progress_fp)
{
struct phase_ops phases[] =
{
@@ -369,22 +378,27 @@ run_scrub_phases(
{
.descr = _("Check internal metadata."),
.fn = xfs_scan_metadata,
+ .estimate_work = xfs_estimate_metadata_work,
},
{
.descr = _("Scan all inodes."),
.fn = xfs_scan_inodes,
+ .estimate_work = xfs_estimate_inodes_work,
},
{
.descr = _("Defer filesystem repairs."),
.fn = REPAIR_DUMMY_FN,
+ .estimate_work = xfs_estimate_repair_work,
},
{
.descr = _("Check directory tree."),
.fn = xfs_scan_connections,
+ .estimate_work = xfs_estimate_inodes_work,
},
{
.descr = _("Verify data file integrity."),
.fn = DATASCAN_DUMMY_FN,
+ .estimate_work = xfs_estimate_verify_work,
},
{
.descr = _("Check summary counters."),
@@ -397,9 +411,12 @@ run_scrub_phases(
};
struct phase_rusage pi;
struct phase_ops *sp;
+ uint64_t max_work;
bool moveon = true;
unsigned int debug_phase = 0;
unsigned int phase;
+ unsigned int nr_threads;
+ int rshift;
if (debug && debug_tweak_on("XFS_SCRUB_PHASE"))
debug_phase = atoi(getenv("XFS_SCRUB_PHASE"));
@@ -433,6 +450,18 @@ run_scrub_phases(
moveon = phase_start(&pi, phase, sp->descr);
if (!moveon)
break;
+ if (sp->estimate_work) {
+ moveon = sp->estimate_work(ctx, &max_work, &nr_threads,
+ &rshift);
+ if (!moveon)
+ break;
+ moveon = progress_init_phase(ctx, progress_fp, phase,
+ max_work, rshift, nr_threads);
+ } else {
+ moveon = progress_init_phase(ctx, NULL, phase, 0, 0, 0);
+ }
+ if (!moveon)
+ break;
moveon = sp->fn(ctx);
if (!moveon) {
str_info(ctx, ctx->mntpoint,
@@ -440,6 +469,7 @@ _("Scrub aborted after phase %d."),
phase);
break;
}
+ progress_end_phase();
moveon = phase_end(&pi, phase);
if (!moveon)
break;
@@ -461,6 +491,7 @@ main(
int c;
char *mtab = NULL;
char *repairstr = "";
+ FILE *progress_fp = NULL;
struct scrub_ctx ctx = {0};
struct phase_rusage all_pi;
unsigned long long total_errors;
@@ -477,7 +508,7 @@ main(
pthread_mutex_init(&ctx.lock, NULL);
ctx.mode = SCRUB_MODE_DEFAULT;
ctx.error_action = ERRORS_CONTINUE;
- while ((c = getopt(argc, argv, "a:bde:m:nTvxVy")) != EOF) {
+ while ((c = getopt(argc, argv, "a:bC:de:m:nTvxVy")) != EOF) {
switch (c) {
case 'a':
ctx.max_errors = cvt_u64(optarg, 10);
@@ -490,6 +521,19 @@ main(
nr_threads = 1;
bg_mode++;
break;
+ case 'C':
+ errno = 0;
+ ret = cvt_u32(optarg, 10);
+ if (errno) {
+ perror(optarg);
+ usage();
+ }
+ progress_fp = fdopen(ret, "w");
+ if (!progress_fp) {
+ perror(optarg);
+ usage();
+ }
+ break;
case 'd':
debug++;
dumpcore = true;
@@ -560,6 +604,13 @@ _("Only one of the options -n or -y may be specified.\n"));
unicrash_setup();
ctx.mntpoint = strdup(argv[optind]);
+ stdout_isatty = isatty(STDOUT_FILENO);
+ stderr_isatty = isatty(STDERR_FILENO);
+
+ /* If interactive, start the progress bar. */
+ if (stdout_isatty && !progress_fp)
+ progress_fp = fdopen(1, "w+");
+
/* Find the mount record for the passed-in argument. */
if (stat(argv[optind], &ctx.mnt_sb) < 0) {
fprintf(stderr,
@@ -615,7 +666,7 @@ _("%s: Not a XFS mount point or block device.\n"),
}
/* Scrub a filesystem. */
- moveon = run_scrub_phases(&ctx);
+ moveon = run_scrub_phases(&ctx, progress_fp);
if (!moveon)
ret |= 4;
@@ -657,6 +708,8 @@ _("%s: %llu warnings found.\n"),
if (ctx.runtime_errors)
ret |= 4;
phase_end(&all_pi, 0);
+ if (progress_fp)
+ fclose(progress_fp);
free(ctx.blkdev);
free(ctx.mntpoint);
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 4a383f1..cda290c 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -31,6 +31,8 @@ extern bool dumpcore;
extern bool verbose;
extern bool scrub_data;
extern long page_size;
+extern bool stderr_isatty;
+extern bool stdout_isatty;
enum scrub_mode {
SCRUB_MODE_DRY_RUN,
@@ -110,4 +112,16 @@ bool xfs_scan_blocks(struct scrub_ctx *ctx);
bool xfs_scan_summary(struct scrub_ctx *ctx);
bool xfs_repair_fs(struct scrub_ctx *ctx);
+/* Progress estimator functions */
+uint64_t xfs_estimate_inodes(struct scrub_ctx *ctx);
+unsigned int xfs_scrub_estimate_ag_work(struct scrub_ctx *ctx);
+bool xfs_estimate_metadata_work(struct scrub_ctx *ctx, uint64_t *items,
+ unsigned int *nr_threads, int *rshift);
+bool xfs_estimate_inodes_work(struct scrub_ctx *ctx, uint64_t *items,
+ unsigned int *nr_threads, int *rshift);
+bool xfs_estimate_repair_work(struct scrub_ctx *ctx, uint64_t *items,
+ unsigned int *nr_threads, int *rshift);
+bool xfs_estimate_verify_work(struct scrub_ctx *ctx, uint64_t *items,
+ unsigned int *nr_threads, int *rshift);
+
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 26/27] xfs_scrub: create a script to scrub all xfs filesystems
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (24 preceding siblings ...)
2018-01-06 1:54 ` [PATCH 25/27] xfs_scrub: progress indicator Darrick J. Wong
@ 2018-01-06 1:54 ` Darrick J. Wong
2018-01-06 1:54 ` [PATCH 27/27] xfs_scrub: integrate services with systemd Darrick J. Wong
` (4 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:54 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create an xfs_scrub_all command to find all XFS filesystems
and run an online scrub against them all.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
debian/control | 3 +
debian/rules | 1
man/man8/xfs_scrub_all.8 | 32 ++++++++++
scrub/Makefile | 15 ++++
scrub/xfs_scrub_all.in | 154 ++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 201 insertions(+), 4 deletions(-)
create mode 100644 man/man8/xfs_scrub_all.8
create mode 100644 scrub/xfs_scrub_all.in
diff --git a/debian/control b/debian/control
index 36d1bd8..801744b 100644
--- a/debian/control
+++ b/debian/control
@@ -3,12 +3,13 @@ Section: admin
Priority: optional
Maintainer: XFS Development Team <linux-xfs@vger.kernel.org>
Uploaders: Nathan Scott <nathans@debian.org>, Anibal Monsalve Salazar <anibal@debian.org>
-Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev, libunistring-dev
+Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev | libreadline5-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev, libunistring-dev, dh-python
Standards-Version: 3.9.1
Homepage: https://xfs.wiki.kernel.org/
Package: xfsprogs
Depends: ${shlibs:Depends}, ${misc:Depends}
+Recommends: ${python3:Depends}, util-linux
Provides: fsck-backend
Suggests: xfsdump, acl, attr, quota
Breaks: xfsdump (<< 3.0.0)
diff --git a/debian/rules b/debian/rules
index baefdba..abb794e 100755
--- a/debian/rules
+++ b/debian/rules
@@ -76,6 +76,7 @@ binary-arch: checkroot built
$(pkgdi) $(MAKE) -C debian install-d-i
$(pkgme) $(MAKE) dist
rmdir debian/xfslibs-dev/usr/share/doc/xfsprogs
+ dh_python3
dh_installdocs
dh_installchangelogs
dh_strip
diff --git a/man/man8/xfs_scrub_all.8 b/man/man8/xfs_scrub_all.8
new file mode 100644
index 0000000..5e1420b
--- /dev/null
+++ b/man/man8/xfs_scrub_all.8
@@ -0,0 +1,32 @@
+.TH xfs_scrub_all 8
+.SH NAME
+xfs_scrub_all \- scrub all mounted XFS filesystems
+.SH SYNOPSIS
+.B xfs_scrub_all
+.SH DESCRIPTION
+.B xfs_scrub_all
+attempts to read and check all the metadata on all mounted XFS filesystems.
+The online scrub is performed via the
+.B xfs_scrub
+tool, either by running it directly or by using systemd to start it
+in a restricted fashion.
+Mounted filesystems are mapped to physical storage devices so that scrub
+operations can be run in parallel so long as no two scrubbers access
+the same device simultaneously.
+.SH EXIT CODE
+The exit code returned by
+.B xfs_scrub_all
+is the sum of the following conditions:
+.br
+\ 0\ \-\ No errors
+.br
+\ 4\ \-\ File system errors left uncorrected
+.br
+\ 8\ \-\ Operational error
+.br
+\ 16\ \-\ Usage or syntax error
+.TP
+These are the same error codes returned by xfs_scrub.
+.br
+.SH SEE ALSO
+.BR xfs_scrub (8).
diff --git a/scrub/Makefile b/scrub/Makefile
index 7a80ff6..f709606 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -13,6 +13,8 @@ SCRUB_PREREQS=$(PKG_PLATFORM)$(HAVE_OPENAT)$(HAVE_FSTATAT)
ifeq ($(SCRUB_PREREQS),linuxyesyes)
LTCOMMAND = xfs_scrub
INSTALL_SCRUB = install-scrub
+XFS_SCRUB_ALL_PROG = xfs_scrub_all
+XFS_SCRUB_ARGS = -b -n
endif # scrub_prereqs
HFILES = \
@@ -82,17 +84,24 @@ ifeq ($(HAVE_HDIO_GETGEO),yes)
LCFLAGS += -DHAVE_HDIO_GETGEO
endif
-default: depend $(LTCOMMAND)
+default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG)
+
+xfs_scrub_all: xfs_scrub_all.in
+ @echo " [SED] $@"
+ $(Q)$(SED) -e "s|@sbindir@|$(PKG_ROOT_SBIN_DIR)|g" \
+ -e "s|@scrub_args@|$(XFS_SCRUB_ARGS)|g" < $< > $@
+ $(Q)chmod a+x $@
phase5.o unicrash.o xfs.o: $(TOPDIR)/include/builddefs
include $(BUILDRULES)
-install: default $(INSTALL_SCRUB)
+install: $(INSTALL_SCRUB)
-install-scrub:
+install-scrub: default
$(INSTALL) -m 755 -d $(PKG_ROOT_SBIN_DIR)
$(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_ROOT_SBIN_DIR)
+ $(INSTALL) -m 755 $(XFS_SCRUB_ALL_PROG) $(PKG_ROOT_SBIN_DIR)
install-dev:
diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
new file mode 100644
index 0000000..7738644
--- /dev/null
+++ b/scrub/xfs_scrub_all.in
@@ -0,0 +1,154 @@
+#!/usr/bin/env python3
+
+# Run online scrubbers in parallel, but avoid thrashing.
+#
+# Copyright (C) 2018 Oracle. All rights reserved.
+#
+# Author: Darrick J. Wong <darrick.wong@oracle.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+
+import subprocess
+import json
+import threading
+import time
+import sys
+
+retcode = 0
+terminate = False
+
+def find_mounts():
+ '''Map mountpoints to physical disks.'''
+
+ fs = {}
+ cmd=['lsblk', '-o', 'KNAME,TYPE,FSTYPE,MOUNTPOINT', '-J']
+ result = subprocess.Popen(cmd, stdout=subprocess.PIPE)
+ result.wait()
+ if result.returncode != 0:
+ return fs
+ sarray = [x.decode('utf-8') for x in result.stdout.readlines()]
+ output = ' '.join(sarray)
+ bdevdata = json.loads(output)
+ # The lsblk output had better be in disks-then-partitions order
+ for bdev in bdevdata['blockdevices']:
+ if bdev['type'] in ('disk', 'loop'):
+ lastdisk = bdev['kname']
+ if bdev['fstype'] == 'xfs':
+ mnt = bdev['mountpoint']
+ if mnt is None:
+ continue
+ if mnt in fs:
+ fs[mnt].add(lastdisk)
+ else:
+ fs[mnt] = set([lastdisk])
+ return fs
+
+def run_killable(cmd, stdout, killfuncs, kill_fn):
+ '''Run a killable program. Returns program retcode or -1 if we can't start it.'''
+ try:
+ proc = subprocess.Popen(cmd, stdout = stdout)
+ real_kill_fn = lambda: kill_fn(proc)
+ killfuncs.add(real_kill_fn)
+ proc.wait()
+ try:
+ killfuncs.remove(real_kill_fn)
+ except:
+ pass
+ return proc.returncode
+ except:
+ return -1
+
+def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
+ '''Run a scrub process.'''
+ global retcode, terminate
+
+ print("Scrubbing %s..." % mnt)
+ sys.stdout.flush()
+
+ try:
+ if terminate:
+ return
+
+ # Invoke xfs_scrub manually
+ cmd=['@sbindir@/xfs_scrub', '@scrub_args@', mnt]
+ ret = run_killable(cmd, None, killfuncs, \
+ lambda proc: proc.terminate())
+ if ret >= 0:
+ print("Scrubbing %s done, (err=%d)" % (mnt, ret))
+ sys.stdout.flush()
+ retcode |= ret
+ return
+
+ if terminate:
+ return
+
+ print("Unable to start scrub tool.")
+ sys.stdout.flush()
+ finally:
+ running_devs -= mntdevs
+ cond.acquire()
+ cond.notify()
+ cond.release()
+
+def main():
+ '''Find mounts, schedule scrub runs.'''
+ def thr(mnt, devs):
+ a = (mnt, cond, running_devs, devs, killfuncs)
+ thr = threading.Thread(target = run_scrub, args = a)
+ thr.start()
+ global retcode, terminate
+
+ fs = find_mounts()
+
+ # Schedule scrub jobs...
+ running_devs = set()
+ killfuncs = set()
+ cond = threading.Condition()
+ while len(fs) > 0:
+ if len(running_devs) == 0:
+ mnt, devs = fs.popitem()
+ running_devs.update(devs)
+ thr(mnt, devs)
+ poppers = set()
+ for mnt in fs:
+ devs = fs[mnt]
+ can_run = True
+ for dev in devs:
+ if dev in running_devs:
+ can_run = False
+ break
+ if can_run:
+ running_devs.update(devs)
+ poppers.add(mnt)
+ thr(mnt, devs)
+ for p in poppers:
+ fs.pop(p)
+ cond.acquire()
+ try:
+ cond.wait()
+ except KeyboardInterrupt:
+ terminate = True
+ print("Terminating...")
+ sys.stdout.flush()
+ while len(killfuncs) > 0:
+ fn = killfuncs.pop()
+ fn()
+ fs = []
+ cond.release()
+
+ sys.exit(retcode)
+
+if __name__ == '__main__':
+ main()
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 27/27] xfs_scrub: integrate services with systemd
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (25 preceding siblings ...)
2018-01-06 1:54 ` [PATCH 26/27] xfs_scrub: create a script to scrub all xfs filesystems Darrick J. Wong
@ 2018-01-06 1:54 ` Darrick J. Wong
2018-01-06 3:50 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
` (3 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 1:54 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create a systemd service unit so that we can run the online scrubber
under systemd with (somewhat) appropriate containment.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
.gitignore | 4 +++
configure.ac | 15 +++++++++++
include/builddefs.in | 3 ++
scrub/Makefile | 32 ++++++++++++++++++++++-
scrub/xfs_scrub.c | 25 ++++++++++++++++++
scrub/xfs_scrub@.service.in | 18 +++++++++++++
scrub/xfs_scrub_all.cron.in | 2 +
scrub/xfs_scrub_all.in | 53 ++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub_all.service.in | 8 ++++++
scrub/xfs_scrub_all.timer | 11 ++++++++
scrub/xfs_scrub_fail | 26 +++++++++++++++++++
scrub/xfs_scrub_fail@.service.in | 10 +++++++
12 files changed, 206 insertions(+), 1 deletion(-)
create mode 100644 scrub/xfs_scrub@.service.in
create mode 100644 scrub/xfs_scrub_all.cron.in
create mode 100644 scrub/xfs_scrub_all.service.in
create mode 100644 scrub/xfs_scrub_all.timer
create mode 100755 scrub/xfs_scrub_fail
create mode 100644 scrub/xfs_scrub_fail@.service.in
diff --git a/.gitignore b/.gitignore
index a3db640..d887451 100644
--- a/.gitignore
+++ b/.gitignore
@@ -69,6 +69,10 @@ cscope.*
/rtcp/xfs_rtcp
/spaceman/xfs_spaceman
/scrub/xfs_scrub
+/scrub/xfs_scrub@.service
+/scrub/xfs_scrub_all
+/scrub/xfs_scrub_all.service
+/scrub/xfs_scrub_fail@.service
# generated crc files
/libxfs/crc32selftest
diff --git a/configure.ac b/configure.ac
index bb032e5..f7840db 100644
--- a/configure.ac
+++ b/configure.ac
@@ -121,6 +121,21 @@ esac
AC_SUBST([root_sbindir])
AC_SUBST([root_libdir])
+# Where do systemd services go?
+pkg_systemdsystemunitdir="$(pkg-config --variable=systemdsystemunitdir systemd 2>/dev/null)"
+case "${pkg_systemdsystemunitdir}" in
+"")
+ systemdsystemunitdir=""
+ have_systemd=no
+ ;;
+*)
+ systemdsystemunitdir="${pkg_systemdsystemunitdir}"
+ have_systemd=yes
+ ;;
+esac
+AC_SUBST([have_systemd])
+AC_SUBST([systemdsystemunitdir])
+
# Find localized files. Don't descend into any "dot directories"
# (like .git or .pc from quilt). Strangely, the "-print" argument
# to "find" is required, to avoid including such directories in the
diff --git a/include/builddefs.in b/include/builddefs.in
index d44faf9..4b4bf41 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -128,6 +128,9 @@ HAVE_FSTATAT = @have_fstatat@
HAVE_SG_IO = @have_sg_io@
HAVE_HDIO_GETGEO = @have_hdio_getgeo@
+HAVE_SYSTEMD = @have_systemd@
+SYSTEMDSYSTEMUNITDIR = @systemdsystemunitdir@
+
GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
# -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/scrub/Makefile b/scrub/Makefile
index f709606..3e6f690 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -15,6 +15,16 @@ LTCOMMAND = xfs_scrub
INSTALL_SCRUB = install-scrub
XFS_SCRUB_ALL_PROG = xfs_scrub_all
XFS_SCRUB_ARGS = -b -n
+ifeq ($(HAVE_SYSTEMD),yes)
+INSTALL_SCRUB += install-systemd
+SYSTEMDSERVICES = xfs_scrub@.service xfs_scrub_all.service xfs_scrub_all.timer xfs_scrub_fail@.service
+endif
+CRONSERVICES = xfs_scrub_all.cron
+CROND_DIR = /etc/cron.d
+
+# Disable all the crontabs for now
+CROND_DIR = $(PKG_LIB_DIR)/$(PKG_NAME)
+
endif # scrub_prereqs
HFILES = \
@@ -84,7 +94,8 @@ ifeq ($(HAVE_HDIO_GETGEO),yes)
LCFLAGS += -DHAVE_HDIO_GETGEO
endif
-default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG)
+default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG) $(SYSTEMDSERVICES) \
+ $(CRONSERVICES)
xfs_scrub_all: xfs_scrub_all.in
@echo " [SED] $@"
@@ -98,10 +109,29 @@ include $(BUILDRULES)
install: $(INSTALL_SCRUB)
+%.service: %.service.in
+ @echo " [SED] $@"
+ $(Q)$(SED) -e "s|@sbindir@|$(PKG_ROOT_SBIN_DIR)|g" \
+ -e "s|@scrub_args@|$(XFS_SCRUB_ARGS)|g" \
+ -e "s|@pkg_lib_dir@|$(PKG_LIB_DIR)|g" \
+ -e "s|@pkg_name@|$(PKG_NAME)|g" < $< > $@
+
+%.cron: %.cron.in
+ @echo " [SED] $@"
+ $(Q)$(SED) -e "s|@sbindir@|$(PKG_ROOT_SBIN_DIR)|g" < $< > $@
+
+install-systemd: default
+ $(INSTALL) -m 755 -d $(SYSTEMDSYSTEMUNITDIR)
+ $(INSTALL) -m 644 $(SYSTEMDSERVICES) $(SYSTEMDSYSTEMUNITDIR)
+ $(INSTALL) -m 755 -d $(PKG_LIB_DIR)/$(PKG_NAME)
+ $(INSTALL) -m 755 xfs_scrub_fail $(PKG_LIB_DIR)/$(PKG_NAME)
+
install-scrub: default
$(INSTALL) -m 755 -d $(PKG_ROOT_SBIN_DIR)
$(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_ROOT_SBIN_DIR)
$(INSTALL) -m 755 $(XFS_SCRUB_ALL_PROG) $(PKG_ROOT_SBIN_DIR)
+ $(INSTALL) -m 755 -d $(CROND_DIR)
+ $(INSTALL) -m 644 $(CRONSERVICES) $(CROND_DIR)
install-dev:
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 5750108..66c64a4 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -144,6 +144,12 @@ long page_size;
bool stderr_isatty;
bool stdout_isatty;
+/*
+ * If we are running as a service, we need to be careful about what
+ * error codes we return to the calling process.
+ */
+bool is_service;
+
static void __attribute__((noreturn))
usage(void)
{
@@ -611,6 +617,9 @@ _("Only one of the options -n or -y may be specified.\n"));
if (stdout_isatty && !progress_fp)
progress_fp = fdopen(1, "w+");
+ if (getenv("SERVICE_MODE"))
+ is_service = true;
+
/* Find the mount record for the passed-in argument. */
if (stat(argv[optind], &ctx.mnt_sb) < 0) {
fprintf(stderr,
@@ -713,5 +722,21 @@ _("%s: %llu warnings found.\n"),
free(ctx.blkdev);
free(ctx.mntpoint);
+ /*
+ * If we're running as a service, bump return code up by 150 to
+ * avoid conflicting with (sysvinit) service return codes.
+ */
+ if (is_service) {
+ /*
+ * journald queries /proc as part of taking in log
+ * messages; it uses this information to associate the
+ * message with systemd units, etc. This races with
+ * process exit, so delay that a couple of seconds so
+ * that we capture the summary outputs in the job log.
+ */
+ sleep(2);
+ if (ret)
+ ret += 150;
+ }
return ret;
}
diff --git a/scrub/xfs_scrub@.service.in b/scrub/xfs_scrub@.service.in
new file mode 100644
index 0000000..6b6992d
--- /dev/null
+++ b/scrub/xfs_scrub@.service.in
@@ -0,0 +1,18 @@
+[Unit]
+Description=Online XFS Metadata Check for %I
+OnFailure=xfs_scrub_fail@%i.service
+
+[Service]
+Type=oneshot
+WorkingDirectory=%I
+PrivateNetwork=true
+ProtectSystem=full
+ProtectHome=read-only
+PrivateTmp=yes
+AmbientCapabilities=CAP_SYS_ADMIN CAP_FOWNER CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_SYS_RAWIO
+NoNewPrivileges=yes
+User=nobody
+IOSchedulingClass=idle
+CPUSchedulingPolicy=idle
+Environment=SERVICE_MODE=1
+ExecStart=@sbindir@/xfs_scrub @scrub_args@ %I
diff --git a/scrub/xfs_scrub_all.cron.in b/scrub/xfs_scrub_all.cron.in
new file mode 100644
index 0000000..ec82236
--- /dev/null
+++ b/scrub/xfs_scrub_all.cron.in
@@ -0,0 +1,2 @@
+SERVICE_MODE=1
+10 3 * * 0 root test -e /run/systemd/system || @sbindir@/xfs_scrub_all
diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
index 7738644..27cdc32 100644
--- a/scrub/xfs_scrub_all.in
+++ b/scrub/xfs_scrub_all.in
@@ -25,10 +25,19 @@ import json
import threading
import time
import sys
+import os
retcode = 0
terminate = False
+def DEVNULL():
+ '''Return /dev/null in subprocess writable format.'''
+ try:
+ from subprocess import DEVNULL
+ return DEVNULL
+ except ImportError:
+ return open(os.devnull, 'wb')
+
def find_mounts():
'''Map mountpoints to physical disks.'''
@@ -55,6 +64,13 @@ def find_mounts():
fs[mnt] = set([lastdisk])
return fs
+def kill_systemd(unit, proc):
+ '''Kill systemd unit.'''
+ proc.terminate()
+ cmd=['systemctl', 'stop', unit]
+ x = subprocess.Popen(cmd)
+ x.wait()
+
def run_killable(cmd, stdout, killfuncs, kill_fn):
'''Run a killable program. Returns program retcode or -1 if we can't start it.'''
try:
@@ -81,6 +97,19 @@ def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
if terminate:
return
+ # Try it the systemd way
+ cmd=['systemctl', 'start', 'xfs_scrub@%s' % mnt]
+ ret = run_killable(cmd, DEVNULL(), killfuncs, \
+ lambda proc: kill_systemd('xfs_scrub@%s' % mnt, proc))
+ if ret == 0 or ret == 1:
+ print("Scrubbing %s done, (err=%d)" % (mnt, ret))
+ sys.stdout.flush()
+ retcode |= ret
+ return
+
+ if terminate:
+ return
+
# Invoke xfs_scrub manually
cmd=['@sbindir@/xfs_scrub', '@scrub_args@', mnt]
ret = run_killable(cmd, None, killfuncs, \
@@ -112,6 +141,17 @@ def main():
fs = find_mounts()
+ # Tail the journal if we ourselves aren't a service...
+ journalthread = None
+ if 'SERVICE_MODE' not in os.environ:
+ try:
+ cmd=['journalctl', '--no-pager', '-q', '-S', 'now', \
+ '-f', '-u', 'xfs_scrub@*', '-o', \
+ 'cat']
+ journalthread = subprocess.Popen(cmd)
+ except:
+ pass
+
# Schedule scrub jobs...
running_devs = set()
killfuncs = set()
@@ -148,6 +188,19 @@ def main():
fs = []
cond.release()
+ if journalthread is not None:
+ journalthread.terminate()
+
+ # journald queries /proc as part of taking in log
+ # messages; it uses this information to associate the
+ # message with systemd units, etc. This races with
+ # process exit, so delay that a couple of seconds so
+ # that we capture the summary outputs in the job log.
+ if 'SERVICE_MODE' in os.environ:
+ time.sleep(2)
+ if retcode:
+ retcode += 150
+
sys.exit(retcode)
if __name__ == '__main__':
diff --git a/scrub/xfs_scrub_all.service.in b/scrub/xfs_scrub_all.service.in
new file mode 100644
index 0000000..683804e
--- /dev/null
+++ b/scrub/xfs_scrub_all.service.in
@@ -0,0 +1,8 @@
+[Unit]
+Description=Online XFS Metadata Check for All Filesystems
+ConditionACPower=true
+
+[Service]
+Type=oneshot
+Environment=SERVICE_MODE=1
+ExecStart=@sbindir@/xfs_scrub_all
diff --git a/scrub/xfs_scrub_all.timer b/scrub/xfs_scrub_all.timer
new file mode 100644
index 0000000..2e4a33b
--- /dev/null
+++ b/scrub/xfs_scrub_all.timer
@@ -0,0 +1,11 @@
+[Unit]
+Description=Periodic XFS Online Metadata Check for All Filesystems
+
+[Timer]
+# Run on Sunday at 3:10am, to avoid running afoul of DST changes
+OnCalendar=Sun *-*-* 03:10:00
+RandomizedDelaySec=60
+Persistent=true
+
+[Install]
+WantedBy=timers.target
diff --git a/scrub/xfs_scrub_fail b/scrub/xfs_scrub_fail
new file mode 100755
index 0000000..36dd50e
--- /dev/null
+++ b/scrub/xfs_scrub_fail
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+# Email logs of failed xfs_scrub unit runs
+
+mailer=/usr/sbin/sendmail
+recipient="$1"
+test -z "${recipient}" && exit 0
+mntpoint="$2"
+test -z "${mntpoint}" && exit 0
+hostname="$(hostname -f 2>/dev/null)"
+test -z "${hostname}" && hostname="${HOSTNAME}"
+if [ ! -x "${mailer}" ]; then
+ echo "${mailer}: Mailer program not found."
+ exit 1
+fi
+
+(cat << ENDL
+To: $1
+From: <xfs_scrub@${hostname}>
+Subject: xfs_scrub failure on ${mntpoint}
+
+So sorry, the automatic xfs_scrub of ${mntpoint} on ${hostname} failed.
+
+A log of what happened follows:
+ENDL
+systemctl status --full --lines 4294967295 "xfs_scrub@${mntpoint}") | "${mailer}" -t -i
diff --git a/scrub/xfs_scrub_fail@.service.in b/scrub/xfs_scrub_fail@.service.in
new file mode 100644
index 0000000..785f881
--- /dev/null
+++ b/scrub/xfs_scrub_fail@.service.in
@@ -0,0 +1,10 @@
+[Unit]
+Description=Online XFS Metadata Check Failure Reporting for %I
+
+[Service]
+Type=oneshot
+Environment=EMAIL_ADDR=root
+ExecStart=@pkg_lib_dir@/@pkg_name@/xfs_scrub_fail "${EMAIL_ADDR}" %I
+User=mail
+Group=mail
+SupplementaryGroups=systemd-journal
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 07/27] xfs_scrub: find XFS filesystem geometry
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (26 preceding siblings ...)
2018-01-06 1:54 ` [PATCH 27/27] xfs_scrub: integrate services with systemd Darrick J. Wong
@ 2018-01-06 3:50 ` Darrick J. Wong
2018-01-12 4:17 ` [PATCH v11 00/27] xfsprogs: online scrub/repair support Eric Sandeen
` (2 subsequent siblings)
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-06 3:50 UTC (permalink / raw)
To: sandeen; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Discover the geometry of the XFS filesystem that we've been told to
scan, and set up some common functions that will be used by the
scrub phases.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 5 +
scrub/common.c | 72 +++++++++++++++++
scrub/common.h | 10 ++
scrub/disk.c | 3 +
scrub/phase1.c | 223 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/xfs_scrub.c | 35 ++++++++
scrub/xfs_scrub.h | 29 +++++++
7 files changed, 376 insertions(+), 1 deletion(-)
create mode 100644 scrub/phase1.c
diff --git a/scrub/Makefile b/scrub/Makefile
index c3a9986..5239dae 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -23,6 +23,7 @@ xfs_scrub.h
CFILES = \
common.c \
disk.c \
+phase1.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
@@ -33,6 +34,10 @@ ifeq ($(HAVE_MALLINFO),yes)
LCFLAGS += -DHAVE_MALLINFO
endif
+ifeq ($(HAVE_SYNCFS),yes)
+LCFLAGS += -DHAVE_SYNCFS
+endif
+
default: depend $(LTCOMMAND)
include $(BUILDRULES)
diff --git a/scrub/common.c b/scrub/common.c
index 75c6df5..252809d 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -20,8 +20,11 @@
#include <stdio.h>
#include <pthread.h>
#include <stdbool.h>
+#include <sys/statvfs.h>
#include "platform_defs.h"
#include "xfs.h"
+#include "xfs_fs.h"
+#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
@@ -248,3 +251,72 @@ scrub_nproc_workqueue(
x = 0;
return x;
}
+
+/*
+ * Check if the argument is either the device name or mountpoint of a mounted
+ * filesystem.
+ */
+#define MNTTYPE_XFS "xfs"
+static bool
+find_mountpoint_check(
+ struct stat *sb,
+ struct mntent *t)
+{
+ struct stat ms;
+
+ if (S_ISDIR(sb->st_mode)) { /* mount point */
+ if (stat(t->mnt_dir, &ms) < 0)
+ return false;
+ if (sb->st_ino != ms.st_ino)
+ return false;
+ if (sb->st_dev != ms.st_dev)
+ return false;
+ if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
+ return NULL;
+ } else { /* device */
+ if (stat(t->mnt_fsname, &ms) < 0)
+ return false;
+ if (sb->st_rdev != ms.st_rdev)
+ return false;
+ if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
+ return NULL;
+ /*
+ * Make sure the mountpoint given by mtab is accessible
+ * before using it.
+ */
+ if (stat(t->mnt_dir, &ms) < 0)
+ return false;
+ }
+
+ return true;
+}
+
+/* Check that our alleged mountpoint is in mtab */
+bool
+find_mountpoint(
+ char *mtab,
+ struct scrub_ctx *ctx)
+{
+ struct mntent_cursor cursor;
+ struct mntent *t = NULL;
+ bool found = false;
+
+ if (platform_mntent_open(&cursor, mtab) != 0) {
+ fprintf(stderr, "Error: can't get mntent entries.\n");
+ exit(1);
+ }
+
+ while ((t = platform_mntent_next(&cursor)) != NULL) {
+ /*
+ * Keep jotting down matching mount details; newer mounts are
+ * towards the end of the file (hopefully).
+ */
+ if (find_mountpoint_check(&ctx->mnt_sb, t)) {
+ ctx->mntpoint = strdup(t->mnt_dir);
+ ctx->blkdev = strdup(t->mnt_fsname);
+ found = true;
+ }
+ }
+ platform_mntent_close(&cursor);
+ return found;
+}
diff --git a/scrub/common.h b/scrub/common.h
index 41b3ea7..fed95df 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -62,4 +62,14 @@ double auto_units(unsigned long long number, char **units);
unsigned int scrub_nproc(struct scrub_ctx *ctx);
unsigned int scrub_nproc_workqueue(struct scrub_ctx *ctx);
+#ifndef HAVE_SYNCFS
+static inline int syncfs(int fd)
+{
+ sync();
+ return 0;
+}
+#endif
+
+bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
+
#endif /* XFS_SCRUB_COMMON_H_ */
diff --git a/scrub/disk.c b/scrub/disk.c
index d4bf81f..546a06c 100644
--- a/scrub/disk.c
+++ b/scrub/disk.c
@@ -31,6 +31,9 @@
#include <linux/fs.h>
#include "platform_defs.h"
#include "libfrog.h"
+#include "xfs.h"
+#include "path.h"
+#include "xfs_fs.h"
#include "xfs_scrub.h"
#include "disk.h"
diff --git a/scrub/phase1.c b/scrub/phase1.c
new file mode 100644
index 0000000..65409d3
--- /dev/null
+++ b/scrub/phase1.c
@@ -0,0 +1,223 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <mntent.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/statvfs.h>
+#include <sys/vfs.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <pthread.h>
+#include <errno.h>
+#include <linux/fs.h>
+#include "libfrog.h"
+#include "workqueue.h"
+#include "input.h"
+#include "path.h"
+#include "handle.h"
+#include "bitops.h"
+#include "xfs_arch.h"
+#include "xfs_format.h"
+#include "avl64.h"
+#include "list.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "disk.h"
+
+/* Phase 1: Find filesystem geometry (and clean up after) */
+
+/* Shut down the filesystem. */
+void
+xfs_shutdown_fs(
+ struct scrub_ctx *ctx)
+{
+ int flag;
+
+ flag = XFS_FSOP_GOING_FLAGS_LOGFLUSH;
+ str_info(ctx, ctx->mntpoint, _("Shutting down filesystem!"));
+ if (ioctl(ctx->mnt_fd, XFS_IOC_GOINGDOWN, &flag))
+ str_errno(ctx, ctx->mntpoint);
+}
+
+/* Clean up the XFS-specific state data. */
+bool
+xfs_cleanup_fs(
+ struct scrub_ctx *ctx)
+{
+ if (ctx->fshandle)
+ free_handle(ctx->fshandle, ctx->fshandle_len);
+ if (ctx->rtdev)
+ disk_close(ctx->rtdev);
+ if (ctx->logdev)
+ disk_close(ctx->logdev);
+ if (ctx->datadev)
+ disk_close(ctx->datadev);
+ fshandle_destroy();
+ close(ctx->mnt_fd);
+ fs_table_destroy();
+
+ return true;
+}
+
+/*
+ * Bind to the mountpoint, read the XFS geometry, bind to the block devices.
+ * Anything we've already built will be cleaned up by xfs_cleanup_fs.
+ */
+bool
+xfs_setup_fs(
+ struct scrub_ctx *ctx)
+{
+ struct fs_path *fsp;
+ int error;
+
+ /*
+ * Open the directory with O_NOATIME. For mountpoints owned
+ * by root, this should be sufficient to ensure that we have
+ * CAP_SYS_ADMIN, which we probably need to do anything fancy
+ * with the (XFS driver) kernel.
+ */
+ ctx->mnt_fd = open(ctx->mntpoint, O_RDONLY | O_NOATIME | O_DIRECTORY);
+ if (ctx->mnt_fd < 0) {
+ if (errno == EPERM)
+ str_info(ctx, ctx->mntpoint,
+_("Must be root to run scrub."));
+ else
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ error = fstat(ctx->mnt_fd, &ctx->mnt_sb);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+ error = fstatvfs(ctx->mnt_fd, &ctx->mnt_sv);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+ error = fstatfs(ctx->mnt_fd, &ctx->mnt_sf);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ ctx->nr_io_threads = nproc;
+ if (verbose) {
+ fprintf(stdout, _("%s: using %d threads to scrub.\n"),
+ ctx->mntpoint, scrub_nproc(ctx));
+ fflush(stdout);
+ }
+
+ if (!platform_test_xfs_fd(ctx->mnt_fd)) {
+ str_error(ctx, ctx->mntpoint,
+_("Does not appear to be an XFS filesystem!"));
+ return false;
+ }
+
+ /*
+ * Flush everything out to disk before we start checking.
+ * This seems to reduce the incidence of stale file handle
+ * errors when we open things by handle.
+ */
+ error = syncfs(ctx->mnt_fd);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ /* Retrieve XFS geometry. */
+ error = ioctl(ctx->mnt_fd, XFS_IOC_FSGEOMETRY, &ctx->geo);
+ if (error) {
+ str_errno(ctx, ctx->mntpoint);
+ return false;
+ }
+
+ ctx->agblklog = log2_roundup(ctx->geo.agblocks);
+ ctx->blocklog = highbit32(ctx->geo.blocksize);
+ ctx->inodelog = highbit32(ctx->geo.inodesize);
+ ctx->inopblog = ctx->blocklog - ctx->inodelog;
+
+ error = path_to_fshandle(ctx->mntpoint, &ctx->fshandle,
+ &ctx->fshandle_len);
+ if (error) {
+ perror(_("getting fshandle"));
+ return false;
+ }
+
+ /* Go find the XFS devices if we have a usable fsmap. */
+ fs_table_initialise(0, NULL, 0, NULL);
+ errno = 0;
+ fsp = fs_table_lookup(ctx->mntpoint, FS_MOUNT_POINT);
+ if (!fsp) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find XFS information."));
+ return false;
+ }
+ memcpy(&ctx->fsinfo, fsp, sizeof(struct fs_path));
+
+ /* Did we find the log and rt devices, if they're present? */
+ if (ctx->geo.logstart == 0 && ctx->fsinfo.fs_log == NULL) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find log device path."));
+ return false;
+ }
+ if (ctx->geo.rtblocks && ctx->fsinfo.fs_rt == NULL) {
+ str_error(ctx, ctx->mntpoint,
+_("Unable to find realtime device path."));
+ return false;
+ }
+
+ /* Open the raw devices. */
+ ctx->datadev = disk_open(ctx->fsinfo.fs_name);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+
+ if (ctx->fsinfo.fs_log) {
+ ctx->logdev = disk_open(ctx->fsinfo.fs_log);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+ }
+ if (ctx->fsinfo.fs_rt) {
+ ctx->rtdev = disk_open(ctx->fsinfo.fs_rt);
+ if (error) {
+ str_errno(ctx, ctx->fsinfo.fs_name);
+ return false;
+ }
+ }
+
+ /*
+ * Everything's set up, which means any failures recorded after
+ * this point are most probably corruption errors (as opposed to
+ * purely setup errors).
+ */
+ ctx->need_repair = true;
+ return true;
+}
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index a9c185b..a733b8f 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -23,9 +23,12 @@
#include <stdlib.h>
#include <sys/time.h>
#include <sys/resource.h>
+#include <sys/statvfs.h>
#include "platform_defs.h"
#include "xfs.h"
+#include "xfs_fs.h"
#include "input.h"
+#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
@@ -345,6 +348,8 @@ run_scrub_phases(
{
{
.descr = _("Find filesystem geometry."),
+ .fn = xfs_setup_fs,
+ .must_run = true,
},
{
.descr = _("Check internal metadata."),
@@ -426,6 +431,7 @@ main(
struct phase_rusage all_pi;
unsigned long long total_errors;
bool moveon = true;
+ bool ismnt;
static bool injected;
int ret = 0;
@@ -522,6 +528,15 @@ _("Only one of the options -n or -y may be specified.\n"));
ctx.mntpoint = strdup(argv[optind]);
+ /* Find the mount record for the passed-in argument. */
+ if (stat(argv[optind], &ctx.mnt_sb) < 0) {
+ fprintf(stderr,
+ _("%s: could not stat: %s: %s\n"),
+ progname, argv[optind], strerror(errno));
+ ret |= 8;
+ goto out;
+ }
+
/*
* If the user did not specify an explicit mount table, try to use
* /proc/mounts if it is available, else /etc/mtab. We prefer
@@ -541,6 +556,15 @@ _("Only one of the options -n or -y may be specified.\n"));
if (!moveon)
goto out;
+ ismnt = find_mountpoint(mtab, &ctx);
+ if (!ismnt) {
+ fprintf(stderr,
+_("%s: Not a XFS mount point or block device.\n"),
+ ctx.mntpoint);
+ ret |= 8;
+ goto out;
+ }
+
/* How many CPUs? */
nproc = sysconf(_SC_NPROCESSORS_ONLN);
if (nproc < 1)
@@ -569,6 +593,11 @@ _("Only one of the options -n or -y may be specified.\n"));
if (debug_tweak_on("XFS_SCRUB_FORCE_ERROR"))
str_error(&ctx, ctx.mntpoint, _("Injecting error."));
+ /* Clean up scan data. */
+ moveon = xfs_cleanup_fs(&ctx);
+ if (!moveon)
+ ret |= 8;
+
out:
total_errors = ctx.errors_found + ctx.runtime_errors;
if (ctx.need_repair)
@@ -586,13 +615,17 @@ _("%s: %llu errors found.%s\n"),
fprintf(stderr,
_("%s: %llu warnings found.\n"),
ctx.mntpoint, ctx.warnings_found);
- if (ctx.errors_found)
+ if (ctx.errors_found) {
+ if (ctx.error_action == ERRORS_SHUTDOWN)
+ xfs_shutdown_fs(&ctx);
ret |= 1;
+ }
if (ctx.warnings_found)
ret |= 2;
if (ctx.runtime_errors)
ret |= 4;
phase_end(&all_pi, 0);
+ free(ctx.blkdev);
free(ctx.mntpoint);
return ret;
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 7f1dcb1..2be7c65 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -51,15 +51,38 @@ struct scrub_ctx {
char *mntpoint;
char *blkdev;
+ /* Mountpoint info */
+ struct stat mnt_sb;
+ struct statvfs mnt_sv;
+ struct statfs mnt_sf;
+
+ /* Open block devices */
+ struct disk *datadev;
+ struct disk *logdev;
+ struct disk *rtdev;
+
/* What does the user want us to do? */
enum scrub_mode mode;
/* How does the user want us to react to errors? */
enum error_action error_action;
+ /* fd to filesystem mount point */
+ int mnt_fd;
+
/* Number of threads for metadata scrubbing */
unsigned int nr_io_threads;
+ /* XFS specific geometry */
+ struct xfs_fsop_geom geo;
+ struct fs_path fsinfo;
+ unsigned int agblklog;
+ unsigned int blocklog;
+ unsigned int inodelog;
+ unsigned int inopblog;
+ void *fshandle;
+ size_t fshandle_len;
+
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
unsigned long long max_errors;
@@ -67,6 +90,12 @@ struct scrub_ctx {
unsigned long long errors_found;
unsigned long long warnings_found;
bool need_repair;
+ bool preen_triggers[XFS_SCRUB_TYPE_NR];
};
+/* Phase helper functions */
+void xfs_shutdown_fs(struct scrub_ctx *ctx);
+bool xfs_cleanup_fs(struct scrub_ctx *ctx);
+bool xfs_setup_fs(struct scrub_ctx *ctx);
+
#endif /* XFS_SCRUB_XFS_SCRUB_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 12/27] xfs_scrub: wrap the scrub ioctl
2018-01-06 1:52 ` [PATCH 12/27] xfs_scrub: wrap the scrub ioctl Darrick J. Wong
@ 2018-01-11 23:12 ` Eric Sandeen
2018-01-12 0:28 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:12 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:52 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Create some wrappers to call the scrub ioctls.
> +/*
> + * Sleep for 100ms * however many -b we got past the initial one.
> + * This is an (albeit clumsy) way to throttle scrub activity.
> + */
> +void
> +background_sleep(void)
> +{
> + unsigned long long time;
> + struct timespec tv;
> +
> + if (bg_mode < 2)
> + return;
> +
> + time = 100000 * (bg_mode - 1);
<coverity pass>
Probably want to cast the constant(s) to something larger if someone
issues -b $HUGE ... 100000ULL?
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 10/27] xfs_scrub: add file space map iteration functions
2018-01-06 1:52 ` [PATCH 10/27] xfs_scrub: add file " Darrick J. Wong
@ 2018-01-11 23:19 ` Eric Sandeen
2018-01-12 0:24 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:19 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:52 PM, Darrick J. Wong wrote:
> + * These routines provide a simple interface to query the block
> + * mappings of the fork of a given inode via GETBMAPX and call a
> + * function to iterate each mapping result.
> + */
> +
> +#define BMAP_NR 2048
> +
> +/* Iterate all the extent block mappings between the key and fork end. */
> +bool
> +xfs_iterate_filemaps(
> + struct scrub_ctx *ctx,
> + const char *descr,
> + int fd,
> + int whichfork,
> + struct xfs_bmap *key,
<coverity pass>
Ok key is an xfs_bmap:
/* inode fork block mapping */
struct xfs_bmap {
uint64_t bm_offset; /* file offset of segment in bytes */
uint64_t bm_physical; /* physical starting byte */
uint64_t bm_length; /* length of segment, bytes */
uint32_t bm_flags; /* output flags */
};
> + xfs_bmap_iter_fn fn,
> + void *arg)
> +{
> + struct fsxattr fsx;
> + struct getbmapx *map
map is a getbmapx ...
struct getbmapx {
__s64 bmv_offset; /* file offset of segment in blocks */
__s64 bmv_block; /* starting block (64-bit daddr_t) */
__s64 bmv_length; /* length of segment, blocks */
__s32 bmv_count; /* # of entries in array incl. 1st */
__s32 bmv_entries; /* # of entries filled in (output). */
__s32 bmv_iflags; /* input flags (1st structure) */
__s32 bmv_oflags; /* output flags (after 1st structure)*/
__s32 bmv_unused1; /* future use */
__s32 bmv_unused2; /* future use */
};
...
> +out:
> + memcpy(key, map, sizeof(struct getbmapx));
so I don't think that fits, right?
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2018-01-06 1:52 ` [PATCH 06/27] xfs_scrub: create an abstraction for a block device Darrick J. Wong
@ 2018-01-11 23:24 ` Eric Sandeen
2018-01-11 23:59 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:24 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:52 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
...
> +/*
> + * Disk Abstraction
> + *
> + * These routines help us to discover the geometry of a block device,
> + * estimate the amount of concurrent IOs that we can send to it, and
> + * abstract the process of performing read verification of disk blocks.
> + */
> +
> +/* Figure out how many disk heads are available. */
> +static unsigned int
> +__disk_heads(
> + struct disk *disk)
> +{
> + int iomin;
> + int ioopt;
> + unsigned short rot;
> + int error;
> +
> + /* If it's not a block device, throw all the CPUs at it. */
> + if (!S_ISBLK(disk->d_sb.st_mode))
> + return nproc;
> +
> + /* Non-rotational device? Throw all the CPUs. */
> + rot = 1;
> + error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
> + if (error == 0 && rot == 0)
> + return nproc;
I needed
+#ifndef BLKROTATIONAL
+#define BLKROTATIONAL _IO(0x12,126)
+#endif
to make this compile on my not /that/ ancient (?) rhel6 box ;)
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 21/27] xfs_scrub: scrub file data blocks
2018-01-06 1:53 ` [PATCH 21/27] xfs_scrub: scrub file " Darrick J. Wong
@ 2018-01-11 23:25 ` Eric Sandeen
2018-01-12 0:29 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:25 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:53 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
...
> + /* Get the stat info for this directory entry. */
> + error = fstatat(dir_fd, dirent->d_name, &sb,
> + AT_NO_AUTOMOUNT | AT_SYMLINK_NOFOLLOW);
> + if (error) {
> + str_errno(ctx, newpath);
> + continue;
I needed:
+#ifndef AT_NO_AUTOMOUNT
+#define AT_NO_AUTOMOUNT 0x800
+#endif
here
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 25/27] xfs_scrub: progress indicator
2018-01-06 1:54 ` [PATCH 25/27] xfs_scrub: progress indicator Darrick J. Wong
@ 2018-01-11 23:27 ` Eric Sandeen
2018-01-12 0:32 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:27 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:54 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> +#define NSEC_PER_SEC (1000000000)
> +static void *
> +progress_report_thread(void *arg)
> +{
> + struct timespec abstime;
> + int ret;
> +
> + pthread_mutex_lock(&pt.lock);
> + while (1) {
> + /* Every half second. */
> + ret = clock_gettime(CLOCK_REALTIME, &abstime);
My manpage says "link with -rt" and to include <time.h>, this got me
going:
diff --git a/scrub/Makefile b/scrub/Makefile
index 3e6f690..0094d9d 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -67,7 +67,7 @@ xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUNISTRING)
LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG) $(LIBUNISTRING)
-LLDFLAGS = -static
+LLDFLAGS = -static -lrt
ifeq ($(HAVE_MALLINFO),yes)
LCFLAGS += -DHAVE_MALLINFO
diff --git a/scrub/progress.c b/scrub/progress.c
index 30b2152..61b9c60 100644
--- a/scrub/progress.c
+++ b/scrub/progress.c
@@ -22,6 +22,7 @@
#include <dirent.h>
#include <pthread.h>
#include <sys/statvfs.h>
+#include <time.h>
#include "../repair/threads.h"
#include "path.h"
#include "disk.h"
^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 03/27] xfs_scrub: set up command line argument parsing
2018-01-06 1:51 ` [PATCH 03/27] xfs_scrub: set up command line argument parsing Darrick J. Wong
@ 2018-01-11 23:39 ` Eric Sandeen
2018-01-12 1:53 ` Darrick J. Wong
2018-01-12 1:30 ` Eric Sandeen
1 sibling, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-11 23:39 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Parse command line options in order to set up the context in which we
> will scrub the filesystem.
> +static void __attribute__((noreturn))
> +usage(void)
> +{
> + fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
> + fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
> + fprintf(stderr, _("-b:\tBackground mode.\n"));
do you intentionally not document -d?
<same question for manpage>
> + fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
> + fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
> + fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
> + fprintf(stderr, _("-T:\tDisplay timing/usage information.\n"));
> + fprintf(stderr, _("-v:\tVerbose output.\n"));
> + fprintf(stderr, _("-V:\tPrint version.\n"));
> + fprintf(stderr, _("-x:\tScrub file data too.\n"));
> + fprintf(stderr, _("-y:\tRepair all errors.\n"));
> +
> + exit(16);
> +}
Could we make this more like xfs_repair usage() for consistency?
Usage: xfs_repair [options] device
Options:
-f The device is a file
-L Force log zeroing. Do this as a last resort.
-l logdev Specifies the device where the external log resides.
-m maxmem Maximum amount of memory to be used in megabytes.
-n No modify mode, just checks the filesystem for damage.
-P Disables prefetching.
-r rtdev Specifies the device where the realtime section resides.
-v Verbose output.
-c subopts Change filesystem parameters - use xfs_admin.
-o subopts Override default behaviour, refer to man page.
-t interval Reporting interval in seconds.
-d Repair dangerously.
-V Reports version and exits.
so maybe:
Usage: xfs_scrub [options] mountpoint
-a count Stop after this many errors are found.
-b Background mode.
-C fd Print progress information to this fd.
-e behavior What to do if errors are found. (shutdown|continue)
-m path Path to /etc/mtab.
-n Dry run. Do not modify anything.
-T Display timing/usage information.
-v Verbose output.
-V Reports version and exits.
-x Scrub file data too.
-y Repair all errors.
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2018-01-11 23:24 ` Eric Sandeen
@ 2018-01-11 23:59 ` Darrick J. Wong
2018-01-12 0:04 ` Eric Sandeen
0 siblings, 1 reply; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-11 23:59 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:24:58PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:52 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
>
> ...
>
> > +/*
> > + * Disk Abstraction
> > + *
> > + * These routines help us to discover the geometry of a block device,
> > + * estimate the amount of concurrent IOs that we can send to it, and
> > + * abstract the process of performing read verification of disk blocks.
> > + */
> > +
> > +/* Figure out how many disk heads are available. */
> > +static unsigned int
> > +__disk_heads(
> > + struct disk *disk)
> > +{
> > + int iomin;
> > + int ioopt;
> > + unsigned short rot;
> > + int error;
> > +
> > + /* If it's not a block device, throw all the CPUs at it. */
> > + if (!S_ISBLK(disk->d_sb.st_mode))
> > + return nproc;
> > +
> > + /* Non-rotational device? Throw all the CPUs. */
> > + rot = 1;
> > + error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
> > + if (error == 0 && rot == 0)
> > + return nproc;
>
> I needed
>
> +#ifndef BLKROTATIONAL
> +#define BLKROTATIONAL _IO(0x12,126)
> +#endif
>
> to make this compile on my not /that/ ancient (?) rhel6 box ;)
Hmm... well, since I don't see backporting xfs kernel scrub to 2.6.32
maybe xfsprogs' build system should just turn off xfs_scrub on old
systems?
In any case, I #ifdef BLKROTATIONAL'd out the entire clause.
--D
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2018-01-11 23:59 ` Darrick J. Wong
@ 2018-01-12 0:04 ` Eric Sandeen
2018-01-12 1:27 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 0:04 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: sandeen, linux-xfs
On 1/11/18 5:59 PM, Darrick J. Wong wrote:
> On Thu, Jan 11, 2018 at 05:24:58PM -0600, Eric Sandeen wrote:
...
>>> + /* Non-rotational device? Throw all the CPUs. */
>>> + rot = 1;
>>> + error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
>>> + if (error == 0 && rot == 0)
>>> + return nproc;
>>
>> I needed
>>
>> +#ifndef BLKROTATIONAL
>> +#define BLKROTATIONAL _IO(0x12,126)
>> +#endif
>>
>> to make this compile on my not /that/ ancient (?) rhel6 box ;)
>
> Hmm... well, since I don't see backporting xfs kernel scrub to 2.6.32
> maybe xfsprogs' build system should just turn off xfs_scrub on old
> systems?
>
> In any case, I #ifdef BLKROTATIONAL'd out the entire clause.
ok. well, other distros are making noise about using bleeding edge progs
w/ older distro kernels (hence the mkfs config file wishes) so it's probably
good to consider building against older environments.
Thanks,
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 01/27] xfs_scrub: create online filesystem scrub program
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
@ 2018-01-12 0:16 ` Eric Sandeen
2018-01-12 1:08 ` Darrick J. Wong
2018-01-12 1:07 ` Eric Sandeen
1 sibling, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 0:16 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
<man page nitpicking>
> diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
> new file mode 100644
> index 0000000..95f4fea
> --- /dev/null
> +++ b/man/man8/xfs_scrub.8
> @@ -0,0 +1,117 @@
> +.TH xfs_scrub 8
> +.SH NAME
> +xfs_scrub \- scrub the contents of an XFS filesystem
> +.SH SYNOPSIS
> +.B xfs_scrub
> +[
> +.B \-abemnTvVxy
^
> +]
> +.I mount-point
or block device?
> +.br
> +.B xfs_scrub \-V
^
If V is special it probably shouldn't be in the first arg string?
Do you mean to hide the "-d" option?
> +.SH DESCRIPTION
> +.B xfs_scrub
> +attempts to check and repair all metadata in a mounted XFS filesystem.
> +.PP
> +.B xfs_scrub
> +asks the kernel to scrub all metadata objects in the filesystem.
> +Metadata records are scanned for obviously bad values and then
> +cross-referenced against other metadata.
> +The goal is to establish a threasonable confidence about the consistency
"reasonable"
> +of the overall filesystem by examining the consistency of individual
> +metadata records against the other metadata in the filesystem across the
> +entire filesystem.
Redundant, "examining the consistency of individual metadata records against
the other medtadata in the filesystem." would suffice.
> +Damaged metadata can be rebuilt from other metadata if there is
> +sufficient redundancy (and no other corruption) in the metadata.
Again redundant, maybe just "if there is sufficient redundancy within
other intact metadata?"
> +.PP
> +This utility does not know how to correct all errors.
> +If the tool cannot fix the detected errors, you must unmount the
> +filesystem and run
> +.B xfs_repair
> +to fix the problems.
> +If this tool is not run with either of the
> +.B \-n
> +or
> +.B \-y
> +options, then it will optimize the filesystem when possible,
> +but it will not try to fix errors.
I think the manpage needs to describe what this optimization might
involve, at least at a high level. Will it fsr all my files? Will
it trim my free space? Will it compact my directories? Will it ...?
What exactly am I agreeing to here? :)
> +.SH OPTIONS
> +.TP
> +.BI \-a " errors"
> +Abort if more than this many errors are found on the filesystem.
> +.TP
> +.B \-b
> +Run in background mode.
> +If the option is specified once, only run a single scrubbing thread at a
> +time.
> +If given more than once, an artificial delay of 100us is added to each
> +scrub call to reduce CPU overhead even further.
I wonder, should it take a value instead of -bbbbbbbbb?
> +.TP
> +.B \-e
> +Specifies what happens when errors are detected.
> +If
> +.IR shutdown
> +is given, the filesystem will be taken offline if errors are found.
> +Not all backends can shut down a filesystem.
<user> what's a backend? </user>
> +If
> +.IR continue
> +is given, no action taken if errors are found.
> +This is the default.
<user> so how do I know what errors were found? </user>
> +.TP
> +.BI \-m " file"
> +Search this file for mounted filesystems instead of /etc/mtab.
> +.TP
> +.B \-n
> +Dry run, do not modify anything in the filesystem.
> +This disables all preening and optimization behaviors, and disables
> +calling FITRIM on the free space after a successful run.
what if I only want to disable FITRIM? (-k?)
Oh, and it runs FITRIM? Can you mention that more prominently
in the behavior description? (and should it, given that we
have a tool for that purpose?)
> +.TP
> +.BI \-T
> +Print timing and memory usage information for each phase.
> +.TP
> +.B \-v
> +Enable verbose mode, which prints periodic status updates.
> +.TP
> +.B \-V
> +Prints the version number and exits.
> +.TP
> +.B \-x
> +Scrub all file data too.
colloquial? maybe s/too/as well/
> +The block list will be sorted in disk order for better performance.
Cool, so when I'm done, my filesystem will have better performance if I use -x?
and none of my files will be corrupted! ;)
The read order is probably an implementation detail that doesn't need to be in
the manpage. It may be worth changing the description a bit to make it
clearer that the purpose is to determine readability of every file block?
I mean, that should probably be obvious, but ...
> +.B xfs_scrub
> +will issue O_DIRECT reads to the block device directly.
> +If the block device is a SCSI disk, it will issue READ VERIFY commands
> +directly to the disk.
+ These actions will confirm that all file data blocks can be read from storage.
or something?
> +.TP
> +.B \-y
> +Try to repair all filesystem errors.
> +If the errors cannot be fixed online, then the filesystem must be taken
> +offline for repair.
> +.SH EXIT CODE
> +The exit code returned by
> +.B xfs_scrub
> +is the sum of the following conditions:
> +.br
> +\ 0\ \-\ No errors
> +.br
> +\ 1\ \-\ File system errors left uncorrected
> +.br
> +\ 2\ \-\ File system optimizations possible
> +.br
> +\ 4\ \-\ Operational error
> +.br
> +\ 8\ \-\ Usage or syntax error
> +.br
> +.SH CAVEATS
> +.B xfs_scrub
> +is an immature utility!
Might it damage my filesystem? ;)
> +This program takes advantage of in-kernel scrubbing to verify a given
> +data structure with locks held.
> +The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS,
> +GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls.
Some of those ioctls are ancient and probably don't need to be specified...
Can you do anything at all without SCRUB_METADATA? If not,
is SCRUB_METADATA sufficient to determine that the kernel has the rest
of what it needs?
> +This can tie up the system for a while.
Maybe that's a statement to go right after "locks held"
> +.PP
> +If errors are found and cannot be repaired, the filesystem must be taken
> +offline and repaired.
"unmounted and repaired" might be more specific? *shrug*
> +.SH SEE ALSO
> +.BR xfs_repair (8).
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 10/27] xfs_scrub: add file space map iteration functions
2018-01-11 23:19 ` Eric Sandeen
@ 2018-01-12 0:24 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 0:24 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:19:22PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:52 PM, Darrick J. Wong wrote:
>
>
> > + * These routines provide a simple interface to query the block
> > + * mappings of the fork of a given inode via GETBMAPX and call a
> > + * function to iterate each mapping result.
> > + */
> > +
> > +#define BMAP_NR 2048
> > +
> > +/* Iterate all the extent block mappings between the key and fork end. */
> > +bool
> > +xfs_iterate_filemaps(
> > + struct scrub_ctx *ctx,
> > + const char *descr,
> > + int fd,
> > + int whichfork,
> > + struct xfs_bmap *key,
>
> <coverity pass>
>
> Ok key is an xfs_bmap:
>
> /* inode fork block mapping */
> struct xfs_bmap {
> uint64_t bm_offset; /* file offset of segment in bytes */
> uint64_t bm_physical; /* physical starting byte */
> uint64_t bm_length; /* length of segment, bytes */
> uint32_t bm_flags; /* output flags */
> };
>
> > + xfs_bmap_iter_fn fn,
> > + void *arg)
> > +{
> > + struct fsxattr fsx;
> > + struct getbmapx *map
> map is a getbmapx ...
>
> struct getbmapx {
> __s64 bmv_offset; /* file offset of segment in blocks */
> __s64 bmv_block; /* starting block (64-bit daddr_t) */
> __s64 bmv_length; /* length of segment, blocks */
> __s32 bmv_count; /* # of entries in array incl. 1st */
> __s32 bmv_entries; /* # of entries filled in (output). */
> __s32 bmv_iflags; /* input flags (1st structure) */
> __s32 bmv_oflags; /* output flags (after 1st structure)*/
> __s32 bmv_unused1; /* future use */
> __s32 bmv_unused2; /* future use */
> };
>
> ...
>
> > +out:
> > + memcpy(key, map, sizeof(struct getbmapx));
>
> so I don't think that fits, right?
I can't remember why this line is even needed, so away it goes.
--D
>
>
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 12/27] xfs_scrub: wrap the scrub ioctl
2018-01-11 23:12 ` Eric Sandeen
@ 2018-01-12 0:28 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 0:28 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:12:49PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:52 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Create some wrappers to call the scrub ioctls.
>
> > +/*
> > + * Sleep for 100ms * however many -b we got past the initial one.
> > + * This is an (albeit clumsy) way to throttle scrub activity.
> > + */
> > +void
> > +background_sleep(void)
> > +{
> > + unsigned long long time;
> > + struct timespec tv;
> > +
> > + if (bg_mode < 2)
> > + return;
> > +
> > + time = 100000 * (bg_mode - 1);
>
> <coverity pass>
>
> Probably want to cast the constant(s) to something larger if someone
> issues -b $HUGE ... 100000ULL?
I suppose, though I doubt anyone will pass -b 42,950 times. 8-)
(-b doesn't take an argument)
Fixed.
--D
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 21/27] xfs_scrub: scrub file data blocks
2018-01-11 23:25 ` Eric Sandeen
@ 2018-01-12 0:29 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 0:29 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:25:58PM -0600, Eric Sandeen wrote:
>
>
> On 1/5/18 7:53 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
>
> ...
>
> > + /* Get the stat info for this directory entry. */
> > + error = fstatat(dir_fd, dirent->d_name, &sb,
> > + AT_NO_AUTOMOUNT | AT_SYMLINK_NOFOLLOW);
> > + if (error) {
> > + str_errno(ctx, newpath);
> > + continue;
>
> I needed:
>
> +#ifndef AT_NO_AUTOMOUNT
> +#define AT_NO_AUTOMOUNT 0x800
> +#endif
Fixed.
--D
> here
>
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 25/27] xfs_scrub: progress indicator
2018-01-11 23:27 ` Eric Sandeen
@ 2018-01-12 0:32 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 0:32 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:27:54PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:54 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
>
>
> > +#define NSEC_PER_SEC (1000000000)
> > +static void *
> > +progress_report_thread(void *arg)
> > +{
> > + struct timespec abstime;
> > + int ret;
> > +
> > + pthread_mutex_lock(&pt.lock);
> > + while (1) {
> > + /* Every half second. */
> > + ret = clock_gettime(CLOCK_REALTIME, &abstime);
>
>
> My manpage says "link with -rt" and to include <time.h>, this got me
> going:
>
> diff --git a/scrub/Makefile b/scrub/Makefile
> index 3e6f690..0094d9d 100644
> --- a/scrub/Makefile
> +++ b/scrub/Makefile
> @@ -67,7 +67,7 @@ xfs_scrub.c
>
> LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUNISTRING)
> LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG) $(LIBUNISTRING)
> -LLDFLAGS = -static
> +LLDFLAGS = -static -lrt
I added $(LIBRT) to the end of LLDLIBS/LTDEPENDENCIES since we already
defined it elsewhere in the autoconf goo for benefit of the other
programs.
>
> ifeq ($(HAVE_MALLINFO),yes)
> LCFLAGS += -DHAVE_MALLINFO
> diff --git a/scrub/progress.c b/scrub/progress.c
> index 30b2152..61b9c60 100644
> --- a/scrub/progress.c
> +++ b/scrub/progress.c
> @@ -22,6 +22,7 @@
> #include <dirent.h>
> #include <pthread.h>
> #include <sys/statvfs.h>
> +#include <time.h>
Fixed.
--D
> #include "../repair/threads.h"
> #include "path.h"
> #include "disk.h"
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 01/27] xfs_scrub: create online filesystem scrub program
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
2018-01-12 0:16 ` Eric Sandeen
@ 2018-01-12 1:07 ` Eric Sandeen
2018-01-12 1:10 ` Darrick J. Wong
1 sibling, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 1:07 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Create the foundations of a filesystem scrubbing tool that asks the
> kernel to inspect all metadata in the filesystem and (ultimately) to
> repair anything that's broken. Also create the man page for the
> utility.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
...
> +/*
> + * XFS Online Metadata Scrub (and Repair)
> + *
> + * The XFS scrubber uses custom XFS ioctls to probe more deeply into the
> + * internals of the filesystem. It takes advantage of scrubbing ioctls
> + * to check all the records stored in a metadata object and to
> + * cross-reference those records against the other filesystem metadata.
> + *
> + * After the program gathers command line arguments to figure out
> + * exactly what the user wants the program is going to do, scrub
* exactly what the user wants the program to do
or -
* exactly what the program is going to do
or -
* exactly what the user wants to do
:)
> + * execution is split up into several separate phases:
> + *
> + * The "find geometry" phase queries XFS for the filesystem geometry.
> + * The block devices for the data, realtime, and log devices are opened.
> + * Kernel ioctls are test-queried to see if they actually work (the scrub
> + * ioctl in particular), and any other filesystem-specific information
> + * is gathered.
> + *
> + * In the "check internal metadata" phase, we call the metadata scrub
> + * ioctl to check the filesystem's internal per-AG btrees. This
> + * includes the AG superblock, AGF, AGFL, and AGI headers, freespace
> + * btrees, the regular and free inode btrees, the reverse mapping
> + * btrees, and the reference counting btrees. If the realtime device is
> + * enabled, the realtime bitmap and reverse mapping btrees are enabled.
checked?
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 01/27] xfs_scrub: create online filesystem scrub program
2018-01-12 0:16 ` Eric Sandeen
@ 2018-01-12 1:08 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 1:08 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 06:16:02PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
>
> <man page nitpicking>
>
> > diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
> > new file mode 100644
> > index 0000000..95f4fea
> > --- /dev/null
> > +++ b/man/man8/xfs_scrub.8
> > @@ -0,0 +1,117 @@
> > +.TH xfs_scrub 8
> > +.SH NAME
> > +xfs_scrub \- scrub the contents of an XFS filesystem
> > +.SH SYNOPSIS
> > +.B xfs_scrub
> > +[
> > +.B \-abemnTvVxy
> ^
> > +]
> > +.I mount-point
>
> or block device?
>
> > +.br
> > +.B xfs_scrub \-V
> ^
>
> If V is special it probably shouldn't be in the first arg string?
Yes, fixed.
> Do you mean to hide the "-d" option?
-d turn on debug mode; I was going to keep that hidden from users.
>
> > +.SH DESCRIPTION
> > +.B xfs_scrub
> > +attempts to check and repair all metadata in a mounted XFS filesystem.
> > +.PP
> > +.B xfs_scrub
> > +asks the kernel to scrub all metadata objects in the filesystem.
> > +Metadata records are scanned for obviously bad values and then
> > +cross-referenced against other metadata.
> > +The goal is to establish a threasonable confidence about the consistency
>
> "reasonable"
Fixed.
> > +of the overall filesystem by examining the consistency of individual
> > +metadata records against the other metadata in the filesystem across the
> > +entire filesystem.
>
> Redundant, "examining the consistency of individual metadata records against
> the other medtadata in the filesystem." would suffice.
Fixed.
> > +Damaged metadata can be rebuilt from other metadata if there is
> > +sufficient redundancy (and no other corruption) in the metadata.
>
> Again redundant, maybe just "if there is sufficient redundancy within
> other intact metadata?"
"Damaged metadata can be rebuilt from other metadata if there exists
redundant data structures which are intact."
?
> > +.PP
> > +This utility does not know how to correct all errors.
> > +If the tool cannot fix the detected errors, you must unmount the
> > +filesystem and run
> > +.B xfs_repair
> > +to fix the problems.
> > +If this tool is not run with either of the
> > +.B \-n
> > +or
> > +.B \-y
> > +options, then it will optimize the filesystem when possible,
> > +but it will not try to fix errors.
>
> I think the manpage needs to describe what this optimization might
> involve, at least at a high level. Will it fsr all my files? Will
> it trim my free space? Will it compact my directories? Will it ...?
> What exactly am I agreeing to here? :)
"Optimizations may include, but are not limited to, activities such as
compacting metadata or bypassing shared block write checks for files
that no longer share blocks."
> > +.SH OPTIONS
> > +.TP
> > +.BI \-a " errors"
> > +Abort if more than this many errors are found on the filesystem.
> > +.TP
> > +.B \-b
> > +Run in background mode.
> > +If the option is specified once, only run a single scrubbing thread at a
> > +time.
> > +If given more than once, an artificial delay of 100us is added to each
> > +scrub call to reduce CPU overhead even further.
>
> I wonder, should it take a value instead of -bbbbbbbbb?
More than ten -b and this program gets reallllly slow. There are
currently six global fs checks, ten per-AG checks, and seven per-file
checks. On my /home filesystem with 4M inodes and 32 AGs that adds up
to...
6 + (32 * 10) + (4M * 7) == ~28M scrub calls, or 324 days to perform
a scan.
> > +.TP
> > +.B \-e
> > +Specifies what happens when errors are detected.
> > +If
> > +.IR shutdown
> > +is given, the filesystem will be taken offline if errors are found.
> > +Not all backends can shut down a filesystem.
>
> <user> what's a backend? </user>
Leftover remnant from the days when this was a frankentool that could be
used to walk filesystems via the standard interfaces. I removed this
sentence.
> > +If
> > +.IR continue
> > +is given, no action taken if errors are found.
> > +This is the default.
>
> <user> so how do I know what errors were found? </user>
"Filesystem corruption and optimization opportunities will be logged to
the standard error stream."
I'll put that at the top.
> > +.TP
> > +.BI \-m " file"
> > +Search this file for mounted filesystems instead of /etc/mtab.
> > +.TP
> > +.B \-n
> > +Dry run, do not modify anything in the filesystem.
> > +This disables all preening and optimization behaviors, and disables
> > +calling FITRIM on the free space after a successful run.
>
> what if I only want to disable FITRIM? (-k?)
Oh all right. :)
> Oh, and it runs FITRIM? Can you mention that more prominently
> in the behavior description?
I'll put it in the list of optimizations.
> (and should it, given that we have a tool for that purpose?)
Yes we have fstrim but I consider it too scary to run out of the
blue without checking the health of the free space info first.
> > +.TP
> > +.BI \-T
> > +Print timing and memory usage information for each phase.
> > +.TP
> > +.B \-v
> > +Enable verbose mode, which prints periodic status updates.
> > +.TP
> > +.B \-V
> > +Prints the version number and exits.
> > +.TP
> > +.B \-x
> > +Scrub all file data too.
>
> colloquial? maybe s/too/as well/
"Read all file data extents to look for disk errors."
> > +The block list will be sorted in disk order for better performance.
>
> Cool, so when I'm done, my filesystem will have better performance if I use -x?
> and none of my files will be corrupted! ;)
>
> The read order is probably an implementation detail that doesn't need to be in
> the manpage. It may be worth changing the description a bit to make it
> clearer that the purpose is to determine readability of every file block?
> I mean, that should probably be obvious, but ...
Eh, I'll just remove it.
> > +.B xfs_scrub
> > +will issue O_DIRECT reads to the block device directly.
> > +If the block device is a SCSI disk, it will issue READ VERIFY commands
> > +directly to the disk.
>
> + These actions will confirm that all file data blocks can be read from storage.
>
> or something?
Ok, added that verbatim.
> > +.TP
> > +.B \-y
> > +Try to repair all filesystem errors.
> > +If the errors cannot be fixed online, then the filesystem must be taken
> > +offline for repair.
> > +.SH EXIT CODE
> > +The exit code returned by
> > +.B xfs_scrub
> > +is the sum of the following conditions:
> > +.br
> > +\ 0\ \-\ No errors
> > +.br
> > +\ 1\ \-\ File system errors left uncorrected
> > +.br
> > +\ 2\ \-\ File system optimizations possible
> > +.br
> > +\ 4\ \-\ Operational error
> > +.br
> > +\ 8\ \-\ Usage or syntax error
> > +.br
> > +.SH CAVEATS
> > +.B xfs_scrub
> > +is an immature utility!
>
> Might it damage my filesystem? ;)
It glides as softly as a piston!
...oh, are we not doing the monorail song?
> > +This program takes advantage of in-kernel scrubbing to verify a given
> > +data structure with locks held.
"This program takes advantage of in-kernel scrubbing to verify a given
data structure with locks held and can keep the filesystem busy for a
long time."
> > +The kernel must support the BULKSTAT, FSGEOMETRY, FSCOUNTS, GET_RESBLKS,
> > +GETBMAPX, GETFSMAP, INUMBERS, and SCRUB_METADATA ioctls.
>
> Some of those ioctls are ancient and probably don't need to be specified...
> Can you do anything at all without SCRUB_METADATA? If not,
> is SCRUB_METADATA sufficient to determine that the kernel has the rest
> of what it needs?
SCRUB_METADATA is enough, provided we don't get kernel-tinyfication'd.
> > +This can tie up the system for a while.
>
> Maybe that's a statement to go right after "locks held"
Ok.
> > +.PP
> > +If errors are found and cannot be repaired, the filesystem must be taken
> > +offline and repaired.
>
> "unmounted and repaired" might be more specific? *shrug*
Ok.
--D
> > +.SH SEE ALSO
> > +.BR xfs_repair (8).
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 01/27] xfs_scrub: create online filesystem scrub program
2018-01-12 1:07 ` Eric Sandeen
@ 2018-01-12 1:10 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 1:10 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 07:07:43PM -0600, Eric Sandeen wrote:
>
>
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Create the foundations of a filesystem scrubbing tool that asks the
> > kernel to inspect all metadata in the filesystem and (ultimately) to
> > repair anything that's broken. Also create the man page for the
> > utility.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>
> ...
>
> > +/*
> > + * XFS Online Metadata Scrub (and Repair)
> > + *
> > + * The XFS scrubber uses custom XFS ioctls to probe more deeply into the
> > + * internals of the filesystem. It takes advantage of scrubbing ioctls
> > + * to check all the records stored in a metadata object and to
> > + * cross-reference those records against the other filesystem metadata.
> > + *
> > + * After the program gathers command line arguments to figure out
> > + * exactly what the user wants the program is going to do, scrub
>
> * exactly what the user wants the program to do
>
> or -
>
> * exactly what the program is going to do
>
> or -
>
> * exactly what the user wants to do
>
> :)
The second. The program can figure out what the program is going to do;
it has no idea what the user wants.
> > + * execution is split up into several separate phases:
> > + *
> > + * The "find geometry" phase queries XFS for the filesystem geometry.
> > + * The block devices for the data, realtime, and log devices are opened.
> > + * Kernel ioctls are test-queried to see if they actually work (the scrub
> > + * ioctl in particular), and any other filesystem-specific information
> > + * is gathered.
> > + *
> > + * In the "check internal metadata" phase, we call the metadata scrub
> > + * ioctl to check the filesystem's internal per-AG btrees. This
> > + * includes the AG superblock, AGF, AGFL, and AGI headers, freespace
> > + * btrees, the regular and free inode btrees, the reverse mapping
> > + * btrees, and the reference counting btrees. If the realtime device is
> > + * enabled, the realtime bitmap and reverse mapping btrees are enabled.
>
> checked?
Fixed.
--D
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 02/27] xfs_scrub: common error handling
2018-01-06 1:51 ` [PATCH 02/27] xfs_scrub: common error handling Darrick J. Wong
@ 2018-01-12 1:15 ` Eric Sandeen
2018-01-12 1:23 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 1:15 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Standardize how we record and report errors.
>
> +/*
> + * Reporting Status to the Console
> + *
> + * We aim for a roughly standard reporting format -- the severity of the
> + * status being reported, a textual description of the objecting being
object? (I mean, I suppose it might be objecting to your horribly
corrupted filesystem?) ;)
> + * reported, and whatever the status happens to be.
> + *
> + * Errors are the most severe and reflect filesystem corruption.
> + * Warnings indicate that something is amiss and needs the attention of
> + * the administrator, but does not constitute a corruption. Information
> + * is merely advisory.
> + */
> +
> /* Program name; needed for libxcmd error reports. */
> char *progname = "xfs_scrub";
>
> +/* Debug level; higher values mean more verbosity. */
> +unsigned int debug;
> +
> +/* Should we dump core if errors happen? */
> +bool dumpcore;
not independent of debug right, but ... *shrug*
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 02/27] xfs_scrub: common error handling
2018-01-12 1:15 ` Eric Sandeen
@ 2018-01-12 1:23 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 1:23 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 07:15:52PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Standardize how we record and report errors.
> >
>
>
> > +/*
> > + * Reporting Status to the Console
> > + *
> > + * We aim for a roughly standard reporting format -- the severity of the
> > + * status being reported, a textual description of the objecting being
>
> object? (I mean, I suppose it might be objecting to your horribly
> corrupted filesystem?) ;)
Fixed.
> > + * reported, and whatever the status happens to be.
> > + *
> > + * Errors are the most severe and reflect filesystem corruption.
> > + * Warnings indicate that something is amiss and needs the attention of
> > + * the administrator, but does not constitute a corruption. Information
> > + * is merely advisory.
> > + */
> > +
>
>
> > /* Program name; needed for libxcmd error reports. */
> > char *progname = "xfs_scrub";
> >
> > +/* Debug level; higher values mean more verbosity. */
> > +unsigned int debug;
> > +
> > +/* Should we dump core if errors happen? */
> > +bool dumpcore;
>
> not independent of debug right, but ... *shrug*
Wart from the old days. I'll gate core dumping on debug.
--D
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2018-01-12 0:04 ` Eric Sandeen
@ 2018-01-12 1:27 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 1:27 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 06:04:38PM -0600, Eric Sandeen wrote:
> On 1/11/18 5:59 PM, Darrick J. Wong wrote:
> > On Thu, Jan 11, 2018 at 05:24:58PM -0600, Eric Sandeen wrote:
> ...
>
> >>> + /* Non-rotational device? Throw all the CPUs. */
> >>> + rot = 1;
> >>> + error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
> >>> + if (error == 0 && rot == 0)
> >>> + return nproc;
> >>
> >> I needed
> >>
> >> +#ifndef BLKROTATIONAL
> >> +#define BLKROTATIONAL _IO(0x12,126)
> >> +#endif
> >>
> >> to make this compile on my not /that/ ancient (?) rhel6 box ;)
> >
> > Hmm... well, since I don't see backporting xfs kernel scrub to 2.6.32
> > maybe xfsprogs' build system should just turn off xfs_scrub on old
> > systems?
> >
> > In any case, I #ifdef BLKROTATIONAL'd out the entire clause.
>
> ok. well, other distros are making noise about using bleeding edge progs
> w/ older distro kernels (hence the mkfs config file wishes) so it's probably
> good to consider building against older environments.
<shrug> ok I can patch it in like that...
--D
> Thanks,
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 03/27] xfs_scrub: set up command line argument parsing
2018-01-06 1:51 ` [PATCH 03/27] xfs_scrub: set up command line argument parsing Darrick J. Wong
2018-01-11 23:39 ` Eric Sandeen
@ 2018-01-12 1:30 ` Eric Sandeen
2018-01-12 2:03 ` Darrick J. Wong
1 sibling, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 1:30 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Parse command line options in order to set up the context in which we
> will scrub the filesystem.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> scrub/common.h | 8 ++
> scrub/xfs_scrub.c | 207 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> scrub/xfs_scrub.h | 34 +++++++++
> 3 files changed, 249 insertions(+)
>
>
> diff --git a/scrub/common.h b/scrub/common.h
> index f620620..15a59bd 100644
> --- a/scrub/common.h
> +++ b/scrub/common.h
> @@ -48,4 +48,12 @@ void __record_preen(struct scrub_ctx *ctx, const char *descr, const char *file,
> #define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
> #define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
>
> +/* Is this debug tweak enabled? */
> +static inline bool
> +debug_tweak_on(
> + const char *name)
> +{
> + return debug && getenv(name) != NULL;
since it's debug anyway, I wonder if printing
"FOO_BAR_TWEAK is on" would be useful here.
> +}
> +
> #endif /* XFS_SCRUB_COMMON_H_ */
> diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
> index 10116a8..9db3b41 100644
> --- a/scrub/xfs_scrub.c
> +++ b/scrub/xfs_scrub.c
> @@ -20,7 +20,12 @@
> #include <stdio.h>
> #include <pthread.h>
> #include <stdbool.h>
> +#include <stdlib.h>
> +#include "platform_defs.h"
> +#include "xfs.h"
> +#include "input.h"
> #include "xfs_scrub.h"
> +#include "common.h"
>
> /*
> * XFS Online Metadata Scrub (and Repair)
> @@ -107,11 +112,213 @@ unsigned int debug;
> /* Should we dump core if errors happen? */
> bool dumpcore;
>
> +/* Display resource usage at the end of each phase? */
> +bool display_rusage;
> +
> +/* Background mode; higher values insert more pauses between scrub calls. */
> +unsigned int bg_mode;
> +
> +/* Maximum number of processors available to us. */
> +int nproc;
> +
> +/* Number of threads we're allowed to use. */
> +unsigned int nr_threads;
> +
> +/* Verbosity; higher values print more information. */
> +bool verbose;
> +
> +/* Should we scrub the data blocks? */
> +bool scrub_data;
> +
> +/* Size of a memory page. */
> +long page_size;
> +
> +static void __attribute__((noreturn))
> +usage(void)
> +{
> + fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
> + fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
> + fprintf(stderr, _("-b:\tBackground mode.\n"));
> + fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
> + fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
> + fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
> + fprintf(stderr, _("-T:\tDisplay timing/usage information.\n"));
> + fprintf(stderr, _("-v:\tVerbose output.\n"));
> + fprintf(stderr, _("-V:\tPrint version.\n"));
> + fprintf(stderr, _("-x:\tScrub file data too.\n"));
> + fprintf(stderr, _("-y:\tRepair all errors.\n"));
> +
> + exit(16);
> +}
> +
> int
> main(
> int argc,
> char **argv)
> {
> + int c;
> + char *mtab = NULL;
> + char *repairstr = "";
> + struct scrub_ctx ctx = {0};
> + unsigned long long total_errors;
> + bool moveon = true;
> + static bool injected;
> + int ret = 0;
> +
> fprintf(stderr, "XXX: This program is not complete!\n");
> return 4;
> +
> + progname = basename(argv[0]);
> + setlocale(LC_ALL, "");
> + bindtextdomain(PACKAGE, LOCALEDIR);
> + textdomain(PACKAGE);
> +
> + pthread_mutex_init(&ctx.lock, NULL);
> + ctx.mode = SCRUB_MODE_DEFAULT;
> + ctx.error_action = ERRORS_CONTINUE;
> + while ((c = getopt(argc, argv, "a:bde:m:nTvxVy")) != EOF) {
> + switch (c) {
> + case 'a':
> + ctx.max_errors = cvt_u64(optarg, 10);
> + if (errno) {
> + perror(optarg);
> + usage();
> + }
> + break;
> + case 'b':
> + nr_threads = 1;
> + bg_mode++;
> + break;
> + case 'd':
> + debug++;
> + dumpcore = true;
> + break;
> + case 'e':
> + if (!strcmp("continue", optarg))
> + ctx.error_action = ERRORS_CONTINUE;
> + else if (!strcmp("shutdown", optarg))
> + ctx.error_action = ERRORS_SHUTDOWN;
> + else
> + usage();
Nothing tells me what I did wrong here,
# scrub/xfs_scrub -e make_it_so /mnt/test
Usage: xfs_scrub [OPTIONS] mountpoint
-a: Stop after this many errors are found.
-b: Background mode.
-C: Print progress information to this fd.
-e: What to do if errors are found.
...
I told it what to do... what's wrong?
> + break;
> + case 'm':
> + mtab = optarg;
> + break;
> + case 'n':
> + if (ctx.mode != SCRUB_MODE_DEFAULT) {
> + fprintf(stderr,
> +_("Only one of the options -n or -y may be specified.\n"));
> + return 1;
> + }
> + ctx.mode = SCRUB_MODE_DRY_RUN;
> + break;
> + case 'T':
> + display_rusage = true;
> + break;
> + case 'v':
> + verbose = true;
> + break;
> + case 'V':
> + fprintf(stdout, _("%s version %s\n"), progname,
> + VERSION);
> + fflush(stdout);
> + exit(0);
> + case 'x':
> + scrub_data = true;
> + break;
> + case 'y':
> + if (ctx.mode != SCRUB_MODE_DEFAULT) {
> + fprintf(stderr,
> +_("Only one of the options -n or -y may be specified.\n"));
> + return 1;
> + }
> + ctx.mode = SCRUB_MODE_REPAIR;
> + break;
> + case '?':
'?' isn't in the getopt string ...
# scrub/xfs_scrub ?
xfs_scrub: could not stat: ?: No such file or directory
> + /* fall through */
> + default:
> + usage();
> + }
> + }
> +
> + /* Override thread count if debugger */
> + if (debug_tweak_on("XFS_SCRUB_THREADS")) {
can you document all these tweaks somewhere near the top in a comment?
> + unsigned int x;
> +
> + x = cvt_u32(getenv("XFS_SCRUB_THREADS"), 10);
> + if (errno) {
> + perror("nr_threads");
> + usage();
> + }
> + nr_threads = x;
> + }
> +
> + if (optind != argc - 1)
> + usage();
> +
> + ctx.mntpoint = strdup(argv[optind]);
> +
> + /*
> + * If the user did not specify an explicit mount table, try to use
> + * /proc/mounts if it is available, else /etc/mtab. We prefer
> + * /proc/mounts because it is kernel controlled, while /etc/mtab
> + * may contain garbage that userspace tools like pam_mounts wrote
> + * into it.
> + */
> + if (!mtab) {
> + if (access(_PATH_PROC_MOUNTS, R_OK) == 0)
> + mtab = _PATH_PROC_MOUNTS;
> + else
> + mtab = _PATH_MOUNTED;
> + }
> +
> + /* How many CPUs? */
> + nproc = sysconf(_SC_NPROCESSORS_ONLN);
> + if (nproc < 1)
> + nproc = 1;
> +
> + /* Set up a page-aligned buffer for read verification. */
> + page_size = sysconf(_SC_PAGESIZE);
> + if (page_size < 0) {
> + str_errno(&ctx, ctx.mntpoint);
> + goto out;
> + }
> +
> + if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR") && !injected) {
> + ctx.mode = SCRUB_MODE_REPAIR;
> + injected = true;
> + }
what is "injected" used for? How could it already be set?.
> +
> + if (xfs_scrub_excessive_errors(&ctx))
> + str_info(&ctx, ctx.mntpoint, _("Too many errors; aborting."));
wait wut? oh right, you'll add $DO_STUFF above this in later patches ;)
Rest looks ok
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 03/27] xfs_scrub: set up command line argument parsing
2018-01-11 23:39 ` Eric Sandeen
@ 2018-01-12 1:53 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 1:53 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 05:39:38PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Parse command line options in order to set up the context in which we
> > will scrub the filesystem.
>
>
> > +static void __attribute__((noreturn))
> > +usage(void)
> > +{
> > + fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
> > + fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
> > + fprintf(stderr, _("-b:\tBackground mode.\n"));
>
> do you intentionally not document -d?
> <same question for manpage>
Debug mode, so yes.
> > + fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
> > + fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
> > + fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
> > + fprintf(stderr, _("-T:\tDisplay timing/usage information.\n"));
> > + fprintf(stderr, _("-v:\tVerbose output.\n"));
> > + fprintf(stderr, _("-V:\tPrint version.\n"));
> > + fprintf(stderr, _("-x:\tScrub file data too.\n"));
> > + fprintf(stderr, _("-y:\tRepair all errors.\n"));
> > +
> > + exit(16);
> > +}
>
> Could we make this more like xfs_repair usage() for consistency?
>
> Usage: xfs_repair [options] device
>
> Options:
> -f The device is a file
> -L Force log zeroing. Do this as a last resort.
> -l logdev Specifies the device where the external log resides.
> -m maxmem Maximum amount of memory to be used in megabytes.
> -n No modify mode, just checks the filesystem for damage.
> -P Disables prefetching.
> -r rtdev Specifies the device where the realtime section resides.
> -v Verbose output.
> -c subopts Change filesystem parameters - use xfs_admin.
> -o subopts Override default behaviour, refer to man page.
> -t interval Reporting interval in seconds.
> -d Repair dangerously.
> -V Reports version and exits.
>
> so maybe:
>
> Usage: xfs_scrub [options] mountpoint
>
> -a count Stop after this many errors are found.
> -b Background mode.
> -C fd Print progress information to this fd.
> -e behavior What to do if errors are found. (shutdown|continue)
> -m path Path to /etc/mtab.
> -n Dry run. Do not modify anything.
> -T Display timing/usage information.
> -v Verbose output.
> -V Reports version and exits.
> -x Scrub file data too.
> -y Repair all errors.
Ok. Assuming you meant to indent everything by two spaces and make it
obvious which switches take parameters.
--D
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 03/27] xfs_scrub: set up command line argument parsing
2018-01-12 1:30 ` Eric Sandeen
@ 2018-01-12 2:03 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-12 2:03 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 07:30:11PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Parse command line options in order to set up the context in which we
> > will scrub the filesystem.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > scrub/common.h | 8 ++
> > scrub/xfs_scrub.c | 207 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > scrub/xfs_scrub.h | 34 +++++++++
> > 3 files changed, 249 insertions(+)
> >
> >
> > diff --git a/scrub/common.h b/scrub/common.h
> > index f620620..15a59bd 100644
> > --- a/scrub/common.h
> > +++ b/scrub/common.h
> > @@ -48,4 +48,12 @@ void __record_preen(struct scrub_ctx *ctx, const char *descr, const char *file,
> > #define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
> > #define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
> >
> > +/* Is this debug tweak enabled? */
> > +static inline bool
> > +debug_tweak_on(
> > + const char *name)
> > +{
> > + return debug && getenv(name) != NULL;
>
> since it's debug anyway, I wonder if printing
> "FOO_BAR_TWEAK is on" would be useful here.
>
> > +}
> > +
> > #endif /* XFS_SCRUB_COMMON_H_ */
> > diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
> > index 10116a8..9db3b41 100644
> > --- a/scrub/xfs_scrub.c
> > +++ b/scrub/xfs_scrub.c
> > @@ -20,7 +20,12 @@
> > #include <stdio.h>
> > #include <pthread.h>
> > #include <stdbool.h>
> > +#include <stdlib.h>
> > +#include "platform_defs.h"
> > +#include "xfs.h"
> > +#include "input.h"
> > #include "xfs_scrub.h"
> > +#include "common.h"
> >
> > /*
> > * XFS Online Metadata Scrub (and Repair)
> > @@ -107,11 +112,213 @@ unsigned int debug;
> > /* Should we dump core if errors happen? */
> > bool dumpcore;
> >
> > +/* Display resource usage at the end of each phase? */
> > +bool display_rusage;
> > +
> > +/* Background mode; higher values insert more pauses between scrub calls. */
> > +unsigned int bg_mode;
> > +
> > +/* Maximum number of processors available to us. */
> > +int nproc;
> > +
> > +/* Number of threads we're allowed to use. */
> > +unsigned int nr_threads;
> > +
> > +/* Verbosity; higher values print more information. */
> > +bool verbose;
> > +
> > +/* Should we scrub the data blocks? */
> > +bool scrub_data;
> > +
> > +/* Size of a memory page. */
> > +long page_size;
> > +
> > +static void __attribute__((noreturn))
> > +usage(void)
> > +{
> > + fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
> > + fprintf(stderr, _("-a:\tStop after this many errors are found.\n"));
> > + fprintf(stderr, _("-b:\tBackground mode.\n"));
> > + fprintf(stderr, _("-e:\tWhat to do if errors are found.\n"));
> > + fprintf(stderr, _("-m:\tPath to /etc/mtab.\n"));
> > + fprintf(stderr, _("-n:\tDry run. Do not modify anything.\n"));
> > + fprintf(stderr, _("-T:\tDisplay timing/usage information.\n"));
> > + fprintf(stderr, _("-v:\tVerbose output.\n"));
> > + fprintf(stderr, _("-V:\tPrint version.\n"));
> > + fprintf(stderr, _("-x:\tScrub file data too.\n"));
> > + fprintf(stderr, _("-y:\tRepair all errors.\n"));
> > +
> > + exit(16);
> > +}
> > +
> > int
> > main(
> > int argc,
> > char **argv)
> > {
> > + int c;
> > + char *mtab = NULL;
> > + char *repairstr = "";
> > + struct scrub_ctx ctx = {0};
> > + unsigned long long total_errors;
> > + bool moveon = true;
> > + static bool injected;
> > + int ret = 0;
> > +
> > fprintf(stderr, "XXX: This program is not complete!\n");
> > return 4;
> > +
> > + progname = basename(argv[0]);
> > + setlocale(LC_ALL, "");
> > + bindtextdomain(PACKAGE, LOCALEDIR);
> > + textdomain(PACKAGE);
> > +
> > + pthread_mutex_init(&ctx.lock, NULL);
> > + ctx.mode = SCRUB_MODE_DEFAULT;
> > + ctx.error_action = ERRORS_CONTINUE;
> > + while ((c = getopt(argc, argv, "a:bde:m:nTvxVy")) != EOF) {
> > + switch (c) {
> > + case 'a':
> > + ctx.max_errors = cvt_u64(optarg, 10);
> > + if (errno) {
> > + perror(optarg);
> > + usage();
> > + }
> > + break;
> > + case 'b':
> > + nr_threads = 1;
> > + bg_mode++;
> > + break;
> > + case 'd':
> > + debug++;
> > + dumpcore = true;
> > + break;
> > + case 'e':
> > + if (!strcmp("continue", optarg))
> > + ctx.error_action = ERRORS_CONTINUE;
> > + else if (!strcmp("shutdown", optarg))
> > + ctx.error_action = ERRORS_SHUTDOWN;
> > + else
> > + usage();
>
> Nothing tells me what I did wrong here,
>
> # scrub/xfs_scrub -e make_it_so /mnt/test
> Usage: xfs_scrub [OPTIONS] mountpoint
> -a: Stop after this many errors are found.
> -b: Background mode.
> -C: Print progress information to this fd.
> -e: What to do if errors are found.
> ...
>
> I told it what to do... what's wrong?
Unknown error behavior "$optarg". ?
> > + break;
> > + case 'm':
> > + mtab = optarg;
> > + break;
> > + case 'n':
> > + if (ctx.mode != SCRUB_MODE_DEFAULT) {
> > + fprintf(stderr,
> > +_("Only one of the options -n or -y may be specified.\n"));
> > + return 1;
> > + }
> > + ctx.mode = SCRUB_MODE_DRY_RUN;
> > + break;
> > + case 'T':
> > + display_rusage = true;
> > + break;
> > + case 'v':
> > + verbose = true;
> > + break;
> > + case 'V':
> > + fprintf(stdout, _("%s version %s\n"), progname,
> > + VERSION);
> > + fflush(stdout);
> > + exit(0);
> > + case 'x':
> > + scrub_data = true;
> > + break;
> > + case 'y':
> > + if (ctx.mode != SCRUB_MODE_DEFAULT) {
> > + fprintf(stderr,
> > +_("Only one of the options -n or -y may be specified.\n"));
> > + return 1;
> > + }
> > + ctx.mode = SCRUB_MODE_REPAIR;
> > + break;
> > + case '?':
>
> '?' isn't in the getopt string ...
The getopt manpage says it returns '?' for an unknown parameter, so I
provide the specific case here so that nobody can accidentally add a
second (case '?') statement.
IOWs, it's a defensive move.
> # scrub/xfs_scrub ?
> xfs_scrub: could not stat: ?: No such file or directory
>
>
> > + /* fall through */
> > + default:
> > + usage();
> > + }
> > + }
> > +
> > + /* Override thread count if debugger */
> > + if (debug_tweak_on("XFS_SCRUB_THREADS")) {
>
> can you document all these tweaks somewhere near the top in a comment?
/*
* Known debug tweaks (pass -d and set the environment variable):
* XFS_SCRUB_FORCE_ERROR -- pretend all metadata is corrupt
* XFS_SCRUB_FORCE_REPAIR -- repair all metadata even if it's ok
* XFS_SCRUB_NO_KERNEL -- pretend there is no kernel ioctl
* XFS_SCRUB_NO_SCSI_VERIFY -- disable SCSI VERIFY (if present)
* XFS_SCRUB_PHASE -- run only this scrub phase
* XFS_SCRUB_THREADS -- start exactly this number of threads
*/
> > + unsigned int x;
> > +
> > + x = cvt_u32(getenv("XFS_SCRUB_THREADS"), 10);
> > + if (errno) {
> > + perror("nr_threads");
> > + usage();
> > + }
> > + nr_threads = x;
> > + }
> > +
> > + if (optind != argc - 1)
> > + usage();
> > +
> > + ctx.mntpoint = strdup(argv[optind]);
> > +
> > + /*
> > + * If the user did not specify an explicit mount table, try to use
> > + * /proc/mounts if it is available, else /etc/mtab. We prefer
> > + * /proc/mounts because it is kernel controlled, while /etc/mtab
> > + * may contain garbage that userspace tools like pam_mounts wrote
> > + * into it.
> > + */
> > + if (!mtab) {
> > + if (access(_PATH_PROC_MOUNTS, R_OK) == 0)
> > + mtab = _PATH_PROC_MOUNTS;
> > + else
> > + mtab = _PATH_MOUNTED;
> > + }
> > +
> > + /* How many CPUs? */
> > + nproc = sysconf(_SC_NPROCESSORS_ONLN);
> > + if (nproc < 1)
> > + nproc = 1;
> > +
> > + /* Set up a page-aligned buffer for read verification. */
> > + page_size = sysconf(_SC_PAGESIZE);
> > + if (page_size < 0) {
> > + str_errno(&ctx, ctx.mntpoint);
> > + goto out;
> > + }
> > +
> > + if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR") && !injected) {
> > + ctx.mode = SCRUB_MODE_REPAIR;
> > + injected = true;
> > + }
>
> what is "injected" used for? How could it already be set?.
Not needed here, will remove.
> > +
> > + if (xfs_scrub_excessive_errors(&ctx))
> > + str_info(&ctx, ctx.mntpoint, _("Too many errors; aborting."));
>
> wait wut? oh right, you'll add $DO_STUFF above this in later patches ;)
>
>
> Rest looks ok
Ok.
--D
>
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH v11 00/27] xfsprogs: online scrub/repair support
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (27 preceding siblings ...)
2018-01-06 3:50 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
@ 2018-01-12 4:17 ` Eric Sandeen
2018-01-17 1:31 ` Darrick J. Wong
2018-01-16 19:21 ` [PATCH 28/27] xfs_scrub: wire up repair ioctl Darrick J. Wong
2018-01-16 19:21 ` [PATCH 29/27] xfs_scrub: schedule and manage repairs to the filesystem Darrick J. Wong
30 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-12 4:17 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> Hi all,
>
> This is the eleventh revision of a patchset that adds to XFS userland tools
> support for online metadata scrubbing and repair. Since v10 I've rebased
> to the latest for-next, fixed some wonky error messages, and fixed a few
> minor problems I found via code inspection. However, this patch series is
> more or less the same as v10.
General note rather than finding the patches they came from ;)
these can be made static and in some cases removed from header files,
and/or ... hm, some aren't used at all.
'bitmap_dump' is unique to scrub/bitmap.o (function)
'bitmap_iterate' is unique to scrub/bitmap.o (function)
'do_error' is unique to scrub/common.o (function)
'display_rusage' is unique to scrub/xfs_scrub.o (global variable)
'is_service' is unique to scrub/xfs_scrub.o (global variable)
'progname' is unique to scrub/xfs_scrub.o (global variable)
'scrub_data' is unique to scrub/xfs_scrub.o (global variable)
'xfs_check_rmap_ioerr' is unique to scrub/phase6.o (function)
bitmap_dump (and so bitmap_iterate) are unused
do_error is unused as well?
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 28/27] xfs_scrub: wire up repair ioctl
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (28 preceding siblings ...)
2018-01-12 4:17 ` [PATCH v11 00/27] xfsprogs: online scrub/repair support Eric Sandeen
@ 2018-01-16 19:21 ` Darrick J. Wong
2018-01-16 19:21 ` [PATCH 29/27] xfs_scrub: schedule and manage repairs to the filesystem Darrick J. Wong
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-16 19:21 UTC (permalink / raw)
To: sandeen; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create the mechanism we need to actually call the kernel's online repair
functionality. The interface will consume a repair description; the
descriptor management will follow in the next patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/common.c | 51 +++++++++++++++++++
scrub/common.h | 2 +
scrub/phase1.c | 15 ++++++
scrub/phase2.c | 1
scrub/phase3.c | 1
scrub/phase5.c | 1
scrub/scrub.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/scrub.h | 20 ++++++++
scrub/xfs_scrub.h | 2 +
9 files changed, 232 insertions(+)
diff --git a/scrub/common.c b/scrub/common.c
index 48ee01c..bd3e939 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -180,6 +180,57 @@ __str_info(
pthread_mutex_unlock(&ctx->lock);
}
+/* Increment the repair count. */
+void
+__record_repair(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line,
+ const char *format,
+ ...)
+{
+ va_list args;
+
+ pthread_mutex_lock(&ctx->lock);
+ fprintf(stderr, _("Repaired: %s: "), descr);
+ va_start(args, format);
+ vfprintf(stderr, format, args);
+ va_end(args);
+ if (debug)
+ fprintf(stderr, _(" (%s line %d)"), file, line);
+ fprintf(stderr, "\n");
+ ctx->repairs++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
+/* Increment the optimization (preening) count. */
+void
+__record_preen(
+ struct scrub_ctx *ctx,
+ const char *descr,
+ const char *file,
+ int line,
+ const char *format,
+ ...)
+{
+ va_list args;
+
+ pthread_mutex_lock(&ctx->lock);
+ if (debug || verbose) {
+ fprintf(stdout, _("Optimized: %s: "), descr);
+ va_start(args, format);
+ vfprintf(stdout, format, args);
+ va_end(args);
+ if (debug)
+ fprintf(stdout, _(" (%s line %d)"), file, line);
+ fprintf(stdout, "\n");
+ fflush(stdout);
+ }
+ ctx->preens++;
+ pthread_mutex_unlock(&ctx->lock);
+}
+
/* Catch fatal errors from pieces we import from xfs_repair. */
void __attribute__((noreturn))
do_error(char const *msg, ...)
diff --git a/scrub/common.h b/scrub/common.h
index bd67a17..ea3bc3f 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -49,6 +49,8 @@ void __str_errno_warn(struct scrub_ctx *, const char *descr, const char *file,
#define str_warn(ctx, str, ...) __str_warn(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define str_info(ctx, str, ...) __str_info(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define str_errno_warn(ctx, str) __str_errno_warn(ctx, str, __FILE__, __LINE__)
+#define record_repair(ctx, str, ...) __record_repair(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
+#define record_preen(ctx, str, ...) __record_preen(ctx, str, __FILE__, __LINE__, __VA_ARGS__)
#define dbg_printf(fmt, ...) {if (debug > 1) {printf(fmt, __VA_ARGS__);}}
/* Is this debug tweak enabled? */
diff --git a/scrub/phase1.c b/scrub/phase1.c
index d7a321f..3a2fbd7 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -176,6 +176,21 @@ _("Does not appear to be an XFS filesystem!"));
!xfs_can_scrub_parent(ctx))
return false;
+ /* Do we have kernel-assisted metadata repair? */
+ if (ctx->mode != SCRUB_MODE_DRY_RUN && !xfs_can_repair(ctx)) {
+ if (ctx->mode == SCRUB_MODE_PREEN) {
+ /* w/o repair, demote preen to dry run. */
+ if (debug || verbose)
+ str_info(ctx, ctx->mntpoint,
+_("Metadata repairing not supported; demoting to scan mode.")
+ );
+ ctx->mode = SCRUB_MODE_DRY_RUN;
+ } else {
+ /* Repair mode w/o repair; abort. */
+ return false;
+ }
+ }
+
/* Go find the XFS devices if we have a usable fsmap. */
fs_table_initialise(0, NULL, 0, NULL);
errno = 0;
diff --git a/scrub/phase2.c b/scrub/phase2.c
index e8eb1ca..32e2752 100644
--- a/scrub/phase2.c
+++ b/scrub/phase2.c
@@ -24,6 +24,7 @@
#include <sys/stat.h>
#include <sys/statvfs.h>
#include "xfs.h"
+#include "list.h"
#include "path.h"
#include "workqueue.h"
#include "xfs_scrub.h"
diff --git a/scrub/phase3.c b/scrub/phase3.c
index 43697c6..f4117b0 100644
--- a/scrub/phase3.c
+++ b/scrub/phase3.c
@@ -24,6 +24,7 @@
#include <sys/stat.h>
#include <sys/statvfs.h>
#include "xfs.h"
+#include "list.h"
#include "path.h"
#include "workqueue.h"
#include "xfs_scrub.h"
diff --git a/scrub/phase5.c b/scrub/phase5.c
index fc3308b..703b279 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -29,6 +29,7 @@
#endif
#include "xfs.h"
#include "handle.h"
+#include "list.h"
#include "path.h"
#include "workqueue.h"
#include "xfs_scrub.h"
diff --git a/scrub/scrub.c b/scrub/scrub.c
index bc4eab4..5729b9b 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -28,6 +28,7 @@
#include <sys/statvfs.h>
#include "xfs.h"
#include "xfs_fs.h"
+#include "list.h"
#include "path.h"
#include "xfs_scrub.h"
#include "common.h"
@@ -561,10 +562,20 @@ __xfs_scrub_test(
bool repair)
{
struct xfs_scrub_metadata meta = {0};
+ struct xfs_error_injection inject;
+ static bool injected;
int error;
if (debug_tweak_on("XFS_SCRUB_NO_KERNEL"))
return false;
+ if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR") && !injected) {
+ inject.fd = ctx->mnt_fd;
+ inject.errtag = XFS_ERRTAG_FORCE_SCRUB_REPAIR;
+ error = ioctl(ctx->mnt_fd,
+ XFS_IOC_ERROR_INJECTION, &inject);
+ if (error == 0)
+ injected = true;
+ }
meta.sm_type = type;
if (repair)
@@ -646,3 +657,131 @@ xfs_can_scrub_parent(
{
return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_PARENT, false);
}
+
+bool
+xfs_can_repair(
+ struct scrub_ctx *ctx)
+{
+ return __xfs_scrub_test(ctx, XFS_SCRUB_TYPE_PROBE, true);
+}
+
+/* General repair routines. */
+
+/* Repair some metadata. */
+enum check_outcome
+xfs_repair_metadata(
+ struct scrub_ctx *ctx,
+ int fd,
+ struct repair_item *ri,
+ unsigned int repair_flags)
+{
+ char buf[DESCR_BUFSZ];
+ struct xfs_scrub_metadata meta = { 0 };
+ struct xfs_scrub_metadata oldm;
+ int error;
+
+ assert(ri->type < XFS_SCRUB_TYPE_NR);
+ assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL"));
+ meta.sm_type = ri->type;
+ meta.sm_flags = ri->flags | XFS_SCRUB_IFLAG_REPAIR;
+ switch (scrubbers[ri->type].type) {
+ case ST_AGHEADER:
+ case ST_PERAG:
+ meta.sm_agno = ri->agno;
+ break;
+ case ST_INODE:
+ meta.sm_ino = ri->ino;
+ meta.sm_gen = ri->gen;
+ break;
+ default:
+ break;
+ }
+
+ /*
+ * If this is a preen operation but we're only repairing
+ * critical items, defer the preening until later.
+ */
+ if (!needs_repair(&meta) && (repair_flags & XRM_REPAIR_ONLY))
+ return CHECK_RETRY;
+
+ memcpy(&oldm, &meta, sizeof(oldm));
+ format_scrub_descr(buf, DESCR_BUFSZ, &meta, &scrubbers[meta.sm_type]);
+
+ if (needs_repair(&meta))
+ str_info(ctx, buf, _("Attempting repair."));
+ else if (debug || verbose)
+ str_info(ctx, buf, _("Attempting optimization."));
+
+ error = ioctl(fd, XFS_IOC_SCRUB_METADATA, &meta);
+ /*
+ * If the caller doesn't want us to complain, tell the caller to
+ * requeue the repair for later and don't say a thing.
+ */
+ if (!(repair_flags & XRM_NOFIX_COMPLAIN) &&
+ (error || needs_repair(&meta)))
+ return CHECK_RETRY;
+ if (error) {
+ switch (errno) {
+ case EDEADLOCK:
+ case EBUSY:
+ /* Filesystem is busy, try again later. */
+ if (debug || verbose)
+ str_info(ctx, buf,
+_("Filesystem is busy, deferring repair."));
+ return CHECK_RETRY;
+ case ESHUTDOWN:
+ /* Filesystem is already shut down, abort. */
+ str_error(ctx, buf,
+_("Filesystem is shut down, aborting."));
+ return CHECK_ABORT;
+ case ENOTTY:
+ case EOPNOTSUPP:
+ /*
+ * If we forced repairs, don't complain if kernel
+ * doesn't know how to fix.
+ */
+ if (debug_tweak_on("XFS_SCRUB_FORCE_REPAIR"))
+ return CHECK_DONE;
+ /* fall through */
+ case EINVAL:
+ /* Kernel doesn't know how to repair this? */
+ str_error(ctx, buf,
+_("Don't know how to fix; offline repair required."));
+ return CHECK_DONE;
+ case EROFS:
+ /* Read-only filesystem, can't fix. */
+ if (verbose || debug || needs_repair(&oldm))
+ str_info(ctx, buf,
+_("Read-only filesystem; cannot make changes."));
+ return CHECK_DONE;
+ case ENOENT:
+ /* Metadata not present, just skip it. */
+ return CHECK_DONE;
+ case ENOMEM:
+ case ENOSPC:
+ /* Don't care if preen fails due to low resources. */
+ if (is_unoptimized(&oldm) && !needs_repair(&oldm))
+ return CHECK_DONE;
+ /* fall through */
+ default:
+ /* Operational error. */
+ str_errno(ctx, buf);
+ return CHECK_DONE;
+ }
+ }
+ if (repair_flags & XRM_NOFIX_COMPLAIN)
+ xfs_scrub_warn_incomplete_scrub(ctx, buf, &meta);
+ if (needs_repair(&meta)) {
+ /* Still broken, try again or fix offline. */
+ if (repair_flags & XRM_NOFIX_COMPLAIN)
+ str_error(ctx, buf,
+_("Repair unsuccessful; offline repair required."));
+ } else {
+ /* Clean operation, no corruption detected. */
+ if (needs_repair(&oldm))
+ record_repair(ctx, buf, _("Repairs successful."));
+ else
+ record_preen(ctx, buf, _("Optimization successful."));
+ }
+ return CHECK_DONE;
+}
diff --git a/scrub/scrub.h b/scrub/scrub.h
index 0b454df..1c44fba 100644
--- a/scrub/scrub.h
+++ b/scrub/scrub.h
@@ -41,6 +41,7 @@ bool xfs_can_scrub_dir(struct scrub_ctx *ctx);
bool xfs_can_scrub_attr(struct scrub_ctx *ctx);
bool xfs_can_scrub_symlink(struct scrub_ctx *ctx);
bool xfs_can_scrub_parent(struct scrub_ctx *ctx);
+bool xfs_can_repair(struct scrub_ctx *ctx);
bool xfs_scrub_inode_fields(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
int fd);
@@ -59,4 +60,23 @@ bool xfs_scrub_symlink(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
bool xfs_scrub_parent(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
int fd);
+/* Repair parameters are the scrub inputs and retry count. */
+struct repair_item {
+ struct list_head list;
+ __u64 ino;
+ __u32 type;
+ __u32 flags;
+ __u32 gen;
+ __u32 agno;
+};
+
+/* Only perform repairs; leave optimization-only actions for later. */
+#define XRM_REPAIR_ONLY (1U << 0)
+
+/* Complain if still broken even after fix. */
+#define XRM_NOFIX_COMPLAIN (1U << 1)
+
+enum check_outcome xfs_repair_metadata(struct scrub_ctx *ctx, int fd,
+ struct repair_item *ri, unsigned int repair_flags);
+
#endif /* XFS_SCRUB_SCRUB_H_ */
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 9b5e490..83b8ae2 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -97,6 +97,8 @@ struct scrub_ctx {
unsigned long long inodes_checked;
unsigned long long bytes_checked;
unsigned long long naming_warnings;
+ unsigned long long repairs;
+ unsigned long long preens;
bool need_repair;
bool preen_triggers[XFS_SCRUB_TYPE_NR];
};
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [PATCH 29/27] xfs_scrub: schedule and manage repairs to the filesystem
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
` (29 preceding siblings ...)
2018-01-16 19:21 ` [PATCH 28/27] xfs_scrub: wire up repair ioctl Darrick J. Wong
@ 2018-01-16 19:21 ` Darrick J. Wong
30 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-16 19:21 UTC (permalink / raw)
To: sandeen; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Teach xfs_scrub to remember scrub requests that failed (or indicated
that optimization is a possibility) as repair requests that can be
deferred until later. Add a new repair phase that deals with the
repair requests.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
man/man8/xfs_scrub.8 | 27 ++++-
scrub/Makefile | 2
scrub/phase1.c | 7 +
scrub/phase2.c | 59 +++++++++-
scrub/phase3.c | 42 +++++--
scrub/phase4.c | 76 ++++++++++++-
scrub/repair.c | 299 ++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/repair.h | 55 +++++++++
scrub/scrub.c | 107 +++++++++++++-----
scrub/scrub.h | 32 +++--
scrub/xfs_scrub.c | 22 ++++
scrub/xfs_scrub.h | 1
12 files changed, 667 insertions(+), 62 deletions(-)
create mode 100644 scrub/repair.c
create mode 100644 scrub/repair.h
diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
index 4c394a5..ce5d876 100644
--- a/man/man8/xfs_scrub.8
+++ b/man/man8/xfs_scrub.8
@@ -114,9 +114,27 @@ Instructing the underlying storage to discard unused extents via the
.B FITRIM
ioctl.
.SH REPAIRS
-This program currently does not support making any repairs.
-Corruptions can only be fixed by unmounting the filesystem and running
-.BR xfs_repair (8).
+Repairs are performed by calling into the kernel.
+This limits the scope of repair activities to rebuilding primary data
+structures from secondary data structures, or secondary structures from
+primary structures.
+The existence of secondary data structures may require features that can
+only be turned on from
+.BR mkfs.xfs (8).
+If errors cannot be repaired, the filesystem must be
+unmounted and
+.BR xfs_repair (8)
+run.
+Repairs supported by the kernel include, but are not limited to:
+.IP \[bu] 2
+Reconstructing extent allocation data from the reverse mapping data.
+.IP \[bu]
+Reconstructing reverse mapping data from primary extent allocation data.
+.IP \[bu]
+Scheduling a quotacheck for the next mount.
+.PP
+If corrupt metadata is successfully repaired, this program will log that
+a repair has succeeded instead of a corruption report.
.SH EXIT CODE
The exit code returned by
.B xfs_scrub
@@ -140,8 +158,5 @@ This program takes advantage of in-kernel scrubbing to verify a given
data structure with locks held and can keep the filesystem busy for a
long time.
The kernel must be new enough to support the SCRUB_METADATA ioctl.
-.PP
-If errors are found and cannot be repaired, the filesystem must be
-unmounted and repaired.
.SH SEE ALSO
.BR xfs_repair (8).
diff --git a/scrub/Makefile b/scrub/Makefile
index 597b2eb..7cdada2 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -37,6 +37,7 @@ fscounters.h \
inodes.h \
progress.h \
read_verify.h \
+repair.h \
scrub.h \
spacemap.h \
unicrash.h \
@@ -60,6 +61,7 @@ phase6.c \
phase7.c \
progress.c \
read_verify.c \
+repair.c \
scrub.c \
spacemap.c \
vfs.c \
diff --git a/scrub/phase1.c b/scrub/phase1.c
index 3a2fbd7..f7d01d1 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -47,6 +47,7 @@
#include "common.h"
#include "disk.h"
#include "scrub.h"
+#include "repair.h"
/* Phase 1: Find filesystem geometry (and clean up after) */
@@ -68,6 +69,7 @@ bool
xfs_cleanup_fs(
struct scrub_ctx *ctx)
{
+ xfs_repair_lists_free(&ctx->repair_lists);
if (ctx->fshandle)
free_handle(ctx->fshandle, ctx->fshandle_len);
if (ctx->rtdev)
@@ -157,6 +159,11 @@ _("Does not appear to be an XFS filesystem!"));
return false;
}
+ if (!xfs_repair_lists_alloc(ctx->geo.agcount, &ctx->repair_lists)) {
+ str_error(ctx, ctx->mntpoint, _("Not enough memory."));
+ return false;
+ }
+
ctx->agblklog = log2_roundup(ctx->geo.agblocks);
ctx->blocklog = highbit32(ctx->geo.blocksize);
ctx->inodelog = highbit32(ctx->geo.inodesize);
diff --git a/scrub/phase2.c b/scrub/phase2.c
index 32e2752..5669f0a 100644
--- a/scrub/phase2.c
+++ b/scrub/phase2.c
@@ -30,6 +30,7 @@
#include "xfs_scrub.h"
#include "common.h"
#include "scrub.h"
+#include "repair.h"
/* Phase 2: Check internal metadata. */
@@ -42,24 +43,65 @@ xfs_scan_ag_metadata(
{
struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
bool *pmoveon = arg;
+ struct xfs_repair_list repairs;
+ struct xfs_repair_list repair_now;
+ unsigned long long broken_primaries;
+ unsigned long long broken_secondaries;
bool moveon;
char descr[DESCR_BUFSZ];
+ xfs_repair_list_init(&repairs);
+ xfs_repair_list_init(&repair_now);
snprintf(descr, DESCR_BUFSZ, _("AG %u"), agno);
/*
* First we scrub and fix the AG headers, because we need
* them to work well enough to check the AG btrees.
*/
- moveon = xfs_scrub_ag_headers(ctx, agno);
+ moveon = xfs_scrub_ag_headers(ctx, agno, &repairs);
+ if (!moveon)
+ goto err;
+
+ /* Repair header damage. */
+ moveon = xfs_quick_repair(ctx, agno, &repairs);
if (!moveon)
goto err;
/* Now scrub the AG btrees. */
- moveon = xfs_scrub_ag_metadata(ctx, agno);
+ moveon = xfs_scrub_ag_metadata(ctx, agno, &repairs);
+ if (!moveon)
+ goto err;
+
+ /*
+ * Figure out if we need to perform early fixing. The only
+ * reason we need to do this is if the inobt is broken, which
+ * prevents phase 3 (inode scan) from running. We can rebuild
+ * the inobt from rmapbt data, but if the rmapbt is broken even
+ * at this early phase then we are sunk.
+ */
+ broken_secondaries = 0;
+ broken_primaries = 0;
+ xfs_repair_find_mustfix(&repairs, &repair_now,
+ &broken_primaries, &broken_secondaries);
+ if (broken_secondaries && !debug_tweak_on("XFS_SCRUB_FORCE_REPAIR")) {
+ if (broken_primaries)
+ str_info(ctx, descr,
+_("Corrupt primary and secondary block mapping metadata."));
+ else
+ str_info(ctx, descr,
+_("Corrupt secondary block mapping metadata."));
+ str_info(ctx, descr,
+_("Filesystem might not be repairable."));
+ }
+
+ /* Repair (inode) btree damage. */
+ moveon = xfs_quick_repair(ctx, agno, &repair_now);
if (!moveon)
goto err;
+ /* Everything else gets fixed during phase 4. */
+ xfs_defer_repairs(ctx, agno, &repairs);
+
return;
err:
*pmoveon = false;
@@ -74,11 +116,15 @@ xfs_scan_fs_metadata(
{
struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
bool *pmoveon = arg;
+ struct xfs_repair_list repairs;
bool moveon;
- moveon = xfs_scrub_fs_metadata(ctx);
+ xfs_repair_list_init(&repairs);
+ moveon = xfs_scrub_fs_metadata(ctx, &repairs);
if (!moveon)
*pmoveon = false;
+
+ xfs_defer_repairs(ctx, agno, &repairs);
}
/* Scan all filesystem metadata. */
@@ -86,6 +132,7 @@ bool
xfs_scan_metadata(
struct scrub_ctx *ctx)
{
+ struct xfs_repair_list repairs;
struct workqueue wq;
xfs_agnumber_t agno;
bool moveon = true;
@@ -103,7 +150,11 @@ xfs_scan_metadata(
* upgrades (followed by a full scrub), do that before we launch
* anything else.
*/
- moveon = xfs_scrub_primary_super(ctx);
+ xfs_repair_list_init(&repairs);
+ moveon = xfs_scrub_primary_super(ctx, &repairs);
+ if (!moveon)
+ return moveon;
+ moveon = xfs_quick_repair(ctx, 0, &repairs);
if (!moveon)
return moveon;
diff --git a/scrub/phase3.c b/scrub/phase3.c
index f4117b0..7fb0120 100644
--- a/scrub/phase3.c
+++ b/scrub/phase3.c
@@ -33,6 +33,7 @@
#include "inodes.h"
#include "progress.h"
#include "scrub.h"
+#include "repair.h"
/* Phase 3: Scan all inodes. */
@@ -45,10 +46,11 @@ static bool
xfs_scrub_fd(
struct scrub_ctx *ctx,
bool (*fn)(struct scrub_ctx *, uint64_t,
- uint32_t, int),
- struct xfs_bstat *bs)
+ uint32_t, int, struct xfs_repair_list *),
+ struct xfs_bstat *bs,
+ struct xfs_repair_list *rl)
{
- return fn(ctx, bs->bs_ino, bs->bs_gen, ctx->mnt_fd);
+ return fn(ctx, bs->bs_ino, bs->bs_gen, ctx->mnt_fd, rl);
}
struct scrub_inode_ctx {
@@ -64,11 +66,15 @@ xfs_scrub_inode(
struct xfs_bstat *bstat,
void *arg)
{
+ struct xfs_repair_list repairs;
struct scrub_inode_ctx *ictx = arg;
struct ptcounter *icount = ictx->icount;
+ xfs_agnumber_t agno;
bool moveon = true;
int fd = -1;
+ xfs_repair_list_init(&repairs);
+ agno = bstat->bs_ino / (1ULL << (ctx->inopblog + ctx->agblklog));
background_sleep();
/* Try to open the inode to pin it. */
@@ -80,45 +86,59 @@ xfs_scrub_inode(
}
/* Scrub the inode. */
- moveon = xfs_scrub_fd(ctx, xfs_scrub_inode_fields, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_inode_fields, bstat, &repairs);
+ if (!moveon)
+ goto out;
+
+ moveon = xfs_quick_repair(ctx, agno, &repairs);
if (!moveon)
goto out;
/* Scrub all block mappings. */
- moveon = xfs_scrub_fd(ctx, xfs_scrub_data_fork, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_data_fork, bstat, &repairs);
if (!moveon)
goto out;
- moveon = xfs_scrub_fd(ctx, xfs_scrub_attr_fork, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_attr_fork, bstat, &repairs);
if (!moveon)
goto out;
- moveon = xfs_scrub_fd(ctx, xfs_scrub_cow_fork, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_cow_fork, bstat, &repairs);
+ if (!moveon)
+ goto out;
+
+ moveon = xfs_quick_repair(ctx, agno, &repairs);
if (!moveon)
goto out;
if (S_ISLNK(bstat->bs_mode)) {
/* Check symlink contents. */
moveon = xfs_scrub_symlink(ctx, bstat->bs_ino,
- bstat->bs_gen, ctx->mnt_fd);
+ bstat->bs_gen, ctx->mnt_fd, &repairs);
} else if (S_ISDIR(bstat->bs_mode)) {
/* Check the directory entries. */
- moveon = xfs_scrub_fd(ctx, xfs_scrub_dir, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_dir, bstat, &repairs);
}
if (!moveon)
goto out;
/* Check all the extended attributes. */
- moveon = xfs_scrub_fd(ctx, xfs_scrub_attr, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_attr, bstat, &repairs);
if (!moveon)
goto out;
/* Check parent pointers. */
- moveon = xfs_scrub_fd(ctx, xfs_scrub_parent, bstat);
+ moveon = xfs_scrub_fd(ctx, xfs_scrub_parent, bstat, &repairs);
+ if (!moveon)
+ goto out;
+
+ /* Try to repair the file while it's open. */
+ moveon = xfs_quick_repair(ctx, agno, &repairs);
if (!moveon)
goto out;
out:
ptcounter_add(icount, 1);
progress_add(1);
+ xfs_defer_repairs(ctx, agno, &repairs);
if (fd >= 0)
close(fd);
if (!moveon)
diff --git a/scrub/phase4.c b/scrub/phase4.c
index 9c81069..b502238 100644
--- a/scrub/phase4.c
+++ b/scrub/phase4.c
@@ -33,16 +33,82 @@
#include "common.h"
#include "progress.h"
#include "scrub.h"
+#include "repair.h"
#include "vfs.h"
/* Phase 4: Repair filesystem. */
+/* Fix all the problems in our per-AG list. */
+static void
+xfs_repair_ag(
+ struct workqueue *wq,
+ xfs_agnumber_t agno,
+ void *priv)
+{
+ struct scrub_ctx *ctx = (struct scrub_ctx *)wq->wq_ctx;
+ bool *pmoveon = priv;
+ struct xfs_repair_list *repairs;
+ size_t unfixed;
+ size_t new_unfixed;
+ unsigned int flags = 0;
+ bool moveon;
+
+ repairs = &ctx->repair_lists[agno];
+ unfixed = xfs_repair_list_length(repairs);
+
+ /* Repair anything broken until we fail to make progress. */
+ do {
+ moveon = xfs_repair_list_now(ctx, ctx->mnt_fd, repairs, flags);
+ if (!moveon) {
+ *pmoveon = false;
+ return;
+ }
+ new_unfixed = xfs_repair_list_length(repairs);
+ if (new_unfixed == unfixed)
+ break;
+ unfixed = new_unfixed;
+ } while (unfixed > 0 && *pmoveon);
+
+ if (!*pmoveon)
+ return;
+
+ /* Try once more, but this time complain if we can't fix things. */
+ flags |= XRML_NOFIX_COMPLAIN;
+ moveon = xfs_repair_list_now(ctx, ctx->mnt_fd, repairs, flags);
+ if (!moveon)
+ *pmoveon = false;
+}
+
/* Fix everything that needs fixing. */
bool
xfs_repair_fs(
struct scrub_ctx *ctx)
{
+ struct workqueue wq;
+ xfs_agnumber_t agno;
bool moveon = true;
+ int ret;
+
+ ret = workqueue_create(&wq, (struct xfs_mount *)ctx,
+ scrub_nproc_workqueue(ctx));
+ if (ret) {
+ str_error(ctx, ctx->mntpoint, _("Could not create workqueue."));
+ return false;
+ }
+ for (agno = 0; agno < ctx->geo.agcount; agno++) {
+ if (xfs_repair_list_length(&ctx->repair_lists[agno]) > 0) {
+ ret = workqueue_add(&wq, xfs_repair_ag, agno, &moveon);
+ if (ret) {
+ moveon = false;
+ str_error(ctx, ctx->mntpoint,
+_("Could not queue repair work."));
+ break;
+ }
+ }
+ if (!moveon)
+ break;
+ }
+ workqueue_destroy(&wq);
pthread_mutex_lock(&ctx->lock);
if (moveon && ctx->errors_found == 0 && want_fstrim) {
@@ -62,8 +128,14 @@ xfs_estimate_repair_work(
unsigned int *nr_threads,
int *rshift)
{
- *items = 1;
- *nr_threads = 1;
+ xfs_agnumber_t agno;
+ size_t need_fixing = 0;
+
+ for (agno = 0; agno < ctx->geo.agcount; agno++)
+ need_fixing += xfs_repair_list_length(&ctx->repair_lists[agno]);
+ need_fixing++;
+ *items = need_fixing;
+ *nr_threads = scrub_nproc(ctx) + 1;
*rshift = 0;
return true;
}
diff --git a/scrub/repair.c b/scrub/repair.c
new file mode 100644
index 0000000..4a6d7b7
--- /dev/null
+++ b/scrub/repair.c
@@ -0,0 +1,299 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/statvfs.h>
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "list.h"
+#include "path.h"
+#include "xfs_scrub.h"
+#include "common.h"
+#include "scrub.h"
+#include "repair.h"
+
+/*
+ * Prioritize repair items in order of how long we can wait.
+ * 0 = do it now, 10000 = do it later.
+ *
+ * To minimize the amount of repair work, we want to prioritize metadata
+ * objects by perceived corruptness. If CORRUPT is set, the fields are
+ * just plain bad; try fixing that first. Otherwise if XCORRUPT is set,
+ * the fields could be bad, but the xref data could also be bad; we'll
+ * try fixing that next. Finally, if XFAIL is set, some other metadata
+ * structure failed validation during xref, so we'll recheck this
+ * metadata last since it was probably fine.
+ *
+ * For metadata that lie in the critical path of checking other metadata
+ * (superblock, AG{F,I,FL}, inobt) we scrub and fix those things before
+ * we even get to handling their dependencies, so things should progress
+ * in order.
+ */
+
+/* Sort repair items in severity order. */
+static int
+PRIO(
+ struct repair_item *ri,
+ int order)
+{
+ if (ri->flags & XFS_SCRUB_OFLAG_CORRUPT)
+ return order;
+ else if (ri->flags & XFS_SCRUB_OFLAG_XCORRUPT)
+ return 100 + order;
+ else if (ri->flags & XFS_SCRUB_OFLAG_XFAIL)
+ return 200 + order;
+ else if (ri->flags & XFS_SCRUB_OFLAG_PREEN)
+ return 300 + order;
+ abort();
+}
+
+/* Sort the repair items in dependency order. */
+static int
+xfs_repair_item_priority(
+ struct repair_item *ri)
+{
+ switch (ri->type) {
+ case XFS_SCRUB_TYPE_SB:
+ case XFS_SCRUB_TYPE_AGF:
+ case XFS_SCRUB_TYPE_AGFL:
+ case XFS_SCRUB_TYPE_AGI:
+ case XFS_SCRUB_TYPE_BNOBT:
+ case XFS_SCRUB_TYPE_CNTBT:
+ case XFS_SCRUB_TYPE_INOBT:
+ case XFS_SCRUB_TYPE_FINOBT:
+ case XFS_SCRUB_TYPE_REFCNTBT:
+ case XFS_SCRUB_TYPE_RMAPBT:
+ case XFS_SCRUB_TYPE_INODE:
+ case XFS_SCRUB_TYPE_BMBTD:
+ case XFS_SCRUB_TYPE_BMBTA:
+ case XFS_SCRUB_TYPE_BMBTC:
+ return PRIO(ri, ri->type - 1);
+ case XFS_SCRUB_TYPE_DIR:
+ case XFS_SCRUB_TYPE_XATTR:
+ case XFS_SCRUB_TYPE_SYMLINK:
+ case XFS_SCRUB_TYPE_PARENT:
+ return PRIO(ri, XFS_SCRUB_TYPE_DIR);
+ case XFS_SCRUB_TYPE_RTBITMAP:
+ case XFS_SCRUB_TYPE_RTSUM:
+ return PRIO(ri, XFS_SCRUB_TYPE_RTBITMAP);
+ case XFS_SCRUB_TYPE_UQUOTA:
+ case XFS_SCRUB_TYPE_GQUOTA:
+ case XFS_SCRUB_TYPE_PQUOTA:
+ return PRIO(ri, XFS_SCRUB_TYPE_UQUOTA);
+ }
+ abort();
+}
+
+/* Make sure that btrees get repaired before headers. */
+static int
+xfs_repair_item_compare(
+ void *priv,
+ struct list_head *a,
+ struct list_head *b)
+{
+ struct repair_item *ra;
+ struct repair_item *rb;
+
+ ra = container_of(a, struct repair_item, list);
+ rb = container_of(b, struct repair_item, list);
+
+ return xfs_repair_item_priority(ra) - xfs_repair_item_priority(rb);
+}
+
+/*
+ * Figure out which AG metadata must be fixed before we can move on
+ * to the inode scan.
+ */
+void
+xfs_repair_find_mustfix(
+ struct xfs_repair_list *repairs,
+ struct xfs_repair_list *repair_now,
+ unsigned long long *broken_primaries,
+ unsigned long long *broken_secondaries)
+{
+ struct repair_item *n;
+ struct repair_item *ri;
+
+ list_for_each_entry_safe(ri, n, &repairs->list, list) {
+ switch (ri->type) {
+ case XFS_SCRUB_TYPE_RMAPBT:
+ (*broken_secondaries)++;
+ break;
+ case XFS_SCRUB_TYPE_FINOBT:
+ case XFS_SCRUB_TYPE_INOBT:
+ repairs->nr--;
+ list_del(&ri->list);
+ list_add_tail(&ri->list, &repair_now->list);
+ repair_now->nr++;
+ /* fall through */
+ case XFS_SCRUB_TYPE_BNOBT:
+ case XFS_SCRUB_TYPE_CNTBT:
+ case XFS_SCRUB_TYPE_REFCNTBT:
+ (*broken_primaries)++;
+ break;
+ default:
+ abort();
+ break;
+ }
+ }
+}
+
+/* Allocate a certain number of repair lists for the scrub context. */
+bool
+xfs_repair_lists_alloc(
+ size_t nr,
+ struct xfs_repair_list **listsp)
+{
+ struct xfs_repair_list *lists;
+ xfs_agnumber_t agno;
+
+ lists = calloc(nr, sizeof(struct xfs_repair_list));
+ if (!lists)
+ return false;
+
+ for (agno = 0; agno < nr; agno++)
+ xfs_repair_list_init(&lists[agno]);
+ *listsp = lists;
+
+ return true;
+}
+
+/* Free the repair lists. */
+void
+xfs_repair_lists_free(
+ struct xfs_repair_list **listsp)
+{
+ free(*listsp);
+ *listsp = NULL;
+}
+
+/* Initialize repair list */
+void
+xfs_repair_list_init(
+ struct xfs_repair_list *rl)
+{
+ INIT_LIST_HEAD(&rl->list);
+ rl->nr = 0;
+ rl->sorted = false;
+}
+
+/* Number of repairs in this list. */
+size_t
+xfs_repair_list_length(
+ struct xfs_repair_list *rl)
+{
+ return rl->nr;
+};
+
+/* Add to the list of repairs. */
+void
+xfs_repair_list_add(
+ struct xfs_repair_list *rl,
+ struct repair_item *ri)
+{
+ list_add_tail(&ri->list, &rl->list);
+ rl->nr++;
+ rl->sorted = false;
+}
+
+/* Splice two repair lists. */
+void
+xfs_repair_list_splice(
+ struct xfs_repair_list *dest,
+ struct xfs_repair_list *src)
+{
+ if (src->nr == 0)
+ return;
+
+ list_splice_tail_init(&src->list, &dest->list);
+ dest->nr += src->nr;
+ src->nr = 0;
+ dest->sorted = false;
+}
+
+/* Repair everything on this list. */
+bool
+xfs_repair_list_now(
+ struct scrub_ctx *ctx,
+ int fd,
+ struct xfs_repair_list *rl,
+ unsigned int repair_flags)
+{
+ struct repair_item *ri;
+ struct repair_item *n;
+ enum check_outcome fix;
+
+ if (!rl->sorted) {
+ list_sort(NULL, &rl->list, xfs_repair_item_compare);
+ rl->sorted = true;
+ }
+
+ list_for_each_entry_safe(ri, n, &rl->list, list) {
+ fix = xfs_repair_metadata(ctx, fd, ri, repair_flags);
+ switch (fix) {
+ case CHECK_DONE:
+ rl->nr--;
+ list_del(&ri->list);
+ free(ri);
+ continue;
+ case CHECK_ABORT:
+ return false;
+ case CHECK_RETRY:
+ continue;
+ case CHECK_REPAIR:
+ abort();
+ }
+ }
+
+ return !xfs_scrub_excessive_errors(ctx);
+}
+
+/* Defer all the repairs until phase 4. */
+void
+xfs_defer_repairs(
+ struct scrub_ctx *ctx,
+ xfs_agnumber_t agno,
+ struct xfs_repair_list *rl)
+{
+ ASSERT(agno < ctx->geo.agcount);
+
+ xfs_repair_list_splice(&ctx->repair_lists[agno], rl);
+}
+
+/* Quickly try to repair AG metadata; broken things are remembered for later. */
+bool
+xfs_quick_repair(
+ struct scrub_ctx *ctx,
+ xfs_agnumber_t agno,
+ struct xfs_repair_list *rl)
+{
+ bool moveon;
+
+ moveon = xfs_repair_list_now(ctx, ctx->mnt_fd, rl, XRML_REPAIR_ONLY);
+ if (!moveon)
+ return moveon;
+
+ xfs_defer_repairs(ctx, agno, rl);
+ return true;
+}
diff --git a/scrub/repair.h b/scrub/repair.h
new file mode 100644
index 0000000..3ae15ef
--- /dev/null
+++ b/scrub/repair.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2018 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_REPAIR_H_
+#define XFS_SCRUB_REPAIR_H_
+
+struct xfs_repair_list {
+ struct list_head list;
+ size_t nr;
+ bool sorted;
+};
+
+bool xfs_repair_lists_alloc(size_t nr, struct xfs_repair_list **listsp);
+void xfs_repair_lists_free(struct xfs_repair_list **listsp);
+
+void xfs_repair_list_init(struct xfs_repair_list *rl);
+size_t xfs_repair_list_length(struct xfs_repair_list *rl);
+void xfs_repair_list_add(struct xfs_repair_list *dest,
+ struct repair_item *item);
+void xfs_repair_list_splice(struct xfs_repair_list *dest,
+ struct xfs_repair_list *src);
+
+void xfs_repair_find_mustfix(struct xfs_repair_list *repairs,
+ struct xfs_repair_list *repair_now,
+ unsigned long long *broken_primaries,
+ unsigned long long *broken_secondaries);
+
+/* Passed through to xfs_repair_metadata() */
+#define XRML_REPAIR_ONLY (XRM_REPAIR_ONLY)
+#define XRML_NOFIX_COMPLAIN (XRM_NOFIX_COMPLAIN)
+
+bool xfs_repair_list_now(struct scrub_ctx *ctx, int fd,
+ struct xfs_repair_list *repair_list, unsigned int repair_flags);
+void xfs_defer_repairs(struct scrub_ctx *ctx, xfs_agnumber_t agno,
+ struct xfs_repair_list *rl);
+bool xfs_quick_repair(struct scrub_ctx *ctx, xfs_agnumber_t agno,
+ struct xfs_repair_list *rl);
+
+#endif /* XFS_SCRUB_REPAIR_H_ */
diff --git a/scrub/scrub.c b/scrub/scrub.c
index 5729b9b..55e8b98 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -35,6 +35,7 @@
#include "progress.h"
#include "scrub.h"
#include "xfs_errortag.h"
+#include "repair.h"
/* Online scrub and repair wrappers. */
@@ -321,12 +322,47 @@ _("Optimizations of %s are possible."), scrubbers[i].name);
}
}
+/* Save a scrub context for later repairs. */
+bool
+xfs_scrub_save_repair(
+ struct scrub_ctx *ctx,
+ struct xfs_repair_list *rl,
+ struct xfs_scrub_metadata *meta)
+{
+ struct repair_item *ri;
+
+ /* Schedule this item for later repairs. */
+ ri = malloc(sizeof(struct repair_item));
+ if (!ri) {
+ str_errno(ctx, _("repair list"));
+ return false;
+ }
+ ri->type = meta->sm_type;
+ ri->flags = meta->sm_flags;
+ switch (scrubbers[meta->sm_type].type) {
+ case ST_AGHEADER:
+ case ST_PERAG:
+ ri->agno = meta->sm_agno;
+ break;
+ case ST_INODE:
+ ri->ino = meta->sm_ino;
+ ri->gen = meta->sm_gen;
+ break;
+ default:
+ break;
+ }
+
+ xfs_repair_list_add(rl, ri);
+ return true;
+}
+
/* Scrub metadata, saving corruption reports for later. */
static bool
xfs_scrub_metadata(
struct scrub_ctx *ctx,
enum scrub_type scrub_type,
- xfs_agnumber_t agno)
+ xfs_agnumber_t agno,
+ struct xfs_repair_list *rl)
{
struct xfs_scrub_metadata meta = {0};
const struct scrub_descr *sc;
@@ -350,6 +386,8 @@ xfs_scrub_metadata(
case CHECK_ABORT:
return false;
case CHECK_REPAIR:
+ if (!xfs_scrub_save_repair(ctx, rl, &meta))
+ return false;
/* fall through */
case CHECK_DONE:
continue;
@@ -369,7 +407,8 @@ xfs_scrub_metadata(
*/
bool
xfs_scrub_primary_super(
- struct scrub_ctx *ctx)
+ struct scrub_ctx *ctx,
+ struct xfs_repair_list *repair_list)
{
struct xfs_scrub_metadata meta = {
.sm_type = XFS_SCRUB_TYPE_SB,
@@ -382,6 +421,8 @@ xfs_scrub_primary_super(
case CHECK_ABORT:
return false;
case CHECK_REPAIR:
+ if (!xfs_scrub_save_repair(ctx, repair_list, &meta))
+ return false;
/* fall through */
case CHECK_DONE:
return true;
@@ -397,26 +438,29 @@ xfs_scrub_primary_super(
bool
xfs_scrub_ag_headers(
struct scrub_ctx *ctx,
- xfs_agnumber_t agno)
+ xfs_agnumber_t agno,
+ struct xfs_repair_list *rl)
{
- return xfs_scrub_metadata(ctx, ST_AGHEADER, agno);
+ return xfs_scrub_metadata(ctx, ST_AGHEADER, agno, rl);
}
/* Scrub each AG's metadata btrees. */
bool
xfs_scrub_ag_metadata(
struct scrub_ctx *ctx,
- xfs_agnumber_t agno)
+ xfs_agnumber_t agno,
+ struct xfs_repair_list *rl)
{
- return xfs_scrub_metadata(ctx, ST_PERAG, agno);
+ return xfs_scrub_metadata(ctx, ST_PERAG, agno, rl);
}
/* Scrub whole-FS metadata btrees. */
bool
xfs_scrub_fs_metadata(
- struct scrub_ctx *ctx)
+ struct scrub_ctx *ctx,
+ struct xfs_repair_list *rl)
{
- return xfs_scrub_metadata(ctx, ST_FS, 0);
+ return xfs_scrub_metadata(ctx, ST_FS, 0, rl);
}
/* How many items do we have to check? */
@@ -452,7 +496,8 @@ __xfs_scrub_file(
uint64_t ino,
uint32_t gen,
int fd,
- unsigned int type)
+ unsigned int type,
+ struct xfs_repair_list *rl)
{
struct xfs_scrub_metadata meta = {0};
enum check_outcome fix;
@@ -471,7 +516,7 @@ __xfs_scrub_file(
if (fix == CHECK_DONE)
return true;
- return true;
+ return xfs_scrub_save_repair(ctx, rl, &meta);
}
bool
@@ -479,9 +524,10 @@ xfs_scrub_inode_fields(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_INODE);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_INODE, rl);
}
bool
@@ -489,9 +535,10 @@ xfs_scrub_data_fork(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTD);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTD, rl);
}
bool
@@ -499,9 +546,10 @@ xfs_scrub_attr_fork(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTA);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTA, rl);
}
bool
@@ -509,9 +557,10 @@ xfs_scrub_cow_fork(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTC);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_BMBTC, rl);
}
bool
@@ -519,9 +568,10 @@ xfs_scrub_dir(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_DIR);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_DIR, rl);
}
bool
@@ -529,9 +579,10 @@ xfs_scrub_attr(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_XATTR);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_XATTR, rl);
}
bool
@@ -539,9 +590,10 @@ xfs_scrub_symlink(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_SYMLINK);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_SYMLINK, rl);
}
bool
@@ -549,9 +601,10 @@ xfs_scrub_parent(
struct scrub_ctx *ctx,
uint64_t ino,
uint32_t gen,
- int fd)
+ int fd,
+ struct xfs_repair_list *rl)
{
- return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_PARENT);
+ return __xfs_scrub_file(ctx, ino, gen, fd, XFS_SCRUB_TYPE_PARENT, rl);
}
/* Test the availability of a kernel scrub command. */
@@ -773,7 +826,7 @@ _("Read-only filesystem; cannot make changes."));
xfs_scrub_warn_incomplete_scrub(ctx, buf, &meta);
if (needs_repair(&meta)) {
/* Still broken, try again or fix offline. */
- if (repair_flags & XRM_NOFIX_COMPLAIN)
+ if ((repair_flags & XRM_NOFIX_COMPLAIN) || debug)
str_error(ctx, buf,
_("Repair unsuccessful; offline repair required."));
} else {
diff --git a/scrub/scrub.h b/scrub/scrub.h
index 1c44fba..22ac89a 100644
--- a/scrub/scrub.h
+++ b/scrub/scrub.h
@@ -28,11 +28,19 @@ enum check_outcome {
CHECK_RETRY, /* repair failed, try again later */
};
+struct repair_item;
+
void xfs_scrub_report_preen_triggers(struct scrub_ctx *ctx);
-bool xfs_scrub_primary_super(struct scrub_ctx *ctx);
-bool xfs_scrub_ag_headers(struct scrub_ctx *ctx, xfs_agnumber_t agno);
-bool xfs_scrub_ag_metadata(struct scrub_ctx *ctx, xfs_agnumber_t agno);
-bool xfs_scrub_fs_metadata(struct scrub_ctx *ctx);
+bool xfs_scrub_primary_super(struct scrub_ctx *ctx,
+ struct xfs_repair_list *repair_list);
+bool xfs_scrub_ag_headers(struct scrub_ctx *ctx, xfs_agnumber_t agno,
+ struct xfs_repair_list *repair_list);
+bool xfs_scrub_ag_metadata(struct scrub_ctx *ctx, xfs_agnumber_t agno,
+ struct xfs_repair_list *repair_list);
+bool xfs_scrub_fs_metadata(struct scrub_ctx *ctx,
+ struct xfs_repair_list *repair_list);
+enum check_outcome xfs_repair_metadata(struct scrub_ctx *ctx, int fd,
+ struct repair_item *ri, unsigned int flags);
bool xfs_can_scrub_fs_metadata(struct scrub_ctx *ctx);
bool xfs_can_scrub_inode(struct scrub_ctx *ctx);
@@ -44,21 +52,21 @@ bool xfs_can_scrub_parent(struct scrub_ctx *ctx);
bool xfs_can_repair(struct scrub_ctx *ctx);
bool xfs_scrub_inode_fields(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_data_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_attr_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_cow_fork(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_dir(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_attr(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_symlink(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
bool xfs_scrub_parent(struct scrub_ctx *ctx, uint64_t ino, uint32_t gen,
- int fd);
+ int fd, struct xfs_repair_list *repair_list);
/* Repair parameters are the scrub inputs and retry count. */
struct repair_item {
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index b5ce4c6..b9dd4d9 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -88,6 +88,15 @@
* the previous two phases are retried here; if there are uncorrectable
* errors, xfs_scrub stops here.
*
+ * To perform the actual repairs, we iterate all the items on the per-AG
+ * repair list and ask the kernel to repair them. Items which are
+ * successfully repaired are removed from the list. If an item is not
+ * repaired successfully (or the kernel asks us to try again), we retry
+ * the repairs until there is nothing left to fix or we fail to make
+ * forward progress. In that event, the unrepaired items are recorded
+ * as errors. If there are no errors at this point, we call FSTRIM on
+ * the filesystem.
+ *
* The next phase is the "check directory tree" phase. In this phase,
* every directory is opened (via file handle) to confirm that each
* directory is connected to the root. Directory entries are checked
@@ -707,6 +716,19 @@ _("%s: Not a XFS mount point or block device.\n"),
ret |= 8;
out:
+ if (ctx.repairs && ctx.preens)
+ fprintf(stdout,
+_("%s: %llu repairs and %llu optimizations made.\n"),
+ ctx.mntpoint, ctx.repairs, ctx.preens);
+ else if (ctx.repairs && ctx.preens == 0)
+ fprintf(stdout,
+_("%s: %llu repairs made.\n"),
+ ctx.mntpoint, ctx.repairs);
+ else if (ctx.repairs == 0 && ctx.preens)
+ fprintf(stdout,
+_("%s: %llu optimizations made.\n"),
+ ctx.mntpoint, ctx.preens);
+
total_errors = ctx.errors_found + ctx.runtime_errors;
if (ctx.need_repair)
repairstr = _(" Unmount and run xfs_repair.");
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 83b8ae2..bd21642 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -90,6 +90,7 @@ struct scrub_ctx {
/* Mutable scrub state; use lock. */
pthread_mutex_t lock;
+ struct xfs_repair_list *repair_lists;
unsigned long long max_errors;
unsigned long long runtime_errors;
unsigned long long errors_found;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem
2018-01-06 1:54 ` [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem Darrick J. Wong
@ 2018-01-16 22:07 ` Eric Sandeen
2018-01-16 22:23 ` Darrick J. Wong
0 siblings, 1 reply; 61+ messages in thread
From: Eric Sandeen @ 2018-01-16 22:07 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
On 1/5/18 7:54 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> If the filesystem scan comes out clean or fixes all the problems, call
> fstrim to clean out the free areas (if it's an ssd/thinp/whatever).
Is this the right patch header for this patch?
Oh ok, this adds a "repair phase" which is really only implementing
preen for now, which is really only fstrimming at this point.
so:
preen()
if no errors
xfs_repair_fs() (IMHO odd to call "repair" on a clean filesystem?)
fstrim
So I guess what was confusing to me is that you do "preen" work under
"repair" functions. I get it that they might all be lumped together
in pending work now, but I'm still wrapping my head around what does
and doesn't happen in various modes, and how to recognize that in
the code...
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> scrub/Makefile | 1 +
> scrub/phase4.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> scrub/vfs.c | 23 +++++++++++++++++++++++
> scrub/vfs.h | 2 ++
> scrub/xfs_scrub.c | 26 +++++++++++++++++++++++++-
> scrub/xfs_scrub.h | 1 +
> 6 files changed, 104 insertions(+), 1 deletion(-)
> create mode 100644 scrub/phase4.c
>
>
> diff --git a/scrub/Makefile b/scrub/Makefile
> index fd26624..91f99ff 100644
> --- a/scrub/Makefile
> +++ b/scrub/Makefile
> @@ -41,6 +41,7 @@ inodes.c \
> phase1.c \
> phase2.c \
> phase3.c \
> +phase4.c \
> phase5.c \
> phase6.c \
> phase7.c \
> diff --git a/scrub/phase4.c b/scrub/phase4.c
> new file mode 100644
> index 0000000..dadf4de
> --- /dev/null
> +++ b/scrub/phase4.c
> @@ -0,0 +1,52 @@
> +/*
> + * Copyright (C) 2018 Oracle. All Rights Reserved.
> + *
> + * Author: Darrick J. Wong <darrick.wong@oracle.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation,
> + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
> + */
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <dirent.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <sys/statvfs.h>
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "list.h"
> +#include "path.h"
> +#include "workqueue.h"
> +#include "xfs_scrub.h"
> +#include "common.h"
> +#include "scrub.h"
> +#include "vfs.h"
> +
> +/* Phase 4: Repair filesystem. */
> +
> +/* Fix everything that needs fixing. */
> +bool
> +xfs_repair_fs(
> + struct scrub_ctx *ctx)
> +{
> + bool moveon = true;
> +
> + pthread_mutex_lock(&ctx->lock);
> + if (moveon && ctx->errors_found == 0)
> + fstrim(ctx);
> + pthread_mutex_unlock(&ctx->lock);
> +
> + return moveon;
> +}
> diff --git a/scrub/vfs.c b/scrub/vfs.c
> index 6a51090..98d356f 100644
> --- a/scrub/vfs.c
> +++ b/scrub/vfs.c
> @@ -219,3 +219,26 @@ _("Could not queue directory scan work."));
> free(sftd);
> return false;
> }
> +
> +#ifndef FITRIM
> +struct fstrim_range {
> + __u64 start;
> + __u64 len;
> + __u64 minlen;
> +};
> +#define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */
> +#endif
(I wonder if we should move all these "if it ain't available define it"
stuff into a single header file at some point...)
> +
> +/* Call FITRIM to trim all the unused space in a filesystem. */
> +void
> +fstrim(
> + struct scrub_ctx *ctx)
> +{
> + struct fstrim_range range = {0};
> + int error;
> +
> + range.len = ULLONG_MAX;
> + error = ioctl(ctx->mnt_fd, FITRIM, &range);
> + if (error && errno != EOPNOTSUPP && errno != ENOTTY)
> + perror(_("fstrim"));
> +}
still wondering if we should have an option to skip this, given some device's
horrific performance under fstrim, and/or an other desire to keep an image
whole.
> diff --git a/scrub/vfs.h b/scrub/vfs.h
> index 100eb18..3305159 100644
> --- a/scrub/vfs.h
> +++ b/scrub/vfs.h
> @@ -28,4 +28,6 @@ typedef bool (*scan_fs_tree_dirent_fn)(struct scrub_ctx *, const char *,
> bool scan_fs_tree(struct scrub_ctx *ctx, scan_fs_tree_dir_fn dir_fn,
> scan_fs_tree_dirent_fn dirent_fn, void *arg);
>
> +void fstrim(struct scrub_ctx *ctx);
> +
> #endif /* XFS_SCRUB_VFS_H_ */
> diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
> index bc40f3c..7809431 100644
> --- a/scrub/xfs_scrub.c
> +++ b/scrub/xfs_scrub.c
> @@ -340,6 +340,20 @@ _("%sI/O rate: %.1f%s/s in, %.1f%s/s out, %.1f%s/s tot\n"),
> return true;
> }
>
> +/* Run the preening phase if there are no errors. */
> +static bool
> +preen(
> + struct scrub_ctx *ctx)
> +{
> + if (ctx->errors_found) {
> + str_info(ctx, ctx->mntpoint,
> +_("Errors found, please re-run with -y."));
> + return true;
> + }
> +
> + return xfs_repair_fs(ctx);
> +}
> +
> /* Run all the phases of the scrubber. */
> static bool
> run_scrub_phases(
> @@ -393,8 +407,18 @@ run_scrub_phases(
> /* Run all phases of the scrub tool. */
> for (phase = 1, sp = phases; sp->fn; sp++, phase++) {
> /* Turn on certain phases if user said to. */
> - if (sp->fn == DATASCAN_DUMMY_FN && scrub_data)
> + if (sp->fn == DATASCAN_DUMMY_FN && scrub_data) {
> sp->fn = xfs_scan_blocks;
> + } else if (sp->fn == REPAIR_DUMMY_FN) {
> + if (ctx->mode == SCRUB_MODE_PREEN) {
> + sp->descr = _("Preen filesystem.");
> + sp->fn = preen;
> + } else if (ctx->mode == SCRUB_MODE_REPAIR) {
> + sp->descr = _("Repair filesystem.");
> + sp->fn = xfs_repair_fs;
> + }
> + sp->must_run = true;
if must_run is always true here, should it just be initialized in
the structure along w/ the other must_run phases?
> + }
>
> /* Skip certain phases unless they're turned on. */
> if (sp->fn == REPAIR_DUMMY_FN ||
> diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
> index a5cdba8..4a383f1 100644
> --- a/scrub/xfs_scrub.h
> +++ b/scrub/xfs_scrub.h
> @@ -108,5 +108,6 @@ bool xfs_scan_inodes(struct scrub_ctx *ctx);
> bool xfs_scan_connections(struct scrub_ctx *ctx);
> bool xfs_scan_blocks(struct scrub_ctx *ctx);
> bool xfs_scan_summary(struct scrub_ctx *ctx);
> +bool xfs_repair_fs(struct scrub_ctx *ctx);
>
> #endif /* XFS_SCRUB_XFS_SCRUB_H_ */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem
2018-01-16 22:07 ` Eric Sandeen
@ 2018-01-16 22:23 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-16 22:23 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Tue, Jan 16, 2018 at 04:07:44PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:54 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > If the filesystem scan comes out clean or fixes all the problems, call
> > fstrim to clean out the free areas (if it's an ssd/thinp/whatever).
>
> Is this the right patch header for this patch?
>
> Oh ok, this adds a "repair phase" which is really only implementing
> preen for now, which is really only fstrimming at this point.
>
> so:
>
> preen()
> if no errors
> xfs_repair_fs() (IMHO odd to call "repair" on a clean filesystem?)
> fstrim
>
> So I guess what was confusing to me is that you do "preen" work under
> "repair" functions. I get it that they might all be lumped together
> in pending work now, but I'm still wrapping my head around what does
> and doesn't happen in various modes, and how to recognize that in
> the code...
Based on our extended IRC conversations I was planning to rename the
"repair list" to "action items" so that we could have a
xfs_process_action_items() that actually takes care of issuing the
repair calls, then we could have two wrappers:
xfs_repair_fs() -> xfs_process_action_items(); fstrim();
xfs_preen_fs() -> if (!ctx->errors_found) xfs_process_action_items()
>
>
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > scrub/Makefile | 1 +
> > scrub/phase4.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > scrub/vfs.c | 23 +++++++++++++++++++++++
> > scrub/vfs.h | 2 ++
> > scrub/xfs_scrub.c | 26 +++++++++++++++++++++++++-
> > scrub/xfs_scrub.h | 1 +
> > 6 files changed, 104 insertions(+), 1 deletion(-)
> > create mode 100644 scrub/phase4.c
> >
> >
> > diff --git a/scrub/Makefile b/scrub/Makefile
> > index fd26624..91f99ff 100644
> > --- a/scrub/Makefile
> > +++ b/scrub/Makefile
> > @@ -41,6 +41,7 @@ inodes.c \
> > phase1.c \
> > phase2.c \
> > phase3.c \
> > +phase4.c \
> > phase5.c \
> > phase6.c \
> > phase7.c \
> > diff --git a/scrub/phase4.c b/scrub/phase4.c
> > new file mode 100644
> > index 0000000..dadf4de
> > --- /dev/null
> > +++ b/scrub/phase4.c
> > @@ -0,0 +1,52 @@
> > +/*
> > + * Copyright (C) 2018 Oracle. All Rights Reserved.
> > + *
> > + * Author: Darrick J. Wong <darrick.wong@oracle.com>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version 2
> > + * of the License, or (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it would be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, write the Free Software Foundation,
> > + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
> > + */
> > +#include <stdio.h>
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +#include <dirent.h>
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <sys/statvfs.h>
> > +#include "xfs.h"
> > +#include "xfs_fs.h"
> > +#include "list.h"
> > +#include "path.h"
> > +#include "workqueue.h"
> > +#include "xfs_scrub.h"
> > +#include "common.h"
> > +#include "scrub.h"
> > +#include "vfs.h"
> > +
> > +/* Phase 4: Repair filesystem. */
> > +
> > +/* Fix everything that needs fixing. */
> > +bool
> > +xfs_repair_fs(
> > + struct scrub_ctx *ctx)
> > +{
> > + bool moveon = true;
> > +
> > + pthread_mutex_lock(&ctx->lock);
> > + if (moveon && ctx->errors_found == 0)
> > + fstrim(ctx);
> > + pthread_mutex_unlock(&ctx->lock);
> > +
> > + return moveon;
> > +}
> > diff --git a/scrub/vfs.c b/scrub/vfs.c
> > index 6a51090..98d356f 100644
> > --- a/scrub/vfs.c
> > +++ b/scrub/vfs.c
> > @@ -219,3 +219,26 @@ _("Could not queue directory scan work."));
> > free(sftd);
> > return false;
> > }
> > +
> > +#ifndef FITRIM
> > +struct fstrim_range {
> > + __u64 start;
> > + __u64 len;
> > + __u64 minlen;
> > +};
> > +#define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */
> > +#endif
>
> (I wonder if we should move all these "if it ain't available define it"
> stuff into a single header file at some point...)
Yeah, probably....
> > +
> > +/* Call FITRIM to trim all the unused space in a filesystem. */
> > +void
> > +fstrim(
> > + struct scrub_ctx *ctx)
> > +{
> > + struct fstrim_range range = {0};
> > + int error;
> > +
> > + range.len = ULLONG_MAX;
> > + error = ioctl(ctx->mnt_fd, FITRIM, &range);
> > + if (error && errno != EOPNOTSUPP && errno != ENOTTY)
> > + perror(_("fstrim"));
> > +}
>
> still wondering if we should have an option to skip this, given some device's
> horrific performance under fstrim, and/or an other desire to keep an image
> whole.
I already added it in my dev tree. -k turns off FITRIM.
> > diff --git a/scrub/vfs.h b/scrub/vfs.h
> > index 100eb18..3305159 100644
> > --- a/scrub/vfs.h
> > +++ b/scrub/vfs.h
> > @@ -28,4 +28,6 @@ typedef bool (*scan_fs_tree_dirent_fn)(struct scrub_ctx *, const char *,
> > bool scan_fs_tree(struct scrub_ctx *ctx, scan_fs_tree_dir_fn dir_fn,
> > scan_fs_tree_dirent_fn dirent_fn, void *arg);
> >
> > +void fstrim(struct scrub_ctx *ctx);
> > +
> > #endif /* XFS_SCRUB_VFS_H_ */
> > diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
> > index bc40f3c..7809431 100644
> > --- a/scrub/xfs_scrub.c
> > +++ b/scrub/xfs_scrub.c
> > @@ -340,6 +340,20 @@ _("%sI/O rate: %.1f%s/s in, %.1f%s/s out, %.1f%s/s tot\n"),
> > return true;
> > }
> >
> > +/* Run the preening phase if there are no errors. */
> > +static bool
> > +preen(
> > + struct scrub_ctx *ctx)
> > +{
> > + if (ctx->errors_found) {
> > + str_info(ctx, ctx->mntpoint,
> > +_("Errors found, please re-run with -y."));
> > + return true;
> > + }
> > +
> > + return xfs_repair_fs(ctx);
> > +}
> > +
> > /* Run all the phases of the scrubber. */
> > static bool
> > run_scrub_phases(
> > @@ -393,8 +407,18 @@ run_scrub_phases(
> > /* Run all phases of the scrub tool. */
> > for (phase = 1, sp = phases; sp->fn; sp++, phase++) {
> > /* Turn on certain phases if user said to. */
> > - if (sp->fn == DATASCAN_DUMMY_FN && scrub_data)
> > + if (sp->fn == DATASCAN_DUMMY_FN && scrub_data) {
> > sp->fn = xfs_scan_blocks;
> > + } else if (sp->fn == REPAIR_DUMMY_FN) {
> > + if (ctx->mode == SCRUB_MODE_PREEN) {
> > + sp->descr = _("Preen filesystem.");
> > + sp->fn = preen;
> > + } else if (ctx->mode == SCRUB_MODE_REPAIR) {
> > + sp->descr = _("Repair filesystem.");
> > + sp->fn = xfs_repair_fs;
> > + }
> > + sp->must_run = true;
>
> if must_run is always true here, should it just be initialized in
> the structure along w/ the other must_run phases?
Ok.
--D
> > + }
> >
> > /* Skip certain phases unless they're turned on. */
> > if (sp->fn == REPAIR_DUMMY_FN ||
> > diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
> > index a5cdba8..4a383f1 100644
> > --- a/scrub/xfs_scrub.h
> > +++ b/scrub/xfs_scrub.h
> > @@ -108,5 +108,6 @@ bool xfs_scan_inodes(struct scrub_ctx *ctx);
> > bool xfs_scan_connections(struct scrub_ctx *ctx);
> > bool xfs_scan_blocks(struct scrub_ctx *ctx);
> > bool xfs_scan_summary(struct scrub_ctx *ctx);
> > +bool xfs_repair_fs(struct scrub_ctx *ctx);
> >
> > #endif /* XFS_SCRUB_XFS_SCRUB_H_ */
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions
2018-01-06 1:53 ` [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions Darrick J. Wong
@ 2018-01-16 23:52 ` Eric Sandeen
2018-01-16 23:57 ` Eric Sandeen
2018-01-16 23:59 ` Darrick J. Wong
0 siblings, 2 replies; 61+ messages in thread
From: Eric Sandeen @ 2018-01-16 23:52 UTC (permalink / raw)
To: Darrick J. Wong, sandeen; +Cc: linux-xfs
> +/* Print a warning string and whatever error is stored in errno. */
> +void
> +__str_errno_warn(
> + struct scrub_ctx *ctx,
> + const char *descr,
> + const char *file,
> + int line)
> +{
> + char buf[DESCR_BUFSZ];
> +
> + pthread_mutex_lock(&ctx->lock);
> + fprintf(stderr, _("Warning: %s: %s."), descr,
> + strerror_r(errno, buf, DESCR_BUFSZ));
> + if (debug)
> + fprintf(stderr, _(" (%s line %d)"), file, line);
> + fprintf(stderr, "\n");
> + ctx->warnings_found++;
> + pthread_mutex_unlock(&ctx->lock);
> +}
> +
Oh hello, unused-new-6th-printing-variant! ;)
It took a lot of careful peering at, and scrolling around, to figure
out what all these different __str_ variants do.
Can we collapse all these str_foo_bar things down into a function
that makes logical choices based on what's passed in? Here's what
I was playing with, see if it actually implements what you want
and if it's any better, and yeah, long lines sorry.
common.h:
void __str_out(struct scrub_ctx *, const char *descr, int level, int error,
const char *file, int line, const char *format, ...);
#define S_ERROR 0
#define S_WARN 1
#define S_INFO 2
#define str_errno(ctx, str) __str_out(ctx, str, S_ERROR, errno, __FILE__, __LINE__, NULL)
#define str_error(ctx, str, ...) __str_out(ctx, str, S_ERROR, 0, __FILE__, __LINE__, __VA_ARGS__)
#define str_errno_warn(ctx, str) __str_out(ctx, str, S_WARN, errno, __FILE__, __LINE__, NULL)
#define str_warn(ctx, str, ...) __str_out(ctx, str, S_WARN, 0, __FILE__, __LINE__, __VA_ARGS__)
#define str_info(ctx, str, ...) __str_out(ctx, str, S_INFO, 0, __FILE__, __LINE__, __VA_ARGS__)
/* note, could rationalize those names a bit, maybe must str_errno -> str_errno_error? */
common.c:
/* If stdout/stderr is a tty, clear to end of line to clean up progress bar. */
static inline const char *str_start(FILE *stream)
{
if (stream == stderr)
return stderr_isatty ? CLEAR_EOL : "";
else
return stdout_isatty ? CLEAR_EOL : "";
}
static const char *err_str[] = {
"Error",
"Warning",
"Info",
};
/* Print a warning string and some warning text. */
void
__str_out(
struct scrub_ctx *ctx,
const char *descr,
int level,
int error,
const char *file,
int line,
const char *format,
...)
{
FILE *stream = stderr;
va_list args;
char buf[DESCR_BUFSZ];
/* print strerror or format of choice but not both */
if (error && format)
abort();
if (level >= S_INFO)
stream = stdout;
pthread_mutex_lock(&ctx->lock);
if (errno)
fprintf(stream, _("%s%s: %s: %s."),
str_start(stream), err_str[level], descr,
strerror_r(errno, buf, DESCR_BUFSZ));
else {
fprintf(stream, _("%s%s: %s: "),
str_start(stream), err_str[level], descr);
va_start(args, format);
vfprintf(stream, format, args);
va_end(args);
}
if (debug)
fprintf(stream, _(" (%s line %d)"), file, line);
fprintf(stream, "\n");
if (stream == stdout)
fflush(stream);
if (errno) /* A syscall failed */
ctx->runtime_errors++;
else if (level == S_ERROR)
ctx->errors_found++;
else if (level == S_WARN)
ctx->warnings_found++;
pthread_mutex_unlock(&ctx->lock);
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions
2018-01-16 23:52 ` Eric Sandeen
@ 2018-01-16 23:57 ` Eric Sandeen
2018-01-16 23:59 ` Darrick J. Wong
1 sibling, 0 replies; 61+ messages in thread
From: Eric Sandeen @ 2018-01-16 23:57 UTC (permalink / raw)
To: Eric Sandeen, Darrick J. Wong; +Cc: linux-xfs
On 1/16/18 5:52 PM, Eric Sandeen wrote:
...
> /* Print a warning string and some warning text. */
> void
> __str_out(
> struct scrub_ctx *ctx,
> const char *descr,
> int level,
> int error,
> const char *file,
> int line,
> const char *format,
> ...)
> {
> FILE *stream = stderr;
> va_list args;
> char buf[DESCR_BUFSZ];
>
> /* print strerror or format of choice but not both */
> if (error && format)
> abort();
>
> if (level >= S_INFO)
> stream = stdout;
>
> pthread_mutex_lock(&ctx->lock);
oops this and every other "errno" below should be error, sorry:
> if (errno)
> fprintf(stream, _("%s%s: %s: %s."),
> str_start(stream), err_str[level], descr,
> strerror_r(errno, buf, DESCR_BUFSZ));
-Eric
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions
2018-01-16 23:52 ` Eric Sandeen
2018-01-16 23:57 ` Eric Sandeen
@ 2018-01-16 23:59 ` Darrick J. Wong
1 sibling, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-16 23:59 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Tue, Jan 16, 2018 at 05:52:10PM -0600, Eric Sandeen wrote:
>
> > +/* Print a warning string and whatever error is stored in errno. */
> > +void
> > +__str_errno_warn(
> > + struct scrub_ctx *ctx,
> > + const char *descr,
> > + const char *file,
> > + int line)
> > +{
> > + char buf[DESCR_BUFSZ];
> > +
> > + pthread_mutex_lock(&ctx->lock);
> > + fprintf(stderr, _("Warning: %s: %s."), descr,
> > + strerror_r(errno, buf, DESCR_BUFSZ));
> > + if (debug)
> > + fprintf(stderr, _(" (%s line %d)"), file, line);
> > + fprintf(stderr, "\n");
> > + ctx->warnings_found++;
> > + pthread_mutex_unlock(&ctx->lock);
> > +}
> > +
>
> Oh hello, unused-new-6th-printing-variant! ;)
>
> It took a lot of careful peering at, and scrolling around, to figure
> out what all these different __str_ variants do.
>
> Can we collapse all these str_foo_bar things down into a function
> that makes logical choices based on what's passed in? Here's what
> I was playing with, see if it actually implements what you want
> and if it's any better, and yeah, long lines sorry.
>
> common.h:
>
> void __str_out(struct scrub_ctx *, const char *descr, int level, int error,
> const char *file, int line, const char *format, ...);
>
> #define S_ERROR 0
> #define S_WARN 1
> #define S_INFO 2
>
> #define str_errno(ctx, str) __str_out(ctx, str, S_ERROR, errno, __FILE__, __LINE__, NULL)
> #define str_error(ctx, str, ...) __str_out(ctx, str, S_ERROR, 0, __FILE__, __LINE__, __VA_ARGS__)
> #define str_errno_warn(ctx, str) __str_out(ctx, str, S_WARN, errno, __FILE__, __LINE__, NULL)
> #define str_warn(ctx, str, ...) __str_out(ctx, str, S_WARN, 0, __FILE__, __LINE__, __VA_ARGS__)
> #define str_info(ctx, str, ...) __str_out(ctx, str, S_INFO, 0, __FILE__, __LINE__, __VA_ARGS__)
>
> /* note, could rationalize those names a bit, maybe must str_errno -> str_errno_error? */
>
> common.c:
>
> /* If stdout/stderr is a tty, clear to end of line to clean up progress bar. */
> static inline const char *str_start(FILE *stream)
> {
> if (stream == stderr)
> return stderr_isatty ? CLEAR_EOL : "";
> else
> return stdout_isatty ? CLEAR_EOL : "";
> }
>
> static const char *err_str[] = {
> "Error",
> "Warning",
> "Info",
> };
>
> /* Print a warning string and some warning text. */
> void
> __str_out(
> struct scrub_ctx *ctx,
> const char *descr,
> int level,
> int error,
> const char *file,
> int line,
> const char *format,
> ...)
> {
> FILE *stream = stderr;
> va_list args;
> char buf[DESCR_BUFSZ];
>
> /* print strerror or format of choice but not both */
> if (error && format)
> abort();
>
> if (level >= S_INFO)
> stream = stdout;
>
> pthread_mutex_lock(&ctx->lock);
> if (errno)
> fprintf(stream, _("%s%s: %s: %s."),
> str_start(stream), err_str[level], descr,
> strerror_r(errno, buf, DESCR_BUFSZ));
> else {
> fprintf(stream, _("%s%s: %s: "),
> str_start(stream), err_str[level], descr);
>
> va_start(args, format);
> vfprintf(stream, format, args);
> va_end(args);
> }
>
> if (debug)
> fprintf(stream, _(" (%s line %d)"), file, line);
> fprintf(stream, "\n");
> if (stream == stdout)
> fflush(stream);
>
> if (errno) /* A syscall failed */
> ctx->runtime_errors++;
> else if (level == S_ERROR)
> ctx->errors_found++;
> else if (level == S_WARN)
> ctx->warnings_found++;
>
> pthread_mutex_unlock(&ctx->lock);
Yes, the whole thing could get unified into a single helper like this.
--D
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH v11 00/27] xfsprogs: online scrub/repair support
2018-01-12 4:17 ` [PATCH v11 00/27] xfsprogs: online scrub/repair support Eric Sandeen
@ 2018-01-17 1:31 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2018-01-17 1:31 UTC (permalink / raw)
To: Eric Sandeen; +Cc: sandeen, linux-xfs
On Thu, Jan 11, 2018 at 10:17:28PM -0600, Eric Sandeen wrote:
> On 1/5/18 7:51 PM, Darrick J. Wong wrote:
> > Hi all,
> >
> > This is the eleventh revision of a patchset that adds to XFS userland tools
> > support for online metadata scrubbing and repair. Since v10 I've rebased
> > to the latest for-next, fixed some wonky error messages, and fixed a few
> > minor problems I found via code inspection. However, this patch series is
> > more or less the same as v10.
>
> General note rather than finding the patches they came from ;)
>
> these can be made static and in some cases removed from header files,
> and/or ... hm, some aren't used at all.
>
> 'bitmap_dump' is unique to scrub/bitmap.o (function)
> 'bitmap_iterate' is unique to scrub/bitmap.o (function)
These only exist #ifdef DEBUG
> 'do_error' is unique to scrub/common.o (function)
Unused, removed.
> 'display_rusage' is unique to scrub/xfs_scrub.o (global variable)
> 'is_service' is unique to scrub/xfs_scrub.o (global variable)
> 'scrub_data' is unique to scrub/xfs_scrub.o (global variable)
> 'xfs_check_rmap_ioerr' is unique to scrub/phase6.o (function)
Ok, these have been made local to the file.
> 'progname' is unique to scrub/xfs_scrub.o (global variable)
I thought we needed to have this for libxfs?
Ah, right, we don't link against libxfs anymore. :)
--D
> bitmap_dump (and so bitmap_iterate) are unused
> do_error is unused as well?
>
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 06/27] xfs_scrub: create an abstraction for a block device
2017-11-17 21:00 [PATCH v10 00/27] xfsprogs-4.15: online scrub/repair support Darrick J. Wong
@ 2017-11-17 21:00 ` Darrick J. Wong
0 siblings, 0 replies; 61+ messages in thread
From: Darrick J. Wong @ 2017-11-17 21:00 UTC (permalink / raw)
To: sandeen, darrick.wong; +Cc: linux-xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Create an abstraction to handle all of our low level disk operations.
We'll eventually use it to bind to a fs mount point and block device.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
scrub/Makefile | 2 +
scrub/disk.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
scrub/disk.h | 39 +++++++++++++
3 files changed, 205 insertions(+)
create mode 100644 scrub/disk.c
create mode 100644 scrub/disk.h
diff --git a/scrub/Makefile b/scrub/Makefile
index ac0af94..f810790 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -17,10 +17,12 @@ endif # scrub_prereqs
HFILES = \
common.h \
+disk.h \
xfs_scrub.h
CFILES = \
common.c \
+disk.c \
xfs_scrub.c
LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD)
diff --git a/scrub/disk.c b/scrub/disk.c
new file mode 100644
index 0000000..fe91842
--- /dev/null
+++ b/scrub/disk.c
@@ -0,0 +1,164 @@
+/*
+ * Copyright (C) 2017 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/statvfs.h>
+#include <sys/vfs.h>
+#include <linux/fs.h>
+#include "platform_defs.h"
+#include "libfrog.h"
+#include "xfs_scrub.h"
+#include "disk.h"
+
+/*
+ * Disk Abstraction
+ *
+ * These routines help us to discover the geometry of a block device,
+ * estimate the amount of concurrent IOs that we can send to it, and
+ * abstract the process of performing read verification of disk blocks.
+ */
+
+/* Figure out how many disk heads are available. */
+static unsigned int
+__disk_heads(
+ struct disk *disk)
+{
+ int iomin;
+ int ioopt;
+ unsigned short rot;
+ int error;
+
+ /* If it's not a block device, throw all the CPUs at it. */
+ if (!S_ISBLK(disk->d_sb.st_mode))
+ return nproc;
+
+ /* Non-rotational device? Throw all the CPUs. */
+ rot = 1;
+ error = ioctl(disk->d_fd, BLKROTATIONAL, &rot);
+ if (error == 0 && rot == 0)
+ return nproc;
+
+ /*
+ * Sometimes we can infer the number of devices from the
+ * min/optimal IO sizes.
+ */
+ iomin = ioopt = 0;
+ if (ioctl(disk->d_fd, BLKIOMIN, &iomin) == 0 &&
+ ioctl(disk->d_fd, BLKIOOPT, &ioopt) == 0 &&
+ iomin > 0 && ioopt > 0) {
+ return min(nproc, max(1, ioopt / iomin));
+ }
+
+ /* Rotating device? I guess? */
+ return 2;
+}
+
+/* Figure out how many disk heads are available. */
+unsigned int
+disk_heads(
+ struct disk *disk)
+{
+ if (nr_threads)
+ return nr_threads;
+ return __disk_heads(disk);
+}
+
+/* Open a disk device and discover its geometry. */
+struct disk *
+disk_open(
+ const char *pathname)
+{
+ struct disk *disk;
+ int lba_sz;
+ int error;
+
+ disk = calloc(1, sizeof(struct disk));
+ if (!disk)
+ return NULL;
+
+ disk->d_fd = open(pathname, O_RDONLY | O_DIRECT | O_NOATIME);
+ if (disk->d_fd < 0)
+ goto out_free;
+
+ /* Try to get LBA size. */
+ error = ioctl(disk->d_fd, BLKSSZGET, &lba_sz);
+ if (error)
+ lba_sz = 512;
+ disk->d_lbalog = log2_roundup(lba_sz);
+
+ /* Obtain disk's stat info. */
+ error = fstat(disk->d_fd, &disk->d_sb);
+ if (error)
+ goto out_close;
+
+ /* Determine bdev size, block size, and offset. */
+ if (S_ISBLK(disk->d_sb.st_mode)) {
+ error = ioctl(disk->d_fd, BLKGETSIZE64, &disk->d_size);
+ if (error)
+ disk->d_size = 0;
+ error = ioctl(disk->d_fd, BLKBSZGET, &disk->d_blksize);
+ if (error)
+ disk->d_blksize = 0;
+ disk->d_start = 0;
+ } else {
+ disk->d_size = disk->d_sb.st_size;
+ disk->d_blksize = disk->d_sb.st_blksize;
+ disk->d_start = 0;
+ }
+
+ return disk;
+out_close:
+ close(disk->d_fd);
+out_free:
+ free(disk);
+ return NULL;
+}
+
+/* Close a disk device. */
+int
+disk_close(
+ struct disk *disk)
+{
+ int error = 0;
+
+ if (disk->d_fd >= 0)
+ error = close(disk->d_fd);
+ disk->d_fd = -1;
+ free(disk);
+ return error;
+}
+
+/* Read-verify an extent of a disk device. */
+ssize_t
+disk_read_verify(
+ struct disk *disk,
+ void *buf,
+ uint64_t start,
+ uint64_t length)
+{
+ return pread(disk->d_fd, buf, length, start);
+}
diff --git a/scrub/disk.h b/scrub/disk.h
new file mode 100644
index 0000000..4331300
--- /dev/null
+++ b/scrub/disk.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2017 Oracle. All Rights Reserved.
+ *
+ * Author: Darrick J. Wong <darrick.wong@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+#ifndef XFS_SCRUB_DISK_H_
+#define XFS_SCRUB_DISK_H_
+
+struct disk {
+ struct stat d_sb;
+ int d_fd;
+ int d_lbalog;
+ unsigned int d_flags;
+ unsigned int d_blksize; /* bytes */
+ uint64_t d_size; /* bytes */
+ uint64_t d_start; /* bytes */
+};
+
+unsigned int disk_heads(struct disk *disk);
+struct disk *disk_open(const char *pathname);
+int disk_close(struct disk *disk);
+ssize_t disk_read_verify(struct disk *disk, void *buf, uint64_t startblock,
+ uint64_t blockcount);
+
+#endif /* XFS_SCRUB_DISK_H_ */
^ permalink raw reply related [flat|nested] 61+ messages in thread
end of thread, other threads:[~2018-01-17 1:36 UTC | newest]
Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-06 1:51 [PATCH v11 00/27] xfsprogs: online scrub/repair support Darrick J. Wong
2018-01-06 1:51 ` [PATCH 01/27] xfs_scrub: create online filesystem scrub program Darrick J. Wong
2018-01-12 0:16 ` Eric Sandeen
2018-01-12 1:08 ` Darrick J. Wong
2018-01-12 1:07 ` Eric Sandeen
2018-01-12 1:10 ` Darrick J. Wong
2018-01-06 1:51 ` [PATCH 02/27] xfs_scrub: common error handling Darrick J. Wong
2018-01-12 1:15 ` Eric Sandeen
2018-01-12 1:23 ` Darrick J. Wong
2018-01-06 1:51 ` [PATCH 03/27] xfs_scrub: set up command line argument parsing Darrick J. Wong
2018-01-11 23:39 ` Eric Sandeen
2018-01-12 1:53 ` Darrick J. Wong
2018-01-12 1:30 ` Eric Sandeen
2018-01-12 2:03 ` Darrick J. Wong
2018-01-06 1:51 ` [PATCH 04/27] xfs_scrub: dispatch the various phases of the scrub program Darrick J. Wong
2018-01-06 1:51 ` [PATCH 05/27] xfs_scrub: figure out how many threads we're going to need Darrick J. Wong
2018-01-06 1:52 ` [PATCH 06/27] xfs_scrub: create an abstraction for a block device Darrick J. Wong
2018-01-11 23:24 ` Eric Sandeen
2018-01-11 23:59 ` Darrick J. Wong
2018-01-12 0:04 ` Eric Sandeen
2018-01-12 1:27 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
2018-01-06 1:52 ` [PATCH 08/27] xfs_scrub: add inode iteration functions Darrick J. Wong
2018-01-06 1:52 ` [PATCH 09/27] xfs_scrub: add space map " Darrick J. Wong
2018-01-06 1:52 ` [PATCH 10/27] xfs_scrub: add file " Darrick J. Wong
2018-01-11 23:19 ` Eric Sandeen
2018-01-12 0:24 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 11/27] xfs_scrub: filesystem counter collection functions Darrick J. Wong
2018-01-06 1:52 ` [PATCH 12/27] xfs_scrub: wrap the scrub ioctl Darrick J. Wong
2018-01-11 23:12 ` Eric Sandeen
2018-01-12 0:28 ` Darrick J. Wong
2018-01-06 1:52 ` [PATCH 13/27] xfs_scrub: scan filesystem and AG metadata Darrick J. Wong
2018-01-06 1:52 ` [PATCH 14/27] xfs_scrub: thread-safe stats counter Darrick J. Wong
2018-01-06 1:53 ` [PATCH 15/27] xfs_scrub: scan inodes Darrick J. Wong
2018-01-06 1:53 ` [PATCH 16/27] xfs_scrub: check directory connectivity Darrick J. Wong
2018-01-06 1:53 ` [PATCH 17/27] xfs_scrub: warn about suspicious characters in directory/xattr names Darrick J. Wong
2018-01-06 1:53 ` [PATCH 18/27] xfs_scrub: warn about normalized Unicode name collisions Darrick J. Wong
2018-01-16 23:52 ` Eric Sandeen
2018-01-16 23:57 ` Eric Sandeen
2018-01-16 23:59 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 19/27] xfs_scrub: create a bitmap data structure Darrick J. Wong
2018-01-06 1:53 ` [PATCH 20/27] xfs_scrub: create infrastructure to read verify data blocks Darrick J. Wong
2018-01-06 1:53 ` [PATCH 21/27] xfs_scrub: scrub file " Darrick J. Wong
2018-01-11 23:25 ` Eric Sandeen
2018-01-12 0:29 ` Darrick J. Wong
2018-01-06 1:53 ` [PATCH 22/27] xfs_scrub: optionally use SCSI READ VERIFY commands to scrub data blocks on disk Darrick J. Wong
2018-01-06 1:53 ` [PATCH 23/27] xfs_scrub: check summary counters Darrick J. Wong
2018-01-06 1:54 ` [PATCH 24/27] xfs_scrub: fstrim the free areas if there are no errors on the filesystem Darrick J. Wong
2018-01-16 22:07 ` Eric Sandeen
2018-01-16 22:23 ` Darrick J. Wong
2018-01-06 1:54 ` [PATCH 25/27] xfs_scrub: progress indicator Darrick J. Wong
2018-01-11 23:27 ` Eric Sandeen
2018-01-12 0:32 ` Darrick J. Wong
2018-01-06 1:54 ` [PATCH 26/27] xfs_scrub: create a script to scrub all xfs filesystems Darrick J. Wong
2018-01-06 1:54 ` [PATCH 27/27] xfs_scrub: integrate services with systemd Darrick J. Wong
2018-01-06 3:50 ` [PATCH 07/27] xfs_scrub: find XFS filesystem geometry Darrick J. Wong
2018-01-12 4:17 ` [PATCH v11 00/27] xfsprogs: online scrub/repair support Eric Sandeen
2018-01-17 1:31 ` Darrick J. Wong
2018-01-16 19:21 ` [PATCH 28/27] xfs_scrub: wire up repair ioctl Darrick J. Wong
2018-01-16 19:21 ` [PATCH 29/27] xfs_scrub: schedule and manage repairs to the filesystem Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2017-11-17 21:00 [PATCH v10 00/27] xfsprogs-4.15: online scrub/repair support Darrick J. Wong
2017-11-17 21:00 ` [PATCH 06/27] xfs_scrub: create an abstraction for a block device Darrick J. Wong
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.