All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] xfsprogs: online scrub fixes
@ 2018-03-21  3:39 Darrick J. Wong
  2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
                   ` (13 more replies)
  0 siblings, 14 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:39 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

Hi all,

Here's a bunch of fixes for the xfs_scrub program that was introduced in
xfsprogs 4.15.

The first nine patches replace the weak name collision scanner of the
original xfs_scrub with a much more powerful one based on a well known
collision finding strategy.  The name scanner still checks all entries
in a directory and all extended attribute keys in a file for names
containing control characters and names with utf8 sequences that
normalize into the same unicode points, which are common methods that
spoofers use to create malicious names.  However, we add in more
powerful scanning capabilities -- first, we can now look for names that
are identical except for zero-width spaces; second, we look for names
with bidirectional overrides; and third, we can warn about long names
that contain well known confusable characters in the same place in both
names (think 'f0' vs 'fO').  As a result of this rewrite, a few buffer
underruns that caused truncated comparisons have been fixed.

The last five patches fix various minor bugs in the xfs_scrub_all
wrapper and service.

If you're going to start using this mess, you probably ought to just
pull from my git tree for xfsprogs[1].

Comments and questions are, as always, welcome.

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
@ 2018-03-21  3:39 ` Darrick J. Wong
  2018-04-03 17:30   ` Eric Sandeen
  2018-04-11  0:27   ` [PATCH v2 " Darrick J. Wong
  2018-03-21  3:39 ` [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker Darrick J. Wong
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:39 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Avoid a buffer overflow when we're formatting extended attribute names
for name checking.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase5.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)


diff --git a/scrub/phase5.c b/scrub/phase5.c
index 8e0a1be..36821d0 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -143,6 +143,8 @@ static const struct xfs_attr_ns attr_ns[] = {
 	{ATTR_SECURE,		"secure"},
 	{0, NULL},
 };
+/* Enough space to handle the prefix. */
+#define ATTR_NAME_MAX		(NAME_MAX + 8)
 
 /*
  * Check all the xattr names in a particular namespace of a file handle
@@ -158,7 +160,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 {
 	struct attrlist_cursor		cur;
 	char				attrbuf[XFS_XATTR_LIST_MAX];
-	char				keybuf[NAME_MAX + 1];
+	char				keybuf[ATTR_NAME_MAX + 1];
 	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
 	struct attrlist_ent		*ent;
 	struct unicrash			*uc;
@@ -172,14 +174,14 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 
 	memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
 	memset(&cur, 0, sizeof(cur));
-	memset(keybuf, 0, NAME_MAX + 1);
+	memset(keybuf, 0, ATTR_NAME_MAX + 1);
 	error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
 			XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
 	while (!error) {
 		/* Examine the xattrs. */
 		for (i = 0; i < attrlist->al_count; i++) {
 			ent = ATTR_ENTRY(attrlist, i);
-			snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,
+			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
 					ent->a_name);
 			moveon = xfs_scrub_check_name(ctx, descr,
 					_("extended attribute"), keybuf);


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
  2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
@ 2018-03-21  3:39 ` Darrick J. Wong
  2018-04-03 17:49   ` Eric Sandeen
  2018-03-21  3:39 ` [PATCH 03/14] xfs_scrub: don't complain about different normalization Darrick J. Wong
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:39 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Skip the ASCII name checks if the Unicode name checker is going to run,
since the latter covers everything that the former does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/phase5.c |   24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)


diff --git a/scrub/phase5.c b/scrub/phase5.c
index 36821d0..decda02 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -113,11 +113,11 @@ xfs_scrub_scan_dirents(
 
 	dentry = readdir(dir);
 	while (dentry) {
-		moveon = xfs_scrub_check_name(ctx, descr, _("directory"),
-				dentry->d_name);
-		if (!moveon)
-			break;
-		moveon = unicrash_check_dir_name(uc, descr, dentry);
+		if (uc)
+			moveon = unicrash_check_dir_name(uc, descr, dentry);
+		else
+			moveon = xfs_scrub_check_name(ctx, descr,
+					_("directory"), dentry->d_name);
 		if (!moveon)
 			break;
 		dentry = readdir(dir);
@@ -163,7 +163,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 	char				keybuf[ATTR_NAME_MAX + 1];
 	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
 	struct attrlist_ent		*ent;
-	struct unicrash			*uc;
+	struct unicrash			*uc = NULL;
 	bool				moveon = true;
 	int				i;
 	int				error;
@@ -183,11 +183,13 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 			ent = ATTR_ENTRY(attrlist, i);
 			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
 					ent->a_name);
-			moveon = xfs_scrub_check_name(ctx, descr,
-					_("extended attribute"), keybuf);
-			if (!moveon)
-				goto out;
-			moveon = unicrash_check_xattr_name(uc, descr, keybuf);
+			if (uc)
+				moveon = unicrash_check_xattr_name(uc, descr,
+						keybuf);
+			else
+				moveon = xfs_scrub_check_name(ctx, descr,
+						_("extended attribute"),
+						keybuf);
 			if (!moveon)
 				goto out;
 		}


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/14] xfs_scrub: don't complain about different normalization
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
  2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
  2018-03-21  3:39 ` [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker Darrick J. Wong
@ 2018-03-21  3:39 ` Darrick J. Wong
  2018-04-10 23:37   ` Eric Sandeen
  2018-03-21  3:40 ` [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans Darrick J. Wong
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:39 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Since there are different ways to normalize utf8 names, don't complain
when we find a name that is normalized in a different way than the NFKC
that we use to find duplicate names.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 0b5d1fa..10d7c14 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -256,7 +256,6 @@ unicrash_complain(
 	struct unicrash		*uc,
 	const char		*descr,
 	const char		*what,
-	bool			normal,
 	bool			unique,
 	const char		*name,
 	uint8_t			*uniname)
@@ -267,10 +266,6 @@ unicrash_complain(
 	bad1 = string_escape(name);
 	bad2 = string_escape((char *)uniname);
 
-	if (!normal && should_warn_about_name(uc->ctx))
-		str_info(uc->ctx, descr,
-_("Unicode name \"%s\" in %s should be normalized as \"%s\"."),
-				bad1, what, bad2);
 	if (!unique)
 		str_warn(uc->ctx, descr,
 _("Duplicate normalized Unicode name \"%s\" found in %s."),
@@ -342,20 +337,18 @@ __unicrash_check_name(
 {
 	uint8_t			uniname[(NAME_MAX * 2) + 1];
 	bool			moveon;
-	bool			normal;
 	bool			unique;
 
 	memset(uniname, 0, (NAME_MAX * 2) + 1);
-	normal = unicrash_normalize(name, uniname, NAME_MAX * 2);
+	unicrash_normalize(name, uniname, NAME_MAX * 2);
 	moveon = unicrash_add(uc, uniname, ino, &unique);
 	if (!moveon)
 		return false;
 
-	if (normal && unique)
+	if (unique)
 		return true;
 
-	unicrash_complain(uc, descr, namedescr, normal, unique, name,
-			uniname);
+	unicrash_complain(uc, descr, namedescr, unique, name, uniname);
 	return true;
 }
 


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2018-03-21  3:39 ` [PATCH 03/14] xfs_scrub: don't complain about different normalization Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-04-10 23:46   ` Eric Sandeen
  2018-03-21  3:40 ` [PATCH 05/14] xfs_scrub: make name_entry a first class structure Darrick J. Wong
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use an unsigned int to pass around name error flags instead of booleans.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |   43 ++++++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 13 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 10d7c14..3538a60 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -77,6 +77,14 @@ struct unicrash {
 #define UNICRASH_SZ(nr)		(sizeof(struct unicrash) + \
 				 (nr * sizeof(struct name_entry *)))
 
+/* Things to complain about in Unicode naming. */
+
+/*
+ * Multiple names resolve to the same normalized string and therefore render
+ * identically.
+ */
+#define UNICRASH_NOT_UNIQUE	(1 << 0)
+
 /*
  * We only care about validating utf8 collisions if the underlying
  * system configuration says we're using utf8.  If the language
@@ -256,7 +264,7 @@ unicrash_complain(
 	struct unicrash		*uc,
 	const char		*descr,
 	const char		*what,
-	bool			unique,
+	unsigned int		badflags,
 	const char		*name,
 	uint8_t			*uniname)
 {
@@ -266,11 +274,20 @@ unicrash_complain(
 	bad1 = string_escape(name);
 	bad2 = string_escape((char *)uniname);
 
-	if (!unique)
+	/*
+	 * Two names that normalize to the same string will render
+	 * identically even though the filesystem considers them unique
+	 * names.  "cafe\xcc\x81" and "caf\xc3\xa9" have different byte
+	 * sequences, but they both appear as "café".
+	 */
+	if (badflags & UNICRASH_NOT_UNIQUE) {
 		str_warn(uc->ctx, descr,
-_("Duplicate normalized Unicode name \"%s\" found in %s."),
-				bad1, what);
+_("Unicode name \"%s\" in %s renders identically to \"%s\"."),
+				bad1, what, bad2);
+		goto out;
+	}
 
+out:
 	free(bad1);
 	free(bad2);
 }
@@ -291,7 +308,7 @@ unicrash_add(
 	struct unicrash		*uc,
 	uint8_t			*uniname,
 	xfs_ino_t		ino,
-	bool			*unique)
+	unsigned int		*badflags)
 {
 	struct name_entry	*ne;
 	struct name_entry	*x;
@@ -304,8 +321,9 @@ unicrash_add(
 	hash = unicrash_hashname(uniname, uninamelen);
 	bucket = hash % uc->nr_buckets;
 	for (nep = &uc->buckets[bucket], ne = *nep; ne != NULL; ne = x) {
-		if (u8_strcmp(uniname, ne->uniname) == 0) {
-			*unique = uc->compare_ino ? ne->ino == ino : false;
+		if (u8_strcmp(uniname, ne->uniname) == 0 &&
+		    (uc->compare_ino ? ino != ne->ino : true)) {
+			*badflags |= UNICRASH_NOT_UNIQUE;
 			return true;
 		}
 		nep = &ne->next;
@@ -321,7 +339,6 @@ unicrash_add(
 	x->uninamelen = uninamelen;
 	memcpy(x->uniname, uniname, uninamelen + 1);
 	*nep = x;
-	*unique = true;
 
 	return true;
 }
@@ -336,19 +353,19 @@ __unicrash_check_name(
 	xfs_ino_t		ino)
 {
 	uint8_t			uniname[(NAME_MAX * 2) + 1];
+	unsigned int		badflags = 0;
 	bool			moveon;
-	bool			unique;
 
 	memset(uniname, 0, (NAME_MAX * 2) + 1);
 	unicrash_normalize(name, uniname, NAME_MAX * 2);
-	moveon = unicrash_add(uc, uniname, ino, &unique);
+	moveon = unicrash_add(uc, uniname, ino, &badflags);
 	if (!moveon)
 		return false;
 
-	if (unique)
-		return true;
+	if (badflags)
+		unicrash_complain(uc, descr, namedescr, badflags, name,
+				uniname);
 
-	unicrash_complain(uc, descr, namedescr, unique, name, uniname);
 	return true;
 }
 


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/14] xfs_scrub: make name_entry a first class structure
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (3 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-03-21  3:40 ` [PATCH 06/14] xfs_scrub: transition from libunistring to libicu for Unicode processing Darrick J. Wong
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Instead of open-coding the construction and hashtable insertion of name
entries, make name_entry a first class object.  This means that we now
have name_entry_ prefix functions that take care of computing Unicode
normalized names as part of name_entry construction, and we pass around
the name_entries when we're looking for suspicious characters and
identically rendering names.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |  264 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 157 insertions(+), 107 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 3538a60..51da32c 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -61,9 +61,16 @@
 
 struct name_entry {
 	struct name_entry	*next;
+
+	/* NFKC normalized name */
+	uint8_t			*normstr;
+	size_t			normstrlen;
+
 	xfs_ino_t		ino;
-	size_t			uninamelen;
-	uint8_t			uniname[0];
+
+	/* Raw UTF8 name */
+	size_t			namelen;
+	char			name[0];
 };
 #define NAME_ENTRY_SZ(nl)	(sizeof(struct name_entry) + 1 + \
 				 (nl * sizeof(uint8_t)))
@@ -119,6 +126,120 @@ is_utf8_locale(void)
 	return answer;
 }
 
+/*
+ * Generate normalized form of the name.
+ * If this fails, just forget everything; this is an advisory checker.
+ */
+static bool
+name_entry_compute_checknames(
+	struct unicrash		*uc,
+	struct name_entry	*entry)
+{
+	uint8_t			*normstr;
+	size_t			normstrlen;
+
+	normstrlen = (entry->namelen * 2) + 1;
+	normstr = calloc(normstrlen, sizeof(uint8_t));
+	if (!normstr)
+		return false;
+
+	if (!u8_normalize(UNINORM_NFKC, (const uint8_t *)entry->name,
+			entry->namelen, normstr, &normstrlen));
+		goto out_normstr;
+
+	entry->normstr = normstr;
+	entry->normstrlen = normstrlen;
+	return true;
+out_normstr:
+	free(normstr);
+	return false;
+}
+
+/* Create a new name entry, returns false if we could not succeed. */
+static bool
+name_entry_create(
+	struct unicrash		*uc,
+	const char		*name,
+	xfs_ino_t		ino,
+	struct name_entry	**entry)
+{
+	struct name_entry	*new_entry;
+	size_t			namelen = strlen(name);
+
+	/* Create new entry */
+	new_entry = calloc(NAME_ENTRY_SZ(namelen), 1);
+	if (!new_entry)
+		return false;
+	new_entry->next = NULL;
+	new_entry->ino = ino;
+	memcpy(new_entry->name, name, namelen);
+	new_entry->name[namelen] = 0;
+	new_entry->namelen = namelen;
+
+	/* Normalize name to find collisions. */
+	if (!name_entry_compute_checknames(uc, new_entry))
+		goto out;
+
+	*entry = new_entry;
+	return true;
+
+out:
+	free(new_entry);
+	return false;
+}
+
+/* Free a name entry */
+static void
+name_entry_free(
+	struct name_entry	*entry)
+{
+	free(entry->normstr);
+	free(entry);
+}
+
+/* Adapt the dirhash function from libxfs, avoid linking with libxfs. */
+
+#define rol32(x, y)		(((x) << (y)) | ((x) >> (32 - (y))))
+
+/*
+ * Implement a simple hash on a character string.
+ * Rotate the hash value by 7 bits, then XOR each character in.
+ * This is implemented with some source-level loop unrolling.
+ */
+static xfs_dahash_t
+name_entry_hash(
+	struct name_entry	*entry)
+{
+	uint8_t			*name;
+	size_t			namelen;
+	xfs_dahash_t		hash;
+
+	name = entry->normstr;
+	namelen = entry->normstrlen;
+
+	/*
+	 * Do four characters at a time as long as we can.
+	 */
+	for (hash = 0; namelen >= 4; namelen -= 4, name += 4)
+		hash = (name[0] << 21) ^ (name[1] << 14) ^ (name[2] << 7) ^
+		       (name[3] << 0) ^ rol32(hash, 7 * 4);
+
+	/*
+	 * Now do the rest of the characters.
+	 */
+	switch (namelen) {
+	case 3:
+		return (name[0] << 14) ^ (name[1] << 7) ^ (name[2] << 0) ^
+		       rol32(hash, 7 * 3);
+	case 2:
+		return (name[0] << 7) ^ (name[1] << 0) ^ rol32(hash, 7 * 2);
+	case 1:
+		return (name[0] << 0) ^ rol32(hash, 7 * 1);
+	default: /* case 0: */
+		return hash;
+	}
+}
+
 /* Initialize the collision detector. */
 static bool
 unicrash_init(
@@ -190,89 +311,28 @@ unicrash_free(
 	for (i = 0; i < uc->nr_buckets; i++) {
 		for (ne = uc->buckets[i]; ne != NULL; ne = x) {
 			x = ne->next;
-			free(ne);
+			name_entry_free(ne);
 		}
 	}
 	free(uc);
 }
 
-/* Steal the dirhash function from libxfs, avoid linking with libxfs. */
-
-#define rol32(x, y)		(((x) << (y)) | ((x) >> (32 - (y))))
-
-/*
- * Implement a simple hash on a character string.
- * Rotate the hash value by 7 bits, then XOR each character in.
- * This is implemented with some source-level loop unrolling.
- */
-static xfs_dahash_t
-unicrash_hashname(
-	const uint8_t		*name,
-	size_t			namelen)
-{
-	xfs_dahash_t		hash;
-
-	/*
-	 * Do four characters at a time as long as we can.
-	 */
-	for (hash = 0; namelen >= 4; namelen -= 4, name += 4)
-		hash = (name[0] << 21) ^ (name[1] << 14) ^ (name[2] << 7) ^
-		       (name[3] << 0) ^ rol32(hash, 7 * 4);
-
-	/*
-	 * Now do the rest of the characters.
-	 */
-	switch (namelen) {
-	case 3:
-		return (name[0] << 14) ^ (name[1] << 7) ^ (name[2] << 0) ^
-		       rol32(hash, 7 * 3);
-	case 2:
-		return (name[0] << 7) ^ (name[1] << 0) ^ rol32(hash, 7 * 2);
-	case 1:
-		return (name[0] << 0) ^ rol32(hash, 7 * 1);
-	default: /* case 0: */
-		return hash;
-	}
-}
-
-/*
- * Normalize a name according to Unicode NFKC normalization rules.
- * Returns true if the name was already normalized.
- */
-static bool
-unicrash_normalize(
-	const char		*in,
-	uint8_t			*out,
-	size_t			outlen)
-{
-	size_t			inlen = strlen(in);
-
-	assert(inlen <= outlen);
-	if (!u8_normalize(UNINORM_NFKC, (const uint8_t *)in, inlen,
-			out, &outlen)) {
-		/* Didn't normalize, just return the same buffer. */
-		memcpy(out, in, inlen + 1);
-		return true;
-	}
-	out[outlen] = 0;
-	return outlen == inlen ? memcmp(in, out, inlen) == 0 : false;
-}
-
 /* Complain about Unicode problems. */
 static void
 unicrash_complain(
 	struct unicrash		*uc,
 	const char		*descr,
 	const char		*what,
+	struct name_entry	*entry,
 	unsigned int		badflags,
-	const char		*name,
-	uint8_t			*uniname)
+	struct name_entry	*dup_entry)
 {
 	char			*bad1 = NULL;
 	char			*bad2 = NULL;
 
-	bad1 = string_escape(name);
-	bad2 = string_escape((char *)uniname);
+	bad1 = string_escape(entry->name);
+	if (dup_entry)
+		bad2 = string_escape(dup_entry->name);
 
 	/*
 	 * Two names that normalize to the same string will render
@@ -294,52 +354,39 @@ _("Unicode name \"%s\" in %s renders identically to \"%s\"."),
 
 /*
  * Try to add a name -> ino entry to the collision detector.  The name
- * must be normalized according to Unicode NFKC normalization rules to
- * detect byte-unique names that map to the same sequence of Unicode
- * code points.
- *
- * This function returns true either if there was no previous mapping or
- * there was a mapping that matched exactly.  It returns false if
- * there is already a record with that name pointing to a different
- * inode.
+ * must be normalized according to Unicode NFKC rules to detect names that
+ * could be confused with each other.
  */
 static bool
 unicrash_add(
 	struct unicrash		*uc,
-	uint8_t			*uniname,
-	xfs_ino_t		ino,
-	unsigned int		*badflags)
+	struct name_entry	*new_entry,
+	unsigned int		*badflags,
+	struct name_entry	**existing_entry)
 {
-	struct name_entry	*ne;
-	struct name_entry	*x;
-	struct name_entry	**nep;
-	size_t			uninamelen = u8_strlen(uniname);
+	struct name_entry	*entry;
 	size_t			bucket;
 	xfs_dahash_t		hash;
 
-	/* Do we already know about that name? */
-	hash = unicrash_hashname(uniname, uninamelen);
+	/* Store name in hashtable. */
+	hash = name_entry_hash(new_entry);
 	bucket = hash % uc->nr_buckets;
-	for (nep = &uc->buckets[bucket], ne = *nep; ne != NULL; ne = x) {
-		if (u8_strcmp(uniname, ne->uniname) == 0 &&
-		    (uc->compare_ino ? ino != ne->ino : true)) {
+	entry = uc->buckets[bucket];
+	new_entry->next = entry;
+	uc->buckets[bucket] = new_entry;
+
+	while (entry != NULL) {
+		/* Same normalization? */
+		if (new_entry->normstrlen == entry->normstrlen &&
+		    !u8_strcmp(new_entry->normstr, entry->normstr) &&
+		    (uc->compare_ino ? entry->ino != new_entry->ino : true)) {
 			*badflags |= UNICRASH_NOT_UNIQUE;
+			*existing_entry = entry;
 			return true;
 		}
-		nep = &ne->next;
-		x = ne->next;
+		entry = entry->next;
 	}
 
-	/* Remember that name. */
-	x = malloc(NAME_ENTRY_SZ(uninamelen));
-	if (!x)
-		return false;
-	x->next = NULL;
-	x->ino = ino;
-	x->uninamelen = uninamelen;
-	memcpy(x->uniname, uniname, uninamelen + 1);
-	*nep = x;
-
 	return true;
 }
 
@@ -352,19 +399,22 @@ __unicrash_check_name(
 	const char		*name,
 	xfs_ino_t		ino)
 {
-	uint8_t			uniname[(NAME_MAX * 2) + 1];
+	struct name_entry	*dup_entry = NULL;
+	struct name_entry	*new_entry;
 	unsigned int		badflags = 0;
 	bool			moveon;
 
-	memset(uniname, 0, (NAME_MAX * 2) + 1);
-	unicrash_normalize(name, uniname, NAME_MAX * 2);
-	moveon = unicrash_add(uc, uniname, ino, &badflags);
+	/* If we can't create entry data, just skip it. */
+	if (!name_entry_create(uc, name, ino, &new_entry))
+		return true;
+
+	moveon = unicrash_add(uc, new_entry, &badflags, &dup_entry);
 	if (!moveon)
 		return false;
 
 	if (badflags)
-		unicrash_complain(uc, descr, namedescr, badflags, name,
-				uniname);
+		unicrash_complain(uc, descr, namedescr, new_entry, badflags,
+				dup_entry);
 
 	return true;
 }


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/14] xfs_scrub: transition from libunistring to libicu for Unicode processing
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (4 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 05/14] xfs_scrub: make name_entry a first class structure Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-03-21  3:40 ` [PATCH 07/14] xfs_scrub: check name for suspicious characters Darrick J. Wong
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move off of libunistring and onto libicu for Unicode name scanning.
This will make it easy to warn about unicode code points that do not
belong in identifiers (directional overrides, zero width elements) and
warn about names that could render similarly enough to cause confusion.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 configure.ac            |   13 ++++++++--
 debian/control          |    2 +
 include/builddefs.in    |    6 +++-
 m4/Makefile             |    2 +
 m4/package_icu.m4       |    6 ++++
 m4/package_unistring.m4 |   19 --------------
 scrub/Makefile          |    8 +++---
 scrub/unicrash.c        |   64 ++++++++++++++++++++++++++++++++++++-----------
 scrub/unicrash.h        |    4 +--
 9 files changed, 78 insertions(+), 46 deletions(-)
 create mode 100644 m4/package_icu.m4
 delete mode 100644 m4/package_unistring.m4


diff --git a/configure.ac b/configure.ac
index 686bf78..1885c45 100644
--- a/configure.ac
+++ b/configure.ac
@@ -95,6 +95,11 @@ AC_ARG_ENABLE(lto,
 	enable_lto=probe)
 AC_SUBST(enable_lto)
 
+# Enable libicu for xfs_scrubbing of malicious unicode sequences in names
+AC_ARG_ENABLE(libicu,
+[ --enable-libicu=[yes/no]   Enable Unicode name scanning (libicu) [default=probe]],,
+	enable_libicu=probe)
+
 #
 # If the user specified a libdir ending in lib64 do not append another
 # 64 to the library names.
@@ -173,8 +178,12 @@ AC_HAVE_DEVMAPPER
 AC_HAVE_MALLINFO
 AC_PACKAGE_WANT_ATTRIBUTES_H
 AC_HAVE_LIBATTR
-AC_PACKAGE_WANT_UNINORM_H
-AC_HAVE_U8NORMALIZE
+if test "$enable_libicu" = "yes" || test "$enable_libicu" = "probe"; then
+	AC_HAVE_LIBICU
+fi
+if test "$enable_libicu" = "yes" && test "$have_libicu" != "yes"; then
+        AC_MSG_ERROR([libicu not found.])
+fi
 AC_HAVE_OPENAT
 AC_HAVE_FSTATAT
 AC_HAVE_SG_IO
diff --git a/debian/control b/debian/control
index 2937c99..f4f807b 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: admin
 Priority: optional
 Maintainer: XFS Development Team <linux-xfs@vger.kernel.org>
 Uploaders: Nathan Scott <nathans@debian.org>, Anibal Monsalve Salazar <anibal@debian.org>
-Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev, libunistring-dev, dh-python, pkg-config
+Build-Depends: uuid-dev, dh-autoreconf, debhelper (>= 5), gettext, libtool, libreadline-gplv2-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libattr1-dev, libicu-dev, dh-python, pkg-config
 Standards-Version: 4.0.0
 Homepage: https://xfs.wiki.kernel.org/
 
diff --git a/include/builddefs.in b/include/builddefs.in
index 7a2a626..8aac06c 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -36,7 +36,6 @@ LIBEDITLINE = @libeditline@
 LIBREADLINE = @libreadline@
 LIBBLKID = @libblkid@
 LIBDEVMAPPER = @libdevmapper@
-LIBUNISTRING = @libunistring@
 LIBXFS = $(TOPDIR)/libxfs/libxfs.la
 LIBFROG = $(TOPDIR)/libfrog/libfrog.la
 LIBXCMD = $(TOPDIR)/libxcmd/libxcmd.la
@@ -122,7 +121,7 @@ HAVE_MAP_SYNC = @have_map_sync@
 HAVE_DEVMAPPER = @have_devmapper@
 HAVE_MALLINFO = @have_mallinfo@
 HAVE_LIBATTR = @have_libattr@
-HAVE_U8NORMALIZE = @have_u8normalize@
+HAVE_LIBICU = @have_libicu@
 HAVE_OPENAT = @have_openat@
 HAVE_FSTATAT = @have_fstatat@
 HAVE_SG_IO = @have_sg_io@
@@ -173,6 +172,9 @@ ifeq ($(HAVE_GETFSMAP),yes)
 PCFLAGS+= -DHAVE_GETFSMAP
 endif
 
+LIBICU_LIBS = @libicu_LIBS@
+LIBICU_CFLAGS = @libicu_CFLAGS@
+
 SANITIZER_CFLAGS += @addrsan_cflags@ @threadsan_cflags@ @ubsan_cflags@
 SANITIZER_LDFLAGS += @addrsan_ldflags@ @threadsan_ldflags@ @ubsan_ldflags@
 
diff --git a/m4/Makefile b/m4/Makefile
index a6d11e9..cf0ce60 100644
--- a/m4/Makefile
+++ b/m4/Makefile
@@ -23,7 +23,7 @@ LSRCFILES = \
 	package_sanitizer.m4 \
 	package_services.m4 \
 	package_types.m4 \
-	package_unistring.m4 \
+	package_icu.m4 \
 	package_utilies.m4 \
 	package_uuiddev.m4 \
 	multilib.m4 \
diff --git a/m4/package_icu.m4 b/m4/package_icu.m4
new file mode 100644
index 0000000..3ccbe0c
--- /dev/null
+++ b/m4/package_icu.m4
@@ -0,0 +1,6 @@
+AC_DEFUN([AC_HAVE_LIBICU],
+  [ PKG_CHECK_MODULES([libicu], [icu-i18n], [have_libicu=yes], [have_libicu=no])
+    AC_SUBST(have_libicu)
+    AC_SUBST(libicu_CFLAGS)
+    AC_SUBST(libicu_LIBS)
+  ])
diff --git a/m4/package_unistring.m4 b/m4/package_unistring.m4
deleted file mode 100644
index 9cbfcb0..0000000
--- a/m4/package_unistring.m4
+++ /dev/null
@@ -1,19 +0,0 @@
-AC_DEFUN([AC_PACKAGE_WANT_UNINORM_H],
-  [ AC_CHECK_HEADERS(uninorm.h)
-    if test $ac_cv_header_uninorm_h = no; then
-	AC_CHECK_HEADERS(uninorm.h,, [
-	echo
-	echo 'WARNING: could not find a valid uninorm.h header.'])
-    fi
-  ])
-
-AC_DEFUN([AC_HAVE_U8NORMALIZE],
-  [ AC_CHECK_LIB(unistring, u8_normalize,[
-	libunistring=-lunistring
-	have_u8normalize=yes
-    ],[
-	echo
-	echo 'WARNING: xfs_scrub will not be built with Unicode libraries.'])
-    AC_SUBST(libunistring)
-    AC_SUBST(have_u8normalize)
-  ])
diff --git a/scrub/Makefile b/scrub/Makefile
index 0632794..bcc05a0 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -68,8 +68,8 @@ spacemap.c \
 vfs.c \
 xfs_scrub.c
 
-LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUNISTRING) $(LIBRT)
-LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG) $(LIBUNISTRING) $(LIBRT)
+LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBICU_LIBS) $(LIBRT)
+LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
 LLDFLAGS = -static
 
 ifeq ($(HAVE_MALLINFO),yes)
@@ -84,9 +84,9 @@ ifeq ($(HAVE_LIBATTR),yes)
 LCFLAGS += -DHAVE_LIBATTR
 endif
 
-ifeq ($(HAVE_U8NORMALIZE),yes)
+ifeq ($(HAVE_LIBICU),yes)
 CFILES += unicrash.c
-LCFLAGS += -DHAVE_U8NORMALIZE
+LCFLAGS += -DHAVE_LIBICU $(LIBICU_CFLAGS)
 endif
 
 ifeq ($(HAVE_SG_IO),yes)
diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 51da32c..06ccadf 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -23,8 +23,9 @@
 #include <dirent.h>
 #include <sys/types.h>
 #include <sys/statvfs.h>
-#include <unistr.h>
-#include <uninorm.h>
+#include <strings.h>
+#include <unicode/ustring.h>
+#include <unicode/unorm2.h>
 #include "path.h"
 #include "xfs_scrub.h"
 #include "common.h"
@@ -63,7 +64,7 @@ struct name_entry {
 	struct name_entry	*next;
 
 	/* NFKC normalized name */
-	uint8_t			*normstr;
+	UChar			*normstr;
 	size_t			normstrlen;
 
 	xfs_ino_t		ino;
@@ -77,6 +78,7 @@ struct name_entry {
 
 struct unicrash {
 	struct scrub_ctx	*ctx;
+	const UNormalizer2	*normalizer;
 	bool			compare_ino;
 	size_t			nr_buckets;
 	struct name_entry	*buckets[0];
@@ -135,23 +137,48 @@ name_entry_compute_checknames(
 	struct unicrash		*uc,
 	struct name_entry	*entry)
 {
-	uint8_t			*normstr;
-	size_t			normstrlen;
-
-	normstrlen = (entry->namelen * 2) + 1;
-	normstr = calloc(normstrlen, sizeof(uint8_t));
-	if (!normstr)
+	UChar			*normstr;
+	UChar			*unistr;
+	int32_t			normstrlen;
+	int32_t			unistrlen;
+	UErrorCode		uerr = U_ZERO_ERROR;
+
+	/* Convert bytestr to unistr for normalization */
+	u_strFromUTF8(NULL, 0, &unistrlen, entry->name, entry->namelen, &uerr);
+	if (uerr != U_BUFFER_OVERFLOW_ERROR)
 		return false;
-
-	if (!u8_normalize(UNINORM_NFKC, (const uint8_t *)entry->name,
-			entry->namelen, normstr, &normstrlen));
+	uerr = U_ZERO_ERROR;
+	unistr = calloc(unistrlen + 1, sizeof(UChar));
+	if (!unistr)
+		return false;
+	u_strFromUTF8(unistr, unistrlen, NULL, entry->name, entry->namelen,
+			&uerr);
+	if (U_FAILURE(uerr))
+		goto out_unistr;
+
+	/* Normalize the string. */
+	normstrlen = unorm2_normalize(uc->normalizer, unistr, unistrlen, NULL,
+			0, &uerr);
+	if (uerr != U_BUFFER_OVERFLOW_ERROR)
+		goto out_unistr;
+	uerr = U_ZERO_ERROR;
+	normstr = calloc(normstrlen + 1, sizeof(UChar));
+	if (!normstr)
+		goto out_unistr;
+	unorm2_normalize(uc->normalizer, unistr, unistrlen, normstr, normstrlen,
+			&uerr);
+	if (U_FAILURE(uerr))
 		goto out_normstr;
 
 	entry->normstr = normstr;
 	entry->normstrlen = normstrlen;
+	free(unistr);
 	return true;
+
 out_normstr:
 	free(normstr);
+out_unistr:
+	free(unistr);
 	return false;
 }
 
@@ -214,8 +241,8 @@ name_entry_hash(
 	size_t			namelen;
 	xfs_dahash_t		hash;
 
-	name = entry->normstr;
-	namelen = entry->normstrlen;
+	name = (uint8_t *)entry->normstr;
+	namelen = entry->normstrlen * sizeof(UChar);
 
 	/*
 	 * Do four characters at a time as long as we can.
@@ -249,6 +276,7 @@ unicrash_init(
 	size_t			nr_buckets)
 {
 	struct unicrash		*p;
+	UErrorCode		uerr = U_ZERO_ERROR;
 
 	if (!is_utf8_locale()) {
 		*ucp = NULL;
@@ -266,9 +294,15 @@ unicrash_init(
 	p->ctx = ctx;
 	p->nr_buckets = nr_buckets;
 	p->compare_ino = compare_ino;
+	p->normalizer = unorm2_getNFKCInstance(&uerr);
+	if (U_FAILURE(uerr))
+		goto out_free;
 	*ucp = p;
 
 	return true;
+out_free:
+	free(p);
+	return false;
 }
 
 /* Initialize the collision detector for a directory. */
@@ -378,7 +412,7 @@ unicrash_add(
 	while (entry != NULL) {
 		/* Same normalization? */
 		if (new_entry->normstrlen == entry->normstrlen &&
-		    !u8_strcmp(new_entry->normstr, entry->normstr) &&
+		    !u_strcmp(new_entry->normstr, entry->normstr) &&
 		    (uc->compare_ino ? entry->ino != new_entry->ino : true)) {
 			*badflags |= UNICRASH_NOT_UNIQUE;
 			*existing_entry = entry;
diff --git a/scrub/unicrash.h b/scrub/unicrash.h
index 3337319..67e70ae 100644
--- a/scrub/unicrash.h
+++ b/scrub/unicrash.h
@@ -23,7 +23,7 @@
 struct unicrash;
 
 /* Unicode name collision detection. */
-#ifdef HAVE_U8NORMALIZE
+#ifdef HAVE_LIBICU
 
 struct dirent;
 
@@ -42,6 +42,6 @@ bool unicrash_check_xattr_name(struct unicrash *uc, const char *descr,
 # define unicrash_free(u)			do {(u) = (u);} while (0)
 # define unicrash_check_dir_name(u, d, n)	(true)
 # define unicrash_check_xattr_name(u, d, n)	(true)
-#endif /* HAVE_U8NORMALIZE */
+#endif /* HAVE_LIBICU */
 
 #endif /* XFS_SCRUB_UNICRASH_H_ */


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/14] xfs_scrub: check name for suspicious characters
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (5 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 06/14] xfs_scrub: transition from libunistring to libicu for Unicode processing Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-03-21  3:40 ` [PATCH 08/14] xfs_scrub: use Unicode skeleton function to find confusing names Darrick J. Wong
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Look for suspicious characters in each name we process.  This includes
control characters, text direction overrides, zero-width code points,
and names that mix characters from different directionalities.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |  110 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 06ccadf..3b5b46e 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -94,6 +94,18 @@ struct unicrash {
  */
 #define UNICRASH_NOT_UNIQUE	(1 << 0)
 
+/* Name contains directional overrides. */
+#define UNICRASH_BIDI_OVERRIDE	(1 << 1)
+
+/* Name mixes left-to-right and right-to-left characters. */
+#define UNICRASH_BIDI_MIXED	(1 << 2)
+
+/* Control characters in name. */
+#define UNICRASH_CONTROL_CHAR	(1 << 3)
+
+/* Invisible characters.  Only a problem if we have collisions. */
+#define UNICRASH_ZERO_WIDTH	(1 << 4)
+
 /*
  * We only care about validating utf8 collisions if the underlying
  * system configuration says we're using utf8.  If the language
@@ -267,6 +279,66 @@ name_entry_hash(
 	}
 }
 
+/*
+ * Check a name for suspicious elements that have appeared in filename
+ * spoofing attacks.  This includes names that mixed directions or contain
+ * direction overrides control characters, both of which have appeared in
+ * filename spoofing attacks.
+ */
+static void
+name_entry_examine(
+	struct name_entry	*entry,
+	unsigned int		*badflags)
+{
+	UChar32			uchr;
+	int32_t			i;
+	uint8_t			mask = 0;
+
+	for (i = 0; i < entry->normstrlen;) {
+		U16_NEXT_UNSAFE(entry->normstr, i, uchr);
+
+		/* zero width character sequences */
+		switch (uchr) {
+		case 0x200B:	/* zero width space */
+		case 0x200C:	/* zero width non-joiner */
+		case 0x200D:	/* zero width joiner */
+		case 0xFEFF:	/* zero width non breaking space */
+		case 0x2060:	/* word joiner */
+		case 0x2061:	/* function application */
+		case 0x2062:	/* invisible times (multiply) */
+		case 0x2063:	/* invisible separator (comma) */
+		case 0x2064:	/* invisible plus (addition) */
+			*badflags |= UNICRASH_ZERO_WIDTH;
+			break;
+		}
+
+		/* control characters */
+		if (u_iscntrl(uchr))
+			*badflags |= UNICRASH_CONTROL_CHAR;
+
+		switch (u_charDirection(uchr)) {
+		case U_LEFT_TO_RIGHT:
+			mask |= 0x01;
+			break;
+		case U_RIGHT_TO_LEFT:
+			mask |= 0x02;
+			break;
+		case U_RIGHT_TO_LEFT_OVERRIDE:
+			*badflags |= UNICRASH_BIDI_OVERRIDE;
+			break;
+		case U_LEFT_TO_RIGHT_OVERRIDE:
+			*badflags |= UNICRASH_BIDI_OVERRIDE;
+			break;
+		default:
+			break;
+		}
+	}
+
+	/* mixing left-to-right and right-to-left chars */
+	if (mask == 0x3)
+		*badflags |= UNICRASH_BIDI_MIXED;
+}
+
 /* Initialize the collision detector. */
 static bool
 unicrash_init(
@@ -369,6 +441,18 @@ unicrash_complain(
 		bad2 = string_escape(dup_entry->name);
 
 	/*
+	 * Most filechooser UIs do not look for bidirectional overrides when
+	 * they render names.  This can result in misleading name presentation
+	 * that makes "hig<rtl>gnp.sh" render like "highs.png".
+	 */
+	if (badflags & UNICRASH_BIDI_OVERRIDE) {
+		str_warn(uc->ctx, descr,
+_("Unicode name \"%s\" in %s contains suspicious text direction overrides."),
+				bad1, what);
+		goto out;
+	}
+
+	/*
 	 * Two names that normalize to the same string will render
 	 * identically even though the filesystem considers them unique
 	 * names.  "cafe\xcc\x81" and "caf\xc3\xa9" have different byte
@@ -381,6 +465,30 @@ _("Unicode name \"%s\" in %s renders identically to \"%s\"."),
 		goto out;
 	}
 
+	/*
+	 * Unfiltered control characters can mess up your terminal and render
+	 * invisibly in filechooser UIs.
+	 */
+	if (badflags & UNICRASH_CONTROL_CHAR) {
+		str_warn(uc->ctx, descr,
+_("Unicode name \"%s\" in %s contains control characters."),
+				bad1, what);
+		goto out;
+	}
+
+	/*
+	 * It's not considered good practice (says Unicode) to mix LTR
+	 * characters with RTL characters.  The mere presence of different
+	 * bidirectional characters isn't enough to trip up software, so don't
+	 * warn about this too loudly.
+	 */
+	if (badflags & UNICRASH_BIDI_MIXED) {
+		str_info(uc->ctx, descr,
+_("Unicode name \"%s\" in %s mixes bidirectional characters."),
+				bad1, what);
+		goto out;
+	}
+
 out:
 	free(bad1);
 	free(bad2);
@@ -442,6 +550,8 @@ __unicrash_check_name(
 	if (!name_entry_create(uc, name, ino, &new_entry))
 		return true;
 
+	name_entry_examine(new_entry, &badflags);
+
 	moveon = unicrash_add(uc, new_entry, &badflags, &dup_entry);
 	if (!moveon)
 		return false;


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/14] xfs_scrub: use Unicode skeleton function to find confusing names
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (6 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 07/14] xfs_scrub: check name for suspicious characters Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-03-26 19:58   ` [PATCH v2 " Darrick J. Wong
  2018-03-21  3:40 ` [PATCH 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root Darrick J. Wong
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Drop the weak normalization-based Unicode name collision detection in
favor of the confusable name guidelines provided in Unicode TR36 & TR39.
This means that we transform the original name into its Unicode skeleton
in order to do hashing-based collision detection.

The Unicode skeleton is defined as nfd(translation(nfd(string))), which
is to say that it flattens sequences that render ambiguously into a
unambiguous format.  For example, 'l' and '1' can render identically in
some typefaces, so they're both squashed to 'l'.  From the skeletons we
can figure out if two names will look the same, and thereby complain
about them.  The unicode spoofing is provided by libicu, hence the
switch away from libunistring.

Note that potentially confusable names are only worth an informational
warning, since it's entirely possible that with the system typefaces in
use, two names will render distinctly enough that users can tell the
difference.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |  147 ++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 120 insertions(+), 27 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 3b5b46e..f35b7bd 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -26,38 +26,45 @@
 #include <strings.h>
 #include <unicode/ustring.h>
 #include <unicode/unorm2.h>
+#include <unicode/uspoof.h>
 #include "path.h"
 #include "xfs_scrub.h"
 #include "common.h"
 
 /*
- * Detect collisions of Unicode-normalized names.
+ * Detect Unicode confusable names in directories and attributes.
  *
- * Record all the name->ino mappings in a directory/xattr, with a twist!
- * The twist is that we perform unicode normalization on every name we
- * see, so that we can warn about a directory containing more than one
- * directory entries that normalize to the same Unicode string.  These
- * entries are at best a sign of Unicode mishandling, or some sort of
- * weird name substitution attack if the entries do not point to the
- * same inode.  Warn if we see multiple dirents that do not all point to
- * the same inode.
+ * Record all the name->ino mappings in a directory/xattr, with a twist!  The
+ * twist is to record the Unicode skeleton and normalized version of every
+ * name we see so that we can check for a name space (directory, extended
+ * attribute set) containing names containing malicious characters or that
+ * could be confused for one another.  These entries are at best a sign of
+ * Unicode mishandling, or some sort of weird name substitution attack if the
+ * entries do not point to the same inode.  Warn if we see multiple dirents
+ * that do not all point to the same inode.
  *
  * For extended attributes we perform the same collision checks on the
  * attribute, though any collision is enough to trigger a warning.
  *
- * We flag these collisions as warnings and not errors because XFS
- * treats names as a sequence of arbitrary nonzero bytes.  While a
- * Unicode collision is not technically a filesystem corruption, we
- * ought to say something if there's a possibility for misleading a
- * user.
+ * We avoid flagging these problems as errors because XFS treats names as a
+ * sequence of arbitrary nonzero bytes.  While a Unicode collision is not
+ * technically a filesystem corruption, we ought to say something if there's a
+ * possibility for misleading a user.  Unquestionably bad things (direction
+ * overrides, control characters, names that normalize to the same string)
+ * produce warnings, whereas potentially confusable names produce
+ * informational messages.
  *
- * To normalize, we use Unicode NFKC.  We use the composing
- * normalization mode (e.g. "E WITH ACUTE" instead of "E" then "ACUTE")
- * because that's what W3C (and in general Linux) uses.  This enables us
- * to detect multiple object names that normalize to the same name and
- * could be confusing to users.  Furthermore, we use the compatibility
- * mode to detect names with compatible but different code points to
- * strengthen those checks.
+ * The skeleton algorithm is detailed in section 4 ("Confusable Detection") of
+ * the Unicode technical standard #39.  First we normalize the name, then we
+ * substitute code points according to the confusable code point table, then
+ * normalize again.
+ *
+ * We take the extra step of removing non-identifier code points such as
+ * formatting characters, control characters, zero width characters, etc.
+ * from the skeleton so that we can complain about names that are confusable
+ * due to invisible control characters.
+ *
+ * In other words, skel = remove_invisible(nfd(remap_confusables(nfd(name)))).
  */
 
 struct name_entry {
@@ -67,6 +74,10 @@ struct name_entry {
 	UChar			*normstr;
 	size_t			normstrlen;
 
+	/* Unicode skeletonized name */
+	UChar			*skelstr;
+	size_t			skelstrlen;
+
 	xfs_ino_t		ino;
 
 	/* Raw UTF8 name */
@@ -78,6 +89,7 @@ struct name_entry {
 
 struct unicrash {
 	struct scrub_ctx	*ctx;
+	USpoofChecker		*spoof;
 	const UNormalizer2	*normalizer;
 	bool			compare_ino;
 	size_t			nr_buckets;
@@ -106,6 +118,9 @@ struct unicrash {
 /* Invisible characters.  Only a problem if we have collisions. */
 #define UNICRASH_ZERO_WIDTH	(1 << 4)
 
+/* Multiple names resolve to the same skeleton string. */
+#define UNICRASH_CONFUSABLE	(1 << 5)
+
 /*
  * We only care about validating utf8 collisions if the underlying
  * system configuration says we're using utf8.  If the language
@@ -141,7 +156,7 @@ is_utf8_locale(void)
 }
 
 /*
- * Generate normalized form of the name.
+ * Generate normalized form and skeleton of the name.
  * If this fails, just forget everything; this is an advisory checker.
  */
 static bool
@@ -151,8 +166,13 @@ name_entry_compute_checknames(
 {
 	UChar			*normstr;
 	UChar			*unistr;
+	UChar			*skelstr;
 	int32_t			normstrlen;
 	int32_t			unistrlen;
+	int32_t			skelstrlen;
+	UChar32			uchr;
+	int32_t			i, j;
+
 	UErrorCode		uerr = U_ZERO_ERROR;
 
 	/* Convert bytestr to unistr for normalization */
@@ -182,11 +202,40 @@ name_entry_compute_checknames(
 	if (U_FAILURE(uerr))
 		goto out_normstr;
 
+	/* Compute skeleton. */
+	skelstrlen = uspoof_getSkeleton(uc->spoof, 0, unistr, unistrlen, NULL,
+			0, &uerr);
+	if (uerr != U_BUFFER_OVERFLOW_ERROR)
+		goto out_normstr;
+	uerr = U_ZERO_ERROR;
+	skelstr = calloc(skelstrlen + 1, sizeof(UChar));
+	if (!skelstr)
+		goto out_normstr;
+	uspoof_getSkeleton(uc->spoof, 0, unistr, unistrlen, skelstr, skelstrlen,
+			&uerr);
+	if (U_FAILURE(uerr))
+		goto out_skelstr;
+
+	/* Remove control/formatting characters from skeleton. */
+	for (i = 0, j = 0; i < skelstrlen; j = i) {
+		U16_NEXT_UNSAFE(skelstr, i, uchr);
+		if (!u_isIDIgnorable(uchr))
+			continue;
+		memmove(&skelstr[j], &skelstr[i],
+				(skelstrlen - i + 1) * sizeof(UChar));
+		skelstrlen -= (i - j);
+		i = j;
+	}
+
+	entry->skelstr = skelstr;
+	entry->skelstrlen = skelstrlen;
 	entry->normstr = normstr;
 	entry->normstrlen = normstrlen;
 	free(unistr);
 	return true;
 
+out_skelstr:
+	free(skelstr);
 out_normstr:
 	free(normstr);
 out_unistr:
@@ -215,7 +264,7 @@ name_entry_create(
 	new_entry->name[namelen] = 0;
 	new_entry->namelen = namelen;
 
-	/* Normalize name to find collisions. */
+	/* Normalize/skeletonize name to find collisions. */
 	if (!name_entry_compute_checknames(uc, new_entry))
 		goto out;
 
@@ -233,6 +282,7 @@ name_entry_free(
 	struct name_entry	*entry)
 {
 	free(entry->normstr);
+	free(entry->skelstr);
 	free(entry);
 }
 
@@ -253,8 +303,8 @@ name_entry_hash(
 	size_t			namelen;
 	xfs_dahash_t		hash;
 
-	name = (uint8_t *)entry->normstr;
-	namelen = entry->normstrlen * sizeof(UChar);
+	name = (uint8_t *)entry->skelstr;
+	namelen = entry->skelstrlen * sizeof(UChar);
 
 	/*
 	 * Do four characters at a time as long as we can.
@@ -369,9 +419,17 @@ unicrash_init(
 	p->normalizer = unorm2_getNFKCInstance(&uerr);
 	if (U_FAILURE(uerr))
 		goto out_free;
+	p->spoof = uspoof_open(&uerr);
+	if (U_FAILURE(uerr))
+		goto out_free;
+	uspoof_setChecks(p->spoof, USPOOF_ALL_CHECKS, &uerr);
+	if (U_FAILURE(uerr))
+		goto out_spoof;
 	*ucp = p;
 
 	return true;
+out_spoof:
+	uspoof_close(uc->spoof);
 out_free:
 	free(p);
 	return false;
@@ -414,6 +472,7 @@ unicrash_free(
 	if (!uc)
 		return;
 
+	uspoof_close(uc->spoof);
 	for (i = 0; i < uc->nr_buckets; i++) {
 		for (ne = uc->buckets[i]; ne != NULL; ne = x) {
 			x = ne->next;
@@ -466,6 +525,19 @@ _("Unicode name \"%s\" in %s renders identically to \"%s\"."),
 	}
 
 	/*
+	 * If a name contains invisible/nonprinting characters and can be
+	 * confused with another name as a result, we should complain.
+	 * "moo<zerowidthspace>cow" and "moocow" are misleading.
+	 */
+	if ((badflags & UNICRASH_ZERO_WIDTH) &&
+	    (badflags & UNICRASH_CONFUSABLE)) {
+		str_warn(uc->ctx, descr,
+_("Unicode name \"%s\" in %s could be confused with '%s' due to invisible characters."),
+				bad1, what, bad2);
+		goto out;
+	}
+
+	/*
 	 * Unfiltered control characters can mess up your terminal and render
 	 * invisibly in filechooser UIs.
 	 */
@@ -489,6 +561,18 @@ _("Unicode name \"%s\" in %s mixes bidirectional characters."),
 		goto out;
 	}
 
+	/*
+	 * We'll note if two names could be confusable with each other, but
+	 * whether or not the user will actually confuse them is dependent
+	 * on the rendering system and the typefaces in use.  Maybe "foo.1"
+	 * and "moo.l" look the same, maybe they do not.
+	 */
+	if (badflags & UNICRASH_CONFUSABLE) {
+		str_info(uc->ctx, descr,
+_("Unicode name \"%s\" in %s could be confused with \"%s\"."),
+				bad1, what, bad2);
+	}
+
 out:
 	free(bad1);
 	free(bad2);
@@ -496,8 +580,8 @@ _("Unicode name \"%s\" in %s mixes bidirectional characters."),
 
 /*
  * Try to add a name -> ino entry to the collision detector.  The name
- * must be normalized according to Unicode NFKC rules to detect names that
- * could be confused with each other.
+ * must be skeletonized according to Unicode TR39 to detect names that
+ * could be visually confused with each other.
  */
 static bool
 unicrash_add(
@@ -526,6 +610,15 @@ unicrash_add(
 			*existing_entry = entry;
 			return true;
 		}
+
+		/* Confusable? */
+		if (new_entry->skelstrlen == entry->skelstrlen &&
+		    !u_strcmp(new_entry->skelstr, entry->skelstr) &&
+		    (uc->compare_ino ? entry->ino != new_entry->ino : true)) {
+			*badflags |= UNICRASH_CONFUSABLE;
+			*existing_entry = entry;
+			return true;
+		}
 		entry = entry->next;
 	}
 


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (7 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 08/14] xfs_scrub: use Unicode skeleton function to find confusing names Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-03-26 19:59   ` [PATCH v2 " Darrick J. Wong
  2018-03-21  3:40 ` [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code Darrick J. Wong
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If we are scanning the directory entries or attribute names of a
dir/file and the inode can only be written by root, don't warn about
Unicode confusable names by default because the system administrator
presumably made the system like that.  Also don't warn about really
short confusable names because of the high chance of collisions.  If
the caller really wants all the output, they can run in verbose mode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/unicrash.c |   36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index f35b7bd..f60e07f 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -92,6 +92,7 @@ struct unicrash {
 	USpoofChecker		*spoof;
 	const UNormalizer2	*normalizer;
 	bool			compare_ino;
+	bool			is_only_root_writeable;
 	size_t			nr_buckets;
 	struct name_entry	*buckets[0];
 };
@@ -395,7 +396,8 @@ unicrash_init(
 	struct unicrash		**ucp,
 	struct scrub_ctx	*ctx,
 	bool			compare_ino,
-	size_t			nr_buckets)
+	size_t			nr_buckets,
+	bool			is_only_root_writeable)
 {
 	struct unicrash		*p;
 	UErrorCode		uerr = U_ZERO_ERROR;
@@ -425,16 +427,31 @@ unicrash_init(
 	uspoof_setChecks(p->spoof, USPOOF_ALL_CHECKS, &uerr);
 	if (U_FAILURE(uerr))
 		goto out_spoof;
+	p->is_only_root_writeable = is_only_root_writeable;
 	*ucp = p;
 
 	return true;
 out_spoof:
-	uspoof_close(uc->spoof);
+	uspoof_close(p->spoof);
 out_free:
 	free(p);
 	return false;
 }
 
+/*
+ * Is this inode owned by root and not writable by others?  If so, skip
+ * even the informational messages, because this was put in place by the
+ * administrator.
+ */
+static bool
+is_only_root_writable(
+	struct xfs_bstat	*bstat)
+{
+	if (bstat->bs_uid != 0 || bstat->bs_gid != 0)
+		return false;
+	return !(bstat->bs_mode & S_IWOTH);
+}
+
 /* Initialize the collision detector for a directory. */
 bool
 unicrash_dir_init(
@@ -446,7 +463,8 @@ unicrash_dir_init(
 	 * Assume 64 bytes per dentry, clamp buckets between 16 and 64k.
 	 * Same general idea as dir_hash_init in xfs_repair.
 	 */
-	return unicrash_init(ucp, ctx, true, bstat->bs_size / 64);
+	return unicrash_init(ucp, ctx, true, bstat->bs_size / 64,
+			is_only_root_writable(bstat));
 }
 
 /* Initialize the collision detector for an extended attribute. */
@@ -457,7 +475,8 @@ unicrash_xattr_init(
 	struct xfs_bstat	*bstat)
 {
 	/* Assume 16 attributes per extent for lack of a better idea. */
-	return unicrash_init(ucp, ctx, false, 16 * (1 + bstat->bs_aextents));
+	return unicrash_init(ucp, ctx, false, 16 * (1 + bstat->bs_aextents),
+			is_only_root_writable(bstat));
 }
 
 /* Free the crash detector. */
@@ -549,6 +568,15 @@ _("Unicode name \"%s\" in %s contains control characters."),
 	}
 
 	/*
+	 * Skip the informational messages if the inode owning the name is
+	 * only writeable by root, because those files were put there by the
+	 * sysadmin.  Also skip names less than four letters long because
+	 * there's a much higher chance of collisions with short names.
+	 */
+	if (!verbose && (uc->is_only_root_writeable || entry->namelen < 4))
+		goto out;
+
+	/*
 	 * It's not considered good practice (says Unicode) to mix LTR
 	 * characters with RTL characters.  The mere presence of different
 	 * bidirectional characters isn't enough to trip up software, so don't


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (8 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-04-11  1:48   ` Eric Sandeen
  2018-03-21  3:40 ` [PATCH 11/14] xfs_scrub_all: report version Darrick J. Wong
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Use the libfrog path finding code to determine if the argument being
passed in is a mountpoint, remove all mention of taking a block device
(we have never supported that) from the documentation, and fix some
potential memory leaks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 man/man8/xfs_scrub.8 |    2 +
 scrub/common.c       |   69 --------------------------------------------------
 scrub/common.h       |    1 -
 scrub/phase1.c       |   12 ---------
 scrub/xfs_scrub.c    |   19 +++++++-------
 scrub/xfs_scrub.h    |    1 -
 6 files changed, 10 insertions(+), 94 deletions(-)


diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
index 77fed92..680ef72 100644
--- a/man/man8/xfs_scrub.8
+++ b/man/man8/xfs_scrub.8
@@ -6,7 +6,7 @@ xfs_scrub \- check the contents of a mounted XFS filesystem
 [
 .B \-abCemnTvx
 ]
-.RI "[" mount-point " | " block-device "]"
+.I mount-point
 .br
 .B xfs_scrub \-V
 .SH DESCRIPTION
diff --git a/scrub/common.c b/scrub/common.c
index 5a37a98..722bf36 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -267,75 +267,6 @@ scrub_nproc_workqueue(
 }
 
 /*
- * Check if the argument is either the device name or mountpoint of a mounted
- * filesystem.
- */
-#define MNTTYPE_XFS	"xfs"
-static bool
-find_mountpoint_check(
-	struct stat		*sb,
-	struct mntent		*t)
-{
-	struct stat		ms;
-
-	if (S_ISDIR(sb->st_mode)) {		/* mount point */
-		if (stat(t->mnt_dir, &ms) < 0)
-			return false;
-		if (sb->st_ino != ms.st_ino)
-			return false;
-		if (sb->st_dev != ms.st_dev)
-			return false;
-		if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
-			return NULL;
-	} else {				/* device */
-		if (stat(t->mnt_fsname, &ms) < 0)
-			return false;
-		if (sb->st_rdev != ms.st_rdev)
-			return false;
-		if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
-			return NULL;
-		/*
-		 * Make sure the mountpoint given by mtab is accessible
-		 * before using it.
-		 */
-		if (stat(t->mnt_dir, &ms) < 0)
-			return false;
-	}
-
-	return true;
-}
-
-/* Check that our alleged mountpoint is in mtab */
-bool
-find_mountpoint(
-	char			*mtab,
-	struct scrub_ctx	*ctx)
-{
-	struct mntent_cursor	cursor;
-	struct mntent		*t = NULL;
-	bool			found = false;
-
-	if (platform_mntent_open(&cursor, mtab) != 0) {
-		fprintf(stderr, "Error: can't get mntent entries.\n");
-		exit(1);
-	}
-
-	while ((t = platform_mntent_next(&cursor)) != NULL) {
-		/*
-		 * Keep jotting down matching mount details; newer mounts are
-		 * towards the end of the file (hopefully).
-		 */
-		if (find_mountpoint_check(&ctx->mnt_sb, t)) {
-			ctx->mntpoint = strdup(t->mnt_dir);
-			ctx->blkdev = strdup(t->mnt_fsname);
-			found = true;
-		}
-	}
-	platform_mntent_close(&cursor);
-	return found;
-}
-
-/*
  * Sleep for 100ms * however many -b we got past the initial one.
  * This is an (albeit clumsy) way to throttle scrub activity.
  */
diff --git a/scrub/common.h b/scrub/common.h
index 4f1f0cd..bc83971 100644
--- a/scrub/common.h
+++ b/scrub/common.h
@@ -88,7 +88,6 @@ static inline int syncfs(int fd)
 }
 #endif
 
-bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
 void background_sleep(void);
 char *string_escape(const char *in);
 
diff --git a/scrub/phase1.c b/scrub/phase1.c
index 9926770..c2b9067 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -83,7 +83,6 @@ bool
 xfs_setup_fs(
 	struct scrub_ctx		*ctx)
 {
-	struct fs_path			*fsp;
 	int				error;
 
 	/*
@@ -178,17 +177,6 @@ _("Kernel metadata repair facility is not available.  Use -n to scrub."));
 		return false;
 	}
 
-	/* Go find the XFS devices if we have a usable fsmap. */
-	fs_table_initialise(0, NULL, 0, NULL);
-	errno = 0;
-	fsp = fs_table_lookup(ctx->mntpoint, FS_MOUNT_POINT);
-	if (!fsp) {
-		str_info(ctx, ctx->mntpoint,
-_("Unable to find XFS information."));
-		return false;
-	}
-	memcpy(&ctx->fsinfo, fsp, sizeof(struct fs_path));
-
 	/* Did we find the log and rt devices, if they're present? */
 	if (ctx->geo.logstart == 0 && ctx->fsinfo.fs_log == NULL) {
 		str_info(ctx, ctx->mntpoint,
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 2d55283..a5b5cf2 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -170,7 +170,7 @@ bool				is_service;
 static void __attribute__((noreturn))
 usage(void)
 {
-	fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint | device\n"), progname);
+	fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
 	fprintf(stderr, "\n");
 	fprintf(stderr, _("Options:\n"));
 	fprintf(stderr, _("  -a count     Stop after this many errors are found.\n"));
@@ -538,8 +538,8 @@ main(
 	struct phase_rusage	all_pi;
 	char			*mtab = NULL;
 	FILE			*progress_fp = NULL;
+	struct fs_path		*fsp;
 	bool			moveon = true;
-	bool			ismnt;
 	int			c;
 	int			fd;
 	int			ret = SCRUB_RET_SUCCESS;
@@ -640,7 +640,7 @@ main(
 	if (optind != argc - 1)
 		usage();
 
-	ctx.mntpoint = strdup(argv[optind]);
+	ctx.mntpoint = argv[optind];
 
 	stdout_isatty = isatty(STDOUT_FILENO);
 	stderr_isatty = isatty(STDERR_FILENO);
@@ -680,14 +680,15 @@ main(
 			mtab = _PATH_MOUNTED;
 	}
 
-	ismnt = find_mountpoint(mtab, &ctx);
-	if (!ismnt) {
-		fprintf(stderr,
-_("%s: Not a XFS mount point or block device.\n"),
-			ctx.mntpoint);
+	fs_table_initialise(0, NULL, 0, NULL);
+	fsp = fs_table_lookup_mount(ctx.mntpoint);
+	if (!fsp) {
+		fprintf(stderr, _("%s: Not a XFS mount point.\n"),
+				ctx.mntpoint);
 		ret |= SCRUB_RET_SYNTAX;
 		goto out;
 	}
+	memcpy(&ctx.fsinfo, fsp, sizeof(struct fs_path));
 
 	/* How many CPUs? */
 	nproc = sysconf(_SC_NPROCESSORS_ONLN);
@@ -740,8 +741,6 @@ _("%s: Not a XFS mount point or block device.\n"),
 	phase_end(&all_pi, 0);
 	if (progress_fp)
 		fclose(progress_fp);
-	free(ctx.blkdev);
-	free(ctx.mntpoint);
 
 	/*
 	 * If we're being run as a service, the return code must fit the LSB
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index aa130a7..c9dbe8e 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -48,7 +48,6 @@ struct scrub_ctx {
 
 	/* Strings we need for presentation */
 	char			*mntpoint;
-	char			*blkdev;
 
 	/* Mountpoint info */
 	struct stat		mnt_sb;


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 11/14] xfs_scrub_all: report version
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (9 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-04-11  0:28   ` Eric Sandeen
  2018-03-21  3:40 ` [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service Darrick J. Wong
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Make xfs_scrub_all -V report its version like the other xfs tools.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 man/man8/xfs_scrub_all.8 |   10 ++++++++++
 scrub/Makefile           |    1 +
 scrub/xfs_scrub_all.in   |   11 +++++++++++
 3 files changed, 22 insertions(+)


diff --git a/man/man8/xfs_scrub_all.8 b/man/man8/xfs_scrub_all.8
index 5e1420b..7454880 100644
--- a/man/man8/xfs_scrub_all.8
+++ b/man/man8/xfs_scrub_all.8
@@ -3,6 +3,9 @@
 xfs_scrub_all \- scrub all mounted XFS filesystems
 .SH SYNOPSIS
 .B xfs_scrub_all
+[
+.B \-hV
+]
 .SH DESCRIPTION
 .B xfs_scrub_all
 attempts to read and check all the metadata on all mounted XFS filesystems.
@@ -13,6 +16,13 @@ in a restricted fashion.
 Mounted filesystems are mapped to physical storage devices so that scrub
 operations can be run in parallel so long as no two scrubbers access
 the same device simultaneously.
+.SH OPTIONS
+.TP
+.B \-h
+Display help.
+.TP
+.B \-V
+Prints the version number and exits.
 .SH EXIT CODE
 The exit code returned by
 .B xfs_scrub_all
diff --git a/scrub/Makefile b/scrub/Makefile
index bcc05a0..482e8a7 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -102,6 +102,7 @@ default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG) $(OPTIONAL_TARGETS)
 xfs_scrub_all: xfs_scrub_all.in
 	@echo "    [SED]    $@"
 	$(Q)$(SED) -e "s|@sbindir@|$(PKG_ROOT_SBIN_DIR)|g" \
+		   -e "s|@pkg_version@|$(PKG_VERSION)|g" \
 		   -e "s|@scrub_args@|$(XFS_SCRUB_ARGS)|g" < $< > $@
 	$(Q)chmod a+x $@
 
diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
index 80f07d5..aed66a1 100644
--- a/scrub/xfs_scrub_all.in
+++ b/scrub/xfs_scrub_all.in
@@ -26,6 +26,7 @@ import threading
 import time
 import sys
 import os
+import argparse
 
 retcode = 0
 terminate = False
@@ -139,6 +140,16 @@ def main():
 		thr.start()
 	global retcode, terminate
 
+	parser = argparse.ArgumentParser( \
+			description = "Scrub all mounted XFS filesystems.")
+	parser.add_argument("-V", help = "Report version and exit.", \
+			action = "store_true")
+	args = parser.parse_args()
+
+	if args.V:
+		print("xfs_scrub_all version @pkg_version@")
+		sys.exit(0)
+
 	fs = find_mounts()
 
 	# Tail the journal if we ourselves aren't a service...


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (10 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 11/14] xfs_scrub_all: report version Darrick J. Wong
@ 2018-03-21  3:40 ` Darrick J. Wong
  2018-04-11  1:45   ` Eric Sandeen
  2018-04-11  1:53   ` [PATCH v2 " Darrick J. Wong
  2018-03-21  3:41 ` [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances Darrick J. Wong
  2018-03-21  3:41 ` [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding Darrick J. Wong
  13 siblings, 2 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:40 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't make /tmp private when invoking xfs_scrub as a service, because
/tmp might contain or itself be an xfs filesystem mountpoint.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/xfs_scrub@.service.in |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/scrub/xfs_scrub@.service.in b/scrub/xfs_scrub@.service.in
index c14f813..9e6206a 100644
--- a/scrub/xfs_scrub@.service.in
+++ b/scrub/xfs_scrub@.service.in
@@ -9,7 +9,7 @@ WorkingDirectory=%I
 PrivateNetwork=true
 ProtectSystem=full
 ProtectHome=read-only
-PrivateTmp=yes
+PrivateTmp=no
 AmbientCapabilities=CAP_SYS_ADMIN CAP_FOWNER CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_SYS_RAWIO
 NoNewPrivileges=yes
 User=nobody


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (11 preceding siblings ...)
  2018-03-21  3:40 ` [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service Darrick J. Wong
@ 2018-03-21  3:41 ` Darrick J. Wong
  2018-04-11  1:31   ` Eric Sandeen
  2018-03-21  3:41 ` [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding Darrick J. Wong
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:41 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

systemd doesn't like unit instance names with slashes in them, so it
replaces them with dashes when it invokes the service.  However, it's
not smart enough to convert the dashes to something else, so when it
unescapes the instance name to feed to xfs_scrub, it turns all dashes
into slashes.  "/moo-cow" becomes "-moo-cow" becomes "/moo/cow", which
is wrong.  systemd actually /can/ escape the dashes correctly if it is
told that this is a path (and not a unit name), but it didn't do this
prior to January 2017, so fix this for them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/xfs_scrub_all.in |   24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)


diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
index aed66a1..83c4e21 100644
--- a/scrub/xfs_scrub_all.in
+++ b/scrub/xfs_scrub_all.in
@@ -87,6 +87,28 @@ def run_killable(cmd, stdout, killfuncs, kill_fn):
 	except:
 		return -1
 
+# systemd doesn't like unit instance names with slashes in them, so it
+# replaces them with dashes when it invokes the service.  However, it's not
+# smart enough to convert the dashes to something else, so when it unescapes
+# the instance name to feed to xfs_scrub, it turns all dashes into slashes.
+# "/moo-cow" becomes "-moo-cow" becomes "/moo/cow", which is wrong.  systemd
+# actually /can/ escape the dashes correctly if it is told that this is a path
+# (and not a unit name), but it didn't do this prior to January 2017, so fix
+# this for them.
+def systemd_escape(path):
+	'''Escape a path to avoid mangled systemd mangling.'''
+
+	if '-' not in path:
+		return path
+	cmd = ['systemd-escape', '--path', path]
+	try:
+		proc = subprocess.Popen(cmd, stdout = subprocess.PIPE)
+		proc.wait()
+		for line in proc.stdout:
+			return '-' + line.decode(sys.stdout.encoding).strip()
+	except:
+		return path
+
 def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
 	'''Run a scrub process.'''
 	global retcode, terminate
@@ -99,7 +121,7 @@ def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
 			return
 
 		# Try it the systemd way
-		cmd=['systemctl', 'start', 'xfs_scrub@%s' % mnt]
+		cmd=['systemctl', 'start', 'xfs_scrub@%s' % systemd_escape(mnt)]
 		ret = run_killable(cmd, DEVNULL(), killfuncs, \
 				lambda proc: kill_systemd('xfs_scrub@%s' % mnt, proc))
 		if ret == 0 or ret == 1:


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding
  2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
                   ` (12 preceding siblings ...)
  2018-03-21  3:41 ` [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances Darrick J. Wong
@ 2018-03-21  3:41 ` Darrick J. Wong
  2018-04-11  1:35   ` Eric Sandeen
  13 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-21  3:41 UTC (permalink / raw)
  To: sandeen, darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't hardcode utf-8 as the decoding scheme for lsblk output, since the
system could set it to anything else.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/xfs_scrub_all.in |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
index 83c4e21..d374f92 100644
--- a/scrub/xfs_scrub_all.in
+++ b/scrub/xfs_scrub_all.in
@@ -48,7 +48,7 @@ def find_mounts():
 	result.wait()
 	if result.returncode != 0:
 		return fs
-	sarray = [x.decode('utf-8') for x in result.stdout.readlines()]
+	sarray = [x.decode(sys.stdout.encoding) for x in result.stdout.readlines()]
 	output = ' '.join(sarray)
 	bdevdata = json.loads(output)
 	# The lsblk output had better be in disks-then-partitions order


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 08/14] xfs_scrub: use Unicode skeleton function to find confusing names
  2018-03-21  3:40 ` [PATCH 08/14] xfs_scrub: use Unicode skeleton function to find confusing names Darrick J. Wong
@ 2018-03-26 19:58   ` Darrick J. Wong
  0 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-26 19:58 UTC (permalink / raw)
  To: sandeen; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Drop the weak normalization-based Unicode name collision detection in
favor of the confusable name guidelines provided in Unicode TR36 & TR39.
This means that we transform the original name into its Unicode skeleton
in order to do hashing-based collision detection.

The Unicode skeleton is defined as nfd(translation(nfd(string))), which
is to say that it flattens sequences that render ambiguously into a
unambiguous format.  For example, 'l' and '1' can render identically in
some typefaces, so they're both squashed to 'l'.  From the skeletons we
can figure out if two names will look the same, and thereby complain
about them.  The unicode spoofing is provided by libicu, hence the
switch away from libunistring.

Note that potentially confusable names are only worth an informational
warning, since it's entirely possible that with the system typefaces in
use, two names will render distinctly enough that users can tell the
difference.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: fix resource cleanup breakage in unicrash_init
---
 scrub/unicrash.c |  147 ++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 120 insertions(+), 27 deletions(-)

diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 3b5b46e..8b58269 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -26,38 +26,45 @@
 #include <strings.h>
 #include <unicode/ustring.h>
 #include <unicode/unorm2.h>
+#include <unicode/uspoof.h>
 #include "path.h"
 #include "xfs_scrub.h"
 #include "common.h"
 
 /*
- * Detect collisions of Unicode-normalized names.
+ * Detect Unicode confusable names in directories and attributes.
  *
- * Record all the name->ino mappings in a directory/xattr, with a twist!
- * The twist is that we perform unicode normalization on every name we
- * see, so that we can warn about a directory containing more than one
- * directory entries that normalize to the same Unicode string.  These
- * entries are at best a sign of Unicode mishandling, or some sort of
- * weird name substitution attack if the entries do not point to the
- * same inode.  Warn if we see multiple dirents that do not all point to
- * the same inode.
+ * Record all the name->ino mappings in a directory/xattr, with a twist!  The
+ * twist is to record the Unicode skeleton and normalized version of every
+ * name we see so that we can check for a name space (directory, extended
+ * attribute set) containing names containing malicious characters or that
+ * could be confused for one another.  These entries are at best a sign of
+ * Unicode mishandling, or some sort of weird name substitution attack if the
+ * entries do not point to the same inode.  Warn if we see multiple dirents
+ * that do not all point to the same inode.
  *
  * For extended attributes we perform the same collision checks on the
  * attribute, though any collision is enough to trigger a warning.
  *
- * We flag these collisions as warnings and not errors because XFS
- * treats names as a sequence of arbitrary nonzero bytes.  While a
- * Unicode collision is not technically a filesystem corruption, we
- * ought to say something if there's a possibility for misleading a
- * user.
+ * We avoid flagging these problems as errors because XFS treats names as a
+ * sequence of arbitrary nonzero bytes.  While a Unicode collision is not
+ * technically a filesystem corruption, we ought to say something if there's a
+ * possibility for misleading a user.  Unquestionably bad things (direction
+ * overrides, control characters, names that normalize to the same string)
+ * produce warnings, whereas potentially confusable names produce
+ * informational messages.
  *
- * To normalize, we use Unicode NFKC.  We use the composing
- * normalization mode (e.g. "E WITH ACUTE" instead of "E" then "ACUTE")
- * because that's what W3C (and in general Linux) uses.  This enables us
- * to detect multiple object names that normalize to the same name and
- * could be confusing to users.  Furthermore, we use the compatibility
- * mode to detect names with compatible but different code points to
- * strengthen those checks.
+ * The skeleton algorithm is detailed in section 4 ("Confusable Detection") of
+ * the Unicode technical standard #39.  First we normalize the name, then we
+ * substitute code points according to the confusable code point table, then
+ * normalize again.
+ *
+ * We take the extra step of removing non-identifier code points such as
+ * formatting characters, control characters, zero width characters, etc.
+ * from the skeleton so that we can complain about names that are confusable
+ * due to invisible control characters.
+ *
+ * In other words, skel = remove_invisible(nfd(remap_confusables(nfd(name)))).
  */
 
 struct name_entry {
@@ -67,6 +74,10 @@ struct name_entry {
 	UChar			*normstr;
 	size_t			normstrlen;
 
+	/* Unicode skeletonized name */
+	UChar			*skelstr;
+	size_t			skelstrlen;
+
 	xfs_ino_t		ino;
 
 	/* Raw UTF8 name */
@@ -78,6 +89,7 @@ struct name_entry {
 
 struct unicrash {
 	struct scrub_ctx	*ctx;
+	USpoofChecker		*spoof;
 	const UNormalizer2	*normalizer;
 	bool			compare_ino;
 	size_t			nr_buckets;
@@ -106,6 +118,9 @@ struct unicrash {
 /* Invisible characters.  Only a problem if we have collisions. */
 #define UNICRASH_ZERO_WIDTH	(1 << 4)
 
+/* Multiple names resolve to the same skeleton string. */
+#define UNICRASH_CONFUSABLE	(1 << 5)
+
 /*
  * We only care about validating utf8 collisions if the underlying
  * system configuration says we're using utf8.  If the language
@@ -141,7 +156,7 @@ is_utf8_locale(void)
 }
 
 /*
- * Generate normalized form of the name.
+ * Generate normalized form and skeleton of the name.
  * If this fails, just forget everything; this is an advisory checker.
  */
 static bool
@@ -151,8 +166,13 @@ name_entry_compute_checknames(
 {
 	UChar			*normstr;
 	UChar			*unistr;
+	UChar			*skelstr;
 	int32_t			normstrlen;
 	int32_t			unistrlen;
+	int32_t			skelstrlen;
+	UChar32			uchr;
+	int32_t			i, j;
+
 	UErrorCode		uerr = U_ZERO_ERROR;
 
 	/* Convert bytestr to unistr for normalization */
@@ -182,11 +202,40 @@ name_entry_compute_checknames(
 	if (U_FAILURE(uerr))
 		goto out_normstr;
 
+	/* Compute skeleton. */
+	skelstrlen = uspoof_getSkeleton(uc->spoof, 0, unistr, unistrlen, NULL,
+			0, &uerr);
+	if (uerr != U_BUFFER_OVERFLOW_ERROR)
+		goto out_normstr;
+	uerr = U_ZERO_ERROR;
+	skelstr = calloc(skelstrlen + 1, sizeof(UChar));
+	if (!skelstr)
+		goto out_normstr;
+	uspoof_getSkeleton(uc->spoof, 0, unistr, unistrlen, skelstr, skelstrlen,
+			&uerr);
+	if (U_FAILURE(uerr))
+		goto out_skelstr;
+
+	/* Remove control/formatting characters from skeleton. */
+	for (i = 0, j = 0; i < skelstrlen; j = i) {
+		U16_NEXT_UNSAFE(skelstr, i, uchr);
+		if (!u_isIDIgnorable(uchr))
+			continue;
+		memmove(&skelstr[j], &skelstr[i],
+				(skelstrlen - i + 1) * sizeof(UChar));
+		skelstrlen -= (i - j);
+		i = j;
+	}
+
+	entry->skelstr = skelstr;
+	entry->skelstrlen = skelstrlen;
 	entry->normstr = normstr;
 	entry->normstrlen = normstrlen;
 	free(unistr);
 	return true;
 
+out_skelstr:
+	free(skelstr);
 out_normstr:
 	free(normstr);
 out_unistr:
@@ -215,7 +264,7 @@ name_entry_create(
 	new_entry->name[namelen] = 0;
 	new_entry->namelen = namelen;
 
-	/* Normalize name to find collisions. */
+	/* Normalize/skeletonize name to find collisions. */
 	if (!name_entry_compute_checknames(uc, new_entry))
 		goto out;
 
@@ -233,6 +282,7 @@ name_entry_free(
 	struct name_entry	*entry)
 {
 	free(entry->normstr);
+	free(entry->skelstr);
 	free(entry);
 }
 
@@ -253,8 +303,8 @@ name_entry_hash(
 	size_t			namelen;
 	xfs_dahash_t		hash;
 
-	name = (uint8_t *)entry->normstr;
-	namelen = entry->normstrlen * sizeof(UChar);
+	name = (uint8_t *)entry->skelstr;
+	namelen = entry->skelstrlen * sizeof(UChar);
 
 	/*
 	 * Do four characters at a time as long as we can.
@@ -369,9 +419,17 @@ unicrash_init(
 	p->normalizer = unorm2_getNFKCInstance(&uerr);
 	if (U_FAILURE(uerr))
 		goto out_free;
+	p->spoof = uspoof_open(&uerr);
+	if (U_FAILURE(uerr))
+		goto out_free;
+	uspoof_setChecks(p->spoof, USPOOF_ALL_CHECKS, &uerr);
+	if (U_FAILURE(uerr))
+		goto out_spoof;
 	*ucp = p;
 
 	return true;
+out_spoof:
+	uspoof_close(p->spoof);
 out_free:
 	free(p);
 	return false;
@@ -414,6 +472,7 @@ unicrash_free(
 	if (!uc)
 		return;
 
+	uspoof_close(uc->spoof);
 	for (i = 0; i < uc->nr_buckets; i++) {
 		for (ne = uc->buckets[i]; ne != NULL; ne = x) {
 			x = ne->next;
@@ -466,6 +525,19 @@ _("Unicode name \"%s\" in %s renders identically to \"%s\"."),
 	}
 
 	/*
+	 * If a name contains invisible/nonprinting characters and can be
+	 * confused with another name as a result, we should complain.
+	 * "moo<zerowidthspace>cow" and "moocow" are misleading.
+	 */
+	if ((badflags & UNICRASH_ZERO_WIDTH) &&
+	    (badflags & UNICRASH_CONFUSABLE)) {
+		str_warn(uc->ctx, descr,
+_("Unicode name \"%s\" in %s could be confused with '%s' due to invisible characters."),
+				bad1, what, bad2);
+		goto out;
+	}
+
+	/*
 	 * Unfiltered control characters can mess up your terminal and render
 	 * invisibly in filechooser UIs.
 	 */
@@ -489,6 +561,18 @@ _("Unicode name \"%s\" in %s mixes bidirectional characters."),
 		goto out;
 	}
 
+	/*
+	 * We'll note if two names could be confusable with each other, but
+	 * whether or not the user will actually confuse them is dependent
+	 * on the rendering system and the typefaces in use.  Maybe "foo.1"
+	 * and "moo.l" look the same, maybe they do not.
+	 */
+	if (badflags & UNICRASH_CONFUSABLE) {
+		str_info(uc->ctx, descr,
+_("Unicode name \"%s\" in %s could be confused with \"%s\"."),
+				bad1, what, bad2);
+	}
+
 out:
 	free(bad1);
 	free(bad2);
@@ -496,8 +580,8 @@ _("Unicode name \"%s\" in %s mixes bidirectional characters."),
 
 /*
  * Try to add a name -> ino entry to the collision detector.  The name
- * must be normalized according to Unicode NFKC rules to detect names that
- * could be confused with each other.
+ * must be skeletonized according to Unicode TR39 to detect names that
+ * could be visually confused with each other.
  */
 static bool
 unicrash_add(
@@ -526,6 +610,15 @@ unicrash_add(
 			*existing_entry = entry;
 			return true;
 		}
+
+		/* Confusable? */
+		if (new_entry->skelstrlen == entry->skelstrlen &&
+		    !u_strcmp(new_entry->skelstr, entry->skelstr) &&
+		    (uc->compare_ino ? entry->ino != new_entry->ino : true)) {
+			*badflags |= UNICRASH_CONFUSABLE;
+			*existing_entry = entry;
+			return true;
+		}
 		entry = entry->next;
 	}
 

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root
  2018-03-21  3:40 ` [PATCH 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root Darrick J. Wong
@ 2018-03-26 19:59   ` Darrick J. Wong
  0 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-03-26 19:59 UTC (permalink / raw)
  To: sandeen; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If we are scanning the directory entries or attribute names of a
dir/file and the inode can only be written by root, don't warn about
Unicode confusable names by default because the system administrator
presumably made the system like that.  Also don't warn about really
short confusable names because of the high chance of collisions.  If
the caller really wants all the output, they can run in verbose mode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: fix resource cleanup breakage in unicrash_init
---
 scrub/unicrash.c |   34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 8b58269..f60e07f 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -92,6 +92,7 @@ struct unicrash {
 	USpoofChecker		*spoof;
 	const UNormalizer2	*normalizer;
 	bool			compare_ino;
+	bool			is_only_root_writeable;
 	size_t			nr_buckets;
 	struct name_entry	*buckets[0];
 };
@@ -395,7 +396,8 @@ unicrash_init(
 	struct unicrash		**ucp,
 	struct scrub_ctx	*ctx,
 	bool			compare_ino,
-	size_t			nr_buckets)
+	size_t			nr_buckets,
+	bool			is_only_root_writeable)
 {
 	struct unicrash		*p;
 	UErrorCode		uerr = U_ZERO_ERROR;
@@ -425,6 +427,7 @@ unicrash_init(
 	uspoof_setChecks(p->spoof, USPOOF_ALL_CHECKS, &uerr);
 	if (U_FAILURE(uerr))
 		goto out_spoof;
+	p->is_only_root_writeable = is_only_root_writeable;
 	*ucp = p;
 
 	return true;
@@ -435,6 +438,20 @@ unicrash_init(
 	return false;
 }
 
+/*
+ * Is this inode owned by root and not writable by others?  If so, skip
+ * even the informational messages, because this was put in place by the
+ * administrator.
+ */
+static bool
+is_only_root_writable(
+	struct xfs_bstat	*bstat)
+{
+	if (bstat->bs_uid != 0 || bstat->bs_gid != 0)
+		return false;
+	return !(bstat->bs_mode & S_IWOTH);
+}
+
 /* Initialize the collision detector for a directory. */
 bool
 unicrash_dir_init(
@@ -446,7 +463,8 @@ unicrash_dir_init(
 	 * Assume 64 bytes per dentry, clamp buckets between 16 and 64k.
 	 * Same general idea as dir_hash_init in xfs_repair.
 	 */
-	return unicrash_init(ucp, ctx, true, bstat->bs_size / 64);
+	return unicrash_init(ucp, ctx, true, bstat->bs_size / 64,
+			is_only_root_writable(bstat));
 }
 
 /* Initialize the collision detector for an extended attribute. */
@@ -457,7 +475,8 @@ unicrash_xattr_init(
 	struct xfs_bstat	*bstat)
 {
 	/* Assume 16 attributes per extent for lack of a better idea. */
-	return unicrash_init(ucp, ctx, false, 16 * (1 + bstat->bs_aextents));
+	return unicrash_init(ucp, ctx, false, 16 * (1 + bstat->bs_aextents),
+			is_only_root_writable(bstat));
 }
 
 /* Free the crash detector. */
@@ -549,6 +568,15 @@ _("Unicode name \"%s\" in %s contains control characters."),
 	}
 
 	/*
+	 * Skip the informational messages if the inode owning the name is
+	 * only writeable by root, because those files were put there by the
+	 * sysadmin.  Also skip names less than four letters long because
+	 * there's a much higher chance of collisions with short names.
+	 */
+	if (!verbose && (uc->is_only_root_writeable || entry->namelen < 4))
+		goto out;
+
+	/*
 	 * It's not considered good practice (says Unicode) to mix LTR
 	 * characters with RTL characters.  The mere presence of different
 	 * bidirectional characters isn't enough to trip up software, so don't

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes
  2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
@ 2018-04-03 17:30   ` Eric Sandeen
  2018-04-05  3:57     ` Darrick J. Wong
  2018-04-11  0:20     ` Darrick J. Wong
  2018-04-11  0:27   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 2 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-03 17:30 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs

On 3/20/18 10:39 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Avoid a buffer overflow when we're formatting extended attribute names
> for name checking.

This won't /actually/ overflow, right, because you are doing
snprintf(NAME_MAX) into a buffer of size [NAME_MAX + 1].

However, it might truncate the attribute string if too long.
(just trying to avoid security folk wig-outs).

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  scrub/phase5.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/scrub/phase5.c b/scrub/phase5.c
> index 8e0a1be..36821d0 100644
> --- a/scrub/phase5.c
> +++ b/scrub/phase5.c
> @@ -143,6 +143,8 @@ static const struct xfs_attr_ns attr_ns[] = {
>  	{ATTR_SECURE,		"secure"},
>  	{0, NULL},
>  };
> +/* Enough space to handle the prefix. */
> +#define ATTR_NAME_MAX		(NAME_MAX + 8)

Unrelated to this change, really, but should NAME_MAX be
XATTR_NAME_MAX for clarity & consistency w/ the xattr code?

(it's defined in /usr/include/linux/limits.h)

>  
>  /*
>   * Check all the xattr names in a particular namespace of a file handle
> @@ -158,7 +160,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
>  {
>  	struct attrlist_cursor		cur;
>  	char				attrbuf[XFS_XATTR_LIST_MAX];
> -	char				keybuf[NAME_MAX + 1];
> +	char				keybuf[ATTR_NAME_MAX + 1];
>  	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
>  	struct attrlist_ent		*ent;
>  	struct unicrash			*uc;
> @@ -172,14 +174,14 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
>  
>  	memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
>  	memset(&cur, 0, sizeof(cur));
> -	memset(keybuf, 0, NAME_MAX + 1);
> +	memset(keybuf, 0, ATTR_NAME_MAX + 1);
>  	error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
>  			XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
>  	while (!error) {
>  		/* Examine the xattrs. */
>  		for (i = 0; i < attrlist->al_count; i++) {
>  			ent = ATTR_ENTRY(attrlist, i);
> -			snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,

To future proof this rather than relying on a hardcoded 8 based on the
current namespaces above, how about:

#define XATTR_NS_MAX 8
#define ATTR_STRING_MAX		(XATTR_NS_MAX + XATTR_NAME_MAX + 1) /* for '.' */

char				keybuf[ATTR_STRING_MAX + 1];

			ASSERT(strlen(attr_ns->name) <= XATTR_NS_MAX);
			snprintf(keybuf, ATTR_STRING_MAX, "%s.%s", attr_ns->name,
					ent->a_name);

just in case someone adds a new 

	ATTR_SUPERDELUXE, "superdeluxe"

namespace some day?  Is that overkill?

> +			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
>  					ent->a_name);
>  			moveon = xfs_scrub_check_name(ctx, descr,
>  					_("extended attribute"), keybuf);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker
  2018-03-21  3:39 ` [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker Darrick J. Wong
@ 2018-04-03 17:49   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-03 17:49 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs

On 3/20/18 10:39 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Skip the ASCII name checks if the Unicode name checker is going to run,
> since the latter covers everything that the former does.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Seems fine.  Since you already have some layers of wrappers it might
be nice to just call i.e. xfs_scrub_check_xattr_name(uc, ctx, descr, keybuf)
and let it sort out the uc / not-uc cases, but unless you're super
excited about that idea, this is fine.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

>  scrub/phase5.c |   24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> 
> diff --git a/scrub/phase5.c b/scrub/phase5.c
> index 36821d0..decda02 100644
> --- a/scrub/phase5.c
> +++ b/scrub/phase5.c
> @@ -113,11 +113,11 @@ xfs_scrub_scan_dirents(
>  
>  	dentry = readdir(dir);
>  	while (dentry) {
> -		moveon = xfs_scrub_check_name(ctx, descr, _("directory"),
> -				dentry->d_name);
> -		if (!moveon)
> -			break;
> -		moveon = unicrash_check_dir_name(uc, descr, dentry);
> +		if (uc)
> +			moveon = unicrash_check_dir_name(uc, descr, dentry);
> +		else
> +			moveon = xfs_scrub_check_name(ctx, descr,
> +					_("directory"), dentry->d_name);
>  		if (!moveon)
>  			break;
>  		dentry = readdir(dir);
> @@ -163,7 +163,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
>  	char				keybuf[ATTR_NAME_MAX + 1];
>  	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
>  	struct attrlist_ent		*ent;
> -	struct unicrash			*uc;
> +	struct unicrash			*uc = NULL;
>  	bool				moveon = true;
>  	int				i;
>  	int				error;
> @@ -183,11 +183,13 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
>  			ent = ATTR_ENTRY(attrlist, i);
>  			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
>  					ent->a_name);
> -			moveon = xfs_scrub_check_name(ctx, descr,
> -					_("extended attribute"), keybuf);
> -			if (!moveon)
> -				goto out;
> -			moveon = unicrash_check_xattr_name(uc, descr, keybuf);
> +			if (uc)
> +				moveon = unicrash_check_xattr_name(uc, descr,
> +						keybuf);
> +			else
> +				moveon = xfs_scrub_check_name(ctx, descr,
> +						_("extended attribute"),
> +						keybuf);
>  			if (!moveon)
>  				goto out;
>  		}
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes
  2018-04-03 17:30   ` Eric Sandeen
@ 2018-04-05  3:57     ` Darrick J. Wong
  2018-04-11  0:20     ` Darrick J. Wong
  1 sibling, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-04-05  3:57 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: sandeen, linux-xfs

On Tue, Apr 03, 2018 at 12:30:57PM -0500, Eric Sandeen wrote:
> On 3/20/18 10:39 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Avoid a buffer overflow when we're formatting extended attribute names
> > for name checking.
> 
> This won't /actually/ overflow, right, because you are doing
> snprintf(NAME_MAX) into a buffer of size [NAME_MAX + 1].
> 
> However, it might truncate the attribute string if too long.
> (just trying to avoid security folk wig-outs).

Right.

> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  scrub/phase5.c |    8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/scrub/phase5.c b/scrub/phase5.c
> > index 8e0a1be..36821d0 100644
> > --- a/scrub/phase5.c
> > +++ b/scrub/phase5.c
> > @@ -143,6 +143,8 @@ static const struct xfs_attr_ns attr_ns[] = {
> >  	{ATTR_SECURE,		"secure"},
> >  	{0, NULL},
> >  };
> > +/* Enough space to handle the prefix. */
> > +#define ATTR_NAME_MAX		(NAME_MAX + 8)
> 
> Unrelated to this change, really, but should NAME_MAX be
> XATTR_NAME_MAX for clarity & consistency w/ the xattr code?
> 
> (it's defined in /usr/include/linux/limits.h)

Sure, but it's in a public header file, we're stuck with the name
forever.

> 
> >  
> >  /*
> >   * Check all the xattr names in a particular namespace of a file handle
> > @@ -158,7 +160,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
> >  {
> >  	struct attrlist_cursor		cur;
> >  	char				attrbuf[XFS_XATTR_LIST_MAX];
> > -	char				keybuf[NAME_MAX + 1];
> > +	char				keybuf[ATTR_NAME_MAX + 1];
> >  	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
> >  	struct attrlist_ent		*ent;
> >  	struct unicrash			*uc;
> > @@ -172,14 +174,14 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
> >  
> >  	memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
> >  	memset(&cur, 0, sizeof(cur));
> > -	memset(keybuf, 0, NAME_MAX + 1);
> > +	memset(keybuf, 0, ATTR_NAME_MAX + 1);
> >  	error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
> >  			XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
> >  	while (!error) {
> >  		/* Examine the xattrs. */
> >  		for (i = 0; i < attrlist->al_count; i++) {
> >  			ent = ATTR_ENTRY(attrlist, i);
> > -			snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,
> 
> To future proof this rather than relying on a hardcoded 8 based on the
> current namespaces above, how about:
> 
> #define XATTR_NS_MAX 8
> #define ATTR_STRING_MAX		(XATTR_NS_MAX + XATTR_NAME_MAX + 1) /* for '.' */
> 
> char				keybuf[ATTR_STRING_MAX + 1];
> 
> 			ASSERT(strlen(attr_ns->name) <= XATTR_NS_MAX);
> 			snprintf(keybuf, ATTR_STRING_MAX, "%s.%s", attr_ns->name,
> 					ent->a_name);
> 
> just in case someone adds a new 
> 
> 	ATTR_SUPERDELUXE, "superdeluxe"
> 
> namespace some day?  Is that overkill?

Probably not.  Will restructure this one and await the
"robofuzztestcallerhahahahahha." namespace. :)

--D

> > +			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
> >  					ent->a_name);
> >  			moveon = xfs_scrub_check_name(ctx, descr,
> >  					_("extended attribute"), keybuf);
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 03/14] xfs_scrub: don't complain about different normalization
  2018-03-21  3:39 ` [PATCH 03/14] xfs_scrub: don't complain about different normalization Darrick J. Wong
@ 2018-04-10 23:37   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-10 23:37 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:39 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Since there are different ways to normalize utf8 names, don't complain
> when we find a name that is normalized in a different way than the NFKC
> that we use to find duplicate names.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

insofar as I understand any of this,

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  scrub/unicrash.c |   13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> 
> diff --git a/scrub/unicrash.c b/scrub/unicrash.c
> index 0b5d1fa..10d7c14 100644
> --- a/scrub/unicrash.c
> +++ b/scrub/unicrash.c
> @@ -256,7 +256,6 @@ unicrash_complain(
>  	struct unicrash		*uc,
>  	const char		*descr,
>  	const char		*what,
> -	bool			normal,
>  	bool			unique,
>  	const char		*name,
>  	uint8_t			*uniname)
> @@ -267,10 +266,6 @@ unicrash_complain(
>  	bad1 = string_escape(name);
>  	bad2 = string_escape((char *)uniname);
>  
> -	if (!normal && should_warn_about_name(uc->ctx))
> -		str_info(uc->ctx, descr,
> -_("Unicode name \"%s\" in %s should be normalized as \"%s\"."),
> -				bad1, what, bad2);
>  	if (!unique)
>  		str_warn(uc->ctx, descr,
>  _("Duplicate normalized Unicode name \"%s\" found in %s."),
> @@ -342,20 +337,18 @@ __unicrash_check_name(
>  {
>  	uint8_t			uniname[(NAME_MAX * 2) + 1];
>  	bool			moveon;
> -	bool			normal;
>  	bool			unique;
>  
>  	memset(uniname, 0, (NAME_MAX * 2) + 1);
> -	normal = unicrash_normalize(name, uniname, NAME_MAX * 2);
> +	unicrash_normalize(name, uniname, NAME_MAX * 2);
>  	moveon = unicrash_add(uc, uniname, ino, &unique);
>  	if (!moveon)
>  		return false;
>  
> -	if (normal && unique)
> +	if (unique)
>  		return true;
>  
> -	unicrash_complain(uc, descr, namedescr, normal, unique, name,
> -			uniname);
> +	unicrash_complain(uc, descr, namedescr, unique, name, uniname);
>  	return true;
>  }
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans
  2018-03-21  3:40 ` [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans Darrick J. Wong
@ 2018-04-10 23:46   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-10 23:46 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:40 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Use an unsigned int to pass around name error flags instead of booleans.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  scrub/unicrash.c |   43 ++++++++++++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
> 
> diff --git a/scrub/unicrash.c b/scrub/unicrash.c
> index 10d7c14..3538a60 100644
> --- a/scrub/unicrash.c
> +++ b/scrub/unicrash.c
> @@ -77,6 +77,14 @@ struct unicrash {
>  #define UNICRASH_SZ(nr)		(sizeof(struct unicrash) + \
>  				 (nr * sizeof(struct name_entry *)))
>  
> +/* Things to complain about in Unicode naming. */
> +
> +/*
> + * Multiple names resolve to the same normalized string and therefore render
> + * identically.
> + */
> +#define UNICRASH_NOT_UNIQUE	(1 << 0)
> +
>  /*
>   * We only care about validating utf8 collisions if the underlying
>   * system configuration says we're using utf8.  If the language
> @@ -256,7 +264,7 @@ unicrash_complain(
>  	struct unicrash		*uc,
>  	const char		*descr,
>  	const char		*what,
> -	bool			unique,
> +	unsigned int		badflags,
>  	const char		*name,
>  	uint8_t			*uniname)
>  {
> @@ -266,11 +274,20 @@ unicrash_complain(
>  	bad1 = string_escape(name);
>  	bad2 = string_escape((char *)uniname);
>  
> -	if (!unique)
> +	/*
> +	 * Two names that normalize to the same string will render
> +	 * identically even though the filesystem considers them unique
> +	 * names.  "cafe\xcc\x81" and "caf\xc3\xa9" have different byte
> +	 * sequences, but they both appear as "café".
> +	 */
> +	if (badflags & UNICRASH_NOT_UNIQUE) {
>  		str_warn(uc->ctx, descr,
> -_("Duplicate normalized Unicode name \"%s\" found in %s."),
> -				bad1, what);
> +_("Unicode name \"%s\" in %s renders identically to \"%s\"."),
> +				bad1, what, bad2);
> +		goto out;
> +	}
>  
> +out:
>  	free(bad1);
>  	free(bad2);
>  }
> @@ -291,7 +308,7 @@ unicrash_add(
>  	struct unicrash		*uc,
>  	uint8_t			*uniname,
>  	xfs_ino_t		ino,
> -	bool			*unique)
> +	unsigned int		*badflags)
>  {
>  	struct name_entry	*ne;
>  	struct name_entry	*x;
> @@ -304,8 +321,9 @@ unicrash_add(
>  	hash = unicrash_hashname(uniname, uninamelen);
>  	bucket = hash % uc->nr_buckets;
>  	for (nep = &uc->buckets[bucket], ne = *nep; ne != NULL; ne = x) {
> -		if (u8_strcmp(uniname, ne->uniname) == 0) {
> -			*unique = uc->compare_ino ? ne->ino == ino : false;
> +		if (u8_strcmp(uniname, ne->uniname) == 0 &&
> +		    (uc->compare_ino ? ino != ne->ino : true)) {
> +			*badflags |= UNICRASH_NOT_UNIQUE;
>  			return true;
>  		}
>  		nep = &ne->next;
> @@ -321,7 +339,6 @@ unicrash_add(
>  	x->uninamelen = uninamelen;
>  	memcpy(x->uniname, uniname, uninamelen + 1);
>  	*nep = x;
> -	*unique = true;
>  
>  	return true;
>  }
> @@ -336,19 +353,19 @@ __unicrash_check_name(
>  	xfs_ino_t		ino)
>  {
>  	uint8_t			uniname[(NAME_MAX * 2) + 1];
> +	unsigned int		badflags = 0;
>  	bool			moveon;
> -	bool			unique;
>  
>  	memset(uniname, 0, (NAME_MAX * 2) + 1);
>  	unicrash_normalize(name, uniname, NAME_MAX * 2);
> -	moveon = unicrash_add(uc, uniname, ino, &unique);
> +	moveon = unicrash_add(uc, uniname, ino, &badflags);
>  	if (!moveon)
>  		return false;
>  
> -	if (unique)
> -		return true;
> +	if (badflags)
> +		unicrash_complain(uc, descr, namedescr, badflags, name,
> +				uniname);
>  
> -	unicrash_complain(uc, descr, namedescr, unique, name, uniname);
>  	return true;
>  }
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes
  2018-04-03 17:30   ` Eric Sandeen
  2018-04-05  3:57     ` Darrick J. Wong
@ 2018-04-11  0:20     ` Darrick J. Wong
  1 sibling, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-04-11  0:20 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: sandeen, linux-xfs

On Tue, Apr 03, 2018 at 12:30:57PM -0500, Eric Sandeen wrote:
> On 3/20/18 10:39 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Avoid a buffer overflow when we're formatting extended attribute names
> > for name checking.
> 
> This won't /actually/ overflow, right, because you are doing
> snprintf(NAME_MAX) into a buffer of size [NAME_MAX + 1].
> 
> However, it might truncate the attribute string if too long.
> (just trying to avoid security folk wig-outs).
> 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  scrub/phase5.c |    8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/scrub/phase5.c b/scrub/phase5.c
> > index 8e0a1be..36821d0 100644
> > --- a/scrub/phase5.c
> > +++ b/scrub/phase5.c
> > @@ -143,6 +143,8 @@ static const struct xfs_attr_ns attr_ns[] = {
> >  	{ATTR_SECURE,		"secure"},
> >  	{0, NULL},
> >  };
> > +/* Enough space to handle the prefix. */
> > +#define ATTR_NAME_MAX		(NAME_MAX + 8)
> 
> Unrelated to this change, really, but should NAME_MAX be
> XATTR_NAME_MAX for clarity & consistency w/ the xattr code?
> 
> (it's defined in /usr/include/linux/limits.h)

Hm, I got bogged down here trying to figure out if XATTR_NAME_MAX
applied to the entire string "security.foo" or just the ".foo" part.
The answer is (according to getxattr()) the first, so I'll fix this to
use XATTR_NAME_MAX directly...

> 
> >  
> >  /*
> >   * Check all the xattr names in a particular namespace of a file handle
> > @@ -158,7 +160,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
> >  {
> >  	struct attrlist_cursor		cur;
> >  	char				attrbuf[XFS_XATTR_LIST_MAX];
> > -	char				keybuf[NAME_MAX + 1];
> > +	char				keybuf[ATTR_NAME_MAX + 1];
> >  	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
> >  	struct attrlist_ent		*ent;
> >  	struct unicrash			*uc;
> > @@ -172,14 +174,14 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
> >  
> >  	memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
> >  	memset(&cur, 0, sizeof(cur));
> > -	memset(keybuf, 0, NAME_MAX + 1);
> > +	memset(keybuf, 0, ATTR_NAME_MAX + 1);
> >  	error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
> >  			XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
> >  	while (!error) {
> >  		/* Examine the xattrs. */
> >  		for (i = 0; i < attrlist->al_count; i++) {
> >  			ent = ATTR_ENTRY(attrlist, i);
> > -			snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,
> 
> To future proof this rather than relying on a hardcoded 8 based on the
> current namespaces above, how about:
> 
> #define XATTR_NS_MAX 8
> #define ATTR_STRING_MAX		(XATTR_NS_MAX + XATTR_NAME_MAX + 1) /* for '.' */
> 
> char				keybuf[ATTR_STRING_MAX + 1];

...and then we can use it directly here instead of all these weird
NAME_MAX variants.

> 
> 			ASSERT(strlen(attr_ns->name) <= XATTR_NS_MAX);
> 			snprintf(keybuf, ATTR_STRING_MAX, "%s.%s", attr_ns->name,
> 					ent->a_name);
> 
> just in case someone adds a new 
> 
> 	ATTR_SUPERDELUXE, "superdeluxe"
> 
> namespace some day?  Is that overkill?

No, not overkill.

--D

> 
> > +			snprintf(keybuf, ATTR_NAME_MAX, "%s.%s", attr_ns->name,
> >  					ent->a_name);
> >  			moveon = xfs_scrub_check_name(ctx, descr,
> >  					_("extended attribute"), keybuf);
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 01/14] xfs_scrub: avoid buffer overflow when scanning attributes
  2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
  2018-04-03 17:30   ` Eric Sandeen
@ 2018-04-11  0:27   ` Darrick J. Wong
  1 sibling, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-04-11  0:27 UTC (permalink / raw)
  To: sandeen; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Avoid a buffer overflow when we're formatting extended attribute names
for name checking.  The kernel headers provide us with XATTR_NAME_MAX,
so we can rely on that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: use XATTR_NAME_MAX per Eric Sandeen's suggestion
---
 scrub/phase5.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scrub/phase5.c b/scrub/phase5.c
index 5f2a1a7..e0e7e8c 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -158,7 +158,7 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 {
 	struct attrlist_cursor		cur;
 	char				attrbuf[XFS_XATTR_LIST_MAX];
-	char				keybuf[NAME_MAX + 1];
+	char				keybuf[XATTR_NAME_MAX + 1];
 	struct attrlist			*attrlist = (struct attrlist *)attrbuf;
 	struct attrlist_ent		*ent;
 	struct unicrash			*uc;
@@ -172,14 +172,14 @@ xfs_scrub_scan_fhandle_namespace_xattrs(
 
 	memset(attrbuf, 0, XFS_XATTR_LIST_MAX);
 	memset(&cur, 0, sizeof(cur));
-	memset(keybuf, 0, NAME_MAX + 1);
+	memset(keybuf, 0, XATTR_NAME_MAX + 1);
 	error = attr_list_by_handle(handle, sizeof(*handle), attrbuf,
 			XFS_XATTR_LIST_MAX, attr_ns->flags, &cur);
 	while (!error) {
 		/* Examine the xattrs. */
 		for (i = 0; i < attrlist->al_count; i++) {
 			ent = ATTR_ENTRY(attrlist, i);
-			snprintf(keybuf, NAME_MAX, "%s.%s", attr_ns->name,
+			snprintf(keybuf, XATTR_NAME_MAX, "%s.%s", attr_ns->name,
 					ent->a_name);
 			moveon = xfs_scrub_check_name(ctx, descr,
 					_("extended attribute"), keybuf);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 11/14] xfs_scrub_all: report version
  2018-03-21  3:40 ` [PATCH 11/14] xfs_scrub_all: report version Darrick J. Wong
@ 2018-04-11  0:28   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-11  0:28 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs

On 3/20/18 10:40 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Make xfs_scrub_all -V report its version like the other xfs tools.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  man/man8/xfs_scrub_all.8 |   10 ++++++++++
>  scrub/Makefile           |    1 +
>  scrub/xfs_scrub_all.in   |   11 +++++++++++
>  3 files changed, 22 insertions(+)
> 
> 
> diff --git a/man/man8/xfs_scrub_all.8 b/man/man8/xfs_scrub_all.8
> index 5e1420b..7454880 100644
> --- a/man/man8/xfs_scrub_all.8
> +++ b/man/man8/xfs_scrub_all.8
> @@ -3,6 +3,9 @@
>  xfs_scrub_all \- scrub all mounted XFS filesystems
>  .SH SYNOPSIS
>  .B xfs_scrub_all
> +[
> +.B \-hV
> +]
>  .SH DESCRIPTION
>  .B xfs_scrub_all
>  attempts to read and check all the metadata on all mounted XFS filesystems.
> @@ -13,6 +16,13 @@ in a restricted fashion.
>  Mounted filesystems are mapped to physical storage devices so that scrub
>  operations can be run in parallel so long as no two scrubbers access
>  the same device simultaneously.
> +.SH OPTIONS
> +.TP
> +.B \-h
> +Display help.
> +.TP
> +.B \-V
> +Prints the version number and exits.
>  .SH EXIT CODE
>  The exit code returned by
>  .B xfs_scrub_all
> diff --git a/scrub/Makefile b/scrub/Makefile
> index bcc05a0..482e8a7 100644
> --- a/scrub/Makefile
> +++ b/scrub/Makefile
> @@ -102,6 +102,7 @@ default: depend $(LTCOMMAND) $(XFS_SCRUB_ALL_PROG) $(OPTIONAL_TARGETS)
>  xfs_scrub_all: xfs_scrub_all.in
>  	@echo "    [SED]    $@"
>  	$(Q)$(SED) -e "s|@sbindir@|$(PKG_ROOT_SBIN_DIR)|g" \
> +		   -e "s|@pkg_version@|$(PKG_VERSION)|g" \
>  		   -e "s|@scrub_args@|$(XFS_SCRUB_ARGS)|g" < $< > $@
>  	$(Q)chmod a+x $@
>  
> diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
> index 80f07d5..aed66a1 100644
> --- a/scrub/xfs_scrub_all.in
> +++ b/scrub/xfs_scrub_all.in
> @@ -26,6 +26,7 @@ import threading
>  import time
>  import sys
>  import os
> +import argparse
>  
>  retcode = 0
>  terminate = False
> @@ -139,6 +140,16 @@ def main():
>  		thr.start()
>  	global retcode, terminate
>  
> +	parser = argparse.ArgumentParser( \
> +			description = "Scrub all mounted XFS filesystems.")
> +	parser.add_argument("-V", help = "Report version and exit.", \
> +			action = "store_true")
> +	args = parser.parse_args()
> +
> +	if args.V:
> +		print("xfs_scrub_all version @pkg_version@")
> +		sys.exit(0)
> +
>  	fs = find_mounts()
>  
>  	# Tail the journal if we ourselves aren't a service...
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances
  2018-03-21  3:41 ` [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances Darrick J. Wong
@ 2018-04-11  1:31   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-11  1:31 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:41 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> systemd doesn't like unit instance names with slashes in them, so it
> replaces them with dashes when it invokes the service.  However, it's
> not smart enough to convert the dashes to something else, so when it
> unescapes the instance name to feed to xfs_scrub, it turns all dashes
> into slashes.  "/moo-cow" becomes "-moo-cow" becomes "/moo/cow", which
> is wrong.  systemd actually /can/ escape the dashes correctly if it is
> told that this is a path (and not a unit name), but it didn't do this
> prior to January 2017, so fix this for them.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Oh-good-god-whatever-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  scrub/xfs_scrub_all.in |   24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
> index aed66a1..83c4e21 100644
> --- a/scrub/xfs_scrub_all.in
> +++ b/scrub/xfs_scrub_all.in
> @@ -87,6 +87,28 @@ def run_killable(cmd, stdout, killfuncs, kill_fn):
>  	except:
>  		return -1
>  
> +# systemd doesn't like unit instance names with slashes in them, so it
> +# replaces them with dashes when it invokes the service.  However, it's not
> +# smart enough to convert the dashes to something else, so when it unescapes
> +# the instance name to feed to xfs_scrub, it turns all dashes into slashes.
> +# "/moo-cow" becomes "-moo-cow" becomes "/moo/cow", which is wrong.  systemd
> +# actually /can/ escape the dashes correctly if it is told that this is a path
> +# (and not a unit name), but it didn't do this prior to January 2017, so fix
> +# this for them.
> +def systemd_escape(path):
> +	'''Escape a path to avoid mangled systemd mangling.'''
> +
> +	if '-' not in path:
> +		return path
> +	cmd = ['systemd-escape', '--path', path]
> +	try:
> +		proc = subprocess.Popen(cmd, stdout = subprocess.PIPE)
> +		proc.wait()
> +		for line in proc.stdout:
> +			return '-' + line.decode(sys.stdout.encoding).strip()
> +	except:
> +		return path
> +
>  def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
>  	'''Run a scrub process.'''
>  	global retcode, terminate
> @@ -99,7 +121,7 @@ def run_scrub(mnt, cond, running_devs, mntdevs, killfuncs):
>  			return
>  
>  		# Try it the systemd way
> -		cmd=['systemctl', 'start', 'xfs_scrub@%s' % mnt]
> +		cmd=['systemctl', 'start', 'xfs_scrub@%s' % systemd_escape(mnt)]
>  		ret = run_killable(cmd, DEVNULL(), killfuncs, \
>  				lambda proc: kill_systemd('xfs_scrub@%s' % mnt, proc))
>  		if ret == 0 or ret == 1:
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding
  2018-03-21  3:41 ` [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding Darrick J. Wong
@ 2018-04-11  1:35   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-11  1:35 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:41 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't hardcode utf-8 as the decoding scheme for lsblk output, since the
> system could set it to anything else.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Its-all-python-to-me-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  scrub/xfs_scrub_all.in |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/scrub/xfs_scrub_all.in b/scrub/xfs_scrub_all.in
> index 83c4e21..d374f92 100644
> --- a/scrub/xfs_scrub_all.in
> +++ b/scrub/xfs_scrub_all.in
> @@ -48,7 +48,7 @@ def find_mounts():
>  	result.wait()
>  	if result.returncode != 0:
>  		return fs
> -	sarray = [x.decode('utf-8') for x in result.stdout.readlines()]
> +	sarray = [x.decode(sys.stdout.encoding) for x in result.stdout.readlines()]
>  	output = ' '.join(sarray)
>  	bdevdata = json.loads(output)
>  	# The lsblk output had better be in disks-then-partitions order
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service
  2018-03-21  3:40 ` [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service Darrick J. Wong
@ 2018-04-11  1:45   ` Eric Sandeen
  2018-04-11  1:49     ` Darrick J. Wong
  2018-04-11  1:53   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 31+ messages in thread
From: Eric Sandeen @ 2018-04-11  1:45 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:40 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't make /tmp private when invoking xfs_scrub as a service, because
> /tmp might contain or itself be an xfs filesystem mountpoint.

Could you please add a comment to this so that future security analysts
don't change it back?  :)

# xfs_scrub doesn't even use /tmp but <this is why we do this here>

> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  scrub/xfs_scrub@.service.in |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/scrub/xfs_scrub@.service.in b/scrub/xfs_scrub@.service.in
> index c14f813..9e6206a 100644
> --- a/scrub/xfs_scrub@.service.in
> +++ b/scrub/xfs_scrub@.service.in
> @@ -9,7 +9,7 @@ WorkingDirectory=%I
>  PrivateNetwork=true
>  ProtectSystem=full
>  ProtectHome=read-only
> -PrivateTmp=yes
> +PrivateTmp=no
>  AmbientCapabilities=CAP_SYS_ADMIN CAP_FOWNER CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_SYS_RAWIO
>  NoNewPrivileges=yes
>  User=nobody
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code
  2018-03-21  3:40 ` [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code Darrick J. Wong
@ 2018-04-11  1:48   ` Eric Sandeen
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Sandeen @ 2018-04-11  1:48 UTC (permalink / raw)
  To: Darrick J. Wong, sandeen; +Cc: linux-xfs



On 3/20/18 10:40 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Use the libfrog path finding code to determine if the argument being
> passed in is a mountpoint, remove all mention of taking a block device
> (we have never supported that) from the documentation, and fix some
> potential memory leaks.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

now with 100% mount point!

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>  man/man8/xfs_scrub.8 |    2 +
>  scrub/common.c       |   69 --------------------------------------------------
>  scrub/common.h       |    1 -
>  scrub/phase1.c       |   12 ---------
>  scrub/xfs_scrub.c    |   19 +++++++-------
>  scrub/xfs_scrub.h    |    1 -
>  6 files changed, 10 insertions(+), 94 deletions(-)
> 
> 
> diff --git a/man/man8/xfs_scrub.8 b/man/man8/xfs_scrub.8
> index 77fed92..680ef72 100644
> --- a/man/man8/xfs_scrub.8
> +++ b/man/man8/xfs_scrub.8
> @@ -6,7 +6,7 @@ xfs_scrub \- check the contents of a mounted XFS filesystem
>  [
>  .B \-abCemnTvx
>  ]
> -.RI "[" mount-point " | " block-device "]"
> +.I mount-point
>  .br
>  .B xfs_scrub \-V
>  .SH DESCRIPTION
> diff --git a/scrub/common.c b/scrub/common.c
> index 5a37a98..722bf36 100644
> --- a/scrub/common.c
> +++ b/scrub/common.c
> @@ -267,75 +267,6 @@ scrub_nproc_workqueue(
>  }
>  
>  /*
> - * Check if the argument is either the device name or mountpoint of a mounted
> - * filesystem.
> - */
> -#define MNTTYPE_XFS	"xfs"
> -static bool
> -find_mountpoint_check(
> -	struct stat		*sb,
> -	struct mntent		*t)
> -{
> -	struct stat		ms;
> -
> -	if (S_ISDIR(sb->st_mode)) {		/* mount point */
> -		if (stat(t->mnt_dir, &ms) < 0)
> -			return false;
> -		if (sb->st_ino != ms.st_ino)
> -			return false;
> -		if (sb->st_dev != ms.st_dev)
> -			return false;
> -		if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
> -			return NULL;
> -	} else {				/* device */
> -		if (stat(t->mnt_fsname, &ms) < 0)
> -			return false;
> -		if (sb->st_rdev != ms.st_rdev)
> -			return false;
> -		if (strcmp(t->mnt_type, MNTTYPE_XFS) != 0)
> -			return NULL;
> -		/*
> -		 * Make sure the mountpoint given by mtab is accessible
> -		 * before using it.
> -		 */
> -		if (stat(t->mnt_dir, &ms) < 0)
> -			return false;
> -	}
> -
> -	return true;
> -}
> -
> -/* Check that our alleged mountpoint is in mtab */
> -bool
> -find_mountpoint(
> -	char			*mtab,
> -	struct scrub_ctx	*ctx)
> -{
> -	struct mntent_cursor	cursor;
> -	struct mntent		*t = NULL;
> -	bool			found = false;
> -
> -	if (platform_mntent_open(&cursor, mtab) != 0) {
> -		fprintf(stderr, "Error: can't get mntent entries.\n");
> -		exit(1);
> -	}
> -
> -	while ((t = platform_mntent_next(&cursor)) != NULL) {
> -		/*
> -		 * Keep jotting down matching mount details; newer mounts are
> -		 * towards the end of the file (hopefully).
> -		 */
> -		if (find_mountpoint_check(&ctx->mnt_sb, t)) {
> -			ctx->mntpoint = strdup(t->mnt_dir);
> -			ctx->blkdev = strdup(t->mnt_fsname);
> -			found = true;
> -		}
> -	}
> -	platform_mntent_close(&cursor);
> -	return found;
> -}
> -
> -/*
>   * Sleep for 100ms * however many -b we got past the initial one.
>   * This is an (albeit clumsy) way to throttle scrub activity.
>   */
> diff --git a/scrub/common.h b/scrub/common.h
> index 4f1f0cd..bc83971 100644
> --- a/scrub/common.h
> +++ b/scrub/common.h
> @@ -88,7 +88,6 @@ static inline int syncfs(int fd)
>  }
>  #endif
>  
> -bool find_mountpoint(char *mtab, struct scrub_ctx *ctx);
>  void background_sleep(void);
>  char *string_escape(const char *in);
>  
> diff --git a/scrub/phase1.c b/scrub/phase1.c
> index 9926770..c2b9067 100644
> --- a/scrub/phase1.c
> +++ b/scrub/phase1.c
> @@ -83,7 +83,6 @@ bool
>  xfs_setup_fs(
>  	struct scrub_ctx		*ctx)
>  {
> -	struct fs_path			*fsp;
>  	int				error;
>  
>  	/*
> @@ -178,17 +177,6 @@ _("Kernel metadata repair facility is not available.  Use -n to scrub."));
>  		return false;
>  	}
>  
> -	/* Go find the XFS devices if we have a usable fsmap. */
> -	fs_table_initialise(0, NULL, 0, NULL);
> -	errno = 0;
> -	fsp = fs_table_lookup(ctx->mntpoint, FS_MOUNT_POINT);
> -	if (!fsp) {
> -		str_info(ctx, ctx->mntpoint,
> -_("Unable to find XFS information."));
> -		return false;
> -	}
> -	memcpy(&ctx->fsinfo, fsp, sizeof(struct fs_path));
> -
>  	/* Did we find the log and rt devices, if they're present? */
>  	if (ctx->geo.logstart == 0 && ctx->fsinfo.fs_log == NULL) {
>  		str_info(ctx, ctx->mntpoint,
> diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
> index 2d55283..a5b5cf2 100644
> --- a/scrub/xfs_scrub.c
> +++ b/scrub/xfs_scrub.c
> @@ -170,7 +170,7 @@ bool				is_service;
>  static void __attribute__((noreturn))
>  usage(void)
>  {
> -	fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint | device\n"), progname);
> +	fprintf(stderr, _("Usage: %s [OPTIONS] mountpoint\n"), progname);
>  	fprintf(stderr, "\n");
>  	fprintf(stderr, _("Options:\n"));
>  	fprintf(stderr, _("  -a count     Stop after this many errors are found.\n"));
> @@ -538,8 +538,8 @@ main(
>  	struct phase_rusage	all_pi;
>  	char			*mtab = NULL;
>  	FILE			*progress_fp = NULL;
> +	struct fs_path		*fsp;
>  	bool			moveon = true;
> -	bool			ismnt;
>  	int			c;
>  	int			fd;
>  	int			ret = SCRUB_RET_SUCCESS;
> @@ -640,7 +640,7 @@ main(
>  	if (optind != argc - 1)
>  		usage();
>  
> -	ctx.mntpoint = strdup(argv[optind]);
> +	ctx.mntpoint = argv[optind];
>  
>  	stdout_isatty = isatty(STDOUT_FILENO);
>  	stderr_isatty = isatty(STDERR_FILENO);
> @@ -680,14 +680,15 @@ main(
>  			mtab = _PATH_MOUNTED;
>  	}
>  
> -	ismnt = find_mountpoint(mtab, &ctx);
> -	if (!ismnt) {
> -		fprintf(stderr,
> -_("%s: Not a XFS mount point or block device.\n"),
> -			ctx.mntpoint);
> +	fs_table_initialise(0, NULL, 0, NULL);
> +	fsp = fs_table_lookup_mount(ctx.mntpoint);
> +	if (!fsp) {
> +		fprintf(stderr, _("%s: Not a XFS mount point.\n"),
> +				ctx.mntpoint);
>  		ret |= SCRUB_RET_SYNTAX;
>  		goto out;
>  	}
> +	memcpy(&ctx.fsinfo, fsp, sizeof(struct fs_path));
>  
>  	/* How many CPUs? */
>  	nproc = sysconf(_SC_NPROCESSORS_ONLN);
> @@ -740,8 +741,6 @@ _("%s: Not a XFS mount point or block device.\n"),
>  	phase_end(&all_pi, 0);
>  	if (progress_fp)
>  		fclose(progress_fp);
> -	free(ctx.blkdev);
> -	free(ctx.mntpoint);
>  
>  	/*
>  	 * If we're being run as a service, the return code must fit the LSB
> diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
> index aa130a7..c9dbe8e 100644
> --- a/scrub/xfs_scrub.h
> +++ b/scrub/xfs_scrub.h
> @@ -48,7 +48,6 @@ struct scrub_ctx {
>  
>  	/* Strings we need for presentation */
>  	char			*mntpoint;
> -	char			*blkdev;
>  
>  	/* Mountpoint info */
>  	struct stat		mnt_sb;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service
  2018-04-11  1:45   ` Eric Sandeen
@ 2018-04-11  1:49     ` Darrick J. Wong
  0 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-04-11  1:49 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: sandeen, linux-xfs

On Tue, Apr 10, 2018 at 08:45:23PM -0500, Eric Sandeen wrote:
> 
> 
> On 3/20/18 10:40 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Don't make /tmp private when invoking xfs_scrub as a service, because
> > /tmp might contain or itself be an xfs filesystem mountpoint.
> 
> Could you please add a comment to this so that future security analysts
> don't change it back?  :)

# Disable private /tmp just in case %i is a path under /tmp.

--D

> # xfs_scrub doesn't even use /tmp but <this is why we do this here>
> 
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  scrub/xfs_scrub@.service.in |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > 
> > diff --git a/scrub/xfs_scrub@.service.in b/scrub/xfs_scrub@.service.in
> > index c14f813..9e6206a 100644
> > --- a/scrub/xfs_scrub@.service.in
> > +++ b/scrub/xfs_scrub@.service.in
> > @@ -9,7 +9,7 @@ WorkingDirectory=%I
> >  PrivateNetwork=true
> >  ProtectSystem=full
> >  ProtectHome=read-only
> > -PrivateTmp=yes
> > +PrivateTmp=no
> >  AmbientCapabilities=CAP_SYS_ADMIN CAP_FOWNER CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_SYS_RAWIO
> >  NoNewPrivileges=yes
> >  User=nobody
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 12/14] xfs_scrub: disable private /tmp for scrub service
  2018-03-21  3:40 ` [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service Darrick J. Wong
  2018-04-11  1:45   ` Eric Sandeen
@ 2018-04-11  1:53   ` Darrick J. Wong
  1 sibling, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2018-04-11  1:53 UTC (permalink / raw)
  To: sandeen; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't make /tmp private when invoking xfs_scrub as a service, because
/tmp might contain or itself be an xfs filesystem mountpoint.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: add a comment explaining why we turned it off
---
 scrub/xfs_scrub@.service.in |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scrub/xfs_scrub@.service.in b/scrub/xfs_scrub@.service.in
index c14f813..56acea6 100644
--- a/scrub/xfs_scrub@.service.in
+++ b/scrub/xfs_scrub@.service.in
@@ -9,7 +9,8 @@ WorkingDirectory=%I
 PrivateNetwork=true
 ProtectSystem=full
 ProtectHome=read-only
-PrivateTmp=yes
+# Disable private /tmp just in case %i is a path under /tmp.
+PrivateTmp=no
 AmbientCapabilities=CAP_SYS_ADMIN CAP_FOWNER CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_SYS_RAWIO
 NoNewPrivileges=yes
 User=nobody

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-04-11  1:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-21  3:39 [PATCH 00/14] xfsprogs: online scrub fixes Darrick J. Wong
2018-03-21  3:39 ` [PATCH 01/14] xfs_scrub: avoid buffer overflow when scanning attributes Darrick J. Wong
2018-04-03 17:30   ` Eric Sandeen
2018-04-05  3:57     ` Darrick J. Wong
2018-04-11  0:20     ` Darrick J. Wong
2018-04-11  0:27   ` [PATCH v2 " Darrick J. Wong
2018-03-21  3:39 ` [PATCH 02/14] xfs_scrub: only run ascii name checks if unicode name checker Darrick J. Wong
2018-04-03 17:49   ` Eric Sandeen
2018-03-21  3:39 ` [PATCH 03/14] xfs_scrub: don't complain about different normalization Darrick J. Wong
2018-04-10 23:37   ` Eric Sandeen
2018-03-21  3:40 ` [PATCH 04/14] xfs_scrub: communicate name problems via flagset instead of booleans Darrick J. Wong
2018-04-10 23:46   ` Eric Sandeen
2018-03-21  3:40 ` [PATCH 05/14] xfs_scrub: make name_entry a first class structure Darrick J. Wong
2018-03-21  3:40 ` [PATCH 06/14] xfs_scrub: transition from libunistring to libicu for Unicode processing Darrick J. Wong
2018-03-21  3:40 ` [PATCH 07/14] xfs_scrub: check name for suspicious characters Darrick J. Wong
2018-03-21  3:40 ` [PATCH 08/14] xfs_scrub: use Unicode skeleton function to find confusing names Darrick J. Wong
2018-03-26 19:58   ` [PATCH v2 " Darrick J. Wong
2018-03-21  3:40 ` [PATCH 09/14] xfs_scrub: don't warn about confusing names if dir/file only writable by root Darrick J. Wong
2018-03-26 19:59   ` [PATCH v2 " Darrick J. Wong
2018-03-21  3:40 ` [PATCH 10/14] xfs_scrub: refactor mountpoint finding code to use libfrog path code Darrick J. Wong
2018-04-11  1:48   ` Eric Sandeen
2018-03-21  3:40 ` [PATCH 11/14] xfs_scrub_all: report version Darrick J. Wong
2018-04-11  0:28   ` Eric Sandeen
2018-03-21  3:40 ` [PATCH 12/14] xfs_scrub: disable private /tmp for scrub service Darrick J. Wong
2018-04-11  1:45   ` Eric Sandeen
2018-04-11  1:49     ` Darrick J. Wong
2018-04-11  1:53   ` [PATCH v2 " Darrick J. Wong
2018-03-21  3:41 ` [PATCH 13/14] xfs_scrub_all: escape paths being passed to systemd service instances Darrick J. Wong
2018-04-11  1:31   ` Eric Sandeen
2018-03-21  3:41 ` [PATCH 14/14] xfs_scrub_all: use system encoding for lsblk output decoding Darrick J. Wong
2018-04-11  1:35   ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.