All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it
@ 2023-06-05 15:36 Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: Christoph Hellwig, linux-xfs, david, hch

Hi all,

Last week, I was fiddling around with the metadump name obfuscation code
while writing a debugger command to generate directories full of names
that all have the same hash name.  I had a few questions about how well
all that worked with ascii-ci mode, and discovered a nasty discrepancy
between the kernel and glibc's implementations of the tolower()
function.

I discovered that I could create a directory that is large enough to
require separate leaf index blocks.  The hashes stored in the dabtree
use the ascii-ci specific hash function, which uses a library function
to convert the name to lowercase before hashing.  If the kernel and C
library's versions of tolower do not behave exactly identically,
xfs_ascii_ci_hashname will not produce the same results for the same
inputs.  xfs_repair will deem the leaf information corrupt and rebuild
the directory.  After that, lookups in the kernel will fail because the
hash index doesn't work.

The kernel's tolower function will convert extended ascii uppercase
letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g.
a-with-umlaut), whereas glibc's will only do that if you force LANG to
ascii.  Tiny embedded libc implementations just plain won't do it at
all, and the result is a mess.  Stabilize the behavior of the hash
function by encoding the name transformation function in libxfs, add it
to the selftest, and fix all the userspace tools, none of which handle
this transformation correctly.

The v1 series generated a /lot/ of discussion, in which several things
became very clear: (1) Linus is not enamored of case folding of any
kind; (2) Dave and Christoph don't seem to agree on whether the feature
is supposed to work for 7-bit ascii or latin1; (3) it trashes UTF8
encoded names if those happen to show up; and (4) I don't want to
maintain this mess any longer than I have to.  Kill it in 2030.

v3: rebase on 6.4
v2: rename the functions to make it clear we're moving away from the
letters t, o, l, o, w, e, and r; and deprecate the whole feature once
we've fixed the bugs and added tests.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-asciici-bugs

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fix-asciici-bugs

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fix-asciici-bugs
---
 db/metadump.c            |   79 +++++++++++++++--
 libfrog/dahashselftest.h |  208 ++++++++++++++++++++++++----------------------
 libxfs/libxfs_api_defs.h |    2 
 man/man8/mkfs.xfs.8.in   |   23 ++++-
 mkfs/xfs_mkfs.c          |   11 ++
 5 files changed, 210 insertions(+), 113 deletions(-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] libxfs: test the ascii case-insensitive hash
  2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
@ 2023-06-05 15:36 ` Darrick J. Wong
  2023-06-05 16:30   ` Eric Sandeen
  2023-06-14  7:44   ` Carlos Maiolino
  2023-06-05 15:36 ` [PATCH 2/5] xfs_db: move obfuscate_name assertion to callers Darrick J. Wong
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Now that we've made kernel and userspace use the same tolower code for
computing directory index hashes, add that to the selftest code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libfrog/dahashselftest.h |  208 ++++++++++++++++++++++++----------------------
 libxfs/libxfs_api_defs.h |    2 
 2 files changed, 110 insertions(+), 100 deletions(-)


diff --git a/libfrog/dahashselftest.h b/libfrog/dahashselftest.h
index 7dda53037b8..ea9d925bf2a 100644
--- a/libfrog/dahashselftest.h
+++ b/libfrog/dahashselftest.h
@@ -13,108 +13,109 @@ static struct dahash_test {
 	uint16_t	start;	/* random 12 bit offset in buf */
 	uint16_t	length;	/* random 8 bit length of test */
 	xfs_dahash_t	dahash;	/* expected dahash result */
+	xfs_dahash_t	ascii_ci_dahash; /* expected ascii-ci dahash result */
 } dahash_tests[] =
 {
-	{0x0567, 0x0097, 0x96951389},
-	{0x0869, 0x0055, 0x6455ab4f},
-	{0x0c51, 0x00be, 0x8663afde},
-	{0x044a, 0x00fc, 0x98fbe432},
-	{0x0f29, 0x0079, 0x42371997},
-	{0x08ba, 0x0052, 0x942be4f7},
-	{0x01f2, 0x0013, 0x5262687e},
-	{0x09e3, 0x00e2, 0x8ffb0908},
-	{0x007c, 0x0051, 0xb3158491},
-	{0x0854, 0x001f, 0x83bb20d9},
-	{0x031b, 0x0008, 0x98970bdf},
-	{0x0de7, 0x0027, 0xbfbf6f6c},
-	{0x0f76, 0x0005, 0x906a7105},
-	{0x092e, 0x00d0, 0x86631850},
-	{0x0233, 0x0082, 0xdbdd914e},
-	{0x04c9, 0x0075, 0x5a400a9e},
-	{0x0b66, 0x0099, 0xae128b45},
-	{0x000d, 0x00ed, 0xe61c216a},
-	{0x0a31, 0x003d, 0xf69663b9},
-	{0x00a3, 0x0052, 0x643c39ae},
-	{0x0125, 0x00d5, 0x7c310b0d},
-	{0x0105, 0x004a, 0x06a77e74},
-	{0x0858, 0x008e, 0x265bc739},
-	{0x045e, 0x0095, 0x13d6b192},
-	{0x0dab, 0x003c, 0xc4498704},
-	{0x00cd, 0x00b5, 0x802a4e2d},
-	{0x069b, 0x008c, 0x5df60f71},
-	{0x0454, 0x006c, 0x5f03d8bb},
-	{0x040e, 0x0032, 0x0ce513b5},
-	{0x0874, 0x00e2, 0x6a811fb3},
-	{0x0521, 0x00b4, 0x93296833},
-	{0x0ddc, 0x00cf, 0xf9305338},
-	{0x0a70, 0x0023, 0x239549ea},
-	{0x083e, 0x0027, 0x2d88ba97},
-	{0x0241, 0x00a7, 0xfe0b32e1},
-	{0x0dfc, 0x0096, 0x1a11e815},
-	{0x023e, 0x001e, 0xebc9a1f3},
-	{0x067e, 0x0066, 0xb1067f81},
-	{0x09ea, 0x000e, 0x46fd7247},
-	{0x036b, 0x008c, 0x1a39acdf},
-	{0x078f, 0x0030, 0x964042ab},
-	{0x085c, 0x008f, 0x1829edab},
-	{0x02ec, 0x009f, 0x6aefa72d},
-	{0x043b, 0x00ce, 0x65642ff5},
-	{0x0a32, 0x00b8, 0xbd82759e},
-	{0x0d3c, 0x0087, 0xf4d66d54},
-	{0x09ec, 0x008a, 0x06bfa1ff},
-	{0x0902, 0x0015, 0x755025d2},
-	{0x08fe, 0x000e, 0xf690ce2d},
-	{0x00fb, 0x00dc, 0xe55f1528},
-	{0x0eaa, 0x003a, 0x0fe0a8d7},
-	{0x05fb, 0x0006, 0x86281cfb},
-	{0x0dd1, 0x00a7, 0x60ab51b4},
-	{0x0005, 0x001b, 0xf51d969b},
-	{0x077c, 0x00dd, 0xc2fed268},
-	{0x0575, 0x00f5, 0x432c0b1a},
-	{0x05be, 0x0088, 0x78baa04b},
-	{0x0c89, 0x0068, 0xeda9e428},
-	{0x0f5c, 0x0068, 0xec143c76},
-	{0x06a8, 0x0009, 0xd72651ce},
-	{0x060f, 0x008e, 0x765426cd},
-	{0x07b1, 0x0047, 0x2cfcfa0c},
-	{0x04f1, 0x0041, 0x55b172f9},
-	{0x0e05, 0x00ac, 0x61efde93},
-	{0x0bf7, 0x0097, 0x05b83eee},
-	{0x04e9, 0x00f3, 0x9928223a},
-	{0x023a, 0x0005, 0xdfada9bc},
-	{0x0acb, 0x000e, 0x2217cecd},
-	{0x0148, 0x0060, 0xbc3f7405},
-	{0x0764, 0x0059, 0xcbc201b1},
-	{0x021f, 0x0059, 0x5d6b2256},
-	{0x0f1e, 0x006c, 0xdefeeb45},
-	{0x071c, 0x00b9, 0xb9b59309},
-	{0x0564, 0x0063, 0xae064271},
-	{0x0b14, 0x0044, 0xdb867d9b},
-	{0x0e5a, 0x0055, 0xff06b685},
-	{0x015e, 0x00ba, 0x1115ccbc},
-	{0x0379, 0x00e6, 0x5f4e58dd},
-	{0x013b, 0x0067, 0x4897427e},
-	{0x0e64, 0x0071, 0x7af2b7a4},
-	{0x0a11, 0x0050, 0x92105726},
-	{0x0109, 0x0055, 0xd0d000f9},
-	{0x00aa, 0x0022, 0x815d229d},
-	{0x09ac, 0x004f, 0x02f9d985},
-	{0x0e1b, 0x00ce, 0x5cf92ab4},
-	{0x08af, 0x00d8, 0x17ca72d1},
-	{0x0e33, 0x000a, 0xda2dba6b},
-	{0x0ee3, 0x006a, 0xb00048e5},
-	{0x0648, 0x001a, 0x2364b8cb},
-	{0x0315, 0x0085, 0x0596fd0d},
-	{0x0fbb, 0x003e, 0x298230ca},
-	{0x0422, 0x006a, 0x78ada4ab},
-	{0x04ba, 0x0073, 0xced1fbc2},
-	{0x007d, 0x0061, 0x4b7ff236},
-	{0x070b, 0x00d0, 0x261cf0ae},
-	{0x0c1a, 0x0035, 0x8be92ee2},
-	{0x0af8, 0x0063, 0x824dcf03},
-	{0x08f8, 0x006d, 0xd289710c},
-	{0x021b, 0x00ee, 0x6ac1c41d},
-	{0x05b5, 0x00da, 0x8e52f0e2},
+	{0x0567, 0x0097, 0x96951389, 0xc153aa0d},
+	{0x0869, 0x0055, 0x6455ab4f, 0xd07f69bf},
+	{0x0c51, 0x00be, 0x8663afde, 0xf9add90c},
+	{0x044a, 0x00fc, 0x98fbe432, 0xbf2abb76},
+	{0x0f29, 0x0079, 0x42371997, 0x282588b3},
+	{0x08ba, 0x0052, 0x942be4f7, 0x2e023547},
+	{0x01f2, 0x0013, 0x5262687e, 0x5266287e},
+	{0x09e3, 0x00e2, 0x8ffb0908, 0x1da892f3},
+	{0x007c, 0x0051, 0xb3158491, 0xb67f9e63},
+	{0x0854, 0x001f, 0x83bb20d9, 0x22bb21db},
+	{0x031b, 0x0008, 0x98970bdf, 0x9cd70adf},
+	{0x0de7, 0x0027, 0xbfbf6f6c, 0xae3f296c},
+	{0x0f76, 0x0005, 0x906a7105, 0x906a7105},
+	{0x092e, 0x00d0, 0x86631850, 0xa3f6ac04},
+	{0x0233, 0x0082, 0xdbdd914e, 0x5d8c7aac},
+	{0x04c9, 0x0075, 0x5a400a9e, 0x12f60711},
+	{0x0b66, 0x0099, 0xae128b45, 0x7551310d},
+	{0x000d, 0x00ed, 0xe61c216a, 0xc22d3c4c},
+	{0x0a31, 0x003d, 0xf69663b9, 0x51960bf8},
+	{0x00a3, 0x0052, 0x643c39ae, 0xa93c73a8},
+	{0x0125, 0x00d5, 0x7c310b0d, 0xf221cbb3},
+	{0x0105, 0x004a, 0x06a77e74, 0xa4ef4561},
+	{0x0858, 0x008e, 0x265bc739, 0xd6c36d9b},
+	{0x045e, 0x0095, 0x13d6b192, 0x5f5c1d62},
+	{0x0dab, 0x003c, 0xc4498704, 0x10414654},
+	{0x00cd, 0x00b5, 0x802a4e2d, 0xfbd17c9d},
+	{0x069b, 0x008c, 0x5df60f71, 0x91ddca5f},
+	{0x0454, 0x006c, 0x5f03d8bb, 0x5c59fce0},
+	{0x040e, 0x0032, 0x0ce513b5, 0xa8cd99b1},
+	{0x0874, 0x00e2, 0x6a811fb3, 0xca028316},
+	{0x0521, 0x00b4, 0x93296833, 0x2c4d4880},
+	{0x0ddc, 0x00cf, 0xf9305338, 0x2c94210d},
+	{0x0a70, 0x0023, 0x239549ea, 0x22b561aa},
+	{0x083e, 0x0027, 0x2d88ba97, 0x5cd8bb9d},
+	{0x0241, 0x00a7, 0xfe0b32e1, 0x17b506b8},
+	{0x0dfc, 0x0096, 0x1a11e815, 0xee4141bd},
+	{0x023e, 0x001e, 0xebc9a1f3, 0x5689a1f3},
+	{0x067e, 0x0066, 0xb1067f81, 0xd9952571},
+	{0x09ea, 0x000e, 0x46fd7247, 0x42b57245},
+	{0x036b, 0x008c, 0x1a39acdf, 0x58bf1586},
+	{0x078f, 0x0030, 0x964042ab, 0xb04218b9},
+	{0x085c, 0x008f, 0x1829edab, 0x9ceca89c},
+	{0x02ec, 0x009f, 0x6aefa72d, 0x634cc2a7},
+	{0x043b, 0x00ce, 0x65642ff5, 0x6c8a584e},
+	{0x0a32, 0x00b8, 0xbd82759e, 0x0f96a34f},
+	{0x0d3c, 0x0087, 0xf4d66d54, 0xb71ba5f4},
+	{0x09ec, 0x008a, 0x06bfa1ff, 0x576ca80f},
+	{0x0902, 0x0015, 0x755025d2, 0x517225c2},
+	{0x08fe, 0x000e, 0xf690ce2d, 0xf690cf3d},
+	{0x00fb, 0x00dc, 0xe55f1528, 0x707d7d92},
+	{0x0eaa, 0x003a, 0x0fe0a8d7, 0x87638cc5},
+	{0x05fb, 0x0006, 0x86281cfb, 0x86281cf9},
+	{0x0dd1, 0x00a7, 0x60ab51b4, 0xe28ef00c},
+	{0x0005, 0x001b, 0xf51d969b, 0xe71dd6d3},
+	{0x077c, 0x00dd, 0xc2fed268, 0xdc30c555},
+	{0x0575, 0x00f5, 0x432c0b1a, 0x81dd7d16},
+	{0x05be, 0x0088, 0x78baa04b, 0xd69b433e},
+	{0x0c89, 0x0068, 0xeda9e428, 0xe9b4fa0a},
+	{0x0f5c, 0x0068, 0xec143c76, 0x9947067a},
+	{0x06a8, 0x0009, 0xd72651ce, 0xd72651ee},
+	{0x060f, 0x008e, 0x765426cd, 0x2099626f},
+	{0x07b1, 0x0047, 0x2cfcfa0c, 0x1a4baa07},
+	{0x04f1, 0x0041, 0x55b172f9, 0x15331a79},
+	{0x0e05, 0x00ac, 0x61efde93, 0x320568cc},
+	{0x0bf7, 0x0097, 0x05b83eee, 0xc72fb7a3},
+	{0x04e9, 0x00f3, 0x9928223a, 0xe8c77de2},
+	{0x023a, 0x0005, 0xdfada9bc, 0xdfadb9be},
+	{0x0acb, 0x000e, 0x2217cecd, 0x0017d6cd},
+	{0x0148, 0x0060, 0xbc3f7405, 0xf5fd6615},
+	{0x0764, 0x0059, 0xcbc201b1, 0xbb089bf4},
+	{0x021f, 0x0059, 0x5d6b2256, 0xa16a0a59},
+	{0x0f1e, 0x006c, 0xdefeeb45, 0xfc34f9d6},
+	{0x071c, 0x00b9, 0xb9b59309, 0xb645eae2},
+	{0x0564, 0x0063, 0xae064271, 0x954dc6d1},
+	{0x0b14, 0x0044, 0xdb867d9b, 0xdf432309},
+	{0x0e5a, 0x0055, 0xff06b685, 0xa65ff257},
+	{0x015e, 0x00ba, 0x1115ccbc, 0x11c365f4},
+	{0x0379, 0x00e6, 0x5f4e58dd, 0x2d176d31},
+	{0x013b, 0x0067, 0x4897427e, 0xc40532fe},
+	{0x0e64, 0x0071, 0x7af2b7a4, 0x1fb7bf43},
+	{0x0a11, 0x0050, 0x92105726, 0xb1185e51},
+	{0x0109, 0x0055, 0xd0d000f9, 0x60a60bfd},
+	{0x00aa, 0x0022, 0x815d229d, 0x215d379c},
+	{0x09ac, 0x004f, 0x02f9d985, 0x10b90b20},
+	{0x0e1b, 0x00ce, 0x5cf92ab4, 0x6a477573},
+	{0x08af, 0x00d8, 0x17ca72d1, 0x385af156},
+	{0x0e33, 0x000a, 0xda2dba6b, 0xda2dbb69},
+	{0x0ee3, 0x006a, 0xb00048e5, 0xa9a2decc},
+	{0x0648, 0x001a, 0x2364b8cb, 0x3364b1cb},
+	{0x0315, 0x0085, 0x0596fd0d, 0xa651740f},
+	{0x0fbb, 0x003e, 0x298230ca, 0x7fc617c7},
+	{0x0422, 0x006a, 0x78ada4ab, 0xc576ae2a},
+	{0x04ba, 0x0073, 0xced1fbc2, 0xaac8455b},
+	{0x007d, 0x0061, 0x4b7ff236, 0x347d5739},
+	{0x070b, 0x00d0, 0x261cf0ae, 0xc7fb1c10},
+	{0x0c1a, 0x0035, 0x8be92ee2, 0x8be9b4e1},
+	{0x0af8, 0x0063, 0x824dcf03, 0x53010388},
+	{0x08f8, 0x006d, 0xd289710c, 0x30418edd},
+	{0x021b, 0x00ee, 0x6ac1c41d, 0x2557e9a3},
+	{0x05b5, 0x00da, 0x8e52f0e2, 0x98531012},
 };
 
 /* Don't print anything to stdout. */
@@ -127,6 +128,7 @@ dahash_test(
 	int		i;
 	int		errors = 0;
 	int		bytes = 0;
+	struct xfs_name	xname = { };
 	struct timeval	start, stop;
 	uint64_t	usec;
 
@@ -150,6 +152,12 @@ dahash_test(
 				dahash_tests[i].length);
 		if (hash != dahash_tests[i].dahash)
 			errors++;
+
+		xname.name = randbytes_test_buf + dahash_tests[i].start;
+		xname.len = dahash_tests[i].length;
+		hash = libxfs_ascii_ci_hashname(&xname);
+		if (hash != dahash_tests[i].ascii_ci_dahash)
+			errors++;
 	}
 	gettimeofday(&stop, NULL);
 
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index d973e300939..026aa510ca1 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -34,6 +34,8 @@
 #define xfs_alloc_read_agf		libxfs_alloc_read_agf
 #define xfs_alloc_vextent		libxfs_alloc_vextent
 
+#define xfs_ascii_ci_hashname		libxfs_ascii_ci_hashname
+
 #define xfs_attr_get			libxfs_attr_get
 #define xfs_attr_leaf_newentsize	libxfs_attr_leaf_newentsize
 #define xfs_attr_namecheck		libxfs_attr_namecheck


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] xfs_db: move obfuscate_name assertion to callers
  2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
@ 2023-06-05 15:36 ` Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: Christoph Hellwig, linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Currently, obfuscate_name asserts that the hash of the new name is the
same as the old name.  To enable bug fixes in the next patch, move this
assertion to the callers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 db/metadump.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


diff --git a/db/metadump.c b/db/metadump.c
index 27d1df43279..317ff72802d 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -882,7 +882,6 @@ obfuscate_name(
 		*first ^= 0x10;
 		ASSERT(!is_invalid_char(*first));
 	}
-	ASSERT(libxfs_da_hashname(name, name_len) == hash);
 }
 
 /*
@@ -1208,6 +1207,7 @@ generate_obfuscated_name(
 
 	hash = libxfs_da_hashname(name, namelen);
 	obfuscate_name(hash, namelen, name);
+	ASSERT(hash == libxfs_da_hashname(name, namelen));
 
 	/*
 	 * Make sure the name is not something already seen.  If we
@@ -1321,6 +1321,7 @@ obfuscate_path_components(
 			namelen = strnlen((char *)comp, len);
 			hash = libxfs_da_hashname(comp, namelen);
 			obfuscate_name(hash, namelen, comp);
+			ASSERT(hash == libxfs_da_hashname(comp, namelen));
 			break;
 		}
 		namelen = slash - (char *)comp;
@@ -1332,6 +1333,7 @@ obfuscate_path_components(
 		}
 		hash = libxfs_da_hashname(comp, namelen);
 		obfuscate_name(hash, namelen, comp);
+		ASSERT(hash == libxfs_da_hashname(comp, namelen));
 		comp += namelen + 1;
 		len -= namelen + 1;
 	}


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 2/5] xfs_db: move obfuscate_name assertion to callers Darrick J. Wong
@ 2023-06-05 15:36 ` Darrick J. Wong
  2023-06-05 16:59   ` Eric Sandeen
                     ` (3 more replies)
  2023-06-05 15:36 ` [PATCH 4/5] mkfs.xfs.8: warn about the version=ci feature Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 5/5] mkfs: deprecate the ascii-ci feature Darrick J. Wong
  4 siblings, 4 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Now that we've stabilized the dirent hash function for ascii-ci
filesystems, adapt the metadump name obfuscation code to detect when
it's obfuscating a directory entry name on an ascii-ci filesystem and
spit out names that actually have the same hash.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 db/metadump.c |   77 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 9 deletions(-)


diff --git a/db/metadump.c b/db/metadump.c
index 317ff72802d..4f8b3adb163 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -817,13 +817,17 @@ static void
 obfuscate_name(
 	xfs_dahash_t	hash,
 	size_t		name_len,
-	unsigned char	*name)
+	unsigned char	*name,
+	bool		is_dirent)
 {
-	unsigned char	*newp = name;
+	unsigned char	*oldname = NULL;
+	unsigned char	*newp;
 	int		i;
-	xfs_dahash_t	new_hash = 0;
+	xfs_dahash_t	new_hash;
 	unsigned char	*first;
 	unsigned char	high_bit;
+	int		tries = 0;
+	bool		is_ci_name = is_dirent && xfs_has_asciici(mp);
 	int		shift;
 
 	/*
@@ -836,6 +840,24 @@ obfuscate_name(
 	if (name_len < 5)
 		return;
 
+	if (is_ci_name) {
+		oldname = alloca(name_len);
+		memcpy(oldname, name, name_len);
+	}
+
+again:
+	newp = name;
+	new_hash = 0;
+
+	/*
+	 * If we cannot generate a ci-compatible obfuscated name after 1000
+	 * tries, don't bother obfuscating the name.
+	 */
+	if (tries++ > 1000) {
+		memcpy(name, oldname, name_len);
+		return;
+	}
+
 	/*
 	 * The beginning of the obfuscated name can be pretty much
 	 * anything, so fill it in with random characters.
@@ -843,7 +865,11 @@ obfuscate_name(
 	 */
 	for (i = 0; i < name_len - 5; i++) {
 		*newp = random_filename_char();
-		new_hash = *newp ^ rol32(new_hash, 7);
+		if (is_ci_name)
+			new_hash = xfs_ascii_ci_xfrm(*newp) ^
+							rol32(new_hash, 7);
+		else
+			new_hash = *newp ^ rol32(new_hash, 7);
 		newp++;
 	}
 
@@ -867,6 +893,17 @@ obfuscate_name(
 			high_bit = 0x80;
 		} else
 			high_bit = 0;
+
+		/*
+		 * If ascii-ci is enabled, uppercase characters are converted
+		 * to lowercase characters while computing the name hash.  If
+		 * any of the necessary correction bytes are uppercase, the
+		 * hash of the new name will not match.  Try again with a
+		 * different prefix.
+		 */
+		if (is_ci_name && xfs_ascii_ci_need_xfrm(*newp))
+			goto again;
+
 		ASSERT(!is_invalid_char(*newp));
 		newp++;
 	}
@@ -880,6 +917,10 @@ obfuscate_name(
 	 */
 	if (high_bit) {
 		*first ^= 0x10;
+
+		if (is_ci_name && xfs_ascii_ci_need_xfrm(*first))
+			goto again;
+
 		ASSERT(!is_invalid_char(*first));
 	}
 }
@@ -1177,6 +1218,24 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
 	return 1;
 }
 
+static inline xfs_dahash_t
+dirattr_hashname(
+	bool		is_dirent,
+	const uint8_t	*name,
+	int		namelen)
+{
+	if (is_dirent) {
+		struct xfs_name	xname = {
+			.name	= name,
+			.len	= namelen,
+		};
+
+		return libxfs_dir2_hashname(mp, &xname);
+	}
+
+	return libxfs_da_hashname(name, namelen);
+}
+
 static void
 generate_obfuscated_name(
 	xfs_ino_t		ino,
@@ -1205,9 +1264,9 @@ generate_obfuscated_name(
 
 	/* Obfuscate the name (if possible) */
 
-	hash = libxfs_da_hashname(name, namelen);
-	obfuscate_name(hash, namelen, name);
-	ASSERT(hash == libxfs_da_hashname(name, namelen));
+	hash = dirattr_hashname(ino != 0, name, namelen);
+	obfuscate_name(hash, namelen, name, ino != 0);
+	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
 
 	/*
 	 * Make sure the name is not something already seen.  If we
@@ -1320,7 +1379,7 @@ obfuscate_path_components(
 			/* last (or single) component */
 			namelen = strnlen((char *)comp, len);
 			hash = libxfs_da_hashname(comp, namelen);
-			obfuscate_name(hash, namelen, comp);
+			obfuscate_name(hash, namelen, comp, false);
 			ASSERT(hash == libxfs_da_hashname(comp, namelen));
 			break;
 		}
@@ -1332,7 +1391,7 @@ obfuscate_path_components(
 			continue;
 		}
 		hash = libxfs_da_hashname(comp, namelen);
-		obfuscate_name(hash, namelen, comp);
+		obfuscate_name(hash, namelen, comp, false);
 		ASSERT(hash == libxfs_da_hashname(comp, namelen));
 		comp += namelen + 1;
 		len -= namelen + 1;


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] mkfs.xfs.8: warn about the version=ci feature
  2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
                   ` (2 preceding siblings ...)
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
@ 2023-06-05 15:36 ` Darrick J. Wong
  2023-06-05 15:36 ` [PATCH 5/5] mkfs: deprecate the ascii-ci feature Darrick J. Wong
  4 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: Christoph Hellwig, linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Document the exact byte transformations that happen during directory
name lookup when the version=ci feature is enabled.  Warn that this is
not generally compatible, and that people should not use this feature.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 man/man8/mkfs.xfs.8.in |   22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)


diff --git a/man/man8/mkfs.xfs.8.in b/man/man8/mkfs.xfs.8.in
index 49e64d47ae4..6fc7708bc94 100644
--- a/man/man8/mkfs.xfs.8.in
+++ b/man/man8/mkfs.xfs.8.in
@@ -809,11 +809,25 @@ can be either 2 or 'ci', defaulting to 2 if unspecified.
 With version 2 directories, the directory block size can be
 any power of 2 size from the filesystem block size up to 65536.
 .IP
-The
+If the
 .B version=ci
-option enables ASCII only case-insensitive filename lookup and version
-2 directories. Filenames are case-preserving, that is, the names
-are stored in directories using the case they were created with.
+option is specified, the kernel will transform certain bytes in filenames
+before performing lookup-related operations.
+The byte sequence given to create a directory entry is persisted without
+alterations.
+The lookup transformations are defined as follows:
+
+    0x41-0x5a -> 0x61-0x7a
+
+    0xc0-0xd6 -> 0xe0-0xf6
+
+    0xd8-0xde -> 0xf8-0xfe
+
+This transformation roughly corresponds to case insensitivity in ISO
+8859-1.
+The transformations are not compatible with other encodings (e.g. UTF8).
+Do not enable this feature unless your entire environment has been coerced
+to ISO 8859-1.
 .IP
 Note: Version 1 directories are not supported.
 .TP


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] mkfs: deprecate the ascii-ci feature
  2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
                   ` (3 preceding siblings ...)
  2023-06-05 15:36 ` [PATCH 4/5] mkfs.xfs.8: warn about the version=ci feature Darrick J. Wong
@ 2023-06-05 15:36 ` Darrick J. Wong
  4 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 15:36 UTC (permalink / raw)
  To: djwong, cem; +Cc: Christoph Hellwig, linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Deprecate this feature, since the feature is broken.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 man/man8/mkfs.xfs.8.in |    1 +
 mkfs/xfs_mkfs.c        |   11 +++++++++++
 2 files changed, 12 insertions(+)


diff --git a/man/man8/mkfs.xfs.8.in b/man/man8/mkfs.xfs.8.in
index 6fc7708bc94..01f9dc6e6e9 100644
--- a/man/man8/mkfs.xfs.8.in
+++ b/man/man8/mkfs.xfs.8.in
@@ -828,6 +828,7 @@ This transformation roughly corresponds to case insensitivity in ISO
 The transformations are not compatible with other encodings (e.g. UTF8).
 Do not enable this feature unless your entire environment has been coerced
 to ISO 8859-1.
+This feature is deprecated and will be removed in September 2030.
 .IP
 Note: Version 1 directories are not supported.
 .TP
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 2f2995e13e3..d3a15cf44e0 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -2150,6 +2150,17 @@ validate_sb_features(
 	struct mkfs_params	*cfg,
 	struct cli_params	*cli)
 {
+	if (cli->sb_feat.nci) {
+		/*
+		 * The ascii-ci feature is deprecated in the upstream Linux
+		 * kernel.  In September 2025 it will be turned off by default
+		 * in the kernel and in September 2030 support will be removed
+		 * entirely.
+		 */
+		fprintf(stdout,
+_("ascii-ci filesystems are deprecated and will not be supported by future versions.\n"));
+	}
+
 	/*
 	 * Now we have blocks and sector sizes set up, check parameters that are
 	 * no longer optional for CRC enabled filesystems.  Catch them up front


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] libxfs: test the ascii case-insensitive hash
  2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
@ 2023-06-05 16:30   ` Eric Sandeen
  2023-06-14  7:44   ` Carlos Maiolino
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2023-06-05 16:30 UTC (permalink / raw)
  To: Darrick J. Wong, cem; +Cc: linux-xfs, david, hch

On 6/5/23 10:36 AM, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've made kernel and userspace use the same tolower code for
> computing directory index hashes, add that to the selftest code.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

This looks fine to me.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
@ 2023-06-05 16:59   ` Eric Sandeen
  2023-06-05 17:09     ` Darrick J. Wong
  2023-06-05 17:19   ` Eric Sandeen
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2023-06-05 16:59 UTC (permalink / raw)
  To: Darrick J. Wong, cem; +Cc: linux-xfs, david, hch

On 6/5/23 10:36 AM, Darrick J. Wong wrote:
> @@ -1205,9 +1264,9 @@ generate_obfuscated_name(
>   
>   	/* Obfuscate the name (if possible) */
>   
> -	hash = libxfs_da_hashname(name, namelen);
> -	obfuscate_name(hash, namelen, name);
> -	ASSERT(hash == libxfs_da_hashname(name, namelen));
> +	hash = dirattr_hashname(ino != 0, name, namelen);
> +	obfuscate_name(hash, namelen, name, ino != 0);
> +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));

This makes sense to me - comments above here remind us that "inode == 0" 
means we're obfuscating an xattr value, not a filename or path name, but ...

>   	/*
>   	 * Make sure the name is not something already seen.  If we
> @@ -1320,7 +1379,7 @@ obfuscate_path_components(
>   			/* last (or single) component */
>   			namelen = strnlen((char *)comp, len);
>   			hash = libxfs_da_hashname(comp, namelen);
> -			obfuscate_name(hash, namelen, comp);
> +			obfuscate_name(hash, namelen, comp, false);
>   			ASSERT(hash == libxfs_da_hashname(comp, namelen));
>   			break;
>   		}
> @@ -1332,7 +1391,7 @@ obfuscate_path_components(
>   			continue;
>   		}
>   		hash = libxfs_da_hashname(comp, namelen);
> -		obfuscate_name(hash, namelen, comp);
> +		obfuscate_name(hash, namelen, comp, false);
>   		ASSERT(hash == libxfs_da_hashname(comp, namelen));
>   		comp += namelen + 1;
>   		len -= namelen + 1;
> 

here, why is "is_dirent" false? Shouldn't a symlink path component match 
the associated dirents, and be obsucated the same way?

-Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 16:59   ` Eric Sandeen
@ 2023-06-05 17:09     ` Darrick J. Wong
  2023-06-05 17:18       ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-05 17:09 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: cem, linux-xfs, david, hch

On Mon, Jun 05, 2023 at 11:59:45AM -0500, Eric Sandeen wrote:
> On 6/5/23 10:36 AM, Darrick J. Wong wrote:
> > @@ -1205,9 +1264,9 @@ generate_obfuscated_name(
> >   	/* Obfuscate the name (if possible) */
> > -	hash = libxfs_da_hashname(name, namelen);
> > -	obfuscate_name(hash, namelen, name);
> > -	ASSERT(hash == libxfs_da_hashname(name, namelen));
> > +	hash = dirattr_hashname(ino != 0, name, namelen);
> > +	obfuscate_name(hash, namelen, name, ino != 0);
> > +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
> 
> This makes sense to me - comments above here remind us that "inode == 0"
> means we're obfuscating an xattr value, not a filename or path name, but ...
> 
> >   	/*
> >   	 * Make sure the name is not something already seen.  If we
> > @@ -1320,7 +1379,7 @@ obfuscate_path_components(
> >   			/* last (or single) component */
> >   			namelen = strnlen((char *)comp, len);
> >   			hash = libxfs_da_hashname(comp, namelen);
> > -			obfuscate_name(hash, namelen, comp);
> > +			obfuscate_name(hash, namelen, comp, false);
> >   			ASSERT(hash == libxfs_da_hashname(comp, namelen));
> >   			break;
> >   		}
> > @@ -1332,7 +1391,7 @@ obfuscate_path_components(
> >   			continue;
> >   		}
> >   		hash = libxfs_da_hashname(comp, namelen);
> > -		obfuscate_name(hash, namelen, comp);
> > +		obfuscate_name(hash, namelen, comp, false);
> >   		ASSERT(hash == libxfs_da_hashname(comp, namelen));
> >   		comp += namelen + 1;
> >   		len -= namelen + 1;
> > 
> 
> here, why is "is_dirent" false? Shouldn't a symlink path component match the
> associated dirents, and be obsucated the same way?

Name obfuscation replaces every byte except for the last five bytes with
a random printable character, and then flips bits in those last five
bytes to make the hash match.  Chances are good that calling
obfuscate_name() twice on the same name will return different results,
which means that symlink targets won't point anywhere useful after the
obfuscation.

One could make metadump remember the (input -> output) pairs instead of
regenerating the names every time, but this comes at a cost of higher
memory consumption.  I actually did this for parent pointers so that
obfuscated dumped pptrs are still verifiable by xfs_repair.

However, symlink targets aren't required to point to a valid path, so
there doesn't seem to be much reason to add that overhead.

--D

> -Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 17:09     ` Darrick J. Wong
@ 2023-06-05 17:18       ` Eric Sandeen
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2023-06-05 17:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs, david, hch

On 6/5/23 12:09 PM, Darrick J. Wong wrote:
> On Mon, Jun 05, 2023 at 11:59:45AM -0500, Eric Sandeen wrote:
>> On 6/5/23 10:36 AM, Darrick J. Wong wrote:
>>> @@ -1205,9 +1264,9 @@ generate_obfuscated_name(
>>>    	/* Obfuscate the name (if possible) */
>>> -	hash = libxfs_da_hashname(name, namelen);
>>> -	obfuscate_name(hash, namelen, name);
>>> -	ASSERT(hash == libxfs_da_hashname(name, namelen));
>>> +	hash = dirattr_hashname(ino != 0, name, namelen);
>>> +	obfuscate_name(hash, namelen, name, ino != 0);
>>> +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
>>
>> This makes sense to me - comments above here remind us that "inode == 0"
>> means we're obfuscating an xattr value, not a filename or path name, but ...
>>
>>>    	/*
>>>    	 * Make sure the name is not something already seen.  If we
>>> @@ -1320,7 +1379,7 @@ obfuscate_path_components(
>>>    			/* last (or single) component */
>>>    			namelen = strnlen((char *)comp, len);
>>>    			hash = libxfs_da_hashname(comp, namelen);
>>> -			obfuscate_name(hash, namelen, comp);
>>> +			obfuscate_name(hash, namelen, comp, false);
>>>    			ASSERT(hash == libxfs_da_hashname(comp, namelen));
>>>    			break;
>>>    		}
>>> @@ -1332,7 +1391,7 @@ obfuscate_path_components(
>>>    			continue;
>>>    		}
>>>    		hash = libxfs_da_hashname(comp, namelen);
>>> -		obfuscate_name(hash, namelen, comp);
>>> +		obfuscate_name(hash, namelen, comp, false);
>>>    		ASSERT(hash == libxfs_da_hashname(comp, namelen));
>>>    		comp += namelen + 1;
>>>    		len -= namelen + 1;
>>>
>>
>> here, why is "is_dirent" false? Shouldn't a symlink path component match the
>> associated dirents, and be obsucated the same way?
> 
> Name obfuscation replaces every byte except for the last five bytes with
> a random printable character, and then flips bits in those last five
> bytes to make the hash match.  Chances are good that calling
> obfuscate_name() twice on the same name will return different results,
> which means that symlink targets won't point anywhere useful after the
> obfuscation.
> 
> One could make metadump remember the (input -> output) pairs instead of
> regenerating the names every time, but this comes at a cost of higher
> memory consumption.  I actually did this for parent pointers so that
> obfuscated dumped pptrs are still verifiable by xfs_repair.
> 
> However, symlink targets aren't required to point to a valid path, so
> there doesn't seem to be much reason to add that overhead.

Fair enough, thanks. If you wanted to throw in a comment it might be 
nice, but I'll leave that up to you, esp. if we're killing off ascii-ci 
in the long run anyway.

Thanks,
-Eric


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
  2023-06-05 16:59   ` Eric Sandeen
@ 2023-06-05 17:19   ` Eric Sandeen
  2023-06-14 11:10   ` Carlos Maiolino
  2023-06-15 16:11   ` [PATCH v2 " Darrick J. Wong
  3 siblings, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2023-06-05 17:19 UTC (permalink / raw)
  To: Darrick J. Wong, cem; +Cc: linux-xfs, david, hch

On 6/5/23 10:36 AM, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've stabilized the dirent hash function for ascii-ci
> filesystems, adapt the metadump name obfuscation code to detect when
> it's obfuscating a directory entry name on an ascii-ci filesystem and
> spit out names that actually have the same hash.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

> ---
>   db/metadump.c |   77 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
>   1 file changed, 68 insertions(+), 9 deletions(-)
> 
> 
> diff --git a/db/metadump.c b/db/metadump.c
> index 317ff72802d..4f8b3adb163 100644
> --- a/db/metadump.c
> +++ b/db/metadump.c
> @@ -817,13 +817,17 @@ static void
>   obfuscate_name(
>   	xfs_dahash_t	hash,
>   	size_t		name_len,
> -	unsigned char	*name)
> +	unsigned char	*name,
> +	bool		is_dirent)
>   {
> -	unsigned char	*newp = name;
> +	unsigned char	*oldname = NULL;
> +	unsigned char	*newp;
>   	int		i;
> -	xfs_dahash_t	new_hash = 0;
> +	xfs_dahash_t	new_hash;
>   	unsigned char	*first;
>   	unsigned char	high_bit;
> +	int		tries = 0;
> +	bool		is_ci_name = is_dirent && xfs_has_asciici(mp);
>   	int		shift;
>   
>   	/*
> @@ -836,6 +840,24 @@ obfuscate_name(
>   	if (name_len < 5)
>   		return;
>   
> +	if (is_ci_name) {
> +		oldname = alloca(name_len);
> +		memcpy(oldname, name, name_len);
> +	}
> +
> +again:
> +	newp = name;
> +	new_hash = 0;
> +
> +	/*
> +	 * If we cannot generate a ci-compatible obfuscated name after 1000
> +	 * tries, don't bother obfuscating the name.
> +	 */
> +	if (tries++ > 1000) {
> +		memcpy(name, oldname, name_len);
> +		return;
> +	}
> +
>   	/*
>   	 * The beginning of the obfuscated name can be pretty much
>   	 * anything, so fill it in with random characters.
> @@ -843,7 +865,11 @@ obfuscate_name(
>   	 */
>   	for (i = 0; i < name_len - 5; i++) {
>   		*newp = random_filename_char();
> -		new_hash = *newp ^ rol32(new_hash, 7);
> +		if (is_ci_name)
> +			new_hash = xfs_ascii_ci_xfrm(*newp) ^
> +							rol32(new_hash, 7);
> +		else
> +			new_hash = *newp ^ rol32(new_hash, 7);
>   		newp++;
>   	}
>   
> @@ -867,6 +893,17 @@ obfuscate_name(
>   			high_bit = 0x80;
>   		} else
>   			high_bit = 0;
> +
> +		/*
> +		 * If ascii-ci is enabled, uppercase characters are converted
> +		 * to lowercase characters while computing the name hash.  If
> +		 * any of the necessary correction bytes are uppercase, the
> +		 * hash of the new name will not match.  Try again with a
> +		 * different prefix.
> +		 */
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*newp))
> +			goto again;
> +
>   		ASSERT(!is_invalid_char(*newp));
>   		newp++;
>   	}
> @@ -880,6 +917,10 @@ obfuscate_name(
>   	 */
>   	if (high_bit) {
>   		*first ^= 0x10;
> +
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*first))
> +			goto again;
> +
>   		ASSERT(!is_invalid_char(*first));
>   	}
>   }
> @@ -1177,6 +1218,24 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
>   	return 1;
>   }
>   
> +static inline xfs_dahash_t
> +dirattr_hashname(
> +	bool		is_dirent,
> +	const uint8_t	*name,
> +	int		namelen)
> +{
> +	if (is_dirent) {
> +		struct xfs_name	xname = {
> +			.name	= name,
> +			.len	= namelen,
> +		};
> +
> +		return libxfs_dir2_hashname(mp, &xname);
> +	}
> +
> +	return libxfs_da_hashname(name, namelen);
> +}
> +
>   static void
>   generate_obfuscated_name(
>   	xfs_ino_t		ino,
> @@ -1205,9 +1264,9 @@ generate_obfuscated_name(
>   
>   	/* Obfuscate the name (if possible) */
>   
> -	hash = libxfs_da_hashname(name, namelen);
> -	obfuscate_name(hash, namelen, name);
> -	ASSERT(hash == libxfs_da_hashname(name, namelen));
> +	hash = dirattr_hashname(ino != 0, name, namelen);
> +	obfuscate_name(hash, namelen, name, ino != 0);
> +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
>   
>   	/*
>   	 * Make sure the name is not something already seen.  If we
> @@ -1320,7 +1379,7 @@ obfuscate_path_components(
>   			/* last (or single) component */
>   			namelen = strnlen((char *)comp, len);
>   			hash = libxfs_da_hashname(comp, namelen);
> -			obfuscate_name(hash, namelen, comp);
> +			obfuscate_name(hash, namelen, comp, false);
>   			ASSERT(hash == libxfs_da_hashname(comp, namelen));
>   			break;
>   		}
> @@ -1332,7 +1391,7 @@ obfuscate_path_components(
>   			continue;
>   		}
>   		hash = libxfs_da_hashname(comp, namelen);
> -		obfuscate_name(hash, namelen, comp);
> +		obfuscate_name(hash, namelen, comp, false);
>   		ASSERT(hash == libxfs_da_hashname(comp, namelen));
>   		comp += namelen + 1;
>   		len -= namelen + 1;
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] libxfs: test the ascii case-insensitive hash
  2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
  2023-06-05 16:30   ` Eric Sandeen
@ 2023-06-14  7:44   ` Carlos Maiolino
  1 sibling, 0 replies; 15+ messages in thread
From: Carlos Maiolino @ 2023-06-14  7:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch

On Mon, Jun 05, 2023 at 08:36:32AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've made kernel and userspace use the same tolower code for
> computing directory index hashes, add that to the selftest code.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  libfrog/dahashselftest.h |  208 ++++++++++++++++++++++++----------------------
>  libxfs/libxfs_api_defs.h |    2
>  2 files changed, 110 insertions(+), 100 deletions(-)

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> 
> 
> diff --git a/libfrog/dahashselftest.h b/libfrog/dahashselftest.h
> index 7dda53037b8..ea9d925bf2a 100644
> --- a/libfrog/dahashselftest.h
> +++ b/libfrog/dahashselftest.h
> @@ -13,108 +13,109 @@ static struct dahash_test {
>  	uint16_t	start;	/* random 12 bit offset in buf */
>  	uint16_t	length;	/* random 8 bit length of test */
>  	xfs_dahash_t	dahash;	/* expected dahash result */
> +	xfs_dahash_t	ascii_ci_dahash; /* expected ascii-ci dahash result */
>  } dahash_tests[] =
>  {
> -	{0x0567, 0x0097, 0x96951389},
> -	{0x0869, 0x0055, 0x6455ab4f},
> -	{0x0c51, 0x00be, 0x8663afde},
> -	{0x044a, 0x00fc, 0x98fbe432},
> -	{0x0f29, 0x0079, 0x42371997},
> -	{0x08ba, 0x0052, 0x942be4f7},
> -	{0x01f2, 0x0013, 0x5262687e},
> -	{0x09e3, 0x00e2, 0x8ffb0908},
> -	{0x007c, 0x0051, 0xb3158491},
> -	{0x0854, 0x001f, 0x83bb20d9},
> -	{0x031b, 0x0008, 0x98970bdf},
> -	{0x0de7, 0x0027, 0xbfbf6f6c},
> -	{0x0f76, 0x0005, 0x906a7105},
> -	{0x092e, 0x00d0, 0x86631850},
> -	{0x0233, 0x0082, 0xdbdd914e},
> -	{0x04c9, 0x0075, 0x5a400a9e},
> -	{0x0b66, 0x0099, 0xae128b45},
> -	{0x000d, 0x00ed, 0xe61c216a},
> -	{0x0a31, 0x003d, 0xf69663b9},
> -	{0x00a3, 0x0052, 0x643c39ae},
> -	{0x0125, 0x00d5, 0x7c310b0d},
> -	{0x0105, 0x004a, 0x06a77e74},
> -	{0x0858, 0x008e, 0x265bc739},
> -	{0x045e, 0x0095, 0x13d6b192},
> -	{0x0dab, 0x003c, 0xc4498704},
> -	{0x00cd, 0x00b5, 0x802a4e2d},
> -	{0x069b, 0x008c, 0x5df60f71},
> -	{0x0454, 0x006c, 0x5f03d8bb},
> -	{0x040e, 0x0032, 0x0ce513b5},
> -	{0x0874, 0x00e2, 0x6a811fb3},
> -	{0x0521, 0x00b4, 0x93296833},
> -	{0x0ddc, 0x00cf, 0xf9305338},
> -	{0x0a70, 0x0023, 0x239549ea},
> -	{0x083e, 0x0027, 0x2d88ba97},
> -	{0x0241, 0x00a7, 0xfe0b32e1},
> -	{0x0dfc, 0x0096, 0x1a11e815},
> -	{0x023e, 0x001e, 0xebc9a1f3},
> -	{0x067e, 0x0066, 0xb1067f81},
> -	{0x09ea, 0x000e, 0x46fd7247},
> -	{0x036b, 0x008c, 0x1a39acdf},
> -	{0x078f, 0x0030, 0x964042ab},
> -	{0x085c, 0x008f, 0x1829edab},
> -	{0x02ec, 0x009f, 0x6aefa72d},
> -	{0x043b, 0x00ce, 0x65642ff5},
> -	{0x0a32, 0x00b8, 0xbd82759e},
> -	{0x0d3c, 0x0087, 0xf4d66d54},
> -	{0x09ec, 0x008a, 0x06bfa1ff},
> -	{0x0902, 0x0015, 0x755025d2},
> -	{0x08fe, 0x000e, 0xf690ce2d},
> -	{0x00fb, 0x00dc, 0xe55f1528},
> -	{0x0eaa, 0x003a, 0x0fe0a8d7},
> -	{0x05fb, 0x0006, 0x86281cfb},
> -	{0x0dd1, 0x00a7, 0x60ab51b4},
> -	{0x0005, 0x001b, 0xf51d969b},
> -	{0x077c, 0x00dd, 0xc2fed268},
> -	{0x0575, 0x00f5, 0x432c0b1a},
> -	{0x05be, 0x0088, 0x78baa04b},
> -	{0x0c89, 0x0068, 0xeda9e428},
> -	{0x0f5c, 0x0068, 0xec143c76},
> -	{0x06a8, 0x0009, 0xd72651ce},
> -	{0x060f, 0x008e, 0x765426cd},
> -	{0x07b1, 0x0047, 0x2cfcfa0c},
> -	{0x04f1, 0x0041, 0x55b172f9},
> -	{0x0e05, 0x00ac, 0x61efde93},
> -	{0x0bf7, 0x0097, 0x05b83eee},
> -	{0x04e9, 0x00f3, 0x9928223a},
> -	{0x023a, 0x0005, 0xdfada9bc},
> -	{0x0acb, 0x000e, 0x2217cecd},
> -	{0x0148, 0x0060, 0xbc3f7405},
> -	{0x0764, 0x0059, 0xcbc201b1},
> -	{0x021f, 0x0059, 0x5d6b2256},
> -	{0x0f1e, 0x006c, 0xdefeeb45},
> -	{0x071c, 0x00b9, 0xb9b59309},
> -	{0x0564, 0x0063, 0xae064271},
> -	{0x0b14, 0x0044, 0xdb867d9b},
> -	{0x0e5a, 0x0055, 0xff06b685},
> -	{0x015e, 0x00ba, 0x1115ccbc},
> -	{0x0379, 0x00e6, 0x5f4e58dd},
> -	{0x013b, 0x0067, 0x4897427e},
> -	{0x0e64, 0x0071, 0x7af2b7a4},
> -	{0x0a11, 0x0050, 0x92105726},
> -	{0x0109, 0x0055, 0xd0d000f9},
> -	{0x00aa, 0x0022, 0x815d229d},
> -	{0x09ac, 0x004f, 0x02f9d985},
> -	{0x0e1b, 0x00ce, 0x5cf92ab4},
> -	{0x08af, 0x00d8, 0x17ca72d1},
> -	{0x0e33, 0x000a, 0xda2dba6b},
> -	{0x0ee3, 0x006a, 0xb00048e5},
> -	{0x0648, 0x001a, 0x2364b8cb},
> -	{0x0315, 0x0085, 0x0596fd0d},
> -	{0x0fbb, 0x003e, 0x298230ca},
> -	{0x0422, 0x006a, 0x78ada4ab},
> -	{0x04ba, 0x0073, 0xced1fbc2},
> -	{0x007d, 0x0061, 0x4b7ff236},
> -	{0x070b, 0x00d0, 0x261cf0ae},
> -	{0x0c1a, 0x0035, 0x8be92ee2},
> -	{0x0af8, 0x0063, 0x824dcf03},
> -	{0x08f8, 0x006d, 0xd289710c},
> -	{0x021b, 0x00ee, 0x6ac1c41d},
> -	{0x05b5, 0x00da, 0x8e52f0e2},
> +	{0x0567, 0x0097, 0x96951389, 0xc153aa0d},
> +	{0x0869, 0x0055, 0x6455ab4f, 0xd07f69bf},
> +	{0x0c51, 0x00be, 0x8663afde, 0xf9add90c},
> +	{0x044a, 0x00fc, 0x98fbe432, 0xbf2abb76},
> +	{0x0f29, 0x0079, 0x42371997, 0x282588b3},
> +	{0x08ba, 0x0052, 0x942be4f7, 0x2e023547},
> +	{0x01f2, 0x0013, 0x5262687e, 0x5266287e},
> +	{0x09e3, 0x00e2, 0x8ffb0908, 0x1da892f3},
> +	{0x007c, 0x0051, 0xb3158491, 0xb67f9e63},
> +	{0x0854, 0x001f, 0x83bb20d9, 0x22bb21db},
> +	{0x031b, 0x0008, 0x98970bdf, 0x9cd70adf},
> +	{0x0de7, 0x0027, 0xbfbf6f6c, 0xae3f296c},
> +	{0x0f76, 0x0005, 0x906a7105, 0x906a7105},
> +	{0x092e, 0x00d0, 0x86631850, 0xa3f6ac04},
> +	{0x0233, 0x0082, 0xdbdd914e, 0x5d8c7aac},
> +	{0x04c9, 0x0075, 0x5a400a9e, 0x12f60711},
> +	{0x0b66, 0x0099, 0xae128b45, 0x7551310d},
> +	{0x000d, 0x00ed, 0xe61c216a, 0xc22d3c4c},
> +	{0x0a31, 0x003d, 0xf69663b9, 0x51960bf8},
> +	{0x00a3, 0x0052, 0x643c39ae, 0xa93c73a8},
> +	{0x0125, 0x00d5, 0x7c310b0d, 0xf221cbb3},
> +	{0x0105, 0x004a, 0x06a77e74, 0xa4ef4561},
> +	{0x0858, 0x008e, 0x265bc739, 0xd6c36d9b},
> +	{0x045e, 0x0095, 0x13d6b192, 0x5f5c1d62},
> +	{0x0dab, 0x003c, 0xc4498704, 0x10414654},
> +	{0x00cd, 0x00b5, 0x802a4e2d, 0xfbd17c9d},
> +	{0x069b, 0x008c, 0x5df60f71, 0x91ddca5f},
> +	{0x0454, 0x006c, 0x5f03d8bb, 0x5c59fce0},
> +	{0x040e, 0x0032, 0x0ce513b5, 0xa8cd99b1},
> +	{0x0874, 0x00e2, 0x6a811fb3, 0xca028316},
> +	{0x0521, 0x00b4, 0x93296833, 0x2c4d4880},
> +	{0x0ddc, 0x00cf, 0xf9305338, 0x2c94210d},
> +	{0x0a70, 0x0023, 0x239549ea, 0x22b561aa},
> +	{0x083e, 0x0027, 0x2d88ba97, 0x5cd8bb9d},
> +	{0x0241, 0x00a7, 0xfe0b32e1, 0x17b506b8},
> +	{0x0dfc, 0x0096, 0x1a11e815, 0xee4141bd},
> +	{0x023e, 0x001e, 0xebc9a1f3, 0x5689a1f3},
> +	{0x067e, 0x0066, 0xb1067f81, 0xd9952571},
> +	{0x09ea, 0x000e, 0x46fd7247, 0x42b57245},
> +	{0x036b, 0x008c, 0x1a39acdf, 0x58bf1586},
> +	{0x078f, 0x0030, 0x964042ab, 0xb04218b9},
> +	{0x085c, 0x008f, 0x1829edab, 0x9ceca89c},
> +	{0x02ec, 0x009f, 0x6aefa72d, 0x634cc2a7},
> +	{0x043b, 0x00ce, 0x65642ff5, 0x6c8a584e},
> +	{0x0a32, 0x00b8, 0xbd82759e, 0x0f96a34f},
> +	{0x0d3c, 0x0087, 0xf4d66d54, 0xb71ba5f4},
> +	{0x09ec, 0x008a, 0x06bfa1ff, 0x576ca80f},
> +	{0x0902, 0x0015, 0x755025d2, 0x517225c2},
> +	{0x08fe, 0x000e, 0xf690ce2d, 0xf690cf3d},
> +	{0x00fb, 0x00dc, 0xe55f1528, 0x707d7d92},
> +	{0x0eaa, 0x003a, 0x0fe0a8d7, 0x87638cc5},
> +	{0x05fb, 0x0006, 0x86281cfb, 0x86281cf9},
> +	{0x0dd1, 0x00a7, 0x60ab51b4, 0xe28ef00c},
> +	{0x0005, 0x001b, 0xf51d969b, 0xe71dd6d3},
> +	{0x077c, 0x00dd, 0xc2fed268, 0xdc30c555},
> +	{0x0575, 0x00f5, 0x432c0b1a, 0x81dd7d16},
> +	{0x05be, 0x0088, 0x78baa04b, 0xd69b433e},
> +	{0x0c89, 0x0068, 0xeda9e428, 0xe9b4fa0a},
> +	{0x0f5c, 0x0068, 0xec143c76, 0x9947067a},
> +	{0x06a8, 0x0009, 0xd72651ce, 0xd72651ee},
> +	{0x060f, 0x008e, 0x765426cd, 0x2099626f},
> +	{0x07b1, 0x0047, 0x2cfcfa0c, 0x1a4baa07},
> +	{0x04f1, 0x0041, 0x55b172f9, 0x15331a79},
> +	{0x0e05, 0x00ac, 0x61efde93, 0x320568cc},
> +	{0x0bf7, 0x0097, 0x05b83eee, 0xc72fb7a3},
> +	{0x04e9, 0x00f3, 0x9928223a, 0xe8c77de2},
> +	{0x023a, 0x0005, 0xdfada9bc, 0xdfadb9be},
> +	{0x0acb, 0x000e, 0x2217cecd, 0x0017d6cd},
> +	{0x0148, 0x0060, 0xbc3f7405, 0xf5fd6615},
> +	{0x0764, 0x0059, 0xcbc201b1, 0xbb089bf4},
> +	{0x021f, 0x0059, 0x5d6b2256, 0xa16a0a59},
> +	{0x0f1e, 0x006c, 0xdefeeb45, 0xfc34f9d6},
> +	{0x071c, 0x00b9, 0xb9b59309, 0xb645eae2},
> +	{0x0564, 0x0063, 0xae064271, 0x954dc6d1},
> +	{0x0b14, 0x0044, 0xdb867d9b, 0xdf432309},
> +	{0x0e5a, 0x0055, 0xff06b685, 0xa65ff257},
> +	{0x015e, 0x00ba, 0x1115ccbc, 0x11c365f4},
> +	{0x0379, 0x00e6, 0x5f4e58dd, 0x2d176d31},
> +	{0x013b, 0x0067, 0x4897427e, 0xc40532fe},
> +	{0x0e64, 0x0071, 0x7af2b7a4, 0x1fb7bf43},
> +	{0x0a11, 0x0050, 0x92105726, 0xb1185e51},
> +	{0x0109, 0x0055, 0xd0d000f9, 0x60a60bfd},
> +	{0x00aa, 0x0022, 0x815d229d, 0x215d379c},
> +	{0x09ac, 0x004f, 0x02f9d985, 0x10b90b20},
> +	{0x0e1b, 0x00ce, 0x5cf92ab4, 0x6a477573},
> +	{0x08af, 0x00d8, 0x17ca72d1, 0x385af156},
> +	{0x0e33, 0x000a, 0xda2dba6b, 0xda2dbb69},
> +	{0x0ee3, 0x006a, 0xb00048e5, 0xa9a2decc},
> +	{0x0648, 0x001a, 0x2364b8cb, 0x3364b1cb},
> +	{0x0315, 0x0085, 0x0596fd0d, 0xa651740f},
> +	{0x0fbb, 0x003e, 0x298230ca, 0x7fc617c7},
> +	{0x0422, 0x006a, 0x78ada4ab, 0xc576ae2a},
> +	{0x04ba, 0x0073, 0xced1fbc2, 0xaac8455b},
> +	{0x007d, 0x0061, 0x4b7ff236, 0x347d5739},
> +	{0x070b, 0x00d0, 0x261cf0ae, 0xc7fb1c10},
> +	{0x0c1a, 0x0035, 0x8be92ee2, 0x8be9b4e1},
> +	{0x0af8, 0x0063, 0x824dcf03, 0x53010388},
> +	{0x08f8, 0x006d, 0xd289710c, 0x30418edd},
> +	{0x021b, 0x00ee, 0x6ac1c41d, 0x2557e9a3},
> +	{0x05b5, 0x00da, 0x8e52f0e2, 0x98531012},
>  };
> 
>  /* Don't print anything to stdout. */
> @@ -127,6 +128,7 @@ dahash_test(
>  	int		i;
>  	int		errors = 0;
>  	int		bytes = 0;
> +	struct xfs_name	xname = { };
>  	struct timeval	start, stop;
>  	uint64_t	usec;
> 
> @@ -150,6 +152,12 @@ dahash_test(
>  				dahash_tests[i].length);
>  		if (hash != dahash_tests[i].dahash)
>  			errors++;
> +
> +		xname.name = randbytes_test_buf + dahash_tests[i].start;
> +		xname.len = dahash_tests[i].length;
> +		hash = libxfs_ascii_ci_hashname(&xname);
> +		if (hash != dahash_tests[i].ascii_ci_dahash)
> +			errors++;
>  	}
>  	gettimeofday(&stop, NULL);
> 
> diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
> index d973e300939..026aa510ca1 100644
> --- a/libxfs/libxfs_api_defs.h
> +++ b/libxfs/libxfs_api_defs.h
> @@ -34,6 +34,8 @@
>  #define xfs_alloc_read_agf		libxfs_alloc_read_agf
>  #define xfs_alloc_vextent		libxfs_alloc_vextent
> 
> +#define xfs_ascii_ci_hashname		libxfs_ascii_ci_hashname
> +
>  #define xfs_attr_get			libxfs_attr_get
>  #define xfs_attr_leaf_newentsize	libxfs_attr_leaf_newentsize
>  #define xfs_attr_namecheck		libxfs_attr_namecheck
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
  2023-06-05 16:59   ` Eric Sandeen
  2023-06-05 17:19   ` Eric Sandeen
@ 2023-06-14 11:10   ` Carlos Maiolino
  2023-06-15 16:11   ` [PATCH v2 " Darrick J. Wong
  3 siblings, 0 replies; 15+ messages in thread
From: Carlos Maiolino @ 2023-06-14 11:10 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch

On Mon, Jun 05, 2023 at 08:36:44AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've stabilized the dirent hash function for ascii-ci
> filesystems, adapt the metadump name obfuscation code to detect when
> it's obfuscating a directory entry name on an ascii-ci filesystem and
> spit out names that actually have the same hash.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  db/metadump.c |   77 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 68 insertions(+), 9 deletions(-)
> 
> 
> diff --git a/db/metadump.c b/db/metadump.c
> index 317ff72802d..4f8b3adb163 100644
> --- a/db/metadump.c
> +++ b/db/metadump.c
> @@ -817,13 +817,17 @@ static void
>  obfuscate_name(
>  	xfs_dahash_t	hash,
>  	size_t		name_len,
> -	unsigned char	*name)
> +	unsigned char	*name,
> +	bool		is_dirent)
>  {
> -	unsigned char	*newp = name;
> +	unsigned char	*oldname = NULL;
> +	unsigned char	*newp;
>  	int		i;
> -	xfs_dahash_t	new_hash = 0;
> +	xfs_dahash_t	new_hash;
>  	unsigned char	*first;
>  	unsigned char	high_bit;
> +	int		tries = 0;
> +	bool		is_ci_name = is_dirent && xfs_has_asciici(mp);
>  	int		shift;
> 
>  	/*
> @@ -836,6 +840,24 @@ obfuscate_name(
>  	if (name_len < 5)
>  		return;
> 
> +	if (is_ci_name) {
> +		oldname = alloca(name_len);

			  ^^^ This is triggering a warning on build time because
			      the alloca() call is unbounded:

metadump.c: In function ‘obfuscate_name’:
metadump.c:844:13: warning: argument to ‘alloca’ may be too large
[-Walloca-larger-than=]
  844 |   oldname = alloca(name_len);
      |             ^~~~~~

Maybe just use malloc() here, instead of using stack space? We probably can use
MAXNAMELEN here to bound it, something like:

if (is_ci_name && name_len <= MAXNAMELEN) {
	oldname = alloca(name_len);
	memcpy(oldname, name, name_len);
} else {
	return;
}

-- 
Carlos
> +		memcpy(oldname, name, name_len);
> +	}
> +
> +again:
> +	newp = name;
> +	new_hash = 0;
> +
> +	/*
> +	 * If we cannot generate a ci-compatible obfuscated name after 1000
> +	 * tries, don't bother obfuscating the name.
> +	 */
> +	if (tries++ > 1000) {
> +		memcpy(name, oldname, name_len);
> +		return;
> +	}
> +
>  	/*
>  	 * The beginning of the obfuscated name can be pretty much
>  	 * anything, so fill it in with random characters.
> @@ -843,7 +865,11 @@ obfuscate_name(
>  	 */
>  	for (i = 0; i < name_len - 5; i++) {
>  		*newp = random_filename_char();
> -		new_hash = *newp ^ rol32(new_hash, 7);
> +		if (is_ci_name)
> +			new_hash = xfs_ascii_ci_xfrm(*newp) ^
> +							rol32(new_hash, 7);
> +		else
> +			new_hash = *newp ^ rol32(new_hash, 7);
>  		newp++;
>  	}
> 
> @@ -867,6 +893,17 @@ obfuscate_name(
>  			high_bit = 0x80;
>  		} else
>  			high_bit = 0;
> +
> +		/*
> +		 * If ascii-ci is enabled, uppercase characters are converted
> +		 * to lowercase characters while computing the name hash.  If
> +		 * any of the necessary correction bytes are uppercase, the
> +		 * hash of the new name will not match.  Try again with a
> +		 * different prefix.
> +		 */
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*newp))
> +			goto again;
> +
>  		ASSERT(!is_invalid_char(*newp));
>  		newp++;
>  	}
> @@ -880,6 +917,10 @@ obfuscate_name(
>  	 */
>  	if (high_bit) {
>  		*first ^= 0x10;
> +
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*first))
> +			goto again;
> +
>  		ASSERT(!is_invalid_char(*first));
>  	}
>  }
> @@ -1177,6 +1218,24 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
>  	return 1;
>  }
> 
> +static inline xfs_dahash_t
> +dirattr_hashname(
> +	bool		is_dirent,
> +	const uint8_t	*name,
> +	int		namelen)
> +{
> +	if (is_dirent) {
> +		struct xfs_name	xname = {
> +			.name	= name,
> +			.len	= namelen,
> +		};
> +
> +		return libxfs_dir2_hashname(mp, &xname);
> +	}
> +
> +	return libxfs_da_hashname(name, namelen);
> +}
> +
>  static void
>  generate_obfuscated_name(
>  	xfs_ino_t		ino,
> @@ -1205,9 +1264,9 @@ generate_obfuscated_name(
> 
>  	/* Obfuscate the name (if possible) */
> 
> -	hash = libxfs_da_hashname(name, namelen);
> -	obfuscate_name(hash, namelen, name);
> -	ASSERT(hash == libxfs_da_hashname(name, namelen));
> +	hash = dirattr_hashname(ino != 0, name, namelen);
> +	obfuscate_name(hash, namelen, name, ino != 0);
> +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
> 
>  	/*
>  	 * Make sure the name is not something already seen.  If we
> @@ -1320,7 +1379,7 @@ obfuscate_path_components(
>  			/* last (or single) component */
>  			namelen = strnlen((char *)comp, len);
>  			hash = libxfs_da_hashname(comp, namelen);
> -			obfuscate_name(hash, namelen, comp);
> +			obfuscate_name(hash, namelen, comp, false);
>  			ASSERT(hash == libxfs_da_hashname(comp, namelen));
>  			break;
>  		}
> @@ -1332,7 +1391,7 @@ obfuscate_path_components(
>  			continue;
>  		}
>  		hash = libxfs_da_hashname(comp, namelen);
> -		obfuscate_name(hash, namelen, comp);
> +		obfuscate_name(hash, namelen, comp, false);
>  		ASSERT(hash == libxfs_da_hashname(comp, namelen));
>  		comp += namelen + 1;
>  		len -= namelen + 1;
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
                     ` (2 preceding siblings ...)
  2023-06-14 11:10   ` Carlos Maiolino
@ 2023-06-15 16:11   ` Darrick J. Wong
  2023-06-22 11:45     ` Carlos Maiolino
  3 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2023-06-15 16:11 UTC (permalink / raw)
  To: cem; +Cc: linux-xfs, david, hch

From: Darrick J. Wong <djwong@kernel.org>

Now that we've stabilized the dirent hash function for ascii-ci
filesystems, adapt the metadump name obfuscation code to detect when
it's obfuscating a directory entry name on an ascii-ci filesystem and
spit out names that actually have the same hash.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
v2: s/alloca/malloc/
---
 db/metadump.c |   82 +++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 73 insertions(+), 9 deletions(-)

diff --git a/db/metadump.c b/db/metadump.c
index 317ff72802d..9ccee0b7ace 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -817,13 +817,17 @@ static void
 obfuscate_name(
 	xfs_dahash_t	hash,
 	size_t		name_len,
-	unsigned char	*name)
+	unsigned char	*name,
+	bool		is_dirent)
 {
-	unsigned char	*newp = name;
+	unsigned char	*oldname = NULL;
+	unsigned char	*newp;
 	int		i;
-	xfs_dahash_t	new_hash = 0;
+	xfs_dahash_t	new_hash;
 	unsigned char	*first;
 	unsigned char	high_bit;
+	int		tries = 0;
+	bool		is_ci_name = is_dirent && xfs_has_asciici(mp);
 	int		shift;
 
 	/*
@@ -836,6 +840,26 @@ obfuscate_name(
 	if (name_len < 5)
 		return;
 
+	if (is_ci_name) {
+		oldname = malloc(name_len);
+		if (!oldname)
+			return;
+		memcpy(oldname, name, name_len);
+	}
+
+again:
+	newp = name;
+	new_hash = 0;
+
+	/*
+	 * If we cannot generate a ci-compatible obfuscated name after 1000
+	 * tries, don't bother obfuscating the name.
+	 */
+	if (tries++ > 1000) {
+		memcpy(name, oldname, name_len);
+		goto out_free;
+	}
+
 	/*
 	 * The beginning of the obfuscated name can be pretty much
 	 * anything, so fill it in with random characters.
@@ -843,7 +867,11 @@ obfuscate_name(
 	 */
 	for (i = 0; i < name_len - 5; i++) {
 		*newp = random_filename_char();
-		new_hash = *newp ^ rol32(new_hash, 7);
+		if (is_ci_name)
+			new_hash = xfs_ascii_ci_xfrm(*newp) ^
+							rol32(new_hash, 7);
+		else
+			new_hash = *newp ^ rol32(new_hash, 7);
 		newp++;
 	}
 
@@ -867,6 +895,17 @@ obfuscate_name(
 			high_bit = 0x80;
 		} else
 			high_bit = 0;
+
+		/*
+		 * If ascii-ci is enabled, uppercase characters are converted
+		 * to lowercase characters while computing the name hash.  If
+		 * any of the necessary correction bytes are uppercase, the
+		 * hash of the new name will not match.  Try again with a
+		 * different prefix.
+		 */
+		if (is_ci_name && xfs_ascii_ci_need_xfrm(*newp))
+			goto again;
+
 		ASSERT(!is_invalid_char(*newp));
 		newp++;
 	}
@@ -880,8 +919,15 @@ obfuscate_name(
 	 */
 	if (high_bit) {
 		*first ^= 0x10;
+
+		if (is_ci_name && xfs_ascii_ci_need_xfrm(*first))
+			goto again;
+
 		ASSERT(!is_invalid_char(*first));
 	}
+
+out_free:
+	free(oldname);
 }
 
 /*
@@ -1177,6 +1223,24 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
 	return 1;
 }
 
+static inline xfs_dahash_t
+dirattr_hashname(
+	bool		is_dirent,
+	const uint8_t	*name,
+	int		namelen)
+{
+	if (is_dirent) {
+		struct xfs_name	xname = {
+			.name	= name,
+			.len	= namelen,
+		};
+
+		return libxfs_dir2_hashname(mp, &xname);
+	}
+
+	return libxfs_da_hashname(name, namelen);
+}
+
 static void
 generate_obfuscated_name(
 	xfs_ino_t		ino,
@@ -1205,9 +1269,9 @@ generate_obfuscated_name(
 
 	/* Obfuscate the name (if possible) */
 
-	hash = libxfs_da_hashname(name, namelen);
-	obfuscate_name(hash, namelen, name);
-	ASSERT(hash == libxfs_da_hashname(name, namelen));
+	hash = dirattr_hashname(ino != 0, name, namelen);
+	obfuscate_name(hash, namelen, name, ino != 0);
+	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
 
 	/*
 	 * Make sure the name is not something already seen.  If we
@@ -1320,7 +1384,7 @@ obfuscate_path_components(
 			/* last (or single) component */
 			namelen = strnlen((char *)comp, len);
 			hash = libxfs_da_hashname(comp, namelen);
-			obfuscate_name(hash, namelen, comp);
+			obfuscate_name(hash, namelen, comp, false);
 			ASSERT(hash == libxfs_da_hashname(comp, namelen));
 			break;
 		}
@@ -1332,7 +1396,7 @@ obfuscate_path_components(
 			continue;
 		}
 		hash = libxfs_da_hashname(comp, namelen);
-		obfuscate_name(hash, namelen, comp);
+		obfuscate_name(hash, namelen, comp, false);
 		ASSERT(hash == libxfs_da_hashname(comp, namelen));
 		comp += namelen + 1;
 		len -= namelen + 1;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems
  2023-06-15 16:11   ` [PATCH v2 " Darrick J. Wong
@ 2023-06-22 11:45     ` Carlos Maiolino
  0 siblings, 0 replies; 15+ messages in thread
From: Carlos Maiolino @ 2023-06-22 11:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, hch

On Thu, Jun 15, 2023 at 09:11:04AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we've stabilized the dirent hash function for ascii-ci
> filesystems, adapt the metadump name obfuscation code to detect when
> it's obfuscating a directory entry name on an ascii-ci filesystem and
> spit out names that actually have the same hash.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> v2: s/alloca/malloc/
> ---
>  db/metadump.c |   82 +++++++++++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 73 insertions(+), 9 deletions(-)

Thanks for fixing it!

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> 
> diff --git a/db/metadump.c b/db/metadump.c
> index 317ff72802d..9ccee0b7ace 100644
> --- a/db/metadump.c
> +++ b/db/metadump.c
> @@ -817,13 +817,17 @@ static void
>  obfuscate_name(
>  	xfs_dahash_t	hash,
>  	size_t		name_len,
> -	unsigned char	*name)
> +	unsigned char	*name,
> +	bool		is_dirent)
>  {
> -	unsigned char	*newp = name;
> +	unsigned char	*oldname = NULL;
> +	unsigned char	*newp;
>  	int		i;
> -	xfs_dahash_t	new_hash = 0;
> +	xfs_dahash_t	new_hash;
>  	unsigned char	*first;
>  	unsigned char	high_bit;
> +	int		tries = 0;
> +	bool		is_ci_name = is_dirent && xfs_has_asciici(mp);
>  	int		shift;
> 
>  	/*
> @@ -836,6 +840,26 @@ obfuscate_name(
>  	if (name_len < 5)
>  		return;
> 
> +	if (is_ci_name) {
> +		oldname = malloc(name_len);
> +		if (!oldname)
> +			return;
> +		memcpy(oldname, name, name_len);
> +	}
> +
> +again:
> +	newp = name;
> +	new_hash = 0;
> +
> +	/*
> +	 * If we cannot generate a ci-compatible obfuscated name after 1000
> +	 * tries, don't bother obfuscating the name.
> +	 */
> +	if (tries++ > 1000) {
> +		memcpy(name, oldname, name_len);
> +		goto out_free;
> +	}
> +
>  	/*
>  	 * The beginning of the obfuscated name can be pretty much
>  	 * anything, so fill it in with random characters.
> @@ -843,7 +867,11 @@ obfuscate_name(
>  	 */
>  	for (i = 0; i < name_len - 5; i++) {
>  		*newp = random_filename_char();
> -		new_hash = *newp ^ rol32(new_hash, 7);
> +		if (is_ci_name)
> +			new_hash = xfs_ascii_ci_xfrm(*newp) ^
> +							rol32(new_hash, 7);
> +		else
> +			new_hash = *newp ^ rol32(new_hash, 7);
>  		newp++;
>  	}
> 
> @@ -867,6 +895,17 @@ obfuscate_name(
>  			high_bit = 0x80;
>  		} else
>  			high_bit = 0;
> +
> +		/*
> +		 * If ascii-ci is enabled, uppercase characters are converted
> +		 * to lowercase characters while computing the name hash.  If
> +		 * any of the necessary correction bytes are uppercase, the
> +		 * hash of the new name will not match.  Try again with a
> +		 * different prefix.
> +		 */
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*newp))
> +			goto again;
> +
>  		ASSERT(!is_invalid_char(*newp));
>  		newp++;
>  	}
> @@ -880,8 +919,15 @@ obfuscate_name(
>  	 */
>  	if (high_bit) {
>  		*first ^= 0x10;
> +
> +		if (is_ci_name && xfs_ascii_ci_need_xfrm(*first))
> +			goto again;
> +
>  		ASSERT(!is_invalid_char(*first));
>  	}
> +
> +out_free:
> +	free(oldname);
>  }
> 
>  /*
> @@ -1177,6 +1223,24 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
>  	return 1;
>  }
> 
> +static inline xfs_dahash_t
> +dirattr_hashname(
> +	bool		is_dirent,
> +	const uint8_t	*name,
> +	int		namelen)
> +{
> +	if (is_dirent) {
> +		struct xfs_name	xname = {
> +			.name	= name,
> +			.len	= namelen,
> +		};
> +
> +		return libxfs_dir2_hashname(mp, &xname);
> +	}
> +
> +	return libxfs_da_hashname(name, namelen);
> +}
> +
>  static void
>  generate_obfuscated_name(
>  	xfs_ino_t		ino,
> @@ -1205,9 +1269,9 @@ generate_obfuscated_name(
> 
>  	/* Obfuscate the name (if possible) */
> 
> -	hash = libxfs_da_hashname(name, namelen);
> -	obfuscate_name(hash, namelen, name);
> -	ASSERT(hash == libxfs_da_hashname(name, namelen));
> +	hash = dirattr_hashname(ino != 0, name, namelen);
> +	obfuscate_name(hash, namelen, name, ino != 0);
> +	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
> 
>  	/*
>  	 * Make sure the name is not something already seen.  If we
> @@ -1320,7 +1384,7 @@ obfuscate_path_components(
>  			/* last (or single) component */
>  			namelen = strnlen((char *)comp, len);
>  			hash = libxfs_da_hashname(comp, namelen);
> -			obfuscate_name(hash, namelen, comp);
> +			obfuscate_name(hash, namelen, comp, false);
>  			ASSERT(hash == libxfs_da_hashname(comp, namelen));
>  			break;
>  		}
> @@ -1332,7 +1396,7 @@ obfuscate_path_components(
>  			continue;
>  		}
>  		hash = libxfs_da_hashname(comp, namelen);
> -		obfuscate_name(hash, namelen, comp);
> +		obfuscate_name(hash, namelen, comp, false);
>  		ASSERT(hash == libxfs_da_hashname(comp, namelen));
>  		comp += namelen + 1;
>  		len -= namelen + 1;

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-06-22 11:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-05 15:36 [PATCHSET v3 0/5] xfsprogs: fix ascii-ci problems, then kill it Darrick J. Wong
2023-06-05 15:36 ` [PATCH 1/5] libxfs: test the ascii case-insensitive hash Darrick J. Wong
2023-06-05 16:30   ` Eric Sandeen
2023-06-14  7:44   ` Carlos Maiolino
2023-06-05 15:36 ` [PATCH 2/5] xfs_db: move obfuscate_name assertion to callers Darrick J. Wong
2023-06-05 15:36 ` [PATCH 3/5] xfs_db: fix metadump name obfuscation for ascii-ci filesystems Darrick J. Wong
2023-06-05 16:59   ` Eric Sandeen
2023-06-05 17:09     ` Darrick J. Wong
2023-06-05 17:18       ` Eric Sandeen
2023-06-05 17:19   ` Eric Sandeen
2023-06-14 11:10   ` Carlos Maiolino
2023-06-15 16:11   ` [PATCH v2 " Darrick J. Wong
2023-06-22 11:45     ` Carlos Maiolino
2023-06-05 15:36 ` [PATCH 4/5] mkfs.xfs.8: warn about the version=ci feature Darrick J. Wong
2023-06-05 15:36 ` [PATCH 5/5] mkfs: deprecate the ascii-ci feature Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.