* [PATCH] generic/453: Exclude filenames that are not supported by exfat @ 2021-04-25 22:31 Shreeya Patel 2021-04-26 0:34 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: Shreeya Patel @ 2021-04-25 22:31 UTC (permalink / raw) To: fstests; +Cc: linux-fsdevel, krisman, preichl, kernel, Shreeya Patel exFAT filesystem does not support the following character codes 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > Hence, exclude the filenames which creates FAKESLASH and a BOX since they are using character codes which are not supported by exfat. Filename creating a BOX uses a control code '\xa0' which is restricted by exfat. Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> --- tests/generic/453 | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/tests/generic/453 b/tests/generic/453 index d997736c..7fc73b4c 100755 --- a/tests/generic/453 +++ b/tests/generic/453 @@ -115,15 +115,9 @@ setf "greek_\xce\xa5\xcc\x81.txt" "GREEK UPSILON WITH ACUTE AND HOOK SYMBOL, NFK setf "arabic_\xef\xb7\xba.txt" "ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM, NFC" setf "arabic_\xd8\xb5\xd9\x84\xd9\x89\x20\xd8\xa7\xd9\x84\xd9\x84\xd9\x87\x20\xd8\xb9\xd9\x84\xd9\x8a\xd9\x87\x20\xd9\x88\xd8\xb3\xd9\x84\xd9\x85.txt" "ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM, NFKC" -# Fake slash? -setf "urk\xc0\xafmoo" "FAKESLASH" - # Emoji: octopus butterfly owl giraffe setf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji" -# Line draw characters, because why not? -setf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can" - # unicode rtl widgets too... setf "moo\xe2\x80\xaegnp.txt" "Well say hello," setf "mootxt.png" "Harvey" @@ -155,6 +149,16 @@ setf "zerojoin_moo\xe2\x80\x8dcow.txt" "zero width joiners" setf "combmark_\xe1\x80\x9c\xe1\x80\xad\xe1\x80\xaf.txt" "combining marks" setf "combmark_\xe1\x80\x9c\xe1\x80\xaf\xe1\x80\xad.txt" "combining marks" +if [ "$FSTYP" != "exfat" ]; then + + # Fake slash? + setf "urk\xc0\xafmoo" "FAKESLASH" + + # Line draw characters, because why not? + setf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can" + +fi + # fake dotdot entry setd ".\xe2\x80\x8d" "zero width joiners in dot entry" setd "..\xe2\x80\x8d" "zero width joiners in dotdot entry" @@ -176,12 +180,8 @@ testf "greek_\xce\xa5\xcc\x81.txt" "GREEK UPSILON WITH ACUTE AND HOOK SYMBOL, NF testf "arabic_\xef\xb7\xba.txt" "ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM, NFC" testf "arabic_\xd8\xb5\xd9\x84\xd9\x89\x20\xd8\xa7\xd9\x84\xd9\x84\xd9\x87\x20\xd8\xb9\xd9\x84\xd9\x8a\xd9\x87\x20\xd9\x88\xd8\xb3\xd9\x84\xd9\x85.txt" "ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM, NFKC" -testf "urk\xc0\xafmoo" "FAKESLASH" - testf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji" -testf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can" - testf "moo\xe2\x80\xaegnp.txt" "Well say hello," testf "mootxt.png" "Harvey" @@ -206,6 +206,14 @@ testf "zerojoin_moo\xe2\x80\x8dcow.txt" "zero width joiners" testf "combmark_\xe1\x80\x9c\xe1\x80\xad\xe1\x80\xaf.txt" "combining marks" testf "combmark_\xe1\x80\x9c\xe1\x80\xaf\xe1\x80\xad.txt" "combining marks" +if [ "$FSTYP" != "exfat" ]; then + + testf "urk\xc0\xafmoo" "FAKESLASH" + + testf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can" + +fi + testd ".\xe2\x80\x8d" "zero width joiners in dot entry" testd "..\xe2\x80\x8d" "zero width joiners in dotdot entry" -- 2.31.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-25 22:31 [PATCH] generic/453: Exclude filenames that are not supported by exfat Shreeya Patel @ 2021-04-26 0:34 ` Matthew Wilcox 2021-04-26 11:57 ` Shreeya Patel 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2021-04-26 0:34 UTC (permalink / raw) To: Shreeya Patel; +Cc: fstests, linux-fsdevel, krisman, preichl, kernel On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: > exFAT filesystem does not support the following character codes > 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > ummm ... > -# Fake slash? > -setf "urk\xc0\xafmoo" "FAKESLASH" That doesn't use any of the explained banned characters. It uses 0xc0, 0xaf. Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS standards require that producers of UTF-8 shall use the shortest form possible, for example, producing a two-byte sequence with first byte 0xc0 is nonconforming. Unicode 3.1 has added the requirement that conforming programs must not accept non-shortest forms in their input." So is it that exfat is rejecting nonconforming sequences? Or is it converting the nonconforming sequence from 0xc0 0xaf to the conforming sequence 0x2f, and then rejecting it (because it's '/')? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-26 0:34 ` Matthew Wilcox @ 2021-04-26 11:57 ` Shreeya Patel 2021-04-26 12:03 ` Shreeya Patel 2021-04-26 12:37 ` Matthew Wilcox 0 siblings, 2 replies; 11+ messages in thread From: Shreeya Patel @ 2021-04-26 11:57 UTC (permalink / raw) To: Matthew Wilcox; +Cc: fstests, linux-fsdevel, krisman, preichl, kernel On 26/04/21 6:04 am, Matthew Wilcox wrote: > On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: >> exFAT filesystem does not support the following character codes >> 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > > ummm ... > >> -# Fake slash? >> -setf "urk\xc0\xafmoo" "FAKESLASH" > That doesn't use any of the explained banned characters. It uses 0xc0, > 0xaf. > > Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS > standards require that producers of UTF-8 shall use the shortest form > possible, for example, producing a two-byte sequence with first byte 0xc0 > is nonconforming. Unicode 3.1 has added the requirement that conforming > programs must not accept non-shortest forms in their input." > > So is it that exfat is rejecting nonconforming sequences? Or is it > converting the nonconforming sequence from 0xc0 0xaf to the conforming > sequence 0x2f, and then rejecting it (because it's '/')? > No, I don't think exfat is not converting nonconforming sequence from 0xc0 0xaf to the conforming sequence 0x2f. Because I get different outputs when tried with both ways. When I create a file with "urk\xc0\xafmoo", I get output as "Operation not permitted" and when I create it as "urk\x2fmoo", it gives "No such file or directory error" or you can consider this error as "Invalid argument" ( because that's what I get when I try for other characters like |, :, ?, etc ) Box filename also fails with "Invalid argument" error. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-26 11:57 ` Shreeya Patel @ 2021-04-26 12:03 ` Shreeya Patel 2021-04-26 12:37 ` Matthew Wilcox 1 sibling, 0 replies; 11+ messages in thread From: Shreeya Patel @ 2021-04-26 12:03 UTC (permalink / raw) To: Matthew Wilcox; +Cc: fstests, linux-fsdevel, krisman, preichl, kernel On 26/04/21 5:27 pm, Shreeya Patel wrote: > > On 26/04/21 6:04 am, Matthew Wilcox wrote: >> On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: >>> exFAT filesystem does not support the following character codes >>> 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > >> ummm ... >> >>> -# Fake slash? >>> -setf "urk\xc0\xafmoo" "FAKESLASH" >> That doesn't use any of the explained banned characters. It uses 0xc0, >> 0xaf. >> >> Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS >> standards require that producers of UTF-8 shall use the shortest form >> possible, for example, producing a two-byte sequence with first byte >> 0xc0 >> is nonconforming. Unicode 3.1 has added the requirement that conforming >> programs must not accept non-shortest forms in their input." >> >> So is it that exfat is rejecting nonconforming sequences? Or is it >> converting the nonconforming sequence from 0xc0 0xaf to the conforming >> sequence 0x2f, and then rejecting it (because it's '/')? >> > > No, I don't think exfat is not converting nonconforming sequence from > 0xc0 0xaf > to the conforming sequence 0x2f. Sorry, I meant "I don't think exfat is converting nonconforming sequence from 0xc0 0xaf to the conforming sequence 0x2f." here. > Because I get different outputs when tried with both ways. > When I create a file with "urk\xc0\xafmoo", I get output as "Operation > not permitted" > and when I create it as "urk\x2fmoo", it gives "No such file or > directory error" or > you can consider this error as "Invalid argument" > ( because that's what I get when I try for other characters like |, :, > ?, etc ) > > Box filename also fails with "Invalid argument" error. > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-26 11:57 ` Shreeya Patel 2021-04-26 12:03 ` Shreeya Patel @ 2021-04-26 12:37 ` Matthew Wilcox 2021-04-27 11:13 ` Shreeya Patel 1 sibling, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2021-04-26 12:37 UTC (permalink / raw) To: Shreeya Patel; +Cc: fstests, linux-fsdevel, krisman, preichl, kernel On Mon, Apr 26, 2021 at 05:27:51PM +0530, Shreeya Patel wrote: > On 26/04/21 6:04 am, Matthew Wilcox wrote: > > On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: > > > exFAT filesystem does not support the following character codes > > > 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > > > ummm ... > > > > > -# Fake slash? > > > -setf "urk\xc0\xafmoo" "FAKESLASH" > > That doesn't use any of the explained banned characters. It uses 0xc0, > > 0xaf. > > > > Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS > > standards require that producers of UTF-8 shall use the shortest form > > possible, for example, producing a two-byte sequence with first byte 0xc0 > > is nonconforming. Unicode 3.1 has added the requirement that conforming > > programs must not accept non-shortest forms in their input." > > > > So is it that exfat is rejecting nonconforming sequences? Or is it > > converting the nonconforming sequence from 0xc0 0xaf to the conforming > > sequence 0x2f, and then rejecting it (because it's '/')? > > > > No, I don't think exfat is not converting nonconforming sequence from 0xc0 > 0xaf > to the conforming sequence 0x2f. > Because I get different outputs when tried with both ways. > When I create a file with "urk\xc0\xafmoo", I get output as "Operation not > permitted" > and when I create it as "urk\x2fmoo", it gives "No such file or directory > error" or > you can consider this error as "Invalid argument" > ( because that's what I get when I try for other characters like |, :, ?, > etc ) I think we need to understand this before skipping the test. Does it also fail, eg, on cifs, vfat, jfs or udf? > Box filename also fails with "Invalid argument" error. > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-26 12:37 ` Matthew Wilcox @ 2021-04-27 11:13 ` Shreeya Patel 2021-04-27 18:11 ` Darrick J. Wong 0 siblings, 1 reply; 11+ messages in thread From: Shreeya Patel @ 2021-04-27 11:13 UTC (permalink / raw) To: Matthew Wilcox; +Cc: fstests, linux-fsdevel, krisman, preichl, kernel On 26/04/21 6:07 pm, Matthew Wilcox wrote: > On Mon, Apr 26, 2021 at 05:27:51PM +0530, Shreeya Patel wrote: >> On 26/04/21 6:04 am, Matthew Wilcox wrote: >>> On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: >>>> exFAT filesystem does not support the following character codes >>>> 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > >>> ummm ... >>> >>>> -# Fake slash? >>>> -setf "urk\xc0\xafmoo" "FAKESLASH" >>> That doesn't use any of the explained banned characters. It uses 0xc0, >>> 0xaf. >>> >>> Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS >>> standards require that producers of UTF-8 shall use the shortest form >>> possible, for example, producing a two-byte sequence with first byte 0xc0 >>> is nonconforming. Unicode 3.1 has added the requirement that conforming >>> programs must not accept non-shortest forms in their input." >>> >>> So is it that exfat is rejecting nonconforming sequences? Or is it >>> converting the nonconforming sequence from 0xc0 0xaf to the conforming >>> sequence 0x2f, and then rejecting it (because it's '/')? >>> >> No, I don't think exfat is not converting nonconforming sequence from 0xc0 >> 0xaf >> to the conforming sequence 0x2f. >> Because I get different outputs when tried with both ways. >> When I create a file with "urk\xc0\xafmoo", I get output as "Operation not >> permitted" >> and when I create it as "urk\x2fmoo", it gives "No such file or directory >> error" or >> you can consider this error as "Invalid argument" >> ( because that's what I get when I try for other characters like |, :, ?, >> etc ) > I think we need to understand this before skipping the test. Does it > also fail, eg, on cifs, vfat, jfs or udf? I tested it for VFAT, UDF and JFS and following are the results. 1. VFAT ( as per wikipedia 0x00-0x1F 0x7F " * / : < > ? \ | are reserved characters) For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory For \xc0\xaf) - /var/mnt/scratch/test-453/urk��moo.txt: Invalid argument Also gives error for Box filename ( this is very much similar to exfat, the only difference is that I do not get Operation not permitted when using \xc0\xaf, instead it gives invalid argument.) 2. UDF ( as per wikipedia - only NULL cannot be used ) For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory For \xc0\xaf - creates filename something like this 'urk??moo.txt' and does not throw any error. ( But this seems to be invalid and should have thrown some error) Also gives error for dotdot entry. I am not sure why UDF was giving error for / and dot dot entry but then I read the following for UDF in one of the man pages which justifies the above errors I think "Invalid characters such as "NULL" and "/" and invalid file names such as "." and ".." will be translated according to the following rule: Replace the invalid character with an "_," then append the file name with # followed by a 4 digit hex representation of the 16-bit CRC of the original FileIdentifier. For example, the file name ".." will become "__#4C05" " Source - http://www-it.desy.de/cgi-bin/man-cgi?udfs+7 3. JFS ( as per Wikipedia NULL cannot be used ) For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory For \xc0\xaf - Works fine Again not sure why / is failing here. Did not find much resource about the restricted filenames for JFS. So as per above all the results, it seems like using \x2f fails for all but \xc0\xaf does work for JFS. > >> Box filename also fails with "Invalid argument" error. >> >> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-27 11:13 ` Shreeya Patel @ 2021-04-27 18:11 ` Darrick J. Wong 2021-04-27 21:00 ` Shreeya Patel 2021-04-28 13:50 ` Theodore Ts'o 0 siblings, 2 replies; 11+ messages in thread From: Darrick J. Wong @ 2021-04-27 18:11 UTC (permalink / raw) To: Shreeya Patel Cc: Matthew Wilcox, fstests, linux-fsdevel, krisman, preichl, kernel On Tue, Apr 27, 2021 at 04:43:05PM +0530, Shreeya Patel wrote: > > On 26/04/21 6:07 pm, Matthew Wilcox wrote: > > On Mon, Apr 26, 2021 at 05:27:51PM +0530, Shreeya Patel wrote: > > > On 26/04/21 6:04 am, Matthew Wilcox wrote: > > > > On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: > > > > > exFAT filesystem does not support the following character codes > > > > > 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > > > > > ummm ... > > > > > > > > > -# Fake slash? > > > > > -setf "urk\xc0\xafmoo" "FAKESLASH" > > > > That doesn't use any of the explained banned characters. It uses 0xc0, > > > > 0xaf. > > > > > > > > Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS > > > > standards require that producers of UTF-8 shall use the shortest form > > > > possible, for example, producing a two-byte sequence with first byte 0xc0 > > > > is nonconforming. Unicode 3.1 has added the requirement that conforming > > > > programs must not accept non-shortest forms in their input." > > > > > > > > So is it that exfat is rejecting nonconforming sequences? Or is it > > > > converting the nonconforming sequence from 0xc0 0xaf to the conforming > > > > sequence 0x2f, and then rejecting it (because it's '/')? > > > > > > > No, I don't think exfat is not converting nonconforming sequence from 0xc0 > > > 0xaf > > > to the conforming sequence 0x2f. > > > Because I get different outputs when tried with both ways. > > > When I create a file with "urk\xc0\xafmoo", I get output as "Operation not > > > permitted" > > > and when I create it as "urk\x2fmoo", it gives "No such file or directory > > > error" or > > > you can consider this error as "Invalid argument" > > > ( because that's what I get when I try for other characters like |, :, ?, > > > etc ) > > I think we need to understand this before skipping the test. Does it > > also fail, eg, on cifs, vfat, jfs or udf? > > > I tested it for VFAT, UDF and JFS and following are the results. > > > 1. VFAT ( as per wikipedia 0x00-0x1F 0x7F " * / : < > ? \ | are reserved > characters) > > For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory > > For \xc0\xaf) - /var/mnt/scratch/test-453/urk��moo.txt: Invalid argument > > Also gives error for Box filename > > ( this is very much similar to exfat, the only difference is that I do not > get Operation not permitted when > using \xc0\xaf, instead it gives invalid argument.) vfat checks for those invalid characters, see msdos_format_name() and vfat_is_used_badchars(). TBH I think these tests (g/453 and g/454) are probably only useful for filesystems that allow unrestricted byte streams for names. > 2. UDF ( as per wikipedia - only NULL cannot be used ) > > For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory > > For \xc0\xaf - creates filename something like this 'urk??moo.txt' and does > not throw any error. > ( But this seems to be invalid and should have thrown some error) > > Also gives error for dotdot entry. > > I am not sure why UDF was giving error for / and dot dot entry but then > I read the following for UDF in one of the man pages which justifies the > above errors I think > > "Invalid characters such as "NULL" and "/" and invalid file > names such as "." and ".." will be translated according to > the following rule: > > Replace the invalid character with an "_," then append the > file name with # followed by a 4 digit hex representation of > the 16-bit CRC of the original FileIdentifier. For example, > the file name ".." will become "__#4C05" " > > Source - http://www-it.desy.de/cgi-bin/man-cgi?udfs+7 That's Solaris. > 3. JFS ( as per Wikipedia NULL cannot be used ) > > For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory > > For \xc0\xaf - Works fine > > Again not sure why / is failing here. Did not find much resource about the > restricted filenames for JFS. "/" is a path separator, it should always return ENOENT (unless you created $SCRATCH_MNT/test-453/urk/moo.txt). 0x2f is the ascii encoding for a slash. > So as per above all the results, it seems like using \x2f fails for all but > \xc0\xaf does work for JFS. <nod> --D > > > > > > > Box filename also fails with "Invalid argument" error. > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-27 18:11 ` Darrick J. Wong @ 2021-04-27 21:00 ` Shreeya Patel 2021-04-28 13:50 ` Theodore Ts'o 1 sibling, 0 replies; 11+ messages in thread From: Shreeya Patel @ 2021-04-27 21:00 UTC (permalink / raw) To: Darrick J. Wong Cc: Matthew Wilcox, fstests, linux-fsdevel, krisman, preichl, kernel On 27/04/21 11:41 pm, Darrick J. Wong wrote: > On Tue, Apr 27, 2021 at 04:43:05PM +0530, Shreeya Patel wrote: >> On 26/04/21 6:07 pm, Matthew Wilcox wrote: >>> On Mon, Apr 26, 2021 at 05:27:51PM +0530, Shreeya Patel wrote: >>>> On 26/04/21 6:04 am, Matthew Wilcox wrote: >>>>> On Mon, Apr 26, 2021 at 04:01:05AM +0530, Shreeya Patel wrote: >>>>>> exFAT filesystem does not support the following character codes >>>>>> 0x0000 - 0x001F ( Control Codes ), /, ?, :, ", \, *, <, |, > >>>>> ummm ... >>>>> >>>>>> -# Fake slash? >>>>>> -setf "urk\xc0\xafmoo" "FAKESLASH" >>>>> That doesn't use any of the explained banned characters. It uses 0xc0, >>>>> 0xaf. >>>>> >>>>> Now, in utf-8, that's an nonconforming sequence. "The Unicode and UCS >>>>> standards require that producers of UTF-8 shall use the shortest form >>>>> possible, for example, producing a two-byte sequence with first byte 0xc0 >>>>> is nonconforming. Unicode 3.1 has added the requirement that conforming >>>>> programs must not accept non-shortest forms in their input." >>>>> >>>>> So is it that exfat is rejecting nonconforming sequences? Or is it >>>>> converting the nonconforming sequence from 0xc0 0xaf to the conforming >>>>> sequence 0x2f, and then rejecting it (because it's '/')? >>>>> >>>> No, I don't think exfat is not converting nonconforming sequence from 0xc0 >>>> 0xaf >>>> to the conforming sequence 0x2f. >>>> Because I get different outputs when tried with both ways. >>>> When I create a file with "urk\xc0\xafmoo", I get output as "Operation not >>>> permitted" >>>> and when I create it as "urk\x2fmoo", it gives "No such file or directory >>>> error" or >>>> you can consider this error as "Invalid argument" >>>> ( because that's what I get when I try for other characters like |, :, ?, >>>> etc ) >>> I think we need to understand this before skipping the test. Does it >>> also fail, eg, on cifs, vfat, jfs or udf? >> >> I tested it for VFAT, UDF and JFS and following are the results. >> >> >> 1. VFAT ( as per wikipedia 0x00-0x1F 0x7F " * / : < > ? \ | are reserved >> characters) >> >> For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory >> >> For \xc0\xaf) - /var/mnt/scratch/test-453/urk��moo.txt: Invalid argument >> >> Also gives error for Box filename >> >> ( this is very much similar to exfat, the only difference is that I do not >> get Operation not permitted when >> using \xc0\xaf, instead it gives invalid argument.) > vfat checks for those invalid characters, see msdos_format_name() and > vfat_is_used_badchars(). > > TBH I think these tests (g/453 and g/454) are probably only useful for > filesystems that allow unrestricted byte streams for names. So it means I should just not run this test for all the fs that have some restricted characters. But what about the other filenames which work fine. Don't we want to test them? >> 2. UDF ( as per wikipedia - only NULL cannot be used ) >> >> For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory >> >> For \xc0\xaf - creates filename something like this 'urk??moo.txt' and does >> not throw any error. >> ( But this seems to be invalid and should have thrown some error) >> >> Also gives error for dotdot entry. >> >> I am not sure why UDF was giving error for / and dot dot entry but then >> I read the following for UDF in one of the man pages which justifies the >> above errors I think >> >> "Invalid characters such as "NULL" and "/" and invalid file >> names such as "." and ".." will be translated according to >> the following rule: >> >> Replace the invalid character with an "_," then append the >> file name with # followed by a 4 digit hex representation of >> the 16-bit CRC of the original FileIdentifier. For example, >> the file name ".." will become "__#4C05" " >> >> Source - http://www-it.desy.de/cgi-bin/man-cgi?udfs+7 > That's Solaris. Sorry missed that. > >> 3. JFS ( as per Wikipedia NULL cannot be used ) >> >> For \x2f - /var/mnt/scratch/test-453/urk/moo.txt: No such file or directory >> >> For \xc0\xaf - Works fine >> >> Again not sure why / is failing here. Did not find much resource about the >> restricted filenames for JFS. > "/" is a path separator, it should always return ENOENT (unless you > created $SCRATCH_MNT/test-453/urk/moo.txt). 0x2f is the ascii encoding > for a slash. Hmmm, makes sense. Myabe that is why we are using \xc0\xaf instead of \x2f. > >> So as per above all the results, it seems like using \x2f fails for all but >> \xc0\xaf does work for JFS. > <nod> > > --D > >> >>>> Box filename also fails with "Invalid argument" error. >>>> >>>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-27 18:11 ` Darrick J. Wong 2021-04-27 21:00 ` Shreeya Patel @ 2021-04-28 13:50 ` Theodore Ts'o 2021-04-29 0:37 ` Darrick J. Wong 1 sibling, 1 reply; 11+ messages in thread From: Theodore Ts'o @ 2021-04-28 13:50 UTC (permalink / raw) To: Darrick J. Wong Cc: Shreeya Patel, Matthew Wilcox, fstests, linux-fsdevel, krisman, preichl, kernel On Tue, Apr 27, 2021 at 11:11:16AM -0700, Darrick J. Wong wrote: > > TBH I think these tests (g/453 and g/454) are probably only useful for > filesystems that allow unrestricted byte streams for names. I'm actually a little puzzled about why these tests should exist: # Create a directory with multiple filenames that all appear the same # (in unicode, anyway) but point to different inodes. In theory all # Linux filesystems should allow this (filenames are a sequence of # arbitrary bytes) even if the user implications are horrifying. Why do we care about testing this? The assertion "In all theory all Linux filesystems should allow this" is clearly not true --- if you enable unicode support for ext4 or f2fs, this will no longer be true, and this is considered by some a _feature_ not a bug --- precisely _because_ the user implications are horrifying. So why does these tests exist? Darrick, I see you added them in 2017 to test whether or not xfs_scrub will warn about confuable names, if _check_xfs_scrub_does_unicode is true. So we already understand that it's possible for a file system checker to complain that these file names are bad. It's not at all clear to me that asserting that all Linux file systems _must_ treat file names as "bag of bits" and not apply any kind of unicode normalization or strict unicode validation is a valid thing to test for in 2021. - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-28 13:50 ` Theodore Ts'o @ 2021-04-29 0:37 ` Darrick J. Wong 2021-04-29 14:32 ` Gabriel Krisman Bertazi 0 siblings, 1 reply; 11+ messages in thread From: Darrick J. Wong @ 2021-04-29 0:37 UTC (permalink / raw) To: Theodore Ts'o Cc: Shreeya Patel, Matthew Wilcox, fstests, linux-fsdevel, krisman, preichl, kernel On Wed, Apr 28, 2021 at 09:50:56AM -0400, Theodore Ts'o wrote: > On Tue, Apr 27, 2021 at 11:11:16AM -0700, Darrick J. Wong wrote: > > > > TBH I think these tests (g/453 and g/454) are probably only useful for > > filesystems that allow unrestricted byte streams for names. > > I'm actually a little puzzled about why these tests should exist: > > # Create a directory with multiple filenames that all appear the same > # (in unicode, anyway) but point to different inodes. In theory all > # Linux filesystems should allow this (filenames are a sequence of > # arbitrary bytes) even if the user implications are horrifying. > > Why do we care about testing this? The assertion "In all theory all > Linux filesystems should allow this" is clearly not true --- if you > enable unicode support for ext4 or f2fs, this will no longer be true, > and this is considered by some a _feature_ not a bug --- precisely > _because_ the user implications are horrifying. > > So why does these tests exist? Darrick, I see you added them in 2017 > to test whether or not xfs_scrub will warn about confuable names, if > _check_xfs_scrub_does_unicode is true. So we already understand that > it's possible for a file system checker to complain that these file > names are bad. Yes, that's exactly why this test (and generic/454) were created -- as a functional test for xfs_scrub's unicode checking. > It's not at all clear to me that asserting that all Linux file systems > _must_ treat file names as "bag of bits" and not apply any kind of > unicode normalization or strict unicode validation is a valid thing to > test for in 2021. Perhaps not. These two tests do have the interesting side effect of catching filesystems that don't hew to the "names are bytestreams" philosophy. In 2017, fstests usage seemed like it pretty narrowly included only the big three filesystems, so it amuses me to no end that four years went by before this discussion started. :P Nowadays with wider testing of other filesystems (thanks, Red Hat!) we should hide these behind _require_names_are_bytes or move them to tests/xfs/. Question -- the unicode case folding doesn't apply to xattr names, right? --D > > - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] generic/453: Exclude filenames that are not supported by exfat 2021-04-29 0:37 ` Darrick J. Wong @ 2021-04-29 14:32 ` Gabriel Krisman Bertazi 0 siblings, 0 replies; 11+ messages in thread From: Gabriel Krisman Bertazi @ 2021-04-29 14:32 UTC (permalink / raw) To: Darrick J. Wong Cc: Theodore Ts'o, Shreeya Patel, Matthew Wilcox, fstests, linux-fsdevel, preichl, kernel "Darrick J. Wong" <djwong@kernel.org> writes: > On Wed, Apr 28, 2021 at 09:50:56AM -0400, Theodore Ts'o wrote: >> On Tue, Apr 27, 2021 at 11:11:16AM -0700, Darrick J. Wong wrote: >> > >> > TBH I think these tests (g/453 and g/454) are probably only useful for >> > filesystems that allow unrestricted byte streams for names. >> >> I'm actually a little puzzled about why these tests should exist: >> >> # Create a directory with multiple filenames that all appear the same >> # (in unicode, anyway) but point to different inodes. In theory all >> # Linux filesystems should allow this (filenames are a sequence of >> # arbitrary bytes) even if the user implications are horrifying. >> >> Why do we care about testing this? The assertion "In all theory all >> Linux filesystems should allow this" is clearly not true --- if you >> enable unicode support for ext4 or f2fs, this will no longer be true, >> and this is considered by some a _feature_ not a bug --- precisely >> _because_ the user implications are horrifying. >> >> So why does these tests exist? Darrick, I see you added them in 2017 >> to test whether or not xfs_scrub will warn about confuable names, if >> _check_xfs_scrub_does_unicode is true. So we already understand that >> it's possible for a file system checker to complain that these file >> names are bad. > > Yes, that's exactly why this test (and generic/454) were created -- as a > functional test for xfs_scrub's unicode checking. > >> It's not at all clear to me that asserting that all Linux file systems >> _must_ treat file names as "bag of bits" and not apply any kind of >> unicode normalization or strict unicode validation is a valid thing to >> test for in 2021. > > Perhaps not. These two tests do have the interesting side effect of > catching filesystems that don't hew to the "names are bytestreams" > philosophy. In 2017, fstests usage seemed like it pretty narrowly > included only the big three filesystems, so it amuses me to no end that > four years went by before this discussion started. :P > > Nowadays with wider testing of other filesystems (thanks, Red Hat!) we > should hide these behind _require_names_are_bytes or move them to > tests/xfs/. > > Question -- the unicode case folding doesn't apply to xattr names, > right? No, they don't apply to xattr name in ext4 and f2fs. -- Gabriel Krisman Bertazi ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-04-29 14:32 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-25 22:31 [PATCH] generic/453: Exclude filenames that are not supported by exfat Shreeya Patel 2021-04-26 0:34 ` Matthew Wilcox 2021-04-26 11:57 ` Shreeya Patel 2021-04-26 12:03 ` Shreeya Patel 2021-04-26 12:37 ` Matthew Wilcox 2021-04-27 11:13 ` Shreeya Patel 2021-04-27 18:11 ` Darrick J. Wong 2021-04-27 21:00 ` Shreeya Patel 2021-04-28 13:50 ` Theodore Ts'o 2021-04-29 0:37 ` Darrick J. Wong 2021-04-29 14:32 ` Gabriel Krisman Bertazi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).