linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CIFS: Rename bug on servers not supporting inode numbers
@ 2011-11-03 15:20 Anton Altaparmakov
  2011-11-03 15:42 ` Anton Altaparmakov
  2011-11-04 11:16 ` Björn JACKE
  0 siblings, 2 replies; 14+ messages in thread
From: Anton Altaparmakov @ 2011-11-03 15:20 UTC (permalink / raw)
  To: Steve French; +Cc: linux-cifs, samba-technical, LKML, Unix Support

Hi,

Our CIFS server problems seem to have no end…  The Novell CIFS server does not support server inode numbers (when I try the mount option I get the message it is being turned off as server does not support it) and thus each inode gets a different number each time it is accessed and it gets a different number again for each readdir call.

The fun happens with rename() when the rename source and target only differ in case, e.g.

	touch foo
	mv foo Foo

The result?  Because of the difference in inode numbers, the request gets through to the CIFS module which promptly does:

	cifs_unlink(target_dir, target_dentry)
	rc = cifs_do_rename(…)

And because the cifs_unlink() just removed the source of the rename (as it is the same as the target), "rc" comes back as -ENOENT.

And indeed the file is gone so we just lost the user's file for ever.  )-:

We are tossing around ideas how to fix this but we would be interested in your input as to what you think the fix should be.

In any case this probably should be fixed in the standard kernel CIFS module, too, and not just for us locally as this presumably affects anyone who is using the CIFS module against case-insensitive, non-server-inode-number-supporting CIFS servers...

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 15:20 CIFS: Rename bug on servers not supporting inode numbers Anton Altaparmakov
@ 2011-11-03 15:42 ` Anton Altaparmakov
  2011-11-03 17:40   ` Jeff Layton
  2011-11-03 18:40   ` Shirish Pargaonkar
  2011-11-04 11:16 ` Björn JACKE
  1 sibling, 2 replies; 14+ messages in thread
From: Anton Altaparmakov @ 2011-11-03 15:42 UTC (permalink / raw)
  To: Steve French; +Cc: linux-cifs, samba-technical, LKML, Unix Support

Hi,

I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!

Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.

That probably explains why it has not been noticed before!

We need utf8 thus we still need to fix this issue.

Best regards,

	Anton

On 3 Nov 2011, at 15:20, Anton Altaparmakov wrote:

> Hi,
> 
> Our CIFS server problems seem to have no end…  The Novell CIFS server does not support server inode numbers (when I try the mount option I get the message it is being turned off as server does not support it) and thus each inode gets a different number each time it is accessed and it gets a different number again for each readdir call.
> 
> The fun happens with rename() when the rename source and target only differ in case, e.g.
> 
> 	touch foo
> 	mv foo Foo
> 
> The result?  Because of the difference in inode numbers, the request gets through to the CIFS module which promptly does:
> 
> 	cifs_unlink(target_dir, target_dentry)
> 	rc = cifs_do_rename(…)
> 
> And because the cifs_unlink() just removed the source of the rename (as it is the same as the target), "rc" comes back as -ENOENT.
> 
> And indeed the file is gone so we just lost the user's file for ever.  )-:
> 
> We are tossing around ideas how to fix this but we would be interested in your input as to what you think the fix should be.
> 
> In any case this probably should be fixed in the standard kernel CIFS module, too, and not just for us locally as this presumably affects anyone who is using the CIFS module against case-insensitive, non-server-inode-number-supporting CIFS servers...
> 
> Best regards,
> 
> 	Anton

-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 15:42 ` Anton Altaparmakov
@ 2011-11-03 17:40   ` Jeff Layton
  2011-11-03 23:25     ` Anton Altaparmakov
  2011-11-03 18:40   ` Shirish Pargaonkar
  1 sibling, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2011-11-03 17:40 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Steve French, linux-cifs, samba-technical, LKML, Unix Support

On Thu, 3 Nov 2011 15:42:13 +0000
Anton Altaparmakov <aia21@cam.ac.uk> wrote:

> Hi,
> 
> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
> 
> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.
> 
> That probably explains why it has not been noticed before!
> 
> We need utf8 thus we still need to fix this issue.
> 
> Best regards,
> 
> 	Anton
> 

I'm confused...

If the filesystem being served out by the server is using utf8, then
how is it handling the case-insensitivity?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 15:42 ` Anton Altaparmakov
  2011-11-03 17:40   ` Jeff Layton
@ 2011-11-03 18:40   ` Shirish Pargaonkar
  1 sibling, 0 replies; 14+ messages in thread
From: Shirish Pargaonkar @ 2011-11-03 18:40 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Steve French, linux-cifs, samba-technical, LKML, Unix Support

On Thu, Nov 3, 2011 at 10:42 AM, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> Hi,
>
> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
>
> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.

I tried iocharset=iso8859-1 and nls_tolower/charset2lower and
nls_toupper/charset2lower  functions work just fine.
But not for iocharset=utf8.

>
> That probably explains why it has not been noticed before!
>
> We need utf8 thus we still need to fix this issue.
>
> Best regards,
>
>        Anton
>
> On 3 Nov 2011, at 15:20, Anton Altaparmakov wrote:
>
>> Hi,
>>
>> Our CIFS server problems seem to have no end…  The Novell CIFS server does not support server inode numbers (when I try the mount option I get the message it is being turned off as server does not support it) and thus each inode gets a different number each time it is accessed and it gets a different number again for each readdir call.
>>
>> The fun happens with rename() when the rename source and target only differ in case, e.g.
>>
>>       touch foo
>>       mv foo Foo
>>
>> The result?  Because of the difference in inode numbers, the request gets through to the CIFS module which promptly does:
>>
>>       cifs_unlink(target_dir, target_dentry)
>>       rc = cifs_do_rename(…)
>>
>> And because the cifs_unlink() just removed the source of the rename (as it is the same as the target), "rc" comes back as -ENOENT.
>>
>> And indeed the file is gone so we just lost the user's file for ever.  )-:
>>
>> We are tossing around ideas how to fix this but we would be interested in your input as to what you think the fix should be.
>>
>> In any case this probably should be fixed in the standard kernel CIFS module, too, and not just for us locally as this presumably affects anyone who is using the CIFS module against case-insensitive, non-server-inode-number-supporting CIFS servers...
>>
>> Best regards,
>>
>>       Anton
>
> --
> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
> Linux NTFS maintainer, http://www.linux-ntfs.org/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 17:40   ` Jeff Layton
@ 2011-11-03 23:25     ` Anton Altaparmakov
  2011-11-03 23:34       ` Steve French
  0 siblings, 1 reply; 14+ messages in thread
From: Anton Altaparmakov @ 2011-11-03 23:25 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Steve French, linux-cifs, samba-technical, LKML, Unix Support

Hi,

On 3 Nov 2011, at 17:40, Jeff Layton wrote:
> On Thu, 3 Nov 2011 15:42:13 +0000 Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>> 
>> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
>> 
>> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.
>> 
>> That probably explains why it has not been noticed before!
>> 
>> We need utf8 thus we still need to fix this issue.

> I'm confused...
> 
> If the filesystem being served out by the server is using utf8, then
> how is it handling the case-insensitivity?


The file system being served is NSS (the Netware one but now mounted on Open Enterprise Server with Linux kernel rather than actual Netware kernel).  No idea how it works I am afraid.  It supports lots of different namespaces as well as being case-insensitive and case preserving when using the LONG name space (which is now being served through CIFS).

If it was NTFS or exFAT I could tell you exactly how they work (each volume has an upcase table mapping the 65536 UCS-2 Unicode characters to their upper case equivalents and each 16-bit character is upper-cased individually, more recently Windows has switched to using UTF-16 instead of UCS-2 and the upcase table changed when that happened though it remained the same size and I think for file system purposes the fact that there are surrogates in the above UCS-2   Unicode range is simply ignored)...

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 23:25     ` Anton Altaparmakov
@ 2011-11-03 23:34       ` Steve French
  2011-11-03 23:37         ` NamJae Jeon
  0 siblings, 1 reply; 14+ messages in thread
From: Steve French @ 2011-11-03 23:34 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Jeff Layton, Steve French, linux-cifs, samba-technical, LKML,
	Unix Support

What is the actual sequence of events from the wire perspective (the
actual smb requests sent)?


On Thu, Nov 3, 2011 at 6:25 PM, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> Hi,
>
> On 3 Nov 2011, at 17:40, Jeff Layton wrote:
>> On Thu, 3 Nov 2011 15:42:13 +0000 Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>>>
>>> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
>>>
>>> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.
>>>
>>> That probably explains why it has not been noticed before!
>>>
>>> We need utf8 thus we still need to fix this issue.
>
>> I'm confused...
>>
>> If the filesystem being served out by the server is using utf8, then
>> how is it handling the case-insensitivity?
>
>
> The file system being served is NSS (the Netware one but now mounted on Open Enterprise Server with Linux kernel rather than actual Netware kernel).  No idea how it works I am afraid.  It supports lots of different namespaces as well as being case-insensitive and case preserving when using the LONG name space (which is now being served through CIFS).
>
> If it was NTFS or exFAT I could tell you exactly how they work (each volume has an upcase table mapping the 65536 UCS-2 Unicode characters to their upper case equivalents and each 16-bit character is upper-cased individually, more recently Windows has switched to using UTF-16 instead of UCS-2 and the upcase table changed when that happened though it remained the same size and I think for file system purposes the fact that there are surrogates in the above UCS-2   Unicode range is simply ignored)...
>
> Best regards,
>
>        Anton
> --
> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
> Linux NTFS maintainer, http://www.linux-ntfs.org/
>
>



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 23:34       ` Steve French
@ 2011-11-03 23:37         ` NamJae Jeon
  2011-11-23 10:34           ` NamJae Jeon
  0 siblings, 1 reply; 14+ messages in thread
From: NamJae Jeon @ 2011-11-03 23:37 UTC (permalink / raw)
  To: Steve French, akpm, Anton Altaparmakov
  Cc: Jeff Layton, Steve French, linux-cifs, samba-technical, LKML,
	Unix Support

2011/11/4 Steve French <smfrench@gmail.com>:
> What is the actual sequence of events from the wire perspective (the
> actual smb requests sent)?
>
>
> On Thu, Nov 3, 2011 at 6:25 PM, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>> Hi,
>>
>> On 3 Nov 2011, at 17:40, Jeff Layton wrote:
>>> On Thu, 3 Nov 2011 15:42:13 +0000 Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>>>>
>>>> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
>>>>
>>>> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.

Hi.
There is no upper/lower case table on nls utf8. so If you use iocharset=utf8,
filesystem will be case sensitive.
so we can add upper/lower case table like other charset.
And Currently surrogate pair is not working on nls utf8.
because it is limited by MAX_WCHAR_T in nls utf8
I think that upper/lower case table and surrogate pair support should
be fixed on nls utf8.
I should know Andrew's opinion to fix these problem.
>>>>
>>>> That probably explains why it has not been noticed before!
>>>>
>>>> We need utf8 thus we still need to fix this issue.
>>
>>> I'm confused...
>>>
>>> If the filesystem being served out by the server is using utf8, then
>>> how is it handling the case-insensitivity?
>>
>>
>> The file system being served is NSS (the Netware one but now mounted on Open Enterprise Server with Linux kernel rather than actual Netware kernel).  No idea how it works I am afraid.  It supports lots of different namespaces as well as being case-insensitive and case preserving when using the LONG name space (which is now being served through CIFS).
>>
>> If it was NTFS or exFAT I could tell you exactly how they work (each volume has an upcase table mapping the 65536 UCS-2 Unicode characters to their upper case equivalents and each 16-bit character is upper-cased individually, more recently Windows has switched to using UTF-16 instead of UCS-2 and the upcase table changed when that happened though it remained the same size and I think for file system purposes the fact that there are surrogates in the above UCS-2   Unicode range is simply ignored)...
>>
>> Best regards,
>>
>>        Anton
>> --
>> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
>> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
>> Linux NTFS maintainer, http://www.linux-ntfs.org/
>>
>>
>
>
>
> --
> Thanks,
>
> Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 15:20 CIFS: Rename bug on servers not supporting inode numbers Anton Altaparmakov
  2011-11-03 15:42 ` Anton Altaparmakov
@ 2011-11-04 11:16 ` Björn JACKE
  1 sibling, 0 replies; 14+ messages in thread
From: Björn JACKE @ 2011-11-04 11:16 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Steve French, linux-cifs, samba-technical, LKML, Unix Support,
	linux-fsdevel, xfs

On 2011-11-03 at 15:20 +0000 Anton Altaparmakov sent off:
> Hi,
> 
> Our CIFS server problems seem to have no end…  The Novell CIFS server does not support server inode numbers (when I try the mount option I get the message it is being turned off as server does not support it) and thus each inode gets a different number each time it is accessed and it gets a different number again for each readdir call.
> 
> The fun happens with rename() when the rename source and target only differ in case, e.g.
> 
> 	touch foo
> 	mv foo Foo

somehow related seems https://bugzilla.kernel.org/show_bug.cgi?id=39512
(as long as kernel bugzilla is dead, see
http://www.linux.sgi.com/archives/xfs-masters/2011-07/msg00022.html )

Case insensitive filesystems seem to be a problem in general on Linux. Not sure
how far kernel and/or glibc are involved in the problem. As a workaround for
the mess you need to do a temporary rename to a different name (not just a case
equivalent name).

Björn
-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-03 23:37         ` NamJae Jeon
@ 2011-11-23 10:34           ` NamJae Jeon
  2011-11-23 16:31             ` Alan Stern
  0 siblings, 1 reply; 14+ messages in thread
From: NamJae Jeon @ 2011-11-23 10:34 UTC (permalink / raw)
  To: stern
  Cc: Jeff Layton, Steve French, linux-cifs, samba-technical, LKML,
	Unix Support, Steve French, akpm, Anton Altaparmakov,
	ashishsangwan2

Hi. Alan.
Would you know why there is no upper/lower case table in nls utf8 ?
And Currently Surrogate pair is not supported also in nls utf8. Is
there the reason ?

2011/11/4 NamJae Jeon <linkinjeon@gmail.com>:
> 2011/11/4 Steve French <smfrench@gmail.com>:
>> What is the actual sequence of events from the wire perspective (the
>> actual smb requests sent)?
>>
>>
>> On Thu, Nov 3, 2011 at 6:25 PM, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>>> Hi,
>>>
>>> On 3 Nov 2011, at 17:40, Jeff Layton wrote:
>>>> On Thu, 3 Nov 2011 15:42:13 +0000 Anton Altaparmakov <aia21@cam.ac.uk> wrote:
>>>>>
>>>>> I should add that we are using iocharset=utf8 mount option which means that the dcache hash/compare functions done in the cifs module do not work because it uses nls_tolower() and nls_strnicmp() both of which for utf8 NLS in the kernel do not do anything at all and effectively behave case sensitively!
>>>>>
>>>>> Thus this bug/problem in all likelyhood only affects utf8 iocharset users on a case-insensitive but case-preserving CIFS server that does not support server inode numbers.
>
> Hi.
> There is no upper/lower case table on nls utf8. so If you use iocharset=utf8,
> filesystem will be case sensitive.
> so we can add upper/lower case table like other charset.
> And Currently surrogate pair is not working on nls utf8.
> because it is limited by MAX_WCHAR_T in nls utf8
> I think that upper/lower case table and surrogate pair support should
> be fixed on nls utf8.
> I should know Andrew's opinion to fix these problem.
>>>>>
>>>>> That probably explains why it has not been noticed before!
>>>>>
>>>>> We need utf8 thus we still need to fix this issue.
>>>
>>>> I'm confused...
>>>>
>>>> If the filesystem being served out by the server is using utf8, then
>>>> how is it handling the case-insensitivity?
>>>
>>>
>>> The file system being served is NSS (the Netware one but now mounted on Open Enterprise Server with Linux kernel rather than actual Netware kernel).  No idea how it works I am afraid.  It supports lots of different namespaces as well as being case-insensitive and case preserving when using the LONG name space (which is now being served through CIFS).
>>>
>>> If it was NTFS or exFAT I could tell you exactly how they work (each volume has an upcase table mapping the 65536 UCS-2 Unicode characters to their upper case equivalents and each 16-bit character is upper-cased individually, more recently Windows has switched to using UTF-16 instead of UCS-2 and the upcase table changed when that happened though it remained the same size and I think for file system purposes the fact that there are surrogates in the above UCS-2   Unicode range is simply ignored)...
>>>
>>> Best regards,
>>>
>>>        Anton
>>> --
>>> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
>>> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
>>> Linux NTFS maintainer, http://www.linux-ntfs.org/
>>>
>>>
>>
>>
>>
>> --
>> Thanks,
>>
>> Steve
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-23 10:34           ` NamJae Jeon
@ 2011-11-23 16:31             ` Alan Stern
  2011-11-23 17:12               ` Alan Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Stern @ 2011-11-23 16:31 UTC (permalink / raw)
  To: NamJae Jeon
  Cc: Jeff Layton, Steve French, linux-cifs, samba-technical, LKML,
	Unix Support, Steve French, akpm, Anton Altaparmakov,
	ashishsangwan2

On Wed, 23 Nov 2011, NamJae Jeon wrote:

> Hi. Alan.
> Would you know why there is no upper/lower case table in nls utf8 ?
> And Currently Surrogate pair is not supported also in nls utf8. Is
> there the reason ?

I don't know.

Alan Stern


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-23 16:31             ` Alan Stern
@ 2011-11-23 17:12               ` Alan Cox
  2011-11-23 18:00                 ` Amit Sahrawat
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2011-11-23 17:12 UTC (permalink / raw)
  To: Alan Stern
  Cc: NamJae Jeon, Jeff Layton, Steve French, linux-cifs,
	samba-technical, LKML, Unix Support, Steve French, akpm,
	Anton Altaparmakov, ashishsangwan2

On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
Alan Stern <stern@rowland.harvard.edu> wrote:

> On Wed, 23 Nov 2011, NamJae Jeon wrote:
> 
> > Hi. Alan.
> > Would you know why there is no upper/lower case table in nls utf8 ?
> > And Currently Surrogate pair is not supported also in nls utf8. Is
> > there the reason ?
> 
> I don't know.

For one case translations are locale specific and very very complicated.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-23 17:12               ` Alan Cox
@ 2011-11-23 18:00                 ` Amit Sahrawat
  2011-11-24  5:08                   ` Günter Kukkukk
  0 siblings, 1 reply; 14+ messages in thread
From: Amit Sahrawat @ 2011-11-23 18:00 UTC (permalink / raw)
  To: Alan Cox
  Cc: Alan Stern, NamJae Jeon, Jeff Layton, Steve French, linux-cifs,
	samba-technical, LKML, Unix Support, Steve French, akpm,
	Anton Altaparmakov, ashishsangwan2

Hi Alan,
Ok, translations cannot be added easily. But any idea why surrogate
pairs are not handled? I think handling for surrogate pairs can be
added by identifying proper points(there are not many I guess). Please
share your views.

Regards,
Amit Sahrawat

On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
> Alan Stern <stern@rowland.harvard.edu> wrote:
>
>> On Wed, 23 Nov 2011, NamJae Jeon wrote:
>>
>> > Hi. Alan.
>> > Would you know why there is no upper/lower case table in nls utf8 ?
>> > And Currently Surrogate pair is not supported also in nls utf8. Is
>> > there the reason ?
>>
>> I don't know.
>
> For one case translations are locale specific and very very complicated.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-23 18:00                 ` Amit Sahrawat
@ 2011-11-24  5:08                   ` Günter Kukkukk
  2011-11-28  7:56                     ` Ashish Sangwan
  0 siblings, 1 reply; 14+ messages in thread
From: Günter Kukkukk @ 2011-11-24  5:08 UTC (permalink / raw)
  To: samba-technical
  Cc: Amit Sahrawat, Alan Cox, linux-cifs, NamJae Jeon, Jeff Layton,
	LKML, Steve French, Steve French, Alan Stern, ashishsangwan2,
	akpm, Anton Altaparmakov, Unix Support

On Wednesday 23 November 2011 19:00:16 Amit Sahrawat wrote:
> Hi Alan,
> Ok, translations cannot be added easily. But any idea why surrogate
> pairs are not handled? I think handling for surrogate pairs can be
> added by identifying proper points(there are not many I guess). Please
> share your views.
> 
> Regards,
> Amit Sahrawat
> 
> On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
> > 
> > Alan Stern <stern@rowland.harvard.edu> wrote:
> >> On Wed, 23 Nov 2011, NamJae Jeon wrote:
> >> > Hi. Alan.
> >> > Would you know why there is no upper/lower case table in nls utf8 ?
> >> > And Currently Surrogate pair is not supported also in nls utf8. Is
> >> > there the reason ?
> >> 
> >> I don't know.
> > 
> > For one case translations are locale specific and very very complicated.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

"Surrogate pairs" had to been implemented to extend the former
16 bit limit of UCS-2/UTF-16.

Unicode has been limited to max 0x0010FFFF glyphs - which
would not fit in UCS-2/UTF-16.

To extend UTF-16, the "surrogate range" between D800 and DFFF was "stolen" 
from the one of the previously named "Private Use Areas" of UCS-2.
-----

Have those "surrogate pairs" any impact on _todays_ linux file name conventions?

I think the easy answer is NO !

AFAIK - _no_ current operating system is supporting this!

We are talking here about "allowed dir/file name characters"!

The main reason behind "Surrogate pairs" was to allow "userland" (!)
applications to use worldwide special character glyphs!
---------

Anyway - in nls_base.c
.....
static const struct utf8_table utf8_table[] =
{
    {0x80,  0x00,   0*6,    0x7F,           0,         /* 1 byte sequence */},
    {0xE0,  0xC0,   1*6,    0x7FF,          0x80,      /* 2 byte sequence */},
    {0xF0,  0xE0,   2*6,    0xFFFF,         0x800,     /* 3 byte sequence */},
    {0xF8,  0xF0,   3*6,    0x1FFFFF,       0x10000,   /* 4 byte sequence */},
    {0xFC,  0xF8,   4*6,    0x3FFFFFF,      0x200000,  /* 5 byte sequence */},
    {0xFE,  0xFC,   5*6,    0x7FFFFFFF,     0x4000000, /* 6 byte sequence */},
    {0,						       /* end of table    */}
};
........
that configured range exceeds the max. allowed unicode range 0x0010FFFF
and _must_ be changed to:

static const struct utf8_table utf8_table[] =
{
    {0x80,  0x00,   0*6,    0x7F,           0,         /* 1 byte sequence */},
    {0xE0,  0xC0,   1*6,    0x7FF,          0x80,      /* 2 byte sequence */},
    {0xF0,  0xE0,   2*6,    0xFFFF,         0x800,     /* 3 byte sequence */},
    {0xF8,  0xF0,   3*6,    0x1FFFFF,       0x10000,   /* 4 byte sequence */},
    {0,						       /* end of table    */}
};

Cheers, Günter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CIFS: Rename bug on servers not supporting inode numbers
  2011-11-24  5:08                   ` Günter Kukkukk
@ 2011-11-28  7:56                     ` Ashish Sangwan
  0 siblings, 0 replies; 14+ messages in thread
From: Ashish Sangwan @ 2011-11-28  7:56 UTC (permalink / raw)
  To: Günter Kukkukk
  Cc: samba-technical, Amit Sahrawat, Alan Cox, linux-cifs,
	NamJae Jeon, Jeff Layton, LKML, Steve French, Steve French,
	Alan Stern, akpm, Anton Altaparmakov, Unix Support

2011/11/24 Günter Kukkukk <linux@kukkukk.com>:
> On Wednesday 23 November 2011 19:00:16 Amit Sahrawat wrote:
>> Hi Alan,
>> Ok, translations cannot be added easily. But any idea why surrogate
>> pairs are not handled? I think handling for surrogate pairs can be
>> added by identifying proper points(there are not many I guess). Please
>> share your views.
>>
>> Regards,
>> Amit Sahrawat
>>
>> On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> > On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
>> >
>> > Alan Stern <stern@rowland.harvard.edu> wrote:
>> >> On Wed, 23 Nov 2011, NamJae Jeon wrote:
>> >> > Hi. Alan.
>> >> > Would you know why there is no upper/lower case table in nls utf8 ?
>> >> > And Currently Surrogate pair is not supported also in nls utf8. Is
>> >> > there the reason ?
>> >>
>> >> I don't know.
>> >
>> > For one case translations are locale specific and very very complicated.
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
>> > in the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > Please read the FAQ at  http://www.tux.org/lkml/
>
> "Surrogate pairs" had to been implemented to extend the former
> 16 bit limit of UCS-2/UTF-16.
>
> Unicode has been limited to max 0x0010FFFF glyphs - which
> would not fit in UCS-2/UTF-16.
>
> To extend UTF-16, the "surrogate range" between D800 and DFFF was "stolen"
> from the one of the previously named "Private Use Areas" of UCS-2.
> -----
>
> Have those "surrogate pairs" any impact on _todays_ linux file name conventions?
>
> I think the easy answer is NO !
>
> AFAIK - _no_ current operating system is supporting this!
>
> We are talking here about "allowed dir/file name characters"!

How about Chinese/Japanese/Korean characters?
User won't be able to create new file with CJK/HAN chars if there is
no surrogate pair support.
>
> The main reason behind "Surrogate pairs" was to allow "userland" (!)
> applications to use worldwide special character glyphs!
> ---------
>
> Anyway - in nls_base.c
> .....
> static const struct utf8_table utf8_table[] =
> {
>    {0x80,  0x00,   0*6,    0x7F,           0,         /* 1 byte sequence */},
>    {0xE0,  0xC0,   1*6,    0x7FF,          0x80,      /* 2 byte sequence */},
>    {0xF0,  0xE0,   2*6,    0xFFFF,         0x800,     /* 3 byte sequence */},
>    {0xF8,  0xF0,   3*6,    0x1FFFFF,       0x10000,   /* 4 byte sequence */},
>    {0xFC,  0xF8,   4*6,    0x3FFFFFF,      0x200000,  /* 5 byte sequence */},
>    {0xFE,  0xFC,   5*6,    0x7FFFFFFF,     0x4000000, /* 6 byte sequence */},
>    {0,                                                /* end of table    */}
> };
> ........
> that configured range exceeds the max. allowed unicode range 0x0010FFFF
> and _must_ be changed to:
>
> static const struct utf8_table utf8_table[] =
> {
>    {0x80,  0x00,   0*6,    0x7F,           0,         /* 1 byte sequence */},
>    {0xE0,  0xC0,   1*6,    0x7FF,          0x80,      /* 2 byte sequence */},
>    {0xF0,  0xE0,   2*6,    0xFFFF,         0x800,     /* 3 byte sequence */},
>    {0xF8,  0xF0,   3*6,    0x1FFFFF,       0x10000,   /* 4 byte sequence */},
>    {0,                                                /* end of table    */}
> };
>
> Cheers, Günter
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-11-28  7:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-03 15:20 CIFS: Rename bug on servers not supporting inode numbers Anton Altaparmakov
2011-11-03 15:42 ` Anton Altaparmakov
2011-11-03 17:40   ` Jeff Layton
2011-11-03 23:25     ` Anton Altaparmakov
2011-11-03 23:34       ` Steve French
2011-11-03 23:37         ` NamJae Jeon
2011-11-23 10:34           ` NamJae Jeon
2011-11-23 16:31             ` Alan Stern
2011-11-23 17:12               ` Alan Cox
2011-11-23 18:00                 ` Amit Sahrawat
2011-11-24  5:08                   ` Günter Kukkukk
2011-11-28  7:56                     ` Ashish Sangwan
2011-11-03 18:40   ` Shirish Pargaonkar
2011-11-04 11:16 ` Björn JACKE

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).