It's UTF-8
diff mbox series

Message ID 20060108203851.GA5864@mipter.zuzino.mipt.ru
State New, archived
Headers show
Series
  • It's UTF-8
Related show

Commit Message

Alexey Dobriyan Jan. 8, 2006, 8:38 p.m. UTC
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 Documentation/filesystems/isofs.txt |    4 ++--
 Documentation/filesystems/jfs.txt   |    2 +-
 Documentation/filesystems/vfat.txt  |    6 +++---
 fs/befs/linuxvfs.c                  |    2 +-
 fs/cifs/CHANGES                     |    2 +-
 fs/fat/dir.c                        |    2 +-
 fs/fat/inode.c                      |    2 +-
 fs/isofs/joliet.c                   |    2 +-
 fs/nls/Kconfig                      |    2 +-
 include/asm-mips/termbits.h         |    2 +-
 include/linux/msdos_fs.h            |    2 +-
 11 files changed, 14 insertions(+), 14 deletions(-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Jan Engelhardt Jan. 8, 2006, 9:46 p.m. UTC | #1
>Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>

I'd say ACK. However,

> iocharset=name	Character set to use for converting from Unicode to
> 		ASCII.  The default is to do no conversion.  Use
>-		iocharset=utf8 for UTF8 translations.  This requires
>+		iocharset=utf8 for UTF-8 translations.  This requires
> 		CONFIG_NLS_UTF8 to be set in the kernel .config file.

If you are really nitpicky about the "-", then it should also be 
"iocharset=utf-8" (and whereever else). Or what's the real purpose of 
adding the dashes in only half of the places, then?



Jan Engelhardt
Måns Rullgård Jan. 8, 2006, 10:09 p.m. UTC | #2
Jan Engelhardt <jengelh@linux01.gwdg.de> writes:

>>Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
>
> I'd say ACK. However,
>
>> iocharset=name	Character set to use for converting from Unicode to
>> 		ASCII.  The default is to do no conversion.  Use
>>-		iocharset=utf8 for UTF8 translations.  This requires
>>+		iocharset=utf8 for UTF-8 translations.  This requires
>> 		CONFIG_NLS_UTF8 to be set in the kernel .config file.
>
> If you are really nitpicky about the "-", then it should also be 
> "iocharset=utf-8" (and whereever else). Or what's the real purpose of 
> adding the dashes in only half of the places, then?

The patch only changes documentation/comments.  Changing other things
would break compatibility, and that's usually not a good idea for
cosmetic changes.
Alistair John Strachan Jan. 8, 2006, 10:10 p.m. UTC | #3
On Sunday 08 January 2006 21:46, Jan Engelhardt wrote:
> >Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
>
> I'd say ACK. However,
>
> > iocharset=name	Character set to use for converting from Unicode to
> > 		ASCII.  The default is to do no conversion.  Use
> >-		iocharset=utf8 for UTF8 translations.  This requires
> >+		iocharset=utf8 for UTF-8 translations.  This requires
> > 		CONFIG_NLS_UTF8 to be set in the kernel .config file.
>
> If you are really nitpicky about the "-", then it should also be
> "iocharset=utf-8" (and whereever else). Or what's the real purpose of
> adding the dashes in only half of the places, then?

Also what's "Unicode 16" as used in several places in the kernel. Surely this 
should be changed to UTF-16, which is the _encoding_ for the unicode 
character space.
Alexey Dobriyan Jan. 8, 2006, 10:25 p.m. UTC | #4
On Sun, Jan 08, 2006 at 10:46:22PM +0100, Jan Engelhardt wrote:
> > iocharset=name	Character set to use for converting from Unicode to
> > 		ASCII.  The default is to do no conversion.  Use
> >-		iocharset=utf8 for UTF8 translations.  This requires
> >+		iocharset=utf8 for UTF-8 translations.  This requires
> > 		CONFIG_NLS_UTF8 to be set in the kernel .config file.
>
> If you are really nitpicky about the "-", then it should also be
> "iocharset=utf-8" (and whereever else). Or what's the real purpose of
> adding the dashes in only half of the places, then?

I don't want to be shot by everyone who has "iocharset=utf8" in
/etc/fstab.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Alexander E. Patrakov Jan. 9, 2006, 8:28 a.m. UTC | #5
Alexey Dobriyan wrote:

>  	if (!strcmp(opts->iocharset, "utf8")) {
>  		printk(KERN_ERR "FAT: utf8 is not a recommended IO charset"
>  		       " for FAT filesystems, filesystem will be case sensitive!\n");

This warning better reads in such a way:

FAT: this is not the recommended filesystem for use with UTF-8 filenames.

Reason: the utf8 IO charset is the only IO charset that displays 
filenames properly in UTF-8 locales. So the choice is really between 
case-sensitive filenames (iocharset=utf8) and completely unreadable 
filenames (everything else).
Vojtech Pavlik Jan. 9, 2006, 9:04 a.m. UTC | #6
On Sun, Jan 08, 2006 at 10:10:09PM +0000, Alistair John Strachan wrote:

> On Sunday 08 January 2006 21:46, Jan Engelhardt wrote:
> > >Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> >
> > I'd say ACK. However,
> >
> > > iocharset=name	Character set to use for converting from Unicode to
> > > 		ASCII.  The default is to do no conversion.  Use
> > >-		iocharset=utf8 for UTF8 translations.  This requires
> > >+		iocharset=utf8 for UTF-8 translations.  This requires
> > > 		CONFIG_NLS_UTF8 to be set in the kernel .config file.
> >
> > If you are really nitpicky about the "-", then it should also be
> > "iocharset=utf-8" (and whereever else). Or what's the real purpose of
> > adding the dashes in only half of the places, then?
> 
> Also what's "Unicode 16" as used in several places in the kernel. Surely this 
> should be changed to UTF-16, which is the _encoding_ for the unicode 
> character space.
 
It might also be UCS-2 and not UTF-16 in some places. They do differ.
Krzysztof Halasa Jan. 9, 2006, 11:38 a.m. UTC | #7
"Alexander E. Patrakov" <patrakov@gmail.com> writes:

> Alexey Dobriyan wrote:
>
>>  	if (!strcmp(opts->iocharset, "utf8")) {
>>  		printk(KERN_ERR "FAT: utf8 is not a recommended IO charset"
>>  		       " for FAT filesystems, filesystem will be case sensitive!\n");
>
> This warning better reads in such a way:
>
> FAT: this is not the recommended filesystem for use with UTF-8 filenames.
>
> Reason: the utf8 IO charset is the only IO charset that displays
> filenames properly in UTF-8 locales. So the choice is really between
> case-sensitive filenames (iocharset=utf8) and completely unreadable
> filenames (everything else).

And UTF-8 locale seems to be the only really sane today. I'd kill the
whole warning.
Kalin KOZHUHAROV Jan. 9, 2006, 12:48 p.m. UTC | #8
Jan Engelhardt wrote:
>>Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> 
> 
> I'd say ACK. However,
> 
> 
>>iocharset=name	Character set to use for converting from Unicode to
>>		ASCII.  The default is to do no conversion.  Use
>>-		iocharset=utf8 for UTF8 translations.  This requires
>>+		iocharset=utf8 for UTF-8 translations.  This requires
>>		CONFIG_NLS_UTF8 to be set in the kernel .config file.
> 
> 
> If you are really nitpicky about the "-", then it should also be 
> "iocharset=utf-8" (and whereever else). Or what's the real purpose of 
> adding the dashes in only half of the places, then?

glibc was the starter, AFAIR. So both utf8 and UTF-8 are generally accepted, but utf-8 is not that
wide spread.

Kalin.
Xavier Bestel Jan. 9, 2006, 6:44 p.m. UTC | #9
Le lundi 09 janvier 2006 à 12:38 +0100, Krzysztof Halasa a écrit :
> "Alexander E. Patrakov" <patrakov@gmail.com> writes:

> > FAT: this is not the recommended filesystem for use with UTF-8 filenames.
> >
> > Reason: the utf8 IO charset is the only IO charset that displays
> > filenames properly in UTF-8 locales. So the choice is really between
> > case-sensitive filenames (iocharset=utf8) and completely unreadable
> > filenames (everything else).
> 
> And UTF-8 locale seems to be the only really sane today. I'd kill the
> whole warning.

.. on unix. But FAT is a sort of lingua franca of filesystems, and is
the only one understandable by every (embedded) OS. So you'd better stay
compatible with everyone else.

	Xav


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Krzysztof Halasa Jan. 10, 2006, 12:12 a.m. UTC | #10
Xavier Bestel <xavier.bestel@free.fr> writes:

>> And UTF-8 locale seems to be the only really sane today. I'd kill the
>> whole warning.
>
> .. on unix. But FAT is a sort of lingua franca of filesystems, and is
> the only one understandable by every (embedded) OS. So you'd better stay
> compatible with everyone else.

You stay compatible. And you can even read files with national
characters in names.

Patch
diff mbox series

--- a/Documentation/filesystems/isofs.txt
+++ b/Documentation/filesystems/isofs.txt
@@ -9,9 +9,9 @@  when using discs encoded using Microsoft
   iocharset=name Character set to use for converting from Unicode to
 		ASCII.  Joliet filenames are stored in Unicode format, but
 		Unix for the most part doesn't know how to deal with Unicode.
-		There is also an option of doing UTF8 translations with the
+		There is also an option of doing UTF-8 translations with the
 		utf8 option.
-  utf8          Encode Unicode names in UTF8 format. Default is no.
+  utf8          Encode Unicode names in UTF-8 format. Default is no.
 
 Mount options unique to the isofs filesystem.
   block=512     Set the block size for the disk to 512 bytes
--- a/Documentation/filesystems/jfs.txt
+++ b/Documentation/filesystems/jfs.txt
@@ -6,7 +6,7 @@  The following mount options are supporte
 
 iocharset=name	Character set to use for converting from Unicode to
 		ASCII.  The default is to do no conversion.  Use
-		iocharset=utf8 for UTF8 translations.  This requires
+		iocharset=utf8 for UTF-8 translations.  This requires
 		CONFIG_NLS_UTF8 to be set in the kernel .config file.
 		iocharset=none specifies the default behavior explicitly.
 
--- a/Documentation/filesystems/vfat.txt
+++ b/Documentation/filesystems/vfat.txt
@@ -28,16 +28,16 @@  iocharset=name -- Character set to use f
 		 know how to deal with Unicode.
 		 By default, FAT_DEFAULT_IOCHARSET setting is used.
 
-		 There is also an option of doing UTF8 translations
+		 There is also an option of doing UTF-8 translations
 		 with the utf8 option.
 
 		 NOTE: "iocharset=utf8" is not recommended. If unsure,
 		 you should consider the following option instead.
 
-utf8=<bool>   -- UTF8 is the filesystem safe version of Unicode that
+utf8=<bool>   -- UTF-8 is the filesystem safe version of Unicode that
 		 is used by the console.  It can be be enabled for the
 		 filesystem with this option. If 'uni_xlate' gets set,
-		 UTF8 gets disabled.
+		 UTF-8 gets disabled.
 
 uni_xlate=<bool> -- Translate unhandled Unicode characters to special
 		 escaped sequences.  This would let you backup and
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -561,7 +561,7 @@  befs_utf2nls(struct super_block *sb, con
  * @sb: Superblock
  * @src: Input string buffer in NLS format
  * @srclen: Length of input string in bytes
- * @dest: The output string in UTF8 format
+ * @dest: The output string in UTF-8 format
  * @destlen: Length of the output buffer
  * 
  * Converts input string @src, which is in the format of the loaded NLS map,
--- a/fs/cifs/CHANGES
+++ b/fs/cifs/CHANGES
@@ -150,7 +150,7 @@  improperly zeroed buffer in CIFS Unix ex
 Version 1.25
 ------------
 Fix internationalization problem in cifs readdir with filenames that map to 
-longer UTF8 strings than the string on the wire was in Unicode.  Add workaround
+longer UTF-8 strings than the string on the wire was in Unicode.  Add workaround
 for readdir to netapp servers. Fix search rewind (seek into readdir to return 
 non-consecutive entries).  Do not do readdir when server negotiates 
 buffer size to small to fit filename. Add support for reading POSIX ACLs from
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -114,7 +114,7 @@  static inline int fat_get_entry(struct i
 }
 
 /*
- * Convert Unicode 16 to UTF8, translated Unicode, or ASCII.
+ * Convert Unicode 16 to UTF-8, translated Unicode, or ASCII.
  * If uni_xlate is enabled and we can't get a 1:1 conversion, use a
  * colon as an escape character since it is normally invalid on the vfat
  * filesystem. The following four characters are the hexadecimal digits
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -1016,7 +1016,7 @@  static int parse_options(char *options, 
 			return -EINVAL;
 		}
 	}
-	/* UTF8 doesn't provide FAT semantics */
+	/* UTF-8 doesn't provide FAT semantics */
 	if (!strcmp(opts->iocharset, "utf8")) {
 		printk(KERN_ERR "FAT: utf8 is not a recommended IO charset"
 		       " for FAT filesystems, filesystem will be case sensitive!\n");
--- a/fs/isofs/joliet.c
+++ b/fs/isofs/joliet.c
@@ -11,7 +11,7 @@ 
 #include "isofs.h"
 
 /*
- * Convert Unicode 16 to UTF8 or ASCII.
+ * Convert Unicode 16 to UTF-8 or ASCII.
  */
 static int
 uni16_to_x8(unsigned char *ascii, u16 *uni, int len, struct nls_table *nls)
--- a/fs/nls/Kconfig
+++ b/fs/nls/Kconfig
@@ -491,7 +491,7 @@  config NLS_KOI8_U
 	  (koi8-u) and Belarusian (koi8-ru) character sets.
 
 config NLS_UTF8
-	tristate "NLS UTF8"
+	tristate "NLS UTF-8"
 	depends on NLS
 	help
 	  If you want to display filenames with native language characters
--- a/include/asm-mips/termbits.h
+++ b/include/asm-mips/termbits.h
@@ -77,7 +77,7 @@  struct termios {
 #define IXANY	0004000		/* Any character will restart after stop.  */
 #define IXOFF	0010000		/* Enable start/stop input control.  */
 #define IMAXBEL	0020000		/* Ring bell when input queue is full.  */
-#define IUTF8	0040000		/* Input is UTF8 */
+#define IUTF8	0040000		/* Input is UTF-8 */
 
 /* c_oflag bits */
 #define OPOST	0000001		/* Perform output processing.  */
--- a/include/linux/msdos_fs.h
+++ b/include/linux/msdos_fs.h
@@ -199,7 +199,7 @@  struct fat_mount_options {
 		 sys_immutable:1, /* set = system files are immutable */
 		 dotsOK:1,        /* set = hidden and system files are named '.filename' */
 		 isvfat:1,        /* 0=no vfat long filename support, 1=vfat support */
-		 utf8:1,	  /* Use of UTF8 character set (Default) */
+		 utf8:1,	  /* Use of UTF-8 character set (Default) */
 		 unicode_xlate:1, /* create escape sequences for unhandled Unicode */
 		 numtail:1,       /* Does first alias have a numeric '~1' type tail? */
 		 atari:1,         /* Use Atari GEMDOS variation of MS-DOS fs */