linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] UDF support for UTF-16
@ 2018-06-07 15:31 Jan Kara
  2018-06-07 16:50 ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2018-06-07 15:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel

  Hello Linus,

  could you please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git udf_for_v4.18-rc1

to get UDF support for UTF-16 characters in file names.

Top of the tree is 3be4aaf4e2d3. The full shortlog is:

Jan Kara (6):
      udf: Always require NLS support
      udf: Use UTF-32 <-> UTF-8 conversion functions from NLS
      udf: Convert ident strings to proper charset
      udf: Push sb argument to udf_name_[to|from]_CS0()
      udf: Add support for encoding UTF-16 characters
      udf: Add support for decoding UTF-16 characters

The diffstat is

 fs/udf/Kconfig   |   6 +-
 fs/udf/super.c   |  12 +--
 fs/udf/udfdecl.h |   3 +-
 fs/udf/unicode.c | 260 +++++++++++++++++++++++++++----------------------------
 4 files changed, 131 insertions(+), 150 deletions(-)

							Thanks
								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] UDF support for UTF-16
  2018-06-07 15:31 [GIT PULL] UDF support for UTF-16 Jan Kara
@ 2018-06-07 16:50 ` Linus Torvalds
  2018-06-07 16:59   ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2018-06-07 16:50 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

On Thu, Jun 7, 2018 at 8:31 AM Jan Kara <jack@suse.cz> wrote:
>
> to get UDF support for UTF-16 characters in file names.

Who uses UTF-16 in this day and age?

It's a broken format.

It's not mentioned in the commit logs *why* this is done.

Honestly, I almost unpulled this one too when reading the commits. You
had completely insane BUG() cases there. They got removed in the end,
which is the only reason I'm taking this, but I'm not super-happy
about the insane half-way state or about the complete lack of
documentation about why things are done.

              Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] UDF support for UTF-16
  2018-06-07 16:50 ` Linus Torvalds
@ 2018-06-07 16:59   ` Matthew Wilcox
  2018-06-07 17:15     ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2018-06-07 16:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Kara, linux-fsdevel

On Thu, Jun 07, 2018 at 09:50:00AM -0700, Linus Torvalds wrote:
> On Thu, Jun 7, 2018 at 8:31 AM Jan Kara <jack@suse.cz> wrote:
> >
> > to get UDF support for UTF-16 characters in file names.
> 
> Who uses UTF-16 in this day and age?
> 
> It's a broken format.
> 
> It's not mentioned in the commit logs *why* this is done.

Here's some background:

https://www.spinics.net/lists/linux-fsdevel/msg124310.html

Yeah, it's a totally broken format, but we shouldn't be thinking that
filenames which come to us in UTF16 are actually in UCS2.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] UDF support for UTF-16
  2018-06-07 16:59   ` Matthew Wilcox
@ 2018-06-07 17:15     ` Linus Torvalds
  2018-06-08 11:56       ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2018-06-07 17:15 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jan Kara, linux-fsdevel

On Thu, Jun 7, 2018 at 9:59 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> Yeah, it's a totally broken format, but we shouldn't be thinking that
> filenames which come to us in UTF16 are actually in UCS2.

Ok, old fixed-2-byte UCS2 is certainly even worse than UTF16, so no
argument on that side.

I was more wondering who actually *does* this, but it sounds like it
was a mostly just that we used to do the old-style UCS-2, and this is
extending it to the slightly less broken "extended UCS-2" aka UTF-16.

I'd just have liked to see some more background in the logs, because
this seemed to me such an odd change to do that it made me go "why
would anybody ever care?".

But I guess MS (and maybe even OSX) _)still_ haven't gotten the memo
on utf-8 and actually use UTF-16..

Oh well.

                Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] UDF support for UTF-16
  2018-06-07 17:15     ` Linus Torvalds
@ 2018-06-08 11:56       ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2018-06-08 11:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Matthew Wilcox, Jan Kara, linux-fsdevel

On Thu 07-06-18 10:15:57, Linus Torvalds wrote:
> On Thu, Jun 7, 2018 at 9:59 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > Yeah, it's a totally broken format, but we shouldn't be thinking that
> > filenames which come to us in UTF16 are actually in UCS2.
> 
> Ok, old fixed-2-byte UCS2 is certainly even worse than UTF16, so no
> argument on that side.
> 
> I was more wondering who actually *does* this, but it sounds like it
> was a mostly just that we used to do the old-style UCS-2, and this is
> extending it to the slightly less broken "extended UCS-2" aka UTF-16.
> 
> I'd just have liked to see some more background in the logs, because
> this seemed to me such an odd change to do that it made me go "why
> would anybody ever care?".
> 
> But I guess MS (and maybe even OSX) _)still_ haven't gotten the memo
> on utf-8 and actually use UTF-16..
> 
> Oh well.

Yes, Windows still apparently use UTF-16 (at least according to a user
report I've got which motivated this work) and the OSTA UDF standard
defines that filenames in UDF are in "OSTA Compressed Unicode" which may be
actually UTF-16 if the program creating the filesystem image chooses so.

I agree that the changelogs should have mentioned this. My bad.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-06-08 11:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-07 15:31 [GIT PULL] UDF support for UTF-16 Jan Kara
2018-06-07 16:50 ` Linus Torvalds
2018-06-07 16:59   ` Matthew Wilcox
2018-06-07 17:15     ` Linus Torvalds
2018-06-08 11:56       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).