linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
@ 2017-11-17  8:20 Ting-Chang Hou
  2017-11-19  0:57 ` Ernesto A. Fernández
  0 siblings, 1 reply; 21+ messages in thread
From: Ting-Chang Hou @ 2017-11-17  8:20 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Ting-Chang Hou

The unicode of hangul from macOS is decomposed. There has a bug that
mistake decomposed unicode for composed when change unicode to ascii,
so it cannot recognize the hangul correctly.

Signed-off-by: Ting-Chang Hou <tchou@synology.com>
---
 fs/hfsplus/unicode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..2daf7b0 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
 	ustrlen = be16_to_cpu(ustr->length);
 	len = *len_p;
 	ce1 = NULL;
-	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
+	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 
 	while (ustrlen > 0) {
 		c0 = be16_to_cpu(*ip++);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-17  8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
@ 2017-11-19  0:57 ` Ernesto A. Fernández
  2017-11-23  3:57   ` tchou
  0 siblings, 1 reply; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-19  0:57 UTC (permalink / raw)
  To: Ting-Chang Hou; +Cc: linux-fsdevel, Ernesto A. Fernández

On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> The unicode of hangul from macOS is decomposed. There has a bug that
> mistake decomposed unicode for composed when change unicode to ascii,
> so it cannot recognize the hangul correctly.
> 
> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> ---
>  fs/hfsplus/unicode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..2daf7b0 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>  	ustrlen = be16_to_cpu(ustr->length);
>  	len = *len_p;
>  	ce1 = NULL;
> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);

I'm not sure this is a mistake. The developers probably wanted the
filenames to be recomposed before being presented in utf8. With your patch,
if you try the following (with the default mount options):

  touch Á
  ls | hexdump -C

the utf8 output filename will be using the combining accent (CC 81) instead
of the Á character (C3 81). This is a bit annoying because it won't print
correctly in my terminal anymore.

What is it exactly that you are trying to fix? You mention an issue with
hangul characters, but I failed to trigger it. Could you expand on that?

>  	while (ustrlen > 0) {
>  		c0 = be16_to_cpu(*ip++);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-19  0:57 ` Ernesto A. Fernández
@ 2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23 11:32     ` Ernesto A. Fernández
  0 siblings, 2 replies; 21+ messages in thread
From: tchou @ 2017-11-23  3:57 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

> Could you please share the examples of incorrect and correct
> behaviour?

> What is it exactly that you are trying to fix? You mention an issue 
> with
> hangul characters, but I failed to trigger it. Could you expand on 
> that?
> 

Hi all,
There is an example.I use Mac mini to format my usb disk to HFS+ and 
touch
the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn) on 
it.
After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get the
following error message when trying to ls and cp the file:

ls: cannot access 공유: No such file or directory
cp: cannot stat ‘공유’: No such file or directory

It seem's a problem for a long time(https://goo.gl/LiWGe5).
After applying my patch, I can ls and cp the file correctly.

> On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> The unicode of hangul from macOS is decomposed. There has a bug that
>> mistake decomposed unicode for composed when change unicode to ascii,
>> so it cannot recognize the hangul correctly.
>> 
>> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> ---
>>  fs/hfsplus/unicode.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> index dfa90c2..2daf7b0 100644
>> --- a/fs/hfsplus/unicode.c
>> +++ b/fs/hfsplus/unicode.c
>> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>>  	ustrlen = be16_to_cpu(ustr->length);
>>  	len = *len_p;
>>  	ce1 = NULL;
>> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> 
> I'm not sure this is a mistake. The developers probably wanted the
> filenames to be recomposed before being presented in utf8. With your 
> patch,
> if you try the following (with the default mount options):
> 
>   touch Á
>   ls | hexdump -C
> 
> the utf8 output filename will be using the combining accent (CC 81) 
> instead
> of the Á character (C3 81). This is a bit annoying because it won't 
> print
> correctly in my terminal anymore.

I'm not exatly know why combining accent cannot print correctly in 
terminal
and how to avoid it. Whether apply my patch or not, my terminal cannot
print the hangul charactor correctly.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
@ 2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23  6:05       ` tchou
  2017-11-23 11:32     ` Ernesto A. Fernández
  1 sibling, 1 reply; 21+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  4:21 UTC (permalink / raw)
  To: tchou, Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > 
> > Could you please share the examples of incorrect and correct
> > behaviour?
> > 
> > What is it exactly that you are trying to fix? You mention an
> > issue 
> > with
> > hangul characters, but I failed to trigger it. Could you expand on 
> > that?
> > 
> Hi all,
> There is an example.I use Mac mini to format my usb disk to HFS+ and 
> touch
> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
> on 
> it.
> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
> the
> following error message when trying to ls and cp the file:
> 


Could you share the all mount options that you used? It looks that you
simply don't use the proper mount options set under the Linux. Because
MacOS X will mount properly always.

Thanks,
Vyacheslav Dubeyko.



> ls: cannot access 공유: No such file or directory
> cp: cannot stat ‘공유’: No such file or directory
> 
> It seem's a problem for a long time(https://goo.gl/LiWGe5).
> After applying my patch, I can ls and cp the file correctly.
> 
> > 
> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> > > 
> > > The unicode of hangul from macOS is decomposed. There has a bug
> > > that
> > > mistake decomposed unicode for composed when change unicode to
> > > ascii,
> > > so it cannot recognize the hangul correctly.
> > > 
> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> > > ---
> > >  fs/hfsplus/unicode.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > index dfa90c2..2daf7b0 100644
> > > --- a/fs/hfsplus/unicode.c
> > > +++ b/fs/hfsplus/unicode.c
> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
> > >  	ustrlen = be16_to_cpu(ustr->length);
> > >  	len = *len_p;
> > >  	ce1 = NULL;
> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > I'm not sure this is a mistake. The developers probably wanted the
> > filenames to be recomposed before being presented in utf8. With
> > your 
> > patch,
> > if you try the following (with the default mount options):
> > 
> >   touch Á
> >   ls | hexdump -C
> > 
> > the utf8 output filename will be using the combining accent (CC
> > 81) 
> > instead
> > of the Á character (C3 81). This is a bit annoying because it
> > won't 
> > print
> > correctly in my terminal anymore.
> I'm not exatly know why combining accent cannot print correctly in 
> terminal
> and how to avoid it. Whether apply my patch or not, my terminal
> cannot
> print the hangul charactor correctly.
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23  6:05       ` tchou
  2017-11-23  6:23         ` Viacheslav Dubeyko
  0 siblings, 1 reply; 21+ messages in thread
From: tchou @ 2017-11-23  6:05 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> >
>> > Could you please share the examples of incorrect and correct
>> > behaviour?
>> >
>> > What is it exactly that you are trying to fix? You mention an
>> > issue 
>> > with
>> > hangul characters, but I failed to trigger it. Could you expand on 
>> > that?
>> >
>> Hi all,
>> There is an example.I use Mac mini to format my usb disk to HFS+ and 
>> touch
>> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
>> on 
>> it.
>> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
>> the
>> following error message when trying to ls and cp the file:
>> 
> 
> 
> Could you share the all mount options that you used? It looks that you
> simply don't use the proper mount options set under the Linux. Because
> MacOS X will mount properly always.
> 
> Thanks,
> Vyacheslav Dubeyko.
> 
Hi Vyacheslav,

I use the default mount option:

mount /dev/sdc1 /mnt/hfs
/dev/sdc1 on /mnt/hfs type hfsplus 
(rw,relatime,umask=22,uid=0,gid=0,nls=utf8)

Thanks,
TCHou
> 
> 
>> ls: cannot access 공유: No such file or directory
>> cp: cannot stat ‘공유’: No such file or directory
>> 
>> It seem's a problem for a long time(https://goo.gl/LiWGe5).
>> After applying my patch, I can ls and cp the file correctly.
>> 
>> >
>> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> > >
>> > > The unicode of hangul from macOS is decomposed. There has a bug
>> > > that
>> > > mistake decomposed unicode for composed when change unicode to
>> > > ascii,
>> > > so it cannot recognize the hangul correctly.
>> > >
>> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> > > ---
>> > >  fs/hfsplus/unicode.c | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> > > index dfa90c2..2daf7b0 100644
>> > > --- a/fs/hfsplus/unicode.c
>> > > +++ b/fs/hfsplus/unicode.c
>> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>> > >  	ustrlen = be16_to_cpu(ustr->length);
>> > >  	len = *len_p;
>> > >  	ce1 = NULL;
>> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > I'm not sure this is a mistake. The developers probably wanted the
>> > filenames to be recomposed before being presented in utf8. With
>> > your 
>> > patch,
>> > if you try the following (with the default mount options):
>> >
>> >   touch Á
>> >   ls | hexdump -C
>> >
>> > the utf8 output filename will be using the combining accent (CC
>> > 81) 
>> > instead
>> > of the Á character (C3 81). This is a bit annoying because it
>> > won't 
>> > print
>> > correctly in my terminal anymore.
>> I'm not exatly know why combining accent cannot print correctly in 
>> terminal
>> and how to avoid it. Whether apply my patch or not, my terminal
>> cannot
>> print the hangul charactor correctly.
>> 
>> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:05       ` tchou
@ 2017-11-23  6:23         ` Viacheslav Dubeyko
  2017-11-23  6:34           ` tchou
  0 siblings, 1 reply; 21+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  6:23 UTC (permalink / raw)
  To: tchou
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> > 
> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > > 
> > > > 
> > > > 
> > > > Could you please share the examples of incorrect and correct
> > > > behaviour?
> > > > 
> > > > What is it exactly that you are trying to fix? You mention an
> > > > issue 
> > > > with
> > > > hangul characters, but I failed to trigger it. Could you expand
> > > > on 
> > > > that?
> > > > 
> > > Hi all,
> > > There is an example.I use Mac mini to format my usb disk to HFS+
> > > and 
> > > touch
> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
> > > )
> > > on 
> > > it.
> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
> > > get
> > > the
> > > following error message when trying to ls and cp the file:
> > > 
> > 
> > Could you share the all mount options that you used? It looks that
> > you
> > simply don't use the proper mount options set under the Linux.
> > Because
> > MacOS X will mount properly always.
> > 
> > Thanks,
> > Vyacheslav Dubeyko.
> > 
> Hi Vyacheslav,
> 
> I use the default mount option:
> 
> mount /dev/sdc1 /mnt/hfs
> /dev/sdc1 on /mnt/hfs type hfsplus 
> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
> 


Please, take a look into mount options description [1]. Especially,
take a look into decompose/nodecompose options. You simply needs to use
the proper mount option for your case.

Thanks,
Vyacheslav Dubeyko.


[1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt

> Thanks,
> TCHou
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:23         ` Viacheslav Dubeyko
@ 2017-11-23  6:34           ` tchou
  0 siblings, 0 replies; 21+ messages in thread
From: tchou @ 2017-11-23  6:34 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "\"Ernesto A.\"" Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 14:23 寫到:
> On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
>> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
>> >
>> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> > >
>> > > >
>> > > >
>> > > > Could you please share the examples of incorrect and correct
>> > > > behaviour?
>> > > >
>> > > > What is it exactly that you are trying to fix? You mention an
>> > > > issue 
>> > > > with
>> > > > hangul characters, but I failed to trigger it. Could you expand
>> > > > on 
>> > > > that?
>> > > >
>> > > Hi all,
>> > > There is an example.I use Mac mini to format my usb disk to HFS+
>> > > and 
>> > > touch
>> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
>> > > )
>> > > on 
>> > > it.
>> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
>> > > get
>> > > the
>> > > following error message when trying to ls and cp the file:
>> > >
>> >
>> > Could you share the all mount options that you used? It looks that
>> > you
>> > simply don't use the proper mount options set under the Linux.
>> > Because
>> > MacOS X will mount properly always.
>> >
>> > Thanks,
>> > Vyacheslav Dubeyko.
>> >
>> Hi Vyacheslav,
>> 
>> I use the default mount option:
>> 
>> mount /dev/sdc1 /mnt/hfs
>> /dev/sdc1 on /mnt/hfs type hfsplus 
>> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
>> 
> 
> 
> Please, take a look into mount options description [1]. Especially,
> take a look into decompose/nodecompose options. You simply needs to use
> the proper mount option for your case.
> 
I got it. It's my mistake.
Thank you all for making me right.

Thanks,
TCHou

> Thanks,
> Vyacheslav Dubeyko.
> 
> 
> [1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt
> 
>> Thanks,
>> TCHou
>> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-24  7:25       ` tchou
  1 sibling, 2 replies; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 11:32 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

Hi:

your issue seems to be in the decomposition of hangul characters, not in
the recomposition before printing. The hfsplus module on linux is saving
the name of your actor as AC F5 C7 20, without performing any
decomposition at all.

The reason your patch hides the bug is because it causes linux to present
filenames as decomposed utf8, so it is not necessary to decompose again
before working with them. But the issue is still there, and you will most
likely run into trouble if you make a hangul filename in linux and try
to work with it in MacOS.

Reviewing the code it would seem that the developers completely forgot
the hangul characters had their own rules for decomposition. It's weird
because they did the composition part correctly.

I've made a quick draft of a patch, mostly by copying the code provided
in the unicode web. I don't think we can actually use it on a release,
but it should be enough to check if I'm right. It works fine on linux,
but I don't have a mac, so it would be great if you could test it for me.

Thanks,
Ernest

(By the way, there is no reason you should have to use the nodecompose
mount option, as the other reviewer suggested. Using that option will
have a similar effect to that of your patch. It will hide the problem,
but if you create a hangul filename on linux with that option you
probably won't be able to use it on a mac.)

---
diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..9006c61 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
+/* Decomposes a single non-Hangul unicode character. */
 static inline u16 *decompose_unichar(wchar_t uc, int *size)
 {
 	int off;
@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/* Decomposes a Hangul unicode character. */
+int decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 hangul_buf[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
-		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
-		else
+		if (decompose) {
+			/* Hangul is handled separately */
+			dstr = &hangul_buf[0];
+			dsize = decompose_hangul(c, dstr);
+			if (dsize == 0)
+				/* not Hangul */
+				dstr = decompose_unichar(c, &dsize);
+		} else {
 			dstr = NULL;
+		}
+
 		if (dstr) {
 			if (outlen + dsize > max_unistr_len)
 				break;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
@ 2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-23 22:20         ` Ernesto A. Fernández
  2017-11-24  7:25       ` tchou
  1 sibling, 1 reply; 21+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23 18:36 UTC (permalink / raw)
  To: Ernesto A. Fernández, tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not
> in
> the recomposition before printing. The hfsplus module on linux is
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to
> present
> filenames as decomposed utf8, so it is not necessary to decompose
> again
> before working with them. But the issue is still there, and you will
> most
> likely run into trouble if you make a hangul filename in linux and
> try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely
> forgot
> the hangul characters had their own rules for decomposition. It's
> weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code
> provided
> in the unicode web. I don't think we can actually use it on a 


Could you please share the link for "the unicode web"?

Thanks,
Vyacheslav Dubeyko.


> release,
> but it should be enough to check if I'm right. It works fine on
> linux,
> but I don't have a mac, so it would be great if you could test it for
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the
> nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the
> problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-23 22:20         ` Ernesto A. Fernández
  0 siblings, 0 replies; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 22:20 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: Ernesto A. Fernández, tchou, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, Nov 23, 2017 at 10:36:37AM -0800, Viacheslav Dubeyko wrote:
> On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> > Hi:
> > 
> > your issue seems to be in the decomposition of hangul characters, not
> > in
> > the recomposition before printing. The hfsplus module on linux is
> > saving
> > the name of your actor as AC F5 C7 20, without performing any
> > decomposition at all.
> > 
> > The reason your patch hides the bug is because it causes linux to
> > present
> > filenames as decomposed utf8, so it is not necessary to decompose
> > again
> > before working with them. But the issue is still there, and you will
> > most
> > likely run into trouble if you make a hangul filename in linux and
> > try
> > to work with it in MacOS.
> > 
> > Reviewing the code it would seem that the developers completely
> > forgot
> > the hangul characters had their own rules for decomposition. It's
> > weird
> > because they did the composition part correctly.
> > 
> > I've made a quick draft of a patch, mostly by copying the code
> > provided
> > in the unicode web. I don't think we can actually use it on a 
> 
> 
> Could you please share the link for "the unicode web"?
> 
> Thanks,
> Vyacheslav Dubeyko.

I'm not asking for any reviews yet, just testing because I don't have
a Mac. As long as that's clear, this is the latest version of Unicode:

www.unicode.org/versions/Unicode10.0.0/

You want section 3.12.

> 
> 
> > release,
> > but it should be enough to check if I'm right. It works fine on
> > linux,
> > but I don't have a mac, so it would be great if you could test it for
> > me.
> > 
> > Thanks,
> > Ernest
> > 
> > (By the way, there is no reason you should have to use the
> > nodecompose
> > mount option, as the other reviewer suggested. Using that option will
> > have a similar effect to that of your patch. It will hide the
> > problem,
> > but if you create a hangul filename on linux with that option you
> > probably won't be able to use it on a mac.)
> > 
> > ---
> > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > index dfa90c2..9006c61 100644
> > --- a/fs/hfsplus/unicode.c
> > +++ b/fs/hfsplus/unicode.c
> > @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> > *sb, const char *astr, int len,
> >  	return size;
> >  }
> >  
> > -/* Decomposes a single unicode character. */
> > +/* Decomposes a single non-Hangul unicode character. */
> >  static inline u16 *decompose_unichar(wchar_t uc, int *size)
> >  {
> >  	int off;
> > @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> > int *size)
> >  	return hfsplus_decompose_table + (off / 4);
> >  }
> >  
> > +/* Decomposes a Hangul unicode character. */
> > +int decompose_hangul(wchar_t uc, u16 *result)
> > +{
> > +	int index;
> > +	int l, v, t;
> > +
> > +	index = uc - Hangul_SBase;
> > +	if (index < 0 || index >= Hangul_SCount)
> > +		return 0;
> > +
> > +	l = Hangul_LBase + index / Hangul_NCount;
> > +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> > +	t = Hangul_TBase + index % Hangul_TCount;
> > +
> > +	result[0] = l;
> > +	result[1] = v;
> > +	if (t != Hangul_TBase) {
> > +		result[2] = t;
> > +		return 3;
> > +	}
> > +	return 2;
> > +}
> > +
> >  int hfsplus_asc2uni(struct super_block *sb,
> >  		    struct hfsplus_unistr *ustr, int max_unistr_len,
> >  		    const char *astr, int len)
> > @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> >  	int size, dsize, decompose;
> >  	u16 *dstr, outlen = 0;
> >  	wchar_t c;
> > +	u16 hangul_buf[3];
> >  
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > &HFSPLUS_SB(sb)->flags);
> >  	while (outlen < max_unistr_len && len > 0) {
> >  		size = asc2unichar(sb, astr, len, &c);
> >  
> > -		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > -		else
> > +		if (decompose) {
> > +			/* Hangul is handled separately */
> > +			dstr = &hangul_buf[0];
> > +			dsize = decompose_hangul(c, dstr);
> > +			if (dsize == 0)
> > +				/* not Hangul */
> > +				dstr = decompose_unichar(c, &dsize);
> > +		} else {
> >  			dstr = NULL;
> > +		}
> > +
> >  		if (dstr) {
> >  			if (outlen + dsize > max_unistr_len)
> >  				break;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-24  7:25       ` tchou
  2017-11-24 11:45         ` Ernesto A. Fernández
  1 sibling, 1 reply; 21+ messages in thread
From: tchou @ 2017-11-24  7:25 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not 
> in
> the recomposition before printing. The hfsplus module on linux is 
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to 
> present
> filenames as decomposed utf8, so it is not necessary to decompose again
> before working with them. But the issue is still there, and you will 
> most
> likely run into trouble if you make a hangul filename in linux and try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely forgot
> the hangul characters had their own rules for decomposition. It's weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code provided
> in the unicode web. I don't think we can actually use it on a release,
> but it should be enough to check if I'm right. It works fine on linux,
> but I don't have a mac, so it would be great if you could test it for 
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
> 
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, 
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
> 
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
> 
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, 
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
> 
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

Hi,

Thank you for your explanation and your draft.
I test four versions of hfsplus module:
--------------------------------
1. origin linux
2. apply my patch
3. mount with nodecompose option
4. apply your patch
--------------------------------

There are the results of the test:
---------------------------------------------------------------------
Linux can ls and cp the file from mac correctly with module 2,3,4.
Mac cannot cp the files correctly which touched by module 1,2,3.
Mac can cp the file correctly which touched by module with your patch.
Mac can ls the files which touch by all modules.
---------------------------------------------------------------------

It seems that everything goes well with your patch. The result is just
like you say, my patch or mount with nodecompose option cannot perform
normally.


By the way, are the decompose procedures of function 
hfsplus_hash_dentry()
and hfsplus_compare_dentry() need to modify as well?

Thanks,
TCHou

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24  7:25       ` tchou
@ 2017-11-24 11:45         ` Ernesto A. Fernández
  2017-11-27  2:07           ` tchou
  0 siblings, 1 reply; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-24 11:45 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> >Hi:
> >
> >your issue seems to be in the decomposition of hangul characters, not in
> >the recomposition before printing. The hfsplus module on linux is saving
> >the name of your actor as AC F5 C7 20, without performing any
> >decomposition at all.
> >
> >The reason your patch hides the bug is because it causes linux to present
> >filenames as decomposed utf8, so it is not necessary to decompose again
> >before working with them. But the issue is still there, and you will most
> >likely run into trouble if you make a hangul filename in linux and try
> >to work with it in MacOS.
> >
> >Reviewing the code it would seem that the developers completely forgot
> >the hangul characters had their own rules for decomposition. It's weird
> >because they did the composition part correctly.
> >
> >I've made a quick draft of a patch, mostly by copying the code provided
> >in the unicode web. I don't think we can actually use it on a release,
> >but it should be enough to check if I'm right. It works fine on linux,
> >but I don't have a mac, so it would be great if you could test it for me.
> >
> >Thanks,
> >Ernest
> >
> >(By the way, there is no reason you should have to use the nodecompose
> >mount option, as the other reviewer suggested. Using that option will
> >have a similar effect to that of your patch. It will hide the problem,
> >but if you create a hangul filename on linux with that option you
> >probably won't be able to use it on a mac.)
> >
> >---
> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> >index dfa90c2..9006c61 100644
> >--- a/fs/hfsplus/unicode.c
> >+++ b/fs/hfsplus/unicode.c
> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> >*sb, const char *astr, int len,
> > 	return size;
> > }
> >
> >-/* Decomposes a single unicode character. */
> >+/* Decomposes a single non-Hangul unicode character. */
> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > {
> > 	int off;
> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
> >*size)
> > 	return hfsplus_decompose_table + (off / 4);
> > }
> >
> >+/* Decomposes a Hangul unicode character. */
> >+int decompose_hangul(wchar_t uc, u16 *result)
> >+{
> >+	int index;
> >+	int l, v, t;
> >+
> >+	index = uc - Hangul_SBase;
> >+	if (index < 0 || index >= Hangul_SCount)
> >+		return 0;
> >+
> >+	l = Hangul_LBase + index / Hangul_NCount;
> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> >+	t = Hangul_TBase + index % Hangul_TCount;
> >+
> >+	result[0] = l;
> >+	result[1] = v;
> >+	if (t != Hangul_TBase) {
> >+		result[2] = t;
> >+		return 3;
> >+	}
> >+	return 2;
> >+}
> >+
> > int hfsplus_asc2uni(struct super_block *sb,
> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
> > 		    const char *astr, int len)
> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> > 	int size, dsize, decompose;
> > 	u16 *dstr, outlen = 0;
> > 	wchar_t c;
> >+	u16 hangul_buf[3];
> >
> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > 	while (outlen < max_unistr_len && len > 0) {
> > 		size = asc2unichar(sb, astr, len, &c);
> >
> >-		if (decompose)
> >-			dstr = decompose_unichar(c, &dsize);
> >-		else
> >+		if (decompose) {
> >+			/* Hangul is handled separately */
> >+			dstr = &hangul_buf[0];
> >+			dsize = decompose_hangul(c, dstr);
> >+			if (dsize == 0)
> >+				/* not Hangul */
> >+				dstr = decompose_unichar(c, &dsize);
> >+		} else {
> > 			dstr = NULL;
> >+		}
> >+
> > 		if (dstr) {
> > 			if (outlen + dsize > max_unistr_len)
> > 				break;
> 
> Hi,
> 
> Thank you for your explanation and your draft.
> I test four versions of hfsplus module:
> --------------------------------
> 1. origin linux
> 2. apply my patch
> 3. mount with nodecompose option
> 4. apply your patch
> --------------------------------

It seems you were very thorough, thank you.
 
> There are the results of the test:
> ---------------------------------------------------------------------
> Linux can ls and cp the file from mac correctly with module 2,3,4.
> Mac cannot cp the files correctly which touched by module 1,2,3.
> Mac can cp the file correctly which touched by module with your patch.
> Mac can ls the files which touch by all modules.
> ---------------------------------------------------------------------
> 
> It seems that everything goes well with your patch. The result is just
> like you say, my patch or mount with nodecompose option cannot perform
> normally.

That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.

> 
> By the way, are the decompose procedures of function hfsplus_hash_dentry()
> and hfsplus_compare_dentry() need to modify as well?

You are correct, good thing you noticed. I'll add that when I resend the
patch.

> Thanks,
> TCHou

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24 11:45         ` Ernesto A. Fernández
@ 2017-11-27  2:07           ` tchou
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
  0 siblings, 1 reply; 21+ messages in thread
From: tchou @ 2017-11-27  2:07 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-24 19:45 寫到:
> On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
>> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
>> >Hi:
>> >
>> >your issue seems to be in the decomposition of hangul characters, not in
>> >the recomposition before printing. The hfsplus module on linux is saving
>> >the name of your actor as AC F5 C7 20, without performing any
>> >decomposition at all.
>> >
>> >The reason your patch hides the bug is because it causes linux to present
>> >filenames as decomposed utf8, so it is not necessary to decompose again
>> >before working with them. But the issue is still there, and you will most
>> >likely run into trouble if you make a hangul filename in linux and try
>> >to work with it in MacOS.
>> >
>> >Reviewing the code it would seem that the developers completely forgot
>> >the hangul characters had their own rules for decomposition. It's weird
>> >because they did the composition part correctly.
>> >
>> >I've made a quick draft of a patch, mostly by copying the code provided
>> >in the unicode web. I don't think we can actually use it on a release,
>> >but it should be enough to check if I'm right. It works fine on linux,
>> >but I don't have a mac, so it would be great if you could test it for me.
>> >
>> >Thanks,
>> >Ernest
>> >
>> >(By the way, there is no reason you should have to use the nodecompose
>> >mount option, as the other reviewer suggested. Using that option will
>> >have a similar effect to that of your patch. It will hide the problem,
>> >but if you create a hangul filename on linux with that option you
>> >probably won't be able to use it on a mac.)
>> >
>> >---
>> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> >index dfa90c2..9006c61 100644
>> >--- a/fs/hfsplus/unicode.c
>> >+++ b/fs/hfsplus/unicode.c
>> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
>> >*sb, const char *astr, int len,
>> > 	return size;
>> > }
>> >
>> >-/* Decomposes a single unicode character. */
>> >+/* Decomposes a single non-Hangul unicode character. */
>> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
>> > {
>> > 	int off;
>> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
>> >*size)
>> > 	return hfsplus_decompose_table + (off / 4);
>> > }
>> >
>> >+/* Decomposes a Hangul unicode character. */
>> >+int decompose_hangul(wchar_t uc, u16 *result)
>> >+{
>> >+	int index;
>> >+	int l, v, t;
>> >+
>> >+	index = uc - Hangul_SBase;
>> >+	if (index < 0 || index >= Hangul_SCount)
>> >+		return 0;
>> >+
>> >+	l = Hangul_LBase + index / Hangul_NCount;
>> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
>> >+	t = Hangul_TBase + index % Hangul_TCount;
>> >+
>> >+	result[0] = l;
>> >+	result[1] = v;
>> >+	if (t != Hangul_TBase) {
>> >+		result[2] = t;
>> >+		return 3;
>> >+	}
>> >+	return 2;
>> >+}
>> >+
>> > int hfsplus_asc2uni(struct super_block *sb,
>> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
>> > 		    const char *astr, int len)
>> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>> > 	int size, dsize, decompose;
>> > 	u16 *dstr, outlen = 0;
>> > 	wchar_t c;
>> >+	u16 hangul_buf[3];
>> >
>> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> > 	while (outlen < max_unistr_len && len > 0) {
>> > 		size = asc2unichar(sb, astr, len, &c);
>> >
>> >-		if (decompose)
>> >-			dstr = decompose_unichar(c, &dsize);
>> >-		else
>> >+		if (decompose) {
>> >+			/* Hangul is handled separately */
>> >+			dstr = &hangul_buf[0];
>> >+			dsize = decompose_hangul(c, dstr);
>> >+			if (dsize == 0)
>> >+				/* not Hangul */
>> >+				dstr = decompose_unichar(c, &dsize);
>> >+		} else {
>> > 			dstr = NULL;
>> >+		}
>> >+
>> > 		if (dstr) {
>> > 			if (outlen + dsize > max_unistr_len)
>> > 				break;
>> 
>> Hi,
>> 
>> Thank you for your explanation and your draft.
>> I test four versions of hfsplus module:
>> --------------------------------
>> 1. origin linux
>> 2. apply my patch
>> 3. mount with nodecompose option
>> 4. apply your patch
>> --------------------------------
> 
> It seems you were very thorough, thank you.
> 
>> There are the results of the test:
>> ---------------------------------------------------------------------
>> Linux can ls and cp the file from mac correctly with module 2,3,4.
>> Mac cannot cp the files correctly which touched by module 1,2,3.
>> Mac can cp the file correctly which touched by module with your patch.
>> Mac can ls the files which touch by all modules.
>> ---------------------------------------------------------------------
>> 
>> It seems that everything goes well with your patch. The result is just
>> like you say, my patch or mount with nodecompose option cannot perform
>> normally.
> 
> That's good to hear. I will submit a patch as soon as I can figure out
> the licensing issues. If you want I can credit you as reporter and 
> tester.
> 
OK, thanks a lot.
>> 
>> By the way, are the decompose procedures of function 
>> hfsplus_hash_dentry()
>> and hfsplus_compare_dentry() need to modify as well?
> 
> You are correct, good thing you noticed. I'll add that when I resend 
> the
> patch.
> 
>> Thanks,
>> TCHou

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27  2:07           ` tchou
@ 2017-11-27 19:36             ` Ernesto A. Fernández
  2017-11-27 22:40               ` Viacheslav Dubeyko
  2018-08-23 18:29               ` Ernesto A. Fernández
  0 siblings, 2 replies; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-27 19:36 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: tchou, slava, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

It would be good to have some feedback on the use of code under the
Unicode License, I don't know if that can be a problem.

To Ting-Chang Hou: This patch is slightly different from the one you
tested with your mac. You are still tagged as Tested-by, but you may
want to check again that I have not broken anything.

Thanks,
Ernest

-- >8 --
Subject: [PATCH] hfsplus: fix decomposition of Hangul characters

Files created under macOS become essentially unusable under linux if
their names contain any Hangul, and vice versa.

This happens because the normalization of Hangul characters is a special
case: it can be done algorithmically, without need of a table. The
hfsplus module deals with Hangul correctly when composing, but
completely forgets about it when performing a decomposition.

Solve this by using the Hangul decomposition function provided in the
Unicode Standard. It's under the Unicode License, compatible with the
GPL.

This patch will cause trouble for Hangul filenames already created by
the module in the past. This shouldn't be a special concern because the
main purpose of the module was always sharing with macOS. If a user
actually needs to access such a file the nodecompose mount option should
be enough.

Reported-by: Ting-Chang Hou <tchou@synology.com>
Tested-by: Ting-Chang Hou <tchou@synology.com>
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
---
 fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..7ceae91 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
-static inline u16 *decompose_unichar(wchar_t uc, int *size)
+/* Decomposes a non-Hangul unicode character. */
+static u16 *decompose_nonhangul(wchar_t uc, int *size)
 {
 	int off;
 
@@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/*
+ * Decomposes a Hangul unicode character.
+ *
+ * This function was adapted from sample code in section 3.12 of the
+ * Unicode Standard, version 10.0.
+ *
+ * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
+ * under the Terms of Use in http://www.unicode.org/copyright.html.
+ */
+static int decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
+/* Decomposes a single unicode character. */
+static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
+{
+	u16 *result;
+
+	/* Hangul is handled separately */
+	result = hangul_buffer;
+	*size = decompose_hangul(uc, result);
+	if (*size == 0)
+		/* not Hangul */
+		result = decompose_nonhangul(uc, size);
+	return result;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 dhangul[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 	unsigned long hash;
 	wchar_t c;
 	u16 c2;
+	u16 dhangul[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 		len -= size;
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 	const char *astr1, *astr2;
 	u16 c1, c2;
 	wchar_t c;
+	u16 dhangul_1[3], dhangul_2[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len1 -= size;
 
 			if (decompose)
-				dstr1 = decompose_unichar(c, &dsize1);
+				dstr1 = decompose_unichar(c, &dsize1,
+							  &dhangul_1[0]);
 			if (!decompose || !dstr1) {
 				c1 = c;
 				dstr1 = &c1;
@@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len2 -= size;
 
 			if (decompose)
-				dstr2 = decompose_unichar(c, &dsize2);
+				dstr2 = decompose_unichar(c, &dsize2,
+							  &dhangul_2[0]);
 			if (!decompose || !dstr2) {
 				c2 = c;
 				dstr2 = &c2;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
@ 2017-11-27 22:40               ` Viacheslav Dubeyko
  2017-11-28 15:02                 ` Ernesto A. Fernández
  2018-08-23 18:29               ` Ernesto A. Fernández
  1 sibling, 1 reply; 21+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-27 22:40 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton

On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> It would be good to have some feedback on the use of code under the
> Unicode License, I don't know if that can be a problem.
> 
> To Ting-Chang Hou: This patch is slightly different from the one you
> tested with your mac. You are still tagged as Tested-by, but you may
> want to check again that I have not broken anything.
> 
> Thanks,
> Ernest
> 
> -- >8 --
> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> 
> Files created under macOS become essentially unusable under linux if
> their names contain any Hangul, and vice versa.
> 
> This happens because the normalization of Hangul characters is a special
> case: it can be done algorithmically, without need of a table. The
> hfsplus module deals with Hangul correctly when composing, but
> completely forgets about it when performing a decomposition.
> 
> Solve this by using the Hangul decomposition function provided in the
> Unicode Standard. It's under the Unicode License, compatible with the
> GPL.
> 

I believe it makes sense to share more details about algorithm details.
I think that the Unicode standard contains some explanation how to
decompose correctly. It will be great to have some brief citation in the
patch description.

> This patch will cause trouble for Hangul filenames already created by
> the module in the past. This shouldn't be a special concern because the
> main purpose of the module was always sharing with macOS. If a user
> actually needs to access such a file the nodecompose mount option should
> be enough.
> 
> Reported-by: Ting-Chang Hou <tchou@synology.com>
> Tested-by: Ting-Chang Hou <tchou@synology.com>
> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
> ---
>  fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 56 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..7ceae91 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> +/* Decomposes a non-Hangul unicode character. */
> +static u16 *decompose_nonhangul(wchar_t uc, int *size)
>  {
>  	int off;
>  
> @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/*
> + * Decomposes a Hangul unicode character.
> + *
> + * This function was adapted from sample code in section 3.12 of the
> + * Unicode Standard, version 10.0.
> + *
> + * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
> + * under the Terms of Use in http://www.unicode.org/copyright.html.
> + */
> +static int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
> +/* Decomposes a single unicode character. */
> +static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
> +{
> +	u16 *result;
> +
> +	/* Hangul is handled separately */
> +	result = hangul_buffer;
> +	*size = decompose_hangul(uc, result);

Is it possible to distinguish the different cases beforehand and to
select one proper function for execution without the execution both ones
in the worst case?

> +	if (*size == 0)
> +		/* not Hangul */
> +		result = decompose_nonhangul(uc, size);

I think it makes sense to use the brackets here. If I remember
correctly, this is the code style's requirement.

Thanks,
Vyacheslav Dubeyko.

> +	return result;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 dhangul[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
>  		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
>  		else
>  			dstr = NULL;
>  		if (dstr) {
> @@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
>  	unsigned long hash;
>  	wchar_t c;
>  	u16 c2;
> +	u16 dhangul[3];
>  
>  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> @@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
>  		len -= size;
>  
>  		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
>  		else
>  			dstr = NULL;
>  		if (dstr) {
> @@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  	const char *astr1, *astr2;
>  	u16 c1, c2;
>  	wchar_t c;
> +	u16 dhangul_1[3], dhangul_2[3];
>  
>  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> @@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  			len1 -= size;
>  
>  			if (decompose)
> -				dstr1 = decompose_unichar(c, &dsize1);
> +				dstr1 = decompose_unichar(c, &dsize1,
> +							  &dhangul_1[0]);
>  			if (!decompose || !dstr1) {
>  				c1 = c;
>  				dstr1 = &c1;
> @@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  			len2 -= size;
>  
>  			if (decompose)
> -				dstr2 = decompose_unichar(c, &dsize2);
> +				dstr2 = decompose_unichar(c, &dsize2,
> +							  &dhangul_2[0]);
>  			if (!decompose || !dstr2) {
>  				c2 = c;
>  				dstr2 = &c2;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 22:40               ` Viacheslav Dubeyko
@ 2017-11-28 15:02                 ` Ernesto A. Fernández
  2017-11-28 16:30                   ` Viacheslav Dubeyko
  0 siblings, 1 reply; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 15:02 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > It would be good to have some feedback on the use of code under the
> > Unicode License, I don't know if that can be a problem.
> > 
> > To Ting-Chang Hou: This patch is slightly different from the one you
> > tested with your mac. You are still tagged as Tested-by, but you may
> > want to check again that I have not broken anything.
> > 
> > Thanks,
> > Ernest
> > 
> > -- >8 --
> > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > 
> > Files created under macOS become essentially unusable under linux if
> > their names contain any Hangul, and vice versa.
> > 
> > This happens because the normalization of Hangul characters is a special
> > case: it can be done algorithmically, without need of a table. The
> > hfsplus module deals with Hangul correctly when composing, but
> > completely forgets about it when performing a decomposition.
> > 
> > Solve this by using the Hangul decomposition function provided in the
> > Unicode Standard. It's under the Unicode License, compatible with the
> > GPL.
> > 
> 
> I believe it makes sense to share more details about algorithm details.
> I think that the Unicode standard contains some explanation how to
> decompose correctly. It will be great to have some brief citation in the
> patch description.

The algorithm is very simple, the best way to understand it is just
looking at the code. I don't know the first thing about Korean writing, so
I don't think I should attempt to explain why the decomposition is done
this way. If somebody else is interested in the details, they can follow
the citation in the header comment of the decompose_hangul function.

> > This patch will cause trouble for Hangul filenames already created by
> > the module in the past. This shouldn't be a special concern because the
> > main purpose of the module was always sharing with macOS. If a user
> > actually needs to access such a file the nodecompose mount option should
> > be enough.
> > 
> > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
> > ---
> >  fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 56 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > index dfa90c2..7ceae91 100644
> > --- a/fs/hfsplus/unicode.c
> > +++ b/fs/hfsplus/unicode.c
> > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
> >  	return size;
> >  }
> >  
> > -/* Decomposes a single unicode character. */
> > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > +/* Decomposes a non-Hangul unicode character. */
> > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> >  {
> >  	int off;
> >  
> > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
> >  	return hfsplus_decompose_table + (off / 4);
> >  }
> >  
> > +/*
> > + * Decomposes a Hangul unicode character.
> > + *
> > + * This function was adapted from sample code in section 3.12 of the
> > + * Unicode Standard, version 10.0.
> > + *
> > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
> > + * under the Terms of Use in http://www.unicode.org/copyright.html.
> > + */
> > +static int decompose_hangul(wchar_t uc, u16 *result)
> > +{
> > +	int index;
> > +	int l, v, t;
> > +
> > +	index = uc - Hangul_SBase;
> > +	if (index < 0 || index >= Hangul_SCount)
> > +		return 0;
> > +
> > +	l = Hangul_LBase + index / Hangul_NCount;
> > +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> > +	t = Hangul_TBase + index % Hangul_TCount;
> > +
> > +	result[0] = l;
> > +	result[1] = v;
> > +	if (t != Hangul_TBase) {
> > +		result[2] = t;
> > +		return 3;
> > +	}
> > +	return 2;
> > +}
> > +
> > +/* Decomposes a single unicode character. */
> > +static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
> > +{
> > +	u16 *result;
> > +
> > +	/* Hangul is handled separately */
> > +	result = hangul_buffer;
> > +	*size = decompose_hangul(uc, result);
> 
> Is it possible to distinguish the different cases beforehand and to
> select one proper function for execution without the execution both ones
> in the worst case?

The decompose_hangul and decompose_nonhangul functions are called only once,
so I would expect the compiler to inline them both. The result should be
similar to your suggestion.

> > +	if (*size == 0)
> > +		/* not Hangul */
> > +		result = decompose_nonhangul(uc, size);
> 
> I think it makes sense to use the brackets here. If I remember
> correctly, this is the code style's requirement.

I couldn't find such a requirement, but I did find some other places within
the hfsplus code where the comments are done my way. For example, in the
hfsplus_cat_write_inode() function of the inode.c file.

> Thanks,
> Vyacheslav Dubeyko.

Thanks,
Ernest
 
> > +	return result;
> > +}
> > +
> >  int hfsplus_asc2uni(struct super_block *sb,
> >  		    struct hfsplus_unistr *ustr, int max_unistr_len,
> >  		    const char *astr, int len)
> > @@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
> >  	int size, dsize, decompose;
> >  	u16 *dstr, outlen = 0;
> >  	wchar_t c;
> > +	u16 dhangul[3];
> >  
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> >  	while (outlen < max_unistr_len && len > 0) {
> >  		size = asc2unichar(sb, astr, len, &c);
> >  
> >  		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
> >  		else
> >  			dstr = NULL;
> >  		if (dstr) {
> > @@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
> >  	unsigned long hash;
> >  	wchar_t c;
> >  	u16 c2;
> > +	u16 dhangul[3];
> >  
> >  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > @@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
> >  		len -= size;
> >  
> >  		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
> >  		else
> >  			dstr = NULL;
> >  		if (dstr) {
> > @@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  	const char *astr1, *astr2;
> >  	u16 c1, c2;
> >  	wchar_t c;
> > +	u16 dhangul_1[3], dhangul_2[3];
> >  
> >  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > @@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  			len1 -= size;
> >  
> >  			if (decompose)
> > -				dstr1 = decompose_unichar(c, &dsize1);
> > +				dstr1 = decompose_unichar(c, &dsize1,
> > +							  &dhangul_1[0]);
> >  			if (!decompose || !dstr1) {
> >  				c1 = c;
> >  				dstr1 = &c1;
> > @@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  			len2 -= size;
> >  
> >  			if (decompose)
> > -				dstr2 = decompose_unichar(c, &dsize2);
> > +				dstr2 = decompose_unichar(c, &dsize2,
> > +							  &dhangul_2[0]);
> >  			if (!decompose || !dstr2) {
> >  				c2 = c;
> >  				dstr2 = &c2;
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 15:02                 ` Ernesto A. Fernández
@ 2017-11-28 16:30                   ` Viacheslav Dubeyko
  2017-11-28 18:15                     ` Ernesto A. Fernández
  0 siblings, 1 reply; 21+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-28 16:30 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton

On Tue, 2017-11-28 at 12:02 -0300, Ernesto A. Fernández wrote:
> On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> > 
> > On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > > 
> > > It would be good to have some feedback on the use of code under
> > > the
> > > Unicode License, I don't know if that can be a problem.
> > > 
> > > To Ting-Chang Hou: This patch is slightly different from the one
> > > you
> > > tested with your mac. You are still tagged as Tested-by, but you
> > > may
> > > want to check again that I have not broken anything.
> > > 
> > > Thanks,
> > > Ernest
> > > 
> > > -- >8 --
> > > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > > 
> > > Files created under macOS become essentially unusable under linux
> > > if
> > > their names contain any Hangul, and vice versa.
> > > 
> > > This happens because the normalization of Hangul characters is a
> > > special
> > > case: it can be done algorithmically, without need of a table.
> > > The
> > > hfsplus module deals with Hangul correctly when composing, but
> > > completely forgets about it when performing a decomposition.
> > > 
> > > Solve this by using the Hangul decomposition function provided in
> > > the
> > > Unicode Standard. It's under the Unicode License, compatible with
> > > the
> > > GPL.
> > > 
> > I believe it makes sense to share more details about algorithm
> > details.
> > I think that the Unicode standard contains some explanation how to
> > decompose correctly. It will be great to have some brief citation
> > in the
> > patch description.
> The algorithm is very simple, the best way to understand it is just
> looking at the code. I don't know the first thing about Korean
> writing, so
> I don't think I should attempt to explain why the decomposition is
> done
> this way. If somebody else is interested in the details, they can
> follow
> the citation in the header comment of the decompose_hangul function.
> 

You should share the link where it can be found. Finally, the comment
needs in more detailed explanation.


> > 
> > > 
> > > This patch will cause trouble for Hangul filenames already
> > > created by
> > > the module in the past. This shouldn't be a special concern
> > > because the
> > > main purpose of the module was always sharing with macOS. If a
> > > user
> > > actually needs to access such a file the nodecompose mount option
> > > should
> > > be enough.
> > > 
> > > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.
> > > com>
> > > ---
> > >  fs/hfsplus/unicode.c | 62
> > > +++++++++++++++++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 56 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > index dfa90c2..7ceae91 100644
> > > --- a/fs/hfsplus/unicode.c
> > > +++ b/fs/hfsplus/unicode.c
> > > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct
> > > super_block *sb, const char *astr, int len,
> > >  	return size;
> > >  }
> > >  
> > > -/* Decomposes a single unicode character. */
> > > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > > +/* Decomposes a non-Hangul unicode character. */
> > > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> > >  {
> > >  	int off;
> > >  
> > > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t
> > > uc, int *size)
> > >  	return hfsplus_decompose_table + (off / 4);
> > >  }
> > >  
> > > +/*
> > > + * Decomposes a Hangul unicode character.
> > > + *
> > > + * This function was adapted from sample code in section 3.12 of
> > > the
> > > + * Unicode Standard, version 10.0.
> > > + *
> > > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights
> > > reserved.  Distributed
> > > + * under the Terms of Use in http://www.unicode.org/copyright.ht
> > > ml.
> > > + */
> > > +static int decompose_hangul(wchar_t uc, u16 *result)
> > > +{
> > > +	int index;
> > > +	int l, v, t;
> > > +
> > > +	index = uc - Hangul_SBase;
> > > +	if (index < 0 || index >= Hangul_SCount)
> > > +		return 0;
> > > +
> > > +	l = Hangul_LBase + index / Hangul_NCount;
> > > +	v = Hangul_VBase + (index % Hangul_NCount) /
> > > Hangul_TCount;
> > > +	t = Hangul_TBase + index % Hangul_TCount;
> > > +
> > > +	result[0] = l;
> > > +	result[1] = v;
> > > +	if (t != Hangul_TBase) {
> > > +		result[2] = t;
> > > +		return 3;
> > > +	}
> > > +	return 2;
> > > +}
> > > +
> > > +/* Decomposes a single unicode character. */
> > > +static u16 *decompose_unichar(wchar_t uc, int *size, u16
> > > *hangul_buffer)
> > > +{
> > > +	u16 *result;
> > > +
> > > +	/* Hangul is handled separately */
> > > +	result = hangul_buffer;
> > > +	*size = decompose_hangul(uc, result);
> > Is it possible to distinguish the different cases beforehand and to
> > select one proper function for execution without the execution both
> > ones
> > in the worst case?
> The decompose_hangul and decompose_nonhangul functions are called
> only once,
> so I would expect the compiler to inline them both. The result should
> be
> similar to your suggestion.
> 


It will be much better to have:

If (hangful) {
    <decompose_hangul>
} else {
    <decompose_nonhangul>
}

Currently, the code looks weird. Especially, decompose_hangul() is able
to distinguish two different cases. So, it's easy to extract the check.

> > 
> > > 
> > > +	if (*size == 0)
> > > +		/* not Hangul */
> > > +		result = decompose_nonhangul(uc, size);
> > I think it makes sense to use the brackets here. If I remember
> > correctly, this is the code style's requirement.
> I couldn't find such a requirement, but I did find some other places
> within
> the hfsplus code where the comments are done my way. For example, in
> the
> hfsplus_cat_write_inode() function of the inode.c file.
> 


The hfsplus code contains many awful places. It doesn't mean that it
needs to follow the bad style examples. If you have only one string:

if (condition)
    <do_something>

then it's OK not to use the brackets. Otherwise, you should include the
branch into the brackets.

Thanks,
Vyacheslav Dubeyko.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 16:30                   ` Viacheslav Dubeyko
@ 2017-11-28 18:15                     ` Ernesto A. Fernández
  0 siblings, 0 replies; 21+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 18:15 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

On Tue, Nov 28, 2017 at 08:30:54AM -0800, Viacheslav Dubeyko wrote:
> On Tue, 2017-11-28 at 12:02 -0300, Ernesto A. Fernández wrote:
> > On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> > > 
> > > On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > > > 
> > > > It would be good to have some feedback on the use of code under
> > > > the
> > > > Unicode License, I don't know if that can be a problem.
> > > > 
> > > > To Ting-Chang Hou: This patch is slightly different from the one
> > > > you
> > > > tested with your mac. You are still tagged as Tested-by, but you
> > > > may
> > > > want to check again that I have not broken anything.
> > > > 
> > > > Thanks,
> > > > Ernest
> > > > 
> > > > -- >8 --
> > > > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > > > 
> > > > Files created under macOS become essentially unusable under linux
> > > > if
> > > > their names contain any Hangul, and vice versa.
> > > > 
> > > > This happens because the normalization of Hangul characters is a
> > > > special
> > > > case: it can be done algorithmically, without need of a table.
> > > > The
> > > > hfsplus module deals with Hangul correctly when composing, but
> > > > completely forgets about it when performing a decomposition.
> > > > 
> > > > Solve this by using the Hangul decomposition function provided in
> > > > the
> > > > Unicode Standard. It's under the Unicode License, compatible with
> > > > the
> > > > GPL.
> > > > 
> > > I believe it makes sense to share more details about algorithm
> > > details.
> > > I think that the Unicode standard contains some explanation how to
> > > decompose correctly. It will be great to have some brief citation
> > > in the
> > > patch description.
> > The algorithm is very simple, the best way to understand it is just
> > looking at the code. I don't know the first thing about Korean
> > writing, so
> > I don't think I should attempt to explain why the decomposition is
> > done
> > this way. If somebody else is interested in the details, they can
> > follow
> > the citation in the header comment of the decompose_hangul function.
> > 
> 
> You should share the link where it can be found.

Is that necessary? I think people can find the Unicode Standard on their
own. Even if I did provide a link, it's a pdf, so people would still need
to look up the right sections. It's not going to save anybody much work.

> Finally, the comment
> needs in more detailed explanation.

Are you talking about the function header comment? I don't know if I
follow. Perhaps I could explain that the return value is the size, but
it seems pretty obvious from the use.

> > > 
> > > > 
> > > > This patch will cause trouble for Hangul filenames already
> > > > created by
> > > > the module in the past. This shouldn't be a special concern
> > > > because the
> > > > main purpose of the module was always sharing with macOS. If a
> > > > user
> > > > actually needs to access such a file the nodecompose mount option
> > > > should
> > > > be enough.
> > > > 
> > > > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > > > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.
> > > > com>
> > > > ---
> > > >  fs/hfsplus/unicode.c | 62
> > > > +++++++++++++++++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 56 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > > index dfa90c2..7ceae91 100644
> > > > --- a/fs/hfsplus/unicode.c
> > > > +++ b/fs/hfsplus/unicode.c
> > > > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct
> > > > super_block *sb, const char *astr, int len,
> > > >  	return size;
> > > >  }
> > > >  
> > > > -/* Decomposes a single unicode character. */
> > > > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > > > +/* Decomposes a non-Hangul unicode character. */
> > > > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> > > >  {
> > > >  	int off;
> > > >  
> > > > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t
> > > > uc, int *size)
> > > >  	return hfsplus_decompose_table + (off / 4);
> > > >  }
> > > >  
> > > > +/*
> > > > + * Decomposes a Hangul unicode character.
> > > > + *
> > > > + * This function was adapted from sample code in section 3.12 of
> > > > the
> > > > + * Unicode Standard, version 10.0.
> > > > + *
> > > > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights
> > > > reserved.  Distributed
> > > > + * under the Terms of Use in http://www.unicode.org/copyright.ht
> > > > ml.
> > > > + */
> > > > +static int decompose_hangul(wchar_t uc, u16 *result)
> > > > +{
> > > > +	int index;
> > > > +	int l, v, t;
> > > > +
> > > > +	index = uc - Hangul_SBase;
> > > > +	if (index < 0 || index >= Hangul_SCount)
> > > > +		return 0;
> > > > +
> > > > +	l = Hangul_LBase + index / Hangul_NCount;
> > > > +	v = Hangul_VBase + (index % Hangul_NCount) /
> > > > Hangul_TCount;
> > > > +	t = Hangul_TBase + index % Hangul_TCount;
> > > > +
> > > > +	result[0] = l;
> > > > +	result[1] = v;
> > > > +	if (t != Hangul_TBase) {
> > > > +		result[2] = t;
> > > > +		return 3;
> > > > +	}
> > > > +	return 2;
> > > > +}
> > > > +
> > > > +/* Decomposes a single unicode character. */
> > > > +static u16 *decompose_unichar(wchar_t uc, int *size, u16
> > > > *hangul_buffer)
> > > > +{
> > > > +	u16 *result;
> > > > +
> > > > +	/* Hangul is handled separately */
> > > > +	result = hangul_buffer;
> > > > +	*size = decompose_hangul(uc, result);
> > > Is it possible to distinguish the different cases beforehand and to
> > > select one proper function for execution without the execution both
> > > ones
> > > in the worst case?
> > The decompose_hangul and decompose_nonhangul functions are called
> > only once,
> > so I would expect the compiler to inline them both. The result should
> > be
> > similar to your suggestion.
> > 
> 
> 
> It will be much better to have:
> 
> If (hangful) {
>     <decompose_hangul>
> } else {
>     <decompose_nonhangul>
> }
> 
> Currently, the code looks weird. Especially, decompose_hangul() is able
> to distinguish two different cases. So, it's easy to extract the check.

OK, your previous objection seemed to be about efficiency, not style.

Moving the Hangul check outside the function would require a reader of
decompose_unichar() to figure out what Hangul_SBase and Hangul_SCount
are, when they may not be interested in the Hangul case at all. My
version may be weird but I think it's easier to understand.

Keep in mind that the old decompose_unichar function was also checking
if the character was Hangul, and doing nothing if that was the case. So
this is not a big departure from the existing code.

Finally, there is the issue of the license; I don't know if I should be
splitting decompose_hangul() since it would make it harder to give
attribution.

> > > 
> > > > 
> > > > +	if (*size == 0)
> > > > +		/* not Hangul */
> > > > +		result = decompose_nonhangul(uc, size);
> > > I think it makes sense to use the brackets here. If I remember
> > > correctly, this is the code style's requirement.
> > I couldn't find such a requirement, but I did find some other places
> > within
> > the hfsplus code where the comments are done my way. For example, in
> > the
> > hfsplus_cat_write_inode() function of the inode.c file.
> > 
> 
> 
> The hfsplus code contains many awful places.

I'd say most of the older code is decent and easy to understand.

> It doesn't mean that it
> needs to follow the bad style examples. If you have only one string:
> 
> if (condition)
>     <do_something>
> 
> then it's OK not to use the brackets. Otherwise, you should include the
> branch into the brackets.

I looked around outside of hfsplus, and it seems that both styles are
in use. The most charming example I found was in the rb_next_postorder()
function of lib/rbtree.c, where one branch of a conditional uses your
style and the other uses mine.

If you can actually find the place in the syle guide where this is
required then I will make the change. Otherwise I would prefer to leave
it as it is.

> Thanks,
> Vyacheslav Dubeyko.
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
  2017-11-27 22:40               ` Viacheslav Dubeyko
@ 2018-08-23 18:29               ` Ernesto A. Fernández
  2018-08-24  1:20                 ` tchou
  1 sibling, 1 reply; 21+ messages in thread
From: Ernesto A. Fernández @ 2018-08-23 18:29 UTC (permalink / raw)
  To: tchou; +Cc: slava, linux-fsdevel, htl10

Hi again:

On Mon, Nov 27, 2017 at 04:36:54PM -0300, Ernesto A. Fernández wrote:
> It would be good to have some feedback on the use of code under the
> Unicode License, I don't know if that can be a problem.
> 
> To Ting-Chang Hou: This patch is slightly different from the one you
> tested with your mac. You are still tagged as Tested-by, but you may
> want to check again that I have not broken anything.
> 
> Thanks,
> Ernest
> 
> -- >8 --
> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> 
> Files created under macOS become essentially unusable under linux if
> their names contain any Hangul, and vice versa.
> 
> This happens because the normalization of Hangul characters is a special
> case: it can be done algorithmically, without need of a table. The
> hfsplus module deals with Hangul correctly when composing, but
> completely forgets about it when performing a decomposition.
> 
> Solve this by using the Hangul decomposition function provided in the
> Unicode Standard. It's under the Unicode License, compatible with the
> GPL.
> 
> This patch will cause trouble for Hangul filenames already created by
> the module in the past. This shouldn't be a special concern because the
> main purpose of the module was always sharing with macOS. If a user
> actually needs to access such a file the nodecompose mount option should
> be enough.
> 
> Reported-by: Ting-Chang Hou <tchou@synology.com>
> Tested-by: Ting-Chang Hou <tchou@synology.com>
> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
>

It took a while but this patch has been merged now. Thanks again for your
report.

Ernest

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2018-08-23 18:29               ` Ernesto A. Fernández
@ 2018-08-24  1:20                 ` tchou
  0 siblings, 0 replies; 21+ messages in thread
From: tchou @ 2018-08-24  1:20 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: slava, linux-fsdevel, htl10, linux-fsdevel-owner

Hi,

Ernesto A. Fernández 於 2018-08-24 02:29 寫到:
> Hi again:
> 
> On Mon, Nov 27, 2017 at 04:36:54PM -0300, Ernesto A. Fernández wrote:
>> It would be good to have some feedback on the use of code under the
>> Unicode License, I don't know if that can be a problem.
>> 
>> To Ting-Chang Hou: This patch is slightly different from the one you
>> tested with your mac. You are still tagged as Tested-by, but you may
>> want to check again that I have not broken anything.
>> 
>> Thanks,
>> Ernest
>> 
>> -- >8 --
>> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
>> 
>> Files created under macOS become essentially unusable under linux if
>> their names contain any Hangul, and vice versa.
>> 
>> This happens because the normalization of Hangul characters is a 
>> special
>> case: it can be done algorithmically, without need of a table. The
>> hfsplus module deals with Hangul correctly when composing, but
>> completely forgets about it when performing a decomposition.
>> 
>> Solve this by using the Hangul decomposition function provided in the
>> Unicode Standard. It's under the Unicode License, compatible with the
>> GPL.
>> 
>> This patch will cause trouble for Hangul filenames already created by
>> the module in the past. This shouldn't be a special concern because 
>> the
>> main purpose of the module was always sharing with macOS. If a user
>> actually needs to access such a file the nodecompose mount option 
>> should
>> be enough.
>> 
>> Reported-by: Ting-Chang Hou <tchou@synology.com>
>> Tested-by: Ting-Chang Hou <tchou@synology.com>
>> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
>> 
> 
> It took a while but this patch has been merged now. Thanks again for 
> your
> report.

And thanks for your patch, too.

> 
> Ernest

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
@ 2017-11-17 19:33 Slava Dubeyko
  0 siblings, 0 replies; 21+ messages in thread
From: Slava Dubeyko @ 2017-11-17 19:33 UTC (permalink / raw)
  To: tchou, Hin-Tak Leung; +Cc: linux-fsdevel

Hi Ting-Chang,

Sorry, I need to reapet my e-mail.

Could you please share the examples of incorrect and corect behavior? Are you sure that we need in such fix because it is tne mount option.

Hi Hin-Tak,

Could you please test the fix?

Thanks,
Vyacheslav Dubeyko.

>> From: Ting-Chang Hou <tchou@synology.com>
>> Date: Fri, Nov 17, 2017 at 12:20 AM
>> Subject: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
>> To: linux-fsdevel@vger.kernel.org
>> Cc: Ting-Chang Hou <tchou@synology.com>
>>
>> The unicode of hangul from macOS is decomposed. There has a bug that
>> mistake decomposed unicode for composed when change unicode to ascii,
>> so it cannot recognize the hangul correctly.
>>
>> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> ---
 >> fs/hfsplus/unicode.c | 2 +-
 >> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> index dfa90c2..2daf7b0 100644
>> --- a/fs/hfsplus/unicode.c
>> +++ b/fs/hfsplus/unicode.c
>> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>>        ustrlen = be16_to_cpu(ustr->length);
>>        len = *len_p;
>>        ce1 = NULL;
>> -       compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> +       compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>>
>>        while (ustrlen > 0) {
>>                c0 = be16_to_cpu(*ip++);
>> --
>> 2.7.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-08-24  4:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17  8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
2017-11-19  0:57 ` Ernesto A. Fernández
2017-11-23  3:57   ` tchou
2017-11-23  4:21     ` Viacheslav Dubeyko
2017-11-23  6:05       ` tchou
2017-11-23  6:23         ` Viacheslav Dubeyko
2017-11-23  6:34           ` tchou
2017-11-23 11:32     ` Ernesto A. Fernández
2017-11-23 18:36       ` Viacheslav Dubeyko
2017-11-23 22:20         ` Ernesto A. Fernández
2017-11-24  7:25       ` tchou
2017-11-24 11:45         ` Ernesto A. Fernández
2017-11-27  2:07           ` tchou
2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
2017-11-27 22:40               ` Viacheslav Dubeyko
2017-11-28 15:02                 ` Ernesto A. Fernández
2017-11-28 16:30                   ` Viacheslav Dubeyko
2017-11-28 18:15                     ` Ernesto A. Fernández
2018-08-23 18:29               ` Ernesto A. Fernández
2018-08-24  1:20                 ` tchou
2017-11-17 19:33 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Slava Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).