linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
@ 2017-11-17  8:20 Ting-Chang Hou
  2017-11-19  0:57 ` Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: Ting-Chang Hou @ 2017-11-17  8:20 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Ting-Chang Hou

The unicode of hangul from macOS is decomposed. There has a bug that
mistake decomposed unicode for composed when change unicode to ascii,
so it cannot recognize the hangul correctly.

Signed-off-by: Ting-Chang Hou <tchou@synology.com>
---
 fs/hfsplus/unicode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..2daf7b0 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
 	ustrlen = be16_to_cpu(ustr->length);
 	len = *len_p;
 	ce1 = NULL;
-	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
+	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 
 	while (ustrlen > 0) {
 		c0 = be16_to_cpu(*ip++);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-17  8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
@ 2017-11-19  0:57 ` Ernesto A. Fernández
  2017-11-23  3:57   ` tchou
  0 siblings, 1 reply; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-19  0:57 UTC (permalink / raw)
  To: Ting-Chang Hou; +Cc: linux-fsdevel, Ernesto A. Fernández

On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> The unicode of hangul from macOS is decomposed. There has a bug that
> mistake decomposed unicode for composed when change unicode to ascii,
> so it cannot recognize the hangul correctly.
> 
> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> ---
>  fs/hfsplus/unicode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..2daf7b0 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>  	ustrlen = be16_to_cpu(ustr->length);
>  	len = *len_p;
>  	ce1 = NULL;
> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);

I'm not sure this is a mistake. The developers probably wanted the
filenames to be recomposed before being presented in utf8. With your patch,
if you try the following (with the default mount options):

  touch Á
  ls | hexdump -C

the utf8 output filename will be using the combining accent (CC 81) instead
of the Á character (C3 81). This is a bit annoying because it won't print
correctly in my terminal anymore.

What is it exactly that you are trying to fix? You mention an issue with
hangul characters, but I failed to trigger it. Could you expand on that?

>  	while (ustrlen > 0) {
>  		c0 = be16_to_cpu(*ip++);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-19  0:57 ` Ernesto A. Fernández
@ 2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23 11:32     ` Ernesto A. Fernández
  0 siblings, 2 replies; 32+ messages in thread
From: tchou @ 2017-11-23  3:57 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

> Could you please share the examples of incorrect and correct
> behaviour?

> What is it exactly that you are trying to fix? You mention an issue 
> with
> hangul characters, but I failed to trigger it. Could you expand on 
> that?
> 

Hi all,
There is an example.I use Mac mini to format my usb disk to HFS+ and 
touch
the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn) on 
it.
After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get the
following error message when trying to ls and cp the file:

ls: cannot access 공유: No such file or directory
cp: cannot stat ‘공유’: No such file or directory

It seem's a problem for a long time(https://goo.gl/LiWGe5).
After applying my patch, I can ls and cp the file correctly.

> On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> The unicode of hangul from macOS is decomposed. There has a bug that
>> mistake decomposed unicode for composed when change unicode to ascii,
>> so it cannot recognize the hangul correctly.
>> 
>> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> ---
>>  fs/hfsplus/unicode.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> index dfa90c2..2daf7b0 100644
>> --- a/fs/hfsplus/unicode.c
>> +++ b/fs/hfsplus/unicode.c
>> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>>  	ustrlen = be16_to_cpu(ustr->length);
>>  	len = *len_p;
>>  	ce1 = NULL;
>> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> 
> I'm not sure this is a mistake. The developers probably wanted the
> filenames to be recomposed before being presented in utf8. With your 
> patch,
> if you try the following (with the default mount options):
> 
>   touch Á
>   ls | hexdump -C
> 
> the utf8 output filename will be using the combining accent (CC 81) 
> instead
> of the Á character (C3 81). This is a bit annoying because it won't 
> print
> correctly in my terminal anymore.

I'm not exatly know why combining accent cannot print correctly in 
terminal
and how to avoid it. Whether apply my patch or not, my terminal cannot
print the hangul charactor correctly.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
@ 2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23  6:05       ` tchou
  2017-11-23 11:32     ` Ernesto A. Fernández
  1 sibling, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  4:21 UTC (permalink / raw)
  To: tchou, Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > 
> > Could you please share the examples of incorrect and correct
> > behaviour?
> > 
> > What is it exactly that you are trying to fix? You mention an
> > issue 
> > with
> > hangul characters, but I failed to trigger it. Could you expand on 
> > that?
> > 
> Hi all,
> There is an example.I use Mac mini to format my usb disk to HFS+ and 
> touch
> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
> on 
> it.
> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
> the
> following error message when trying to ls and cp the file:
> 


Could you share the all mount options that you used? It looks that you
simply don't use the proper mount options set under the Linux. Because
MacOS X will mount properly always.

Thanks,
Vyacheslav Dubeyko.



> ls: cannot access 공유: No such file or directory
> cp: cannot stat ‘공유’: No such file or directory
> 
> It seem's a problem for a long time(https://goo.gl/LiWGe5).
> After applying my patch, I can ls and cp the file correctly.
> 
> > 
> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> > > 
> > > The unicode of hangul from macOS is decomposed. There has a bug
> > > that
> > > mistake decomposed unicode for composed when change unicode to
> > > ascii,
> > > so it cannot recognize the hangul correctly.
> > > 
> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> > > ---
> > >  fs/hfsplus/unicode.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > index dfa90c2..2daf7b0 100644
> > > --- a/fs/hfsplus/unicode.c
> > > +++ b/fs/hfsplus/unicode.c
> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
> > >  	ustrlen = be16_to_cpu(ustr->length);
> > >  	len = *len_p;
> > >  	ce1 = NULL;
> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > I'm not sure this is a mistake. The developers probably wanted the
> > filenames to be recomposed before being presented in utf8. With
> > your 
> > patch,
> > if you try the following (with the default mount options):
> > 
> >   touch Á
> >   ls | hexdump -C
> > 
> > the utf8 output filename will be using the combining accent (CC
> > 81) 
> > instead
> > of the Á character (C3 81). This is a bit annoying because it
> > won't 
> > print
> > correctly in my terminal anymore.
> I'm not exatly know why combining accent cannot print correctly in 
> terminal
> and how to avoid it. Whether apply my patch or not, my terminal
> cannot
> print the hangul charactor correctly.
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23  6:05       ` tchou
  2017-11-23  6:23         ` Viacheslav Dubeyko
  0 siblings, 1 reply; 32+ messages in thread
From: tchou @ 2017-11-23  6:05 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> >
>> > Could you please share the examples of incorrect and correct
>> > behaviour?
>> >
>> > What is it exactly that you are trying to fix? You mention an
>> > issue 
>> > with
>> > hangul characters, but I failed to trigger it. Could you expand on 
>> > that?
>> >
>> Hi all,
>> There is an example.I use Mac mini to format my usb disk to HFS+ and 
>> touch
>> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
>> on 
>> it.
>> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
>> the
>> following error message when trying to ls and cp the file:
>> 
> 
> 
> Could you share the all mount options that you used? It looks that you
> simply don't use the proper mount options set under the Linux. Because
> MacOS X will mount properly always.
> 
> Thanks,
> Vyacheslav Dubeyko.
> 
Hi Vyacheslav,

I use the default mount option:

mount /dev/sdc1 /mnt/hfs
/dev/sdc1 on /mnt/hfs type hfsplus 
(rw,relatime,umask=22,uid=0,gid=0,nls=utf8)

Thanks,
TCHou
> 
> 
>> ls: cannot access 공유: No such file or directory
>> cp: cannot stat ‘공유’: No such file or directory
>> 
>> It seem's a problem for a long time(https://goo.gl/LiWGe5).
>> After applying my patch, I can ls and cp the file correctly.
>> 
>> >
>> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> > >
>> > > The unicode of hangul from macOS is decomposed. There has a bug
>> > > that
>> > > mistake decomposed unicode for composed when change unicode to
>> > > ascii,
>> > > so it cannot recognize the hangul correctly.
>> > >
>> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> > > ---
>> > >  fs/hfsplus/unicode.c | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> > > index dfa90c2..2daf7b0 100644
>> > > --- a/fs/hfsplus/unicode.c
>> > > +++ b/fs/hfsplus/unicode.c
>> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>> > >  	ustrlen = be16_to_cpu(ustr->length);
>> > >  	len = *len_p;
>> > >  	ce1 = NULL;
>> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > I'm not sure this is a mistake. The developers probably wanted the
>> > filenames to be recomposed before being presented in utf8. With
>> > your 
>> > patch,
>> > if you try the following (with the default mount options):
>> >
>> >   touch Á
>> >   ls | hexdump -C
>> >
>> > the utf8 output filename will be using the combining accent (CC
>> > 81) 
>> > instead
>> > of the Á character (C3 81). This is a bit annoying because it
>> > won't 
>> > print
>> > correctly in my terminal anymore.
>> I'm not exatly know why combining accent cannot print correctly in 
>> terminal
>> and how to avoid it. Whether apply my patch or not, my terminal
>> cannot
>> print the hangul charactor correctly.
>> 
>> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:05       ` tchou
@ 2017-11-23  6:23         ` Viacheslav Dubeyko
  2017-11-23  6:34           ` tchou
  0 siblings, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  6:23 UTC (permalink / raw)
  To: tchou
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> > 
> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > > 
> > > > 
> > > > 
> > > > Could you please share the examples of incorrect and correct
> > > > behaviour?
> > > > 
> > > > What is it exactly that you are trying to fix? You mention an
> > > > issue 
> > > > with
> > > > hangul characters, but I failed to trigger it. Could you expand
> > > > on 
> > > > that?
> > > > 
> > > Hi all,
> > > There is an example.I use Mac mini to format my usb disk to HFS+
> > > and 
> > > touch
> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
> > > )
> > > on 
> > > it.
> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
> > > get
> > > the
> > > following error message when trying to ls and cp the file:
> > > 
> > 
> > Could you share the all mount options that you used? It looks that
> > you
> > simply don't use the proper mount options set under the Linux.
> > Because
> > MacOS X will mount properly always.
> > 
> > Thanks,
> > Vyacheslav Dubeyko.
> > 
> Hi Vyacheslav,
> 
> I use the default mount option:
> 
> mount /dev/sdc1 /mnt/hfs
> /dev/sdc1 on /mnt/hfs type hfsplus 
> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
> 


Please, take a look into mount options description [1]. Especially,
take a look into decompose/nodecompose options. You simply needs to use
the proper mount option for your case.

Thanks,
Vyacheslav Dubeyko.


[1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt

> Thanks,
> TCHou
> > 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:23         ` Viacheslav Dubeyko
@ 2017-11-23  6:34           ` tchou
  0 siblings, 0 replies; 32+ messages in thread
From: tchou @ 2017-11-23  6:34 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "\"Ernesto A.\"" Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 14:23 寫到:
> On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
>> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
>> >
>> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> > >
>> > > >
>> > > >
>> > > > Could you please share the examples of incorrect and correct
>> > > > behaviour?
>> > > >
>> > > > What is it exactly that you are trying to fix? You mention an
>> > > > issue 
>> > > > with
>> > > > hangul characters, but I failed to trigger it. Could you expand
>> > > > on 
>> > > > that?
>> > > >
>> > > Hi all,
>> > > There is an example.I use Mac mini to format my usb disk to HFS+
>> > > and 
>> > > touch
>> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
>> > > )
>> > > on 
>> > > it.
>> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
>> > > get
>> > > the
>> > > following error message when trying to ls and cp the file:
>> > >
>> >
>> > Could you share the all mount options that you used? It looks that
>> > you
>> > simply don't use the proper mount options set under the Linux.
>> > Because
>> > MacOS X will mount properly always.
>> >
>> > Thanks,
>> > Vyacheslav Dubeyko.
>> >
>> Hi Vyacheslav,
>> 
>> I use the default mount option:
>> 
>> mount /dev/sdc1 /mnt/hfs
>> /dev/sdc1 on /mnt/hfs type hfsplus 
>> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
>> 
> 
> 
> Please, take a look into mount options description [1]. Especially,
> take a look into decompose/nodecompose options. You simply needs to use
> the proper mount option for your case.
> 
I got it. It's my mistake.
Thank you all for making me right.

Thanks,
TCHou

> Thanks,
> Vyacheslav Dubeyko.
> 
> 
> [1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt
> 
>> Thanks,
>> TCHou
>> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-24  7:25       ` tchou
  1 sibling, 2 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 11:32 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

Hi:

your issue seems to be in the decomposition of hangul characters, not in
the recomposition before printing. The hfsplus module on linux is saving
the name of your actor as AC F5 C7 20, without performing any
decomposition at all.

The reason your patch hides the bug is because it causes linux to present
filenames as decomposed utf8, so it is not necessary to decompose again
before working with them. But the issue is still there, and you will most
likely run into trouble if you make a hangul filename in linux and try
to work with it in MacOS.

Reviewing the code it would seem that the developers completely forgot
the hangul characters had their own rules for decomposition. It's weird
because they did the composition part correctly.

I've made a quick draft of a patch, mostly by copying the code provided
in the unicode web. I don't think we can actually use it on a release,
but it should be enough to check if I'm right. It works fine on linux,
but I don't have a mac, so it would be great if you could test it for me.

Thanks,
Ernest

(By the way, there is no reason you should have to use the nodecompose
mount option, as the other reviewer suggested. Using that option will
have a similar effect to that of your patch. It will hide the problem,
but if you create a hangul filename on linux with that option you
probably won't be able to use it on a mac.)

---
diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..9006c61 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
+/* Decomposes a single non-Hangul unicode character. */
 static inline u16 *decompose_unichar(wchar_t uc, int *size)
 {
 	int off;
@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/* Decomposes a Hangul unicode character. */
+int decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 hangul_buf[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
-		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
-		else
+		if (decompose) {
+			/* Hangul is handled separately */
+			dstr = &hangul_buf[0];
+			dsize = decompose_hangul(c, dstr);
+			if (dsize == 0)
+				/* not Hangul */
+				dstr = decompose_unichar(c, &dsize);
+		} else {
 			dstr = NULL;
+		}
+
 		if (dstr) {
 			if (outlen + dsize > max_unistr_len)
 				break;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
@ 2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-23 22:20         ` Ernesto A. Fernández
  2017-11-24  7:25       ` tchou
  1 sibling, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23 18:36 UTC (permalink / raw)
  To: Ernesto A. Fernández, tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not
> in
> the recomposition before printing. The hfsplus module on linux is
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to
> present
> filenames as decomposed utf8, so it is not necessary to decompose
> again
> before working with them. But the issue is still there, and you will
> most
> likely run into trouble if you make a hangul filename in linux and
> try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely
> forgot
> the hangul characters had their own rules for decomposition. It's
> weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code
> provided
> in the unicode web. I don't think we can actually use it on a 


Could you please share the link for "the unicode web"?

Thanks,
Vyacheslav Dubeyko.


> release,
> but it should be enough to check if I'm right. It works fine on
> linux,
> but I don't have a mac, so it would be great if you could test it for
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the
> nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the
> problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-23 22:20         ` Ernesto A. Fernández
  0 siblings, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 22:20 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: Ernesto A. Fernández, tchou, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, Nov 23, 2017 at 10:36:37AM -0800, Viacheslav Dubeyko wrote:
> On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> > Hi:
> > 
> > your issue seems to be in the decomposition of hangul characters, not
> > in
> > the recomposition before printing. The hfsplus module on linux is
> > saving
> > the name of your actor as AC F5 C7 20, without performing any
> > decomposition at all.
> > 
> > The reason your patch hides the bug is because it causes linux to
> > present
> > filenames as decomposed utf8, so it is not necessary to decompose
> > again
> > before working with them. But the issue is still there, and you will
> > most
> > likely run into trouble if you make a hangul filename in linux and
> > try
> > to work with it in MacOS.
> > 
> > Reviewing the code it would seem that the developers completely
> > forgot
> > the hangul characters had their own rules for decomposition. It's
> > weird
> > because they did the composition part correctly.
> > 
> > I've made a quick draft of a patch, mostly by copying the code
> > provided
> > in the unicode web. I don't think we can actually use it on a 
> 
> 
> Could you please share the link for "the unicode web"?
> 
> Thanks,
> Vyacheslav Dubeyko.

I'm not asking for any reviews yet, just testing because I don't have
a Mac. As long as that's clear, this is the latest version of Unicode:

www.unicode.org/versions/Unicode10.0.0/

You want section 3.12.

> 
> 
> > release,
> > but it should be enough to check if I'm right. It works fine on
> > linux,
> > but I don't have a mac, so it would be great if you could test it for
> > me.
> > 
> > Thanks,
> > Ernest
> > 
> > (By the way, there is no reason you should have to use the
> > nodecompose
> > mount option, as the other reviewer suggested. Using that option will
> > have a similar effect to that of your patch. It will hide the
> > problem,
> > but if you create a hangul filename on linux with that option you
> > probably won't be able to use it on a mac.)
> > 
> > ---
> > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > index dfa90c2..9006c61 100644
> > --- a/fs/hfsplus/unicode.c
> > +++ b/fs/hfsplus/unicode.c
> > @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> > *sb, const char *astr, int len,
> >  	return size;
> >  }
> >  
> > -/* Decomposes a single unicode character. */
> > +/* Decomposes a single non-Hangul unicode character. */
> >  static inline u16 *decompose_unichar(wchar_t uc, int *size)
> >  {
> >  	int off;
> > @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> > int *size)
> >  	return hfsplus_decompose_table + (off / 4);
> >  }
> >  
> > +/* Decomposes a Hangul unicode character. */
> > +int decompose_hangul(wchar_t uc, u16 *result)
> > +{
> > +	int index;
> > +	int l, v, t;
> > +
> > +	index = uc - Hangul_SBase;
> > +	if (index < 0 || index >= Hangul_SCount)
> > +		return 0;
> > +
> > +	l = Hangul_LBase + index / Hangul_NCount;
> > +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> > +	t = Hangul_TBase + index % Hangul_TCount;
> > +
> > +	result[0] = l;
> > +	result[1] = v;
> > +	if (t != Hangul_TBase) {
> > +		result[2] = t;
> > +		return 3;
> > +	}
> > +	return 2;
> > +}
> > +
> >  int hfsplus_asc2uni(struct super_block *sb,
> >  		    struct hfsplus_unistr *ustr, int max_unistr_len,
> >  		    const char *astr, int len)
> > @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> >  	int size, dsize, decompose;
> >  	u16 *dstr, outlen = 0;
> >  	wchar_t c;
> > +	u16 hangul_buf[3];
> >  
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > &HFSPLUS_SB(sb)->flags);
> >  	while (outlen < max_unistr_len && len > 0) {
> >  		size = asc2unichar(sb, astr, len, &c);
> >  
> > -		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > -		else
> > +		if (decompose) {
> > +			/* Hangul is handled separately */
> > +			dstr = &hangul_buf[0];
> > +			dsize = decompose_hangul(c, dstr);
> > +			if (dsize == 0)
> > +				/* not Hangul */
> > +				dstr = decompose_unichar(c, &dsize);
> > +		} else {
> >  			dstr = NULL;
> > +		}
> > +
> >  		if (dstr) {
> >  			if (outlen + dsize > max_unistr_len)
> >  				break;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-24  7:25       ` tchou
  2017-11-24 11:45         ` Ernesto A. Fernández
  1 sibling, 1 reply; 32+ messages in thread
From: tchou @ 2017-11-24  7:25 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not 
> in
> the recomposition before printing. The hfsplus module on linux is 
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to 
> present
> filenames as decomposed utf8, so it is not necessary to decompose again
> before working with them. But the issue is still there, and you will 
> most
> likely run into trouble if you make a hangul filename in linux and try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely forgot
> the hangul characters had their own rules for decomposition. It's weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code provided
> in the unicode web. I don't think we can actually use it on a release,
> but it should be enough to check if I'm right. It works fine on linux,
> but I don't have a mac, so it would be great if you could test it for 
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
> 
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, 
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
> 
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
> 
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, 
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
> 
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

Hi,

Thank you for your explanation and your draft.
I test four versions of hfsplus module:
--------------------------------
1. origin linux
2. apply my patch
3. mount with nodecompose option
4. apply your patch
--------------------------------

There are the results of the test:
---------------------------------------------------------------------
Linux can ls and cp the file from mac correctly with module 2,3,4.
Mac cannot cp the files correctly which touched by module 1,2,3.
Mac can cp the file correctly which touched by module with your patch.
Mac can ls the files which touch by all modules.
---------------------------------------------------------------------

It seems that everything goes well with your patch. The result is just
like you say, my patch or mount with nodecompose option cannot perform
normally.


By the way, are the decompose procedures of function 
hfsplus_hash_dentry()
and hfsplus_compare_dentry() need to modify as well?

Thanks,
TCHou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24  7:25       ` tchou
@ 2017-11-24 11:45         ` Ernesto A. Fernández
  2017-11-27  2:07           ` tchou
  0 siblings, 1 reply; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-24 11:45 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> >Hi:
> >
> >your issue seems to be in the decomposition of hangul characters, not in
> >the recomposition before printing. The hfsplus module on linux is saving
> >the name of your actor as AC F5 C7 20, without performing any
> >decomposition at all.
> >
> >The reason your patch hides the bug is because it causes linux to present
> >filenames as decomposed utf8, so it is not necessary to decompose again
> >before working with them. But the issue is still there, and you will most
> >likely run into trouble if you make a hangul filename in linux and try
> >to work with it in MacOS.
> >
> >Reviewing the code it would seem that the developers completely forgot
> >the hangul characters had their own rules for decomposition. It's weird
> >because they did the composition part correctly.
> >
> >I've made a quick draft of a patch, mostly by copying the code provided
> >in the unicode web. I don't think we can actually use it on a release,
> >but it should be enough to check if I'm right. It works fine on linux,
> >but I don't have a mac, so it would be great if you could test it for me.
> >
> >Thanks,
> >Ernest
> >
> >(By the way, there is no reason you should have to use the nodecompose
> >mount option, as the other reviewer suggested. Using that option will
> >have a similar effect to that of your patch. It will hide the problem,
> >but if you create a hangul filename on linux with that option you
> >probably won't be able to use it on a mac.)
> >
> >---
> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> >index dfa90c2..9006c61 100644
> >--- a/fs/hfsplus/unicode.c
> >+++ b/fs/hfsplus/unicode.c
> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> >*sb, const char *astr, int len,
> > 	return size;
> > }
> >
> >-/* Decomposes a single unicode character. */
> >+/* Decomposes a single non-Hangul unicode character. */
> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > {
> > 	int off;
> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
> >*size)
> > 	return hfsplus_decompose_table + (off / 4);
> > }
> >
> >+/* Decomposes a Hangul unicode character. */
> >+int decompose_hangul(wchar_t uc, u16 *result)
> >+{
> >+	int index;
> >+	int l, v, t;
> >+
> >+	index = uc - Hangul_SBase;
> >+	if (index < 0 || index >= Hangul_SCount)
> >+		return 0;
> >+
> >+	l = Hangul_LBase + index / Hangul_NCount;
> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> >+	t = Hangul_TBase + index % Hangul_TCount;
> >+
> >+	result[0] = l;
> >+	result[1] = v;
> >+	if (t != Hangul_TBase) {
> >+		result[2] = t;
> >+		return 3;
> >+	}
> >+	return 2;
> >+}
> >+
> > int hfsplus_asc2uni(struct super_block *sb,
> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
> > 		    const char *astr, int len)
> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> > 	int size, dsize, decompose;
> > 	u16 *dstr, outlen = 0;
> > 	wchar_t c;
> >+	u16 hangul_buf[3];
> >
> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > 	while (outlen < max_unistr_len && len > 0) {
> > 		size = asc2unichar(sb, astr, len, &c);
> >
> >-		if (decompose)
> >-			dstr = decompose_unichar(c, &dsize);
> >-		else
> >+		if (decompose) {
> >+			/* Hangul is handled separately */
> >+			dstr = &hangul_buf[0];
> >+			dsize = decompose_hangul(c, dstr);
> >+			if (dsize == 0)
> >+				/* not Hangul */
> >+				dstr = decompose_unichar(c, &dsize);
> >+		} else {
> > 			dstr = NULL;
> >+		}
> >+
> > 		if (dstr) {
> > 			if (outlen + dsize > max_unistr_len)
> > 				break;
> 
> Hi,
> 
> Thank you for your explanation and your draft.
> I test four versions of hfsplus module:
> --------------------------------
> 1. origin linux
> 2. apply my patch
> 3. mount with nodecompose option
> 4. apply your patch
> --------------------------------

It seems you were very thorough, thank you.
 
> There are the results of the test:
> ---------------------------------------------------------------------
> Linux can ls and cp the file from mac correctly with module 2,3,4.
> Mac cannot cp the files correctly which touched by module 1,2,3.
> Mac can cp the file correctly which touched by module with your patch.
> Mac can ls the files which touch by all modules.
> ---------------------------------------------------------------------
> 
> It seems that everything goes well with your patch. The result is just
> like you say, my patch or mount with nodecompose option cannot perform
> normally.

That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.

> 
> By the way, are the decompose procedures of function hfsplus_hash_dentry()
> and hfsplus_compare_dentry() need to modify as well?

You are correct, good thing you noticed. I'll add that when I resend the
patch.

> Thanks,
> TCHou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24 11:45         ` Ernesto A. Fernández
@ 2017-11-27  2:07           ` tchou
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: tchou @ 2017-11-27  2:07 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-24 19:45 寫到:
> On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
>> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
>> >Hi:
>> >
>> >your issue seems to be in the decomposition of hangul characters, not in
>> >the recomposition before printing. The hfsplus module on linux is saving
>> >the name of your actor as AC F5 C7 20, without performing any
>> >decomposition at all.
>> >
>> >The reason your patch hides the bug is because it causes linux to present
>> >filenames as decomposed utf8, so it is not necessary to decompose again
>> >before working with them. But the issue is still there, and you will most
>> >likely run into trouble if you make a hangul filename in linux and try
>> >to work with it in MacOS.
>> >
>> >Reviewing the code it would seem that the developers completely forgot
>> >the hangul characters had their own rules for decomposition. It's weird
>> >because they did the composition part correctly.
>> >
>> >I've made a quick draft of a patch, mostly by copying the code provided
>> >in the unicode web. I don't think we can actually use it on a release,
>> >but it should be enough to check if I'm right. It works fine on linux,
>> >but I don't have a mac, so it would be great if you could test it for me.
>> >
>> >Thanks,
>> >Ernest
>> >
>> >(By the way, there is no reason you should have to use the nodecompose
>> >mount option, as the other reviewer suggested. Using that option will
>> >have a similar effect to that of your patch. It will hide the problem,
>> >but if you create a hangul filename on linux with that option you
>> >probably won't be able to use it on a mac.)
>> >
>> >---
>> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> >index dfa90c2..9006c61 100644
>> >--- a/fs/hfsplus/unicode.c
>> >+++ b/fs/hfsplus/unicode.c
>> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
>> >*sb, const char *astr, int len,
>> > 	return size;
>> > }
>> >
>> >-/* Decomposes a single unicode character. */
>> >+/* Decomposes a single non-Hangul unicode character. */
>> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
>> > {
>> > 	int off;
>> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
>> >*size)
>> > 	return hfsplus_decompose_table + (off / 4);
>> > }
>> >
>> >+/* Decomposes a Hangul unicode character. */
>> >+int decompose_hangul(wchar_t uc, u16 *result)
>> >+{
>> >+	int index;
>> >+	int l, v, t;
>> >+
>> >+	index = uc - Hangul_SBase;
>> >+	if (index < 0 || index >= Hangul_SCount)
>> >+		return 0;
>> >+
>> >+	l = Hangul_LBase + index / Hangul_NCount;
>> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
>> >+	t = Hangul_TBase + index % Hangul_TCount;
>> >+
>> >+	result[0] = l;
>> >+	result[1] = v;
>> >+	if (t != Hangul_TBase) {
>> >+		result[2] = t;
>> >+		return 3;
>> >+	}
>> >+	return 2;
>> >+}
>> >+
>> > int hfsplus_asc2uni(struct super_block *sb,
>> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
>> > 		    const char *astr, int len)
>> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>> > 	int size, dsize, decompose;
>> > 	u16 *dstr, outlen = 0;
>> > 	wchar_t c;
>> >+	u16 hangul_buf[3];
>> >
>> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> > 	while (outlen < max_unistr_len && len > 0) {
>> > 		size = asc2unichar(sb, astr, len, &c);
>> >
>> >-		if (decompose)
>> >-			dstr = decompose_unichar(c, &dsize);
>> >-		else
>> >+		if (decompose) {
>> >+			/* Hangul is handled separately */
>> >+			dstr = &hangul_buf[0];
>> >+			dsize = decompose_hangul(c, dstr);
>> >+			if (dsize == 0)
>> >+				/* not Hangul */
>> >+				dstr = decompose_unichar(c, &dsize);
>> >+		} else {
>> > 			dstr = NULL;
>> >+		}
>> >+
>> > 		if (dstr) {
>> > 			if (outlen + dsize > max_unistr_len)
>> > 				break;
>> 
>> Hi,
>> 
>> Thank you for your explanation and your draft.
>> I test four versions of hfsplus module:
>> --------------------------------
>> 1. origin linux
>> 2. apply my patch
>> 3. mount with nodecompose option
>> 4. apply your patch
>> --------------------------------
> 
> It seems you were very thorough, thank you.
> 
>> There are the results of the test:
>> ---------------------------------------------------------------------
>> Linux can ls and cp the file from mac correctly with module 2,3,4.
>> Mac cannot cp the files correctly which touched by module 1,2,3.
>> Mac can cp the file correctly which touched by module with your patch.
>> Mac can ls the files which touch by all modules.
>> ---------------------------------------------------------------------
>> 
>> It seems that everything goes well with your patch. The result is just
>> like you say, my patch or mount with nodecompose option cannot perform
>> normally.
> 
> That's good to hear. I will submit a patch as soon as I can figure out
> the licensing issues. If you want I can credit you as reporter and 
> tester.
> 
OK, thanks a lot.
>> 
>> By the way, are the decompose procedures of function 
>> hfsplus_hash_dentry()
>> and hfsplus_compare_dentry() need to modify as well?
> 
> You are correct, good thing you noticed. I'll add that when I resend 
> the
> patch.
> 
>> Thanks,
>> TCHou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27  2:07           ` tchou
@ 2017-11-27 19:36             ` Ernesto A. Fernández
  2017-11-27 22:40               ` Viacheslav Dubeyko
  2018-08-23 18:29               ` Ernesto A. Fernández
  0 siblings, 2 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-27 19:36 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: tchou, slava, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

It would be good to have some feedback on the use of code under the
Unicode License, I don't know if that can be a problem.

To Ting-Chang Hou: This patch is slightly different from the one you
tested with your mac. You are still tagged as Tested-by, but you may
want to check again that I have not broken anything.

Thanks,
Ernest

-- >8 --
Subject: [PATCH] hfsplus: fix decomposition of Hangul characters

Files created under macOS become essentially unusable under linux if
their names contain any Hangul, and vice versa.

This happens because the normalization of Hangul characters is a special
case: it can be done algorithmically, without need of a table. The
hfsplus module deals with Hangul correctly when composing, but
completely forgets about it when performing a decomposition.

Solve this by using the Hangul decomposition function provided in the
Unicode Standard. It's under the Unicode License, compatible with the
GPL.

This patch will cause trouble for Hangul filenames already created by
the module in the past. This shouldn't be a special concern because the
main purpose of the module was always sharing with macOS. If a user
actually needs to access such a file the nodecompose mount option should
be enough.

Reported-by: Ting-Chang Hou <tchou@synology.com>
Tested-by: Ting-Chang Hou <tchou@synology.com>
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
---
 fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..7ceae91 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
-static inline u16 *decompose_unichar(wchar_t uc, int *size)
+/* Decomposes a non-Hangul unicode character. */
+static u16 *decompose_nonhangul(wchar_t uc, int *size)
 {
 	int off;
 
@@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/*
+ * Decomposes a Hangul unicode character.
+ *
+ * This function was adapted from sample code in section 3.12 of the
+ * Unicode Standard, version 10.0.
+ *
+ * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
+ * under the Terms of Use in http://www.unicode.org/copyright.html.
+ */
+static int decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
+/* Decomposes a single unicode character. */
+static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
+{
+	u16 *result;
+
+	/* Hangul is handled separately */
+	result = hangul_buffer;
+	*size = decompose_hangul(uc, result);
+	if (*size == 0)
+		/* not Hangul */
+		result = decompose_nonhangul(uc, size);
+	return result;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 dhangul[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 	unsigned long hash;
 	wchar_t c;
 	u16 c2;
+	u16 dhangul[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 		len -= size;
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 	const char *astr1, *astr2;
 	u16 c1, c2;
 	wchar_t c;
+	u16 dhangul_1[3], dhangul_2[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len1 -= size;
 
 			if (decompose)
-				dstr1 = decompose_unichar(c, &dsize1);
+				dstr1 = decompose_unichar(c, &dsize1,
+							  &dhangul_1[0]);
 			if (!decompose || !dstr1) {
 				c1 = c;
 				dstr1 = &c1;
@@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len2 -= size;
 
 			if (decompose)
-				dstr2 = decompose_unichar(c, &dsize2);
+				dstr2 = decompose_unichar(c, &dsize2,
+							  &dhangul_2[0]);
 			if (!decompose || !dstr2) {
 				c2 = c;
 				dstr2 = &c2;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
@ 2017-11-27 22:40               ` Viacheslav Dubeyko
  2017-11-28 15:02                 ` Ernesto A. Fernández
  2018-08-23 18:29               ` Ernesto A. Fernández
  1 sibling, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-27 22:40 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton

On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> It would be good to have some feedback on the use of code under the
> Unicode License, I don't know if that can be a problem.
> 
> To Ting-Chang Hou: This patch is slightly different from the one you
> tested with your mac. You are still tagged as Tested-by, but you may
> want to check again that I have not broken anything.
> 
> Thanks,
> Ernest
> 
> -- >8 --
> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> 
> Files created under macOS become essentially unusable under linux if
> their names contain any Hangul, and vice versa.
> 
> This happens because the normalization of Hangul characters is a special
> case: it can be done algorithmically, without need of a table. The
> hfsplus module deals with Hangul correctly when composing, but
> completely forgets about it when performing a decomposition.
> 
> Solve this by using the Hangul decomposition function provided in the
> Unicode Standard. It's under the Unicode License, compatible with the
> GPL.
> 

I believe it makes sense to share more details about algorithm details.
I think that the Unicode standard contains some explanation how to
decompose correctly. It will be great to have some brief citation in the
patch description.

> This patch will cause trouble for Hangul filenames already created by
> the module in the past. This shouldn't be a special concern because the
> main purpose of the module was always sharing with macOS. If a user
> actually needs to access such a file the nodecompose mount option should
> be enough.
> 
> Reported-by: Ting-Chang Hou <tchou@synology.com>
> Tested-by: Ting-Chang Hou <tchou@synology.com>
> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
> ---
>  fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 56 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..7ceae91 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> +/* Decomposes a non-Hangul unicode character. */
> +static u16 *decompose_nonhangul(wchar_t uc, int *size)
>  {
>  	int off;
>  
> @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/*
> + * Decomposes a Hangul unicode character.
> + *
> + * This function was adapted from sample code in section 3.12 of the
> + * Unicode Standard, version 10.0.
> + *
> + * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
> + * under the Terms of Use in http://www.unicode.org/copyright.html.
> + */
> +static int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
> +/* Decomposes a single unicode character. */
> +static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
> +{
> +	u16 *result;
> +
> +	/* Hangul is handled separately */
> +	result = hangul_buffer;
> +	*size = decompose_hangul(uc, result);

Is it possible to distinguish the different cases beforehand and to
select one proper function for execution without the execution both ones
in the worst case?

> +	if (*size == 0)
> +		/* not Hangul */
> +		result = decompose_nonhangul(uc, size);

I think it makes sense to use the brackets here. If I remember
correctly, this is the code style's requirement.

Thanks,
Vyacheslav Dubeyko.

> +	return result;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 dhangul[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
>  		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
>  		else
>  			dstr = NULL;
>  		if (dstr) {
> @@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
>  	unsigned long hash;
>  	wchar_t c;
>  	u16 c2;
> +	u16 dhangul[3];
>  
>  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> @@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
>  		len -= size;
>  
>  		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
>  		else
>  			dstr = NULL;
>  		if (dstr) {
> @@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  	const char *astr1, *astr2;
>  	u16 c1, c2;
>  	wchar_t c;
> +	u16 dhangul_1[3], dhangul_2[3];
>  
>  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> @@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  			len1 -= size;
>  
>  			if (decompose)
> -				dstr1 = decompose_unichar(c, &dsize1);
> +				dstr1 = decompose_unichar(c, &dsize1,
> +							  &dhangul_1[0]);
>  			if (!decompose || !dstr1) {
>  				c1 = c;
>  				dstr1 = &c1;
> @@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
>  			len2 -= size;
>  
>  			if (decompose)
> -				dstr2 = decompose_unichar(c, &dsize2);
> +				dstr2 = decompose_unichar(c, &dsize2,
> +							  &dhangul_2[0]);
>  			if (!decompose || !dstr2) {
>  				c2 = c;
>  				dstr2 = &c2;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 22:40               ` Viacheslav Dubeyko
@ 2017-11-28 15:02                 ` Ernesto A. Fernández
  2017-11-28 16:30                   ` Viacheslav Dubeyko
  0 siblings, 1 reply; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 15:02 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > It would be good to have some feedback on the use of code under the
> > Unicode License, I don't know if that can be a problem.
> > 
> > To Ting-Chang Hou: This patch is slightly different from the one you
> > tested with your mac. You are still tagged as Tested-by, but you may
> > want to check again that I have not broken anything.
> > 
> > Thanks,
> > Ernest
> > 
> > -- >8 --
> > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > 
> > Files created under macOS become essentially unusable under linux if
> > their names contain any Hangul, and vice versa.
> > 
> > This happens because the normalization of Hangul characters is a special
> > case: it can be done algorithmically, without need of a table. The
> > hfsplus module deals with Hangul correctly when composing, but
> > completely forgets about it when performing a decomposition.
> > 
> > Solve this by using the Hangul decomposition function provided in the
> > Unicode Standard. It's under the Unicode License, compatible with the
> > GPL.
> > 
> 
> I believe it makes sense to share more details about algorithm details.
> I think that the Unicode standard contains some explanation how to
> decompose correctly. It will be great to have some brief citation in the
> patch description.

The algorithm is very simple, the best way to understand it is just
looking at the code. I don't know the first thing about Korean writing, so
I don't think I should attempt to explain why the decomposition is done
this way. If somebody else is interested in the details, they can follow
the citation in the header comment of the decompose_hangul function.

> > This patch will cause trouble for Hangul filenames already created by
> > the module in the past. This shouldn't be a special concern because the
> > main purpose of the module was always sharing with macOS. If a user
> > actually needs to access such a file the nodecompose mount option should
> > be enough.
> > 
> > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
> > ---
> >  fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 56 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > index dfa90c2..7ceae91 100644
> > --- a/fs/hfsplus/unicode.c
> > +++ b/fs/hfsplus/unicode.c
> > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
> >  	return size;
> >  }
> >  
> > -/* Decomposes a single unicode character. */
> > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > +/* Decomposes a non-Hangul unicode character. */
> > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> >  {
> >  	int off;
> >  
> > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
> >  	return hfsplus_decompose_table + (off / 4);
> >  }
> >  
> > +/*
> > + * Decomposes a Hangul unicode character.
> > + *
> > + * This function was adapted from sample code in section 3.12 of the
> > + * Unicode Standard, version 10.0.
> > + *
> > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights reserved.  Distributed
> > + * under the Terms of Use in http://www.unicode.org/copyright.html.
> > + */
> > +static int decompose_hangul(wchar_t uc, u16 *result)
> > +{
> > +	int index;
> > +	int l, v, t;
> > +
> > +	index = uc - Hangul_SBase;
> > +	if (index < 0 || index >= Hangul_SCount)
> > +		return 0;
> > +
> > +	l = Hangul_LBase + index / Hangul_NCount;
> > +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> > +	t = Hangul_TBase + index % Hangul_TCount;
> > +
> > +	result[0] = l;
> > +	result[1] = v;
> > +	if (t != Hangul_TBase) {
> > +		result[2] = t;
> > +		return 3;
> > +	}
> > +	return 2;
> > +}
> > +
> > +/* Decomposes a single unicode character. */
> > +static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
> > +{
> > +	u16 *result;
> > +
> > +	/* Hangul is handled separately */
> > +	result = hangul_buffer;
> > +	*size = decompose_hangul(uc, result);
> 
> Is it possible to distinguish the different cases beforehand and to
> select one proper function for execution without the execution both ones
> in the worst case?

The decompose_hangul and decompose_nonhangul functions are called only once,
so I would expect the compiler to inline them both. The result should be
similar to your suggestion.

> > +	if (*size == 0)
> > +		/* not Hangul */
> > +		result = decompose_nonhangul(uc, size);
> 
> I think it makes sense to use the brackets here. If I remember
> correctly, this is the code style's requirement.

I couldn't find such a requirement, but I did find some other places within
the hfsplus code where the comments are done my way. For example, in the
hfsplus_cat_write_inode() function of the inode.c file.

> Thanks,
> Vyacheslav Dubeyko.

Thanks,
Ernest
 
> > +	return result;
> > +}
> > +
> >  int hfsplus_asc2uni(struct super_block *sb,
> >  		    struct hfsplus_unistr *ustr, int max_unistr_len,
> >  		    const char *astr, int len)
> > @@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
> >  	int size, dsize, decompose;
> >  	u16 *dstr, outlen = 0;
> >  	wchar_t c;
> > +	u16 dhangul[3];
> >  
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> >  	while (outlen < max_unistr_len && len > 0) {
> >  		size = asc2unichar(sb, astr, len, &c);
> >  
> >  		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
> >  		else
> >  			dstr = NULL;
> >  		if (dstr) {
> > @@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
> >  	unsigned long hash;
> >  	wchar_t c;
> >  	u16 c2;
> > +	u16 dhangul[3];
> >  
> >  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > @@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
> >  		len -= size;
> >  
> >  		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > +			dstr = decompose_unichar(c, &dsize, &dhangul[0]);
> >  		else
> >  			dstr = NULL;
> >  		if (dstr) {
> > @@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  	const char *astr1, *astr2;
> >  	u16 c1, c2;
> >  	wchar_t c;
> > +	u16 dhangul_1[3], dhangul_2[3];
> >  
> >  	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > @@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  			len1 -= size;
> >  
> >  			if (decompose)
> > -				dstr1 = decompose_unichar(c, &dsize1);
> > +				dstr1 = decompose_unichar(c, &dsize1,
> > +							  &dhangul_1[0]);
> >  			if (!decompose || !dstr1) {
> >  				c1 = c;
> >  				dstr1 = &c1;
> > @@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
> >  			len2 -= size;
> >  
> >  			if (decompose)
> > -				dstr2 = decompose_unichar(c, &dsize2);
> > +				dstr2 = decompose_unichar(c, &dsize2,
> > +							  &dhangul_2[0]);
> >  			if (!decompose || !dstr2) {
> >  				c2 = c;
> >  				dstr2 = &c2;
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 15:02                 ` Ernesto A. Fernández
@ 2017-11-28 16:30                   ` Viacheslav Dubeyko
  2017-11-28 18:15                     ` Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-28 16:30 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton

On Tue, 2017-11-28 at 12:02 -0300, Ernesto A. Fernández wrote:
> On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> > 
> > On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > > 
> > > It would be good to have some feedback on the use of code under
> > > the
> > > Unicode License, I don't know if that can be a problem.
> > > 
> > > To Ting-Chang Hou: This patch is slightly different from the one
> > > you
> > > tested with your mac. You are still tagged as Tested-by, but you
> > > may
> > > want to check again that I have not broken anything.
> > > 
> > > Thanks,
> > > Ernest
> > > 
> > > -- >8 --
> > > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > > 
> > > Files created under macOS become essentially unusable under linux
> > > if
> > > their names contain any Hangul, and vice versa.
> > > 
> > > This happens because the normalization of Hangul characters is a
> > > special
> > > case: it can be done algorithmically, without need of a table.
> > > The
> > > hfsplus module deals with Hangul correctly when composing, but
> > > completely forgets about it when performing a decomposition.
> > > 
> > > Solve this by using the Hangul decomposition function provided in
> > > the
> > > Unicode Standard. It's under the Unicode License, compatible with
> > > the
> > > GPL.
> > > 
> > I believe it makes sense to share more details about algorithm
> > details.
> > I think that the Unicode standard contains some explanation how to
> > decompose correctly. It will be great to have some brief citation
> > in the
> > patch description.
> The algorithm is very simple, the best way to understand it is just
> looking at the code. I don't know the first thing about Korean
> writing, so
> I don't think I should attempt to explain why the decomposition is
> done
> this way. If somebody else is interested in the details, they can
> follow
> the citation in the header comment of the decompose_hangul function.
> 

You should share the link where it can be found. Finally, the comment
needs in more detailed explanation.


> > 
> > > 
> > > This patch will cause trouble for Hangul filenames already
> > > created by
> > > the module in the past. This shouldn't be a special concern
> > > because the
> > > main purpose of the module was always sharing with macOS. If a
> > > user
> > > actually needs to access such a file the nodecompose mount option
> > > should
> > > be enough.
> > > 
> > > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.
> > > com>
> > > ---
> > >  fs/hfsplus/unicode.c | 62
> > > +++++++++++++++++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 56 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > index dfa90c2..7ceae91 100644
> > > --- a/fs/hfsplus/unicode.c
> > > +++ b/fs/hfsplus/unicode.c
> > > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct
> > > super_block *sb, const char *astr, int len,
> > >  	return size;
> > >  }
> > >  
> > > -/* Decomposes a single unicode character. */
> > > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > > +/* Decomposes a non-Hangul unicode character. */
> > > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> > >  {
> > >  	int off;
> > >  
> > > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t
> > > uc, int *size)
> > >  	return hfsplus_decompose_table + (off / 4);
> > >  }
> > >  
> > > +/*
> > > + * Decomposes a Hangul unicode character.
> > > + *
> > > + * This function was adapted from sample code in section 3.12 of
> > > the
> > > + * Unicode Standard, version 10.0.
> > > + *
> > > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights
> > > reserved.  Distributed
> > > + * under the Terms of Use in http://www.unicode.org/copyright.ht
> > > ml.
> > > + */
> > > +static int decompose_hangul(wchar_t uc, u16 *result)
> > > +{
> > > +	int index;
> > > +	int l, v, t;
> > > +
> > > +	index = uc - Hangul_SBase;
> > > +	if (index < 0 || index >= Hangul_SCount)
> > > +		return 0;
> > > +
> > > +	l = Hangul_LBase + index / Hangul_NCount;
> > > +	v = Hangul_VBase + (index % Hangul_NCount) /
> > > Hangul_TCount;
> > > +	t = Hangul_TBase + index % Hangul_TCount;
> > > +
> > > +	result[0] = l;
> > > +	result[1] = v;
> > > +	if (t != Hangul_TBase) {
> > > +		result[2] = t;
> > > +		return 3;
> > > +	}
> > > +	return 2;
> > > +}
> > > +
> > > +/* Decomposes a single unicode character. */
> > > +static u16 *decompose_unichar(wchar_t uc, int *size, u16
> > > *hangul_buffer)
> > > +{
> > > +	u16 *result;
> > > +
> > > +	/* Hangul is handled separately */
> > > +	result = hangul_buffer;
> > > +	*size = decompose_hangul(uc, result);
> > Is it possible to distinguish the different cases beforehand and to
> > select one proper function for execution without the execution both
> > ones
> > in the worst case?
> The decompose_hangul and decompose_nonhangul functions are called
> only once,
> so I would expect the compiler to inline them both. The result should
> be
> similar to your suggestion.
> 


It will be much better to have:

If (hangful) {
    <decompose_hangul>
} else {
    <decompose_nonhangul>
}

Currently, the code looks weird. Especially, decompose_hangul() is able
to distinguish two different cases. So, it's easy to extract the check.

> > 
> > > 
> > > +	if (*size == 0)
> > > +		/* not Hangul */
> > > +		result = decompose_nonhangul(uc, size);
> > I think it makes sense to use the brackets here. If I remember
> > correctly, this is the code style's requirement.
> I couldn't find such a requirement, but I did find some other places
> within
> the hfsplus code where the comments are done my way. For example, in
> the
> hfsplus_cat_write_inode() function of the inode.c file.
> 


The hfsplus code contains many awful places. It doesn't mean that it
needs to follow the bad style examples. If you have only one string:

if (condition)
    <do_something>

then it's OK not to use the brackets. Otherwise, you should include the
branch into the brackets.

Thanks,
Vyacheslav Dubeyko.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 16:30                   ` Viacheslav Dubeyko
@ 2017-11-28 18:15                     ` Ernesto A. Fernández
  0 siblings, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 18:15 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: linux-fsdevel, tchou, htl10, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

On Tue, Nov 28, 2017 at 08:30:54AM -0800, Viacheslav Dubeyko wrote:
> On Tue, 2017-11-28 at 12:02 -0300, Ernesto A. Fernández wrote:
> > On Mon, Nov 27, 2017 at 02:40:53PM -0800, Viacheslav Dubeyko wrote:
> > > 
> > > On Mon, 2017-11-27 at 16:36 -0300, Ernesto A. Fernández wrote:
> > > > 
> > > > It would be good to have some feedback on the use of code under
> > > > the
> > > > Unicode License, I don't know if that can be a problem.
> > > > 
> > > > To Ting-Chang Hou: This patch is slightly different from the one
> > > > you
> > > > tested with your mac. You are still tagged as Tested-by, but you
> > > > may
> > > > want to check again that I have not broken anything.
> > > > 
> > > > Thanks,
> > > > Ernest
> > > > 
> > > > -- >8 --
> > > > Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> > > > 
> > > > Files created under macOS become essentially unusable under linux
> > > > if
> > > > their names contain any Hangul, and vice versa.
> > > > 
> > > > This happens because the normalization of Hangul characters is a
> > > > special
> > > > case: it can be done algorithmically, without need of a table.
> > > > The
> > > > hfsplus module deals with Hangul correctly when composing, but
> > > > completely forgets about it when performing a decomposition.
> > > > 
> > > > Solve this by using the Hangul decomposition function provided in
> > > > the
> > > > Unicode Standard. It's under the Unicode License, compatible with
> > > > the
> > > > GPL.
> > > > 
> > > I believe it makes sense to share more details about algorithm
> > > details.
> > > I think that the Unicode standard contains some explanation how to
> > > decompose correctly. It will be great to have some brief citation
> > > in the
> > > patch description.
> > The algorithm is very simple, the best way to understand it is just
> > looking at the code. I don't know the first thing about Korean
> > writing, so
> > I don't think I should attempt to explain why the decomposition is
> > done
> > this way. If somebody else is interested in the details, they can
> > follow
> > the citation in the header comment of the decompose_hangul function.
> > 
> 
> You should share the link where it can be found.

Is that necessary? I think people can find the Unicode Standard on their
own. Even if I did provide a link, it's a pdf, so people would still need
to look up the right sections. It's not going to save anybody much work.

> Finally, the comment
> needs in more detailed explanation.

Are you talking about the function header comment? I don't know if I
follow. Perhaps I could explain that the return value is the size, but
it seems pretty obvious from the use.

> > > 
> > > > 
> > > > This patch will cause trouble for Hangul filenames already
> > > > created by
> > > > the module in the past. This shouldn't be a special concern
> > > > because the
> > > > main purpose of the module was always sharing with macOS. If a
> > > > user
> > > > actually needs to access such a file the nodecompose mount option
> > > > should
> > > > be enough.
> > > > 
> > > > Reported-by: Ting-Chang Hou <tchou@synology.com>
> > > > Tested-by: Ting-Chang Hou <tchou@synology.com>
> > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.
> > > > com>
> > > > ---
> > > >  fs/hfsplus/unicode.c | 62
> > > > +++++++++++++++++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 56 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > > index dfa90c2..7ceae91 100644
> > > > --- a/fs/hfsplus/unicode.c
> > > > +++ b/fs/hfsplus/unicode.c
> > > > @@ -272,8 +272,8 @@ static inline int asc2unichar(struct
> > > > super_block *sb, const char *astr, int len,
> > > >  	return size;
> > > >  }
> > > >  
> > > > -/* Decomposes a single unicode character. */
> > > > -static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > > > +/* Decomposes a non-Hangul unicode character. */
> > > > +static u16 *decompose_nonhangul(wchar_t uc, int *size)
> > > >  {
> > > >  	int off;
> > > >  
> > > > @@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t
> > > > uc, int *size)
> > > >  	return hfsplus_decompose_table + (off / 4);
> > > >  }
> > > >  
> > > > +/*
> > > > + * Decomposes a Hangul unicode character.
> > > > + *
> > > > + * This function was adapted from sample code in section 3.12 of
> > > > the
> > > > + * Unicode Standard, version 10.0.
> > > > + *
> > > > + * Copyright (C) 1991-2017 Unicode, Inc.  All rights
> > > > reserved.  Distributed
> > > > + * under the Terms of Use in http://www.unicode.org/copyright.ht
> > > > ml.
> > > > + */
> > > > +static int decompose_hangul(wchar_t uc, u16 *result)
> > > > +{
> > > > +	int index;
> > > > +	int l, v, t;
> > > > +
> > > > +	index = uc - Hangul_SBase;
> > > > +	if (index < 0 || index >= Hangul_SCount)
> > > > +		return 0;
> > > > +
> > > > +	l = Hangul_LBase + index / Hangul_NCount;
> > > > +	v = Hangul_VBase + (index % Hangul_NCount) /
> > > > Hangul_TCount;
> > > > +	t = Hangul_TBase + index % Hangul_TCount;
> > > > +
> > > > +	result[0] = l;
> > > > +	result[1] = v;
> > > > +	if (t != Hangul_TBase) {
> > > > +		result[2] = t;
> > > > +		return 3;
> > > > +	}
> > > > +	return 2;
> > > > +}
> > > > +
> > > > +/* Decomposes a single unicode character. */
> > > > +static u16 *decompose_unichar(wchar_t uc, int *size, u16
> > > > *hangul_buffer)
> > > > +{
> > > > +	u16 *result;
> > > > +
> > > > +	/* Hangul is handled separately */
> > > > +	result = hangul_buffer;
> > > > +	*size = decompose_hangul(uc, result);
> > > Is it possible to distinguish the different cases beforehand and to
> > > select one proper function for execution without the execution both
> > > ones
> > > in the worst case?
> > The decompose_hangul and decompose_nonhangul functions are called
> > only once,
> > so I would expect the compiler to inline them both. The result should
> > be
> > similar to your suggestion.
> > 
> 
> 
> It will be much better to have:
> 
> If (hangful) {
>     <decompose_hangul>
> } else {
>     <decompose_nonhangul>
> }
> 
> Currently, the code looks weird. Especially, decompose_hangul() is able
> to distinguish two different cases. So, it's easy to extract the check.

OK, your previous objection seemed to be about efficiency, not style.

Moving the Hangul check outside the function would require a reader of
decompose_unichar() to figure out what Hangul_SBase and Hangul_SCount
are, when they may not be interested in the Hangul case at all. My
version may be weird but I think it's easier to understand.

Keep in mind that the old decompose_unichar function was also checking
if the character was Hangul, and doing nothing if that was the case. So
this is not a big departure from the existing code.

Finally, there is the issue of the license; I don't know if I should be
splitting decompose_hangul() since it would make it harder to give
attribution.

> > > 
> > > > 
> > > > +	if (*size == 0)
> > > > +		/* not Hangul */
> > > > +		result = decompose_nonhangul(uc, size);
> > > I think it makes sense to use the brackets here. If I remember
> > > correctly, this is the code style's requirement.
> > I couldn't find such a requirement, but I did find some other places
> > within
> > the hfsplus code where the comments are done my way. For example, in
> > the
> > hfsplus_cat_write_inode() function of the inode.c file.
> > 
> 
> 
> The hfsplus code contains many awful places.

I'd say most of the older code is decent and easy to understand.

> It doesn't mean that it
> needs to follow the bad style examples. If you have only one string:
> 
> if (condition)
>     <do_something>
> 
> then it's OK not to use the brackets. Otherwise, you should include the
> branch into the brackets.

I looked around outside of hfsplus, and it seems that both styles are
in use. The most charming example I found was in the rb_next_postorder()
function of lib/rbtree.c, where one branch of a conditional uses your
style and the other uses mine.

If you can actually find the place in the syle guide where this is
required then I will make the change. Otherwise I would prefer to leave
it as it is.

> Thanks,
> Vyacheslav Dubeyko.
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
  2017-11-27 22:40               ` Viacheslav Dubeyko
@ 2018-08-23 18:29               ` Ernesto A. Fernández
  2018-08-24  1:20                 ` tchou
  1 sibling, 1 reply; 32+ messages in thread
From: Ernesto A. Fernández @ 2018-08-23 18:29 UTC (permalink / raw)
  To: tchou; +Cc: slava, linux-fsdevel, htl10

Hi again:

On Mon, Nov 27, 2017 at 04:36:54PM -0300, Ernesto A. Fernández wrote:
> It would be good to have some feedback on the use of code under the
> Unicode License, I don't know if that can be a problem.
> 
> To Ting-Chang Hou: This patch is slightly different from the one you
> tested with your mac. You are still tagged as Tested-by, but you may
> want to check again that I have not broken anything.
> 
> Thanks,
> Ernest
> 
> -- >8 --
> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
> 
> Files created under macOS become essentially unusable under linux if
> their names contain any Hangul, and vice versa.
> 
> This happens because the normalization of Hangul characters is a special
> case: it can be done algorithmically, without need of a table. The
> hfsplus module deals with Hangul correctly when composing, but
> completely forgets about it when performing a decomposition.
> 
> Solve this by using the Hangul decomposition function provided in the
> Unicode Standard. It's under the Unicode License, compatible with the
> GPL.
> 
> This patch will cause trouble for Hangul filenames already created by
> the module in the past. This shouldn't be a special concern because the
> main purpose of the module was always sharing with macOS. If a user
> actually needs to access such a file the nodecompose mount option should
> be enough.
> 
> Reported-by: Ting-Chang Hou <tchou@synology.com>
> Tested-by: Ting-Chang Hou <tchou@synology.com>
> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
>

It took a while but this patch has been merged now. Thanks again for your
report.

Ernest

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2018-08-23 18:29               ` Ernesto A. Fernández
@ 2018-08-24  1:20                 ` tchou
  0 siblings, 0 replies; 32+ messages in thread
From: tchou @ 2018-08-24  1:20 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: slava, linux-fsdevel, htl10, linux-fsdevel-owner

Hi,

Ernesto A. Fernández 於 2018-08-24 02:29 寫到:
> Hi again:
> 
> On Mon, Nov 27, 2017 at 04:36:54PM -0300, Ernesto A. Fernández wrote:
>> It would be good to have some feedback on the use of code under the
>> Unicode License, I don't know if that can be a problem.
>> 
>> To Ting-Chang Hou: This patch is slightly different from the one you
>> tested with your mac. You are still tagged as Tested-by, but you may
>> want to check again that I have not broken anything.
>> 
>> Thanks,
>> Ernest
>> 
>> -- >8 --
>> Subject: [PATCH] hfsplus: fix decomposition of Hangul characters
>> 
>> Files created under macOS become essentially unusable under linux if
>> their names contain any Hangul, and vice versa.
>> 
>> This happens because the normalization of Hangul characters is a 
>> special
>> case: it can be done algorithmically, without need of a table. The
>> hfsplus module deals with Hangul correctly when composing, but
>> completely forgets about it when performing a decomposition.
>> 
>> Solve this by using the Hangul decomposition function provided in the
>> Unicode Standard. It's under the Unicode License, compatible with the
>> GPL.
>> 
>> This patch will cause trouble for Hangul filenames already created by
>> the module in the past. This shouldn't be a special concern because 
>> the
>> main purpose of the module was always sharing with macOS. If a user
>> actually needs to access such a file the nodecompose mount option 
>> should
>> be enough.
>> 
>> Reported-by: Ting-Chang Hou <tchou@synology.com>
>> Tested-by: Ting-Chang Hou <tchou@synology.com>
>> Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
>> 
> 
> It took a while but this patch has been merged now. Thanks again for 
> your
> report.

And thanks for your patch, too.

> 
> Ernest

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] hfsplus: fix decomposition of Hangul characters
@ 2018-07-17 22:09 Ernesto A. Fernández
  0 siblings, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2018-07-17 22:09 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andrew Morton, Ting-Chang Hou

Files created under macOS cannot be opened under linux if their names
contain Korean characters, and vice versa.

The Korean alphabet is special because its normalization is done without
a table. The module deals with it correctly when composing, but forgets
about it for the decomposition.

Fix this using the Hangul decomposition function provided in the Unicode
Standard. The code fits a bit awkwardly because it requires a buffer,
while all the other normalizations are returned as pointers to the
decomposition table. This is actually also a bug because reordering may
still be needed, but for now leave it as it is.

The patch will cause trouble for Hangul filenames already created by the
module in the past. This shouldn't really be concern because its main
purpose was always sharing with macOS. If a user actually needs to
access such a file the nodecompose mount option should be enough.

Reported-and-tested-by: Ting-Chang Hou <tchou@synology.com>
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
---
 fs/hfsplus/unicode.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c21948f..c8d1b2be7854 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,8 +272,8 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
-static inline u16 *decompose_unichar(wchar_t uc, int *size)
+/* Decomposes a non-Hangul unicode character. */
+static u16 *hfsplus_decompose_nonhangul(wchar_t uc, int *size)
 {
 	int off;
 
@@ -296,6 +296,51 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/*
+ * Try to decompose a unicode character as Hangul. Return 0 if @uc is not
+ * precomposed Hangul, otherwise return the length of the decomposition.
+ *
+ * This function was adapted from sample code from the Unicode Standard
+ * Annex #15: Unicode Normalization Forms, version 3.2.0.
+ *
+ * Copyright (C) 1991-2018 Unicode, Inc.  All rights reserved.  Distributed
+ * under the Terms of Use in http://www.unicode.org/copyright.html.
+ */
+static int hfsplus_try_decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
+/* Decomposes a single unicode character. */
+static u16 *decompose_unichar(wchar_t uc, int *size, u16 *hangul_buffer)
+{
+	u16 *result;
+
+	/* Hangul is handled separately */
+	result = hangul_buffer;
+	*size = hfsplus_try_decompose_hangul(uc, result);
+	if (*size == 0)
+		result = hfsplus_decompose_nonhangul(uc, size);
+	return result;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,13 +348,14 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 dhangul[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, dhangul);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -344,6 +390,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 	unsigned long hash;
 	wchar_t c;
 	u16 c2;
+	u16 dhangul[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -357,7 +404,7 @@ int hfsplus_hash_dentry(const struct dentry *dentry, struct qstr *str)
 		len -= size;
 
 		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
+			dstr = decompose_unichar(c, &dsize, dhangul);
 		else
 			dstr = NULL;
 		if (dstr) {
@@ -396,6 +443,7 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 	const char *astr1, *astr2;
 	u16 c1, c2;
 	wchar_t c;
+	u16 dhangul_1[3], dhangul_2[3];
 
 	casefold = test_bit(HFSPLUS_SB_CASEFOLD, &HFSPLUS_SB(sb)->flags);
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
@@ -413,7 +461,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len1 -= size;
 
 			if (decompose)
-				dstr1 = decompose_unichar(c, &dsize1);
+				dstr1 = decompose_unichar(c, &dsize1,
+							  dhangul_1);
 			if (!decompose || !dstr1) {
 				c1 = c;
 				dstr1 = &c1;
@@ -427,7 +476,8 @@ int hfsplus_compare_dentry(const struct dentry *dentry,
 			len2 -= size;
 
 			if (decompose)
-				dstr2 = decompose_unichar(c, &dsize2);
+				dstr2 = decompose_unichar(c, &dsize2,
+							  dhangul_2);
 			if (!decompose || !dstr2) {
 				c2 = c;
 				dstr2 = &c2;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-29 16:28     ` Viacheslav Dubeyko
@ 2017-11-30  4:03       ` Ernesto A. Fernández
  0 siblings, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-30  4:03 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández, Hin-Tak Leung

On Wed, Nov 29, 2017 at 08:28:56AM -0800, Viacheslav Dubeyko wrote:
> On Wed, 2017-11-29 at 11:42 -0300, Ernesto A. Fernández wrote:
> > On Wed, Nov 29, 2017 at 08:57:46AM +0000, Hin-Tak Leung wrote:
> > > 
> > > 
> > > --------------------------------------------
> > > On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail
> > > .com> wrote:
> > > 
> > > > 
> > > > This does not link to the code
> > > > sample at all, in fact it links to a page
> > > > that links to a page that links to a page that
> > > > links to the pdf.
> > >  
> > > I think this process is called bike-shedding - i.e. spending too
> > > much time arguing about trivial matters. The amount of (wasted)
> > > effort put into arguing about not providing pointers to relevant
> > > information. Please just put the relevant information in the next
> > > revision of the patch and re-submit.
> > Of course you ignored me, but I must insist: I would be far more
> > inclined
> > to take you seriously if you could just get your email client to send
> > replies correctly. And while you are at it, keep the lines under 80
> > characters, will you?
> 
> You cannot insist. Everybody has the right to do the things in own way.

I beg to differ. From Documentation/process/email-clients.rst, lines
43-44:

    "Email clients should generate and maintain References: or
     In-Reply-To: headers so that mail threading is not broken."

> You are ignoring our opinions. Why do we need to change our opinions?

I have not ignored you at all, in fact I have taken the time to reply to
every single one of your opinions. I even asked you to elaborate on some
points, which you are yet to do.

The real problem here is that you expect me to take your advice blindly,
which I'm not really inclined to do given our past conversations.

> Why do we need to take into account your vision? 

What vision are you talking about? This is just a bugfix. The reason my
work should be considered is because I took the trouble of replying to the
reporter, tracking down the bug and fixing their problem.

I did all that while you kept stubbornly denying that there was a bug at
all, and you actually recommended using a mount option that would have
made their files unreadable on a mac.

> What's the point to
> consider you in a serious way if you treat us like the kids?

Getting the bug fixed? That's why we are here, right?

If you don't want me to bother you with my patches anymore just say so.
I only cc'd you in this last one because you were already part of the
conversation and I thought it would be rude to remove you, but I have no
problem if you want out.

> By the way, my remarks are still the same. I don't agree this current
> state of the patch. The patch needs to be improved.

This was to be expected. I've been dealing with you for a few months now
and I have never seen you consider that you could be wrong about anything.

> Thanks,
> Vyacheslav Dubeyko.
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-29 14:42   ` Ernesto A. Fernández
@ 2017-11-29 16:28     ` Viacheslav Dubeyko
  2017-11-30  4:03       ` Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-29 16:28 UTC (permalink / raw)
  To: Ernesto A. Fernández, Hin-Tak Leung
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton

On Wed, 2017-11-29 at 11:42 -0300, Ernesto A. Fernández wrote:
> On Wed, Nov 29, 2017 at 08:57:46AM +0000, Hin-Tak Leung wrote:
> > 
> > 
> > --------------------------------------------
> > On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail
> > .com> wrote:
> > 
> > > 
> > > This does not link to the code
> > > sample at all, in fact it links to a page
> > > that links to a page that links to a page that
> > > links to the pdf.
> >  
> > I think this process is called bike-shedding - i.e. spending too
> > much time arguing about trivial matters. The amount of (wasted)
> > effort put into arguing about not providing pointers to relevant
> > information. Please just put the relevant information in the next
> > revision of the patch and re-submit.
> Of course you ignored me, but I must insist: I would be far more
> inclined
> to take you seriously if you could just get your email client to send
> replies correctly. And while you are at it, keep the lines under 80
> characters, will you?

You cannot insist. Everybody has the right to do the things in own way.
You are ignoring our opinions. Why do we need to change our opinions?
Why do we need to take into account your vision? What's the point to
consider you in a serious way if you treat us like the kids?

By the way, my remarks are still the same. I don't agree this current
state of the patch. The patch needs to be improved.

Thanks,
Vyacheslav Dubeyko.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-29  8:57 ` Hin-Tak Leung
@ 2017-11-29 14:42   ` Ernesto A. Fernández
  2017-11-29 16:28     ` Viacheslav Dubeyko
  0 siblings, 1 reply; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-29 14:42 UTC (permalink / raw)
  To: Hin-Tak Leung
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández, Viacheslav Dubeyko

On Wed, Nov 29, 2017 at 08:57:46AM +0000, Hin-Tak Leung wrote:
> 
> --------------------------------------------
> On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:
> 
> > This does not link to the code
> > sample at all, in fact it links to a page
> > that links to a page that links to a page that
> > links to the pdf.
>  
> I think this process is called bike-shedding - i.e. spending too much time arguing about trivial matters. The amount of (wasted) effort put into arguing about not providing pointers to relevant information. Please just put the relevant information in the next revision of the patch and re-submit.

Of course you ignored me, but I must insist: I would be far more inclined
to take you seriously if you could just get your email client to send
replies correctly. And while you are at it, keep the lines under 80
characters, will you?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
       [not found] <748914583.7912009.1511948801051.ref@mail.yahoo.com>
@ 2017-11-29  9:46 ` Hin-Tak Leung
  0 siblings, 0 replies; 32+ messages in thread
From: Hin-Tak Leung @ 2017-11-29  9:46 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández, Viacheslav Dubeyko

Please include this (or other relevant information) in the commit message when re-submitting the patch:

http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf

Page 143, Section 3.12 Conjourning Jamo Behavior,
page 149-151, "Sample Code for Hangul Algorithms" .

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
       [not found] <1691800505.7838608.1511945866988.ref@mail.yahoo.com>
@ 2017-11-29  8:57 ` Hin-Tak Leung
  2017-11-29 14:42   ` Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: Hin-Tak Leung @ 2017-11-29  8:57 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández, Viacheslav Dubeyko


--------------------------------------------
On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:

> This does not link to the code
> sample at all, in fact it links to a page
> that links to a page that links to a page that
> links to the pdf.
 
I think this process is called bike-shedding - i.e. spending too much time arguing about trivial matters. The amount of (wasted) effort put into arguing about not providing pointers to relevant information. Please just put the relevant information in the next revision of the patch and re-submit.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 21:48 ` Hin-Tak Leung
@ 2017-11-28 23:10   ` Ernesto A. Fernández
  0 siblings, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 23:10 UTC (permalink / raw)
  To: Hin-Tak Leung
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández, Viacheslav Dubeyko

On Tue, Nov 28, 2017 at 09:48:40PM +0000, Hin-Tak Leung wrote:
> --------------------------------------------
> On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:
> 
> > Is that necessary? I think people can find the
> > Unicode Standard on their
> > own. Even if I did
> > provide a link, it's a pdf, so people would still
> > need
> > to look up the right sections. It's
> > not going to save anybody much work.
>  
> Yes, it is necessary. No, it is not a pdf. The code sample is inlined in the older version of this technical report:
> 
> http://www.unicode.org/reports/tr15/tr15-18.html#Hangul

This is extremely old, clearly not the version of Unicode I'm referencing.

> The code samples, etc had been splitted off and referenced in the current version:
> 
> http://www.unicode.org/reports/tr15/#Hangul_Composition

This does not link to the code sample at all, in fact it links to a page
that links to a page that links to a page that links to the pdf.

> Please:
> 1. include the above urls when you re-submit the next revision of the patch.
> 
> 2. include a brief description of minimal example - e.g. touch "\xXX\xYY\xZZ..." to create a minimal empty file for testing,
> in the commit message.

The way to test this patch is by creating a Hangul filename under MacOS
and trying to open it under linux, as the reporter did. You will not notice
anything wrong if you only work under linux.

> "People could look it up" is not an excuse not to give precise pointers to relevant information.
>

I would say "Section 3.12 of the Unicode Standard, version 10.0" is a very
precise pointer.

By the way, your mail client is making a huge mess of this thread; I have
no idea what message you are replying to anymore.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
       [not found] <1047295440.7535049.1511905720846.ref@mail.yahoo.com>
@ 2017-11-28 21:48 ` Hin-Tak Leung
  2017-11-28 23:10   ` Ernesto A. Fernández
  0 siblings, 1 reply; 32+ messages in thread
From: Hin-Tak Leung @ 2017-11-28 21:48 UTC (permalink / raw)
  To: Viacheslav Dubeyko, Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández

--------------------------------------------
On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:

> Is that necessary? I think people can find the
> Unicode Standard on their
> own. Even if I did
> provide a link, it's a pdf, so people would still
> need
> to look up the right sections. It's
> not going to save anybody much work.
 
Yes, it is necessary. No, it is not a pdf. The code sample is inlined in the older version of this technical report:

http://www.unicode.org/reports/tr15/tr15-18.html#Hangul

The code samples, etc had been splitted off and referenced in the current version:

http://www.unicode.org/reports/tr15/#Hangul_Composition

Please:
1. include the above urls when you re-submit the next revision of the patch.

2. include a brief description of minimal example - e.g. touch "\xXX\xYY\xZZ..." to create a minimal empty file for testing,
in the commit message.

"People could look it up" is not an excuse not to give precise pointers to relevant information.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 16:15 ` Hin-Tak Leung
  2017-11-28 17:50   ` Viacheslav Dubeyko
@ 2017-11-28 18:51   ` Ernesto A. Fernández
  1 sibling, 0 replies; 32+ messages in thread
From: Ernesto A. Fernández @ 2017-11-28 18:51 UTC (permalink / raw)
  To: Hin-Tak Leung
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Viacheslav Dubeyko, Ernesto A. Fernández

On Tue, Nov 28, 2017 at 04:15:18PM +0000, Hin-Tak Leung wrote:
> 
> --------------------------------------------
> On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:
> 
> > The algorithm is very simple, the best way to
> > understand it is just
> > looking at the code. I
> > don't know the first thing about Korean writing, so
> > I don't think I should attempt to explain
> > why the decomposition is done
> > this way. If
> > somebody else is interested in the details, they can
> > follow
> > the citation in the header comment of
> > the decompose_hangul function.
>  
> Apologies for coming into this a bit late.
> 
> A couple of points:
> 
> 1. Hangul canonical composition and decomposition is a separate topic from compositions of latin characters with accents. It is described in 
> 
>  http://www.unicode.org/reports/tr15/tr15-18.html#Hangul
> 
> among other sources.

I am aware of that, that's the reason this patch exists. If you check the
commit message, you will see it reads:

    "This happens because the normalization of Hangul characters is
    a special case"
 
> 2. I think the mount option is a bit of a red-herring. I think we should just do what Mac OS X does - I think in the tech note it says something about storing things always in the decomposed form or composed form. Ideally we should make the differing mount options no-ops. Mac OS X does not need extra mount options, we shouldn't either.

MacOS stores the filenames in the NFD form, that is, decomposed. The problem
here was that linux was forgetting to decompose the Hangul.

The mount option has nothing to do with this patch, other than the fact
that it could later be used to access Hangul filenames potentially
stored (mistakenly) without decomposition. It is a very unlikely situation.

As to why the mount option exists, I couldn't tell you. It was here before
the git tree. It is disabled by default anyway, so it's not bothering
anyone.

Thanks,
Ernest

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
  2017-11-28 16:15 ` Hin-Tak Leung
@ 2017-11-28 17:50   ` Viacheslav Dubeyko
  2017-11-28 18:51   ` Ernesto A. Fernández
  1 sibling, 0 replies; 32+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-28 17:50 UTC (permalink / raw)
  To: Hin-Tak Leung, Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton

On Tue, 2017-11-28 at 16:15 +0000, Hin-Tak Leung wrote:
> --------------------------------------------
> On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.c
> om> wrote:
> 
> > 
> > The algorithm is very simple, the best way to
> > understand it is just
> > looking at the code. I
> > don't know the first thing about Korean writing, so
> > I don't think I should attempt to explain
> > why the decomposition is done
> > this way. If
> > somebody else is interested in the details, they can
> > follow
> > the citation in the header comment of
> > the decompose_hangul function.
>  
> Apologies for coming into this a bit late.
> 
> A couple of points:
> 
> 1. Hangul canonical composition and decomposition is a separate topic
> from compositions of latin characters with accents. It is described
> in 
> 
>  http://www.unicode.org/reports/tr15/tr15-18.html#Hangul
> 
> among other sources.
> 
> 2. I think the mount option is a bit of a red-herring. I think we
> should just do what Mac OS X does - I think in the tech note it says
> something about storing things always in the decomposed form or
> composed form. Ideally we should make the differing mount options no-
> ops. Mac OS X does not need extra mount options, we shouldn't either.

Do you mean the using of textEncoding field in the Volume Header or/and
textEncoding in the Ctalog Folder's records? But I am not completelely
sure what flag can be Hangul related. Do you have any hints?

Thanks,
Vyacheslav Dubeyko.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
       [not found] <114108759.7167494.1511886580033.ref@mail.yahoo.com>
@ 2017-11-28 16:29 ` Hin-Tak Leung
  0 siblings, 0 replies; 32+ messages in thread
From: Hin-Tak Leung @ 2017-11-28 16:29 UTC (permalink / raw)
  To: Viacheslav Dubeyko, Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández


--------------------------------------------
On Tue, 28/11/17, Hin-Tak Leung <htl10@users.sourceforge.net> wrote:

> 1. Hangul canonical composition and
> decomposition is a separate topic from compositions of latin
> characters with accents. It is described in 
 
>  http://www.unicode.org/reports/tr15/tr15-18.html#Hangul


More current version of the URL: 

http://www.unicode.org/reports/tr15/#Hangul_Composition

The code samples, etc had been splitted off to urls, etc.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] hfsplus: fix decomposition of Hangul characters
       [not found] <1999104027.7153972.1511885718712.ref@mail.yahoo.com>
@ 2017-11-28 16:15 ` Hin-Tak Leung
  2017-11-28 17:50   ` Viacheslav Dubeyko
  2017-11-28 18:51   ` Ernesto A. Fernández
  0 siblings, 2 replies; 32+ messages in thread
From: Hin-Tak Leung @ 2017-11-28 16:15 UTC (permalink / raw)
  To: Viacheslav Dubeyko, Ernesto A. Fernández
  Cc: linux-fsdevel, tchou, Alexander Viro, Andrew Morton,
	Ernesto A. Fernández


--------------------------------------------
On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> wrote:

> The algorithm is very simple, the best way to
> understand it is just
> looking at the code. I
> don't know the first thing about Korean writing, so
> I don't think I should attempt to explain
> why the decomposition is done
> this way. If
> somebody else is interested in the details, they can
> follow
> the citation in the header comment of
> the decompose_hangul function.
 
Apologies for coming into this a bit late.

A couple of points:

1. Hangul canonical composition and decomposition is a separate topic from compositions of latin characters with accents. It is described in 

 http://www.unicode.org/reports/tr15/tr15-18.html#Hangul

among other sources.

2. I think the mount option is a bit of a red-herring. I think we should just do what Mac OS X does - I think in the tech note it says something about storing things always in the decomposed form or composed form. Ideally we should make the differing mount options no-ops. Mac OS X does not need extra mount options, we shouldn't either.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2018-08-24  4:53 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17  8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
2017-11-19  0:57 ` Ernesto A. Fernández
2017-11-23  3:57   ` tchou
2017-11-23  4:21     ` Viacheslav Dubeyko
2017-11-23  6:05       ` tchou
2017-11-23  6:23         ` Viacheslav Dubeyko
2017-11-23  6:34           ` tchou
2017-11-23 11:32     ` Ernesto A. Fernández
2017-11-23 18:36       ` Viacheslav Dubeyko
2017-11-23 22:20         ` Ernesto A. Fernández
2017-11-24  7:25       ` tchou
2017-11-24 11:45         ` Ernesto A. Fernández
2017-11-27  2:07           ` tchou
2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
2017-11-27 22:40               ` Viacheslav Dubeyko
2017-11-28 15:02                 ` Ernesto A. Fernández
2017-11-28 16:30                   ` Viacheslav Dubeyko
2017-11-28 18:15                     ` Ernesto A. Fernández
2018-08-23 18:29               ` Ernesto A. Fernández
2018-08-24  1:20                 ` tchou
     [not found] <1999104027.7153972.1511885718712.ref@mail.yahoo.com>
2017-11-28 16:15 ` Hin-Tak Leung
2017-11-28 17:50   ` Viacheslav Dubeyko
2017-11-28 18:51   ` Ernesto A. Fernández
     [not found] <114108759.7167494.1511886580033.ref@mail.yahoo.com>
2017-11-28 16:29 ` Hin-Tak Leung
     [not found] <1047295440.7535049.1511905720846.ref@mail.yahoo.com>
2017-11-28 21:48 ` Hin-Tak Leung
2017-11-28 23:10   ` Ernesto A. Fernández
     [not found] <1691800505.7838608.1511945866988.ref@mail.yahoo.com>
2017-11-29  8:57 ` Hin-Tak Leung
2017-11-29 14:42   ` Ernesto A. Fernández
2017-11-29 16:28     ` Viacheslav Dubeyko
2017-11-30  4:03       ` Ernesto A. Fernández
     [not found] <748914583.7912009.1511948801051.ref@mail.yahoo.com>
2017-11-29  9:46 ` Hin-Tak Leung
2018-07-17 22:09 Ernesto A. Fernández

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).