linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
@ 2017-11-17 19:33 Slava Dubeyko
  0 siblings, 0 replies; 14+ messages in thread
From: Slava Dubeyko @ 2017-11-17 19:33 UTC (permalink / raw)
  To: tchou, Hin-Tak Leung; +Cc: linux-fsdevel

Hi Ting-Chang,

Sorry, I need to reapet my e-mail.

Could you please share the examples of incorrect and corect behavior? Are you sure that we need in such fix because it is tne mount option.

Hi Hin-Tak,

Could you please test the fix?

Thanks,
Vyacheslav Dubeyko.

>> From: Ting-Chang Hou <tchou@synology.com>
>> Date: Fri, Nov 17, 2017 at 12:20 AM
>> Subject: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
>> To: linux-fsdevel@vger.kernel.org
>> Cc: Ting-Chang Hou <tchou@synology.com>
>>
>> The unicode of hangul from macOS is decomposed. There has a bug that
>> mistake decomposed unicode for composed when change unicode to ascii,
>> so it cannot recognize the hangul correctly.
>>
>> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> ---
 >> fs/hfsplus/unicode.c | 2 +-
 >> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> index dfa90c2..2daf7b0 100644
>> --- a/fs/hfsplus/unicode.c
>> +++ b/fs/hfsplus/unicode.c
>> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>>        ustrlen = be16_to_cpu(ustr->length);
>>        len = *len_p;
>>        ce1 = NULL;
>> -       compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> +       compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>>
>>        while (ustrlen > 0) {
>>                c0 = be16_to_cpu(*ip++);
>> --
>> 2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24 11:45         ` Ernesto A. Fernández
@ 2017-11-27  2:07           ` tchou
  0 siblings, 0 replies; 14+ messages in thread
From: tchou @ 2017-11-27  2:07 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-24 19:45 寫到:
> On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
>> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
>> >Hi:
>> >
>> >your issue seems to be in the decomposition of hangul characters, not in
>> >the recomposition before printing. The hfsplus module on linux is saving
>> >the name of your actor as AC F5 C7 20, without performing any
>> >decomposition at all.
>> >
>> >The reason your patch hides the bug is because it causes linux to present
>> >filenames as decomposed utf8, so it is not necessary to decompose again
>> >before working with them. But the issue is still there, and you will most
>> >likely run into trouble if you make a hangul filename in linux and try
>> >to work with it in MacOS.
>> >
>> >Reviewing the code it would seem that the developers completely forgot
>> >the hangul characters had their own rules for decomposition. It's weird
>> >because they did the composition part correctly.
>> >
>> >I've made a quick draft of a patch, mostly by copying the code provided
>> >in the unicode web. I don't think we can actually use it on a release,
>> >but it should be enough to check if I'm right. It works fine on linux,
>> >but I don't have a mac, so it would be great if you could test it for me.
>> >
>> >Thanks,
>> >Ernest
>> >
>> >(By the way, there is no reason you should have to use the nodecompose
>> >mount option, as the other reviewer suggested. Using that option will
>> >have a similar effect to that of your patch. It will hide the problem,
>> >but if you create a hangul filename on linux with that option you
>> >probably won't be able to use it on a mac.)
>> >
>> >---
>> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> >index dfa90c2..9006c61 100644
>> >--- a/fs/hfsplus/unicode.c
>> >+++ b/fs/hfsplus/unicode.c
>> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
>> >*sb, const char *astr, int len,
>> > 	return size;
>> > }
>> >
>> >-/* Decomposes a single unicode character. */
>> >+/* Decomposes a single non-Hangul unicode character. */
>> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
>> > {
>> > 	int off;
>> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
>> >*size)
>> > 	return hfsplus_decompose_table + (off / 4);
>> > }
>> >
>> >+/* Decomposes a Hangul unicode character. */
>> >+int decompose_hangul(wchar_t uc, u16 *result)
>> >+{
>> >+	int index;
>> >+	int l, v, t;
>> >+
>> >+	index = uc - Hangul_SBase;
>> >+	if (index < 0 || index >= Hangul_SCount)
>> >+		return 0;
>> >+
>> >+	l = Hangul_LBase + index / Hangul_NCount;
>> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
>> >+	t = Hangul_TBase + index % Hangul_TCount;
>> >+
>> >+	result[0] = l;
>> >+	result[1] = v;
>> >+	if (t != Hangul_TBase) {
>> >+		result[2] = t;
>> >+		return 3;
>> >+	}
>> >+	return 2;
>> >+}
>> >+
>> > int hfsplus_asc2uni(struct super_block *sb,
>> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
>> > 		    const char *astr, int len)
>> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>> > 	int size, dsize, decompose;
>> > 	u16 *dstr, outlen = 0;
>> > 	wchar_t c;
>> >+	u16 hangul_buf[3];
>> >
>> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> > 	while (outlen < max_unistr_len && len > 0) {
>> > 		size = asc2unichar(sb, astr, len, &c);
>> >
>> >-		if (decompose)
>> >-			dstr = decompose_unichar(c, &dsize);
>> >-		else
>> >+		if (decompose) {
>> >+			/* Hangul is handled separately */
>> >+			dstr = &hangul_buf[0];
>> >+			dsize = decompose_hangul(c, dstr);
>> >+			if (dsize == 0)
>> >+				/* not Hangul */
>> >+				dstr = decompose_unichar(c, &dsize);
>> >+		} else {
>> > 			dstr = NULL;
>> >+		}
>> >+
>> > 		if (dstr) {
>> > 			if (outlen + dsize > max_unistr_len)
>> > 				break;
>> 
>> Hi,
>> 
>> Thank you for your explanation and your draft.
>> I test four versions of hfsplus module:
>> --------------------------------
>> 1. origin linux
>> 2. apply my patch
>> 3. mount with nodecompose option
>> 4. apply your patch
>> --------------------------------
> 
> It seems you were very thorough, thank you.
> 
>> There are the results of the test:
>> ---------------------------------------------------------------------
>> Linux can ls and cp the file from mac correctly with module 2,3,4.
>> Mac cannot cp the files correctly which touched by module 1,2,3.
>> Mac can cp the file correctly which touched by module with your patch.
>> Mac can ls the files which touch by all modules.
>> ---------------------------------------------------------------------
>> 
>> It seems that everything goes well with your patch. The result is just
>> like you say, my patch or mount with nodecompose option cannot perform
>> normally.
> 
> That's good to hear. I will submit a patch as soon as I can figure out
> the licensing issues. If you want I can credit you as reporter and 
> tester.
> 
OK, thanks a lot.
>> 
>> By the way, are the decompose procedures of function 
>> hfsplus_hash_dentry()
>> and hfsplus_compare_dentry() need to modify as well?
> 
> You are correct, good thing you noticed. I'll add that when I resend 
> the
> patch.
> 
>> Thanks,
>> TCHou

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-24  7:25       ` tchou
@ 2017-11-24 11:45         ` Ernesto A. Fernández
  2017-11-27  2:07           ` tchou
  0 siblings, 1 reply; 14+ messages in thread
From: Ernesto A. Fernández @ 2017-11-24 11:45 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> >Hi:
> >
> >your issue seems to be in the decomposition of hangul characters, not in
> >the recomposition before printing. The hfsplus module on linux is saving
> >the name of your actor as AC F5 C7 20, without performing any
> >decomposition at all.
> >
> >The reason your patch hides the bug is because it causes linux to present
> >filenames as decomposed utf8, so it is not necessary to decompose again
> >before working with them. But the issue is still there, and you will most
> >likely run into trouble if you make a hangul filename in linux and try
> >to work with it in MacOS.
> >
> >Reviewing the code it would seem that the developers completely forgot
> >the hangul characters had their own rules for decomposition. It's weird
> >because they did the composition part correctly.
> >
> >I've made a quick draft of a patch, mostly by copying the code provided
> >in the unicode web. I don't think we can actually use it on a release,
> >but it should be enough to check if I'm right. It works fine on linux,
> >but I don't have a mac, so it would be great if you could test it for me.
> >
> >Thanks,
> >Ernest
> >
> >(By the way, there is no reason you should have to use the nodecompose
> >mount option, as the other reviewer suggested. Using that option will
> >have a similar effect to that of your patch. It will hide the problem,
> >but if you create a hangul filename on linux with that option you
> >probably won't be able to use it on a mac.)
> >
> >---
> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> >index dfa90c2..9006c61 100644
> >--- a/fs/hfsplus/unicode.c
> >+++ b/fs/hfsplus/unicode.c
> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> >*sb, const char *astr, int len,
> > 	return size;
> > }
> >
> >-/* Decomposes a single unicode character. */
> >+/* Decomposes a single non-Hangul unicode character. */
> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > {
> > 	int off;
> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
> >*size)
> > 	return hfsplus_decompose_table + (off / 4);
> > }
> >
> >+/* Decomposes a Hangul unicode character. */
> >+int decompose_hangul(wchar_t uc, u16 *result)
> >+{
> >+	int index;
> >+	int l, v, t;
> >+
> >+	index = uc - Hangul_SBase;
> >+	if (index < 0 || index >= Hangul_SCount)
> >+		return 0;
> >+
> >+	l = Hangul_LBase + index / Hangul_NCount;
> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> >+	t = Hangul_TBase + index % Hangul_TCount;
> >+
> >+	result[0] = l;
> >+	result[1] = v;
> >+	if (t != Hangul_TBase) {
> >+		result[2] = t;
> >+		return 3;
> >+	}
> >+	return 2;
> >+}
> >+
> > int hfsplus_asc2uni(struct super_block *sb,
> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
> > 		    const char *astr, int len)
> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> > 	int size, dsize, decompose;
> > 	u16 *dstr, outlen = 0;
> > 	wchar_t c;
> >+	u16 hangul_buf[3];
> >
> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > 	while (outlen < max_unistr_len && len > 0) {
> > 		size = asc2unichar(sb, astr, len, &c);
> >
> >-		if (decompose)
> >-			dstr = decompose_unichar(c, &dsize);
> >-		else
> >+		if (decompose) {
> >+			/* Hangul is handled separately */
> >+			dstr = &hangul_buf[0];
> >+			dsize = decompose_hangul(c, dstr);
> >+			if (dsize == 0)
> >+				/* not Hangul */
> >+				dstr = decompose_unichar(c, &dsize);
> >+		} else {
> > 			dstr = NULL;
> >+		}
> >+
> > 		if (dstr) {
> > 			if (outlen + dsize > max_unistr_len)
> > 				break;
> 
> Hi,
> 
> Thank you for your explanation and your draft.
> I test four versions of hfsplus module:
> --------------------------------
> 1. origin linux
> 2. apply my patch
> 3. mount with nodecompose option
> 4. apply your patch
> --------------------------------

It seems you were very thorough, thank you.
 
> There are the results of the test:
> ---------------------------------------------------------------------
> Linux can ls and cp the file from mac correctly with module 2,3,4.
> Mac cannot cp the files correctly which touched by module 1,2,3.
> Mac can cp the file correctly which touched by module with your patch.
> Mac can ls the files which touch by all modules.
> ---------------------------------------------------------------------
> 
> It seems that everything goes well with your patch. The result is just
> like you say, my patch or mount with nodecompose option cannot perform
> normally.

That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.

> 
> By the way, are the decompose procedures of function hfsplus_hash_dentry()
> and hfsplus_compare_dentry() need to modify as well?

You are correct, good thing you noticed. I'll add that when I resend the
patch.

> Thanks,
> TCHou

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-24  7:25       ` tchou
  2017-11-24 11:45         ` Ernesto A. Fernández
  1 sibling, 1 reply; 14+ messages in thread
From: tchou @ 2017-11-24  7:25 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not 
> in
> the recomposition before printing. The hfsplus module on linux is 
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to 
> present
> filenames as decomposed utf8, so it is not necessary to decompose again
> before working with them. But the issue is still there, and you will 
> most
> likely run into trouble if you make a hangul filename in linux and try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely forgot
> the hangul characters had their own rules for decomposition. It's weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code provided
> in the unicode web. I don't think we can actually use it on a release,
> but it should be enough to check if I'm right. It works fine on linux,
> but I don't have a mac, so it would be great if you could test it for 
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
> 
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, 
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
> 
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
> 
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, 
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
> 
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

Hi,

Thank you for your explanation and your draft.
I test four versions of hfsplus module:
--------------------------------
1. origin linux
2. apply my patch
3. mount with nodecompose option
4. apply your patch
--------------------------------

There are the results of the test:
---------------------------------------------------------------------
Linux can ls and cp the file from mac correctly with module 2,3,4.
Mac cannot cp the files correctly which touched by module 1,2,3.
Mac can cp the file correctly which touched by module with your patch.
Mac can ls the files which touch by all modules.
---------------------------------------------------------------------

It seems that everything goes well with your patch. The result is just
like you say, my patch or mount with nodecompose option cannot perform
normally.


By the way, are the decompose procedures of function 
hfsplus_hash_dentry()
and hfsplus_compare_dentry() need to modify as well?

Thanks,
TCHou

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 18:36       ` Viacheslav Dubeyko
@ 2017-11-23 22:20         ` Ernesto A. Fernández
  0 siblings, 0 replies; 14+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 22:20 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: Ernesto A. Fernández, tchou, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, Nov 23, 2017 at 10:36:37AM -0800, Viacheslav Dubeyko wrote:
> On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> > Hi:
> > 
> > your issue seems to be in the decomposition of hangul characters, not
> > in
> > the recomposition before printing. The hfsplus module on linux is
> > saving
> > the name of your actor as AC F5 C7 20, without performing any
> > decomposition at all.
> > 
> > The reason your patch hides the bug is because it causes linux to
> > present
> > filenames as decomposed utf8, so it is not necessary to decompose
> > again
> > before working with them. But the issue is still there, and you will
> > most
> > likely run into trouble if you make a hangul filename in linux and
> > try
> > to work with it in MacOS.
> > 
> > Reviewing the code it would seem that the developers completely
> > forgot
> > the hangul characters had their own rules for decomposition. It's
> > weird
> > because they did the composition part correctly.
> > 
> > I've made a quick draft of a patch, mostly by copying the code
> > provided
> > in the unicode web. I don't think we can actually use it on a 
> 
> 
> Could you please share the link for "the unicode web"?
> 
> Thanks,
> Vyacheslav Dubeyko.

I'm not asking for any reviews yet, just testing because I don't have
a Mac. As long as that's clear, this is the latest version of Unicode:

www.unicode.org/versions/Unicode10.0.0/

You want section 3.12.

> 
> 
> > release,
> > but it should be enough to check if I'm right. It works fine on
> > linux,
> > but I don't have a mac, so it would be great if you could test it for
> > me.
> > 
> > Thanks,
> > Ernest
> > 
> > (By the way, there is no reason you should have to use the
> > nodecompose
> > mount option, as the other reviewer suggested. Using that option will
> > have a similar effect to that of your patch. It will hide the
> > problem,
> > but if you create a hangul filename on linux with that option you
> > probably won't be able to use it on a mac.)
> > 
> > ---
> > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > index dfa90c2..9006c61 100644
> > --- a/fs/hfsplus/unicode.c
> > +++ b/fs/hfsplus/unicode.c
> > @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> > *sb, const char *astr, int len,
> >  	return size;
> >  }
> >  
> > -/* Decomposes a single unicode character. */
> > +/* Decomposes a single non-Hangul unicode character. */
> >  static inline u16 *decompose_unichar(wchar_t uc, int *size)
> >  {
> >  	int off;
> > @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> > int *size)
> >  	return hfsplus_decompose_table + (off / 4);
> >  }
> >  
> > +/* Decomposes a Hangul unicode character. */
> > +int decompose_hangul(wchar_t uc, u16 *result)
> > +{
> > +	int index;
> > +	int l, v, t;
> > +
> > +	index = uc - Hangul_SBase;
> > +	if (index < 0 || index >= Hangul_SCount)
> > +		return 0;
> > +
> > +	l = Hangul_LBase + index / Hangul_NCount;
> > +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> > +	t = Hangul_TBase + index % Hangul_TCount;
> > +
> > +	result[0] = l;
> > +	result[1] = v;
> > +	if (t != Hangul_TBase) {
> > +		result[2] = t;
> > +		return 3;
> > +	}
> > +	return 2;
> > +}
> > +
> >  int hfsplus_asc2uni(struct super_block *sb,
> >  		    struct hfsplus_unistr *ustr, int max_unistr_len,
> >  		    const char *astr, int len)
> > @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> >  	int size, dsize, decompose;
> >  	u16 *dstr, outlen = 0;
> >  	wchar_t c;
> > +	u16 hangul_buf[3];
> >  
> >  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > &HFSPLUS_SB(sb)->flags);
> >  	while (outlen < max_unistr_len && len > 0) {
> >  		size = asc2unichar(sb, astr, len, &c);
> >  
> > -		if (decompose)
> > -			dstr = decompose_unichar(c, &dsize);
> > -		else
> > +		if (decompose) {
> > +			/* Hangul is handled separately */
> > +			dstr = &hangul_buf[0];
> > +			dsize = decompose_hangul(c, dstr);
> > +			if (dsize == 0)
> > +				/* not Hangul */
> > +				dstr = decompose_unichar(c, &dsize);
> > +		} else {
> >  			dstr = NULL;
> > +		}
> > +
> >  		if (dstr) {
> >  			if (outlen + dsize > max_unistr_len)
> >  				break;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23 11:32     ` Ernesto A. Fernández
@ 2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-23 22:20         ` Ernesto A. Fernández
  2017-11-24  7:25       ` tchou
  1 sibling, 1 reply; 14+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23 18:36 UTC (permalink / raw)
  To: Ernesto A. Fernández, tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not
> in
> the recomposition before printing. The hfsplus module on linux is
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to
> present
> filenames as decomposed utf8, so it is not necessary to decompose
> again
> before working with them. But the issue is still there, and you will
> most
> likely run into trouble if you make a hangul filename in linux and
> try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely
> forgot
> the hangul characters had their own rules for decomposition. It's
> weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code
> provided
> in the unicode web. I don't think we can actually use it on a 


Could you please share the link for "the unicode web"?

Thanks,
Vyacheslav Dubeyko.


> release,
> but it should be enough to check if I'm right. It works fine on
> linux,
> but I don't have a mac, so it would be great if you could test it for
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the
> nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the
> problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23 11:32     ` Ernesto A. Fernández
  2017-11-23 18:36       ` Viacheslav Dubeyko
  2017-11-24  7:25       ` tchou
  1 sibling, 2 replies; 14+ messages in thread
From: Ernesto A. Fernández @ 2017-11-23 11:32 UTC (permalink / raw)
  To: tchou
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10,
	Ernesto A. Fernández

Hi:

your issue seems to be in the decomposition of hangul characters, not in
the recomposition before printing. The hfsplus module on linux is saving
the name of your actor as AC F5 C7 20, without performing any
decomposition at all.

The reason your patch hides the bug is because it causes linux to present
filenames as decomposed utf8, so it is not necessary to decompose again
before working with them. But the issue is still there, and you will most
likely run into trouble if you make a hangul filename in linux and try
to work with it in MacOS.

Reviewing the code it would seem that the developers completely forgot
the hangul characters had their own rules for decomposition. It's weird
because they did the composition part correctly.

I've made a quick draft of a patch, mostly by copying the code provided
in the unicode web. I don't think we can actually use it on a release,
but it should be enough to check if I'm right. It works fine on linux,
but I don't have a mac, so it would be great if you could test it for me.

Thanks,
Ernest

(By the way, there is no reason you should have to use the nodecompose
mount option, as the other reviewer suggested. Using that option will
have a similar effect to that of your patch. It will hide the problem,
but if you create a hangul filename on linux with that option you
probably won't be able to use it on a mac.)

---
diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..9006c61 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block *sb, const char *astr, int len,
 	return size;
 }
 
-/* Decomposes a single unicode character. */
+/* Decomposes a single non-Hangul unicode character. */
 static inline u16 *decompose_unichar(wchar_t uc, int *size)
 {
 	int off;
@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int *size)
 	return hfsplus_decompose_table + (off / 4);
 }
 
+/* Decomposes a Hangul unicode character. */
+int decompose_hangul(wchar_t uc, u16 *result)
+{
+	int index;
+	int l, v, t;
+
+	index = uc - Hangul_SBase;
+	if (index < 0 || index >= Hangul_SCount)
+		return 0;
+
+	l = Hangul_LBase + index / Hangul_NCount;
+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
+	t = Hangul_TBase + index % Hangul_TCount;
+
+	result[0] = l;
+	result[1] = v;
+	if (t != Hangul_TBase) {
+		result[2] = t;
+		return 3;
+	}
+	return 2;
+}
+
 int hfsplus_asc2uni(struct super_block *sb,
 		    struct hfsplus_unistr *ustr, int max_unistr_len,
 		    const char *astr, int len)
@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
 	int size, dsize, decompose;
 	u16 *dstr, outlen = 0;
 	wchar_t c;
+	u16 hangul_buf[3];
 
 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 	while (outlen < max_unistr_len && len > 0) {
 		size = asc2unichar(sb, astr, len, &c);
 
-		if (decompose)
-			dstr = decompose_unichar(c, &dsize);
-		else
+		if (decompose) {
+			/* Hangul is handled separately */
+			dstr = &hangul_buf[0];
+			dsize = decompose_hangul(c, dstr);
+			if (dsize == 0)
+				/* not Hangul */
+				dstr = decompose_unichar(c, &dsize);
+		} else {
 			dstr = NULL;
+		}
+
 		if (dstr) {
 			if (outlen + dsize > max_unistr_len)
 				break;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:23         ` Viacheslav Dubeyko
@ 2017-11-23  6:34           ` tchou
  0 siblings, 0 replies; 14+ messages in thread
From: tchou @ 2017-11-23  6:34 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "\"Ernesto A.\"" Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 14:23 寫到:
> On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
>> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
>> >
>> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> > >
>> > > >
>> > > >
>> > > > Could you please share the examples of incorrect and correct
>> > > > behaviour?
>> > > >
>> > > > What is it exactly that you are trying to fix? You mention an
>> > > > issue 
>> > > > with
>> > > > hangul characters, but I failed to trigger it. Could you expand
>> > > > on 
>> > > > that?
>> > > >
>> > > Hi all,
>> > > There is an example.I use Mac mini to format my usb disk to HFS+
>> > > and 
>> > > touch
>> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
>> > > )
>> > > on 
>> > > it.
>> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
>> > > get
>> > > the
>> > > following error message when trying to ls and cp the file:
>> > >
>> >
>> > Could you share the all mount options that you used? It looks that
>> > you
>> > simply don't use the proper mount options set under the Linux.
>> > Because
>> > MacOS X will mount properly always.
>> >
>> > Thanks,
>> > Vyacheslav Dubeyko.
>> >
>> Hi Vyacheslav,
>> 
>> I use the default mount option:
>> 
>> mount /dev/sdc1 /mnt/hfs
>> /dev/sdc1 on /mnt/hfs type hfsplus 
>> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
>> 
> 
> 
> Please, take a look into mount options description [1]. Especially,
> take a look into decompose/nodecompose options. You simply needs to use
> the proper mount option for your case.
> 
I got it. It's my mistake.
Thank you all for making me right.

Thanks,
TCHou

> Thanks,
> Vyacheslav Dubeyko.
> 
> 
> [1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt
> 
>> Thanks,
>> TCHou
>> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  6:05       ` tchou
@ 2017-11-23  6:23         ` Viacheslav Dubeyko
  2017-11-23  6:34           ` tchou
  0 siblings, 1 reply; 14+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  6:23 UTC (permalink / raw)
  To: tchou
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 14:05 +0800, tchou wrote:
> Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> > 
> > On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > > 
> > > > 
> > > > 
> > > > Could you please share the examples of incorrect and correct
> > > > behaviour?
> > > > 
> > > > What is it exactly that you are trying to fix? You mention an
> > > > issue 
> > > > with
> > > > hangul characters, but I failed to trigger it. Could you expand
> > > > on 
> > > > that?
> > > > 
> > > Hi all,
> > > There is an example.I use Mac mini to format my usb disk to HFS+
> > > and 
> > > touch
> > > the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn
> > > )
> > > on 
> > > it.
> > > After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and
> > > get
> > > the
> > > following error message when trying to ls and cp the file:
> > > 
> > 
> > Could you share the all mount options that you used? It looks that
> > you
> > simply don't use the proper mount options set under the Linux.
> > Because
> > MacOS X will mount properly always.
> > 
> > Thanks,
> > Vyacheslav Dubeyko.
> > 
> Hi Vyacheslav,
> 
> I use the default mount option:
> 
> mount /dev/sdc1 /mnt/hfs
> /dev/sdc1 on /mnt/hfs type hfsplus 
> (rw,relatime,umask=22,uid=0,gid=0,nls=utf8)
> 


Please, take a look into mount options description [1]. Especially,
take a look into decompose/nodecompose options. You simply needs to use
the proper mount option for your case.

Thanks,
Vyacheslav Dubeyko.


[1] https://www.kernel.org/doc/Documentation/filesystems/hfsplus.txt

> Thanks,
> TCHou
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  4:21     ` Viacheslav Dubeyko
@ 2017-11-23  6:05       ` tchou
  2017-11-23  6:23         ` Viacheslav Dubeyko
  0 siblings, 1 reply; 14+ messages in thread
From: tchou @ 2017-11-23  6:05 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: "Ernesto A." Fernández, linux-fsdevel,
	linux-fsdevel-owner, htl10

Viacheslav Dubeyko 於 2017-11-23 12:21 寫到:
> On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
>> >
>> > Could you please share the examples of incorrect and correct
>> > behaviour?
>> >
>> > What is it exactly that you are trying to fix? You mention an
>> > issue 
>> > with
>> > hangul characters, but I failed to trigger it. Could you expand on 
>> > that?
>> >
>> Hi all,
>> There is an example.I use Mac mini to format my usb disk to HFS+ and 
>> touch
>> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
>> on 
>> it.
>> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
>> the
>> following error message when trying to ls and cp the file:
>> 
> 
> 
> Could you share the all mount options that you used? It looks that you
> simply don't use the proper mount options set under the Linux. Because
> MacOS X will mount properly always.
> 
> Thanks,
> Vyacheslav Dubeyko.
> 
Hi Vyacheslav,

I use the default mount option:

mount /dev/sdc1 /mnt/hfs
/dev/sdc1 on /mnt/hfs type hfsplus 
(rw,relatime,umask=22,uid=0,gid=0,nls=utf8)

Thanks,
TCHou
> 
> 
>> ls: cannot access 공유: No such file or directory
>> cp: cannot stat ‘공유’: No such file or directory
>> 
>> It seem's a problem for a long time(https://goo.gl/LiWGe5).
>> After applying my patch, I can ls and cp the file correctly.
>> 
>> >
>> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> > >
>> > > The unicode of hangul from macOS is decomposed. There has a bug
>> > > that
>> > > mistake decomposed unicode for composed when change unicode to
>> > > ascii,
>> > > so it cannot recognize the hangul correctly.
>> > >
>> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> > > ---
>> > >  fs/hfsplus/unicode.c | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> > > index dfa90c2..2daf7b0 100644
>> > > --- a/fs/hfsplus/unicode.c
>> > > +++ b/fs/hfsplus/unicode.c
>> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>> > >  	ustrlen = be16_to_cpu(ustr->length);
>> > >  	len = *len_p;
>> > >  	ce1 = NULL;
>> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
>> > > &HFSPLUS_SB(sb)->flags);
>> > I'm not sure this is a mistake. The developers probably wanted the
>> > filenames to be recomposed before being presented in utf8. With
>> > your 
>> > patch,
>> > if you try the following (with the default mount options):
>> >
>> >   touch Á
>> >   ls | hexdump -C
>> >
>> > the utf8 output filename will be using the combining accent (CC
>> > 81) 
>> > instead
>> > of the Á character (C3 81). This is a bit annoying because it
>> > won't 
>> > print
>> > correctly in my terminal anymore.
>> I'm not exatly know why combining accent cannot print correctly in 
>> terminal
>> and how to avoid it. Whether apply my patch or not, my terminal
>> cannot
>> print the hangul charactor correctly.
>> 
>> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-23  3:57   ` tchou
@ 2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23  6:05       ` tchou
  2017-11-23 11:32     ` Ernesto A. Fernández
  1 sibling, 1 reply; 14+ messages in thread
From: Viacheslav Dubeyko @ 2017-11-23  4:21 UTC (permalink / raw)
  To: tchou, Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, htl10

On Thu, 2017-11-23 at 11:57 +0800, tchou wrote:
> > 
> > Could you please share the examples of incorrect and correct
> > behaviour?
> > 
> > What is it exactly that you are trying to fix? You mention an
> > issue 
> > with
> > hangul characters, but I failed to trigger it. Could you expand on 
> > that?
> > 
> Hi all,
> There is an example.I use Mac mini to format my usb disk to HFS+ and 
> touch
> the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn)
> on 
> it.
> After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get
> the
> following error message when trying to ls and cp the file:
> 


Could you share the all mount options that you used? It looks that you
simply don't use the proper mount options set under the Linux. Because
MacOS X will mount properly always.

Thanks,
Vyacheslav Dubeyko.



> ls: cannot access 공유: No such file or directory
> cp: cannot stat ‘공유’: No such file or directory
> 
> It seem's a problem for a long time(https://goo.gl/LiWGe5).
> After applying my patch, I can ls and cp the file correctly.
> 
> > 
> > On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> > > 
> > > The unicode of hangul from macOS is decomposed. There has a bug
> > > that
> > > mistake decomposed unicode for composed when change unicode to
> > > ascii,
> > > so it cannot recognize the hangul correctly.
> > > 
> > > Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> > > ---
> > >  fs/hfsplus/unicode.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> > > index dfa90c2..2daf7b0 100644
> > > --- a/fs/hfsplus/unicode.c
> > > +++ b/fs/hfsplus/unicode.c
> > > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
> > >  	ustrlen = be16_to_cpu(ustr->length);
> > >  	len = *len_p;
> > >  	ce1 = NULL;
> > > -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > > +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE,
> > > &HFSPLUS_SB(sb)->flags);
> > I'm not sure this is a mistake. The developers probably wanted the
> > filenames to be recomposed before being presented in utf8. With
> > your 
> > patch,
> > if you try the following (with the default mount options):
> > 
> >   touch Á
> >   ls | hexdump -C
> > 
> > the utf8 output filename will be using the combining accent (CC
> > 81) 
> > instead
> > of the Á character (C3 81). This is a bit annoying because it
> > won't 
> > print
> > correctly in my terminal anymore.
> I'm not exatly know why combining accent cannot print correctly in 
> terminal
> and how to avoid it. Whether apply my patch or not, my terminal
> cannot
> print the hangul charactor correctly.
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-19  0:57 ` Ernesto A. Fernández
@ 2017-11-23  3:57   ` tchou
  2017-11-23  4:21     ` Viacheslav Dubeyko
  2017-11-23 11:32     ` Ernesto A. Fernández
  0 siblings, 2 replies; 14+ messages in thread
From: tchou @ 2017-11-23  3:57 UTC (permalink / raw)
  To: Ernesto A. Fernández
  Cc: linux-fsdevel, linux-fsdevel-owner, slava, htl10

> Could you please share the examples of incorrect and correct
> behaviour?

> What is it exactly that you are trying to fix? You mention an issue 
> with
> hangul characters, but I failed to trigger it. Could you expand on 
> that?
> 

Hi all,
There is an example.I use Mac mini to format my usb disk to HFS+ and 
touch
the file with filename "공유"(a Korean actor, https://goo.gl/VcBsrn) on 
it.
After it, I mount the usb disk on my ubuntu(Linux 4.14.0+) and get the
following error message when trying to ls and cp the file:

ls: cannot access 공유: No such file or directory
cp: cannot stat ‘공유’: No such file or directory

It seem's a problem for a long time(https://goo.gl/LiWGe5).
After applying my patch, I can ls and cp the file correctly.

> On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
>> The unicode of hangul from macOS is decomposed. There has a bug that
>> mistake decomposed unicode for composed when change unicode to ascii,
>> so it cannot recognize the hangul correctly.
>> 
>> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
>> ---
>>  fs/hfsplus/unicode.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> index dfa90c2..2daf7b0 100644
>> --- a/fs/hfsplus/unicode.c
>> +++ b/fs/hfsplus/unicode.c
>> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>>  	ustrlen = be16_to_cpu(ustr->length);
>>  	len = *len_p;
>>  	ce1 = NULL;
>> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> 
> I'm not sure this is a mistake. The developers probably wanted the
> filenames to be recomposed before being presented in utf8. With your 
> patch,
> if you try the following (with the default mount options):
> 
>   touch Á
>   ls | hexdump -C
> 
> the utf8 output filename will be using the combining accent (CC 81) 
> instead
> of the Á character (C3 81). This is a bit annoying because it won't 
> print
> correctly in my terminal anymore.

I'm not exatly know why combining accent cannot print correctly in 
terminal
and how to avoid it. Whether apply my patch or not, my terminal cannot
print the hangul charactor correctly.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
  2017-11-17  8:20 Ting-Chang Hou
@ 2017-11-19  0:57 ` Ernesto A. Fernández
  2017-11-23  3:57   ` tchou
  0 siblings, 1 reply; 14+ messages in thread
From: Ernesto A. Fernández @ 2017-11-19  0:57 UTC (permalink / raw)
  To: Ting-Chang Hou; +Cc: linux-fsdevel, Ernesto A. Fernández

On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote:
> The unicode of hangul from macOS is decomposed. There has a bug that
> mistake decomposed unicode for composed when change unicode to ascii,
> so it cannot recognize the hangul correctly.
> 
> Signed-off-by: Ting-Chang Hou <tchou@synology.com>
> ---
>  fs/hfsplus/unicode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..2daf7b0 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
>  	ustrlen = be16_to_cpu(ustr->length);
>  	len = *len_p;
>  	ce1 = NULL;
> -	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> +	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);

I'm not sure this is a mistake. The developers probably wanted the
filenames to be recomposed before being presented in utf8. With your patch,
if you try the following (with the default mount options):

  touch Á
  ls | hexdump -C

the utf8 output filename will be using the combining accent (CC 81) instead
of the Á character (C3 81). This is a bit annoying because it won't print
correctly in my terminal anymore.

What is it exactly that you are trying to fix? You mention an issue with
hangul characters, but I failed to trigger it. Could you expand on that?

>  	while (ustrlen > 0) {
>  		c0 = be16_to_cpu(*ip++);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
@ 2017-11-17  8:20 Ting-Chang Hou
  2017-11-19  0:57 ` Ernesto A. Fernández
  0 siblings, 1 reply; 14+ messages in thread
From: Ting-Chang Hou @ 2017-11-17  8:20 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Ting-Chang Hou

The unicode of hangul from macOS is decomposed. There has a bug that
mistake decomposed unicode for composed when change unicode to ascii,
so it cannot recognize the hangul correctly.

Signed-off-by: Ting-Chang Hou <tchou@synology.com>
---
 fs/hfsplus/unicode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
index dfa90c2..2daf7b0 100644
--- a/fs/hfsplus/unicode.c
+++ b/fs/hfsplus/unicode.c
@@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb,
 	ustrlen = be16_to_cpu(ustr->length);
 	len = *len_p;
 	ce1 = NULL;
-	compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
+	compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
 
 	while (ustrlen > 0) {
 		c0 = be16_to_cpu(*ip++);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-11-27  2:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17 19:33 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Slava Dubeyko
  -- strict thread matches above, loose matches on Subject: below --
2017-11-17  8:20 Ting-Chang Hou
2017-11-19  0:57 ` Ernesto A. Fernández
2017-11-23  3:57   ` tchou
2017-11-23  4:21     ` Viacheslav Dubeyko
2017-11-23  6:05       ` tchou
2017-11-23  6:23         ` Viacheslav Dubeyko
2017-11-23  6:34           ` tchou
2017-11-23 11:32     ` Ernesto A. Fernández
2017-11-23 18:36       ` Viacheslav Dubeyko
2017-11-23 22:20         ` Ernesto A. Fernández
2017-11-24  7:25       ` tchou
2017-11-24 11:45         ` Ernesto A. Fernández
2017-11-27  2:07           ` tchou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).