linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tchou <tchou@synology.com>
To: "Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>
Cc: linux-fsdevel@vger.kernel.org,
	linux-fsdevel-owner@vger.kernel.org, slava@dubeyko.com,
	htl10@users.sourceforge.net
Subject: Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
Date: Mon, 27 Nov 2017 10:07:00 +0800	[thread overview]
Message-ID: <510d6873cbcd4d93f0902c62e3a2c22d@synology.com> (raw)
In-Reply-To: <20171124114525.GA3265@debian.home>

Ernesto A. Fernández 於 2017-11-24 19:45 寫到:
> On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
>> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
>> >Hi:
>> >
>> >your issue seems to be in the decomposition of hangul characters, not in
>> >the recomposition before printing. The hfsplus module on linux is saving
>> >the name of your actor as AC F5 C7 20, without performing any
>> >decomposition at all.
>> >
>> >The reason your patch hides the bug is because it causes linux to present
>> >filenames as decomposed utf8, so it is not necessary to decompose again
>> >before working with them. But the issue is still there, and you will most
>> >likely run into trouble if you make a hangul filename in linux and try
>> >to work with it in MacOS.
>> >
>> >Reviewing the code it would seem that the developers completely forgot
>> >the hangul characters had their own rules for decomposition. It's weird
>> >because they did the composition part correctly.
>> >
>> >I've made a quick draft of a patch, mostly by copying the code provided
>> >in the unicode web. I don't think we can actually use it on a release,
>> >but it should be enough to check if I'm right. It works fine on linux,
>> >but I don't have a mac, so it would be great if you could test it for me.
>> >
>> >Thanks,
>> >Ernest
>> >
>> >(By the way, there is no reason you should have to use the nodecompose
>> >mount option, as the other reviewer suggested. Using that option will
>> >have a similar effect to that of your patch. It will hide the problem,
>> >but if you create a hangul filename on linux with that option you
>> >probably won't be able to use it on a mac.)
>> >
>> >---
>> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>> >index dfa90c2..9006c61 100644
>> >--- a/fs/hfsplus/unicode.c
>> >+++ b/fs/hfsplus/unicode.c
>> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
>> >*sb, const char *astr, int len,
>> > 	return size;
>> > }
>> >
>> >-/* Decomposes a single unicode character. */
>> >+/* Decomposes a single non-Hangul unicode character. */
>> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
>> > {
>> > 	int off;
>> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
>> >*size)
>> > 	return hfsplus_decompose_table + (off / 4);
>> > }
>> >
>> >+/* Decomposes a Hangul unicode character. */
>> >+int decompose_hangul(wchar_t uc, u16 *result)
>> >+{
>> >+	int index;
>> >+	int l, v, t;
>> >+
>> >+	index = uc - Hangul_SBase;
>> >+	if (index < 0 || index >= Hangul_SCount)
>> >+		return 0;
>> >+
>> >+	l = Hangul_LBase + index / Hangul_NCount;
>> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
>> >+	t = Hangul_TBase + index % Hangul_TCount;
>> >+
>> >+	result[0] = l;
>> >+	result[1] = v;
>> >+	if (t != Hangul_TBase) {
>> >+		result[2] = t;
>> >+		return 3;
>> >+	}
>> >+	return 2;
>> >+}
>> >+
>> > int hfsplus_asc2uni(struct super_block *sb,
>> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
>> > 		    const char *astr, int len)
>> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>> > 	int size, dsize, decompose;
>> > 	u16 *dstr, outlen = 0;
>> > 	wchar_t c;
>> >+	u16 hangul_buf[3];
>> >
>> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
>> > 	while (outlen < max_unistr_len && len > 0) {
>> > 		size = asc2unichar(sb, astr, len, &c);
>> >
>> >-		if (decompose)
>> >-			dstr = decompose_unichar(c, &dsize);
>> >-		else
>> >+		if (decompose) {
>> >+			/* Hangul is handled separately */
>> >+			dstr = &hangul_buf[0];
>> >+			dsize = decompose_hangul(c, dstr);
>> >+			if (dsize == 0)
>> >+				/* not Hangul */
>> >+				dstr = decompose_unichar(c, &dsize);
>> >+		} else {
>> > 			dstr = NULL;
>> >+		}
>> >+
>> > 		if (dstr) {
>> > 			if (outlen + dsize > max_unistr_len)
>> > 				break;
>> 
>> Hi,
>> 
>> Thank you for your explanation and your draft.
>> I test four versions of hfsplus module:
>> --------------------------------
>> 1. origin linux
>> 2. apply my patch
>> 3. mount with nodecompose option
>> 4. apply your patch
>> --------------------------------
> 
> It seems you were very thorough, thank you.
> 
>> There are the results of the test:
>> ---------------------------------------------------------------------
>> Linux can ls and cp the file from mac correctly with module 2,3,4.
>> Mac cannot cp the files correctly which touched by module 1,2,3.
>> Mac can cp the file correctly which touched by module with your patch.
>> Mac can ls the files which touch by all modules.
>> ---------------------------------------------------------------------
>> 
>> It seems that everything goes well with your patch. The result is just
>> like you say, my patch or mount with nodecompose option cannot perform
>> normally.
> 
> That's good to hear. I will submit a patch as soon as I can figure out
> the licensing issues. If you want I can credit you as reporter and 
> tester.
> 
OK, thanks a lot.
>> 
>> By the way, are the decompose procedures of function 
>> hfsplus_hash_dentry()
>> and hfsplus_compare_dentry() need to modify as well?
> 
> You are correct, good thing you noticed. I'll add that when I resend 
> the
> patch.
> 
>> Thanks,
>> TCHou

  reply	other threads:[~2017-11-27  2:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-17  8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
2017-11-19  0:57 ` Ernesto A. Fernández
2017-11-23  3:57   ` tchou
2017-11-23  4:21     ` Viacheslav Dubeyko
2017-11-23  6:05       ` tchou
2017-11-23  6:23         ` Viacheslav Dubeyko
2017-11-23  6:34           ` tchou
2017-11-23 11:32     ` Ernesto A. Fernández
2017-11-23 18:36       ` Viacheslav Dubeyko
2017-11-23 22:20         ` Ernesto A. Fernández
2017-11-24  7:25       ` tchou
2017-11-24 11:45         ` Ernesto A. Fernández
2017-11-27  2:07           ` tchou [this message]
2017-11-27 19:36             ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
2017-11-27 22:40               ` Viacheslav Dubeyko
2017-11-28 15:02                 ` Ernesto A. Fernández
2017-11-28 16:30                   ` Viacheslav Dubeyko
2017-11-28 18:15                     ` Ernesto A. Fernández
2018-08-23 18:29               ` Ernesto A. Fernández
2018-08-24  1:20                 ` tchou
2017-11-17 19:33 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Slava Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=510d6873cbcd4d93f0902c62e3a2c22d@synology.com \
    --to=tchou@synology.com \
    --cc=ernesto.mnd.fernandez@gmail.com \
    --cc=htl10@users.sourceforge.net \
    --cc=linux-fsdevel-owner@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).