From: "Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>
To: tchou <tchou@synology.com>
Cc: linux-fsdevel@vger.kernel.org,
linux-fsdevel-owner@vger.kernel.org, slava@dubeyko.com,
htl10@users.sourceforge.net,
"Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>
Subject: Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
Date: Fri, 24 Nov 2017 08:45:26 -0300 [thread overview]
Message-ID: <20171124114525.GA3265@debian.home> (raw)
In-Reply-To: <1028889c01cdd513c2cf8cabf6c1762e@synology.com>
On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> >Hi:
> >
> >your issue seems to be in the decomposition of hangul characters, not in
> >the recomposition before printing. The hfsplus module on linux is saving
> >the name of your actor as AC F5 C7 20, without performing any
> >decomposition at all.
> >
> >The reason your patch hides the bug is because it causes linux to present
> >filenames as decomposed utf8, so it is not necessary to decompose again
> >before working with them. But the issue is still there, and you will most
> >likely run into trouble if you make a hangul filename in linux and try
> >to work with it in MacOS.
> >
> >Reviewing the code it would seem that the developers completely forgot
> >the hangul characters had their own rules for decomposition. It's weird
> >because they did the composition part correctly.
> >
> >I've made a quick draft of a patch, mostly by copying the code provided
> >in the unicode web. I don't think we can actually use it on a release,
> >but it should be enough to check if I'm right. It works fine on linux,
> >but I don't have a mac, so it would be great if you could test it for me.
> >
> >Thanks,
> >Ernest
> >
> >(By the way, there is no reason you should have to use the nodecompose
> >mount option, as the other reviewer suggested. Using that option will
> >have a similar effect to that of your patch. It will hide the problem,
> >but if you create a hangul filename on linux with that option you
> >probably won't be able to use it on a mac.)
> >
> >---
> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> >index dfa90c2..9006c61 100644
> >--- a/fs/hfsplus/unicode.c
> >+++ b/fs/hfsplus/unicode.c
> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> >*sb, const char *astr, int len,
> > return size;
> > }
> >
> >-/* Decomposes a single unicode character. */
> >+/* Decomposes a single non-Hangul unicode character. */
> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > {
> > int off;
> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
> >*size)
> > return hfsplus_decompose_table + (off / 4);
> > }
> >
> >+/* Decomposes a Hangul unicode character. */
> >+int decompose_hangul(wchar_t uc, u16 *result)
> >+{
> >+ int index;
> >+ int l, v, t;
> >+
> >+ index = uc - Hangul_SBase;
> >+ if (index < 0 || index >= Hangul_SCount)
> >+ return 0;
> >+
> >+ l = Hangul_LBase + index / Hangul_NCount;
> >+ v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> >+ t = Hangul_TBase + index % Hangul_TCount;
> >+
> >+ result[0] = l;
> >+ result[1] = v;
> >+ if (t != Hangul_TBase) {
> >+ result[2] = t;
> >+ return 3;
> >+ }
> >+ return 2;
> >+}
> >+
> > int hfsplus_asc2uni(struct super_block *sb,
> > struct hfsplus_unistr *ustr, int max_unistr_len,
> > const char *astr, int len)
> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> > int size, dsize, decompose;
> > u16 *dstr, outlen = 0;
> > wchar_t c;
> >+ u16 hangul_buf[3];
> >
> > decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > while (outlen < max_unistr_len && len > 0) {
> > size = asc2unichar(sb, astr, len, &c);
> >
> >- if (decompose)
> >- dstr = decompose_unichar(c, &dsize);
> >- else
> >+ if (decompose) {
> >+ /* Hangul is handled separately */
> >+ dstr = &hangul_buf[0];
> >+ dsize = decompose_hangul(c, dstr);
> >+ if (dsize == 0)
> >+ /* not Hangul */
> >+ dstr = decompose_unichar(c, &dsize);
> >+ } else {
> > dstr = NULL;
> >+ }
> >+
> > if (dstr) {
> > if (outlen + dsize > max_unistr_len)
> > break;
>
> Hi,
>
> Thank you for your explanation and your draft.
> I test four versions of hfsplus module:
> --------------------------------
> 1. origin linux
> 2. apply my patch
> 3. mount with nodecompose option
> 4. apply your patch
> --------------------------------
It seems you were very thorough, thank you.
> There are the results of the test:
> ---------------------------------------------------------------------
> Linux can ls and cp the file from mac correctly with module 2,3,4.
> Mac cannot cp the files correctly which touched by module 1,2,3.
> Mac can cp the file correctly which touched by module with your patch.
> Mac can ls the files which touch by all modules.
> ---------------------------------------------------------------------
>
> It seems that everything goes well with your patch. The result is just
> like you say, my patch or mount with nodecompose option cannot perform
> normally.
That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.
>
> By the way, are the decompose procedures of function hfsplus_hash_dentry()
> and hfsplus_compare_dentry() need to modify as well?
You are correct, good thing you noticed. I'll add that when I resend the
patch.
> Thanks,
> TCHou
next prev parent reply other threads:[~2017-11-24 11:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-17 8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
2017-11-19 0:57 ` Ernesto A. Fernández
2017-11-23 3:57 ` tchou
2017-11-23 4:21 ` Viacheslav Dubeyko
2017-11-23 6:05 ` tchou
2017-11-23 6:23 ` Viacheslav Dubeyko
2017-11-23 6:34 ` tchou
2017-11-23 11:32 ` Ernesto A. Fernández
2017-11-23 18:36 ` Viacheslav Dubeyko
2017-11-23 22:20 ` Ernesto A. Fernández
2017-11-24 7:25 ` tchou
2017-11-24 11:45 ` Ernesto A. Fernández [this message]
2017-11-27 2:07 ` tchou
2017-11-27 19:36 ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
2017-11-27 22:40 ` Viacheslav Dubeyko
2017-11-28 15:02 ` Ernesto A. Fernández
2017-11-28 16:30 ` Viacheslav Dubeyko
2017-11-28 18:15 ` Ernesto A. Fernández
2018-08-23 18:29 ` Ernesto A. Fernández
2018-08-24 1:20 ` tchou
2017-11-17 19:33 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Slava Dubeyko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171124114525.GA3265@debian.home \
--to=ernesto.mnd.fernandez@gmail.com \
--cc=htl10@users.sourceforge.net \
--cc=linux-fsdevel-owner@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=slava@dubeyko.com \
--cc=tchou@synology.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).