From: Viacheslav Dubeyko <slava@dubeyko.com>
To: "Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>,
tchou <tchou@synology.com>
Cc: linux-fsdevel@vger.kernel.org,
linux-fsdevel-owner@vger.kernel.org, htl10@users.sourceforge.net
Subject: Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name
Date: Thu, 23 Nov 2017 10:36:37 -0800 [thread overview]
Message-ID: <1511462197.2541.24.camel@dubeyko.com> (raw)
In-Reply-To: <20171123113230.GA5581@debian.home>
On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> Hi:
>
> your issue seems to be in the decomposition of hangul characters, not
> in
> the recomposition before printing. The hfsplus module on linux is
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
>
> The reason your patch hides the bug is because it causes linux to
> present
> filenames as decomposed utf8, so it is not necessary to decompose
> again
> before working with them. But the issue is still there, and you will
> most
> likely run into trouble if you make a hangul filename in linux and
> try
> to work with it in MacOS.
>
> Reviewing the code it would seem that the developers completely
> forgot
> the hangul characters had their own rules for decomposition. It's
> weird
> because they did the composition part correctly.
>
> I've made a quick draft of a patch, mostly by copying the code
> provided
> in the unicode web. I don't think we can actually use it on a
Could you please share the link for "the unicode web"?
Thanks,
Vyacheslav Dubeyko.
> release,
> but it should be enough to check if I'm right. It works fine on
> linux,
> but I don't have a mac, so it would be great if you could test it for
> me.
>
> Thanks,
> Ernest
>
> (By the way, there is no reason you should have to use the
> nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the
> problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
>
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
> return size;
> }
>
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
> static inline u16 *decompose_unichar(wchar_t uc, int *size)
> {
> int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> int *size)
> return hfsplus_decompose_table + (off / 4);
> }
>
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> + int index;
> + int l, v, t;
> +
> + index = uc - Hangul_SBase;
> + if (index < 0 || index >= Hangul_SCount)
> + return 0;
> +
> + l = Hangul_LBase + index / Hangul_NCount;
> + v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> + t = Hangul_TBase + index % Hangul_TCount;
> +
> + result[0] = l;
> + result[1] = v;
> + if (t != Hangul_TBase) {
> + result[2] = t;
> + return 3;
> + }
> + return 2;
> +}
> +
> int hfsplus_asc2uni(struct super_block *sb,
> struct hfsplus_unistr *ustr, int max_unistr_len,
> const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> int size, dsize, decompose;
> u16 *dstr, outlen = 0;
> wchar_t c;
> + u16 hangul_buf[3];
>
> decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> &HFSPLUS_SB(sb)->flags);
> while (outlen < max_unistr_len && len > 0) {
> size = asc2unichar(sb, astr, len, &c);
>
> - if (decompose)
> - dstr = decompose_unichar(c, &dsize);
> - else
> + if (decompose) {
> + /* Hangul is handled separately */
> + dstr = &hangul_buf[0];
> + dsize = decompose_hangul(c, dstr);
> + if (dsize == 0)
> + /* not Hangul */
> + dstr = decompose_unichar(c, &dsize);
> + } else {
> dstr = NULL;
> + }
> +
> if (dstr) {
> if (outlen + dsize > max_unistr_len)
> break;
next prev parent reply other threads:[~2017-11-23 18:36 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-17 8:20 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Ting-Chang Hou
2017-11-19 0:57 ` Ernesto A. Fernández
2017-11-23 3:57 ` tchou
2017-11-23 4:21 ` Viacheslav Dubeyko
2017-11-23 6:05 ` tchou
2017-11-23 6:23 ` Viacheslav Dubeyko
2017-11-23 6:34 ` tchou
2017-11-23 11:32 ` Ernesto A. Fernández
2017-11-23 18:36 ` Viacheslav Dubeyko [this message]
2017-11-23 22:20 ` Ernesto A. Fernández
2017-11-24 7:25 ` tchou
2017-11-24 11:45 ` Ernesto A. Fernández
2017-11-27 2:07 ` tchou
2017-11-27 19:36 ` [PATCH] hfsplus: fix decomposition of Hangul characters Ernesto A. Fernández
2017-11-27 22:40 ` Viacheslav Dubeyko
2017-11-28 15:02 ` Ernesto A. Fernández
2017-11-28 16:30 ` Viacheslav Dubeyko
2017-11-28 18:15 ` Ernesto A. Fernández
2018-08-23 18:29 ` Ernesto A. Fernández
2018-08-24 1:20 ` tchou
2017-11-17 19:33 [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Slava Dubeyko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1511462197.2541.24.camel@dubeyko.com \
--to=slava@dubeyko.com \
--cc=ernesto.mnd.fernandez@gmail.com \
--cc=htl10@users.sourceforge.net \
--cc=linux-fsdevel-owner@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tchou@synology.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).