From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from synology.com ([59.124.61.242]:58638 "EHLO synology.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752227AbdK0CHI (ORCPT ); Sun, 26 Nov 2017 21:07:08 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Mon, 27 Nov 2017 10:07:00 +0800 From: tchou To: =?UTF-8?Q?Ernesto_A=2E_Fern=C3=A1ndez?= Cc: linux-fsdevel@vger.kernel.org, linux-fsdevel-owner@vger.kernel.org, slava@dubeyko.com, htl10@users.sourceforge.net Subject: Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name In-Reply-To: <20171124114525.GA3265@debian.home> References: <1510906805-2142-1-git-send-email-tchou@synology.com> <20171119005704.GA3495@debian.home> <080024a85dc413b72c181c6e75bdc736@synology.com> <20171123113230.GA5581@debian.home> <1028889c01cdd513c2cf8cabf6c1762e@synology.com> <20171124114525.GA3265@debian.home> Message-ID: <510d6873cbcd4d93f0902c62e3a2c22d@synology.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Ernesto A. Fernández 於 2017-11-24 19:45 寫到: > On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote: >> Ernesto A. Fernández 於 2017-11-23 19:32 寫到: >> >Hi: >> > >> >your issue seems to be in the decomposition of hangul characters, not in >> >the recomposition before printing. The hfsplus module on linux is saving >> >the name of your actor as AC F5 C7 20, without performing any >> >decomposition at all. >> > >> >The reason your patch hides the bug is because it causes linux to present >> >filenames as decomposed utf8, so it is not necessary to decompose again >> >before working with them. But the issue is still there, and you will most >> >likely run into trouble if you make a hangul filename in linux and try >> >to work with it in MacOS. >> > >> >Reviewing the code it would seem that the developers completely forgot >> >the hangul characters had their own rules for decomposition. It's weird >> >because they did the composition part correctly. >> > >> >I've made a quick draft of a patch, mostly by copying the code provided >> >in the unicode web. I don't think we can actually use it on a release, >> >but it should be enough to check if I'm right. It works fine on linux, >> >but I don't have a mac, so it would be great if you could test it for me. >> > >> >Thanks, >> >Ernest >> > >> >(By the way, there is no reason you should have to use the nodecompose >> >mount option, as the other reviewer suggested. Using that option will >> >have a similar effect to that of your patch. It will hide the problem, >> >but if you create a hangul filename on linux with that option you >> >probably won't be able to use it on a mac.) >> > >> >--- >> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c >> >index dfa90c2..9006c61 100644 >> >--- a/fs/hfsplus/unicode.c >> >+++ b/fs/hfsplus/unicode.c >> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block >> >*sb, const char *astr, int len, >> > return size; >> > } >> > >> >-/* Decomposes a single unicode character. */ >> >+/* Decomposes a single non-Hangul unicode character. */ >> > static inline u16 *decompose_unichar(wchar_t uc, int *size) >> > { >> > int off; >> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int >> >*size) >> > return hfsplus_decompose_table + (off / 4); >> > } >> > >> >+/* Decomposes a Hangul unicode character. */ >> >+int decompose_hangul(wchar_t uc, u16 *result) >> >+{ >> >+ int index; >> >+ int l, v, t; >> >+ >> >+ index = uc - Hangul_SBase; >> >+ if (index < 0 || index >= Hangul_SCount) >> >+ return 0; >> >+ >> >+ l = Hangul_LBase + index / Hangul_NCount; >> >+ v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount; >> >+ t = Hangul_TBase + index % Hangul_TCount; >> >+ >> >+ result[0] = l; >> >+ result[1] = v; >> >+ if (t != Hangul_TBase) { >> >+ result[2] = t; >> >+ return 3; >> >+ } >> >+ return 2; >> >+} >> >+ >> > int hfsplus_asc2uni(struct super_block *sb, >> > struct hfsplus_unistr *ustr, int max_unistr_len, >> > const char *astr, int len) >> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb, >> > int size, dsize, decompose; >> > u16 *dstr, outlen = 0; >> > wchar_t c; >> >+ u16 hangul_buf[3]; >> > >> > decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags); >> > while (outlen < max_unistr_len && len > 0) { >> > size = asc2unichar(sb, astr, len, &c); >> > >> >- if (decompose) >> >- dstr = decompose_unichar(c, &dsize); >> >- else >> >+ if (decompose) { >> >+ /* Hangul is handled separately */ >> >+ dstr = &hangul_buf[0]; >> >+ dsize = decompose_hangul(c, dstr); >> >+ if (dsize == 0) >> >+ /* not Hangul */ >> >+ dstr = decompose_unichar(c, &dsize); >> >+ } else { >> > dstr = NULL; >> >+ } >> >+ >> > if (dstr) { >> > if (outlen + dsize > max_unistr_len) >> > break; >> >> Hi, >> >> Thank you for your explanation and your draft. >> I test four versions of hfsplus module: >> -------------------------------- >> 1. origin linux >> 2. apply my patch >> 3. mount with nodecompose option >> 4. apply your patch >> -------------------------------- > > It seems you were very thorough, thank you. > >> There are the results of the test: >> --------------------------------------------------------------------- >> Linux can ls and cp the file from mac correctly with module 2,3,4. >> Mac cannot cp the files correctly which touched by module 1,2,3. >> Mac can cp the file correctly which touched by module with your patch. >> Mac can ls the files which touch by all modules. >> --------------------------------------------------------------------- >> >> It seems that everything goes well with your patch. The result is just >> like you say, my patch or mount with nodecompose option cannot perform >> normally. > > That's good to hear. I will submit a patch as soon as I can figure out > the licensing issues. If you want I can credit you as reporter and > tester. > OK, thanks a lot. >> >> By the way, are the decompose procedures of function >> hfsplus_hash_dentry() >> and hfsplus_compare_dentry() need to modify as well? > > You are correct, good thing you noticed. I'll add that when I resend > the > patch. > >> Thanks, >> TCHou