From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753416AbcAVQls (ORCPT ); Fri, 22 Jan 2016 11:41:48 -0500 Received: from mx2.suse.de ([195.135.220.15]:42102 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750940AbcAVQlp (ORCPT ); Fri, 22 Jan 2016 11:41:45 -0500 Date: Fri, 22 Jan 2016 17:41:55 +0100 From: Jan Kara To: Andrew Gabbasov Cc: Jan Kara , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 0/7] udf: rework name conversions to fix multi-bytes characters support Message-ID: <20160122164155.GQ16898@quack.suse.cz> References: <1452847463-16125-1-git-send-email-andrew_gabbasov@mentor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1452847463-16125-1-git-send-email-andrew_gabbasov@mentor.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 15-01-16 02:44:18, Andrew Gabbasov wrote: > V3: > > Patches 1 and 2 skipped from sending since they are already accepted > by the maintainer (patch 2 with some changes comparing to V2). > > Patches 3 - 5 rebased on top of updated patch 2. > > Patch 6: Fixed a mistake in passing parameters to translate_to_linux(): > the third buffer and length, used for CRC calculation, should be > passed without leading encoding character. Thanks! For now I've taken patches 3-6 into my tree. I'll have a look at patch 7 next week since I need a fresh mind for that. Honza > > Patch 7: Main part of body of converting loops extracted to a separate > helper function. Also, some other modifications addressing maintainer's > comments to V2. > > V2: > > The single patch was split into several commits for separate logical > steps. Also, some minor fixes were done in the code of the patches. > > V1: > > Current implementation has several issues in unicode.c, mostly related > to handling multi-bytes characters in file names: > > - loop ending conditions in udf_CS0toUTF8 and udf_CS0toNLS functions do not > properly catch the end of output buffer in case of multi-bytes characters, > allowing out-of-bounds writing and memory corruption; > > - udf_UTF8toCS0 and udf_NLStoCS0 do not check the right boundary of output > buffer at all, also allowing out-of-bounds writing and memory corruption; > > - udf_translate_to_linux does not take into account multi-bytes characters > at all (although it is called after converting to UTF8 or NLS): maximal > length of extension is counted as 5 bytes, that may be incorrect with > multi-bytes characters; when inserting CRC and extension for long names > (near the end of the buffer), they are inserted at fixed place at the end, > that can break into the middle of the multi-bytes character; > > - when being converted from CS0 to UTF8 (or NLS), the name can be truncated > (even if the sizes in bytes of input and output buffers are the same), > but the following translating function does not know about it and does not > insert CRC, as it is assumed by the specs. > > Because of the last item above, it looks like all the checks and > conversions (re-coding and possible CRC insertions) should be done > simultaneously in the single function. This means that the listed > issues can not be fixed independently and separately. So, the whole > conversion and translation support should be reworked. > > The proposed implementation below fixes the listed issues, and also has > some additional features: > > - it gets rid of "struct ustr", since it actually just makes an unneeded > extra copying of the buffer and does not have any other significant > advantage; > > - it unifies UTF8 and NLS conversions support, since there is no much > sense to separate these cases; > > - UDF_NAME_LEN constant adjusted to better reflect actual restrictions. > > > Andrew Gabbasov (7): > udf: Prevent buffer overrun with multi-byte characters > udf: Check output buffer length when converting name to CS0 > udf: Parameterize output length in udf_put_filename > udf: Join functions for UTF8 and NLS conversions > udf: Adjust UDF_NAME_LEN to better reflect actual restrictions > udf: Remove struct ustr as non-needed intermediate storage > udf: Merge linux specific translation into CS0 conversion function > > fs/udf/namei.c | 16 +- > fs/udf/super.c | 38 ++-- > fs/udf/udfdecl.h | 21 +- > fs/udf/unicode.c | 620 ++++++++++++++++++++++--------------------------------- > 4 files changed, 281 insertions(+), 414 deletions(-) > > -- > 2.1.0 > > -- Jan Kara SUSE Labs, CR