All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Altaparmakov <aia21@cam.ac.uk>
To: Hin-Tak Leung <hintak.leung@gmail.com>
Cc: linux-fsdevel@vger.kernel.org,
	Hin-Tak Leung <htl10@users.sourceforge.net>,
	Vyacheslav Dubeyko <slava@dubeyko.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] hfsplus: fixes worst-case unicode to char conversion of file names
Date: Fri, 4 Apr 2014 22:24:57 +0100	[thread overview]
Message-ID: <9E4A22A4-627B-4AE4-AC90-87E6D2D6CEC9@cam.ac.uk> (raw)
In-Reply-To: <1396640768-28353-1-git-send-email-HinTak.Leung@gmail.com>

Hi,

On 4 Apr 2014, at 20:46, Hin-Tak Leung <hintak.leung@gmail.com> wrote:
> From: Hin-Tak Leung <htl10@users.sourceforge.net>
> 
> The HFS Plus Volume Format specification (TN1150) states that
> file names are stored internally as a maximum of 255 unicode
> characters, as defined by The Unicode Standard, Version 2.0
> [Unicode, Inc. ISBN 0-201-48345-9]. File names are converted by
> the NLS system on Linux before presented to the user.
> 
> Though it is rare, the worst-case is 255 CJK characters converting
> to UTF-8 with 1 unicode character to 3 bytes. Surrogate pairs are
> no worse. The receiver buffer needs to be 255 x 3 bytes,
> not 255 bytes as the code has always been.

You are correct that that buffer is too small.  However:

1) The correct size for the buffer is NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1 and not using a magic constant "3" (which is actually not big enough in case the string is storing UTF-16 rather than UCS-2 Unicode which I have observed happen on NTFS written to by asian versions of Windows but I see no reason why it could not happen on OS X, too, especially on a HFS+ volume that has been written to by a Windows HFS+ driver - even if native OS X driver would not normally do it - I have not looked at it I admit).  That reliable source of information Wikipedia suggests Mac OS X also uses UTF-16 as of OS X 10.3 at least in userspace so chances are it either also uses it in the kernel or if not yet it might well do in future:

	http://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings

2) You are now allocating a huge buffer on the stack.  This is not a good thing to do in the kernel (think 4k stack kernel config - that single variable is consuming about a quarter of available stack).  You need to allocate the buffer dynamically.  As you only need to do the allocation on entry to hfsplus_readdir() and deallocate it on exit it is not a problem as it could be if you had to allocate/free for every filename.

Best regards,

	Anton

> Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
> CC: Vyacheslav Dubeyko <slava@dubeyko.com>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Christoph Hellwig <hch@infradead.org>
> ---
> fs/hfsplus/dir.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
> index bdec665..381c668 100644
> --- a/fs/hfsplus/dir.c
> +++ b/fs/hfsplus/dir.c
> @@ -127,7 +127,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
> 	struct inode *inode = file_inode(file);
> 	struct super_block *sb = inode->i_sb;
> 	int len, err;
> -	char strbuf[HFSPLUS_MAX_STRLEN + 1];
> +	char strbuf[3 * HFSPLUS_MAX_STRLEN + 1];
> 	hfsplus_cat_entry entry;
> 	struct hfs_find_data fd;
> 	struct hfsplus_readdir_data *rd;
> @@ -193,7 +193,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
> 		hfs_bnode_read(fd.bnode, &entry, fd.entryoffset,
> 			fd.entrylength);
> 		type = be16_to_cpu(entry.type);
> -		len = HFSPLUS_MAX_STRLEN;
> +		len = 3 * HFSPLUS_MAX_STRLEN;
> 		err = hfsplus_uni2asc(sb, &fd.key->cat.name, strbuf, &len);
> 		if (err)
> 			goto out;
> -- 
> 1.9.0

-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge
J.J. Thomson Avenue, Cambridge, CB3 0RB, UK


  reply	other threads:[~2014-04-04 21:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-04 19:46 [PATCH] hfsplus: fixes worst-case unicode to char conversion of file names Hin-Tak Leung
2014-04-04 21:24 ` Anton Altaparmakov [this message]
2014-04-04 22:11 Hin-Tak Leung
2014-04-05  9:07 ` Vyacheslav Dubeyko
2014-04-05 20:37 ` Anton Altaparmakov
2014-04-05 22:09 Hin-Tak Leung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9E4A22A4-627B-4AE4-AC90-87E6D2D6CEC9@cam.ac.uk \
    --to=aia21@cam.ac.uk \
    --cc=hch@infradead.org \
    --cc=hintak.leung@gmail.com \
    --cc=htl10@users.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.