All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes
@ 2014-04-09 21:53 Hin-Tak Leung
  2014-04-10  7:30 ` Vyacheslav Dubeyko
  0 siblings, 1 reply; 5+ messages in thread
From: Hin-Tak Leung @ 2014-04-09 21:53 UTC (permalink / raw)
  To: linux-fsdevel, Andrew Morton
  Cc: Anton Altaparmakov, Hin-Tak Leung, Vyacheslav Dubeyko, Al Viro,
	Christoph Hellwig

From: Hin-Tak Leung <htl10@users.sourceforge.net>

The HFS Plus Volume Format specification (TN1150) states that
file names are stored internally as a maximum of 255 unicode
characters, as defined by The Unicode Standard, Version 2.0
[Unicode, Inc. ISBN 0-201-48345-9]. File names are converted by
the NLS system on Linux before presented to the user.

255 CJK characters converts to UTF-8 with 1 unicode character
to up to 3 bytes, and to GB18030 with 1 unicode character to
up to 4 bytes. Thus, trying in a UTF-8 locale to list files
with names of more than 85 CJK characters results in:

    $ ls /mnt
    ls: reading directory /mnt: File name too long

The receiving buffer to hfsplus_uni2asc() needs to be
255 x NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code
has always been.

Similar consideration applies to attributes, which are stored
internally as a maximum of 127 UTF-16BE units. See XNU source for
an up-to-date reference on attributes.

Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6
is not attainable in the case of conversion to UTF-8, as going
beyond 3 bytes requires the use of surrogate pairs, i.e. consuming
two input units.

Thanks Anton Altaparmakov for reviewing an earlier version of
this change.

This patch fixes all callers of hfsplus_uni2asc(), and also enables
the use of long non-English file names in HFS+. The getting and
setting, and general usage of long non-English attributes
requires further forthcoming work.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
CC: Vyacheslav Dubeyko <slava@dubeyko.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Christoph Hellwig <hch@infradead.org>
---
 fs/hfsplus/dir.c   | 11 +++++++++--
 fs/hfsplus/xattr.c | 14 +++++++++++---
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
index bdec665..fb07d26 100644
--- a/fs/hfsplus/dir.c
+++ b/fs/hfsplus/dir.c
@@ -12,6 +12,7 @@
 #include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/random.h>
+#include <linux/nls.h>
 
 #include "hfsplus_fs.h"
 #include "hfsplus_raw.h"
@@ -127,7 +128,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
 	struct inode *inode = file_inode(file);
 	struct super_block *sb = inode->i_sb;
 	int len, err;
-	char strbuf[HFSPLUS_MAX_STRLEN + 1];
+	char *strbuf;
 	hfsplus_cat_entry entry;
 	struct hfs_find_data fd;
 	struct hfsplus_readdir_data *rd;
@@ -139,6 +140,11 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
 	err = hfs_find_init(HFSPLUS_SB(sb)->cat_tree, &fd);
 	if (err)
 		return err;
+	strbuf = kmalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1, GFP_KERNEL);
+	if (!strbuf) {
+		err = -ENOMEM;
+		goto out;
+	}
 	hfsplus_cat_build_key(sb, fd.search_key, inode->i_ino, NULL);
 	err = hfs_brec_find(&fd, hfs_find_rec_by_key);
 	if (err)
@@ -193,7 +199,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
 		hfs_bnode_read(fd.bnode, &entry, fd.entryoffset,
 			fd.entrylength);
 		type = be16_to_cpu(entry.type);
-		len = HFSPLUS_MAX_STRLEN;
+		len = NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN;
 		err = hfsplus_uni2asc(sb, &fd.key->cat.name, strbuf, &len);
 		if (err)
 			goto out;
@@ -246,6 +252,7 @@ next:
 	}
 	memcpy(&rd->key, fd.key, sizeof(struct hfsplus_cat_key));
 out:
+	kfree(strbuf);
 	hfs_find_exit(&fd);
 	return err;
 }
diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
index 4e27edc..3034ce6 100644
--- a/fs/hfsplus/xattr.c
+++ b/fs/hfsplus/xattr.c
@@ -8,6 +8,7 @@
 
 #include "hfsplus_fs.h"
 #include <linux/posix_acl_xattr.h>
+#include <linux/nls.h>
 #include "xattr.h"
 #include "acl.h"
 
@@ -645,8 +646,7 @@ ssize_t hfsplus_listxattr(struct dentry *dentry, char *buffer, size_t size)
 	struct hfs_find_data fd;
 	u16 key_len = 0;
 	struct hfsplus_attr_key attr_key;
-	char strbuf[HFSPLUS_ATTR_MAX_STRLEN +
-			XATTR_MAC_OSX_PREFIX_LEN + 1] = {0};
+	char *strbuf;
 	int xattr_name_len;
 
 	if ((!S_ISREG(inode->i_mode) &&
@@ -666,6 +666,13 @@ ssize_t hfsplus_listxattr(struct dentry *dentry, char *buffer, size_t size)
 		return err;
 	}
 
+	strbuf = kzalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN +
+			XATTR_MAC_OSX_PREFIX_LEN + 1, GFP_KERNEL);
+	if (!strbuf) {
+		res = -ENOMEM;
+		goto out;
+	}
+
 	err = hfsplus_find_attr(inode->i_sb, inode->i_ino, NULL, &fd);
 	if (err) {
 		if (err == -ENOENT) {
@@ -692,7 +699,7 @@ ssize_t hfsplus_listxattr(struct dentry *dentry, char *buffer, size_t size)
 		if (be32_to_cpu(attr_key.cnid) != inode->i_ino)
 			goto end_listxattr;
 
-		xattr_name_len = HFSPLUS_ATTR_MAX_STRLEN;
+		xattr_name_len = NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN;
 		if (hfsplus_uni2asc(inode->i_sb,
 			(const struct hfsplus_unistr *)&fd.key->attr.key_name,
 					strbuf, &xattr_name_len)) {
@@ -718,6 +725,7 @@ ssize_t hfsplus_listxattr(struct dentry *dentry, char *buffer, size_t size)
 	}
 
 end_listxattr:
+	kfree(strbuf);
 	hfs_find_exit(&fd);
 	return res;
 }
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes
  2014-04-09 21:53 [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes Hin-Tak Leung
@ 2014-04-10  7:30 ` Vyacheslav Dubeyko
  2014-04-10  9:33   ` Anton Altaparmakov
  0 siblings, 1 reply; 5+ messages in thread
From: Vyacheslav Dubeyko @ 2014-04-10  7:30 UTC (permalink / raw)
  To: Hin-Tak Leung
  Cc: linux-fsdevel, Andrew Morton, Anton Altaparmakov, Hin-Tak Leung,
	Al Viro, Christoph Hellwig

On Wed, 2014-04-09 at 22:53 +0100, Hin-Tak Leung wrote:

[snip]
> @@ -127,7 +128,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>  	struct inode *inode = file_inode(file);
>  	struct super_block *sb = inode->i_sb;
>  	int len, err;
> -	char strbuf[HFSPLUS_MAX_STRLEN + 1];
> +	char *strbuf;
>  	hfsplus_cat_entry entry;
>  	struct hfs_find_data fd;
>  	struct hfsplus_readdir_data *rd;
> @@ -139,6 +140,11 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>  	err = hfs_find_init(HFSPLUS_SB(sb)->cat_tree, &fd);
>  	if (err)
>  		return err;
> +	strbuf = kmalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1, GFP_KERNEL);

I think that if you use kzalloc for second case then it makes sense to
use kzalloc and here. Anyway, kzalloc can be much more safe way, from my
viewpoint.

[snip]

> diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
> index 4e27edc..3034ce6 100644
> --- a/fs/hfsplus/xattr.c
> +++ b/fs/hfsplus/xattr.c
> @@ -8,6 +8,7 @@
>  

>  
> +	strbuf = kzalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN +
> +			XATTR_MAC_OSX_PREFIX_LEN + 1, GFP_KERNEL);
> +	if (!strbuf) {
> +		res = -ENOMEM;
> +		goto out;
> +	}
> +

Thanks,
Vyacheslav Dubeyko.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes
  2014-04-10  7:30 ` Vyacheslav Dubeyko
@ 2014-04-10  9:33   ` Anton Altaparmakov
  0 siblings, 0 replies; 5+ messages in thread
From: Anton Altaparmakov @ 2014-04-10  9:33 UTC (permalink / raw)
  To: Vyacheslav Dubeyko
  Cc: Hin-Tak Leung, linux-fsdevel, Andrew Morton, Hin-Tak Leung,
	Al Viro, Christoph Hellwig

Hi,

On 10 Apr 2014, at 08:30, Vyacheslav Dubeyko <slava@dubeyko.com> wrote:
> On Wed, 2014-04-09 at 22:53 +0100, Hin-Tak Leung wrote:
> [snip]
>> @@ -127,7 +128,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>> 	struct inode *inode = file_inode(file);
>> 	struct super_block *sb = inode->i_sb;
>> 	int len, err;
>> -	char strbuf[HFSPLUS_MAX_STRLEN + 1];
>> +	char *strbuf;
>> 	hfsplus_cat_entry entry;
>> 	struct hfs_find_data fd;
>> 	struct hfsplus_readdir_data *rd;
>> @@ -139,6 +140,11 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>> 	err = hfs_find_init(HFSPLUS_SB(sb)->cat_tree, &fd);
>> 	if (err)
>> 		return err;
>> +	strbuf = kmalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1, GFP_KERNEL);
> 
> I think that if you use kzalloc for second case then it makes sense to
> use kzalloc and here. Anyway, kzalloc can be much more safe way, from my
> viewpoint.

Why are you wanting to throw away CPU time on zeroing buffers when it is not needed?  And anyway, what about the fact that in readdir the buffer is used several times in a loop thus after the first use it will no longer be zeroed!  So you are now saying the buffer should be zeroed out between each use, too?  Honestly, if your programming abilities cannot cope with non-zeroed buffers you should not be touching kernel code!!!

If anything the xattr name buffer does not need kzalloc() and should just have the first byte set to zero instead which is much more CPU effective (if that is even needed - depends on the use case of the buffer).  1 byte vs high-hundreds of bytes being set can make a huge performance difference on embedded with slow RAM...  But at least there is a justification why you might want a zeroed buffer where there is none at all for readdir case where the buffer keeps getting reused without reinitialization thus it by definition must cope with a non-initialized buffer already.

Best regards,

	Anton

> [snip]
> 
>> diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
>> index 4e27edc..3034ce6 100644
>> --- a/fs/hfsplus/xattr.c
>> +++ b/fs/hfsplus/xattr.c
>> @@ -8,6 +8,7 @@
>> 
> 
>> 
>> +	strbuf = kzalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN +
>> +			XATTR_MAC_OSX_PREFIX_LEN + 1, GFP_KERNEL);
>> +	if (!strbuf) {
>> +		res = -ENOMEM;
>> +		goto out;
>> +	}
>> +
> 
> Thanks,
> Vyacheslav Dubeyko.

-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge
J.J. Thomson Avenue, Cambridge, CB3 0RB, UK


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes
@ 2014-04-10 23:32 Hin-Tak Leung
  0 siblings, 0 replies; 5+ messages in thread
From: Hin-Tak Leung @ 2014-04-10 23:32 UTC (permalink / raw)
  To: akpm; +Cc: linux-fsdevel, aia21, viro, hch, slava

Hi Andrew, (and others)

I think I want to pull back this, and submit a V4 later.

After looking at the code carefully, and continue on fixing the rest
of the issues with setting/getting non-English attributes, I think there is
no need to do kzalloc(), and kmalloc is sufficient - the buffer is
always used with its length.

The original stack allocation, doing null termination on init (and other similiar uses
elsewhere in the extended attribute code) is too conservative.
Many of the uses overwrites it with a static string, so the null termination on 
init doesn't do anything useful.

It looks a bit obvious that's the case when switching to dynamic allocation - and
allocate it as late as possible, just before use.

Hin-Tak

------------------------------
On Thu, Apr 10, 2014 2:39 PM BST Hin-Tak Leung wrote:

>------------------------------
>On Thu, Apr 10, 2014 8:30 AM BST Vyacheslav Dubeyko wrote:
>
>>On Wed, 2014-04-09 at 22:53 +0100, Hin-Tak Leung wrote:
>>
>>[snip]
>> @@ -127,7 +128,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>>      struct inode *inode = file_inode(file);
>>      struct super_block *sb = inode->i_sb;
>>      int len, err;
>> -    char strbuf[HFSPLUS_MAX_STRLEN + 1];
>> +    char *strbuf;
>>      hfsplus_cat_entry entry;
>>      struct hfs_find_data fd;
>>      struct hfsplus_readdir_data *rd;
>> @@ -139,6 +140,11 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>>      err = hfs_find_init(HFSPLUS_SB(sb)->cat_tree, &fd);
>>      if (err)
>>          return err;
>> +    strbuf = kmalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1, GFP_KERNEL);
>>
>>I think that if you use kzalloc for second case then it makes sense to
>>use kzalloc and here. Anyway, kzalloc can be much more safe way, from my
>>viewpoint.
>>
>
>The difference is simply to avoid unforseeable regression - the stack buffer for names
>wasn't null terminated before the switch to dynamic allocation, therefore
>it is switch to plain kmalloc; but the xattr one is null-terminated at initialization, so
>I used kzalloc when switching, just to be on the safe side. It is really just about
>"changing like to like", avoiding unforseeable regression.
>
>I can also see in the latter case, it is being used subsequently in strcpy,
>so null-terminate is probably important. Whereas the name is being passed
>around with its length.
>
>I think Anton has a point there - it might be more efficient to do kmalloc, and if
>necessary, a single assignment to the first byte only, rather than calling kzalloc
>(which we are doing, BTW, 127 x 6 ~ over 750 bytes). I am hoping kzalloc
>is cleverer and does say, blocks of 4 bytes or even 64 at one go, which won't suck too
>much, compared to just one byte assignment. I wish there is an intermediate
>between kmalloc and kzalloc where it zero's only the first byte.
>
>I hope this answer the question. Perhaps kzalloc is not needed, but
>since the code started with "strbuf[size] = {0};", it is safer to switch to kzalloc
>without making the change too ugly.
>
>I mean "safer" in the sense of "without causing
>unforseeable regressions", *not* safer in the general philosophical sense of 
>"zero'ed buffers always better than non-zero'ed".
>
>Hin-Tak
>
>>[snip]
>>
>> diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
>> index 4e27edc..3034ce6 100644
>> --- a/fs/hfsplus/xattr.c
>> +++ b/fs/hfsplus/xattr.c
>> @@ -8,6 +8,7 @@
>>  
>>
>>  
>> +    strbuf = kzalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN +
>> +            XATTR_MAC_OSX_PREFIX_LEN + 1, GFP_KERNEL);
>> +    if (!strbuf) {
>> +        res = -ENOMEM;
>> +        goto out;
>> +    }
>> +
>>
>>Thanks,
>>Vyacheslav Dubeyko.
>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes
@ 2014-04-10 13:39 Hin-Tak Leung
  0 siblings, 0 replies; 5+ messages in thread
From: Hin-Tak Leung @ 2014-04-10 13:39 UTC (permalink / raw)
  To: slava; +Cc: linux-fsdevel, akpm, aia21, viro, hch

------------------------------
On Thu, Apr 10, 2014 8:30 AM BST Vyacheslav Dubeyko wrote:

>On Wed, 2014-04-09 at 22:53 +0100, Hin-Tak Leung wrote:
>
>[snip]
>> @@ -127,7 +128,7 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>>      struct inode *inode = file_inode(file);
>>      struct super_block *sb = inode->i_sb;
>>      int len, err;
>> -    char strbuf[HFSPLUS_MAX_STRLEN + 1];
>> +    char *strbuf;
>>      hfsplus_cat_entry entry;
>>      struct hfs_find_data fd;
>>      struct hfsplus_readdir_data *rd;
>> @@ -139,6 +140,11 @@ static int hfsplus_readdir(struct file *file, struct dir_context *ctx)
>>      err = hfs_find_init(HFSPLUS_SB(sb)->cat_tree, &fd);
>>      if (err)
>>          return err;
>> +    strbuf = kmalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_MAX_STRLEN + 1, GFP_KERNEL);
>
>I think that if you use kzalloc for second case then it makes sense to
>use kzalloc and here. Anyway, kzalloc can be much more safe way, from my
>viewpoint.
>

The difference is simply to avoid unforseeable regression - the stack buffer for names
wasn't null terminated before the switch to dynamic allocation, therefore
it is switch to plain kmalloc; but the xattr one is null-terminated at initialization, so
I used kzalloc when switching, just to be on the safe side. It is really just about
"changing like to like", avoiding unforseeable regression.

I can also see in the latter case, it is being used subsequently in strcpy,
so null-terminate is probably important. Whereas the name is being passed
around with its length.

I think Anton has a point there - it might be more efficient to do kmalloc, and if
necessary, a single assignment to the first byte only, rather than calling kzalloc
(which we are doing, BTW, 127 x 6 ~ over 750 bytes). I am hoping kzalloc
is cleverer and does say, blocks of 4 bytes or even 64 at one go, which won't suck too
much, compared to just one byte assignment. I wish there is an intermediate
between kmalloc and kzalloc where it zero's only the first byte.

I hope this answer the question. Perhaps kzalloc is not needed, but
since the code started with "strbuf[size] = {0};", it is safer to switch to kzalloc
without making the change too ugly.

I mean "safer" in the sense of "without causing
unforseeable regressions", *not* safer in the general philosophical sense of 
"zero'ed buffers always better than non-zero'ed".

Hin-Tak

>[snip]
>
>> diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
>> index 4e27edc..3034ce6 100644
>> --- a/fs/hfsplus/xattr.c
>> +++ b/fs/hfsplus/xattr.c
>> @@ -8,6 +8,7 @@
>>  
>
>>  
>> +    strbuf = kzalloc(NLS_MAX_CHARSET_SIZE * HFSPLUS_ATTR_MAX_STRLEN +
>> +            XATTR_MAC_OSX_PREFIX_LEN + 1, GFP_KERNEL);
>> +    if (!strbuf) {
>> +        res = -ENOMEM;
>> +        goto out;
>> +    }
>> +
>
>Thanks,
>Vyacheslav Dubeyko.
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-04-10 23:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-09 21:53 [PATCH V3] hfsplus: fixes worst-case unicode to char conversion of file names and attributes Hin-Tak Leung
2014-04-10  7:30 ` Vyacheslav Dubeyko
2014-04-10  9:33   ` Anton Altaparmakov
2014-04-10 13:39 Hin-Tak Leung
2014-04-10 23:32 Hin-Tak Leung

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.