linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] smbfs codepage fixes for 2.4.18
@ 2002-03-01 23:41 Urban Widmark
  2002-03-02  7:38 ` Christian Bornträger
  0 siblings, 1 reply; 3+ messages in thread
From: Urban Widmark @ 2002-03-01 23:41 UTC (permalink / raw)
  To: Cyrille Chepelov, Christian Bornträger, linux-kernel; +Cc: Alexander Viro

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2213 bytes --]


Ok, I think I've got something regarding the 2.4.18 oopses. (oopsen?)

There are two errors in the changes to smbfs in 2.4.18-rc3 (and 2.5.5)
1. SMB_MAXNAMELEN (max length of a single path component) was used where
   SMB_MAXPATHLEN (max total path length) should have been used.
2. The charset conversion routine was modified to return errors as
   negative values but not all callers was changed to handle this. When an 
   "illegal" character was hit the length of the string was set to 
   0xffffffff and when computing the hash value it read outside the kernel 
   memory.

Attached is a patch vs 2.4.18 that fixes these issues for me. Please test
and let me know.

If I select a codepage/charset combination that doesn't match I now get a
somewhat cryptic message instead of an oops (just a temporary thing).
    "smbfs: filename charset conversion failed"

The file is then hidden, which is bad. Conversion errors should map to '?'
as they used to or do some translation into ":####" strings. I'll do
something about that.


Some comments on what some of you have been doing:

The smbfs remote codepage can never be utf8 since there are no smb servers
that talk utf8. It can be one of the dos codepages, it can be blank or
with additional patches it can be a 2 byte little endian unicode format.

Furthermore, the local charset must be one that matches the chars used in
the remote set. Otherwise you get conversion errors. A few known good
combinations are:

cp850 <-> iso8859-1
cp866 <-> koi8-r
cp932 <-> euc-jp
(the right is the local = linux side)

See also the smb.conf manpage.

But even with these it seems to be possible to create chars that do not
match, and I think it is caused by windows trying to map unicode to a
codepage and not finding a matching char to use.

Local utf8 always matches the remote and is preferred if your system is
setup to handle it.

I would explain the reported
    smb_proc_readdir_long: name=<directory> result=-2, rcls=1, err=2
as a name conversion problem. If the conversion failed one way it used to
be truncated and would then fail when sent back to the server. The error
is ERRDOS - ERRbadfile (File not found).

Check the config and the nls maps used.

/Urban

[-- Attachment #2: Type: TEXT/PLAIN, Size: 4670 bytes --]

diff -urN -X exclude linux-2.4.18-orig/fs/smbfs/cache.c linux-2.4.18-smbfs/fs/smbfs/cache.c
--- linux-2.4.18-orig/fs/smbfs/cache.c	Sat Jan 12 16:55:58 2002
+++ linux-2.4.18-smbfs/fs/smbfs/cache.c	Fri Mar  1 21:46:36 2002
@@ -84,7 +84,7 @@
 	struct list_head *next;
 
 	if (d_validate(dent, parent)) {
-		if (dent->d_name.len <= SMB_MAXPATHLEN &&
+		if (dent->d_name.len <= SMB_MAXNAMELEN &&
 		    (unsigned long)dent->d_fsdata == fpos) {
 			if (!dent->d_inode) {
 				dput(dent);
diff -urN -X exclude linux-2.4.18-orig/fs/smbfs/proc.c linux-2.4.18-smbfs/fs/smbfs/proc.c
--- linux-2.4.18-orig/fs/smbfs/proc.c	Fri Mar  1 20:23:38 2002
+++ linux-2.4.18-smbfs/fs/smbfs/proc.c	Sat Mar  2 00:04:22 2002
@@ -119,11 +119,6 @@
 	int n;
 	wchar_t ch;
 
-	if (!nls_from || !nls_to) {
-		PARANOIA("nls_from=%p, nls_to=%p\n", nls_from, nls_to);
-		return convert_memcpy(output, olen, input, ilen, NULL, NULL);
-	}
-
 	while (ilen > 0) {
 		/* convert by changing to unicode and back to the new cp */
 		n = nls_from->char2uni((unsigned char *)input, ilen, &ch);
@@ -141,6 +136,10 @@
 		len += n;
 	}
 	return len;
+
+	/* FIXME: these error returns will simply make the files disappear
+	   if there is a codepage error. uni_xlate? Or treat different errors
+	   differently? */
 fail:
 	return n;
 }
@@ -226,8 +225,8 @@
 	if (maxlen < 2)
 		return -ENAMETOOLONG;
 
-	if (maxlen > SMB_MAXNAMELEN + 1)
-		maxlen = SMB_MAXNAMELEN + 1;
+	if (maxlen > SMB_MAXPATHLEN + 1)
+		maxlen = SMB_MAXPATHLEN + 1;
 
 	if (entry == NULL)
 		goto test_name_and_out;
@@ -1579,12 +1578,16 @@
 	}
 #endif
 
-	qname->len = server->convert(server->name_buf, SMB_MAXNAMELEN,
-				     qname->name, len,
-				     server->remote_nls, server->local_nls);
-	qname->name = server->name_buf;
+	qname->len = 0;
+	len = server->convert(server->name_buf, SMB_MAXNAMELEN,
+			    qname->name, len,
+			    server->remote_nls, server->local_nls);
+	if (len > 0) {
+		qname->len = len;
+		qname->name = server->name_buf;
+		DEBUG1("len=%d, name=%.*s\n",qname->len,qname->len,qname->name);
+	}
 
-	DEBUG1("len=%d, name=%.*s\n", qname->len, qname->len, qname->name);
 	return p + 22;
 }
 
@@ -1700,6 +1703,10 @@
 		for (i = 0; i < count; i++) {
 			p = smb_decode_short_dirent(server, p, 
 						    &qname, &fattr);
+			if (qname.len == 0) {
+				printk(KERN_ERR "smbfs: filename charset conversion failed\n");
+				continue;
+			}
 
 			if (entries_seen == 2 && qname.name[0] == '.') {
 				if (qname.len == 1)
@@ -1737,6 +1744,7 @@
 {
 	char *result;
 	unsigned int len = 0;
+	int n;
 	__u16 date, time;
 
 	/*
@@ -1812,10 +1820,14 @@
 	}
 #endif
 
-	qname->len = server->convert(server->name_buf, SMB_MAXNAMELEN,
-				     qname->name, len,
-				     server->remote_nls, server->local_nls);
-	qname->name = server->name_buf;
+	qname->len = 0;
+	n = server->convert(server->name_buf, SMB_MAXNAMELEN,
+			    qname->name, len,
+			    server->remote_nls, server->local_nls);
+	if (n > 0) {
+		qname->len = n;
+		qname->name = server->name_buf;
+	}
 
 out:
 	return result;
@@ -1881,7 +1893,7 @@
 	 */
 	mask = param + 12;
 
-	mask_len = smb_encode_path(server, mask, SMB_MAXNAMELEN+1, dir, &star);
+	mask_len = smb_encode_path(server, mask, SMB_MAXPATHLEN+1, dir, &star);
 	if (mask_len < 0) {
 		result = mask_len;
 		goto unlock_return;
@@ -2030,6 +2042,10 @@
 
 			p = smb_decode_long_dirent(server, p, info_level,
 						   &qname, &fattr);
+			if (qname.len == 0) {
+				printk(KERN_ERR "smbfs: filename charset conversion failed\n");
+				continue;
+			}
 
 			/* ignore . and .. from the server */
 			if (entries_seen == 2 && qname.name[0] == '.') {
@@ -2088,7 +2104,7 @@
 	int mask_len, result;
 
 retry:
-	mask_len = smb_encode_path(server, mask, SMB_MAXNAMELEN+1, dentry, NULL);
+	mask_len = smb_encode_path(server, mask, SMB_MAXPATHLEN+1, dentry, NULL);
 	if (mask_len < 0) {
 		result = mask_len;
 		goto out;
@@ -2214,7 +2230,7 @@
       retry:
 	WSET(param, 0, 1);	/* Info level SMB_INFO_STANDARD */
 	DSET(param, 2, 0);
-	result = smb_encode_path(server, param+6, SMB_MAXNAMELEN+1, dir, NULL);
+	result = smb_encode_path(server, param+6, SMB_MAXPATHLEN+1, dir, NULL);
 	if (result < 0)
 		goto out;
 	p = param + 6 + result;
@@ -2464,7 +2480,7 @@
       retry:
 	WSET(param, 0, 1);	/* Info level SMB_INFO_STANDARD */
 	DSET(param, 2, 0);
-	result = smb_encode_path(server, param+6, SMB_MAXNAMELEN+1, dir, NULL);
+	result = smb_encode_path(server, param+6, SMB_MAXPATHLEN+1, dir, NULL);
 	if (result < 0)
 		goto out;
 	p = param + 6 + result;

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] smbfs codepage fixes for 2.4.18
  2002-03-01 23:41 [PATCH] smbfs codepage fixes for 2.4.18 Urban Widmark
@ 2002-03-02  7:38 ` Christian Bornträger
  2002-03-03 13:21   ` Urban Widmark
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Bornträger @ 2002-03-02  7:38 UTC (permalink / raw)
  To: Urban Widmark, Cyrille Chepelov, linux-kernel; +Cc: Alexander Viro

Urban Widmark wrote:
> Attached is a patch vs 2.4.18 that fixes these issues for me. Please test
> and let me know.

There is no OOPS, which is good.

> If I select a codepage/charset combination that doesn't match I now get a
> somewhat cryptic message instead of an oops (just a temporary thing).
>     "smbfs: filename charset conversion failed"

I see a lot of them.

my smb.conf:
character set = ISO8859-1
client code page = 850


But I think, that my local code page is actually 8859-15 (I have euro-support 
so it has to be 15)
Is that a problem? AFAIK the only difference between 1 and 15 is the 
Euro-sign.


> The smbfs remote codepage can never be utf8 since there are no smb servers
> that talk utf8. It can be one of the dos codepages, it can be blank or
> with additional patches it can be a 2 byte little endian unicode format.
>
> Furthermore, the local charset must be one that matches the chars used in
> the remote set. Otherwise you get conversion errors. A few known good
> combinations are:
>
> cp850 <-> iso8859-1
> cp866 <-> koi8-r
> cp932 <-> euc-jp
> (the right is the local = linux side)

> But even with these it seems to be possible to create chars that do not
> match, and I think it is caused by windows trying to map unicode to a
> codepage and not finding a matching char to use.

The computer I mount has samba 2.0.7. But I don't know which code page it is 
running. If it is of interest I will ask.


> Local utf8 always matches the remote and is preferred if your system is
> setup to handle it.

I will try  that someday. If I had the choice I would introduce 4Byte Unicode 
for everthing and forbid everything else......

>     smb_proc_readdir_long: name=<directory> result=-2, rcls=1, err=2

If I run a find over all shares I still get some rare:

smb_proc_readdir_long: name=directory\*, result=-13, rcls=1, err=5

and

smb_proc_readdir_long: name=directory\*, result=-2, rcls=1, err=2

messages. 
These directories are empty, as you posted above.


greetings

CHristian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] smbfs codepage fixes for 2.4.18
  2002-03-02  7:38 ` Christian Bornträger
@ 2002-03-03 13:21   ` Urban Widmark
  0 siblings, 0 replies; 3+ messages in thread
From: Urban Widmark @ 2002-03-03 13:21 UTC (permalink / raw)
  To: Christian Bornträger; +Cc: Cyrille Chepelov, linux-kernel

On Sat, 2 Mar 2002, Christian Bornträger wrote:

> my smb.conf:
> character set = ISO8859-1
> client code page = 850

smbmount does not use smb.conf for these values.

It does matter what/if the server has as "client code page" and your
client must use matching settings. Either with the codepage/iocharset 
mount options or what was set as default in make *config.


> But I think, that my local code page is actually 8859-15 (I have euro-support 
> so it has to be 15)
> Is that a problem? AFAIK the only difference between 1 and 15 is the 
> Euro-sign.

I count 8 differences in the codepage->unicode mapping in the kernel.

Even if you pick the correct mappings you can get errors where the server
is failing to convert its unicode chars into codepage chars.

http://marc.theaimsgroup.com/?t=96709071500001&r=1&w=2
http://marc.theaimsgroup.com/?l=samba&m=96835905219782&w=2

What might be 0x00a8 (DIAERESIS) or 0x0308 (COMBINING DIAERESIS) is mapped
into 0x22 by the server, smbfs sees 0x22 and uses that for open requests
and others. The server then complains because 0x22 doesn't match 0x00a8
(or whatever that char is on the server side).

I have added 3 patches for 2.4.18 to my smbfs page:
    http://www.hojdpunkten.ac.se/054/samba/index.html

00 - fixes the oopses on failed codepages, now maps failed conversions
     into :## strings for debugging.
01 - adds LFS
02 - adds Unicode support

For 01 and 02 you need to patch samba and add some extra options when 
mounting to activate them. Details on the page.


> If I run a find over all shares I still get some rare:
> 
> smb_proc_readdir_long: name=directory\*, result=-13, rcls=1, err=5
> smb_proc_readdir_long: name=directory\*, result=-2, rcls=1, err=2

Access denied, permissions on the server?
File not found, probably a char translation error.

/Urban


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-03-03 13:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-01 23:41 [PATCH] smbfs codepage fixes for 2.4.18 Urban Widmark
2002-03-02  7:38 ` Christian Bornträger
2002-03-03 13:21   ` Urban Widmark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).