linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* aoe fails on sparc64
@ 2005-08-31 13:30 Jim MacBaine
  2005-08-31 15:50 ` Ed L Cashin
  2005-09-16 13:36 ` Ed L Cashin
  0 siblings, 2 replies; 17+ messages in thread
From: Jim MacBaine @ 2005-08-31 13:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: ecashin

Hello,

Using aoe on a sparc64 system gives strange results:

sunny:/dev/etherd# echo >discover
sunny:/dev/etherd# mke2fs e0.0
mke2fs 1.37 (21-Mar-2005)
mke2fs: File too large while trying to determine filesystem size
sunny:/dev/etherd# blockdev --getsz e0.0
-4503599627370496

The log says:

Aug 31 15:18:49 sunny kernel: devfs_mk_dir: invalid argument.<6>
etherd/e0.0: unknown partition table
Aug 31 15:18:49 sunny kernel: aoe: 0011d8xxxxxx e0.0 v4000 has
67553994410557440
sectors

The system is an Sun Ultra 5, running 2.6.12.5/sparc64 compiled with
gcc-3.4.2.  e0.0 is exported on a x86 system using vblade-5, and has a
size of 30 MB.

Regards,
Jim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-08-31 13:30 aoe fails on sparc64 Jim MacBaine
@ 2005-08-31 15:50 ` Ed L Cashin
  2005-09-01  6:24   ` David S. Miller
  2005-09-16 13:36 ` Ed L Cashin
  1 sibling, 1 reply; 17+ messages in thread
From: Ed L Cashin @ 2005-08-31 15:50 UTC (permalink / raw)
  To: linux-kernel

Jim MacBaine <jmacbaine@gmail.com> writes:

> Hello,
>
> Using aoe on a sparc64 system gives strange results:
>
...
> The log says:
>
> Aug 31 15:18:49 sunny kernel: devfs_mk_dir: invalid argument.<6>
> etherd/e0.0: unknown partition table
> Aug 31 15:18:49 sunny kernel: aoe: 0011d8xxxxxx e0.0 v4000 has
> 67553994410557440
> sectors

OK.  67553994410557440 is 61440 byte swapped in 64 bits, and 30MB is
61440 sectors, so this should be a simple byte order fix.

> The system is an Sun Ultra 5, running 2.6.12.5/sparc64 compiled with
> gcc-3.4.2.  e0.0 is exported on a x86 system using vblade-5, and has a
> size of 30 MB.

Thanks for the report.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-08-31 15:50 ` Ed L Cashin
@ 2005-09-01  6:24   ` David S. Miller
  2005-09-01 19:13     ` Ed L Cashin
  0 siblings, 1 reply; 17+ messages in thread
From: David S. Miller @ 2005-09-01  6:24 UTC (permalink / raw)
  To: ecashin; +Cc: linux-kernel

From: Ed L Cashin <ecashin@coraid.com>
Date: Wed, 31 Aug 2005 11:50:55 -0400

> Jim MacBaine <jmacbaine@gmail.com> writes:
> 
> > Aug 31 15:18:49 sunny kernel: devfs_mk_dir: invalid argument.<6>
> > etherd/e0.0: unknown partition table
> > Aug 31 15:18:49 sunny kernel: aoe: 0011d8xxxxxx e0.0 v4000 has
> > 67553994410557440
> > sectors
> 
> OK.  67553994410557440 is 61440 byte swapped in 64 bits, and 30MB is
> 61440 sectors, so this should be a simple byte order fix.

More strangely, the upper and lower 32-bit words are swapped.
The bytes within each 32-bit word are swapped correctly.

So the calculation maybe should be something like:

	__le32 *p = (__le32 *) &id[100 << 1];
	u32 high32 = le32_to_cpup(p);
	u32 low32 = le32_to_cpup(p + 1);

	ssize = (((u64)high32 << 32) | (u64) low32);

But that doesn't make any sense, and even ide_fix_driveid() in
drivers/ide/ide-iops.c does a le64_to_cpu() for this value:

	id->lba_capacity_2 = __le64_to_cpu(id->lba_capacity_2);

I wonder if this is some artifact of how AOE devices encode
this field when sending it to the client.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-01  6:24   ` David S. Miller
@ 2005-09-01 19:13     ` Ed L Cashin
  2005-09-01 19:45       ` David S. Miller
  2005-09-03 16:06       ` Jim MacBaine
  0 siblings, 2 replies; 17+ messages in thread
From: Ed L Cashin @ 2005-09-01 19:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: David S. Miller, Jim MacBaine

"David S. Miller" <davem@davemloft.net> writes:

> From: Ed L Cashin <ecashin@coraid.com>
...
>> OK.  67553994410557440 is 61440 byte swapped in 64 bits, and 30MB is
>> 61440 sectors, so this should be a simple byte order fix.
>
> More strangely, the upper and lower 32-bit words are swapped.
> The bytes within each 32-bit word are swapped correctly.
>
> So the calculation maybe should be something like:
>
> 	__le32 *p = (__le32 *) &id[100 << 1];
> 	u32 high32 = le32_to_cpup(p);
> 	u32 low32 = le32_to_cpup(p + 1);
>
> 	ssize = (((u64)high32 << 32) | (u64) low32);
>
> But that doesn't make any sense, and even ide_fix_driveid() in
> drivers/ide/ide-iops.c does a le64_to_cpu() for this value:
>
> 	id->lba_capacity_2 = __le64_to_cpu(id->lba_capacity_2);
>
> I wonder if this is some artifact of how AOE devices encode
> this field when sending it to the client.

Well, an EtherDrive blade just copies the ATA identify response data
into a network packet without looking at it.  The vblade, though, has
to set the lba_capacity and lba_capacity_2 fields itself.

The aoe driver looks OK, but it turns out there's a byte swapping bug
in the vblade that could be related if he's running the vblade on a
big endian host (even though he said it was an x86 host), but I
haven't heard back from the original poster yet.

The vblade bug was the omission of swapping the bytes in each short.
The fix below shows what I mean:

diff -urNp a-exp/ata.c b-exp/ata.c
--- a-exp/ata.c	2005-09-01 10:19:11.000000000 -0400
+++ b-exp/ata.c	2005-09-01 10:19:12.000000000 -0400
@@ -55,24 +55,29 @@ setfld(ushort *a, int idx, int len, char
 }
 
 static void
-setlba28(ushort *p, vlong lba)
+setlba28(ushort *ident, vlong lba)
 {
-	p += 60;
-	*p++ = lba & 0xffff;
-	*p = lba >> 16 & 0x0fffffff;
+	uchar *cp;
+
+	cp = (uchar *) &ident[60];
+	*cp++ = lba;
+	*cp++ = lba >>= 8;
+	*cp++ = lba >>= 8;
+	*cp++ = (lba >>= 8) & 0xf;
 }
 
 static void
-setlba48(ushort *p, vlong lba)
+setlba48(ushort *ident, vlong lba)
 {
-	p += 100;
-	*p++ = lba;
-	lba >>= 16;
-	*p++ = lba;
-	lba >>= 16;
-	*p++ = lba;
-	lba >>= 16;
-	*p = lba;
+	uchar *cp;
+
+	cp = (uchar *) &ident[100];
+	*cp++ = lba;
+	*cp++ = lba >>= 8;
+	*cp++ = lba >>= 8;
+	*cp++ = lba >>= 8;
+	*cp++ = lba >>= 8;
+	*cp++ = lba >>= 8;
 }
 		
 void


-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-01 19:13     ` Ed L Cashin
@ 2005-09-01 19:45       ` David S. Miller
  2005-09-03 16:06       ` Jim MacBaine
  1 sibling, 0 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-01 19:45 UTC (permalink / raw)
  To: ecashin; +Cc: linux-kernel, jmacbaine

From: Ed L Cashin <ecashin@coraid.com>
Date: Thu, 01 Sep 2005 15:13:52 -0400

> The aoe driver looks OK, but it turns out there's a byte swapping bug
> in the vblade that could be related if he's running the vblade on a
> big endian host (even though he said it was an x86 host), but I
> haven't heard back from the original poster yet.

I see, thanks for looking into this.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-01 19:13     ` Ed L Cashin
  2005-09-01 19:45       ` David S. Miller
@ 2005-09-03 16:06       ` Jim MacBaine
  2005-09-06 20:31         ` Ed L Cashin
  1 sibling, 1 reply; 17+ messages in thread
From: Jim MacBaine @ 2005-09-03 16:06 UTC (permalink / raw)
  To: Ed L Cashin; +Cc: linux-kernel, David S. Miller

On 9/1/05, Ed L Cashin <ecashin@coraid.com> wrote:

> The aoe driver looks OK, but it turns out there's a byte swapping bug
> in the vblade that could be related if he's running the vblade on a
> big endian host (even though he said it was an x86 host), but I
> haven't heard back from the original poster yet.

It is in fact a x86_64 kernel, but with a mostly x86 userland. Vblade
is pure x86 code.

> The vblade bug was the omission of swapping the bytes in each short.
> The fix below shows what I mean:

Unfortunately it doesn't fix anything here. The client still reports
the same wrong size as before.  The dmesg output is identical, too.

Regards,
Jim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-03 16:06       ` Jim MacBaine
@ 2005-09-06 20:31         ` Ed L Cashin
  2005-09-09 14:06           ` Ed L Cashin
  0 siblings, 1 reply; 17+ messages in thread
From: Ed L Cashin @ 2005-09-06 20:31 UTC (permalink / raw)
  To: Jim MacBaine; +Cc: David S. Miller, Linux Kernel Mailing List

Jim MacBaine <jmacbaine@gmail.com> writes:

> On 9/1/05, Ed L Cashin <ecashin@coraid.com> wrote:
>
>> The aoe driver looks OK, but it turns out there's a byte swapping bug
>> in the vblade that could be related if he's running the vblade on a
>> big endian host (even though he said it was an x86 host), but I
>> haven't heard back from the original poster yet.
>
> It is in fact a x86_64 kernel, but with a mostly x86 userland. Vblade
> is pure x86 code.
>
>> The vblade bug was the omission of swapping the bytes in each short.
>> The fix below shows what I mean:
>
> Unfortunately it doesn't fix anything here. The client still reports
> the same wrong size as before.  The dmesg output is identical, too.

Let's take this discussion off the lkml, because I doubt there's a
problem with the aoe driver in the kernel, and I can easily follow up
to the lkml with a synopsis if it turns out I'm wrong.

Jim MacBaine, I'm going to ask for more details in a separate email.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-06 20:31         ` Ed L Cashin
@ 2005-09-09 14:06           ` Ed L Cashin
  0 siblings, 0 replies; 17+ messages in thread
From: Ed L Cashin @ 2005-09-09 14:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: David S. Miller, Jim MacBaine

Ed L Cashin <ecashin@coraid.com> writes:

...
> Let's take this discussion off the lkml, because I doubt there's a
> problem with the aoe driver in the kernel, and I can easily follow up
> to the lkml with a synopsis if it turns out I'm wrong.

It looks like I was probably wrong.  I need to do some debugging, but
the only sparc64 machine here at hand is in use.

If anybody would be up for running 2.6.13 on a sparc64 host and
running tests with a patched aoe driver, please let me know.  A test
would look something like this, using an x86 host and a sparc64 host
on the same LAN.

  x86$ dd if=/dev/zero of=/tmp/0x1234567 bs=1k count=1 seek=19088742
  x86$ vblade 0 1 eth0 /tmp/0x1234567

  sparc64$ rmmod aoe
  sparc64$ cd ~/linux-2.6.13
  sparc64$ patch -p1 < aoe.diff
  sparc64$ make && make modules_install
  sparc64$ modprobe aoe

I'd email you patches, and you'd email me the printk messages that
show up in the logs.  Such help would be much appreciated.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-08-31 13:30 aoe fails on sparc64 Jim MacBaine
  2005-08-31 15:50 ` Ed L Cashin
@ 2005-09-16 13:36 ` Ed L Cashin
  2005-09-16 20:34   ` David S. Miller
  2005-09-16 23:35   ` David S. Miller
  1 sibling, 2 replies; 17+ messages in thread
From: Ed L Cashin @ 2005-09-16 13:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jim MacBaine, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

Jim MacBaine <jmacbaine@gmail.com> writes:

> Hello,
>
> Using aoe on a sparc64 system gives strange results:
>
> sunny:/dev/etherd# echo >discover
> sunny:/dev/etherd# mke2fs e0.0
> mke2fs 1.37 (21-Mar-2005)
> mke2fs: File too large while trying to determine filesystem size
> sunny:/dev/etherd# blockdev --getsz e0.0
> -4503599627370496
>
> The log says:
>
> Aug 31 15:18:49 sunny kernel: devfs_mk_dir: invalid argument.<6>
> etherd/e0.0: unknown partition table
> Aug 31 15:18:49 sunny kernel: aoe: 0011d8xxxxxx e0.0 v4000 has
> 67553994410557440
> sectors
>
> The system is an Sun Ultra 5, running 2.6.12.5/sparc64 compiled with
> gcc-3.4.2.  e0.0 is exported on a x86 system using vblade-5, and has a
> size of 30 MB.

I've been working with Jim MacBaine, and he reports that the patch
below gets rid of the problem.  I don't know why.  When I test
le64_to_cpup by itself, it works as expected.


[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 566 bytes --]

Index: linux-2.6.13/drivers/block/aoe/aoecmd.c
===================================================================
--- linux-2.6.13.orig/drivers/block/aoe/aoecmd.c	2005-08-31 17:03:52.000000000 -0400
+++ linux-2.6.13/drivers/block/aoe/aoecmd.c	2005-09-15 15:44:41.000000000 -0400
@@ -320,7 +320,8 @@
 		d->flags |= DEVFL_EXT;
 
 		/* word 100: number lba48 sectors */
-		ssize = le64_to_cpup((__le64 *) &id[100<<1]);
+		ssize = *((u64 *) &id[100<<1]);
+		ssize = le64_to_cpu(ssize);
 
 		/* set as in ide-disk.c:init_idedisk_capacity */
 		d->geo.cylinders = ssize;

[-- Attachment #3: Type: text/plain, Size: 41 bytes --]



-- 
  Ed L Cashin <ecashin@coraid.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-16 13:36 ` Ed L Cashin
@ 2005-09-16 20:34   ` David S. Miller
  2005-09-16 23:35   ` David S. Miller
  1 sibling, 0 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-16 20:34 UTC (permalink / raw)
  To: ecashin; +Cc: linux-kernel, jmacbaine

From: Ed L Cashin <ecashin@coraid.com>
Date: Fri, 16 Sep 2005 09:36:51 -0400

> I've been working with Jim MacBaine, and he reports that the patch
> below gets rid of the problem.  I don't know why.  When I test
> le64_to_cpup by itself, it works as expected.

This reminds me of a known UltraSPARC chip bug.  There is a bug
wherein if you do a 64-bit endianness swapped load instruction, it
only endian swaps within each 32-bit word, not throughout the entire
64-bit word as it should.

But this is weird, because the errata description for this chip bug
says that it only applies to the deprecated 32-bit mode "ldd/std"
instructions, whereas the compiler emits the 64-bit "ldx/stx"
instructions for sparc64 kernel builds.

So, first, let's sanity check the ___arch__swab64p() implementation:

#include <stdlib.h>
#include <stdio.h>

#define ASI_PL			0x88 /* Primary, implicit, l-endian	*/

typedef unsigned long __u64;

static __inline__ __u64 ___arch__swab64p(const __u64 *addr)
{
	__u64 ret;

	__asm__ __volatile__ ("ldxa [%1] %2, %0"
			      : "=r" (ret)
			      : "r" (addr), "i" (ASI_PL));
	return ret;
}

int main(void)
{
	unsigned long tval = 0x123456789abcdef0;
	unsigned long *p, v;

	p = &tval;
	v = ___arch__swab64p(p);
	printf("%016lx --> %016lx\n",
	       tval, v);
	exit(0);
}

Running this on my Ultra-IIIi workstation (this chip doesn't have
said errata) gives the desired result:

davem@sunset:~/src/GIT/sparc-2.6$ gcc -m64 -O2 -o x x.c
davem@sunset:~/src/GIT/sparc-2.6$ ./x
123456789abcdef0 --> f0debc9a78563412
davem@sunset:~/src/GIT/sparc-2.6$

So it looks sane.  Let's try this on a chip that has the errata:

? ./x
123456789abcdef0 --> f0debc9a78563412
? 

That's fine too, so this isn't what is causing the problems.

Wait, I think I see the problem.  Is that "&id[100<<1]" pointer
aligned properly on a 64-bit boundary?  I bet it isn't, and the
sparc64 unaligned load/store trap handler doesn't check whether one of
the endian swapping load/store attributes are being used.

Looking at the assembler generated for aoecmd.s, it isn't aligned:

	add	%l1, 236, %g2
 ...
	ldxa [%g2] 136, %g7

That's aligned on a 4-byte boundary, but not an 8-byte one.

So this is what the bug is.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-16 13:36 ` Ed L Cashin
  2005-09-16 20:34   ` David S. Miller
@ 2005-09-16 23:35   ` David S. Miller
  2005-09-17 10:10     ` Jim MacBaine
  2005-09-19 14:24     ` Ed L Cashin
  1 sibling, 2 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-16 23:35 UTC (permalink / raw)
  To: ecashin; +Cc: linux-kernel, jmacbaine

From: Ed L Cashin <ecashin@coraid.com>
Date: Fri, 16 Sep 2005 09:36:51 -0400

> I've been working with Jim MacBaine, and he reports that the patch
> below gets rid of the problem.  I don't know why.  When I test
> le64_to_cpup by itself, it works as expected.

This patch should fix the bug.

diff --git a/arch/sparc64/kernel/una_asm.S b/arch/sparc64/kernel/una_asm.S
--- a/arch/sparc64/kernel/una_asm.S
+++ b/arch/sparc64/kernel/una_asm.S
@@ -17,7 +17,7 @@ kernel_unaligned_trap_fault:
 __do_int_store:
 	rd	%asi, %o4
 	wr	%o3, 0, %asi
-	ldx	[%o2], %g3
+	mov	%o2, %g3
 	cmp	%o1, 2
 	be,pn	%icc, 2f
 	 cmp	%o1, 4
diff --git a/arch/sparc64/kernel/unaligned.c b/arch/sparc64/kernel/unaligned.c
--- a/arch/sparc64/kernel/unaligned.c
+++ b/arch/sparc64/kernel/unaligned.c
@@ -184,13 +184,14 @@ extern void do_int_load(unsigned long *d
 			unsigned long *saddr, int is_signed, int asi);
 	
 extern void __do_int_store(unsigned long *dst_addr, int size,
-			   unsigned long *src_val, int asi);
+			   unsigned long src_val, int asi);
 
 static inline void do_int_store(int reg_num, int size, unsigned long *dst_addr,
-				struct pt_regs *regs, int asi)
+				struct pt_regs *regs, int asi, int orig_asi)
 {
 	unsigned long zero = 0;
-	unsigned long *src_val = &zero;
+	unsigned long *src_val_p = &zero;
+	unsigned long src_val;
 
 	if (size == 16) {
 		size = 8;
@@ -198,7 +199,25 @@ static inline void do_int_store(int reg_
 		        (unsigned)fetch_reg(reg_num, regs) : 0)) << 32) |
 			(unsigned)fetch_reg(reg_num + 1, regs);
 	} else if (reg_num) {
-		src_val = fetch_reg_addr(reg_num, regs);
+		src_val_p = fetch_reg_addr(reg_num, regs);
+	}
+	src_val = *src_val_p;
+	if (unlikely(asi != orig_asi)) {
+		switch (size) {
+		case 2:
+			src_val = swab16(src_val);
+			break;
+		case 4:
+			src_val = swab32(src_val);
+			break;
+		case 8:
+			src_val = swab64(src_val);
+			break;
+		case 16:
+		default:
+			BUG();
+			break;
+		};
 	}
 	__do_int_store(dst_addr, size, src_val, asi);
 }
@@ -276,6 +295,7 @@ asmlinkage void kernel_unaligned_trap(st
 		kernel_mna_trap_fault();
 	} else {
 		unsigned long addr;
+		int orig_asi, asi;
 
 		addr = compute_effective_address(regs, insn,
 						 ((insn >> 25) & 0x1f));
@@ -285,18 +305,48 @@ asmlinkage void kernel_unaligned_trap(st
 		       regs->tpc, dirstrings[dir], addr, size,
 		       regs->u_regs[UREG_RETPC]);
 #endif
+		orig_asi = asi = decode_asi(insn, regs);
+		switch (asi) {
+		case ASI_NL:
+		case ASI_AIUPL:
+		case ASI_AIUSL:
+		case ASI_PL:
+		case ASI_SL:
+		case ASI_PNFL:
+		case ASI_SNFL:
+			asi &= ~0x08;
+			break;
+		};
 		switch (dir) {
 		case load:
 			do_int_load(fetch_reg_addr(((insn>>25)&0x1f), regs),
 				    size, (unsigned long *) addr,
-				    decode_signedness(insn),
-				    decode_asi(insn, regs));
+				    decode_signedness(insn), asi);
+			if (unlikely(asi != orig_asi)) {
+				unsigned long val_in = *(unsigned long *) addr;
+				switch (size) {
+				case 2:
+					val_in = swab16(val_in);
+					break;
+				case 4:
+					val_in = swab32(val_in);
+					break;
+				case 8:
+					val_in = swab64(val_in);
+					break;
+				case 16:
+				default:
+					BUG();
+					break;
+				};
+				*(unsigned long *) addr = val_in;
+			}
 			break;
 
 		case store:
 			do_int_store(((insn>>25)&0x1f), size,
 				     (unsigned long *) addr, regs,
-				     decode_asi(insn, regs));
+				     asi, orig_asi);
 			break;
 
 		default:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-16 23:35   ` David S. Miller
@ 2005-09-17 10:10     ` Jim MacBaine
  2005-09-18  6:12       ` David S. Miller
  2005-09-19 14:24     ` Ed L Cashin
  1 sibling, 1 reply; 17+ messages in thread
From: Jim MacBaine @ 2005-09-17 10:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: ecashin, linux-kernel

On 9/17/05, David S. Miller <davem@davemloft.net> wrote:

> This patch should fix the bug.
> 
> diff --git a/arch/sparc64/kernel/una_asm.S b/arch/sparc64/kernel/una_asm.S
[..]
> diff --git a/arch/sparc64/kernel/unaligned.c b/arch/sparc64/kernel/unaligned.c
[..]

Was this patch meant to be applied to a fresh 2.6.13 kernel without
any of Ed's patches? If so, I cannot confirm that this patch works.
The aoe driver still reports a wrong size:

sunny:~# modprobe aoe
aoe: aoe_init: AoE v2.6-10 initialised.
 etherd/e0.0: unknown partition table
aoe: 0011xxxxxxxx e0.0 v4000 has 7441392446501552128 sectors

The exported file has got a size of 19088743 sectors.

Regards,
Jim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-17 10:10     ` Jim MacBaine
@ 2005-09-18  6:12       ` David S. Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-18  6:12 UTC (permalink / raw)
  To: jmacbaine; +Cc: ecashin, linux-kernel

From: Jim MacBaine <jmacbaine@gmail.com>
Date: Sat, 17 Sep 2005 12:10:17 +0200

> Was this patch meant to be applied to a fresh 2.6.13 kernel without
> any of Ed's patches? If so, I cannot confirm that this patch works.
> The aoe driver still reports a wrong size:

Please double check that you really ran with the patch
applied.  I even wrote a test kernel module that verified
that the bug was fixed by doing various unaligned 64-bit
loads and stores, both with and without endianness swapping.
It definitely failed before the patch, and definitely worked
with the patch.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-16 23:35   ` David S. Miller
  2005-09-17 10:10     ` Jim MacBaine
@ 2005-09-19 14:24     ` Ed L Cashin
  2005-09-19 18:21       ` David S. Miller
  1 sibling, 1 reply; 17+ messages in thread
From: Ed L Cashin @ 2005-09-19 14:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, jmacbaine

"David S. Miller" <davem@davemloft.net> writes:

> From: Ed L Cashin <ecashin@coraid.com>
> Date: Fri, 16 Sep 2005 09:36:51 -0400
>
>> I've been working with Jim MacBaine, and he reports that the patch
>> below gets rid of the problem.  I don't know why.  When I test
>> le64_to_cpup by itself, it works as expected.
>
> This patch should fix the bug.
>
> diff --git a/arch/sparc64/kernel/una_asm.S b/arch/sparc64/kernel/una_asm.S
> --- a/arch/sparc64/kernel/una_asm.S
> +++ b/arch/sparc64/kernel/una_asm.S

So it's OK to use the "...._to_cpup" macros with unaligned pointers?
I'm asking whether ...

  1) Passing le64_to_cpup an unaligned pointer is "OK" and within the
     intended use of the function.  I'm having trouble finding whether
     this is documented somewhere.

  2) These new changes to the sparc64 unaligned access fault handling
     will make it OK to leave the aoe driver the way it is in the
     mainline kernel.

...
> diff --git a/arch/sparc64/kernel/unaligned.c b/arch/sparc64/kernel/unaligned.c


-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-19 14:24     ` Ed L Cashin
@ 2005-09-19 18:21       ` David S. Miller
  2005-09-19 18:29         ` Roland Dreier
  2005-09-19 18:38         ` Ed L Cashin
  0 siblings, 2 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-19 18:21 UTC (permalink / raw)
  To: ecashin; +Cc: linux-kernel, jmacbaine

From: Ed L Cashin <ecashin@coraid.com>
Date: Mon, 19 Sep 2005 10:24:00 -0400

>   1) Passing le64_to_cpup an unaligned pointer is "OK" and within the
>      intended use of the function.  I'm having trouble finding whether
>      this is documented somewhere.
> 
>   2) These new changes to the sparc64 unaligned access fault handling
>      will make it OK to leave the aoe driver the way it is in the
>      mainline kernel.

Both #1 and #2 are true.

Although it's very much discouraged to dereference unaligned pointers,
especially in performance critical code (which this AOE case is not,
thankfully), because performance will be really bad as the trap
handler has to fix up the access on RISC platforms.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-19 18:21       ` David S. Miller
@ 2005-09-19 18:29         ` Roland Dreier
  2005-09-19 18:38         ` Ed L Cashin
  1 sibling, 0 replies; 17+ messages in thread
From: Roland Dreier @ 2005-09-19 18:29 UTC (permalink / raw)
  To: David S. Miller; +Cc: ecashin, linux-kernel, jmacbaine

    David> Although it's very much discouraged to dereference
    David> unaligned pointers, especially in performance critical code
    David> (which this AOE case is not, thankfully), because
    David> performance will be really bad as the trap handler has to
    David> fix up the access on RISC platforms. 

Also, ia64 has a tendency to print an ugly message in the kernel log
for unaligned accesses.  Has anyone tried AoE on ia64?

It might be better to change the AoE code to use get_unaligned(), just
to document what's going on.  Although clearly the sparc64 patch is
correct as well -- we should never silently return the wrong data.

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aoe fails on sparc64
  2005-09-19 18:21       ` David S. Miller
  2005-09-19 18:29         ` Roland Dreier
@ 2005-09-19 18:38         ` Ed L Cashin
  1 sibling, 0 replies; 17+ messages in thread
From: Ed L Cashin @ 2005-09-19 18:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, jmacbaine

"David S. Miller" <davem@davemloft.net> writes:

> From: Ed L Cashin <ecashin@coraid.com>
> Date: Mon, 19 Sep 2005 10:24:00 -0400
>
>>   1) Passing le64_to_cpup an unaligned pointer is "OK" and within the
>>      intended use of the function.  I'm having trouble finding whether
>>      this is documented somewhere.
>> 
>>   2) These new changes to the sparc64 unaligned access fault handling
>>      will make it OK to leave the aoe driver the way it is in the
>>      mainline kernel.
>
> Both #1 and #2 are true.

That's interesting.  I think I'll send a patch documenting #1.

> Although it's very much discouraged to dereference unaligned pointers,
> especially in performance critical code (which this AOE case is not,
> thankfully), because performance will be really bad as the trap
> handler has to fix up the access on RISC platforms.

Yes, this only happens when per AoE device when the AoE device is
discovered.  Still, I might submit a patch that reverts the aoe driver
to getting the ATA identify values byte by byte as it used to do.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-09-19 19:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-31 13:30 aoe fails on sparc64 Jim MacBaine
2005-08-31 15:50 ` Ed L Cashin
2005-09-01  6:24   ` David S. Miller
2005-09-01 19:13     ` Ed L Cashin
2005-09-01 19:45       ` David S. Miller
2005-09-03 16:06       ` Jim MacBaine
2005-09-06 20:31         ` Ed L Cashin
2005-09-09 14:06           ` Ed L Cashin
2005-09-16 13:36 ` Ed L Cashin
2005-09-16 20:34   ` David S. Miller
2005-09-16 23:35   ` David S. Miller
2005-09-17 10:10     ` Jim MacBaine
2005-09-18  6:12       ` David S. Miller
2005-09-19 14:24     ` Ed L Cashin
2005-09-19 18:21       ` David S. Miller
2005-09-19 18:29         ` Roland Dreier
2005-09-19 18:38         ` Ed L Cashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).