linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NS83820 2.6.0-test5 driver seems unstable on IA64
@ 2003-09-19  4:16 Peter Chubb
  2003-09-19  4:38 ` Grant Grundler
  0 siblings, 1 reply; 81+ messages in thread
From: Peter Chubb @ 2003-09-19  4:16 UTC (permalink / raw)
  To: linux-ns83820; +Cc: linux-ia64, linux-kernel


Hi,
	I'm having some problems with the NS83820 driver on Itanium-1.

1.  I see many many `unaligned accesses' when the IP header is
accessed (ip_input.c:410 and 423)
It looks as if the IP header is two byte aligned, when it ought to be
4-byte aligned.

The error message is, e.g., 
 kernel unaligned access to 0xe000000128d8881e, ip=0xe0000000047ba2d1

0xe0000000047ba2d1 is             
ip_input:410
		   if (iph->ihl < 5 || iph->version != 4)
		                goto inhdr_error; 

0xe000000128d8881e is what's calculated as &iph->ihl

The obvious approach of realigning the SKB by 2 bytes seems not to
work.

Any ideas?

--
Dr Peter Chubb     http://www.gelato.unsw.edu.au  peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories,   all slightly different.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  4:16 NS83820 2.6.0-test5 driver seems unstable on IA64 Peter Chubb
@ 2003-09-19  4:38 ` Grant Grundler
  2003-09-19  4:43   ` Andi Kleen
  0 siblings, 1 reply; 81+ messages in thread
From: Grant Grundler @ 2003-09-19  4:38 UTC (permalink / raw)
  To: Peter Chubb; +Cc: linux-ns83820, linux-ia64, linux-kernel

On Fri, Sep 19, 2003 at 02:16:29PM +1000, Peter Chubb wrote:
> The obvious approach of realigning the SKB by 2 bytes seems not to
> work.

Could you be more detailed about the "obvious approach"?
ie show the diff of what you changed.

Several other NIC driver "alias" (as davidm describes it) the buffer
by reserving two bytes at the beginning of the recieve buffer
where header and payload data are DMAd on inbound traffic.

I look to my favorite driver, tulip, for an example and promptly
get confused. interrupt.c:tulip_rx() calls skb_reserve(xx,2)
to force alignment when replenishing RX buffers while a note in
tulip_core.c:tulip_init_ring() clearly says not to. I haven't
looked further to figure out the difference.

hth,
grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  4:38 ` Grant Grundler
@ 2003-09-19  4:43   ` Andi Kleen
  2003-09-19  5:01     ` Peter Chubb
  2003-09-19  5:04     ` Grant Grundler
  0 siblings, 2 replies; 81+ messages in thread
From: Andi Kleen @ 2003-09-19  4:43 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Peter Chubb, linux-ns83820, linux-ia64, linux-kernel

On Thu, Sep 18, 2003 at 09:38:47PM -0700, Grant Grundler wrote:
> On Fri, Sep 19, 2003 at 02:16:29PM +1000, Peter Chubb wrote:
> > The obvious approach of realigning the SKB by 2 bytes seems not to
> > work.
> 
> Could you be more detailed about the "obvious approach"?
> ie show the diff of what you changed.
> 
> Several other NIC driver "alias" (as davidm describes it) the buffer
> by reserving two bytes at the beginning of the recieve buffer
> where header and payload data are DMAd on inbound traffic.

It is a mixed blessing, because the result is a non cache line 
aligned buffer. Some NIC chipsets don't like this because they have
to do a read-modify-write cycle for the first cache line and cannot
burst the full packet.

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  4:43   ` Andi Kleen
@ 2003-09-19  5:01     ` Peter Chubb
  2003-09-19  5:53       ` Andi Kleen
  2003-09-19  5:04     ` Grant Grundler
  1 sibling, 1 reply; 81+ messages in thread
From: Peter Chubb @ 2003-09-19  5:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, Peter Chubb, linux-ns83820, linux-ia64, linux-kernel


 On Thu, Sep 18, 2003 at 09:38:47PM -0700, Grant Grundler wrote:
> On Fri, Sep 19, 2003 at 02:16:29PM +1000, Peter Chubb wrote: 
>> The
>> obvious approach of realigning the SKB by 2 bytes seems not to 
>> work.

> Could you be more detailed about the "obvious approach"?  ie show
> the diff of what you changed.


It doesn't work as in no error messages, no pings, no interrupts to the
driver.  And the kernel hangs after a short while.

This is what I changed:

===== drivers/net/ns83820.c 1.19 vs edited =====
--- 1.19/drivers/net/ns83820.c  Thu Jun  5 13:50:00 2003
+++ edited/drivers/net/ns83820.c        Fri Sep 19 13:49:23 2003
@@ -567,7 +567,7 @@
                res = (long)skb->tail & 0xf;
                res = 0x10 - res;
                res &= 0xf;
-               skb_reserve(skb, res);
+               skb_reserve(skb, res+2);
 
                skb->dev = &dev->net_dev;
                if (gfp != GFP_ATOMIC)


--
Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories,   all slightly different.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  4:43   ` Andi Kleen
  2003-09-19  5:01     ` Peter Chubb
@ 2003-09-19  5:04     ` Grant Grundler
  2003-09-19  5:11       ` Peter Chubb
  1 sibling, 1 reply; 81+ messages in thread
From: Grant Grundler @ 2003-09-19  5:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Peter Chubb, linux-ns83820, linux-ia64, linux-kernel

On Fri, Sep 19, 2003 at 06:43:15AM +0200, Andi Kleen wrote:
> It is a mixed blessing, because the result is a non cache line 
> aligned buffer. Some NIC chipsets don't like this because they have
> to do a read-modify-write cycle for the first cache line and cannot
> burst the full packet.

yeah, that reminds me...tulip can only DMA to word aligned addresses.
I looked it up again in DEC 21143 HWREF (page 113 of the pdf):
        Table 4-3. RDES2 Bit Field Description
    Field      Description
    31:0       Buffer Address 1
               Indicates the physical address of buffer 1. 
               The buffer must be longword aligned (RDES2<1:0> = 00).

Same is true for TX/RX descriptor addresses.
Behavior is undefined if the addresses for DMA are not 4-byte aligned.

Anyone know if that's true for NS83820?
I couldn't find which driver controls that chip/NIC via a quick grep.

grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  5:04     ` Grant Grundler
@ 2003-09-19  5:11       ` Peter Chubb
  0 siblings, 0 replies; 81+ messages in thread
From: Peter Chubb @ 2003-09-19  5:11 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, Peter Chubb, linux-ns83820, linux-ia64, linux-kernel

>>>>> "Grant" == Grant Grundler <iod00d@hp.com> writes:

Grant> On Fri, Sep 19, 2003 at 06:43:15AM +0200, Andi Kleen wrote:

Grant> Same is true for TX/RX descriptor addresses.  Behavior is
Grant> undefined if the addresses for DMA are not 4-byte aligned.

Grant> Anyone know if that's true for NS83820?  I couldn't find which
Grant> driver controls that chip/NIC via a quick grep.

I believe it is, based on the experiments I've done here, and on
looking at the BSD driver.

Driver is drivers/net/ns83820.c

Peter C

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  5:01     ` Peter Chubb
@ 2003-09-19  5:53       ` Andi Kleen
  2003-09-19 10:49         ` Benjamin LaHaise
  0 siblings, 1 reply; 81+ messages in thread
From: Andi Kleen @ 2003-09-19  5:53 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Andi Kleen, Grant Grundler, Peter Chubb, linux-ns83820,
	linux-ia64, linux-kernel

> This is what I changed:
> 
> ===== drivers/net/ns83820.c 1.19 vs edited =====
> --- 1.19/drivers/net/ns83820.c  Thu Jun  5 13:50:00 2003
> +++ edited/drivers/net/ns83820.c        Fri Sep 19 13:49:23 2003
> @@ -567,7 +567,7 @@
>                 res = (long)skb->tail & 0xf;
>                 res = 0x10 - res;
>                 res &= 0xf;
> -               skb_reserve(skb, res);
> +               skb_reserve(skb, res+2);

You must add the 2 bytes in dev_alloc_skb() too.

(it also add 16 bytes by itself, but they are used for other things)

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19  5:53       ` Andi Kleen
@ 2003-09-19 10:49         ` Benjamin LaHaise
  2003-09-23  0:34           ` Peter Chubb
  0 siblings, 1 reply; 81+ messages in thread
From: Benjamin LaHaise @ 2003-09-19 10:49 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Chubb, Grant Grundler, Peter Chubb, linux-ns83820,
	linux-ia64, linux-kernel

On Fri, Sep 19, 2003 at 07:53:04AM +0200, Andi Kleen wrote:
> You must add the 2 bytes in dev_alloc_skb() too.
> 
> (it also add 16 bytes by itself, but they are used for other things)

Sorry, the chipset does not support packet rx into unaligned buffers.  
stoopid.  Fwiw, the copy code got removed by davem -- isn't the core 
dealubg wuth unakugned packets now?

		-ben
-- 
"The software industry today survives only through an unstated agreement 
not to stir things up too much.  We must hope this lawsuit [by SCO] isn't 
the big stirring spoon." -- Ian Lance Taylor, Linux Journal, June 2003

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-19 10:49         ` Benjamin LaHaise
@ 2003-09-23  0:34           ` Peter Chubb
  2003-09-23  0:36             ` Benjamin LaHaise
  0 siblings, 1 reply; 81+ messages in thread
From: Peter Chubb @ 2003-09-23  0:34 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andi Kleen, Peter Chubb, Grant Grundler, Peter Chubb,
	linux-ns83820, linux-ia64, linux-kernel


OK, a patch for the driver in 2.6.0-test5 is appended.

I suspect that there are other architectures that don't like unaligned
accesses...  feel free to add them to the #ifdef.
This is based on code I found in the revision history.  It seems to
work.

Without this patch, the console messages saying `unaligned access'
would come out fast enough and often enough to delay and or miss
interrupts, leading to an eventual machine hangup on I2000.

===== drivers/net/ns83820.c 1.30 vs edited =====
--- 1.30/drivers/net/ns83820.c	Thu Sep 11 09:46:45 2003
+++ edited/drivers/net/ns83820.c	Mon Sep 22 12:49:18 2003
@@ -793,6 +793,25 @@
 	}
 }
 
+#if defined(__ia64__)
+static inline struct sk_buff *skb_realign_iphdr(struct sk_buff *skb, int len, struct ns83820 *dev)
+{
+	struct sk_buff *tmp = __dev_alloc_skb(len+2, GFP_ATOMIC);
+	if (!tmp)
+		return NULL;
+	tmp->dev = &dev->net_dev;
+	skb_reserve(tmp, 2);
+	memcpy(skb_put(tmp, len), skb->data, len);
+	kfree_skb(skb);
+	return tmp;
+}
+#else
+static inline  struct sk_buff *skb_realign_iphdr(struct sk_buff *skb, int len, struct ns83820 *dev)
+{
+	return skb;
+}
+#endif
+
 static void FASTCALL(ns83820_rx_kick(struct ns83820 *dev));
 static void ns83820_rx_kick(struct ns83820 *dev)
 {
@@ -862,6 +881,7 @@
 		if (likely(CMDSTS_OK & cmdsts)) {
 			int len = cmdsts & 0xffff;
 			skb_put(skb, len);
+			skb = skb_realign_iphdr(skb, len, dev);
 			if (unlikely(!skb))
 				goto netdev_mangle_me_harder_failed;
 			if (cmdsts & CMDSTS_DEST_MULTI)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23  0:34           ` Peter Chubb
@ 2003-09-23  0:36             ` Benjamin LaHaise
  2003-09-23  6:22               ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Benjamin LaHaise @ 2003-09-23  0:36 UTC (permalink / raw)
  To: Peter Chubb, davem
  Cc: Andi Kleen, Grant Grundler, Peter Chubb, linux-ns83820,
	linux-ia64, linux-kernel

Denied.  Dave, please explain.

		-ben

On Tue, Sep 23, 2003 at 10:34:18AM +1000, Peter Chubb wrote:
> 
> OK, a patch for the driver in 2.6.0-test5 is appended.
> 
> I suspect that there are other architectures that don't like unaligned
> accesses...  feel free to add them to the #ifdef.
> This is based on code I found in the revision history.  It seems to
> work.
> 
> Without this patch, the console messages saying `unaligned access'
> would come out fast enough and often enough to delay and or miss
> interrupts, leading to an eventual machine hangup on I2000.
> 
> ===== drivers/net/ns83820.c 1.30 vs edited =====
> --- 1.30/drivers/net/ns83820.c	Thu Sep 11 09:46:45 2003
> +++ edited/drivers/net/ns83820.c	Mon Sep 22 12:49:18 2003
> @@ -793,6 +793,25 @@
>  	}
>  }
>  
> +#if defined(__ia64__)
> +static inline struct sk_buff *skb_realign_iphdr(struct sk_buff *skb, int len, struct ns83820 *dev)
> +{
> +	struct sk_buff *tmp = __dev_alloc_skb(len+2, GFP_ATOMIC);
> +	if (!tmp)
> +		return NULL;
> +	tmp->dev = &dev->net_dev;
> +	skb_reserve(tmp, 2);
> +	memcpy(skb_put(tmp, len), skb->data, len);
> +	kfree_skb(skb);
> +	return tmp;
> +}
> +#else
> +static inline  struct sk_buff *skb_realign_iphdr(struct sk_buff *skb, int len, struct ns83820 *dev)
> +{
> +	return skb;
> +}
> +#endif
> +
>  static void FASTCALL(ns83820_rx_kick(struct ns83820 *dev));
>  static void ns83820_rx_kick(struct ns83820 *dev)
>  {
> @@ -862,6 +881,7 @@
>  		if (likely(CMDSTS_OK & cmdsts)) {
>  			int len = cmdsts & 0xffff;
>  			skb_put(skb, len);
> +			skb = skb_realign_iphdr(skb, len, dev);
>  			if (unlikely(!skb))
>  				goto netdev_mangle_me_harder_failed;
>  			if (cmdsts & CMDSTS_DEST_MULTI)

-- 
"The software industry today survives only through an unstated agreement 
not to stir things up too much.  We must hope this lawsuit [by SCO] isn't 
the big stirring spoon." -- Ian Lance Taylor, Linux Journal, June 2003

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23  0:36             ` Benjamin LaHaise
@ 2003-09-23  6:22               ` David S. Miller
  2003-09-23 10:40                 ` Peter Chubb
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23  6:22 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: peter, ak, iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

On Mon, 22 Sep 2003 20:36:29 -0400
Benjamin LaHaise <bcrl@kvack.org> wrote:

> Denied.  Dave, please explain.

Why should I have anything to explain? :-)

The fact that ia64 is doing a printk for an unaligned kernel
load or store is what you should be asking questions about :)

It's one thing if ia64 keeps track of unaligned accesses as
a counter statistic, but emitting a printk for everyone is
pretty anti-social.

Unaligned accesses in the kernel are perfectly normal, and are
absolutely going to happen in various kinds of networking setups.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23  6:22               ` David S. Miller
@ 2003-09-23 10:40                 ` Peter Chubb
  2003-09-23 10:51                   ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Peter Chubb @ 2003-09-23 10:40 UTC (permalink / raw)
  To: David S. Miller
  Cc: Benjamin LaHaise, peter, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

>>>>> "David" == David S Miller <davem@redhat.com> writes:

David> On Mon, 22 Sep 2003 20:36:29 -0400 Benjamin LaHaise
David> <bcrl@kvack.org> wrote:

>> Denied.  Dave, please explain.

David> Why should I have anything to explain? :-)

David> The fact that ia64 is doing a printk for an unaligned kernel
David> load or store is what you should be asking questions about :)

How expensive is it to take the trap and do a fix up, compared to
making an aligned copy?  As it involves raising and handling a fault
disassembling the instruction that caused the fault, etc., I'd be
surprised if it's much less than 1000 cycles, even without the printk,
although I haven't measured it yet, and can't find enough info in the
architecture manuals to know what it is.

Even if it's only 500 cycles, you can copy and realign a large packet
in that time. 

--
Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories,   all slightly different.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 10:40                 ` Peter Chubb
@ 2003-09-23 10:51                   ` David S. Miller
  2003-09-23 14:59                     ` David Mosberger
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 10:51 UTC (permalink / raw)
  To: Peter Chubb
  Cc: bcrl, peter, ak, iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 20:40:05 +1000
Peter Chubb <peter@chubb.wattle.id.au> wrote:

> How expensive is it to take the trap and do a fix up, compared to
> making an aligned copy?  As it involves raising and handling a fault
> disassembling the instruction that caused the fault, etc., I'd be
> surprised if it's much less than 1000 cycles, even without the printk,
> although I haven't measured it yet, and can't find enough info in the
> architecture manuals to know what it is.

A cache miss can cause 100 or so cycles. :)

And unlike the fixup trap, the printk wakes up a process and
causes disk activity as syslogd writes to the kernel message
log file.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 10:51                   ` David S. Miller
@ 2003-09-23 14:59                     ` David Mosberger
  2003-09-23 17:27                       ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 14:59 UTC (permalink / raw)
  To: David S. Miller
  Cc: Peter Chubb, bcrl, ak, iod00d, peterc, linux-ns83820, linux-ia64,
	linux-kernel

>>>>> On Tue, 23 Sep 2003 03:51:18 -0700, "David S. Miller" <davem@redhat.com> said:

  David> On Tue, 23 Sep 2003 20:40:05 +1000 Peter Chubb
  David> <peter@chubb.wattle.id.au> wrote:

  >> How expensive is it to take the trap and do a fix up, compared to
  >> making an aligned copy?  As it involves raising and handling a
  >> fault disassembling the instruction that caused the fault, etc.,
  >> I'd be surprised if it's much less than 1000 cycles, even without
  >> the printk, although I haven't measured it yet, and can't find
  >> enough info in the architecture manuals to know what it is.

  David> A cache miss can cause 100 or so cycles. :)

  David> And unlike the fixup trap, the printk wakes up a process and
  David> causes disk activity as syslogd writes to the kernel message
  David> log file.

The printk() is rate-controlled and doesn't happen for every unaligned
access.  It's average cost can be made as low as we want to, by adjusting
the rate.

	--david


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 14:59                     ` David Mosberger
@ 2003-09-23 17:27                       ` David S. Miller
  2003-09-23 17:56                         ` David Mosberger
  2003-09-23 23:04                         ` Ian Wienand
  0 siblings, 2 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 17:27 UTC (permalink / raw)
  To: davidm
  Cc: davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

On Tue, 23 Sep 2003 07:59:11 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> The printk() is rate-controlled and doesn't happen for every unaligned
> access.  It's average cost can be made as low as we want to, by adjusting
> the rate.

But if the event is normal, you shouldn't be logging it as if
it weren't.

Anyone who tries to use IP over appletalk or certain protocols
over PPP are going to see your silly messages.

As I understand it, you even do this stupid printk for user apps
as well, that makes it more than rediculious.  I'd be surprised
if anyone can find any useful kernel messages on an ia64 system
in the dmesg output with all the unaligned access crap there.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:27                       ` David S. Miller
@ 2003-09-23 17:56                         ` David Mosberger
  2003-09-24  2:55                           ` David S. Miller
  2003-09-23 23:04                         ` Ian Wienand
  1 sibling, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 17:56 UTC (permalink / raw)
  To: David S. Miller
  Cc: davidm, davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

>>>>> On Tue, 23 Sep 2003 10:27:35 -0700, "David S. Miller" <davem@redhat.com> said:

  DaveM> On Tue, 23 Sep 2003 07:59:11 -0700
  DaveM> David Mosberger <davidm@napali.hpl.hp.com> wrote:

  >> The printk() is rate-controlled and doesn't happen for every unaligned
  >> access.  It's average cost can be made as low as we want to, by adjusting
  >> the rate.

  DaveM> But if the event is normal, you shouldn't be logging it as if
  DaveM> it weren't.

An event that causes a slow-down of 500 times or so is not "normal".
On Alpha, we did have just a counter.  Guess what, nobody ever noticed
when things ran much slower than they should have.

  DaveM> Anyone who tries to use IP over appletalk or certain protocols
  DaveM> over PPP are going to see your silly messages.

  DaveM> As I understand it, you even do this stupid printk for user apps
  DaveM> as well, that makes it more than rediculious.  I'd be surprised
  DaveM> if anyone can find any useful kernel messages on an ia64 system
  DaveM> in the dmesg output with all the unaligned access crap there.

Ever heard of prctl --unalign=silent?
I don't normaly do that and I still get very few unaligned warning messages.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:27                       ` David S. Miller
  2003-09-23 17:56                         ` David Mosberger
@ 2003-09-23 23:04                         ` Ian Wienand
  2003-09-23 23:35                           ` David S. Miller
  1 sibling, 1 reply; 81+ messages in thread
From: Ian Wienand @ 2003-09-23 23:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 10:27:35AM -0700, David S. Miller wrote:
> As I understand it, you even do this stupid printk for user apps
> as well, that makes it more than rediculious.
                                                                                                                                                      
Just as a point of interest, as an application programmer I think it's
the type of thing you want to know about quite loudly.  The only
benchmark style figure on this I've ever seen is in this paper
                                                                                                                                                      
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vcconWindowsDataAlignmentOnIPFX86X86-64.asp
                                                                                                                                                      
(forgive the long url) which, if you scroll down to the little graph,
shows OS fixup is about 450 times slower that an aligned access.  That
sucks, and if I didn't realise I'd done it, it would suck even more.

Oh, and I think fixing it up automatically & warning is much more
useful than sending a (by default) terminating signal to the program;
though I'm sure others might disagree.
                                                                                                                                                      
-i
ianw@gelato.unsw.edu.au
http://www.gelato.unsw.edu.au

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 23:04                         ` Ian Wienand
@ 2003-09-23 23:35                           ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 23:35 UTC (permalink / raw)
  To: Ian Wienand; +Cc: linux-ia64, linux-kernel

On Wed, 24 Sep 2003 09:04:12 +1000
Ian Wienand <ianw@gelato.unsw.edu.au> wrote:

> Just as a point of interest, as an application programmer I think it's
> the type of thing you want to know about quite loudly.

Sure, and I agree that you should have the choice to see
these messages at the verbosity you desire, and ia64 has
the capability to control things in this way.

What I disagree with is spitting out log messages about such things
happening in the kernel.  Because that is remotely triggerable.  As
Alan mentioned, if I can send packets to your ia64 box I can make it
spew system log messages.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:56                         ` David Mosberger
@ 2003-09-24  2:55                           ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-24  2:55 UTC (permalink / raw)
  To: davidm
  Cc: davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

On Tue, 23 Sep 2003 10:56:33 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> An event that causes a slow-down of 500 times or so is not "normal".

For the slow path of IP option processing it is.  When using IP
embedded in appletalk, it is.

It's all slow path stuff.

Why is this so hard to understand?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-27  4:26           ` Andi Kleen
  2003-09-27  4:34             ` David Mosberger
@ 2003-09-27 23:36             ` David S. Miller
  1 sibling, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-27 23:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davidm, ak, linux-kernel

On Sat, 27 Sep 2003 06:26:40 +0200
Andi Kleen <ak@muc.de> wrote:

> You handle misalignment->misalignment copies with zero or small cost -
> when both source and destination have the same misalignment. I guess
> you do that by just aligning the pointer at the beginning of the 
> function. That works as long as both source and destination have
> the same misalignment.

Nope, different alignments are fully optimized as well.

Sparc64 VIS has special instructions so that you can handle
all of these different src/dst alignment cases and still use
the 64-byte-at-a-time cache bypassing loads and stores.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-27  4:36         ` Andi Kleen
@ 2003-09-27  4:52           ` David Mosberger
  0 siblings, 0 replies; 81+ messages in thread
From: David Mosberger @ 2003-09-27  4:52 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davidm, David S. Miller, linux-kernel

>>>>> On Sat, 27 Sep 2003 06:36:31 +0200, Andi Kleen <ak@colin2.muc.de> said:

  Andi> Use one receive descriptor for the MAC header and another for
  Andi> the rest (IP/TCP/payload) In the rest everything is aligned
  Andi> and there are no problems with misaligned data types (ignoring
  Andi> exceptional cases like broken timestamp options which can be
  Andi> handled slowly).

  >>  That's what I proposed.

  Andi> Hmm, must have missed that sorry. I saw it in Ivan's post
  Andi> only.

Hmmh, I wonder whether I accidentally replied via the newsgroup.  I
see my post in the newsgroup, but not on the linux-kernel archive.
Anyhow, I attached the original mail below.

  Andi> Fixing the stack to hadle separate MAC headers should not be
  Andi> that much work, the code is fairly limited. Drawback is that
  Andi> it requires bigger changes to the network drivers and maybe
  Andi> some special case code for non standard MAC headers.

  >>  I suspect you'd be better off just copying the 14 bytes of the
  >> Ethernet header so that the entire packet is contiguous.

  Andi> Good point. So it can be even handled transparently in drivers
  Andi> without any core stack changes.

  Andi> Only drawback is that more RX descriptors also usually require
  Andi> more PCI writes, which can be also very slow. Needs to be
  Andi> benchmarked carefully if it's a worthy trade-off.

Yes, I totally agree on this one.  Descriptor fetching is usually
quite expensive, though depending on how smart the PCI controller is,
some of the cost could be hidden.

	--david

---------------------------------------------------------------------
From: David Mosberger <davidm@napali.hpl.hp.com>
Subject: Re: NS83820 2.6.0-test5 driver seems unstable on IA64
Newsgroups: linux.kernel
Date: Fri, 26 Sep 2003 08:50:09 +0200
Reply-To: davidm@hpl.hp.com
Organization: linux.* mail to news gateway

>>>>> On Thu, 25 Sep 2003 23:07:02 -0700, "David S. Miller" <davem@redhat.com> said:

  David> On Fri, 26 Sep 2003 08:16:36 +0200 Manfred Spraul
  David> <manfred@colorfullife.com> wrote:

  >> Is that really the right solution? Add a full-packet copy to
  >> every driver?

  David> In the short term, yes it is.

I know nothing about the ns83820, but page 22 of the data sheet
(http://www.national.com/pf/DP/DP83820.html) suggests that it would be
possible to setup _two_ descriptors for each incoming packet: the
first one would cover the Ethernet header (14 bytes), the second the
rest.  That way, IP-header would be 8-byte aligned (since the
ns83820's DMA engine seems to require 8-byte alignment for incoming
data).

If so, this would let you get the IP-header and on aligned at the cost
of extra descriptor fetching.  The cost of fetching the extra
descriptor will be big enough that you'd only want to do this when
unaligned accesses are expensive, but hopefully it would be cheaper
than copying the entire packet.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 17:47       ` David Mosberger
  2003-09-27  3:57         ` David S. Miller
@ 2003-09-27  4:36         ` Andi Kleen
  2003-09-27  4:52           ` David Mosberger
  1 sibling, 1 reply; 81+ messages in thread
From: Andi Kleen @ 2003-09-27  4:36 UTC (permalink / raw)
  To: davidm; +Cc: David S. Miller, linux-kernel

On Fri, Sep 26, 2003 at 10:47:12AM -0700, David Mosberger wrote:
> >>>>> On Fri, 26 Sep 2003 19:04:30 +0200, Andi Kleen <ak@muc.de> said:
> 
>   Andi> David Mosberger <davidm@napali.hpl.hp.com> writes:
>   >> Why would that be?  Slower, yes, but very slow?
> 
>   Andi> It depends on your architecture I guess. On K8/x86-64 it isn't
>   Andi> that big a deal (one cycle penalty for unaligned accesses),
>   Andi> but if you take an exception for each unaligned read it will
>   Andi> be incredible slow.
> 
> What are you talking about?  Arches that don't like/support unaligned

They don't in this case ;-)

See my reply to David Miller for details.

> 
>   Andi> Use one receive descriptor for the MAC header and another for
>   Andi> the rest (IP/TCP/payload) In the rest everything is aligned
>   Andi> and there are no problems with misaligned data types (ignoring
>   Andi> exceptional cases like broken timestamp options which can be
>   Andi> handled slowly).
> 
> That's what I proposed.

Hmm, must have missed that sorry. I saw it in Ivan's post only.

>   Andi> Fixing the stack to hadle separate MAC headers should not be
>   Andi> that much work, the code is fairly limited. Drawback is that
>   Andi> it requires bigger changes to the network drivers and maybe
>   Andi> some special case code for non standard MAC headers.
> 
> I suspect you'd be better off just copying the 14 bytes of the
> Ethernet header so that the entire packet is contiguous.

Good point. So it can be even handled transparently in drivers without
any core stack changes.

Only drawback is that more RX descriptors also usually require 
more PCI writes, which can be also very slow. Needs to be benchmarked
carefully if it's a worthy trade-off.

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-27  4:26           ` Andi Kleen
@ 2003-09-27  4:34             ` David Mosberger
  2003-09-27 23:36             ` David S. Miller
  1 sibling, 0 replies; 81+ messages in thread
From: David Mosberger @ 2003-09-27  4:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, davidm, linux-kernel

>>>>> On Sat, 27 Sep 2003 06:26:40 +0200, Andi Kleen <ak@muc.de> said:

  Andi> You handle misalignment->misalignment copies with zero or
  Andi> small cost - when both source and destination have the same
  Andi> misalignment. I guess you do that by just aligning the pointer
  Andi> at the beginning of the function. That works as long as both
  Andi> source and destination have the same misalignment.

  Andi> But that is not what happens here. The copy is a
  Andi> misaligned->aligned copy and that cannot be handled at zero
  Andi> cost (unless you have zero misalignment penalty in
  Andi> load/store). Either the load or the store in the copy loop
  Andi> will be always misaligned, no matter what tricks you play. You
  Andi> cannot avoid this by aligning the pointers.

Geez, Andi, get a clue.  You've been working with x86 for too long.

You really can't imagine that you could combine two source words into
a single destination word?  Look ma, no unaligned accesses...

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
       [not found]         ` <AepM.5Zb.41@gated-at.bofh.it>
@ 2003-09-27  4:26           ` Andi Kleen
  2003-09-27  4:34             ` David Mosberger
  2003-09-27 23:36             ` David S. Miller
  0 siblings, 2 replies; 81+ messages in thread
From: Andi Kleen @ 2003-09-27  4:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: davidm, ak, linux-kernel

"David S. Miller" <davem@redhat.com> writes:

> On Fri, 26 Sep 2003 10:47:12 -0700
> David Mosberger <davidm@napali.hpl.hp.com> wrote:
>
>> Arches that don't like/support unaligned accesses will certainly
>> have a copy_user() which handles misalignment just fine.  For
>> example, on ia64, the copy will run slightly slower because of an
>> additional shift in the loop, but that penalty only shows on fully
>> cached data.
>
> Exactly correct, and on many platforms (sparc64 happens to be one)
> there is _ZERO_ performance penalty for misalignment or source or
> destination buffer during a memcpy/memmove.

You handle misalignment->misalignment copies with zero or small cost -
when both source and destination have the same misalignment. I guess
you do that by just aligning the pointer at the beginning of the 
function. That works as long as both source and destination have
the same misalignment.

But that is not what happens here. The copy is a misaligned->aligned
copy and that cannot be handled at zero cost (unless you have zero
misalignment penalty in load/store). Either the load or the store in the 
copy loop will be always misaligned, no matter what tricks you play. You 
cannot avoid this by aligning the pointers.

The user buffers tend to be aligned, so when the kernel data is misaligned
the copy ends up requiring an misaligned load in the inner copy loop.

Of course you could try to teach the user to use intentionally misaligned
buffers in his programs to avoid this, but it would be likely a hard 
sell and be a bad idea on less crippled NICs.

The copy inside the device driver has the same problem: misaligned 
data -> aligned skb => the load in the copy will be misaligned, the store 
aligned. You avoid a few unaligned loads in the core stack, at the cost
of hundreds in the copy. Seems like a bad trade-off to me.

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 17:47       ` David Mosberger
@ 2003-09-27  3:57         ` David S. Miller
  2003-09-27  4:36         ` Andi Kleen
  1 sibling, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-27  3:57 UTC (permalink / raw)
  To: davidm; +Cc: davidm, ak, linux-kernel

On Fri, 26 Sep 2003 10:47:12 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> Arches that don't like/support unaligned accesses will certainly
> have a copy_user() which handles misalignment just fine.  For
> example, on ia64, the copy will run slightly slower because of an
> additional shift in the loop, but that penalty only shows on fully
> cached data.

Exactly correct, and on many platforms (sparc64 happens to be one)
there is _ZERO_ performance penalty for misalignment or source or
destination buffer during a memcpy/memmove.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 14:38   ` Ivan Kokshaysky
@ 2003-09-27  3:12     ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-27  3:12 UTC (permalink / raw)
  To: Ivan Kokshaysky; +Cc: manfred, linux-kernel

On Fri, 26 Sep 2003 18:38:27 +0400
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:

> What about aligning the packet directly in the rx buffer (by memmoving
> the entire packet to buf+2) instead of copying to another skb?
> This appears to be a) more than 2 times faster b) easy to implement.

That's another possibility.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 17:04     ` Andi Kleen
  2003-09-26 17:22       ` Chris Friesen
@ 2003-09-26 17:47       ` David Mosberger
  2003-09-27  3:57         ` David S. Miller
  2003-09-27  4:36         ` Andi Kleen
  1 sibling, 2 replies; 81+ messages in thread
From: David Mosberger @ 2003-09-26 17:47 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davidm, linux-kernel

>>>>> On Fri, 26 Sep 2003 19:04:30 +0200, Andi Kleen <ak@muc.de> said:

  Andi> David Mosberger <davidm@napali.hpl.hp.com> writes:
  >> Why would that be?  Slower, yes, but very slow?

  Andi> It depends on your architecture I guess. On K8/x86-64 it isn't
  Andi> that big a deal (one cycle penalty for unaligned accesses),
  Andi> but if you take an exception for each unaligned read it will
  Andi> be incredible slow.

What are you talking about?  Arches that don't like/support unaligned
accesses will certainly have a copy_user() which handles misalignment
just fine.  For example, on ia64, the copy will run slightly slower
because of an additional shift in the loop, but that penalty only
shows on fully cached data.  If the data is not cached, the copy speed
for aligned vs. misaligned data is virtually the same.

  Andi> Use one receive descriptor for the MAC header and another for
  Andi> the rest (IP/TCP/payload) In the rest everything is aligned
  Andi> and there are no problems with misaligned data types (ignoring
  Andi> exceptional cases like broken timestamp options which can be
  Andi> handled slowly).

That's what I proposed.

  Andi> Fixing the stack to handle separate MAC headers should not be
  Andi> that much work, the code is fairly limited. Drawback is that
  Andi> it requires bigger changes to the network drivers and maybe
  Andi> some special case code for non standard MAC headers.

I suspect you'd be better off just copying the 14 bytes of the
Ethernet header so that the entire packet is contiguous.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 17:04     ` Andi Kleen
@ 2003-09-26 17:22       ` Chris Friesen
  2003-09-26 17:47       ` David Mosberger
  1 sibling, 0 replies; 81+ messages in thread
From: Chris Friesen @ 2003-09-26 17:22 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davidm, linux-kernel

Andi Kleen wrote:

> The solution proposed by Ivan sounds much better. The basic problem
> is that the Ethernet header is not a multiple of 4 and that misaligns
> everything after it.

At least some hardware will offset the incoming packet by two bytes to 
align everything.  Whatever mechanism we end up using, it would be nice 
if it could make use of the hardware that is capable of this.

Chris

-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
       [not found]   ` <A317.6GH.7@gated-at.bofh.it>
@ 2003-09-26 17:04     ` Andi Kleen
  2003-09-26 17:22       ` Chris Friesen
  2003-09-26 17:47       ` David Mosberger
  0 siblings, 2 replies; 81+ messages in thread
From: Andi Kleen @ 2003-09-26 17:04 UTC (permalink / raw)
  To: davidm; +Cc: linux-kernel

David Mosberger <davidm@napali.hpl.hp.com> writes:

> Why would that be?  Slower, yes, but very slow?

It depends on your architecture I guess. On K8/x86-64 it isn't that big
a deal (one cycle penalty for unaligned accesses), but if you take an
exception for each unaligned read it will be incredible slow. 

Of course the copy in the driver has the same problem, so it's not
much better. 

My point was basically that the header access are peanuts compared to
the unaligned copy. It would be much better to optimize the copy than
the header access. The proposal of just copying the header does not
address this. And copying the packet in the driver has the same problem,
so IMHO it's useless.

The solution proposed by Ivan sounds much better. The basic problem
is that the Ethernet header is not a multiple of 4 and that misaligns
everything after it. Use one receive descriptor for the MAC header and 
another for the rest (IP/TCP/payload) In the rest everything is aligned and 
there are no problems with misaligned data types (ignoring exceptional cases 
like broken timestamp options which can be handled slowly). Fixing the 
stack to handle separate MAC headers should not be that much work, the 
code is fairly limited. Drawback is that it requires bigger changes to the 
network drivers and maybe some special case code for non standard MAC
headers.

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26 15:14 ` Andi Kleen
@ 2003-09-26 15:41   ` David Mosberger
  0 siblings, 0 replies; 81+ messages in thread
From: David Mosberger @ 2003-09-26 15:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Manfred Spraul, linux-kernel

>>>>> On Fri, 26 Sep 2003 17:14:57 +0200, Andi Kleen <ak@muc.de> said:

  Andi> Manfred Spraul <manfred@colorfullife.com> writes:
  >> David wrote:

  >>> Fine, then we should have something like an rx_copybreak scheme
  >>> in the ns83820 driver too.

  >> Is that really the right solution? Add a full-packet copy to
  >> every driver?  IMHO the fastest solution would be to copy only
  >> the ip & tcp headers, and keep the rest as it is. And preferable
  >> in the network core, to avoid having to copy&paste that into
  >> every driver.

  Andi> One problem is that you still have an unaligned->aligned copy
  Andi> to user space in recvmsg (the user buffer is usually aligned
  Andi> and the network payload will be unaligned). And that will be
  Andi> very slow.

Why would that be?  Slower, yes, but very slow?

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
       [not found] <zU7D.2Ji.27@gated-at.bofh.it>
@ 2003-09-26 15:14 ` Andi Kleen
  2003-09-26 15:41   ` David Mosberger
  0 siblings, 1 reply; 81+ messages in thread
From: Andi Kleen @ 2003-09-26 15:14 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

Manfred Spraul <manfred@colorfullife.com> writes:

> David wrote:
>
>>Fine, then we should have something like an rx_copybreak scheme in
>>the ns83820 driver too.
>>
> Is that really the right solution? Add a full-packet copy to every driver?
> IMHO the fastest solution would be to copy only the ip & tcp headers,
> and keep the rest as it is. And preferable in the network core, to
> avoid having to copy&paste that into every driver.

One problem is that you still have an unaligned->aligned copy to user space
in recvmsg (the user buffer is usually aligned and the network payload 
will be unaligned). And that will be very slow.

-Andi

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26  6:07 ` David S. Miller
  2003-09-26  6:41   ` David Mosberger
@ 2003-09-26 14:38   ` Ivan Kokshaysky
  2003-09-27  3:12     ` David S. Miller
  1 sibling, 1 reply; 81+ messages in thread
From: Ivan Kokshaysky @ 2003-09-26 14:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: Manfred Spraul, linux-kernel

On Thu, Sep 25, 2003 at 11:07:02PM -0700, David S. Miller wrote:
> On Fri, 26 Sep 2003 08:16:36 +0200
> Manfred Spraul <manfred@colorfullife.com> wrote:
> 
> > Is that really the right solution? Add a full-packet copy to every driver?
> 
> In the short term, yes it is.

What about aligning the packet directly in the rx buffer (by memmoving
the entire packet to buf+2) instead of copying to another skb?
This appears to be a) more than 2 times faster b) easy to implement.

Ivan.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26  6:07 ` David S. Miller
@ 2003-09-26  6:41   ` David Mosberger
  2003-09-26 14:38   ` Ivan Kokshaysky
  1 sibling, 0 replies; 81+ messages in thread
From: David Mosberger @ 2003-09-26  6:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: Manfred Spraul, linux-kernel

>>>>> On Thu, 25 Sep 2003 23:07:02 -0700, "David S. Miller" <davem@redhat.com> said:

  David> On Fri, 26 Sep 2003 08:16:36 +0200 Manfred Spraul
  David> <manfred@colorfullife.com> wrote:

  >> Is that really the right solution? Add a full-packet copy to
  >> every driver?

  David> In the short term, yes it is.

I know nothing about the ns83820, but page 22 of the data sheet
(http://www.national.com/pf/DP/DP83820.html) suggests that it would be
possible to setup _two_ descriptors for each incoming packet: the
first one would cover the Ethernet header (14 bytes), the second the
rest.  That way, IP-header would be 8-byte aligned (since the
ns83820's DMA engine seems to require 8-byte alignment for incoming
data).

If so, this would let you get the IP-header and on aligned at the cost
of extra descriptor fetching.  The cost of fetching the extra
descriptor will be big enough that you'd only want to do this when
unaligned accesses are expensive, but hopefully it would be cheaper
than copying the entire packet.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
@ 2003-09-26  6:16 Manfred Spraul
  2003-09-26  6:07 ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Manfred Spraul @ 2003-09-26  6:16 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

David wrote:

>Fine, then we should have something like an rx_copybreak scheme in
>the ns83820 driver too.
>  
>
Is that really the right solution? Add a full-packet copy to every driver?
IMHO the fastest solution would be to copy only the ip & tcp headers, 
and keep the rest as it is. And preferable in the network core, to avoid 
having to copy&paste that into every driver.

--
    Manfred


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-26  6:16 Manfred Spraul
@ 2003-09-26  6:07 ` David S. Miller
  2003-09-26  6:41   ` David Mosberger
  2003-09-26 14:38   ` Ivan Kokshaysky
  0 siblings, 2 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-26  6:07 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

On Fri, 26 Sep 2003 08:16:36 +0200
Manfred Spraul <manfred@colorfullife.com> wrote:

> Is that really the right solution? Add a full-packet copy to every driver?

In the short term, yes it is.

> IMHO the fastest solution would be to copy only the ip & tcp headers, 
> and keep the rest as it is. And preferable in the network core, to avoid 
> having to copy&paste that into every driver.

You're absolutely correct, but implementing this is far from trivial.
There are assumptions all over the place that the protocol data
immediately follows the protocol headers.

If you're willing to do all of the auditing and work, have at it.
:-)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-25 16:11               ` Ivan Kokshaysky
@ 2003-09-26  1:12                 ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-26  1:12 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: davidm, davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Thu, 25 Sep 2003 20:11:27 +0400
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:

> On Alpha it is - I did some experimentation yesterday with
> the tulip driver and "rx_copybreak=0" vs default "rx_copybreak=1518".
> That's true, copying the large packet is very expensive due to constant
> cache misses. On UP1500 it costs more than 2500 cycles per 1.5Kb
> copy on large ftp transfers.
> OTOH, minimal time to handle unaligned load is ~200 cycles (ev6 cpu),
> but only when the trap handler itself and its data are in L1 cache
> (like doing unaligned load in a tight loop). In the real life average
> time is ~400 cycles due to cache misses.
> Further, with "rx_copybreak=0" I've got 20 (!) unaligned traps per TCP
> packet, i.e. penalty was 8000 vs 2500 cycles...

Fine, then we should have something like an rx_copybreak scheme in
the ns83820 driver too.

Thanks for your data.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-24  3:08             ` David S. Miller
@ 2003-09-25 16:11               ` Ivan Kokshaysky
  2003-09-26  1:12                 ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Ivan Kokshaysky @ 2003-09-25 16:11 UTC (permalink / raw)
  To: David S. Miller
  Cc: davidm, davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 08:08:32PM -0700, David S. Miller wrote:
> It's not even decided that the copying version is faster.

On Alpha it is - I did some experimentation yesterday with
the tulip driver and "rx_copybreak=0" vs default "rx_copybreak=1518".
That's true, copying the large packet is very expensive due to constant
cache misses. On UP1500 it costs more than 2500 cycles per 1.5Kb
copy on large ftp transfers.
OTOH, minimal time to handle unaligned load is ~200 cycles (ev6 cpu),
but only when the trap handler itself and its data are in L1 cache
(like doing unaligned load in a tight loop). In the real life average
time is ~400 cycles due to cache misses.
Further, with "rx_copybreak=0" I've got 20 (!) unaligned traps per TCP
packet, i.e. penalty was 8000 vs 2500 cycles...
Note that my network setup is as simple as possible - no routing/filtering
etc., just a networked workstation. I guess one with a more complex
routing/firewalling configuration would have a LOT more unaligned
traps per packet.

Ivan.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 22:58 Luck, Tony
  2003-09-23 23:32 ` David S. Miller
@ 2003-09-24  5:18 ` Valdis.Kletnieks
  1 sibling, 0 replies; 81+ messages in thread
From: Valdis.Kletnieks @ 2003-09-24  5:18 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Alan Cox, David S. Miller, davidm, davidm, peter, bcrl, ak,
	iod00d, peterc, linux-ns83820, linux-ia64,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

On Tue, 23 Sep 2003 15:58:29 PDT, "Luck, Tony" said:
> Alan Cox wrote:
> > On Maw, 2003-09-23 at 19:21, Luck, Tony wrote:
> > > a) the programmer is playing fast and loose with types and/or casts.
> > > b) the end-user is going to be disappointed with the performance.
> > 
> > c) the programmer is being clever and knows the unaligned access is
> > cheaper on average than the cost of making sure it cant happen
> 
> Which is great until the "cleverly written" program is fed a data set
> that pushes into the unaligned case far more frequently than the
> programmer anticipated.

Didn't we recently fix a DoS attack on the TCP stack that was basically this
sort of thing with hashes?  And crafting an attack against this would be a lot
simpler than the hash-function attack.....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 20:00           ` David Mosberger
@ 2003-09-24  3:08             ` David S. Miller
  2003-09-25 16:11               ` Ivan Kokshaysky
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-24  3:08 UTC (permalink / raw)
  To: davidm
  Cc: davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 13:00:03 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> But they _are_ useful.  Peter certainly noticed very quickly that
> there was a problem with the ns83820 driver.  Do you think it would
> have been better for the kernel to run silently at greatly degraded
> performance?  I think not.

It's not even decided that the copying version is faster.

I bet for bulk data transfers it isn't faster at all, because instead
of spinning in the unaligned trap handler your spinning swiping the
L2 cache clean with the copies.  I'll take some traps over twice the
data footprint any day, because I can optimize the traps but I can't
make the memory overhead go away.

Also, how much have you optimized your unaligned trap handler?  If it's
a thousand cycles now, do you think you could cut that overhead say
in half?  I bet you could, and then some.  I bet a clever handler could
be done on ia64 or sparc64 in a hundred cycles or so, maybe even faster.

Furthermore, like I said, if you want performance you aren't going
to stick a NS83820 in your box.  Even the author of the driver admits
this and it's not like this chip is the only ~$30USD gigabit chip in
town anymore :)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 23:35             ` David S. Miller
  2003-09-24  0:13               ` Grant Grundler
@ 2003-09-24  1:58               ` William Lee Irwin III
  1 sibling, 0 replies; 81+ messages in thread
From: William Lee Irwin III @ 2003-09-24  1:58 UTC (permalink / raw)
  To: David S. Miller
  Cc: Grant Grundler, bcrl, tony.luck, davidm, davidm, peter, ak,
	peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 04:35:42PM -0700, David S. Miller wrote:
> That's a amusing coincidence since at least some people think ia64
> will end up the same way the i860 did :-)
> In the past I did always advocate things the way you are right now,
> but these days I think I've been wrong the whole time and Intel on x86
> is doing the right thing.
> They do everything in hardware and this makes the software so much
> simpler.  Sure, there's a lot of architectually inherited complexity
> in the x86 family, but their engineering priorities mean there is so
> much other stuff you simply never have to think about as a programmer.

Several of the x86 "hardware assists" need some rather hefty hacks
codewise to cope with their concomitant data structure proliferations
under industrial workloads, and generally have me begging for RISC's
system-level features instead (which, of course, require various
undoings of Linux' x86 crossdressings to exploit).

Given the reactions in prior threads, this message clearly needs to
wait a long while before it will ever be heard.


-- wli

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 21:16   ` David Mosberger
@ 2003-09-24  1:47     ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-24  1:47 UTC (permalink / raw)
  To: davidm
  Cc: davidm, alan, tony.luck, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 14:16:36 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> Red Hat users won't be bothered by unaligned messages.

Let's hope they don't need to see anything useful in the dmesg output.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 23:37             ` David S. Miller
@ 2003-09-24  0:14               ` Matthew Wilcox
  2003-09-24  0:06                 ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2003-09-24  0:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: Andrew Morton, willy, schwab, bcrl, tony.luck, davidm, davidm,
	peter, ak, iod00d, peterc, linux-ns83820, linux-ia64,
	linux-kernel

On Tue, Sep 23, 2003 at 04:37:44PM -0700, David S. Miller wrote:
> > >  Not only sparc would be effected by this.  Using {get,put}_unaligned()
> > >  all over the networking would incur a penalty for many platforms, not
> > >  just sparc.
> > 
> > Really?  I'd have thought that get/put_unaligned would be a simple
> > load/store for architectures which wish to implement it in that manner.
> 
> Only on systems that have the "load upper/lower-unaligned"
> instructions.  On others it's a memmove().

It is at the moment, but why should it be?  Why can't it be implemented
as load-and-trap if that's the fastest way to do it?

(I can see this descending into get_unaligned_likely() and
get_aligned_unlikely() which i'd rather avoid ...)

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 23:35             ` David S. Miller
@ 2003-09-24  0:13               ` Grant Grundler
  2003-09-24  0:04                 ` David S. Miller
  2003-09-24  1:58               ` William Lee Irwin III
  1 sibling, 1 reply; 81+ messages in thread
From: Grant Grundler @ 2003-09-24  0:13 UTC (permalink / raw)
  To: David S. Miller
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 04:35:42PM -0700, David S. Miller wrote:
> On Tue, 23 Sep 2003 15:35:40 -0700
> Grant Grundler <iod00d@hp.com> wrote:
> 
> > And someone at Intel obviously agrees the newer architectures
> > should support misaligned access in SW since ever RISC chip
> > they've built (starting with i860, ~1989) does it that way.
> 
> That's a amusing coincidence since at least some people think ia64
> will end up the same way the i860 did :-)

It might. I be happy to share what I know about i860/i960 over pizza.
I worked on ATT SVR4 port to i860 in 1989/90 and things were quite
different then...

> In the past I did always advocate things the way you are right now,
> but these days I think I've been wrong the whole time and Intel on x86
> is doing the right thing.

I'm not so interested in "right" or "wrong". I'd just like to see other
arches besides x86 work well (ia64, parisc in particular) and that includes
how linux deals with unaligned accesses. If x86 is the gold standard
for "the right way", we'd be using bounce buffers for DMA to highmem
(well, PAE support would get added somehow) and 64-bit kernels wouldn't
have happened...but linux so far seems to accomodate other arches *when
reasonable*.  I'll follow Andrew Morton's comments...

> They do everything in hardware and this makes the software so much
> simpler.  Sure, there's a lot of architectually inherited complexity
> in the x86 family, but their engineering priorities mean there is so
> much other stuff you simply never have to think about as a programmer.

Yes, true.  But I think we are digressing from the original thread here...
(/me works his way around another really big rat hole :^)

thanks,
grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-24  0:14               ` Matthew Wilcox
@ 2003-09-24  0:06                 ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-24  0:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, willy, schwab, bcrl, tony.luck, davidm, davidm, peter, ak,
	iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

On Wed, 24 Sep 2003 01:14:56 +0100
Matthew Wilcox <willy@debian.org> wrote:

> (I can see this descending into get_unaligned_likely() and
> get_aligned_unlikely() which i'd rather avoid ...)

get_unaligned_unlikely() is exactly what just loading the 32-bit value
as if it were aligned is, ie. what we're doing right now.

People who want top performance should not put a networking card into
their machine that can only DMA packets to 32-byte or whatever
boundaries.  That's exactly the limitation NS83820 has and therefore
why it should be avoided like the plague by performance conscious
folks.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-24  0:13               ` Grant Grundler
@ 2003-09-24  0:04                 ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-24  0:04 UTC (permalink / raw)
  To: Grant Grundler
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 17:13:33 -0700
Grant Grundler <iod00d@hp.com> wrote:

> It might. I be happy to share what I know about i860/i960 over pizza.
> I worked on ATT SVR4 port to i860 in 1989/90 and things were quite
> different then...

Q: How does an OS context switch work on the i860?
A: The user sneezes and the kernel cleans up after it.

I've hung out with i860 hackers already in my past :)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 23:15           ` Andrew Morton
@ 2003-09-23 23:37             ` David S. Miller
  2003-09-24  0:14               ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 23:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, schwab, bcrl, tony.luck, davidm, davidm, peter, ak,
	iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 16:15:29 -0700
Andrew Morton <akpm@osdl.org> wrote:

> "David S. Miller" <davem@redhat.com> wrote:
> >
> > > That's not true; they could be avoided by using get_unaligned() and
> >  > put_unaligned().  You just don't want to because they'd make sparc suck.
> > 
> >  Not only sparc would be effected by this.  Using {get,put}_unaligned()
> >  all over the networking would incur a penalty for many platforms, not
> >  just sparc.
> 
> Really?  I'd have thought that get/put_unaligned would be a simple
> load/store for architectures which wish to implement it in that manner.

Only on systems that have the "load upper/lower-unaligned"
instructions.  On others it's a memmove().

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 22:35           ` Grant Grundler
@ 2003-09-23 23:35             ` David S. Miller
  2003-09-24  0:13               ` Grant Grundler
  2003-09-24  1:58               ` William Lee Irwin III
  0 siblings, 2 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 23:35 UTC (permalink / raw)
  To: Grant Grundler
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 15:35:40 -0700
Grant Grundler <iod00d@hp.com> wrote:

> And someone at Intel obviously agrees the newer architectures
> should support misaligned access in SW since ever RISC chip
> they've built (starting with i860, ~1989) does it that way.

That's a amusing coincidence since at least some people think ia64
will end up the same way the i860 did :-)

In the past I did always advocate things the way you are right now,
but these days I think I've been wrong the whole time and Intel on x86
is doing the right thing.

They do everything in hardware and this makes the software so much
simpler.  Sure, there's a lot of architectually inherited complexity
in the x86 family, but their engineering priorities mean there is so
much other stuff you simply never have to think about as a programmer.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 22:58 Luck, Tony
@ 2003-09-23 23:32 ` David S. Miller
  2003-09-24  5:18 ` Valdis.Kletnieks
  1 sibling, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 23:32 UTC (permalink / raw)
  To: Luck, Tony
  Cc: alan, davidm, davidm, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 15:58:29 -0700
"Luck, Tony" <tony.luck@intel.com> wrote:

> Which is great until the "cleverly written" program is fed a data set
> that pushes into the unaligned case far more frequently than the
> programmer anticipated.

Which is why the people who work on the networking are well
aware of the issues and will make sure the common case never
triggers these unaligned accesses.

People writing protocol stacks _don't_ feed these data unaligned
cases out onto the wire, because like us they want the networking
to go fast.  Why in the world do you think they specify in the
very RFCs that define the protocols that one should use NOP options
in the TCP option area in order to align TCP timestamps on a 32-bit
boundry?

Do you think they say this so people can go ahead and use memmove()'s
and byte loads all over the place anyways?

No, rather, they specify things so that unless you do something
absolutely stupid all the shit is aligned properly.

It is absurdly stupid to do byte loads of TCP and IP header
bits just because one tenth of one hundredths of one percent
of systems have some configuration where word and half-word
loads of these things will be unaligned _AND_ be slow on that
cpu.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:22         ` David S. Miller
@ 2003-09-23 23:15           ` Andrew Morton
  2003-09-23 23:37             ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Andrew Morton @ 2003-09-23 23:15 UTC (permalink / raw)
  To: David S. Miller
  Cc: willy, schwab, bcrl, tony.luck, davidm, davidm, peter, ak,
	iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

"David S. Miller" <davem@redhat.com> wrote:
>
> > That's not true; they could be avoided by using get_unaligned() and
>  > put_unaligned().  You just don't want to because they'd make sparc suck.
> 
>  Not only sparc would be effected by this.  Using {get,put}_unaligned()
>  all over the networking would incur a penalty for many platforms, not
>  just sparc.

Really?  I'd have thought that get/put_unaligned would be a simple
load/store for architectures which wish to implement it in that manner.

Other architectures could take it as an optimisation hint, to avoid taking
a trap.  They'd probably still need to implement the fixup, but if a few of
these hints could reduce the trap frequency significantly then it may be
worth doing?

I guess it depends on how many of these hints would be needed at the source
level to avoid "most" of the traps.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
@ 2003-09-23 22:58 Luck, Tony
  2003-09-23 23:32 ` David S. Miller
  2003-09-24  5:18 ` Valdis.Kletnieks
  0 siblings, 2 replies; 81+ messages in thread
From: Luck, Tony @ 2003-09-23 22:58 UTC (permalink / raw)
  To: Alan Cox
  Cc: David S. Miller, davidm, davidm, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, Linux Kernel Mailing List

Alan Cox wrote:
> On Maw, 2003-09-23 at 19:21, Luck, Tony wrote:
> > a) the programmer is playing fast and loose with types and/or casts.
> > b) the end-user is going to be disappointed with the performance.
> 
> c) the programmer is being clever and knows the unaligned access is
> cheaper on average than the cost of making sure it cant happen

Which is great until the "cleverly written" program is fed a data set
that pushes into the unaligned case far more frequently than the
programmer anticipated.

> > Looking at a couple of ia64 build servers here I see zero unaligned
> > access messages in the logs.
> 
> Anyone who can deliver network traffic to your box can soon 
> fix that...

See answer above :-)

-Tony Luck

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 20:45         ` David S. Miller
@ 2003-09-23 22:35           ` Grant Grundler
  2003-09-23 23:35             ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Grant Grundler @ 2003-09-23 22:35 UTC (permalink / raw)
  To: David S. Miller
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 01:45:29PM -0700, David S. Miller wrote:
> Intel actually optimizes this on the P4, what is your
> response to that?  Is Intel wasting they time? :-)

nono...but Intel doesn't have a choice on x86.
They have to optimize for the binaries that are out there.
Compatibility is everything in that market space.

And someone at Intel obviously agrees the newer architectures
should support misaligned access in SW since ever RISC chip
they've built (starting with i860, ~1989) does it that way.

> It's needed on every access to every TCP and IP header portion
> for the case we're talking about in this thread, where the network
> device driver gives the networking a packet that ends up with
> unaligned IP and TCP headers.

Yeah, I don't use most LAN features (PPPoE, VLAN, Appletalk, etc).
I naively thought there must be a subset everyone uses...but defining
that subset sounds like a rat hole I shouldn't go near.

> I once considered adding some get_unaligned() uses to the TCP option
> parsing code, guess who rejected that patch?  It wasn't me, it was
> Linus himself and I came to learn that he's right on this one.

I'm not totally comfortable with that. The NICs I care about seem to
"bias the buffer address" to compensate for some "common case".
Seems like those cases would be cheaper (and more portable) to add
the get_unaligned() calls in the networking stack....I don't know
though really.

thanks,
grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 21:00 ` Alan Cox
@ 2003-09-23 21:16   ` David Mosberger
  2003-09-24  1:47     ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 21:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, bcrl, ak,
	iod00d, peterc, linux-ns83820, linux-ia64,
	Linux Kernel Mailing List

>>>>> On Tue, 23 Sep 2003 22:00:34 +0100, Alan Cox <alan@lxorguk.ukuu.org.uk> said:

  >> Looking at a couple of ia64 build servers here I see zero unaligned
  >> access messages in the logs.

  Alan> Anyone who can deliver network traffic to your box can soon fix that...

Not if he's running Red Hat.  This is on a Red Hat 9 machine (x86,
just so you can't argue it's ia64-specific...):

 $ fgrep LOGLEVEL /etc/rc.sysinit
 /bin/dmesg -n $LOGLEVEL
 $ fgrep LOGLEVEL /etc/sysconfig/*
 /etc/sysconfig/init:LOGLEVEL=3

Red Hat users won't be bothered by unaligned messages.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:21 Luck, Tony
  2003-09-23 18:29 ` Benjamin LaHaise
@ 2003-09-23 21:00 ` Alan Cox
  2003-09-23 21:16   ` David Mosberger
  1 sibling, 1 reply; 81+ messages in thread
From: Alan Cox @ 2003-09-23 21:00 UTC (permalink / raw)
  To: Luck, Tony
  Cc: David S. Miller, davidm, davidm, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, Linux Kernel Mailing List

On Maw, 2003-09-23 at 19:21, Luck, Tony wrote:
> a) the programmer is playing fast and loose with types and/or casts.
> b) the end-user is going to be disappointed with the performance.

c) the programmer is being clever and knows the unaligned access is
cheaper on average than the cost of making sure it cant happen

> Looking at a couple of ia64 build servers here I see zero unaligned
> access messages in the logs.

Anyone who can deliver network traffic to your box can soon fix that...



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 20:38       ` Grant Grundler
@ 2003-09-23 20:45         ` David S. Miller
  2003-09-23 22:35           ` Grant Grundler
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 20:45 UTC (permalink / raw)
  To: Grant Grundler
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 13:38:19 -0700
Grant Grundler <iod00d@hp.com> wrote:

> Given misaligned accesses are infrequent enough to affect
> performance, it makes sense to do this in SW because
> it reduces cost of the HW design/test/mfg cycles.

Intel actually optimizes this on the P4, what is your
response to that?  Is Intel wasting they time? :-)

> Ok. If the kernel networking stack used get_unaligned() in the one place
> Peter originally found, x86/sparc64?/et al wouldn't see a difference.
> It would avoid traps on ia64 and parisc.  Bad idea?
> Any other arches it might help/hurt on?

It's needed on every access to every TCP and IP header portion
for the case we're talking about in this thread, where the network
device driver gives the networking a packet that ends up with
unaligned IP and TCP headers.

I once considered adding some get_unaligned() uses to the TCP option
parsing code, guess who rejected that patch?  It wasn't me, it was
Linus himself and I came to learn that he's right on this one.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:51     ` David S. Miller
@ 2003-09-23 20:38       ` Grant Grundler
  2003-09-23 20:45         ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Grant Grundler @ 2003-09-23 20:38 UTC (permalink / raw)
  To: David S. Miller
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 11:51:22AM -0700, David S. Miller wrote:
> > Even x86 pays at least a one cycle penalty for every misaligned access.
> 
> Yes, one cycle, and it's completely lost in the noise when it happens.

Depends on the app - for the networking stack, I agree.

To revisit Ben's comment: if we know something is likely to be misaligned,
a RISC processor can efficiently load both parts and merge them (one cycle
penalty vs a regular aligned load). Given misaligned accesses are infrequent
enough to affect performance, it makes sense to do this in SW because
it reduces cost of the HW design/test/mfg cycles.

...
> It is an unavoidable axoim in the kernel networking.  Unaligned accesses
> will happen, and they aren't a bug and therefore not worthy of mention
> in the kernel logs any more than "page was freed" :-)

Ok. If the kernel networking stack used get_unaligned() in the one place
Peter originally found, x86/sparc64?/et al wouldn't see a difference.
It would avoid traps on ia64 and parisc.  Bad idea?
Any other arches it might help/hurt on?

grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:10         ` David S. Miller
@ 2003-09-23 20:00           ` David Mosberger
  2003-09-24  3:08             ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 20:00 UTC (permalink / raw)
  To: David S. Miller
  Cc: davidm, davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

>>>>> On Tue, 23 Sep 2003 12:10:44 -0700, "David S. Miller" <davem@redhat.com> said:

  >> Look, this may be difficult for you to understand, but different
  >> people find different policies useful.

  David> You, sure.  But you are not the only user of the ia64 port just
  David> as I am not the only user of the sparc64 port and if sparc64 generated
  David> diagnostic messages not useful to people other than me I'd have to
  David> quiet them by default.

But they _are_ useful.  Peter certainly noticed very quickly that
there was a problem with the ns83820 driver.  Do you think it would
have been better for the kernel to run silently at greatly degraded
performance?  I think not.

The optimistic assumption that the network stack is making about
headers being aligned works as long as unaligned headers are
relatively rare.  If unaligned headers are common, we'll have to do
something about that for ia64 (and Alpha, SPARC, PA-RISC, MIPS, etc.),
because performance will otherwise suck.

BTW: Peter, I'm not quite sure I understand why your machine got into
real problems because of the printks.  The messages are supposed to be
rated limited to at most 5 per 5 second interval.  Each message is
about 80 bytes in size, so we're talking about 80 bytes/s.  Do you
know what really went wrong here?  Did the rate limiting not work as
expected or is there something strange about your setup?

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:52     ` David S. Miller
  2003-09-23 19:09       ` Andreas Schwab
@ 2003-09-23 19:28       ` Matthew Wilcox
  2003-09-23 19:22         ` David S. Miller
  1 sibling, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2003-09-23 19:28 UTC (permalink / raw)
  To: David S. Miller
  Cc: Andreas Schwab, bcrl, tony.luck, davidm, davidm, peter, ak,
	iod00d, peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 11:52:00AM -0700, David S. Miller wrote:
> On Tue, 23 Sep 2003 20:54:50 +0200
> Andreas Schwab <schwab@suse.de> wrote:
> 
> > Unaligned access are a BUG.
> 
> Wrong, they've been allowed in the kernel networking from day
> one and there is nothing that can be done to avoid the cases
> for which they occur.

That's not true; they could be avoided by using get_unaligned() and
put_unaligned().  You just don't want to because they'd make sparc suck.

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:11     ` Benjamin LaHaise
@ 2003-09-23 19:22       ` Jeff Garzik
  0 siblings, 0 replies; 81+ messages in thread
From: Jeff Garzik @ 2003-09-23 19:22 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 03:11:28PM -0400, Benjamin LaHaise wrote:
> On Tue, Sep 23, 2003 at 03:10:11PM -0400, Jeff Garzik wrote:
> > Traditionally, if the NIC driver is accessing unaligned data, it should
> > be using get/put_unaligned().
> 
> The driver isn't, the networking stack is (it has to).  For more examples 
> where this behaviour is required: PPPoE, VLAN...

Oh, no argument....  agreed 100%

	Jeff




^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:28       ` Matthew Wilcox
@ 2003-09-23 19:22         ` David S. Miller
  2003-09-23 23:15           ` Andrew Morton
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 19:22 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: schwab, bcrl, tony.luck, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 20:28:04 +0100
Matthew Wilcox <willy@debian.org> wrote:

> That's not true; they could be avoided by using get_unaligned() and
> put_unaligned().  You just don't want to because they'd make sparc suck.

Not only sparc would be effected by this.  Using {get,put}_unaligned()
all over the networking would incur a penalty for many platforms, not
just sparc.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:47     ` David S. Miller
@ 2003-09-23 19:17       ` David Mosberger
  2003-09-23 19:10         ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 19:17 UTC (permalink / raw)
  To: David S. Miller
  Cc: davidm, davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

>>>>> On Tue, 23 Sep 2003 11:47:44 -0700, "David S. Miller" <davem@redhat.com> said:

  DaveM> No other port is so obstinate about printing out unaligned
  DaveM> kernel access messages, why does ia64 have to be so
  DaveM> different?

Look, this may be difficult for you to understand, but different
people find different policies useful.  I absolutely want to see when
unaligned accesses happen, because it's almost always a performance
issue, if not an outright bug.  If you don't like the current
mechanisms to control the policy for handling unaligned accesses, make
them better.  Don't try to tell me that the policy I want is
"wrong"---that will get you nowhere.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:01         ` David S. Miller
@ 2003-09-23 19:16           ` Andreas Schwab
  2003-09-23 19:08             ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Schwab @ 2003-09-23 19:16 UTC (permalink / raw)
  To: David S. Miller
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

"David S. Miller" <davem@redhat.com> writes:

> On Tue, 23 Sep 2003 21:09:47 +0200
> Andreas Schwab <schwab@suse.de> wrote:
>
>> The compiler is allowed to take advantage that there are no unaligned
>> accesses.  You need to use compiler extensions (like attribute packed) to
>> stop it from doing this.
>
> That's correct, and if the address is misaligned the cpu "traps"
> and the kernel fixes up the load/store access to fix it up.

Or the compiler generates code to take advantage of the fact that the
lower address bits are zero.

> That's what we're talking about here.

Of course, the kernel language is not ISO C, and never will be.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:10   ` Jeff Garzik
@ 2003-09-23 19:11     ` Benjamin LaHaise
  2003-09-23 19:22       ` Jeff Garzik
  0 siblings, 1 reply; 81+ messages in thread
From: Benjamin LaHaise @ 2003-09-23 19:11 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 03:10:11PM -0400, Jeff Garzik wrote:
> Traditionally, if the NIC driver is accessing unaligned data, it should
> be using get/put_unaligned().

The driver isn't, the networking stack is (it has to).  For more examples 
where this behaviour is required: PPPoE, VLAN...

		-ben
-- 
"The software industry today survives only through an unstated agreement 
not to stir things up too much.  We must hope this lawsuit [by SCO] isn't 
the big stirring spoon." -- Ian Lance Taylor, Linux Journal, June 2003

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:06 ` David Mosberger
@ 2003-09-23 19:11   ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 19:11 UTC (permalink / raw)
  To: davidm
  Cc: davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 11:06:04 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> So what's wrong with doign prctl --fpemul=silent in the init process?
> The flags are inherited across fork().

Great, can we get such a setting for the kernel bits too?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:17       ` David Mosberger
@ 2003-09-23 19:10         ` David S. Miller
  2003-09-23 20:00           ` David Mosberger
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 19:10 UTC (permalink / raw)
  To: davidm
  Cc: davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 12:17:21 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

> Look, this may be difficult for you to understand, but different
> people find different policies useful.

You, sure.  But you are not the only user of the ia64 port just
as I am not the only user of the sparc64 port and if sparc64 generated
diagnostic messages not useful to people other than me I'd have to
quiet them by default.

All I'm suggesting is provide a way to control the kernel unaligned
accesses message and default it to off.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:29 ` Benjamin LaHaise
                     ` (2 preceding siblings ...)
  2003-09-23 18:54   ` Andreas Schwab
@ 2003-09-23 19:10   ` Jeff Garzik
  2003-09-23 19:11     ` Benjamin LaHaise
  3 siblings, 1 reply; 81+ messages in thread
From: Jeff Garzik @ 2003-09-23 19:10 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

Traditionally, if the NIC driver is accessing unaligned data, it should
be using get/put_unaligned().

	Jeff




^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:52     ` David S. Miller
@ 2003-09-23 19:09       ` Andreas Schwab
  2003-09-23 19:01         ` David S. Miller
  2003-09-23 19:28       ` Matthew Wilcox
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Schwab @ 2003-09-23 19:09 UTC (permalink / raw)
  To: David S. Miller
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

"David S. Miller" <davem@redhat.com> writes:

> On Tue, 23 Sep 2003 20:54:50 +0200
> Andreas Schwab <schwab@suse.de> wrote:
>
>> Unaligned access are a BUG.
>
> Wrong, they've been allowed in the kernel networking from day
> one and there is nothing that can be done to avoid the cases
> for which they occur.

The compiler is allowed to take advantage that there are no unaligned
accesses.  You need to use compiler extensions (like attribute packed) to
stop it from doing this.

> Stop spreading fud.

Nothing like this.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:16           ` Andreas Schwab
@ 2003-09-23 19:08             ` David S. Miller
  0 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 19:08 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 21:16:33 +0200
Andreas Schwab <schwab@suse.de> wrote:

> Or the compiler generates code to take advantage of the fact that the
> lower address bits are zero.

The only place where I can se it doing this legally is for structure
offsets.  For example where a "load 4 byte word" instruction takes an
offsetable address composed of a reg and an integer offset where the
integer offset must be a multiple of 4.

This rule we do abide by in the kernel, because PARISC requires this.

Anything more is asking for trouble, I wouldn't want to use such a compiler
in the real world :)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:51   ` Grant Grundler
  2003-09-23 18:51     ` David S. Miller
@ 2003-09-23 19:04     ` Benjamin LaHaise
  1 sibling, 0 replies; 81+ messages in thread
From: Benjamin LaHaise @ 2003-09-23 19:04 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 11:51:04AM -0700, Grant Grundler wrote:
> On Tue, Sep 23, 2003 at 02:29:25PM -0400, Benjamin LaHaise wrote:
> ...
> > (yet another hardware issue pushed off on software for no significant gain).
> 
> Even x86 pays at least a one cycle penalty for every misaligned access.
> In general, open source code has no excuse for using misaligned fields.
> It's (mostly) avoidable.  TCP/IP headers are the historical exception.

The other point to keep in mind is that apparently the .09u rev of the P4 
includes additional hardware for handling unaligned accesses because they 
are so common (gzip is a good example of an app where it is faster to do 
an unaligned access for the benefit of fetching the data in one instruction 
instead of 3+, and there is no way to improve on it).

> One could make the same arguement that a modern NIC should not require
> 16 byte alignment for DMA. It's a tradeoff one way or the other.
> Just a matter of perspective.

I consider the 83820 buggy in this regard, too.  That said, the fix does 
not belong in the driver layer, as having it duplicated in a dozen drivers 
is more stupid than fixing up the arch code which is required anyways.

		-ben
-- 
"The software industry today survives only through an unstated agreement 
not to stir things up too much.  We must hope this lawsuit [by SCO] isn't 
the big stirring spoon." -- Ian Lance Taylor, Linux Journal, June 2003

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 19:09       ` Andreas Schwab
@ 2003-09-23 19:01         ` David S. Miller
  2003-09-23 19:16           ` Andreas Schwab
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 19:01 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 21:09:47 +0200
Andreas Schwab <schwab@suse.de> wrote:

> The compiler is allowed to take advantage that there are no unaligned
> accesses.  You need to use compiler extensions (like attribute packed) to
> stop it from doing this.

That's correct, and if the address is misaligned the cpu "traps"
and the kernel fixes up the load/store access to fix it up.

This unaligned trap handling is required for a port of Linux to
a given cpu architecture.

That's what we're talking about here.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:29 ` Benjamin LaHaise
  2003-09-23 18:45   ` David S. Miller
  2003-09-23 18:51   ` Grant Grundler
@ 2003-09-23 18:54   ` Andreas Schwab
  2003-09-23 18:52     ` David S. Miller
  2003-09-23 19:10   ` Jeff Garzik
  3 siblings, 1 reply; 81+ messages in thread
From: Andreas Schwab @ 2003-09-23 18:54 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

Benjamin LaHaise <bcrl@kvack.org> writes:

> On Tue, Sep 23, 2003 at 11:21:35AM -0700, Luck, Tony wrote:
>> Looking at a couple of ia64 build servers here I see zero unaligned
>> access messages in the logs.
>
> ip options can still be an odd number of bytes.  Having itanic spew bogus 
> log messages is just plain wrong (yet another hardware issue pushed off on 
> software for no significant gain).

Unaligned access are a BUG.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:54   ` Andreas Schwab
@ 2003-09-23 18:52     ` David S. Miller
  2003-09-23 19:09       ` Andreas Schwab
  2003-09-23 19:28       ` Matthew Wilcox
  0 siblings, 2 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 18:52 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 20:54:50 +0200
Andreas Schwab <schwab@suse.de> wrote:

> Unaligned access are a BUG.

Wrong, they've been allowed in the kernel networking from day
one and there is nothing that can be done to avoid the cases
for which they occur.

Stop spreading fud.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:51   ` Grant Grundler
@ 2003-09-23 18:51     ` David S. Miller
  2003-09-23 20:38       ` Grant Grundler
  2003-09-23 19:04     ` Benjamin LaHaise
  1 sibling, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 18:51 UTC (permalink / raw)
  To: Grant Grundler
  Cc: bcrl, tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 11:51:04 -0700
Grant Grundler <iod00d@hp.com> wrote:

> Even x86 pays at least a one cycle penalty for every misaligned access.

Yes, one cycle, and it's completely lost in the noise when it
happens.

> In general, open source code has no excuse for using misaligned fields.
> It's (mostly) avoidable.  TCP/IP headers are the historical exception.

It's not the TCP/IP headers intrinsically, it's what they are embedded
inside of.

For example, if the ethernet driver (as nearly all of our's do which
can) optimized for an ethernet header followed by an IP header, guess
what?  That causes ethernet header followed by appletalk followed
by IP to do unaligned accesses.

It is an unavoidable axoim in the kernel networking.  Unaligned accesses
will happen, and they aren't a bug and therefore not worthy of mention
in the kernel logs any more than "page was freed" :-)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:29 ` Benjamin LaHaise
  2003-09-23 18:45   ` David S. Miller
@ 2003-09-23 18:51   ` Grant Grundler
  2003-09-23 18:51     ` David S. Miller
  2003-09-23 19:04     ` Benjamin LaHaise
  2003-09-23 18:54   ` Andreas Schwab
  2003-09-23 19:10   ` Jeff Garzik
  3 siblings, 2 replies; 81+ messages in thread
From: Grant Grundler @ 2003-09-23 18:51 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Luck, Tony, David S. Miller, davidm, davidm, peter, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 02:29:25PM -0400, Benjamin LaHaise wrote:
...
> (yet another hardware issue pushed off on software for no significant gain).

Even x86 pays at least a one cycle penalty for every misaligned access.
In general, open source code has no excuse for using misaligned fields.
It's (mostly) avoidable.  TCP/IP headers are the historical exception.

One could make the same arguement that a modern NIC should not require
16 byte alignment for DMA. It's a tradeoff one way or the other.
Just a matter of perspective.

grant

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:27   ` David Mosberger
@ 2003-09-23 18:47     ` David S. Miller
  2003-09-23 19:17       ` David Mosberger
  0 siblings, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 18:47 UTC (permalink / raw)
  To: davidm
  Cc: davidm, kevin.vanmaren, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 11:27:13 -0700
David Mosberger <davidm@napali.hpl.hp.com> wrote:

>  - use the dmesg command to lower the printing threshold below KERN_WARNING
>    ("dmesg -n4", IIRC)

Not terribly useful if all the unaligned messages made the one I'm
interested in get kicked out of the logs.

>  - use prctl --unalign=silent to turn off unaligned printing for a
>    particular task and its children

What about for the kernel?

No other port is so obstinate about printing out unaligned kernel
access messages, why does ia64 have to be so different?

Unaligned accesses are not only perfectly fine in the kernel, they
are in fact expected in certain circumstances.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:29 ` Benjamin LaHaise
@ 2003-09-23 18:45   ` David S. Miller
  2003-09-23 18:51   ` Grant Grundler
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 81+ messages in thread
From: David S. Miller @ 2003-09-23 18:45 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: tony.luck, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, 23 Sep 2003 14:29:25 -0400
Benjamin LaHaise <bcrl@kvack.org> wrote:

> On Tue, Sep 23, 2003 at 11:21:35AM -0700, Luck, Tony wrote:
> > Looking at a couple of ia64 build servers here I see zero unaligned
> > access messages in the logs.
> 
> ip options can still be an odd number of bytes.  Having itanic spew bogus 
> log messages is just plain wrong (yet another hardware issue pushed off on 
> software for no significant gain).

Also, some TCP implementations sometimes don't align the timestamp
options in the headers as well.

Also, as I mentioned, try IP over appletalk.

Unaligned accesses are perfectly normal, _especially_ in the kernel.
This is an axiom of the kernel networking decided a long time ago, and
printing out a silly message when it happens doesn't make that any
less true.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 18:21 Luck, Tony
@ 2003-09-23 18:29 ` Benjamin LaHaise
  2003-09-23 18:45   ` David S. Miller
                     ` (3 more replies)
  2003-09-23 21:00 ` Alan Cox
  1 sibling, 4 replies; 81+ messages in thread
From: Benjamin LaHaise @ 2003-09-23 18:29 UTC (permalink / raw)
  To: Luck, Tony
  Cc: David S. Miller, davidm, davidm, peter, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

On Tue, Sep 23, 2003 at 11:21:35AM -0700, Luck, Tony wrote:
> Looking at a couple of ia64 build servers here I see zero unaligned
> access messages in the logs.

ip options can still be an odd number of bytes.  Having itanic spew bogus 
log messages is just plain wrong (yet another hardware issue pushed off on 
software for no significant gain).

		-ben
-- 
"The software industry today survives only through an unstated agreement 
not to stir things up too much.  We must hope this lawsuit [by SCO] isn't 
the big stirring spoon." -- Ian Lance Taylor, Linux Journal, June 2003

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:57 ` David S. Miller
@ 2003-09-23 18:27   ` David Mosberger
  2003-09-23 18:47     ` David S. Miller
  0 siblings, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 18:27 UTC (permalink / raw)
  To: David S. Miller
  Cc: Van Maren, Kevin, davidm, davidm, peter, bcrl, ak, iod00d,
	peterc, linux-ns83820, linux-ia64, linux-kernel

>>>>> On Tue, 23 Sep 2003 10:57:12 -0700, "David S. Miller" <davem@redhat.com> said:

  DaveM> On Tue, 23 Sep 2003 12:58:59 -0500
  DaveM> "Van Maren, Kevin" <kevin.vanmaren@unisys.com> wrote:

  >> Rate-limited unaligned loads in user space make a lot more sense, since
  >> they _may_ point out issues in the code.

  DaveM> That's what generates the bulk of the noise in people's ia64
  DaveM> dmesg output.

At the moment, there are two ways to control the unaligned message printing:

 - use the dmesg command to lower the printing threshold below KERN_WARNING
   ("dmesg -n4", IIRC)

 - use prctl --unalign=silent to turn off unaligned printing for a
   particular task and its children

If these mechanisms are not sufficient, there is nothing to stop
anyone from cooking up a patch.  For my puposes, what's there works
nicely so I myself have no plans to make changes in this area.

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
@ 2003-09-23 18:21 Luck, Tony
  2003-09-23 18:29 ` Benjamin LaHaise
  2003-09-23 21:00 ` Alan Cox
  0 siblings, 2 replies; 81+ messages in thread
From: Luck, Tony @ 2003-09-23 18:21 UTC (permalink / raw)
  To: David S. Miller, davidm
  Cc: davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

> As I understand it, you even do this stupid printk for user apps
> as well, that makes it more than rediculious.  I'd be surprised
> if anyone can find any useful kernel messages on an ia64 system
> in the dmesg output with all the unaligned access crap there.

I don't think that it is "normal" for applications to do unaligned
memory access.  It means that:

a) the programmer is playing fast and loose with types and/or casts.
b) the end-user is going to be disappointed with the performance.

Looking at a couple of ia64 build servers here I see zero unaligned
access messages in the logs.

-Tony

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:58 Van Maren, Kevin
  2003-09-23 17:57 ` David S. Miller
@ 2003-09-23 18:06 ` David Mosberger
  2003-09-23 19:11   ` David S. Miller
  1 sibling, 1 reply; 81+ messages in thread
From: David Mosberger @ 2003-09-23 18:06 UTC (permalink / raw)
  To: Van Maren, Kevin
  Cc: David S. Miller, davidm, davidm, peter, bcrl, ak, iod00d, peterc,
	linux-ns83820, linux-ia64, linux-kernel

>>>>> On Tue, 23 Sep 2003 12:58:59 -0500, "Van Maren, Kevin" <kevin.vanmaren@unisys.com> said:

  Van> That's my view on the fpswa printk's (handle_fpu_swa): they are
  Van> normal, expected, and there is absolutely nothing that can be
  Van> done about them -- so why print a "warning" about them (even if
  Van> it is only 5 per second)?  If nothing else, toggle the meaning
  Van> for IA64_THREAD_FPEMU_NOPRINT: turn it ON for special apps.

So what's wrong with doign prctl --fpemul=silent in the init process?
The flags are inherited across fork().

	--david

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: NS83820 2.6.0-test5 driver seems unstable on IA64
@ 2003-09-23 17:58 Van Maren, Kevin
  2003-09-23 17:57 ` David S. Miller
  2003-09-23 18:06 ` David Mosberger
  0 siblings, 2 replies; 81+ messages in thread
From: Van Maren, Kevin @ 2003-09-23 17:58 UTC (permalink / raw)
  To: David S. Miller, davidm
  Cc: davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

> > The printk() is rate-controlled and doesn't happen for every unaligned
> > access.  It's average cost can be made as low as we want to, by adjusting
> > the rate.
> 
> But if the event is normal, you shouldn't be logging it as if
> it weren't.

That's my view on the fpswa printk's (handle_fpu_swa): they are normal,
expected, and there is absolutely nothing that can be done about them --
so why print a "warning" about them (even if it is only 5 per second)?
If nothing else, toggle the meaning for IA64_THREAD_FPEMU_NOPRINT: turn it
ON for special apps.

Rate-limited unaligned loads in user space make a lot more sense, since
they _may_ point out issues in the code.

Kevin

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: NS83820 2.6.0-test5 driver seems unstable on IA64
  2003-09-23 17:58 Van Maren, Kevin
@ 2003-09-23 17:57 ` David S. Miller
  2003-09-23 18:27   ` David Mosberger
  2003-09-23 18:06 ` David Mosberger
  1 sibling, 1 reply; 81+ messages in thread
From: David S. Miller @ 2003-09-23 17:57 UTC (permalink / raw)
  To: Van Maren, Kevin
  Cc: davidm, davidm, peter, bcrl, ak, iod00d, peterc, linux-ns83820,
	linux-ia64, linux-kernel

On Tue, 23 Sep 2003 12:58:59 -0500
"Van Maren, Kevin" <kevin.vanmaren@unisys.com> wrote:

> Rate-limited unaligned loads in user space make a lot more sense, since
> they _may_ point out issues in the code.

That's what generates the bulk of the noise in people's ia64
dmesg output.

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2003-09-27 23:50 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-19  4:16 NS83820 2.6.0-test5 driver seems unstable on IA64 Peter Chubb
2003-09-19  4:38 ` Grant Grundler
2003-09-19  4:43   ` Andi Kleen
2003-09-19  5:01     ` Peter Chubb
2003-09-19  5:53       ` Andi Kleen
2003-09-19 10:49         ` Benjamin LaHaise
2003-09-23  0:34           ` Peter Chubb
2003-09-23  0:36             ` Benjamin LaHaise
2003-09-23  6:22               ` David S. Miller
2003-09-23 10:40                 ` Peter Chubb
2003-09-23 10:51                   ` David S. Miller
2003-09-23 14:59                     ` David Mosberger
2003-09-23 17:27                       ` David S. Miller
2003-09-23 17:56                         ` David Mosberger
2003-09-24  2:55                           ` David S. Miller
2003-09-23 23:04                         ` Ian Wienand
2003-09-23 23:35                           ` David S. Miller
2003-09-19  5:04     ` Grant Grundler
2003-09-19  5:11       ` Peter Chubb
2003-09-23 17:58 Van Maren, Kevin
2003-09-23 17:57 ` David S. Miller
2003-09-23 18:27   ` David Mosberger
2003-09-23 18:47     ` David S. Miller
2003-09-23 19:17       ` David Mosberger
2003-09-23 19:10         ` David S. Miller
2003-09-23 20:00           ` David Mosberger
2003-09-24  3:08             ` David S. Miller
2003-09-25 16:11               ` Ivan Kokshaysky
2003-09-26  1:12                 ` David S. Miller
2003-09-23 18:06 ` David Mosberger
2003-09-23 19:11   ` David S. Miller
2003-09-23 18:21 Luck, Tony
2003-09-23 18:29 ` Benjamin LaHaise
2003-09-23 18:45   ` David S. Miller
2003-09-23 18:51   ` Grant Grundler
2003-09-23 18:51     ` David S. Miller
2003-09-23 20:38       ` Grant Grundler
2003-09-23 20:45         ` David S. Miller
2003-09-23 22:35           ` Grant Grundler
2003-09-23 23:35             ` David S. Miller
2003-09-24  0:13               ` Grant Grundler
2003-09-24  0:04                 ` David S. Miller
2003-09-24  1:58               ` William Lee Irwin III
2003-09-23 19:04     ` Benjamin LaHaise
2003-09-23 18:54   ` Andreas Schwab
2003-09-23 18:52     ` David S. Miller
2003-09-23 19:09       ` Andreas Schwab
2003-09-23 19:01         ` David S. Miller
2003-09-23 19:16           ` Andreas Schwab
2003-09-23 19:08             ` David S. Miller
2003-09-23 19:28       ` Matthew Wilcox
2003-09-23 19:22         ` David S. Miller
2003-09-23 23:15           ` Andrew Morton
2003-09-23 23:37             ` David S. Miller
2003-09-24  0:14               ` Matthew Wilcox
2003-09-24  0:06                 ` David S. Miller
2003-09-23 19:10   ` Jeff Garzik
2003-09-23 19:11     ` Benjamin LaHaise
2003-09-23 19:22       ` Jeff Garzik
2003-09-23 21:00 ` Alan Cox
2003-09-23 21:16   ` David Mosberger
2003-09-24  1:47     ` David S. Miller
2003-09-23 22:58 Luck, Tony
2003-09-23 23:32 ` David S. Miller
2003-09-24  5:18 ` Valdis.Kletnieks
2003-09-26  6:16 Manfred Spraul
2003-09-26  6:07 ` David S. Miller
2003-09-26  6:41   ` David Mosberger
2003-09-26 14:38   ` Ivan Kokshaysky
2003-09-27  3:12     ` David S. Miller
     [not found] <zU7D.2Ji.27@gated-at.bofh.it>
2003-09-26 15:14 ` Andi Kleen
2003-09-26 15:41   ` David Mosberger
     [not found] <A2yd.64p.31@gated-at.bofh.it>
     [not found] ` <A2yd.64p.29@gated-at.bofh.it>
     [not found]   ` <A317.6GH.7@gated-at.bofh.it>
2003-09-26 17:04     ` Andi Kleen
2003-09-26 17:22       ` Chris Friesen
2003-09-26 17:47       ` David Mosberger
2003-09-27  3:57         ` David S. Miller
2003-09-27  4:36         ` Andi Kleen
2003-09-27  4:52           ` David Mosberger
     [not found] <A4gy.8wi.13@gated-at.bofh.it>
     [not found] ` <A4gy.8wi.15@gated-at.bofh.it>
     [not found]   ` <A4gy.8wi.17@gated-at.bofh.it>
     [not found]     ` <A4gy.8wi.11@gated-at.bofh.it>
     [not found]       ` <A4Tv.Ud.37@gated-at.bofh.it>
     [not found]         ` <AepM.5Zb.41@gated-at.bofh.it>
2003-09-27  4:26           ` Andi Kleen
2003-09-27  4:34             ` David Mosberger
2003-09-27 23:36             ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).