All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Nelson <eric@nelint.com>
To: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Eric Dumazet <edumazet@google.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Fugang Duan <fugang.duan@nxp.com>,
	Troy Kisky <troy.kisky@boundarydevices.com>,
	Otavio Salvador <otavio@ossystems.com.br>,
	Simone <cjb.sw.nospam@gmail.com>
Subject: Re: Alignment issues with freescale FEC driver
Date: Fri, 23 Sep 2016 11:26:18 -0700	[thread overview]
Message-ID: <d0d6f333-c6fc-6572-0633-d7c2c29b8b3f@nelint.com> (raw)
In-Reply-To: <20160923173751.GA1041@n2100.armlinux.org.uk>

Thanks Russell,

On 09/23/2016 10:37 AM, Russell King - ARM Linux wrote:
> On Fri, Sep 23, 2016 at 10:19:50AM -0700, Eric Nelson wrote:
>> Oddly, it does prevent the vast majority (90%+) of the alignment errors.
>>
>> I believe this is because the compiler is generating an ldm instruction
>> when the ntohl() call is used, but I'm stumped about why these aren't
>> generating faults:

After looking at it, I have to think that the code that reads iph->id
is just hit more frequently than the other code in this routine.

> 
> ldm generates alignment faults when the address is not aligned to a
> 32-bit boundary.  ldr on ARMv6+ does not.
> 
>> I don't think that's the case.
>>
>> # CONFIG_IPV6_GRE is not set
>>
>> Hmm... Instrumenting the kernel, it seems that iphdr **is** aligned on
>> a 4-byte boundary.
>>
>> Does the ldm instruction require 8-byte alignment?
>>
>> There's definitely a compiler-version dependency involved here,
>> since using gcc 4.9 also reduced the number of faults dramatically.
> 
> Well, I don't think it's that gcc related:
> 

I can only say that I noticed a dramatic drop in the number of faults, and
didn't see the inet_gro_receive reported in /proc/cpu/alignment with gcc 4.9
when trying to identify the issue.

> User:           0
> System:         312855 (ip6_route_input+0x6c/0x1e0)
> Skipped:        0
> Half:           0
> Word:           0
> DWord:          2
> Multi:          312853
> 
> c06d8998 <ip6_route_input>:
> c06d89ac:       e1a04000        mov     r4, r0
> c06d89b0:       e1d489b4        ldrh    r8, [r4, #148]  ; 0x94
> c06d89b8:       e594a0a0        ldr     sl, [r4, #160]  ; 0xa0
> c06d89cc:       e08ac008        add     ip, sl, r8
> c06d89d4:       e28c3018        add     r3, ip, #24
> c06d89dc:       e28c7008        add     r7, ip, #8
> c06d89e4:       e893000f        ldm     r3, {r0, r1, r2, r3}
> c06d89ec:       e24be044        sub     lr, fp, #68     ; 0x44
> c06d89f4:       e24b5054        sub     r5, fp, #84     ; 0x54
> c06d89fc:       e885000f        stm     r5, {r0, r1, r2, r3}
> c06d8a04:       e897000f        ldm     r7, {r0, r1, r2, r3}
> c06d8a10:       e88e000f        stm     lr, {r0, r1, r2, r3}
> 
> This is from:
> 
>         struct flowi6 fl6 = {
>                 .flowi6_iif = l3mdev_fib_oif(skb->dev),
>                 .daddr = iph->daddr,
>                 .saddr = iph->saddr,
>                 .flowlabel = ip6_flowinfo(iph),
>                 .flowi6_mark = skb->mark,
>                 .flowi6_proto = iph->nexthdr,
>         };
> 
> specifically, I suspect, the saddr and daddr initialisations.
> 
> There's not much to get away from this - the FEC on iMX requires a
> 16-byte alignment for DMA addresses, which violates the network
> stack's requirement for the ethernet packet to be received with a
> two byte offset.  So the IP header (and IPv6 headers) will always
> be mis-aligned in memory, which leads to a huge number of alignment
> faults.
> 
> There's not much getting away from this - the problem is not in the
> networking stack, but the FEC hardware/network driver.  See:
> 
>         struct  fec_enet_private *fep = netdev_priv(ndev);
>         int off;
> 
>         off = ((unsigned long)skb->data) & fep->rx_align;
>         if (off)
>                 skb_reserve(skb, fep->rx_align + 1 - off);
> 
>         bdp->cbd_bufaddr = cpu_to_fec32(dma_map_single(&fep->pdev->dev, skb->data, FEC_ENET_RX_FRSIZE - fep->rx_align, DMA_FROM_DEVICE));
> 
> in fec_enet_new_rxbdp().
> 

So the question is: should we just live with this and acknowledge a
performance penalty of bad alignment or do something about it?

I'm not sure the cost (or the details) of Eric's proposed fix of allocating
and copying the header to another skb.

The original report was of bad network performance, but I haven't
been able to see an impact doing some simple tests using wget
and SSH.

WARNING: multiple messages have this Message-ID (diff)
From: eric@nelint.com (Eric Nelson)
To: linux-arm-kernel@lists.infradead.org
Subject: Alignment issues with freescale FEC driver
Date: Fri, 23 Sep 2016 11:26:18 -0700	[thread overview]
Message-ID: <d0d6f333-c6fc-6572-0633-d7c2c29b8b3f@nelint.com> (raw)
In-Reply-To: <20160923173751.GA1041@n2100.armlinux.org.uk>

Thanks Russell,

On 09/23/2016 10:37 AM, Russell King - ARM Linux wrote:
> On Fri, Sep 23, 2016 at 10:19:50AM -0700, Eric Nelson wrote:
>> Oddly, it does prevent the vast majority (90%+) of the alignment errors.
>>
>> I believe this is because the compiler is generating an ldm instruction
>> when the ntohl() call is used, but I'm stumped about why these aren't
>> generating faults:

After looking at it, I have to think that the code that reads iph->id
is just hit more frequently than the other code in this routine.

> 
> ldm generates alignment faults when the address is not aligned to a
> 32-bit boundary.  ldr on ARMv6+ does not.
> 
>> I don't think that's the case.
>>
>> # CONFIG_IPV6_GRE is not set
>>
>> Hmm... Instrumenting the kernel, it seems that iphdr **is** aligned on
>> a 4-byte boundary.
>>
>> Does the ldm instruction require 8-byte alignment?
>>
>> There's definitely a compiler-version dependency involved here,
>> since using gcc 4.9 also reduced the number of faults dramatically.
> 
> Well, I don't think it's that gcc related:
> 

I can only say that I noticed a dramatic drop in the number of faults, and
didn't see the inet_gro_receive reported in /proc/cpu/alignment with gcc 4.9
when trying to identify the issue.

> User:           0
> System:         312855 (ip6_route_input+0x6c/0x1e0)
> Skipped:        0
> Half:           0
> Word:           0
> DWord:          2
> Multi:          312853
> 
> c06d8998 <ip6_route_input>:
> c06d89ac:       e1a04000        mov     r4, r0
> c06d89b0:       e1d489b4        ldrh    r8, [r4, #148]  ; 0x94
> c06d89b8:       e594a0a0        ldr     sl, [r4, #160]  ; 0xa0
> c06d89cc:       e08ac008        add     ip, sl, r8
> c06d89d4:       e28c3018        add     r3, ip, #24
> c06d89dc:       e28c7008        add     r7, ip, #8
> c06d89e4:       e893000f        ldm     r3, {r0, r1, r2, r3}
> c06d89ec:       e24be044        sub     lr, fp, #68     ; 0x44
> c06d89f4:       e24b5054        sub     r5, fp, #84     ; 0x54
> c06d89fc:       e885000f        stm     r5, {r0, r1, r2, r3}
> c06d8a04:       e897000f        ldm     r7, {r0, r1, r2, r3}
> c06d8a10:       e88e000f        stm     lr, {r0, r1, r2, r3}
> 
> This is from:
> 
>         struct flowi6 fl6 = {
>                 .flowi6_iif = l3mdev_fib_oif(skb->dev),
>                 .daddr = iph->daddr,
>                 .saddr = iph->saddr,
>                 .flowlabel = ip6_flowinfo(iph),
>                 .flowi6_mark = skb->mark,
>                 .flowi6_proto = iph->nexthdr,
>         };
> 
> specifically, I suspect, the saddr and daddr initialisations.
> 
> There's not much to get away from this - the FEC on iMX requires a
> 16-byte alignment for DMA addresses, which violates the network
> stack's requirement for the ethernet packet to be received with a
> two byte offset.  So the IP header (and IPv6 headers) will always
> be mis-aligned in memory, which leads to a huge number of alignment
> faults.
> 
> There's not much getting away from this - the problem is not in the
> networking stack, but the FEC hardware/network driver.  See:
> 
>         struct  fec_enet_private *fep = netdev_priv(ndev);
>         int off;
> 
>         off = ((unsigned long)skb->data) & fep->rx_align;
>         if (off)
>                 skb_reserve(skb, fep->rx_align + 1 - off);
> 
>         bdp->cbd_bufaddr = cpu_to_fec32(dma_map_single(&fep->pdev->dev, skb->data, FEC_ENET_RX_FRSIZE - fep->rx_align, DMA_FROM_DEVICE));
> 
> in fec_enet_new_rxbdp().
> 

So the question is: should we just live with this and acknowledge a
performance penalty of bad alignment or do something about it?

I'm not sure the cost (or the details) of Eric's proposed fix of allocating
and copying the header to another skb.

The original report was of bad network performance, but I haven't
been able to see an impact doing some simple tests using wget
and SSH.

  reply	other threads:[~2016-09-23 18:26 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-23 16:43 Alignment issues with freescale FEC driver Eric Nelson
2016-09-23 16:43 ` Eric Nelson
2016-09-23 16:54 ` Eric Dumazet
2016-09-23 16:54   ` Eric Dumazet
2016-09-23 17:19   ` Eric Nelson
2016-09-23 17:19     ` Eric Nelson
2016-09-23 17:33     ` Eric Nelson
2016-09-23 17:33       ` Eric Nelson
2016-09-23 18:13       ` Andrew Lunn
2016-09-23 18:13         ` Andrew Lunn
2016-09-23 18:30         ` Russell King - ARM Linux
2016-09-23 18:30           ` Russell King - ARM Linux
2016-09-23 18:39           ` Eric Nelson
2016-09-23 18:39             ` Eric Nelson
2016-09-23 18:35         ` Eric Nelson
2016-09-23 18:35           ` Eric Nelson
2016-09-24  2:45           ` David Miller
2016-09-24  2:45             ` David Miller
2016-09-24  5:13             ` Andy Duan
2016-09-24  5:13               ` Andy Duan
2016-09-24 14:42               ` [PATCH 0/3] net: fec: updates to align IP header Eric Nelson
2016-09-24 14:42                 ` [PATCH 1/3] net: fec: remove QUIRK_HAS_RACC from i.mx25 Eric Nelson
2016-09-24 14:42                 ` [PATCH 2/3] net: fec: remove QUIRK_HAS_RACC from i.mx27 Eric Nelson
2016-09-24 14:42                 ` [PATCH 3/3] net: fec: align IP header in hardware Eric Nelson
2016-09-26  9:26                   ` David Laight
2016-09-26 18:39                     ` Eric Nelson
2016-09-28 16:42                       ` David Laight
2016-09-28 17:14                         ` Eric Nelson
2016-09-28 17:25                           ` Russell King - ARM Linux
2016-09-28 18:01                             ` Eric Nelson
2016-09-29 11:07                           ` David Laight
2016-09-30 13:27                             ` Eric Nelson
2016-09-30 13:49                               ` David Laight
2016-09-30 14:16                                 ` Eric Nelson
2016-10-01 19:52                                   ` Russell King - ARM Linux
2016-10-03 16:42                                     ` David Laight
2016-10-03 18:48                                     ` Eric Nelson
2016-10-08  2:44                               ` Andy Duan
2016-09-24 15:09                 ` [PATCH 0/3] net: fec: updates to align IP header Andy Duan
2016-09-24 15:29                   ` Eric Nelson
2016-09-27 11:40                 ` David Miller
2016-09-24  2:43       ` Alignment issues with freescale FEC driver David Miller
2016-09-24  2:43         ` David Miller
2016-09-24 12:27         ` Eric Nelson
2016-09-24 12:27           ` Eric Nelson
2016-09-23 17:37     ` Russell King - ARM Linux
2016-09-23 17:37       ` Russell King - ARM Linux
2016-09-23 18:26       ` Eric Nelson [this message]
2016-09-23 18:26         ` Eric Nelson
2016-09-23 18:37         ` Russell King - ARM Linux
2016-09-23 18:37           ` Russell King - ARM Linux
2016-09-23 18:49           ` Eric Nelson
2016-09-23 18:49             ` Eric Nelson
2016-09-23 20:22           ` Uwe Kleine-König
2016-09-23 20:22             ` Uwe Kleine-König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0d6f333-c6fc-6572-0633-d7c2c29b8b3f@nelint.com \
    --to=eric@nelint.com \
    --cc=cjb.sw.nospam@gmail.com \
    --cc=edumazet@google.com \
    --cc=fugang.duan@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=otavio@ossystems.com.br \
    --cc=troy.kisky@boundarydevices.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.