netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux 5.12-rc7
       [not found] ` <20210412051445.GA47322@roeck-us.net>
@ 2021-04-12 16:28   ` Linus Torvalds
  2021-04-12 16:31     ` Michael S. Tsirkin
  2021-04-12 16:31     ` Eric Dumazet
  0 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2021-04-12 16:28 UTC (permalink / raw)
  To: Guenter Roeck, Xuan Zhuo, Eric Dumazet, Michael S. Tsirkin
  Cc: Linux Kernel Mailing List, Netdev

On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Qemu test results:
>         total: 460 pass: 459 fail: 1
> Failed tests:
>         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
>
> The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> skb->head"). It is a spurious problem - the test passes roughly every other
> time. When the failure is seen, udhcpc fails to get an IP address and aborts
> with SIGTERM. So far I have only seen this with the "sh" architecture.

Hmm. Let's add in some more of the people involved in that commit, and
also netdev.

Nothing in there looks like it should have any interaction with
architecture, so that "it happens on sh" sounds odd, but maybe it's
some particular interaction with the qemu environment.

             Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 16:28   ` Linux 5.12-rc7 Linus Torvalds
@ 2021-04-12 16:31     ` Michael S. Tsirkin
  2021-04-12 16:31     ` Eric Dumazet
  1 sibling, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-12 16:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Xuan Zhuo, Eric Dumazet,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 09:28:28AM -0700, Linus Torvalds wrote:
> On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Qemu test results:
> >         total: 460 pass: 459 fail: 1
> > Failed tests:
> >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> >
> > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > skb->head"). It is a spurious problem - the test passes roughly every other
> > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > with SIGTERM. So far I have only seen this with the "sh" architecture.
> 
> Hmm. Let's add in some more of the people involved in that commit, and
> also netdev.
> 
> Nothing in there looks like it should have any interaction with
> architecture, so that "it happens on sh" sounds odd, but maybe it's
> some particular interaction with the qemu environment.
> 
>              Linus

Yea Eric's been trying to debug this already. Let's give him a bit more time...

-- 
MST


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 16:28   ` Linux 5.12-rc7 Linus Torvalds
  2021-04-12 16:31     ` Michael S. Tsirkin
@ 2021-04-12 16:31     ` Eric Dumazet
  2021-04-12 16:47       ` Eric Dumazet
  2021-04-12 17:31       ` Guenter Roeck
  1 sibling, 2 replies; 18+ messages in thread
From: Eric Dumazet @ 2021-04-12 16:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Qemu test results:
> >         total: 460 pass: 459 fail: 1
> > Failed tests:
> >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> >
> > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > skb->head"). It is a spurious problem - the test passes roughly every other
> > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > with SIGTERM. So far I have only seen this with the "sh" architecture.
>
> Hmm. Let's add in some more of the people involved in that commit, and
> also netdev.
>
> Nothing in there looks like it should have any interaction with
> architecture, so that "it happens on sh" sounds odd, but maybe it's
> some particular interaction with the qemu environment.

Yes, maybe.

I spent few hours on this, and suspect a buggy memcpy() implementation
on SH, but this was not conclusive.

By pulling one extra byte, the problem goes away.

Strange thing is that the udhcpc process does not go past sendto().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 16:31     ` Eric Dumazet
@ 2021-04-12 16:47       ` Eric Dumazet
  2021-04-13 12:57         ` Michael S. Tsirkin
  2021-04-12 17:31       ` Guenter Roeck
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-12 16:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > Qemu test results:
> > >         total: 460 pass: 459 fail: 1
> > > Failed tests:
> > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > >
> > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> >
> > Hmm. Let's add in some more of the people involved in that commit, and
> > also netdev.
> >
> > Nothing in there looks like it should have any interaction with
> > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > some particular interaction with the qemu environment.
>
> Yes, maybe.
>
> I spent few hours on this, and suspect a buggy memcpy() implementation
> on SH, but this was not conclusive.
>
> By pulling one extra byte, the problem goes away.
>
> Strange thing is that the udhcpc process does not go past sendto().

This is the patch working around the issue. Unfortunately I was not
able to root-cause it (I really suspect something on SH)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
virtnet_info *vi,

        /* Copy all frame if it fits skb->head, otherwise
         * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+        *
+        * Apparently, pulling only the Ethernet Header triggers a bug
on qemu-system-sh4.
+        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
+        * more to work around this bug : These 20 bytes can not belong
+        * to UDP/TCP payload.
+        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
         */
        if (len <= skb_tailroom(skb))
                copy = len;
        else
-               copy = ETH_HLEN + metasize;
+               copy = ETH_HLEN + sizeof(struct iphdr) + metasize;
        skb_put_data(skb, p, copy);

        if (metasize) {

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 16:31     ` Eric Dumazet
  2021-04-12 16:47       ` Eric Dumazet
@ 2021-04-12 17:31       ` Guenter Roeck
  2021-04-12 17:38         ` Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2021-04-12 17:31 UTC (permalink / raw)
  To: Eric Dumazet, Linus Torvalds
  Cc: Xuan Zhuo, Michael S. Tsirkin, Linux Kernel Mailing List, Netdev

On 4/12/21 9:31 AM, Eric Dumazet wrote:
> On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>
>>> Qemu test results:
>>>         total: 460 pass: 459 fail: 1
>>> Failed tests:
>>>         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
>>>
>>> The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
>>> skb->head"). It is a spurious problem - the test passes roughly every other
>>> time. When the failure is seen, udhcpc fails to get an IP address and aborts
>>> with SIGTERM. So far I have only seen this with the "sh" architecture.
>>
>> Hmm. Let's add in some more of the people involved in that commit, and
>> also netdev.
>>
>> Nothing in there looks like it should have any interaction with
>> architecture, so that "it happens on sh" sounds odd, but maybe it's
>> some particular interaction with the qemu environment.
> 
> Yes, maybe.
> 
> I spent few hours on this, and suspect a buggy memcpy() implementation
> on SH, but this was not conclusive.
> 

I replaced all memcpy() calls in skbuff.h with calls to

static inline void __my_memcpy(unsigned char *to, const unsigned char *from,
                               unsigned int len)
{
       while (len--)
               *to++ = *from++;
}

That made no difference, so unless you have some other memcpy() in mind that
seems to be unlikely.

> By pulling one extra byte, the problem goes away.
> 
> Strange thing is that the udhcpc process does not go past sendto().
> 

I have been trying to debug that one. Unfortunately gdb doesn't work with sh,
so I can't use it to debug the problem. I'll spend some more time on this today.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 17:31       ` Guenter Roeck
@ 2021-04-12 17:38         ` Eric Dumazet
  2021-04-12 20:05           ` Guenter Roeck
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-12 17:38 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 7:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 4/12/21 9:31 AM, Eric Dumazet wrote:
> > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >>
> >> On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >>>
> >>> Qemu test results:
> >>>         total: 460 pass: 459 fail: 1
> >>> Failed tests:
> >>>         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> >>>
> >>> The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> >>> skb->head"). It is a spurious problem - the test passes roughly every other
> >>> time. When the failure is seen, udhcpc fails to get an IP address and aborts
> >>> with SIGTERM. So far I have only seen this with the "sh" architecture.
> >>
> >> Hmm. Let's add in some more of the people involved in that commit, and
> >> also netdev.
> >>
> >> Nothing in there looks like it should have any interaction with
> >> architecture, so that "it happens on sh" sounds odd, but maybe it's
> >> some particular interaction with the qemu environment.
> >
> > Yes, maybe.
> >
> > I spent few hours on this, and suspect a buggy memcpy() implementation
> > on SH, but this was not conclusive.
> >
>
> I replaced all memcpy() calls in skbuff.h with calls to
>
> static inline void __my_memcpy(unsigned char *to, const unsigned char *from,
>                                unsigned int len)
> {
>        while (len--)
>                *to++ = *from++;
> }
>
> That made no difference, so unless you have some other memcpy() in mind that
> seems to be unlikely.


Sure, note I also had :

diff --git a/net/core/dev.c b/net/core/dev.c
index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..4e05a32542dd606aaaaee8038017fea949939c0e
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5938,7 +5938,7 @@ static void gro_pull_from_frag0(struct sk_buff
*skb, int grow)

        BUG_ON(skb->end - skb->tail < grow);

-       memcpy(skb_tail_pointer(skb), NAPI_GRO_CB(skb)->frag0, grow);
+       memmove(skb_tail_pointer(skb), NAPI_GRO_CB(skb)->frag0, grow);

        skb->data_len -= grow;
        skb->tail += grow;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c421c8f809256f7b13a8b5a1331108449353ee2d..41796dedfa9034f2333cf249a0d81b7250e14d1f
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2278,7 +2278,7 @@ int skb_copy_bits(const struct sk_buff *skb, int
offset, void *to, int len)
                                              skb_frag_off(f) + offset - start,
                                              copy, p, p_off, p_len, copied) {
                                vaddr = kmap_atomic(p);
-                               memcpy(to + copied, vaddr + p_off, p_len);
+                               memmove(to + copied, vaddr + p_off, p_len);
                                kunmap_atomic(vaddr);
                        }


>
> > By pulling one extra byte, the problem goes away.
> >
> > Strange thing is that the udhcpc process does not go past sendto().
> >
>
> I have been trying to debug that one. Unfortunately gdb doesn't work with sh,
> so I can't use it to debug the problem. I'll spend some more time on this today.

Yes, I think this is the real issue here. This smells like some memory
corruption.

In my traces, packet is correctly received in AF_PACKET queue.

I have checked the skb is well formed.

But the user space seems to never call poll() and recvmsg() on this
af_packet socket.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 17:38         ` Eric Dumazet
@ 2021-04-12 20:05           ` Guenter Roeck
  2021-04-13  9:24             ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Guenter Roeck @ 2021-04-12 20:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On 4/12/21 10:38 AM, Eric Dumazet wrote:
[ ... ]

> Yes, I think this is the real issue here. This smells like some memory
> corruption.
> 
> In my traces, packet is correctly received in AF_PACKET queue.
> 
> I have checked the skb is well formed.
> 
> But the user space seems to never call poll() and recvmsg() on this
> af_packet socket.
> 

After sprinkling the kernel with debug messages:

424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
424   00:01:33.693873 close(6)          = 0
424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
424   00:01:33.697311 exit_group(1)     = ?
424   00:01:33.698346 +++ exited with 1 +++

I only see that after adding debug messages in the kernel, so I guess there must be
a heisenbug somehere.

Anyway, indeed, I see (another kernel debug message):

__do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8

So udhcpc doesn't even try to read the reply because it crashes after sendto()
when trying to read the current time. Unless I am missing something, that means
that the problem happens somewhere on the send side.

To make things even more interesting, it looks like the failing system call
isn't always clock_gettime().

Guenter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 20:05           ` Guenter Roeck
@ 2021-04-13  9:24             ` Eric Dumazet
  2021-04-13 10:43               ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13  9:24 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 10:05 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 4/12/21 10:38 AM, Eric Dumazet wrote:
> [ ... ]
>
> > Yes, I think this is the real issue here. This smells like some memory
> > corruption.
> >
> > In my traces, packet is correctly received in AF_PACKET queue.
> >
> > I have checked the skb is well formed.
> >
> > But the user space seems to never call poll() and recvmsg() on this
> > af_packet socket.
> >
>
> After sprinkling the kernel with debug messages:
>
> 424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
> 424   00:01:33.693873 close(6)          = 0
> 424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
> 424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
> 424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
> 424   00:01:33.697311 exit_group(1)     = ?
> 424   00:01:33.698346 +++ exited with 1 +++
>
> I only see that after adding debug messages in the kernel, so I guess there must be
> a heisenbug somehere.
>
> Anyway, indeed, I see (another kernel debug message):
>
> __do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8
>
> So udhcpc doesn't even try to read the reply because it crashes after sendto()
> when trying to read the current time. Unless I am missing something, that means
> that the problem happens somewhere on the send side.
>
> To make things even more interesting, it looks like the failing system call
> isn't always clock_gettime().
>
> Guenter


I think GRO fast path has never worked on SUPERH. Probably SUPERH has
never used a fast NIC (10Gbit+)

The following hack fixes the issue.


diff --git a/net/core/dev.c b/net/core/dev.c
index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..91ba89a645ff91d4cd4f3d8dc8a009bcb67da344
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5916,13 +5916,16 @@ static struct list_head
*gro_list_prepare(struct napi_struct *napi,

 static void skb_gro_reset_offset(struct sk_buff *skb)
 {
+#if !defined(CONFIG_SUPERH)
        const struct skb_shared_info *pinfo = skb_shinfo(skb);
        const skb_frag_t *frag0 = &pinfo->frags[0];
+#endif

        NAPI_GRO_CB(skb)->data_offset = 0;
        NAPI_GRO_CB(skb)->frag0 = NULL;
        NAPI_GRO_CB(skb)->frag0_len = 0;

+#if !defined(CONFIG_SUPERH)
        if (!skb_headlen(skb) && pinfo->nr_frags &&
            !PageHighMem(skb_frag_page(frag0))) {
                NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
@@ -5930,6 +5933,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
                                                    skb_frag_size(frag0),
                                                    skb->end - skb->tail);
        }
+#endif
 }

 static void gro_pull_from_frag0(struct sk_buff *skb, int grow)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13  9:24             ` Eric Dumazet
@ 2021-04-13 10:43               ` Eric Dumazet
  2021-04-13 12:45                 ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13 10:43 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 11:24 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Apr 12, 2021 at 10:05 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On 4/12/21 10:38 AM, Eric Dumazet wrote:
> > [ ... ]
> >
> > > Yes, I think this is the real issue here. This smells like some memory
> > > corruption.
> > >
> > > In my traces, packet is correctly received in AF_PACKET queue.
> > >
> > > I have checked the skb is well formed.
> > >
> > > But the user space seems to never call poll() and recvmsg() on this
> > > af_packet socket.
> > >
> >
> > After sprinkling the kernel with debug messages:
> >
> > 424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
> > 424   00:01:33.693873 close(6)          = 0
> > 424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
> > 424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
> > 424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
> > 424   00:01:33.697311 exit_group(1)     = ?
> > 424   00:01:33.698346 +++ exited with 1 +++
> >
> > I only see that after adding debug messages in the kernel, so I guess there must be
> > a heisenbug somehere.
> >
> > Anyway, indeed, I see (another kernel debug message):
> >
> > __do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8
> >
> > So udhcpc doesn't even try to read the reply because it crashes after sendto()
> > when trying to read the current time. Unless I am missing something, that means
> > that the problem happens somewhere on the send side.
> >
> > To make things even more interesting, it looks like the failing system call
> > isn't always clock_gettime().
> >
> > Guenter
>
>
> I think GRO fast path has never worked on SUPERH. Probably SUPERH has
> never used a fast NIC (10Gbit+)
>
> The following hack fixes the issue.
>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..91ba89a645ff91d4cd4f3d8dc8a009bcb67da344
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5916,13 +5916,16 @@ static struct list_head
> *gro_list_prepare(struct napi_struct *napi,
>
>  static void skb_gro_reset_offset(struct sk_buff *skb)
>  {
> +#if !defined(CONFIG_SUPERH)
>         const struct skb_shared_info *pinfo = skb_shinfo(skb);
>         const skb_frag_t *frag0 = &pinfo->frags[0];
> +#endif
>
>         NAPI_GRO_CB(skb)->data_offset = 0;
>         NAPI_GRO_CB(skb)->frag0 = NULL;
>         NAPI_GRO_CB(skb)->frag0_len = 0;
>
> +#if !defined(CONFIG_SUPERH)
>         if (!skb_headlen(skb) && pinfo->nr_frags &&
>             !PageHighMem(skb_frag_page(frag0))) {
>                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> @@ -5930,6 +5933,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
>                                                     skb_frag_size(frag0),
>                                                     skb->end - skb->tail);
>         }
> +#endif
>  }
>
>  static void gro_pull_from_frag0(struct sk_buff *skb, int grow)

OK ... more sh debugging :

diff --git a/arch/sh/mm/alignment.c b/arch/sh/mm/alignment.c
index fb517b82a87b1065cf38c06cb3c178ce86587b00..5d18f9f792991105a8aa05cc6231b7d4532d72c9
100644
--- a/arch/sh/mm/alignment.c
+++ b/arch/sh/mm/alignment.c
@@ -27,7 +27,7 @@ static unsigned long se_multi;
    valid! */
 static int se_usermode = UM_WARN | UM_FIXUP;
 /* 0: no warning 1: print a warning message, disabled by default */
-static int se_kernmode_warn;
+static int se_kernmode_warn = 1;

 core_param(alignment, se_usermode, int, 0600);

@@ -103,7 +103,7 @@ void unaligned_fixups_notify(struct task_struct
*tsk, insn_size_t insn,
                          (void *)instruction_pointer(regs), insn);
        else if (se_kernmode_warn)
                pr_notice_ratelimited("Fixing up unaligned kernel access "
-                         "in \"%s\" pid=%d pc=0x%p ins=0x%04hx\n",
+                         "in \"%s\" pid=%d pc=%px ins=0x%04hx\n",
                          tsk->comm, task_pid_nr(tsk),
                          (void *)instruction_pointer(regs), insn);
 }

I now see something of interest :
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636

So basically the frag0 idea only works if drivers respect NET_IP_ALIGN
(So that IP header is 4-byte aligned)

It seems either virtio_net or qemu does not respect the contract.

A possible generic fix  would then be :


diff --git a/net/core/dev.c b/net/core/dev.c
index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..1f79b9aa9a3f2392fddd1401f95ad098b5e03204
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5924,7 +5924,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
        NAPI_GRO_CB(skb)->frag0_len = 0;

        if (!skb_headlen(skb) && pinfo->nr_frags &&
-           !PageHighMem(skb_frag_page(frag0))) {
+           !PageHighMem(skb_frag_page(frag0)) &&
+           (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
                NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
                NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
                                                    skb_frag_size(frag0),

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 10:43               ` Eric Dumazet
@ 2021-04-13 12:45                 ` Eric Dumazet
  2021-04-13 12:52                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13 12:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Xuan Zhuo, Michael S. Tsirkin,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 12:43 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Apr 13, 2021 at 11:24 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:05 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On 4/12/21 10:38 AM, Eric Dumazet wrote:
> > > [ ... ]
> > >
> > > > Yes, I think this is the real issue here. This smells like some memory
> > > > corruption.
> > > >
> > > > In my traces, packet is correctly received in AF_PACKET queue.
> > > >
> > > > I have checked the skb is well formed.
> > > >
> > > > But the user space seems to never call poll() and recvmsg() on this
> > > > af_packet socket.
> > > >
> > >
> > > After sprinkling the kernel with debug messages:
> > >
> > > 424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
> > > 424   00:01:33.693873 close(6)          = 0
> > > 424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
> > > 424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
> > > 424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
> > > 424   00:01:33.697311 exit_group(1)     = ?
> > > 424   00:01:33.698346 +++ exited with 1 +++
> > >
> > > I only see that after adding debug messages in the kernel, so I guess there must be
> > > a heisenbug somehere.
> > >
> > > Anyway, indeed, I see (another kernel debug message):
> > >
> > > __do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8
> > >
> > > So udhcpc doesn't even try to read the reply because it crashes after sendto()
> > > when trying to read the current time. Unless I am missing something, that means
> > > that the problem happens somewhere on the send side.
> > >
> > > To make things even more interesting, it looks like the failing system call
> > > isn't always clock_gettime().
> > >
> > > Guenter
> >
> >
> > I think GRO fast path has never worked on SUPERH. Probably SUPERH has
> > never used a fast NIC (10Gbit+)
> >
> > The following hack fixes the issue.
> >
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..91ba89a645ff91d4cd4f3d8dc8a009bcb67da344
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -5916,13 +5916,16 @@ static struct list_head
> > *gro_list_prepare(struct napi_struct *napi,
> >
> >  static void skb_gro_reset_offset(struct sk_buff *skb)
> >  {
> > +#if !defined(CONFIG_SUPERH)
> >         const struct skb_shared_info *pinfo = skb_shinfo(skb);
> >         const skb_frag_t *frag0 = &pinfo->frags[0];
> > +#endif
> >
> >         NAPI_GRO_CB(skb)->data_offset = 0;
> >         NAPI_GRO_CB(skb)->frag0 = NULL;
> >         NAPI_GRO_CB(skb)->frag0_len = 0;
> >
> > +#if !defined(CONFIG_SUPERH)
> >         if (!skb_headlen(skb) && pinfo->nr_frags &&
> >             !PageHighMem(skb_frag_page(frag0))) {
> >                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> > @@ -5930,6 +5933,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
> >                                                     skb_frag_size(frag0),
> >                                                     skb->end - skb->tail);
> >         }
> > +#endif
> >  }
> >
> >  static void gro_pull_from_frag0(struct sk_buff *skb, int grow)
>
> OK ... more sh debugging :
>
> diff --git a/arch/sh/mm/alignment.c b/arch/sh/mm/alignment.c
> index fb517b82a87b1065cf38c06cb3c178ce86587b00..5d18f9f792991105a8aa05cc6231b7d4532d72c9
> 100644
> --- a/arch/sh/mm/alignment.c
> +++ b/arch/sh/mm/alignment.c
> @@ -27,7 +27,7 @@ static unsigned long se_multi;
>     valid! */
>  static int se_usermode = UM_WARN | UM_FIXUP;
>  /* 0: no warning 1: print a warning message, disabled by default */
> -static int se_kernmode_warn;
> +static int se_kernmode_warn = 1;
>
>  core_param(alignment, se_usermode, int, 0600);
>
> @@ -103,7 +103,7 @@ void unaligned_fixups_notify(struct task_struct
> *tsk, insn_size_t insn,
>                           (void *)instruction_pointer(regs), insn);
>         else if (se_kernmode_warn)
>                 pr_notice_ratelimited("Fixing up unaligned kernel access "
> -                         "in \"%s\" pid=%d pc=0x%p ins=0x%04hx\n",
> +                         "in \"%s\" pid=%d pc=%px ins=0x%04hx\n",
>                           tsk->comm, task_pid_nr(tsk),
>                           (void *)instruction_pointer(regs), insn);
>  }
>
> I now see something of interest :
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
>
> So basically the frag0 idea only works if drivers respect NET_IP_ALIGN
> (So that IP header is 4-byte aligned)
>
> It seems either virtio_net or qemu does not respect the contract.
>
> A possible generic fix  would then be :
>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..1f79b9aa9a3f2392fddd1401f95ad098b5e03204
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5924,7 +5924,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
>         NAPI_GRO_CB(skb)->frag0_len = 0;
>
>         if (!skb_headlen(skb) && pinfo->nr_frags &&
> -           !PageHighMem(skb_frag_page(frag0))) {
> +           !PageHighMem(skb_frag_page(frag0)) &&
> +           (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
>                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
>                 NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
>                                                     skb_frag_size(frag0),


Official submission :

https://patchwork.kernel.org/project/netdevbpf/patch/20210413124136.2750358-1-eric.dumazet@gmail.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 12:45                 ` Eric Dumazet
@ 2021-04-13 12:52                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-13 12:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Guenter Roeck, Linus Torvalds, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 02:45:46PM +0200, Eric Dumazet wrote:
> On Tue, Apr 13, 2021 at 12:43 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Apr 13, 2021 at 11:24 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:05 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On 4/12/21 10:38 AM, Eric Dumazet wrote:
> > > > [ ... ]
> > > >
> > > > > Yes, I think this is the real issue here. This smells like some memory
> > > > > corruption.
> > > > >
> > > > > In my traces, packet is correctly received in AF_PACKET queue.
> > > > >
> > > > > I have checked the skb is well formed.
> > > > >
> > > > > But the user space seems to never call poll() and recvmsg() on this
> > > > > af_packet socket.
> > > > >
> > > >
> > > > After sprinkling the kernel with debug messages:
> > > >
> > > > 424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
> > > > 424   00:01:33.693873 close(6)          = 0
> > > > 424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
> > > > 424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
> > > > 424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
> > > > 424   00:01:33.697311 exit_group(1)     = ?
> > > > 424   00:01:33.698346 +++ exited with 1 +++
> > > >
> > > > I only see that after adding debug messages in the kernel, so I guess there must be
> > > > a heisenbug somehere.
> > > >
> > > > Anyway, indeed, I see (another kernel debug message):
> > > >
> > > > __do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8
> > > >
> > > > So udhcpc doesn't even try to read the reply because it crashes after sendto()
> > > > when trying to read the current time. Unless I am missing something, that means
> > > > that the problem happens somewhere on the send side.
> > > >
> > > > To make things even more interesting, it looks like the failing system call
> > > > isn't always clock_gettime().
> > > >
> > > > Guenter
> > >
> > >
> > > I think GRO fast path has never worked on SUPERH. Probably SUPERH has
> > > never used a fast NIC (10Gbit+)
> > >
> > > The following hack fixes the issue.
> > >
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..91ba89a645ff91d4cd4f3d8dc8a009bcb67da344
> > > 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -5916,13 +5916,16 @@ static struct list_head
> > > *gro_list_prepare(struct napi_struct *napi,
> > >
> > >  static void skb_gro_reset_offset(struct sk_buff *skb)
> > >  {
> > > +#if !defined(CONFIG_SUPERH)
> > >         const struct skb_shared_info *pinfo = skb_shinfo(skb);
> > >         const skb_frag_t *frag0 = &pinfo->frags[0];
> > > +#endif
> > >
> > >         NAPI_GRO_CB(skb)->data_offset = 0;
> > >         NAPI_GRO_CB(skb)->frag0 = NULL;
> > >         NAPI_GRO_CB(skb)->frag0_len = 0;
> > >
> > > +#if !defined(CONFIG_SUPERH)
> > >         if (!skb_headlen(skb) && pinfo->nr_frags &&
> > >             !PageHighMem(skb_frag_page(frag0))) {
> > >                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> > > @@ -5930,6 +5933,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
> > >                                                     skb_frag_size(frag0),
> > >                                                     skb->end - skb->tail);
> > >         }
> > > +#endif
> > >  }
> > >
> > >  static void gro_pull_from_frag0(struct sk_buff *skb, int grow)
> >
> > OK ... more sh debugging :
> >
> > diff --git a/arch/sh/mm/alignment.c b/arch/sh/mm/alignment.c
> > index fb517b82a87b1065cf38c06cb3c178ce86587b00..5d18f9f792991105a8aa05cc6231b7d4532d72c9
> > 100644
> > --- a/arch/sh/mm/alignment.c
> > +++ b/arch/sh/mm/alignment.c
> > @@ -27,7 +27,7 @@ static unsigned long se_multi;
> >     valid! */
> >  static int se_usermode = UM_WARN | UM_FIXUP;
> >  /* 0: no warning 1: print a warning message, disabled by default */
> > -static int se_kernmode_warn;
> > +static int se_kernmode_warn = 1;
> >
> >  core_param(alignment, se_usermode, int, 0600);
> >
> > @@ -103,7 +103,7 @@ void unaligned_fixups_notify(struct task_struct
> > *tsk, insn_size_t insn,
> >                           (void *)instruction_pointer(regs), insn);
> >         else if (se_kernmode_warn)
> >                 pr_notice_ratelimited("Fixing up unaligned kernel access "
> > -                         "in \"%s\" pid=%d pc=0x%p ins=0x%04hx\n",
> > +                         "in \"%s\" pid=%d pc=%px ins=0x%04hx\n",
> >                           tsk->comm, task_pid_nr(tsk),
> >                           (void *)instruction_pointer(regs), insn);
> >  }
> >
> > I now see something of interest :
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> >
> > So basically the frag0 idea only works if drivers respect NET_IP_ALIGN
> > (So that IP header is 4-byte aligned)
> >
> > It seems either virtio_net or qemu does not respect the contract.
> >
> > A possible generic fix  would then be :
> >
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..1f79b9aa9a3f2392fddd1401f95ad098b5e03204
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -5924,7 +5924,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
> >         NAPI_GRO_CB(skb)->frag0_len = 0;
> >
> >         if (!skb_headlen(skb) && pinfo->nr_frags &&
> > -           !PageHighMem(skb_frag_page(frag0))) {
> > +           !PageHighMem(skb_frag_page(frag0)) &&
> > +           (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
> >                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> >                 NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
> >                                                     skb_frag_size(frag0),
> 
> 
> Official submission :
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/20210413124136.2750358-1-eric.dumazet@gmail.com/

Thanks a lot Eric!

-- 
MST


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-12 16:47       ` Eric Dumazet
@ 2021-04-13 12:57         ` Michael S. Tsirkin
  2021-04-13 13:27           ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-13 12:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > Qemu test results:
> > > >         total: 460 pass: 459 fail: 1
> > > > Failed tests:
> > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > >
> > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > >
> > > Hmm. Let's add in some more of the people involved in that commit, and
> > > also netdev.
> > >
> > > Nothing in there looks like it should have any interaction with
> > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > some particular interaction with the qemu environment.
> >
> > Yes, maybe.
> >
> > I spent few hours on this, and suspect a buggy memcpy() implementation
> > on SH, but this was not conclusive.
> >
> > By pulling one extra byte, the problem goes away.
> >
> > Strange thing is that the udhcpc process does not go past sendto().
> 
> This is the patch working around the issue. Unfortunately I was not
> able to root-cause it (I really suspect something on SH)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> virtnet_info *vi,
> 
>         /* Copy all frame if it fits skb->head, otherwise
>          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> +        *
> +        * Apparently, pulling only the Ethernet Header triggers a bug
> on qemu-system-sh4.
> +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> +        * more to work around this bug : These 20 bytes can not belong
> +        * to UDP/TCP payload.
> +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
>          */

Question: do we still want to do this for performance reasons?
We also have the hdr_len coming from the device which is
just skb_headlen on the host.

>         if (len <= skb_tailroom(skb))
>                 copy = len;
>         else
> -               copy = ETH_HLEN + metasize;
> +               copy = ETH_HLEN + sizeof(struct iphdr) + metasize;
>         skb_put_data(skb, p, copy);
> 
>         if (metasize) {


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 12:57         ` Michael S. Tsirkin
@ 2021-04-13 13:27           ` Eric Dumazet
  2021-04-13 13:33             ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13 13:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > <torvalds@linux-foundation.org> wrote:
> > > >
> > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > Qemu test results:
> > > > >         total: 460 pass: 459 fail: 1
> > > > > Failed tests:
> > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > >
> > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > >
> > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > also netdev.
> > > >
> > > > Nothing in there looks like it should have any interaction with
> > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > some particular interaction with the qemu environment.
> > >
> > > Yes, maybe.
> > >
> > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > on SH, but this was not conclusive.
> > >
> > > By pulling one extra byte, the problem goes away.
> > >
> > > Strange thing is that the udhcpc process does not go past sendto().
> >
> > This is the patch working around the issue. Unfortunately I was not
> > able to root-cause it (I really suspect something on SH)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > virtnet_info *vi,
> >
> >         /* Copy all frame if it fits skb->head, otherwise
> >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > +        *
> > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > on qemu-system-sh4.
> > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > +        * more to work around this bug : These 20 bytes can not belong
> > +        * to UDP/TCP payload.
> > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> >          */
>
> Question: do we still want to do this for performance reasons?
> We also have the hdr_len coming from the device which is
> just skb_headlen on the host.

Well, putting 20 bytes in skb->head will disable frag0 optimization.

The change would only benefit to sh architecture :)

About hdr_len, I suppose we could try it, with appropriate safety checks.

>
> >         if (len <= skb_tailroom(skb))
> >                 copy = len;
> >         else
> > -               copy = ETH_HLEN + metasize;
> > +               copy = ETH_HLEN + sizeof(struct iphdr) + metasize;
> >         skb_put_data(skb, p, copy);
> >
> >         if (metasize) {
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 13:27           ` Eric Dumazet
@ 2021-04-13 13:33             ` Eric Dumazet
  2021-04-13 13:37               ` Michael S. Tsirkin
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13 13:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > <torvalds@linux-foundation.org> wrote:
> > > > >
> > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > >
> > > > > > Qemu test results:
> > > > > >         total: 460 pass: 459 fail: 1
> > > > > > Failed tests:
> > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > >
> > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > > >
> > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > also netdev.
> > > > >
> > > > > Nothing in there looks like it should have any interaction with
> > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > some particular interaction with the qemu environment.
> > > >
> > > > Yes, maybe.
> > > >
> > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > on SH, but this was not conclusive.
> > > >
> > > > By pulling one extra byte, the problem goes away.
> > > >
> > > > Strange thing is that the udhcpc process does not go past sendto().
> > >
> > > This is the patch working around the issue. Unfortunately I was not
> > > able to root-cause it (I really suspect something on SH)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > virtnet_info *vi,
> > >
> > >         /* Copy all frame if it fits skb->head, otherwise
> > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > +        *
> > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > on qemu-system-sh4.
> > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > +        * more to work around this bug : These 20 bytes can not belong
> > > +        * to UDP/TCP payload.
> > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> > >          */
> >
> > Question: do we still want to do this for performance reasons?
> > We also have the hdr_len coming from the device which is
> > just skb_headlen on the host.
>
> Well, putting 20 bytes in skb->head will disable frag0 optimization.
>
> The change would only benefit to sh architecture :)
>
> About hdr_len, I suppose we could try it, with appropriate safety checks.

I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.

Have I understood you correctly ?

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
                hdr_padded_len = sizeof(struct padded_vnet_hdr);

        /* hdr_valid means no XDP, so we can copy the vnet header */
-       if (hdr_valid)
+       if (hdr_valid) {
                memcpy(hdr, p, hdr_len);
-
+               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
+       }
        len -= hdr_len;
        offset += hdr_padded_len;
        p += hdr_padded_len;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 13:33             ` Eric Dumazet
@ 2021-04-13 13:37               ` Michael S. Tsirkin
  2021-04-13 13:42                 ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-13 13:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 03:33:40PM +0200, Eric Dumazet wrote:
> On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > > <torvalds@linux-foundation.org> wrote:
> > > > > >
> > > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > >
> > > > > > > Qemu test results:
> > > > > > >         total: 460 pass: 459 fail: 1
> > > > > > > Failed tests:
> > > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > > >
> > > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > > > >
> > > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > > also netdev.
> > > > > >
> > > > > > Nothing in there looks like it should have any interaction with
> > > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > > some particular interaction with the qemu environment.
> > > > >
> > > > > Yes, maybe.
> > > > >
> > > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > > on SH, but this was not conclusive.
> > > > >
> > > > > By pulling one extra byte, the problem goes away.
> > > > >
> > > > > Strange thing is that the udhcpc process does not go past sendto().
> > > >
> > > > This is the patch working around the issue. Unfortunately I was not
> > > > able to root-cause it (I really suspect something on SH)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > > 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > > virtnet_info *vi,
> > > >
> > > >         /* Copy all frame if it fits skb->head, otherwise
> > > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > > +        *
> > > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > > on qemu-system-sh4.
> > > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > > +        * more to work around this bug : These 20 bytes can not belong
> > > > +        * to UDP/TCP payload.
> > > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> > > >          */
> > >
> > > Question: do we still want to do this for performance reasons?
> > > We also have the hdr_len coming from the device which is
> > > just skb_headlen on the host.
> >
> > Well, putting 20 bytes in skb->head will disable frag0 optimization.
> >
> > The change would only benefit to sh architecture :)
> >
> > About hdr_len, I suppose we could try it, with appropriate safety checks.
> 
> I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.
> 
> Have I understood you correctly ?
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
> 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>                 hdr_padded_len = sizeof(struct padded_vnet_hdr);
> 
>         /* hdr_valid means no XDP, so we can copy the vnet header */
> -       if (hdr_valid)
> +       if (hdr_valid) {
>                 memcpy(hdr, p, hdr_len);
> -
> +               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
> +       }
>         len -= hdr_len;
>         offset += hdr_padded_len;
>         p += hdr_padded_len;


Depends on how you connect qemu on the host. It's filled by host tap,
see virtio_net_hdr_from_skb. If you are using slirp that just zero-fills
it.


-- 
MST


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 13:37               ` Michael S. Tsirkin
@ 2021-04-13 13:42                 ` Eric Dumazet
  2021-04-13 13:46                   ` Michael S. Tsirkin
  2021-04-13 13:58                   ` Michael S. Tsirkin
  0 siblings, 2 replies; 18+ messages in thread
From: Eric Dumazet @ 2021-04-13 13:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 3:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Apr 13, 2021 at 03:33:40PM +0200, Eric Dumazet wrote:
> > On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > > > <torvalds@linux-foundation.org> wrote:
> > > > > > >
> > > > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > >
> > > > > > > > Qemu test results:
> > > > > > > >         total: 460 pass: 459 fail: 1
> > > > > > > > Failed tests:
> > > > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > > > >
> > > > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > > > > >
> > > > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > > > also netdev.
> > > > > > >
> > > > > > > Nothing in there looks like it should have any interaction with
> > > > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > > > some particular interaction with the qemu environment.
> > > > > >
> > > > > > Yes, maybe.
> > > > > >
> > > > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > > > on SH, but this was not conclusive.
> > > > > >
> > > > > > By pulling one extra byte, the problem goes away.
> > > > > >
> > > > > > Strange thing is that the udhcpc process does not go past sendto().
> > > > >
> > > > > This is the patch working around the issue. Unfortunately I was not
> > > > > able to root-cause it (I really suspect something on SH)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > > > 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > > > virtnet_info *vi,
> > > > >
> > > > >         /* Copy all frame if it fits skb->head, otherwise
> > > > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > > > +        *
> > > > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > > > on qemu-system-sh4.
> > > > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > > > +        * more to work around this bug : These 20 bytes can not belong
> > > > > +        * to UDP/TCP payload.
> > > > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> > > > >          */
> > > >
> > > > Question: do we still want to do this for performance reasons?
> > > > We also have the hdr_len coming from the device which is
> > > > just skb_headlen on the host.
> > >
> > > Well, putting 20 bytes in skb->head will disable frag0 optimization.
> > >
> > > The change would only benefit to sh architecture :)
> > >
> > > About hdr_len, I suppose we could try it, with appropriate safety checks.
> >
> > I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.
> >
> > Have I understood you correctly ?
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
> > 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
> >                 hdr_padded_len = sizeof(struct padded_vnet_hdr);
> >
> >         /* hdr_valid means no XDP, so we can copy the vnet header */
> > -       if (hdr_valid)
> > +       if (hdr_valid) {
> >                 memcpy(hdr, p, hdr_len);
> > -
> > +               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
> > +       }
> >         len -= hdr_len;
> >         offset += hdr_padded_len;
> >         p += hdr_padded_len;
>
>
> Depends on how you connect qemu on the host. It's filled by host tap,
> see virtio_net_hdr_from_skb. If you are using slirp that just zero-fills
> it.

Guenter provided :

qemu-system-sh4 -M r2d -kernel ./arch/sh/boot/zImage -no-reboot \
        -snapshot \
        -drive file=rootfs.ext2,format=raw,if=ide \
        -device virtio-net,netdev=net0 -netdev user,id=net0 \
        -append "root=/dev/sda console=ttySC1,115200
earlycon=scif,mmio16,0xffe80000 noiotrap" \
        -serial null -serial stdio -nographic -monitor null

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 13:42                 ` Eric Dumazet
@ 2021-04-13 13:46                   ` Michael S. Tsirkin
  2021-04-13 13:58                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-13 13:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 03:42:24PM +0200, Eric Dumazet wrote:
> On Tue, Apr 13, 2021 at 3:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 13, 2021 at 03:33:40PM +0200, Eric Dumazet wrote:
> > > On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > > > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > > > > <torvalds@linux-foundation.org> wrote:
> > > > > > > >
> > > > > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > >
> > > > > > > > > Qemu test results:
> > > > > > > > >         total: 460 pass: 459 fail: 1
> > > > > > > > > Failed tests:
> > > > > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > > > > >
> > > > > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > > > > > >
> > > > > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > > > > also netdev.
> > > > > > > >
> > > > > > > > Nothing in there looks like it should have any interaction with
> > > > > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > > > > some particular interaction with the qemu environment.
> > > > > > >
> > > > > > > Yes, maybe.
> > > > > > >
> > > > > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > > > > on SH, but this was not conclusive.
> > > > > > >
> > > > > > > By pulling one extra byte, the problem goes away.
> > > > > > >
> > > > > > > Strange thing is that the udhcpc process does not go past sendto().
> > > > > >
> > > > > > This is the patch working around the issue. Unfortunately I was not
> > > > > > able to root-cause it (I really suspect something on SH)
> > > > > >
> > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > > > > 100644
> > > > > > --- a/drivers/net/virtio_net.c
> > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > > > > virtnet_info *vi,
> > > > > >
> > > > > >         /* Copy all frame if it fits skb->head, otherwise
> > > > > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > > > > +        *
> > > > > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > > > > on qemu-system-sh4.
> > > > > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > > > > +        * more to work around this bug : These 20 bytes can not belong
> > > > > > +        * to UDP/TCP payload.
> > > > > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> > > > > >          */
> > > > >
> > > > > Question: do we still want to do this for performance reasons?
> > > > > We also have the hdr_len coming from the device which is
> > > > > just skb_headlen on the host.
> > > >
> > > > Well, putting 20 bytes in skb->head will disable frag0 optimization.
> > > >
> > > > The change would only benefit to sh architecture :)
> > > >
> > > > About hdr_len, I suppose we could try it, with appropriate safety checks.
> > >
> > > I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.
> > >
> > > Have I understood you correctly ?
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
> > > 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
> > >                 hdr_padded_len = sizeof(struct padded_vnet_hdr);
> > >
> > >         /* hdr_valid means no XDP, so we can copy the vnet header */
> > > -       if (hdr_valid)
> > > +       if (hdr_valid) {
> > >                 memcpy(hdr, p, hdr_len);
> > > -
> > > +               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
> > > +       }
> > >         len -= hdr_len;
> > >         offset += hdr_padded_len;
> > >         p += hdr_padded_len;
> >
> >
> > Depends on how you connect qemu on the host. It's filled by host tap,
> > see virtio_net_hdr_from_skb. If you are using slirp that just zero-fills
> > it.
> 
> Guenter provided :
> 
> qemu-system-sh4 -M r2d -kernel ./arch/sh/boot/zImage -no-reboot \
>         -snapshot \
>         -drive file=rootfs.ext2,format=raw,if=ide \
>         -device virtio-net,netdev=net0 -netdev user,id=net0 \
>         -append "root=/dev/sda console=ttySC1,115200
> earlycon=scif,mmio16,0xffe80000 noiotrap" \
>         -serial null -serial stdio -nographic -monitor null

That's slirp, sure enough. It generates packets in userspace thus no
skbs thus no skb_headlen.

-- 
MST


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Linux 5.12-rc7
  2021-04-13 13:42                 ` Eric Dumazet
  2021-04-13 13:46                   ` Michael S. Tsirkin
@ 2021-04-13 13:58                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-04-13 13:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, Guenter Roeck, Xuan Zhuo,
	Linux Kernel Mailing List, Netdev

On Tue, Apr 13, 2021 at 03:42:24PM +0200, Eric Dumazet wrote:
> On Tue, Apr 13, 2021 at 3:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 13, 2021 at 03:33:40PM +0200, Eric Dumazet wrote:
> > > On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > > > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > > > > <torvalds@linux-foundation.org> wrote:
> > > > > > > >
> > > > > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > > > > >
> > > > > > > > > Qemu test results:
> > > > > > > > >         total: 460 pass: 459 fail: 1
> > > > > > > > > Failed tests:
> > > > > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > > > > >
> > > > > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > > > > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > > > > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > > > > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > > > > > > >
> > > > > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > > > > also netdev.
> > > > > > > >
> > > > > > > > Nothing in there looks like it should have any interaction with
> > > > > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > > > > some particular interaction with the qemu environment.
> > > > > > >
> > > > > > > Yes, maybe.
> > > > > > >
> > > > > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > > > > on SH, but this was not conclusive.
> > > > > > >
> > > > > > > By pulling one extra byte, the problem goes away.
> > > > > > >
> > > > > > > Strange thing is that the udhcpc process does not go past sendto().
> > > > > >
> > > > > > This is the patch working around the issue. Unfortunately I was not
> > > > > > able to root-cause it (I really suspect something on SH)
> > > > > >
> > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > > > > 100644
> > > > > > --- a/drivers/net/virtio_net.c
> > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > > > > virtnet_info *vi,
> > > > > >
> > > > > >         /* Copy all frame if it fits skb->head, otherwise
> > > > > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > > > > +        *
> > > > > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > > > > on qemu-system-sh4.
> > > > > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > > > > +        * more to work around this bug : These 20 bytes can not belong
> > > > > > +        * to UDP/TCP payload.
> > > > > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> > > > > >          */
> > > > >
> > > > > Question: do we still want to do this for performance reasons?
> > > > > We also have the hdr_len coming from the device which is
> > > > > just skb_headlen on the host.
> > > >
> > > > Well, putting 20 bytes in skb->head will disable frag0 optimization.
> > > >
> > > > The change would only benefit to sh architecture :)
> > > >
> > > > About hdr_len, I suppose we could try it, with appropriate safety checks.
> > >
> > > I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.
> > >
> > > Have I understood you correctly ?
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
> > > 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
> > >                 hdr_padded_len = sizeof(struct padded_vnet_hdr);
> > >
> > >         /* hdr_valid means no XDP, so we can copy the vnet header */
> > > -       if (hdr_valid)
> > > +       if (hdr_valid) {
> > >                 memcpy(hdr, p, hdr_len);
> > > -
> > > +               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
> > > +       }
> > >         len -= hdr_len;
> > >         offset += hdr_padded_len;
> > >         p += hdr_padded_len;
> >
> >
> > Depends on how you connect qemu on the host. It's filled by host tap,
> > see virtio_net_hdr_from_skb. If you are using slirp that just zero-fills
> > it.
> 
> Guenter provided :
> 
> qemu-system-sh4 -M r2d -kernel ./arch/sh/boot/zImage -no-reboot \
>         -snapshot \
>         -drive file=rootfs.ext2,format=raw,if=ide \
>         -device virtio-net,netdev=net0 -netdev user,id=net0 \
>         -append "root=/dev/sda console=ttySC1,115200
> earlycon=scif,mmio16,0xffe80000 noiotrap" \
>         -serial null -serial stdio -nographic -monitor null


I do something like this (macvtap):

sudo ip link del macvtap0
sudo ip link add link enp0s25 name macvtap0 type macvtap mode bridge
#sudo ip link add link wlp3s0 name macvtap0 type macvtap mode bridge
sudo ip link set macvtap0 address 52:54:00:12:34:56 up
index=`cat /sys/class/net/macvtap0/ifindex`
sudo chgrp mst /dev/tap$index
sudo chmod g+rw /dev/tap$index

and then

-netdev tap,fds=6,id=net0,vhost=on -device virtio-net,netdev=net0 \
    6<>/dev/tap$index



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-04-13 13:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHk-=wiHGchP=V=a4DbDN+imjGEc=2nvuLQVoeNXNxjpU1T8pg@mail.gmail.com>
     [not found] ` <20210412051445.GA47322@roeck-us.net>
2021-04-12 16:28   ` Linux 5.12-rc7 Linus Torvalds
2021-04-12 16:31     ` Michael S. Tsirkin
2021-04-12 16:31     ` Eric Dumazet
2021-04-12 16:47       ` Eric Dumazet
2021-04-13 12:57         ` Michael S. Tsirkin
2021-04-13 13:27           ` Eric Dumazet
2021-04-13 13:33             ` Eric Dumazet
2021-04-13 13:37               ` Michael S. Tsirkin
2021-04-13 13:42                 ` Eric Dumazet
2021-04-13 13:46                   ` Michael S. Tsirkin
2021-04-13 13:58                   ` Michael S. Tsirkin
2021-04-12 17:31       ` Guenter Roeck
2021-04-12 17:38         ` Eric Dumazet
2021-04-12 20:05           ` Guenter Roeck
2021-04-13  9:24             ` Eric Dumazet
2021-04-13 10:43               ` Eric Dumazet
2021-04-13 12:45                 ` Eric Dumazet
2021-04-13 12:52                   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).