qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
@ 2013-05-22 12:50 Stefan Hajnoczi
  2013-05-22 12:53 ` Andreas Färber
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-05-22 12:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: oliver.francke, Stefan Hajnoczi

Net queues support efficient "receive disable".  For example, tap's file
descriptor will not be polled while its peer has receive disabled.  This
saves CPU cycles for needlessly copying and then dropping packets which
the peer cannot receive.

rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
queue up when receive becomes possible again.

As a result, the Windows 7 guest driver reaches a state where the
rtl8139 cannot receive packets.  The driver has actually refilled the
receive buffer but we never resume reception.

The bug can be reproduced by running a large FTP 'get' inside a Windows
7 guest:

  $ qemu -netdev tap,id=tap0,...
         -device rtl8139,netdev=tap0

The Linux guest driver does not trigger the bug, probably due to a
different buffer management strategy.

Reported-by: Oliver Francke <oliver.francke@filoo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/net/rtl8139.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 9369507..7993f9f 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
     /* this value is off by 16 */
     s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
 
+    /* more buffer space may be available so try to receive */
+    qemu_flush_queued_packets(qemu_get_queue(s->nic));
+
     DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
         s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-22 12:50 [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written Stefan Hajnoczi
@ 2013-05-22 12:53 ` Andreas Färber
  2013-05-22 13:33   ` Stefan Hajnoczi
  2013-05-24 14:34 ` Stefan Hajnoczi
  2013-05-27  6:15 ` Peter Lieven
  2 siblings, 1 reply; 11+ messages in thread
From: Andreas Färber @ 2013-05-22 12:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: oliver.francke, qemu-devel, qemu-stable

Am 22.05.2013 14:50, schrieb Stefan Hajnoczi:
> Net queues support efficient "receive disable".  For example, tap's file
> descriptor will not be polled while its peer has receive disabled.  This
> saves CPU cycles for needlessly copying and then dropping packets which
> the peer cannot receive.
> 
> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
> queue up when receive becomes possible again.
> 
> As a result, the Windows 7 guest driver reaches a state where the
> rtl8139 cannot receive packets.  The driver has actually refilled the
> receive buffer but we never resume reception.
> 
> The bug can be reproduced by running a large FTP 'get' inside a Windows
> 7 guest:
> 
>   $ qemu -netdev tap,id=tap0,...
>          -device rtl8139,netdev=tap0
> 
> The Linux guest driver does not trigger the bug, probably due to a
> different buffer management strategy.
> 
> Reported-by: Oliver Francke <oliver.francke@filoo.de>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Sounds as if we should

Cc: qemu-stable@nongnu.org

Andreas

> ---
>  hw/net/rtl8139.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> index 9369507..7993f9f 100644
> --- a/hw/net/rtl8139.c
> +++ b/hw/net/rtl8139.c
> @@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
>      /* this value is off by 16 */
>      s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
>  
> +    /* more buffer space may be available so try to receive */
> +    qemu_flush_queued_packets(qemu_get_queue(s->nic));
> +
>      DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
>          s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
>  }
> 

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-22 12:53 ` Andreas Färber
@ 2013-05-22 13:33   ` Stefan Hajnoczi
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-05-22 13:33 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Oliver Francke, qemu-devel, Stefan Hajnoczi, qemu-stable

On Wed, May 22, 2013 at 2:53 PM, Andreas Färber <afaerber@suse.de> wrote:
> Am 22.05.2013 14:50, schrieb Stefan Hajnoczi:
>> Net queues support efficient "receive disable".  For example, tap's file
>> descriptor will not be polled while its peer has receive disabled.  This
>> saves CPU cycles for needlessly copying and then dropping packets which
>> the peer cannot receive.
>>
>> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
>> queue up when receive becomes possible again.
>>
>> As a result, the Windows 7 guest driver reaches a state where the
>> rtl8139 cannot receive packets.  The driver has actually refilled the
>> receive buffer but we never resume reception.
>>
>> The bug can be reproduced by running a large FTP 'get' inside a Windows
>> 7 guest:
>>
>>   $ qemu -netdev tap,id=tap0,...
>>          -device rtl8139,netdev=tap0
>>
>> The Linux guest driver does not trigger the bug, probably due to a
>> different buffer management strategy.
>>
>> Reported-by: Oliver Francke <oliver.francke@filoo.de>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>
> Sounds as if we should
>
> Cc: qemu-stable@nongnu.org

Yes, please.  Oliver just confirmed that it fixes the issue for him on
IRC so this is good for QEMU 1.5.1.

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-22 12:50 [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written Stefan Hajnoczi
  2013-05-22 12:53 ` Andreas Färber
@ 2013-05-24 14:34 ` Stefan Hajnoczi
  2013-05-27  6:15 ` Peter Lieven
  2 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-05-24 14:34 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: oliver.francke, qemu-devel

On Wed, May 22, 2013 at 02:50:18PM +0200, Stefan Hajnoczi wrote:
> Net queues support efficient "receive disable".  For example, tap's file
> descriptor will not be polled while its peer has receive disabled.  This
> saves CPU cycles for needlessly copying and then dropping packets which
> the peer cannot receive.
> 
> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
> queue up when receive becomes possible again.
> 
> As a result, the Windows 7 guest driver reaches a state where the
> rtl8139 cannot receive packets.  The driver has actually refilled the
> receive buffer but we never resume reception.
> 
> The bug can be reproduced by running a large FTP 'get' inside a Windows
> 7 guest:
> 
>   $ qemu -netdev tap,id=tap0,...
>          -device rtl8139,netdev=tap0
> 
> The Linux guest driver does not trigger the bug, probably due to a
> different buffer management strategy.
> 
> Reported-by: Oliver Francke <oliver.francke@filoo.de>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  hw/net/rtl8139.c | 3 +++
>  1 file changed, 3 insertions(+)

Applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-22 12:50 [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written Stefan Hajnoczi
  2013-05-22 12:53 ` Andreas Färber
  2013-05-24 14:34 ` Stefan Hajnoczi
@ 2013-05-27  6:15 ` Peter Lieven
  2013-05-27  8:32   ` Stefan Hajnoczi
  2013-05-27 14:07   ` Oliver Francke
  2 siblings, 2 replies; 11+ messages in thread
From: Peter Lieven @ 2013-05-27  6:15 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: oliver.francke, qemu-devel

Hi all,

I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.

My question is if you see increasing dropped packets on the tap device if this problem occurs?

tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
           inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
           RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
           collisions:0 txqueuelen:500
           RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)

In my case as well the only option to recover without shutting down the whole vServer is Live Migration
to another Node.

However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.

Thank you,
Peter

On 22.05.2013 14:50, Stefan Hajnoczi wrote:
> Net queues support efficient "receive disable".  For example, tap's file
> descriptor will not be polled while its peer has receive disabled.  This
> saves CPU cycles for needlessly copying and then dropping packets which
> the peer cannot receive.
>
> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
> queue up when receive becomes possible again.
>
> As a result, the Windows 7 guest driver reaches a state where the
> rtl8139 cannot receive packets.  The driver has actually refilled the
> receive buffer but we never resume reception.
>
> The bug can be reproduced by running a large FTP 'get' inside a Windows
> 7 guest:
>
>    $ qemu -netdev tap,id=tap0,...
>           -device rtl8139,netdev=tap0
>
> The Linux guest driver does not trigger the bug, probably due to a
> different buffer management strategy.
>
> Reported-by: Oliver Francke <oliver.francke@filoo.de>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   hw/net/rtl8139.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> index 9369507..7993f9f 100644
> --- a/hw/net/rtl8139.c
> +++ b/hw/net/rtl8139.c
> @@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
>       /* this value is off by 16 */
>       s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
>   
> +    /* more buffer space may be available so try to receive */
> +    qemu_flush_queued_packets(qemu_get_queue(s->nic));
> +
>       DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
>           s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
>   }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27  6:15 ` Peter Lieven
@ 2013-05-27  8:32   ` Stefan Hajnoczi
  2013-05-27 10:19     ` Peter Lieven
  2013-05-27 14:07   ` Oliver Francke
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-05-27  8:32 UTC (permalink / raw)
  To: Peter Lieven; +Cc: oliver.francke, qemu-devel

On Mon, May 27, 2013 at 08:15:42AM +0200, Peter Lieven wrote:
> I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
> WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
> 
> My question is if you see increasing dropped packets on the tap device if this problem occurs?
> 
> tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
>           inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>           RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)

My reading of the tun code is that will see TX dropped increase.  This
is because tun keeps a finite size queue of tx packets.  Since QEMU
userspace is not monitoring the tap fd anymore we'll never drain the
queue and soon enough the TX dropped counter will begin incrementing.

> In my case as well the only option to recover without shutting down the whole vServer is Live Migration
> to another Node.
> 
> However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.

Yes, the patch that exposes this problem was only merged in 1.2.1.

Can you still reproduce the problem now that the patch has been merged
into qemu.git/master?

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27  8:32   ` Stefan Hajnoczi
@ 2013-05-27 10:19     ` Peter Lieven
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Lieven @ 2013-05-27 10:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: oliver.francke, qemu-devel

On 27.05.2013 10:32, Stefan Hajnoczi wrote:
> On Mon, May 27, 2013 at 08:15:42AM +0200, Peter Lieven wrote:
>> I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
>> WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
>>
>> My question is if you see increasing dropped packets on the tap device if this problem occurs?
>>
>> tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
>>            inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
>>            UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>>            RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
>>            TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
>>            collisions:0 txqueuelen:500
>>            RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)
> My reading of the tun code is that will see TX dropped increase.  This
> is because tun keeps a finite size queue of tx packets.  Since QEMU
> userspace is not monitoring the tap fd anymore we'll never drain the
> queue and soon enough the TX dropped counter will begin incrementing.
Ok, so this would fit.

>
>> In my case as well the only option to recover without shutting down the whole vServer is Live Migration
>> to another Node.
>>
>> However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.
> Yes, the patch that exposes this problem was only merged in 1.2.1.
Can you say which patch exactly? I cherry-picked some patches by hand.
>
> Can you still reproduce the problem now that the patch has been merged
> into qemu.git/master?
Unfortunately, I have no reliable way of reproducing the issue. It only happens
from time to time.

Peter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27  6:15 ` Peter Lieven
  2013-05-27  8:32   ` Stefan Hajnoczi
@ 2013-05-27 14:07   ` Oliver Francke
  2013-05-27 14:24     ` Peter Lieven
  1 sibling, 1 reply; 11+ messages in thread
From: Oliver Francke @ 2013-05-27 14:07 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel, Stefan Hajnoczi

Well,

Am 27.05.2013 um 08:15 schrieb Peter Lieven <lieven-lists@dlhnet.de>:

> Hi all,
> 
> I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
> WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
> 
> My question is if you see increasing dropped packets on the tap device if this problem occurs?
> 
> tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
>          inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>          RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)
> 
> In my case as well the only option to recover without shutting down the whole vServer is Live Migration
> to another Node.
> 

ACK, tried it and every network-devices might have been re-created into a defined state qemu-wise.

> However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.
> 

Neither me nor any  affected customers have ever seen such failures in qemu-1.2.0, so this was my last-known-good ;)

Oliver.

> Thank you,
> Peter
> 
> On 22.05.2013 14:50, Stefan Hajnoczi wrote:
>> Net queues support efficient "receive disable".  For example, tap's file
>> descriptor will not be polled while its peer has receive disabled.  This
>> saves CPU cycles for needlessly copying and then dropping packets which
>> the peer cannot receive.
>> 
>> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
>> queue up when receive becomes possible again.
>> 
>> As a result, the Windows 7 guest driver reaches a state where the
>> rtl8139 cannot receive packets.  The driver has actually refilled the
>> receive buffer but we never resume reception.
>> 
>> The bug can be reproduced by running a large FTP 'get' inside a Windows
>> 7 guest:
>> 
>>   $ qemu -netdev tap,id=tap0,...
>>          -device rtl8139,netdev=tap0
>> 
>> The Linux guest driver does not trigger the bug, probably due to a
>> different buffer management strategy.
>> 
>> Reported-by: Oliver Francke <oliver.francke@filoo.de>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>  hw/net/rtl8139.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
>> index 9369507..7993f9f 100644
>> --- a/hw/net/rtl8139.c
>> +++ b/hw/net/rtl8139.c
>> @@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
>>      /* this value is off by 16 */
>>      s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
>>  +    /* more buffer space may be available so try to receive */
>> +    qemu_flush_queued_packets(qemu_get_queue(s->nic));
>> +
>>      DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
>>          s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
>>  }
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27 14:07   ` Oliver Francke
@ 2013-05-27 14:24     ` Peter Lieven
  2013-05-27 15:29       ` Stefan Hajnoczi
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Lieven @ 2013-05-27 14:24 UTC (permalink / raw)
  To: Oliver Francke; +Cc: qemu-devel, Stefan Hajnoczi

On 27.05.2013 16:07, Oliver Francke wrote:
> Well,
>
> Am 27.05.2013 um 08:15 schrieb Peter Lieven <lieven-lists@dlhnet.de>:
>
>> Hi all,
>>
>> I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
>> WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
>>
>> My question is if you see increasing dropped packets on the tap device if this problem occurs?
>>
>> tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
>>           inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
>>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>>           RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
>>           collisions:0 txqueuelen:500
>>           RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)
>>
>> In my case as well the only option to recover without shutting down the whole vServer is Live Migration
>> to another Node.
>>
> ACK, tried it and every network-devices might have been re-created into a defined state qemu-wise.
>
>> However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.
>>
> Neither me nor any  affected customers have ever seen such failures in qemu-1.2.0, so this was my last-known-good ;)
I cherry-picked

net: add receive_disabled logic to iov delivery path

to my qemu-1.2.0 build. I think this might be why I see this.

have to tried to patch qemu-1.2.0 with something like this?

--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
      /* this value is off by 16 */
      s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);

+    /* more buffer space may be available so try to receive */
+    qemu_flush_queued_packets(&s->nic->nc);
+
      DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
          s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
  }


Peter

>
> Oliver.
>
>> Thank you,
>> Peter
>>
>> On 22.05.2013 14:50, Stefan Hajnoczi wrote:
>>> Net queues support efficient "receive disable".  For example, tap's file
>>> descriptor will not be polled while its peer has receive disabled.  This
>>> saves CPU cycles for needlessly copying and then dropping packets which
>>> the peer cannot receive.
>>>
>>> rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
>>> queue up when receive becomes possible again.
>>>
>>> As a result, the Windows 7 guest driver reaches a state where the
>>> rtl8139 cannot receive packets.  The driver has actually refilled the
>>> receive buffer but we never resume reception.
>>>
>>> The bug can be reproduced by running a large FTP 'get' inside a Windows
>>> 7 guest:
>>>
>>>    $ qemu -netdev tap,id=tap0,...
>>>           -device rtl8139,netdev=tap0
>>>
>>> The Linux guest driver does not trigger the bug, probably due to a
>>> different buffer management strategy.
>>>
>>> Reported-by: Oliver Francke <oliver.francke@filoo.de>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>   hw/net/rtl8139.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
>>> index 9369507..7993f9f 100644
>>> --- a/hw/net/rtl8139.c
>>> +++ b/hw/net/rtl8139.c
>>> @@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
>>>       /* this value is off by 16 */
>>>       s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
>>>   +    /* more buffer space may be available so try to receive */
>>> +    qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>> +
>>>       DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
>>>           s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
>>>   }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27 14:24     ` Peter Lieven
@ 2013-05-27 15:29       ` Stefan Hajnoczi
  2013-05-28  6:27         ` Peter Lieven
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-05-27 15:29 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Oliver Francke, qemu-devel

On Mon, May 27, 2013 at 04:24:59PM +0200, Peter Lieven wrote:
> On 27.05.2013 16:07, Oliver Francke wrote:
> >Well,
> >
> >Am 27.05.2013 um 08:15 schrieb Peter Lieven <lieven-lists@dlhnet.de>:
> >
> >>Hi all,
> >>
> >>I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
> >>WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
> >>
> >>My question is if you see increasing dropped packets on the tap device if this problem occurs?
> >>
> >>tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
> >>          inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
> >>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
> >>          RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:500
> >>          RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)
> >>
> >>In my case as well the only option to recover without shutting down the whole vServer is Live Migration
> >>to another Node.
> >>
> >ACK, tried it and every network-devices might have been re-created into a defined state qemu-wise.
> >
> >>However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.
> >>
> >Neither me nor any  affected customers have ever seen such failures in qemu-1.2.0, so this was my last-known-good ;)
> I cherry-picked
> 
> net: add receive_disabled logic to iov delivery path

This one exposes the bug that Oliver reported:

commit a9d8f7b1c41a8a346f4cf5a0c6963a79fbd1249e
Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Date:   Mon Aug 20 13:35:23 2012 +0100

    net: do not report queued packets as sent

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written
  2013-05-27 15:29       ` Stefan Hajnoczi
@ 2013-05-28  6:27         ` Peter Lieven
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Lieven @ 2013-05-28  6:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Oliver Francke, qemu-devel

On 27.05.2013 17:29, Stefan Hajnoczi wrote:
> On Mon, May 27, 2013 at 04:24:59PM +0200, Peter Lieven wrote:
>> On 27.05.2013 16:07, Oliver Francke wrote:
>>> Well,
>>>
>>> Am 27.05.2013 um 08:15 schrieb Peter Lieven <lieven-lists@dlhnet.de>:
>>>
>>>> Hi all,
>>>>
>>>> I ocassionally have seen a probably related problem in the past. It mainly happend with rtl8139 under
>>>> WinXP where we most likely use rtl8139 due to lack of shipped e1000 drivers.
>>>>
>>>> My question is if you see increasing dropped packets on the tap device if this problem occurs?
>>>>
>>>> tap36     Link encap:Ethernet  HWaddr b2:84:23:c0:e2:c0
>>>>           inet6 addr: fe80::b084:23ff:fec0:e2c0/64 Scope:Link
>>>>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>>>>           RX packets:5816096 errors:0 dropped:0 overruns:0 frame:0
>>>>           TX packets:3878744 errors:0 dropped:13775 overruns:0 carrier:0
>>>>           collisions:0 txqueuelen:500
>>>>           RX bytes:5161769434 (5.1 GB)  TX bytes:380415916 (380.4 MB)
>>>>
>>>> In my case as well the only option to recover without shutting down the whole vServer is Live Migration
>>>> to another Node.
>>>>
>>> ACK, tried it and every network-devices might have been re-created into a defined state qemu-wise.
>>>
>>>> However, I also see this problem under qemu-kvm-1.2.0 while Oliver reported it does not happen there.
>>>>
>>> Neither me nor any  affected customers have ever seen such failures in qemu-1.2.0, so this was my last-known-good ;)
>> I cherry-picked
>>
>> net: add receive_disabled logic to iov delivery path
> This one exposes the bug that Oliver reported:
>
> commit a9d8f7b1c41a8a346f4cf5a0c6963a79fbd1249e
> Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> Date:   Mon Aug 20 13:35:23 2012 +0100
>
>      net: do not report queued packets as sent
This was also in the series I cherry-picked for my 1.2.0 build. So its likely I hit the same bug.

Thank you,
Peter

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-05-28  6:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-22 12:50 [Qemu-devel] [PATCH] rtl8139: flush queued packets when RxBufPtr is written Stefan Hajnoczi
2013-05-22 12:53 ` Andreas Färber
2013-05-22 13:33   ` Stefan Hajnoczi
2013-05-24 14:34 ` Stefan Hajnoczi
2013-05-27  6:15 ` Peter Lieven
2013-05-27  8:32   ` Stefan Hajnoczi
2013-05-27 10:19     ` Peter Lieven
2013-05-27 14:07   ` Oliver Francke
2013-05-27 14:24     ` Peter Lieven
2013-05-27 15:29       ` Stefan Hajnoczi
2013-05-28  6:27         ` Peter Lieven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).