All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS
@ 2014-12-11  5:23 Richard Tollerton
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid Richard Tollerton
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Richard Tollerton @ 2014-12-11  5:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Tollerton, kraxel, mst

The 8254x driver in certain versions of Phar Lap ETS hasn't been
initializing the e1000 device properly in qemu. It looks like the driver
is relying on two specific pieces of behavior which (anecdotally) exist
in hardware, although I can't cite any datasheets on the matter; in any
case, these two patches adjust this behavior. Tested lightly on both ETS
and Win7.

This is a resend; there are no changes compared to v1.

Richard Tollerton (2):
  e1000: Clear MDIC register when PHY addr is invalid
  e1000: decrement RDT if equal to RDH

 hw/net/e1000.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

-- 
2.1.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid
  2014-12-11  5:23 [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Richard Tollerton
@ 2014-12-11  5:23 ` Richard Tollerton
  2014-12-18  5:08   ` Jason Wang
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH Richard Tollerton
  2015-03-24  9:37 ` [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Stefan Hajnoczi
  2 siblings, 1 reply; 12+ messages in thread
From: Richard Tollerton @ 2014-12-11  5:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Tollerton, kraxel, Jeff Westfahl, mst

Some drivers probe PHY addresses beyond the first one, and
also (unfortunately) don't check for MDIC_ERROR on reads. It appears
that the driver assumes that the data reads will return zero in this
case (invalid PHY address). Anecdotally, hardware is believed to do
this; but qemu wasn't, and instead was reusing the previous value of the
MDIC register. This screwed up driver init.

To fix, don't reuse the current value of the MDIC register when an
invalid PHY address is referenced; let the value be set to precisely
MDIC_ERROR|MDIC_READY.

Signed-off-by: Richard Tollerton <rich.tollerton@ni.com>
Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
---
 hw/net/e1000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index e33a4da..44ae3a8 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -457,7 +457,7 @@ set_mdic(E1000State *s, int index, uint32_t val)
     uint32_t addr = ((val & E1000_MDIC_REG_MASK) >> E1000_MDIC_REG_SHIFT);
 
     if ((val & E1000_MDIC_PHY_MASK) >> E1000_MDIC_PHY_SHIFT != 1) // phy #
-        val = s->mac_reg[MDIC] | E1000_MDIC_ERROR;
+        val = E1000_MDIC_ERROR;
     else if (val & E1000_MDIC_OP_READ) {
         DBGOUT(MDIC, "MDIC read reg 0x%x\n", addr);
         if (!(phy_regcap[addr] & PHY_R)) {
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2014-12-11  5:23 [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Richard Tollerton
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid Richard Tollerton
@ 2014-12-11  5:23 ` Richard Tollerton
  2014-12-18  5:01   ` Jason Wang
  2015-03-24  9:37 ` [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Stefan Hajnoczi
  2 siblings, 1 reply; 12+ messages in thread
From: Richard Tollerton @ 2014-12-11  5:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Tollerton, kraxel, Jeff Westfahl, mst

Some drivers set RDT=RDH. Oddly, this works on real hardware. To work
around this, autodecrement RDT when this happens.

Signed-off-by: Richard Tollerton <rich.tollerton@ni.com>
Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
---
 hw/net/e1000.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 44ae3a8..b8cbfc1 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, uint32_t val)
 static void
 set_rdt(E1000State *s, int index, uint32_t val)
 {
+    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's too big */
+        if (val == 0) {
+            val = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc);
+        }
+        val--;
+    }
     s->mac_reg[index] = val & 0xffff;
     if (e1000_has_rxbufs(s, 1)) {
         qemu_flush_queued_packets(qemu_get_queue(s->nic));
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH Richard Tollerton
@ 2014-12-18  5:01   ` Jason Wang
  2015-01-12 10:24     ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2014-12-18  5:01 UTC (permalink / raw)
  To: Richard Tollerton; +Cc: mst, qemu-devel, Jeff Westfahl, kraxel



----- Original Message -----
> Some drivers set RDT=RDH. Oddly, this works on real hardware. To work
> around this, autodecrement RDT when this happens.
> 
> Signed-off-by: Richard Tollerton <rich.tollerton@ni.com>
> Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
> ---
>  hw/net/e1000.c | 6 ++++++
>  1 file changed, 6 insertions(+)

Please describe more details on the issue.

The spec 3.2.6 said:

"
When the head pointer is equal to the tail pointer, the ring is empty. 
"

So RDT=RDH in fact empty the ring. No?
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 44ae3a8..b8cbfc1 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, uint32_t val)
>  static void
>  set_rdt(E1000State *s, int index, uint32_t val)
>  {
> +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's too big */
> +        if (val == 0) {
> +            val = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc);
> +        }
> +        val--;
> +    }
>      s->mac_reg[index] = val & 0xffff;
>      if (e1000_has_rxbufs(s, 1)) {
>          qemu_flush_queued_packets(qemu_get_queue(s->nic));
> --
> 2.1.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid Richard Tollerton
@ 2014-12-18  5:08   ` Jason Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2014-12-18  5:08 UTC (permalink / raw)
  To: Richard Tollerton; +Cc: mst, qemu-devel, Jeff Westfahl, kraxel



----- Original Message -----
> Some drivers probe PHY addresses beyond the first one, and
> also (unfortunately) don't check for MDIC_ERROR on reads. It appears
> that the driver assumes that the data reads will return zero in this
> case (invalid PHY address). Anecdotally, hardware is believed to do
> this; but qemu wasn't, and instead was reusing the previous value of the
> MDIC register. This screwed up driver init.
> 
> To fix, don't reuse the current value of the MDIC register when an
> invalid PHY address is referenced; let the value be set to precisely
> MDIC_ERROR|MDIC_READY.
> 
> Signed-off-by: Richard Tollerton <rich.tollerton@ni.com>
> Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
> ---

Looks correct.

Reviewed-by: Jason Wang <jasowang@redhat.com>
>  hw/net/e1000.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index e33a4da..44ae3a8 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -457,7 +457,7 @@ set_mdic(E1000State *s, int index, uint32_t val)
>      uint32_t addr = ((val & E1000_MDIC_REG_MASK) >> E1000_MDIC_REG_SHIFT);
>  
>      if ((val & E1000_MDIC_PHY_MASK) >> E1000_MDIC_PHY_SHIFT != 1) // phy #
> -        val = s->mac_reg[MDIC] | E1000_MDIC_ERROR;
> +        val = E1000_MDIC_ERROR;
>      else if (val & E1000_MDIC_OP_READ) {
>          DBGOUT(MDIC, "MDIC read reg 0x%x\n", addr);
>          if (!(phy_regcap[addr] & PHY_R)) {
> --
> 2.1.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2014-12-18  5:01   ` Jason Wang
@ 2015-01-12 10:24     ` Michael S. Tsirkin
  2015-01-12 19:12       ` Richard Tollerton
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2015-01-12 10:24 UTC (permalink / raw)
  To: Jason Wang; +Cc: Richard Tollerton, qemu-devel, Jeff Westfahl, kraxel

On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
> 
> 
> ----- Original Message -----
> > Some drivers set RDT=RDH. Oddly, this works on real hardware. To work
> > around this, autodecrement RDT when this happens.
> > 
> > Signed-off-by: Richard Tollerton <rich.tollerton@ni.com>
> > Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
> > ---
> >  hw/net/e1000.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> 
> Please describe more details on the issue.
> 
> The spec 3.2.6 said:
> 
> "
> When the head pointer is equal to the tail pointer, the ring is empty. 
> "
> 
> So RDT=RDH in fact empty the ring. No?

Richard, can you respond please?
I'd like to see this clarified in code comment or
commit message before applying this patchset.

> > 
> > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > index 44ae3a8..b8cbfc1 100644
> > --- a/hw/net/e1000.c
> > +++ b/hw/net/e1000.c
> > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, uint32_t val)
> >  static void
> >  set_rdt(E1000State *s, int index, uint32_t val)
> >  {
> > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's too big */
> > +        if (val == 0) {
> > +            val = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc);
> > +        }
> > +        val--;
> > +    }
> >      s->mac_reg[index] = val & 0xffff;
> >      if (e1000_has_rxbufs(s, 1)) {
> >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
> > --
> > 2.1.3
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2015-01-12 10:24     ` Michael S. Tsirkin
@ 2015-01-12 19:12       ` Richard Tollerton
  2015-01-13  6:48         ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Tollerton @ 2015-01-12 19:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang; +Cc: qemu-devel, Jeff Westfahl, kraxel

"Michael S. Tsirkin" <mst@redhat.com> writes:

> Richard, can you respond please?
> I'd like to see this clarified in code comment or
> commit message before applying this patchset.

Apologies, and thanks for reminding me.

On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:

> > Some drivers set RDT=RDH. Oddly, this works on real hardware. To work
> > around this, autodecrement RDT when this happens.
> 
> Please describe more details on the issue. The spec 3.2.6 said: "When
> the head pointer is equal to the tail pointer, the ring is empty." So
> RDT=RDH in fact empty the ring. No?

That is incorrect; the spec explicitly states that RDT=RDH means the
ring is full. The linux e1000 driver more or less implies the same thing.

You forgot to include the sentence after that in section 3.2.6 :)

"When the head pointer is equal to the tail pointer, the ring is empty.
Hardware stops storing packets in system memory until software advances
the tail pointer, making more receive buffers available."

Yeah, this seems really poorly worded to me too. :( You appear to be
interpreting "ring is empty" in the usual sense, i.e. "all N elements of
the ring buffer are available for use by hardware". In fact, the correct
interpretation [1] is the exact opposite, "none of the elements are
available for use by hardware". The last sentence in the quote makes
this explicit. See also linux e1000 driver sources at [2] [3] [4].

See also [5] which implies that hardware DMA is kicked off by setting
tail != head at initialization. I'm *guessing* (?) that the DMA engine
isn't correspondingly stopped when software sets RDT=RDH, so that once
packets start getting received, the hardware can more or less ignore it.
In this context, my patch makes sense.

(Yes, this is totally an ex-post-facto justification for the patch; it
arrived to me secondhand, and I had not been familiar with the driver
source before now.)

[1] http://sourceforge.net/p/e1000/mailman/message/29280078/
[2] http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
[3] http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
[4] http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
[5] http://sourceforge.net/p/e1000/mailman/message/29969887/

>> > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>> > index 44ae3a8..b8cbfc1 100644
>> > --- a/hw/net/e1000.c
>> > +++ b/hw/net/e1000.c
>> > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, uint32_t val)
>> >  static void
>> >  set_rdt(E1000State *s, int index, uint32_t val)
>> >  {
>> > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's too big */
>> > +        if (val == 0) {
>> > +            val = s->mac_reg[RDLEN] / sizeof(struct e1000_rx_desc);
>> > +        }
>> > +        val--;
>> > +    }
>> >      s->mac_reg[index] = val & 0xffff;
>> >      if (e1000_has_rxbufs(s, 1)) {
>> >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>> > --
>> > 2.1.3
>> > 
>> > 
>> > 

-- 
Richard Tollerton <rich.tollerton@ni.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2015-01-12 19:12       ` Richard Tollerton
@ 2015-01-13  6:48         ` Jason Wang
  2015-01-13 21:06           ` Richard Tollerton
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2015-01-13  6:48 UTC (permalink / raw)
  To: Richard Tollerton; +Cc: kraxel, qemu-devel, Jeff Westfahl, Michael S. Tsirkin



On Tue, Jan 13, 2015 at 3:12 AM, Richard Tollerton 
<rich.tollerton@ni.com> wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
>>  Richard, can you respond please?
>>  I'd like to see this clarified in code comment or
>>  commit message before applying this patchset.
> 
> Apologies, and thanks for reminding me.
> 
> On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
> 
>>  > Some drivers set RDT=RDH. Oddly, this works on real hardware. To 
>> work
>>  > around this, autodecrement RDT when this happens.
>>  
>>  Please describe more details on the issue. The spec 3.2.6 said: 
>> "When
>>  the head pointer is equal to the tail pointer, the ring is empty." 
>> So
>>  RDT=RDH in fact empty the ring. No?
> 
> That is incorrect; the spec explicitly states that RDT=RDH means the
> ring is full. The linux e1000 driver more or less implies the same 
> thing.
> 
> You forgot to include the sentence after that in section 3.2.6 :)
> 
> "When the head pointer is equal to the tail pointer, the ring is 
> empty.
> Hardware stops storing packets in system memory until software 
> advances
> the tail pointer, making more receive buffers available."
> 
> Yeah, this seems really poorly worded to me too. :( You appear to be
> interpreting "ring is empty" in the usual sense, i.e. "all N elements 
> of
> the ring buffer are available for use by hardware". In fact, the 
> correct
> interpretation [1] is the exact opposite, "none of the elements are
> available for use by hardware". 

Yes, I do think 'empty' means no available buffer for device to receive 
:)
> The last sentence in the quote makes
> this explicit. See also linux e1000 driver sources at [2] [3] [4].

Btw, [2],[3],[4] are all codes that deal with driver's internal 
variable, not the one that the hardware use.
> 
> 
> See also [5] which implies that hardware DMA is kicked off by setting
> tail != head at initialization. 

Yes, and we trigger receiving in set_rdt().
> I'm *guessing* (?) that the DMA engine
> isn't correspondingly stopped when software sets RDT=RDH, so that once
> packets start getting received, 

Do you mean in qemu? I/O are single threaded, so looks like we are safe.
> the hardware can more or less ignore it.
> In this context, my patch makes sense.
> 
> (Yes, this is totally an ex-post-facto justification for the patch; it
> arrived to me secondhand, and I had not been familiar with the driver
> source before now.)

True, we've found many undocumented behavior in the past (some even 
conflicts with spec). I don't have a 82540EM in my hand, but I think 
the best thing is to check this behavior in real hardware to prevent 
this patch from breaking many existing drivers. 
> 
> [1] http://sourceforge.net/p/e1000/mailman/message/29280078/

This issue mentioned in the thread seems solved.

Current e1000_has_rxbufs() will return false if RDT==RDH.
> 
> [2] 
> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
> [3] 
> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
> [4] 
> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
> [5] http://sourceforge.net/p/e1000/mailman/message/29969887/

Looks like what mentioned in this thread was also solved.

We check both RCTL and e1000_has_rxbufs() in e1000_can_receive().
And flush queued packets in set_rx_control().
> 
>>>  > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>  > index 44ae3a8..b8cbfc1 100644
>>>  > --- a/hw/net/e1000.c
>>>  > +++ b/hw/net/e1000.c
>>>  > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, 
>>> uint32_t val)
>>>  >  static void
>>>  >  set_rdt(E1000State *s, int index, uint32_t val)
>>>  >  {
>>>  > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's 
>>> too big */
>>>  > +        if (val == 0) {
>>>  > +            val = s->mac_reg[RDLEN] / sizeof(struct 
>>> e1000_rx_desc);
>>>  > +        }
>>>  > +        val--;
>>>  > +    }
>>>  >      s->mac_reg[index] = val & 0xffff;
>>>  >      if (e1000_has_rxbufs(s, 1)) {
>>>  >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>>  > --
>>>  > 2.1.3
>>>  > 
>>>  > 
>>>  > 
> 
> -- 
> Richard Tollerton <rich.tollerton@ni.com>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2015-01-13  6:48         ` Jason Wang
@ 2015-01-13 21:06           ` Richard Tollerton
  2015-01-14  9:48             ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Tollerton @ 2015-01-13 21:06 UTC (permalink / raw)
  To: Jason Wang; +Cc: kraxel, qemu-devel, Jeff Westfahl, Michael S. Tsirkin

Jason Wang <jasowang@redhat.com> writes:

> On Tue, Jan 13, 2015 at 3:12 AM, Richard Tollerton <rich.tollerton@ni.com> wrote:
>> On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
>> 
>>>  > Some drivers set RDT=RDH. Oddly, this works on real hardware. To
>>>  > work around this, autodecrement RDT when this happens.
>>>  
>>>  Please describe more details on the issue. The spec 3.2.6 said:
>>> "When the head pointer is equal to the tail pointer, the ring is
>>> empty." So RDT=RDH in fact empty the ring. No?
>> 
>> That is incorrect; the spec explicitly states that RDT=RDH means the
>> ring is full. The linux e1000 driver more or less implies the same
>> thing.
>> 
>> You forgot to include the sentence after that in section 3.2.6 :)
>> 
>> "When the head pointer is equal to the tail pointer, the ring is
>> empty. Hardware stops storing packets in system memory until software
>> advances the tail pointer, making more receive buffers available."
>> 
>> Yeah, this seems really poorly worded to me too. :( You appear to be
>> interpreting "ring is empty" in the usual sense, i.e. "all N elements
>> of the ring buffer are available for use by hardware". In fact, the
>> correct interpretation [1] is the exact opposite, "none of the
>> elements are available for use by hardware".
>
> Yes, I do think 'empty' means no available buffer for device to receive 
> :)

Ah, ok. But if you're concerned about breaking drivers with this... what
legitimate reason could a driver possibly have to set RDT=RDH? (Besides a
mistaken attempt to free the ring for hardware use, as I posit?)

The only reason I can think of is that maybe a driver is trying to
implement some sort of crude flow control. But if that actually worked,
then major packet loss would be observed under load, as any packets
received by hardware but not yet processed by software would get
dropped.

I'm going to go out (further) on a limb and assert that no driver ever
sets RDT=RDH to stop receives, because no hardware implements that
behavior. The driver I'm trying to get working appears to have been
setting RDT=RDH at the end of *every* iteration of the receive loop,
since it was originally written in 2003. If I am to trust the comments,
it's been ported/supported on 28 different chipset variants, and it's
definitely been tested for performance and packet loss under load for a
good number of those, including under polling modes where the ring is
only checked every few milliseconds. If RDT=RDH ever did anything except
free all buffers for hardware use, surely catastrophic packet loss would
have been observed?

>> The last sentence in the quote makes this explicit. See also linux
>> e1000 driver sources at [2] [3] [4].
>
> Btw, [2],[3],[4] are all codes that deal with driver's internal
> variable, not the one that the hardware use.

No, it's directly used by hardware -- current_count increments RDT. See
e1000_alloc_rx_buffers().

>> I'm *guessing* (?) that the DMA engine isn't correspondingly stopped
>> when software sets RDT=RDH, so that once packets start getting
>> received,
>
> Do you mean in qemu? I/O are single threaded, so looks like we are
> safe.

I'm referring to the possibility that physical hardware is doing this.

Interesting side note: while this driver sets RDT=RDH on every
iteration, it *initializes* RDT=0 and RDH=1...

>> the hardware can more or less ignore it. In this context, my patch
>> makes sense.
>> 
>> (Yes, this is totally an ex-post-facto justification for the patch; it
>> arrived to me secondhand, and I had not been familiar with the driver
>> source before now.)
>
> True, we've found many undocumented behavior in the past (some even 
> conflicts with spec). I don't have a 82540EM in my hand, but I think 
> the best thing is to check this behavior in real hardware to prevent 
> this patch from breaking many existing drivers.

Can you be more specific in regards to what information you're
requesting? Are you wanting additional confirmation (i.e. via
instrumented code in addition to code inspection) that setting RDT=RDH
does not stop packet receive once it has started?

>> [1] http://sourceforge.net/p/e1000/mailman/message/29280078/
>
> This issue mentioned in the thread seems solved.

Indeed; I was citing this thread (and the other thread) for the
discussions, not the issues themselves.

>> [2] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
>> [3] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
>> [4] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
>> [5] http://sourceforge.net/p/e1000/mailman/message/29969887/
>
> Looks like what mentioned in this thread was also solved.
>
> We check both RCTL and e1000_has_rxbufs() in e1000_can_receive().
> And flush queued packets in set_rx_control().
>> 
>>>>  > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>  > index 44ae3a8..b8cbfc1 100644
>>>>  > --- a/hw/net/e1000.c
>>>>  > +++ b/hw/net/e1000.c
>>>>  > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, 
>>>> uint32_t val)
>>>>  >  static void
>>>>  >  set_rdt(E1000State *s, int index, uint32_t val)
>>>>  >  {
>>>>  > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's 
>>>> too big */
>>>>  > +        if (val == 0) {
>>>>  > +            val = s->mac_reg[RDLEN] / sizeof(struct 
>>>> e1000_rx_desc);
>>>>  > +        }
>>>>  > +        val--;
>>>>  > +    }
>>>>  >      s->mac_reg[index] = val & 0xffff;
>>>>  >      if (e1000_has_rxbufs(s, 1)) {
>>>>  >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>>>  > --
>>>>  > 2.1.3
>>>>  > 
>>>>  > 
>>>>  > 
>> 
>> -- 
>> Richard Tollerton <rich.tollerton@ni.com>
>> 
>

-- 
Richard Tollerton <rich.tollerton@ni.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
  2015-01-13 21:06           ` Richard Tollerton
@ 2015-01-14  9:48             ` Jason Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2015-01-14  9:48 UTC (permalink / raw)
  To: Richard Tollerton; +Cc: kraxel, qemu-devel, Jeff Westfahl, Michael S. Tsirkin


On 01/14/2015 05:06 AM, Richard Tollerton wrote:
> Jason Wang <jasowang@redhat.com> writes:
>
>> On Tue, Jan 13, 2015 at 3:12 AM, Richard Tollerton <rich.tollerton@ni.com> wrote:
>>> On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
>>>
>>>>  > Some drivers set RDT=RDH. Oddly, this works on real hardware. To
>>>>  > work around this, autodecrement RDT when this happens.
>>>>  
>>>>  Please describe more details on the issue. The spec 3.2.6 said:
>>>> "When the head pointer is equal to the tail pointer, the ring is
>>>> empty." So RDT=RDH in fact empty the ring. No?
>>> That is incorrect; the spec explicitly states that RDT=RDH means the
>>> ring is full. The linux e1000 driver more or less implies the same
>>> thing.
>>>
>>> You forgot to include the sentence after that in section 3.2.6 :)
>>>
>>> "When the head pointer is equal to the tail pointer, the ring is
>>> empty. Hardware stops storing packets in system memory until software
>>> advances the tail pointer, making more receive buffers available."
>>>
>>> Yeah, this seems really poorly worded to me too. :( You appear to be
>>> interpreting "ring is empty" in the usual sense, i.e. "all N elements
>>> of the ring buffer are available for use by hardware". In fact, the
>>> correct interpretation [1] is the exact opposite, "none of the
>>> elements are available for use by hardware".
>> Yes, I do think 'empty' means no available buffer for device to receive 
>> :)
> Ah, ok. But if you're concerned about breaking drivers with this... what
> legitimate reason could a driver possibly have to set RDT=RDH? (Besides a
> mistaken attempt to free the ring for hardware use, as I posit?)

One example is initialization in Linux (e1000_configure_rx()). What it
does is:

    /* disable receives while setting up the descriptors */
    rctl = er32(RCTL);
    ew32(RCTL, rctl & ~E1000_RCTL_EN);
...
        ew32(RDT, 0);
        ew32(RDH, 0);
...
    /* Enable Receives */
    ew32(RCTL, rctl | E1000_RCTL_EN);

And the rx buffer allocations were done later. So with your patch, after
rx was enabled but before  rx buffer were allocated. Since RDT was set
to RDLEN, e1000_has_rxbuf() will return true. Qemu will try to receive
packet to uninitialized buffers. This seems wrong.
>
> The only reason I can think of is that maybe a driver is trying to
> implement some sort of crude flow control. But if that actually worked,
> then major packet loss would be observed under load, as any packets
> received by hardware but not yet processed by software would get
> dropped.
>
> I'm going to go out (further) on a limb and assert that no driver ever
> sets RDT=RDH to stop receives, because no hardware implements that
> behavior. The driver I'm trying to get working appears to have been
> setting RDT=RDH at the end of *every* iteration of the receive loop,
> since it was originally written in 2003. If I am to trust the comments,
> it's been ported/supported on 28 different chipset variants, and it's
> definitely been tested for performance and packet loss under load for a
> good number of those, including under polling modes where the ring is
> only checked every few milliseconds. If RDT=RDH ever did anything except
> free all buffers for hardware use, surely catastrophic packet loss would
> have been observed?

>From device's point of view. It just need stop receiving when RDT=RDH.
It does not care whether the ring was full of received packets or empty.

I probably miss something but could you please show the (pseudo) code of
your driver to see why current qemu does not work?
>
>>> The last sentence in the quote makes this explicit. See also linux
>>> e1000 driver sources at [2] [3] [4].
>> Btw, [2],[3],[4] are all codes that deal with driver's internal
>> variable, not the one that the hardware use.
> No, it's directly used by hardware -- current_count increments RDT. See
> e1000_alloc_rx_buffers().

Do you mean cleaned_count? (Didn't find current_count).
>
>>> I'm *guessing* (?) that the DMA engine isn't correspondingly stopped
>>> when software sets RDT=RDH, so that once packets start getting
>>> received,
>> Do you mean in qemu? I/O are single threaded, so looks like we are
>> safe.
> I'm referring to the possibility that physical hardware is doing this.
>
> Interesting side note: while this driver sets RDT=RDH on every
> iteration, it *initializes* RDT=0 and RDH=1...

I want to know more about this driver.

RDT=0 and RDH=1 means all buffers were available for device to receive.
If no rx buffer refilling happens, RDH will be finally advanced by
device and finally equal to RDT and then receiving is stopped. When will
driver set RDT=RDH?
>>> the hardware can more or less ignore it. In this context, my patch
>>> makes sense.
>>>
>>> (Yes, this is totally an ex-post-facto justification for the patch; it
>>> arrived to me secondhand, and I had not been familiar with the driver
>>> source before now.)
>> True, we've found many undocumented behavior in the past (some even 
>> conflicts with spec). I don't have a 82540EM in my hand, but I think 
>> the best thing is to check this behavior in real hardware to prevent 
>> this patch from breaking many existing drivers.
> Can you be more specific in regards to what information you're
> requesting? Are you wanting additional confirmation (i.e. via
> instrumented code in addition to code inspection) that setting RDT=RDH
> does not stop packet receive once it has started?
>

Just a test to see if a real 82540 card behaves like this patch. E.g.

write(RDH, 0)
write(RDT, 0)
read(RDT)

To see if RDT is zero or RDLEN.
>>> [1] http://sourceforge.net/p/e1000/mailman/message/29280078/
>> This issue mentioned in the thread seems solved.
> Indeed; I was citing this thread (and the other thread) for the
> discussions, not the issues themselves.
>
>>> [2] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
>>> [3] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
>>> [4] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
>>> [5] http://sourceforge.net/p/e1000/mailman/message/29969887/
>> Looks like what mentioned in this thread was also solved.
>>
>> We check both RCTL and e1000_has_rxbufs() in e1000_can_receive().
>> And flush queued packets in set_rx_control().
>>>>>  > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>  > index 44ae3a8..b8cbfc1 100644
>>>>>  > --- a/hw/net/e1000.c
>>>>>  > +++ b/hw/net/e1000.c
>>>>>  > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, 
>>>>> uint32_t val)
>>>>>  >  static void
>>>>>  >  set_rdt(E1000State *s, int index, uint32_t val)
>>>>>  >  {
>>>>>  > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's 
>>>>> too big */
>>>>>  > +        if (val == 0) {
>>>>>  > +            val = s->mac_reg[RDLEN] / sizeof(struct 
>>>>> e1000_rx_desc);
>>>>>  > +        }
>>>>>  > +        val--;
>>>>>  > +    }
>>>>>  >      s->mac_reg[index] = val & 0xffff;
>>>>>  >      if (e1000_has_rxbufs(s, 1)) {
>>>>>  >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>>>>  > --
>>>>>  > 2.1.3
>>>>>  > 
>>>>>  > 
>>>>>  > 
>>> -- 
>>> Richard Tollerton <rich.tollerton@ni.com>
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS
  2014-12-11  5:23 [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Richard Tollerton
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid Richard Tollerton
  2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH Richard Tollerton
@ 2015-03-24  9:37 ` Stefan Hajnoczi
  2015-03-25  6:02   ` Richard Tollerton
  2 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2015-03-24  9:37 UTC (permalink / raw)
  To: Richard Tollerton; +Cc: mst, qemu-devel, kraxel

[-- Attachment #1: Type: text/plain, Size: 832 bytes --]

On Wed, Dec 10, 2014 at 11:23:45PM -0600, Richard Tollerton wrote:
> The 8254x driver in certain versions of Phar Lap ETS hasn't been
> initializing the e1000 device properly in qemu. It looks like the driver
> is relying on two specific pieces of behavior which (anecdotally) exist
> in hardware, although I can't cite any datasheets on the matter; in any
> case, these two patches adjust this behavior. Tested lightly on both ETS
> and Win7.
> 
> This is a resend; there are no changes compared to v1.
> 
> Richard Tollerton (2):
>   e1000: Clear MDIC register when PHY addr is invalid
>   e1000: decrement RDT if equal to RDH
> 
>  hw/net/e1000.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)

Hi,
Just working through unread patches in my mailbox.

Has this patch series stalled?

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS
  2015-03-24  9:37 ` [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Stefan Hajnoczi
@ 2015-03-25  6:02   ` Richard Tollerton
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Tollerton @ 2015-03-25  6:02 UTC (permalink / raw)
  To: Stefan Hajnoczi, jasowang; +Cc: mst, qemu-devel, kraxel

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Wed, Dec 10, 2014 at 11:23:45PM -0600, Richard Tollerton wrote:
>> The 8254x driver in certain versions of Phar Lap ETS hasn't been
>> initializing the e1000 device properly in qemu. It looks like the driver
>> is relying on two specific pieces of behavior which (anecdotally) exist
>> in hardware, although I can't cite any datasheets on the matter; in any
>> case, these two patches adjust this behavior. Tested lightly on both ETS
>> and Win7.
>> 
>> This is a resend; there are no changes compared to v1.
>> 
>> Richard Tollerton (2):
>>   e1000: Clear MDIC register when PHY addr is invalid
>>   e1000: decrement RDT if equal to RDH
>> 
>>  hw/net/e1000.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> Hi,
> Just working through unread patches in my mailbox.
>
> Has this patch series stalled?

Apologies, yes -- Jason raised some legitimate concerns, and I haven't
found the time to finish investigating our e1000 driver since then; I
have been negligent in not responding sooner. I probably won't have time
to respond properly until (at least) early May.

> Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-03-25  6:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-11  5:23 [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Richard Tollerton
2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 1/2] e1000: Clear MDIC register when PHY addr is invalid Richard Tollerton
2014-12-18  5:08   ` Jason Wang
2014-12-11  5:23 ` [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH Richard Tollerton
2014-12-18  5:01   ` Jason Wang
2015-01-12 10:24     ` Michael S. Tsirkin
2015-01-12 19:12       ` Richard Tollerton
2015-01-13  6:48         ` Jason Wang
2015-01-13 21:06           ` Richard Tollerton
2015-01-14  9:48             ` Jason Wang
2015-03-24  9:37 ` [Qemu-devel] [PATCH v2 0/2] e1000: fixes for Phar Lap ETS Stefan Hajnoczi
2015-03-25  6:02   ` Richard Tollerton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.