All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
@ 2021-07-02  9:24 Bin Meng
  2021-07-02  9:24 ` [PATCH v2 2/3] hw/net: e1000e: " Bin Meng
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Bin Meng @ 2021-07-02  9:24 UTC (permalink / raw)
  To: Jason Wang, qemu-devel; +Cc: Bin Meng, Christina Wang, Markus Carlstedt

From: Christina Wang <christina.wang@windriver.com>

The initial value of VLAN Ether Type (VET) register is 0x8100, as per
the manual and real hardware.

While Linux e1000 driver always writes VET register to 0x8100, it is
not always the case for everyone. Drivers relying on the reset value
of VET won't be able to transmit and receive VLAN frames in QEMU.

Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

(no changes since v1)

 hw/net/e1000.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 4f75b44cfc..20cbba6411 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -29,6 +29,7 @@
 #include "hw/pci/pci.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
+#include "net/eth.h"
 #include "net/net.h"
 #include "net/checksum.h"
 #include "sysemu/sysemu.h"
@@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
     [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
                 E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
                 E1000_MANC_RMCP_EN,
+    [VET]     = ETH_P_VLAN,
 };
 
 /* Helper function, *curr == 0 means the value is not set */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/3] hw/net: e1000e: Correct the initial value of VET register
  2021-07-02  9:24 [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Bin Meng
@ 2021-07-02  9:24 ` Bin Meng
  2021-07-02  9:24 ` [PATCH v2 3/3] hw/net: e1000e: Don't zero out the VLAN tag in the legacy RX descriptor Bin Meng
  2021-07-05  4:21 ` [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Jason Wang
  2 siblings, 0 replies; 20+ messages in thread
From: Bin Meng @ 2021-07-02  9:24 UTC (permalink / raw)
  To: Jason Wang, qemu-devel; +Cc: Bin Meng, Christina Wang, Markus Carlstedt

From: Christina Wang <christina.wang@windriver.com>

The initial value of VLAN Ether Type (VET) register is 0x8100, as per
the manual and real hardware.

While Linux e1000e driver always writes VET register to 0x8100, it is
not always the case for everyone. Drivers relying on the reset value
of VET won't be able to transmit and receive VLAN frames in QEMU.

Unlike e1000 in QEMU, e1000e uses a field 'vet' in "struct E1000Core"
to cache the value of VET register, but the cache only gets updated
when VET register is written. To always get a consistent VET value
no matter VET is written or remains its reset value, drop the 'vet'
field and use 'core->mac[VET]' directly.

Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

Changes in v2:
- keep the 'vet' field in "struct E1000Core" for migration compatibility

 hw/net/e1000e_core.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index b75f2ab8fc..38b3e3b784 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -35,6 +35,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "net/eth.h"
 #include "net/net.h"
 #include "net/tap.h"
 #include "hw/pci/msi.h"
@@ -731,7 +732,7 @@ e1000e_process_tx_desc(E1000ECore *core,
             if (e1000x_vlan_enabled(core->mac) &&
                 e1000x_is_vlan_txd(txd_lower)) {
                 net_tx_pkt_setup_vlan_header_ex(tx->tx_pkt,
-                    le16_to_cpu(dp->upper.fields.special), core->vet);
+                    le16_to_cpu(dp->upper.fields.special), core->mac[VET]);
             }
             if (e1000e_tx_pkt_send(core, tx, queue_index)) {
                 e1000e_on_tx_done_update_stats(core, tx->tx_pkt);
@@ -1012,7 +1013,7 @@ e1000e_receive_filter(E1000ECore *core, const uint8_t *buf, int size)
 {
     uint32_t rctl = core->mac[RCTL];
 
-    if (e1000x_is_vlan_packet(buf, core->vet) &&
+    if (e1000x_is_vlan_packet(buf, core->mac[VET]) &&
         e1000x_vlan_rx_filter_enabled(core->mac)) {
         uint16_t vid = lduw_be_p(buf + 14);
         uint32_t vfta = ldl_le_p((uint32_t *)(core->mac + VFTA) +
@@ -1686,7 +1687,7 @@ e1000e_receive_iov(E1000ECore *core, const struct iovec *iov, int iovcnt)
     }
 
     net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-                               e1000x_vlan_enabled(core->mac), core->vet);
+                               e1000x_vlan_enabled(core->mac), core->mac[VET]);
 
     e1000e_rss_parse_packet(core, core->rx_pkt, &rss_info);
     e1000e_rx_ring_init(core, &rxr, rss_info.queue);
@@ -2397,8 +2398,7 @@ static void
 e1000e_set_vet(E1000ECore *core, int index, uint32_t val)
 {
     core->mac[VET] = val & 0xffff;
-    core->vet = le16_to_cpu(core->mac[VET]);
-    trace_e1000e_vlan_vet(core->vet);
+    trace_e1000e_vlan_vet(core->mac[VET]);
 }
 
 static void
@@ -3442,6 +3442,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
     [RXCSUM]        = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
     [ITR]           = E1000E_MIN_XITR,
     [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
+    [VET]           = ETH_P_VLAN,
 };
 
 void
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/3] hw/net: e1000e: Don't zero out the VLAN tag in the legacy RX descriptor
  2021-07-02  9:24 [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Bin Meng
  2021-07-02  9:24 ` [PATCH v2 2/3] hw/net: e1000e: " Bin Meng
@ 2021-07-02  9:24 ` Bin Meng
  2021-07-05  4:21 ` [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Jason Wang
  2 siblings, 0 replies; 20+ messages in thread
From: Bin Meng @ 2021-07-02  9:24 UTC (permalink / raw)
  To: Jason Wang, qemu-devel; +Cc: Bin Meng, Christina Wang, Markus Carlstedt

From: Christina Wang <christina.wang@windriver.com>

In the legacy RX descriptor mode, VLAN tag was saved to d->special
by e1000e_build_rx_metadata() in e1000e_write_lgcy_rx_descr(), but
it was then zeroed out again at the end of the call, which is wrong.

Fixes: c89d416a2b0f ("e1000e: Don't zero out buffer address in rx descriptor")
Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

(no changes since v1)

 hw/net/e1000e_core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 38b3e3b784..738c7169e4 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1286,7 +1286,6 @@ e1000e_write_lgcy_rx_descr(E1000ECore *core, uint8_t *desc,
                              &d->special);
     d->errors = (uint8_t) (le32_to_cpu(status_flags) >> 24);
     d->status = (uint8_t) le32_to_cpu(status_flags);
-    d->special = 0;
 }
 
 static inline void
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-02  9:24 [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Bin Meng
  2021-07-02  9:24 ` [PATCH v2 2/3] hw/net: e1000e: " Bin Meng
  2021-07-02  9:24 ` [PATCH v2 3/3] hw/net: e1000e: Don't zero out the VLAN tag in the legacy RX descriptor Bin Meng
@ 2021-07-05  4:21 ` Jason Wang
  2021-07-05  5:57   ` Bin Meng
  2 siblings, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-05  4:21 UTC (permalink / raw)
  To: Bin Meng, qemu-devel; +Cc: Bin Meng, Christina Wang, Markus Carlstedt


在 2021/7/2 下午5:24, Bin Meng 写道:
> From: Christina Wang <christina.wang@windriver.com>
>
> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> the manual and real hardware.
>
> While Linux e1000 driver always writes VET register to 0x8100, it is
> not always the case for everyone. Drivers relying on the reset value
> of VET won't be able to transmit and receive VLAN frames in QEMU.
>
> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
>
> (no changes since v1)
>
>   hw/net/e1000.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 4f75b44cfc..20cbba6411 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -29,6 +29,7 @@
>   #include "hw/pci/pci.h"
>   #include "hw/qdev-properties.h"
>   #include "migration/vmstate.h"
> +#include "net/eth.h"
>   #include "net/net.h"
>   #include "net/checksum.h"
>   #include "sysemu/sysemu.h"
> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>       [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>                   E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>                   E1000_MANC_RMCP_EN,
> +    [VET]     = ETH_P_VLAN,


I wonder if we need a compat flag for this, since we change the behavior.

(See e1000_properties[])

Thanks


>   };
>   
>   /* Helper function, *curr == 0 means the value is not set */



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-05  4:21 ` [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Jason Wang
@ 2021-07-05  5:57   ` Bin Meng
  2021-07-12 23:06     ` Bin Meng
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-05  5:57 UTC (permalink / raw)
  To: Jason Wang
  Cc: Bin Meng, Christina Wang, qemu-devel@nongnu.org Developers,
	Markus Carlstedt

On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/2 下午5:24, Bin Meng 写道:
> > From: Christina Wang <christina.wang@windriver.com>
> >
> > The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> > the manual and real hardware.
> >
> > While Linux e1000 driver always writes VET register to 0x8100, it is
> > not always the case for everyone. Drivers relying on the reset value
> > of VET won't be able to transmit and receive VLAN frames in QEMU.
> >
> > Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> > Signed-off-by: Christina Wang <christina.wang@windriver.com>
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > ---
> >
> > (no changes since v1)
> >
> >   hw/net/e1000.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > index 4f75b44cfc..20cbba6411 100644
> > --- a/hw/net/e1000.c
> > +++ b/hw/net/e1000.c
> > @@ -29,6 +29,7 @@
> >   #include "hw/pci/pci.h"
> >   #include "hw/qdev-properties.h"
> >   #include "migration/vmstate.h"
> > +#include "net/eth.h"
> >   #include "net/net.h"
> >   #include "net/checksum.h"
> >   #include "sysemu/sysemu.h"
> > @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >       [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >                   E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >                   E1000_MANC_RMCP_EN,
> > +    [VET]     = ETH_P_VLAN,
>
>
> I wonder if we need a compat flag for this, since we change the behavior.
>
> (See e1000_properties[])
>

No we don't need to since it does not break migration.

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-05  5:57   ` Bin Meng
@ 2021-07-12 23:06     ` Bin Meng
  2021-07-13  7:03       ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-12 23:06 UTC (permalink / raw)
  To: Jason Wang
  Cc: Bin Meng, Christina Wang, qemu-devel@nongnu.org Developers,
	Markus Carlstedt

On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2021/7/2 下午5:24, Bin Meng 写道:
> > > From: Christina Wang <christina.wang@windriver.com>
> > >
> > > The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> > > the manual and real hardware.
> > >
> > > While Linux e1000 driver always writes VET register to 0x8100, it is
> > > not always the case for everyone. Drivers relying on the reset value
> > > of VET won't be able to transmit and receive VLAN frames in QEMU.
> > >
> > > Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> > > Signed-off-by: Christina Wang <christina.wang@windriver.com>
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > ---
> > >
> > > (no changes since v1)
> > >
> > >   hw/net/e1000.c | 2 ++
> > >   1 file changed, 2 insertions(+)
> > >
> > > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > > index 4f75b44cfc..20cbba6411 100644
> > > --- a/hw/net/e1000.c
> > > +++ b/hw/net/e1000.c
> > > @@ -29,6 +29,7 @@
> > >   #include "hw/pci/pci.h"
> > >   #include "hw/qdev-properties.h"
> > >   #include "migration/vmstate.h"
> > > +#include "net/eth.h"
> > >   #include "net/net.h"
> > >   #include "net/checksum.h"
> > >   #include "sysemu/sysemu.h"
> > > @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> > >       [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> > >                   E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> > >                   E1000_MANC_RMCP_EN,
> > > +    [VET]     = ETH_P_VLAN,
> >
> >
> > I wonder if we need a compat flag for this, since we change the behavior.
> >
> > (See e1000_properties[])
> >
>
> No we don't need to since it does not break migration.

Ping?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-12 23:06     ` Bin Meng
@ 2021-07-13  7:03       ` Jason Wang
  2021-07-13  8:36         ` Bin Meng
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-13  7:03 UTC (permalink / raw)
  To: Bin Meng
  Cc: Bin Meng, Christina Wang, qemu-devel@nongnu.org Developers,
	Markus Carlstedt


在 2021/7/13 上午7:06, Bin Meng 写道:
> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>
>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>
>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>> the manual and real hardware.
>>>>
>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>> not always the case for everyone. Drivers relying on the reset value
>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>
>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>> ---
>>>>
>>>> (no changes since v1)
>>>>
>>>>    hw/net/e1000.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>> index 4f75b44cfc..20cbba6411 100644
>>>> --- a/hw/net/e1000.c
>>>> +++ b/hw/net/e1000.c
>>>> @@ -29,6 +29,7 @@
>>>>    #include "hw/pci/pci.h"
>>>>    #include "hw/qdev-properties.h"
>>>>    #include "migration/vmstate.h"
>>>> +#include "net/eth.h"
>>>>    #include "net/net.h"
>>>>    #include "net/checksum.h"
>>>>    #include "sysemu/sysemu.h"
>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>        [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>                    E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>                    E1000_MANC_RMCP_EN,
>>>> +    [VET]     = ETH_P_VLAN,
>>>
>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>
>>> (See e1000_properties[])
>>>
>> No we don't need to since it does not break migration.
> Ping?


I admit migration "works" but it doesn't mean it's not broken. It 
changes the guest visible default value of VET register, so it may break 
things silently for the guest.

For old machine types, we should stick the value to the one without this 
fix.

Thanks


>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-13  7:03       ` Jason Wang
@ 2021-07-13  8:36         ` Bin Meng
  2021-07-13  9:02           ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-13  8:36 UTC (permalink / raw)
  To: Jason Wang, Peter Maydell
  Cc: Bin Meng, Christina Wang, qemu-devel@nongnu.org Developers,
	Markus Carlstedt

On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/13 上午7:06, Bin Meng 写道:
> > On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>> 在 2021/7/2 下午5:24, Bin Meng 写道:
> >>>> From: Christina Wang <christina.wang@windriver.com>
> >>>>
> >>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> >>>> the manual and real hardware.
> >>>>
> >>>> While Linux e1000 driver always writes VET register to 0x8100, it is
> >>>> not always the case for everyone. Drivers relying on the reset value
> >>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
> >>>>
> >>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> >>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> >>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >>>> ---
> >>>>
> >>>> (no changes since v1)
> >>>>
> >>>>    hw/net/e1000.c | 2 ++
> >>>>    1 file changed, 2 insertions(+)
> >>>>
> >>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> >>>> index 4f75b44cfc..20cbba6411 100644
> >>>> --- a/hw/net/e1000.c
> >>>> +++ b/hw/net/e1000.c
> >>>> @@ -29,6 +29,7 @@
> >>>>    #include "hw/pci/pci.h"
> >>>>    #include "hw/qdev-properties.h"
> >>>>    #include "migration/vmstate.h"
> >>>> +#include "net/eth.h"
> >>>>    #include "net/net.h"
> >>>>    #include "net/checksum.h"
> >>>>    #include "sysemu/sysemu.h"
> >>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >>>>        [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >>>>                    E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >>>>                    E1000_MANC_RMCP_EN,
> >>>> +    [VET]     = ETH_P_VLAN,
> >>>
> >>> I wonder if we need a compat flag for this, since we change the behavior.
> >>>
> >>> (See e1000_properties[])
> >>>
> >> No we don't need to since it does not break migration.
> > Ping?
>
>
> I admit migration "works" but it doesn't mean it's not broken. It
> changes the guest visible default value of VET register, so it may break
> things silently for the guest.
>
> For old machine types, we should stick the value to the one without this
> fix.

Could you please propose a solution on how to handle such a scenario
in a generic way in QEMU? (+Peter)

The POR reset value is wrong in QEMU and has carried forward the wrong
value for years, and correcting it to its right value needs to do
what?

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-13  8:36         ` Bin Meng
@ 2021-07-13  9:02           ` Jason Wang
  2021-07-13  9:11             ` Bin Meng
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-13  9:02 UTC (permalink / raw)
  To: Bin Meng, Peter Maydell
  Cc: Bin Meng, Christina Wang, qemu-devel@nongnu.org Developers,
	Markus Carlstedt


在 2021/7/13 下午4:36, Bin Meng 写道:
> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2021/7/13 上午7:06, Bin Meng 写道:
>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>>>
>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>>>> the manual and real hardware.
>>>>>>
>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>>>> not always the case for everyone. Drivers relying on the reset value
>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>>>
>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>>>> ---
>>>>>>
>>>>>> (no changes since v1)
>>>>>>
>>>>>>     hw/net/e1000.c | 2 ++
>>>>>>     1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>> index 4f75b44cfc..20cbba6411 100644
>>>>>> --- a/hw/net/e1000.c
>>>>>> +++ b/hw/net/e1000.c
>>>>>> @@ -29,6 +29,7 @@
>>>>>>     #include "hw/pci/pci.h"
>>>>>>     #include "hw/qdev-properties.h"
>>>>>>     #include "migration/vmstate.h"
>>>>>> +#include "net/eth.h"
>>>>>>     #include "net/net.h"
>>>>>>     #include "net/checksum.h"
>>>>>>     #include "sysemu/sysemu.h"
>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>>>         [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>>>                     E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>>>                     E1000_MANC_RMCP_EN,
>>>>>> +    [VET]     = ETH_P_VLAN,
>>>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>>>
>>>>> (See e1000_properties[])
>>>>>
>>>> No we don't need to since it does not break migration.
>>> Ping?
>>
>> I admit migration "works" but it doesn't mean it's not broken. It
>> changes the guest visible default value of VET register, so it may break
>> things silently for the guest.
>>
>> For old machine types, we should stick the value to the one without this
>> fix.
> Could you please propose a solution on how to handle such a scenario
> in a generic way in QEMU? (+Peter)


Well, I think I've suggested you to have a look at how things is done in 
for handling such compatibility in e1000_properties.


>
> The POR reset value is wrong in QEMU and has carried forward the wrong
> value for years, and correcting it to its right value needs to do
> what?


We should stick to the wrong behavior for old machine types.

That's all.


>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-13  9:02           ` Jason Wang
@ 2021-07-13  9:11             ` Bin Meng
  2021-07-14  3:10               ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-13  9:11 UTC (permalink / raw)
  To: Jason Wang, Philippe Mathieu-Daudé
  Cc: Bin Meng, Peter Maydell, Christina Wang,
	qemu-devel@nongnu.org Developers, Markus Carlstedt

On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/13 下午4:36, Bin Meng 写道:
> > On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2021/7/13 上午7:06, Bin Meng 写道:
> >>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
> >>>>>> From: Christina Wang <christina.wang@windriver.com>
> >>>>>>
> >>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> >>>>>> the manual and real hardware.
> >>>>>>
> >>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
> >>>>>> not always the case for everyone. Drivers relying on the reset value
> >>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
> >>>>>>
> >>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> >>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> >>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >>>>>> ---
> >>>>>>
> >>>>>> (no changes since v1)
> >>>>>>
> >>>>>>     hw/net/e1000.c | 2 ++
> >>>>>>     1 file changed, 2 insertions(+)
> >>>>>>
> >>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> >>>>>> index 4f75b44cfc..20cbba6411 100644
> >>>>>> --- a/hw/net/e1000.c
> >>>>>> +++ b/hw/net/e1000.c
> >>>>>> @@ -29,6 +29,7 @@
> >>>>>>     #include "hw/pci/pci.h"
> >>>>>>     #include "hw/qdev-properties.h"
> >>>>>>     #include "migration/vmstate.h"
> >>>>>> +#include "net/eth.h"
> >>>>>>     #include "net/net.h"
> >>>>>>     #include "net/checksum.h"
> >>>>>>     #include "sysemu/sysemu.h"
> >>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >>>>>>         [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >>>>>>                     E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >>>>>>                     E1000_MANC_RMCP_EN,
> >>>>>> +    [VET]     = ETH_P_VLAN,
> >>>>> I wonder if we need a compat flag for this, since we change the behavior.
> >>>>>
> >>>>> (See e1000_properties[])
> >>>>>
> >>>> No we don't need to since it does not break migration.
> >>> Ping?
> >>
> >> I admit migration "works" but it doesn't mean it's not broken. It
> >> changes the guest visible default value of VET register, so it may break
> >> things silently for the guest.
> >>
> >> For old machine types, we should stick the value to the one without this
> >> fix.
> > Could you please propose a solution on how to handle such a scenario
> > in a generic way in QEMU? (+Peter)
>
>
> Well, I think I've suggested you to have a look at how things is done in
> for handling such compatibility in e1000_properties.
>
>
> >
> > The POR reset value is wrong in QEMU and has carried forward the wrong
> > value for years, and correcting it to its right value needs to do
> > what?
>
>
> We should stick to the wrong behavior for old machine types.
>
> That's all.

So that means the following SD patch is also wrong (+Philippe) which
changes the default value of capability register.
http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/

Can we get some agreement among maintainers?

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-13  9:11             ` Bin Meng
@ 2021-07-14  3:10               ` Jason Wang
  2021-07-14  3:42                 ` Bin Meng
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-14  3:10 UTC (permalink / raw)
  To: Bin Meng, Philippe Mathieu-Daudé
  Cc: Bin Meng, Peter Maydell, Christina Wang,
	qemu-devel@nongnu.org Developers, Markus Carlstedt


在 2021/7/13 下午5:11, Bin Meng 写道:
> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2021/7/13 下午4:36, Bin Meng 写道:
>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>>>>>
>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>>>>>> the manual and real hardware.
>>>>>>>>
>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>>>>>> not always the case for everyone. Drivers relying on the reset value
>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>>>>>
>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> (no changes since v1)
>>>>>>>>
>>>>>>>>      hw/net/e1000.c | 2 ++
>>>>>>>>      1 file changed, 2 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>>> index 4f75b44cfc..20cbba6411 100644
>>>>>>>> --- a/hw/net/e1000.c
>>>>>>>> +++ b/hw/net/e1000.c
>>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>>      #include "hw/pci/pci.h"
>>>>>>>>      #include "hw/qdev-properties.h"
>>>>>>>>      #include "migration/vmstate.h"
>>>>>>>> +#include "net/eth.h"
>>>>>>>>      #include "net/net.h"
>>>>>>>>      #include "net/checksum.h"
>>>>>>>>      #include "sysemu/sysemu.h"
>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>>>>>          [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>>>>>                      E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>>>>>                      E1000_MANC_RMCP_EN,
>>>>>>>> +    [VET]     = ETH_P_VLAN,
>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>>>>>
>>>>>>> (See e1000_properties[])
>>>>>>>
>>>>>> No we don't need to since it does not break migration.
>>>>> Ping?
>>>> I admit migration "works" but it doesn't mean it's not broken. It
>>>> changes the guest visible default value of VET register, so it may break
>>>> things silently for the guest.
>>>>
>>>> For old machine types, we should stick the value to the one without this
>>>> fix.
>>> Could you please propose a solution on how to handle such a scenario
>>> in a generic way in QEMU? (+Peter)
>>
>> Well, I think I've suggested you to have a look at how things is done in
>> for handling such compatibility in e1000_properties.
>>
>>
>>> The POR reset value is wrong in QEMU and has carried forward the wrong
>>> value for years, and correcting it to its right value needs to do
>>> what?
>>
>> We should stick to the wrong behavior for old machine types.
>>
>> That's all.
> So that means the following SD patch is also wrong (+Philippe) which
> changes the default value of capability register.
> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/


It should compat capareg for the old value for old machine types.


>
> Can we get some agreement among maintainers?


It's not about the agreement but about to have a stable ABI. I don't 
know the case for sd but e1000 is used in various  and we work hard to 
unbreak the migration compatibility among downstream versions. Git log 
on e1000.c will tell you more.

Thanks


>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  3:10               ` Jason Wang
@ 2021-07-14  3:42                 ` Bin Meng
  2021-07-14  4:53                   ` Jason Wang
  2021-07-14  9:00                   ` Peter Maydell
  0 siblings, 2 replies; 20+ messages in thread
From: Bin Meng @ 2021-07-14  3:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang

On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/13 下午5:11, Bin Meng 写道:
> > On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2021/7/13 下午4:36, Bin Meng 写道:
> >>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
> >>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
> >>>>>>>> From: Christina Wang <christina.wang@windriver.com>
> >>>>>>>>
> >>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> >>>>>>>> the manual and real hardware.
> >>>>>>>>
> >>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
> >>>>>>>> not always the case for everyone. Drivers relying on the reset value
> >>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
> >>>>>>>>
> >>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> >>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> >>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>> (no changes since v1)
> >>>>>>>>
> >>>>>>>>      hw/net/e1000.c | 2 ++
> >>>>>>>>      1 file changed, 2 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> >>>>>>>> index 4f75b44cfc..20cbba6411 100644
> >>>>>>>> --- a/hw/net/e1000.c
> >>>>>>>> +++ b/hw/net/e1000.c
> >>>>>>>> @@ -29,6 +29,7 @@
> >>>>>>>>      #include "hw/pci/pci.h"
> >>>>>>>>      #include "hw/qdev-properties.h"
> >>>>>>>>      #include "migration/vmstate.h"
> >>>>>>>> +#include "net/eth.h"
> >>>>>>>>      #include "net/net.h"
> >>>>>>>>      #include "net/checksum.h"
> >>>>>>>>      #include "sysemu/sysemu.h"
> >>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >>>>>>>>          [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >>>>>>>>                      E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >>>>>>>>                      E1000_MANC_RMCP_EN,
> >>>>>>>> +    [VET]     = ETH_P_VLAN,
> >>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
> >>>>>>>
> >>>>>>> (See e1000_properties[])
> >>>>>>>
> >>>>>> No we don't need to since it does not break migration.
> >>>>> Ping?
> >>>> I admit migration "works" but it doesn't mean it's not broken. It
> >>>> changes the guest visible default value of VET register, so it may break
> >>>> things silently for the guest.
> >>>>
> >>>> For old machine types, we should stick the value to the one without this
> >>>> fix.
> >>> Could you please propose a solution on how to handle such a scenario
> >>> in a generic way in QEMU? (+Peter)
> >>
> >> Well, I think I've suggested you to have a look at how things is done in
> >> for handling such compatibility in e1000_properties.
> >>
> >>
> >>> The POR reset value is wrong in QEMU and has carried forward the wrong
> >>> value for years, and correcting it to its right value needs to do
> >>> what?
> >>
> >> We should stick to the wrong behavior for old machine types.
> >>
> >> That's all.
> > So that means the following SD patch is also wrong (+Philippe) which
> > changes the default value of capability register.
> > http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
>
>
> It should compat capareg for the old value for old machine types.

Yeah, it's already a property for the SD controller model but someone
views it as a bug because the model implements 64-bit but not
reporting it in the capability register.

>
>
> >
> > Can we get some agreement among maintainers?
>
>
> It's not about the agreement but about to have a stable ABI. I don't
> know the case for sd but e1000 is used in various  and we work hard to
> unbreak the migration compatibility among downstream versions. Git log
> on e1000.c will tell you more.

Agreement or stable ABI, whatever we call, but we should be in some consistency.

IMHO maintainers should reach an agreement to some extent on how
compatibility should be achieved. I just found silly to add a property
to fix a real bug in the model, and we preserve the bug all over
releases.

I can find plenty of examples in the current QEMU tree that were
accepted that changed the bugous register behavior, but it was not
asked to add new properties to keep the bugos behavior.

e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  3:42                 ` Bin Meng
@ 2021-07-14  4:53                   ` Jason Wang
  2021-07-14  6:04                     ` Bin Meng
  2021-07-14  9:00                   ` Peter Maydell
  1 sibling, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-14  4:53 UTC (permalink / raw)
  To: Bin Meng
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang


在 2021/7/14 上午11:42, Bin Meng 写道:
> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2021/7/13 下午5:11, Bin Meng 写道:
>>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/7/13 下午4:36, Bin Meng 写道:
>>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
>>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>>
>>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>>>>>>>> the manual and real hardware.
>>>>>>>>>>
>>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>>>>>>>> not always the case for everyone. Drivers relying on the reset value
>>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>>>>>>>
>>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> (no changes since v1)
>>>>>>>>>>
>>>>>>>>>>       hw/net/e1000.c | 2 ++
>>>>>>>>>>       1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>>>>> index 4f75b44cfc..20cbba6411 100644
>>>>>>>>>> --- a/hw/net/e1000.c
>>>>>>>>>> +++ b/hw/net/e1000.c
>>>>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>>>>       #include "hw/pci/pci.h"
>>>>>>>>>>       #include "hw/qdev-properties.h"
>>>>>>>>>>       #include "migration/vmstate.h"
>>>>>>>>>> +#include "net/eth.h"
>>>>>>>>>>       #include "net/net.h"
>>>>>>>>>>       #include "net/checksum.h"
>>>>>>>>>>       #include "sysemu/sysemu.h"
>>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>>>>>>>           [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>>>>>>>                       E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>>>>>>>                       E1000_MANC_RMCP_EN,
>>>>>>>>>> +    [VET]     = ETH_P_VLAN,
>>>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>>>>>>>
>>>>>>>>> (See e1000_properties[])
>>>>>>>>>
>>>>>>>> No we don't need to since it does not break migration.
>>>>>>> Ping?
>>>>>> I admit migration "works" but it doesn't mean it's not broken. It
>>>>>> changes the guest visible default value of VET register, so it may break
>>>>>> things silently for the guest.
>>>>>>
>>>>>> For old machine types, we should stick the value to the one without this
>>>>>> fix.
>>>>> Could you please propose a solution on how to handle such a scenario
>>>>> in a generic way in QEMU? (+Peter)
>>>> Well, I think I've suggested you to have a look at how things is done in
>>>> for handling such compatibility in e1000_properties.
>>>>
>>>>
>>>>> The POR reset value is wrong in QEMU and has carried forward the wrong
>>>>> value for years, and correcting it to its right value needs to do
>>>>> what?
>>>> We should stick to the wrong behavior for old machine types.
>>>>
>>>> That's all.
>>> So that means the following SD patch is also wrong (+Philippe) which
>>> changes the default value of capability register.
>>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
>>
>> It should compat capareg for the old value for old machine types.
> Yeah, it's already a property for the SD controller model but someone
> views it as a bug because the model implements 64-bit but not
> reporting it in the capability register.
>
>>
>>> Can we get some agreement among maintainers?
>>
>> It's not about the agreement but about to have a stable ABI. I don't
>> know the case for sd but e1000 is used in various  and we work hard to
>> unbreak the migration compatibility among downstream versions. Git log
>> on e1000.c will tell you more.
> Agreement or stable ABI, whatever we call, but we should be in some consistency.
>
> IMHO maintainers should reach an agreement to some extent on how
> compatibility should be achieved. I just found silly to add a property
> to fix a real bug in the model, and we preserve the bug all over
> releases.


That's the price for the stable ABI. See one of my recent fix - 
d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy" 
behavior to unbreak the migration.


>
> I can find plenty of examples in the current QEMU tree that were
> accepted that changed the bugous register behavior, but it was not
> asked to add new properties to keep the bugos behavior.
>
> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")


I guess it's simply because fsl_etsec is not used in any 
distributions/production environments or the maintainer may just not 
notice things like this.

But for e1000(e), we should stick to a stable ABI for consistency. 
Otherwise it would be very tricky to fix them after we saw real issues. 
We had learnt a lot during the past decade.

Thanks


>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  4:53                   ` Jason Wang
@ 2021-07-14  6:04                     ` Bin Meng
  2021-07-14  8:40                       ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-14  6:04 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang

On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/14 上午11:42, Bin Meng 写道:
> > On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2021/7/13 下午5:11, Bin Meng 写道:
> >>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>> 在 2021/7/13 下午4:36, Bin Meng 写道:
> >>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
> >>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
> >>>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
> >>>>>>>>>>
> >>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> >>>>>>>>>> the manual and real hardware.
> >>>>>>>>>>
> >>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
> >>>>>>>>>> not always the case for everyone. Drivers relying on the reset value
> >>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
> >>>>>>>>>>
> >>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> >>>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> >>>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >>>>>>>>>> ---
> >>>>>>>>>>
> >>>>>>>>>> (no changes since v1)
> >>>>>>>>>>
> >>>>>>>>>>       hw/net/e1000.c | 2 ++
> >>>>>>>>>>       1 file changed, 2 insertions(+)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> >>>>>>>>>> index 4f75b44cfc..20cbba6411 100644
> >>>>>>>>>> --- a/hw/net/e1000.c
> >>>>>>>>>> +++ b/hw/net/e1000.c
> >>>>>>>>>> @@ -29,6 +29,7 @@
> >>>>>>>>>>       #include "hw/pci/pci.h"
> >>>>>>>>>>       #include "hw/qdev-properties.h"
> >>>>>>>>>>       #include "migration/vmstate.h"
> >>>>>>>>>> +#include "net/eth.h"
> >>>>>>>>>>       #include "net/net.h"
> >>>>>>>>>>       #include "net/checksum.h"
> >>>>>>>>>>       #include "sysemu/sysemu.h"
> >>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >>>>>>>>>>           [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >>>>>>>>>>                       E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >>>>>>>>>>                       E1000_MANC_RMCP_EN,
> >>>>>>>>>> +    [VET]     = ETH_P_VLAN,
> >>>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
> >>>>>>>>>
> >>>>>>>>> (See e1000_properties[])
> >>>>>>>>>
> >>>>>>>> No we don't need to since it does not break migration.
> >>>>>>> Ping?
> >>>>>> I admit migration "works" but it doesn't mean it's not broken. It
> >>>>>> changes the guest visible default value of VET register, so it may break
> >>>>>> things silently for the guest.
> >>>>>>
> >>>>>> For old machine types, we should stick the value to the one without this
> >>>>>> fix.
> >>>>> Could you please propose a solution on how to handle such a scenario
> >>>>> in a generic way in QEMU? (+Peter)
> >>>> Well, I think I've suggested you to have a look at how things is done in
> >>>> for handling such compatibility in e1000_properties.
> >>>>
> >>>>
> >>>>> The POR reset value is wrong in QEMU and has carried forward the wrong
> >>>>> value for years, and correcting it to its right value needs to do
> >>>>> what?
> >>>> We should stick to the wrong behavior for old machine types.
> >>>>
> >>>> That's all.
> >>> So that means the following SD patch is also wrong (+Philippe) which
> >>> changes the default value of capability register.
> >>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
> >>
> >> It should compat capareg for the old value for old machine types.
> > Yeah, it's already a property for the SD controller model but someone
> > views it as a bug because the model implements 64-bit but not
> > reporting it in the capability register.
> >
> >>
> >>> Can we get some agreement among maintainers?
> >>
> >> It's not about the agreement but about to have a stable ABI. I don't
> >> know the case for sd but e1000 is used in various  and we work hard to
> >> unbreak the migration compatibility among downstream versions. Git log
> >> on e1000.c will tell you more.
> > Agreement or stable ABI, whatever we call, but we should be in some consistency.
> >
> > IMHO maintainers should reach an agreement to some extent on how
> > compatibility should be achieved. I just found silly to add a property
> > to fix a real bug in the model, and we preserve the bug all over
> > releases.
>
>
> That's the price for the stable ABI. See one of my recent fix -
> d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy"
> behavior to unbreak the migration.
>

But this series does not break the migration, as we discussed in the
previous thread.

> >
> > I can find plenty of examples in the current QEMU tree that were
> > accepted that changed the bugous register behavior, but it was not
> > asked to add new properties to keep the bugos behavior.
> >
> > e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
>
>
> I guess it's simply because fsl_etsec is not used in any
> distributions/production environments or the maintainer may just not
> notice things like this.
>
> But for e1000(e), we should stick to a stable ABI for consistency.
> Otherwise it would be very tricky to fix them after we saw real issues.
> We had learnt a lot during the past decade.
>

Okay, do we have such a kind of widely used device model list? And we
should document such a process that we should keep compatibility on
these devices as well.

Regarding this VET register, do you know what guest relies on the POR
value which is zero? Zero is not a valid ethernet VLAN type. I don't
think changing this will break any guests. The commit message says
Linux e1000(e) driver just rewrites this register by itself. Given
that's probably the most commonly used OS, it's not strange this bug
gets unnoticed for years.

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  6:04                     ` Bin Meng
@ 2021-07-14  8:40                       ` Jason Wang
  2021-07-14  9:05                         ` Bin Meng
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Wang @ 2021-07-14  8:40 UTC (permalink / raw)
  To: Bin Meng
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang


在 2021/7/14 下午2:04, Bin Meng 写道:
> On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2021/7/14 上午11:42, Bin Meng 写道:
>>> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/7/13 下午5:11, Bin Meng 写道:
>>>>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> 在 2021/7/13 下午4:36, Bin Meng 写道:
>>>>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
>>>>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>>>>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>>>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>>>>
>>>>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>>>>>>>>>> the manual and real hardware.
>>>>>>>>>>>>
>>>>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>>>>>>>>>> not always the case for everyone. Drivers relying on the reset value
>>>>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>>>>>>>>>
>>>>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>>>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>
>>>>>>>>>>>> (no changes since v1)
>>>>>>>>>>>>
>>>>>>>>>>>>        hw/net/e1000.c | 2 ++
>>>>>>>>>>>>        1 file changed, 2 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>>>>>>> index 4f75b44cfc..20cbba6411 100644
>>>>>>>>>>>> --- a/hw/net/e1000.c
>>>>>>>>>>>> +++ b/hw/net/e1000.c
>>>>>>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>>>>>>        #include "hw/pci/pci.h"
>>>>>>>>>>>>        #include "hw/qdev-properties.h"
>>>>>>>>>>>>        #include "migration/vmstate.h"
>>>>>>>>>>>> +#include "net/eth.h"
>>>>>>>>>>>>        #include "net/net.h"
>>>>>>>>>>>>        #include "net/checksum.h"
>>>>>>>>>>>>        #include "sysemu/sysemu.h"
>>>>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>>>>>>>>>            [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>>>>>>>>>                        E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>>>>>>>>>                        E1000_MANC_RMCP_EN,
>>>>>>>>>>>> +    [VET]     = ETH_P_VLAN,
>>>>>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>>>>>>>>>
>>>>>>>>>>> (See e1000_properties[])
>>>>>>>>>>>
>>>>>>>>>> No we don't need to since it does not break migration.
>>>>>>>>> Ping?
>>>>>>>> I admit migration "works" but it doesn't mean it's not broken. It
>>>>>>>> changes the guest visible default value of VET register, so it may break
>>>>>>>> things silently for the guest.
>>>>>>>>
>>>>>>>> For old machine types, we should stick the value to the one without this
>>>>>>>> fix.
>>>>>>> Could you please propose a solution on how to handle such a scenario
>>>>>>> in a generic way in QEMU? (+Peter)
>>>>>> Well, I think I've suggested you to have a look at how things is done in
>>>>>> for handling such compatibility in e1000_properties.
>>>>>>
>>>>>>
>>>>>>> The POR reset value is wrong in QEMU and has carried forward the wrong
>>>>>>> value for years, and correcting it to its right value needs to do
>>>>>>> what?
>>>>>> We should stick to the wrong behavior for old machine types.
>>>>>>
>>>>>> That's all.
>>>>> So that means the following SD patch is also wrong (+Philippe) which
>>>>> changes the default value of capability register.
>>>>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
>>>> It should compat capareg for the old value for old machine types.
>>> Yeah, it's already a property for the SD controller model but someone
>>> views it as a bug because the model implements 64-bit but not
>>> reporting it in the capability register.
>>>
>>>>> Can we get some agreement among maintainers?
>>>> It's not about the agreement but about to have a stable ABI. I don't
>>>> know the case for sd but e1000 is used in various  and we work hard to
>>>> unbreak the migration compatibility among downstream versions. Git log
>>>> on e1000.c will tell you more.
>>> Agreement or stable ABI, whatever we call, but we should be in some consistency.
>>>
>>> IMHO maintainers should reach an agreement to some extent on how
>>> compatibility should be achieved. I just found silly to add a property
>>> to fix a real bug in the model, and we preserve the bug all over
>>> releases.
>>
>> That's the price for the stable ABI. See one of my recent fix -
>> d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy"
>> behavior to unbreak the migration.
>>
> But this series does not break the migration, as we discussed in the
> previous thread.


It actually did,

(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
     0 device: 20 cmask: ff wmask: 0 w1cmask:0

Since the register is RO.


>
>>> I can find plenty of examples in the current QEMU tree that were
>>> accepted that changed the bugous register behavior, but it was not
>>> asked to add new properties to keep the bugos behavior.
>>>
>>> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
>>
>> I guess it's simply because fsl_etsec is not used in any
>> distributions/production environments or the maintainer may just not
>> notice things like this.
>>
>> But for e1000(e), we should stick to a stable ABI for consistency.
>> Otherwise it would be very tricky to fix them after we saw real issues.
>> We had learnt a lot during the past decade.
>>
> Okay, do we have such a kind of widely used device model list? And we
> should document such a process that we should keep compatibility on
> these devices as well.


I can only say for networking devices:

- e1000(e)
- rtl8139
- virtio-net


>
> Regarding this VET register, do you know what guest relies on the POR
> value which is zero?


I don't know.


> Zero is not a valid ethernet VLAN type. I don't
> think changing this will break any guests.


You might be right here, but it would be late if we find it breaks any one.

If it's not a lot of work, I tend to bother with compat stuffs for this.

Thanks


> The commit message says
> Linux e1000(e) driver just rewrites this register by itself. Given
> that's probably the most commonly used OS, it's not strange this bug
> gets unnoticed for years.
>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  3:42                 ` Bin Meng
  2021-07-14  4:53                   ` Jason Wang
@ 2021-07-14  9:00                   ` Peter Maydell
  2021-07-14  9:14                     ` Bin Meng
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Maydell @ 2021-07-14  9:00 UTC (permalink / raw)
  To: Bin Meng
  Cc: Markus Carlstedt, Jason Wang, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang

On Wed, 14 Jul 2021 at 04:42, Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2021/7/13 下午5:11, Bin Meng 写道:
> > > Can we get some agreement among maintainers?
> >
> >
> > It's not about the agreement but about to have a stable ABI. I don't
> > know the case for sd but e1000 is used in various  and we work hard to
> > unbreak the migration compatibility among downstream versions. Git log
> > on e1000.c will tell you more.
>
> Agreement or stable ABI, whatever we call, but we should be in some consistency.
>
> IMHO maintainers should reach an agreement to some extent on how
> compatibility should be achieved. I just found silly to add a property
> to fix a real bug in the model, and we preserve the bug all over
> releases.
>
> I can find plenty of examples in the current QEMU tree that were
> accepted that changed the bugous register behavior, but it was not
> asked to add new properties to keep the bugos behavior.
>
> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")

There is basically a judgement call going on here about whether the
device is "significant enough" that it's worth the effort of
preserving back-compatibility of bugs.

There is at least one clear rule: if the device isn't used by any
machine with a versioned machine type, then there is no need to
provide back-compatibility of old buggy behaviour. (There would
be no way for the user to select the old behaviour by choosing a
-4.2 machine type.) This is why the fsl_etsec device doesn't need
to handle that.

For PCI devices it's a bit fuzzier because in theory you can plug
any PCI card into a versioned x86 PC machine.

We don't want to preserve bug-compatibility for absolutely
everything, because the codebase would quickly clog up with weird
behaviour that we never test and which is not of any use to users
either. So you have to look at:
 * how easy is providing the back-compat?
 * how commonly used in production is the device?
 * how likely is it that guests might care?
 * would the behaviour change cause odd behaviour across
   a cross-version migration from the old QEMU?

Migration compat is similar, but not quite the same because the
compatibility handling tends to be less invasive, so we take the
"provide compat" choice more often. For non-versioned machine types,
again, we're OK with breaking back-compat as long as we bump a
migration version number so the user gets an error message rather
than weird behaviour.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  8:40                       ` Jason Wang
@ 2021-07-14  9:05                         ` Bin Meng
  2021-07-14  9:17                           ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-14  9:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang

On Wed, Jul 14, 2021 at 4:40 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/14 下午2:04, Bin Meng 写道:
> > On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2021/7/14 上午11:42, Bin Meng 写道:
> >>> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> 在 2021/7/13 下午5:11, Bin Meng 写道:
> >>>>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>> 在 2021/7/13 下午4:36, Bin Meng 写道:
> >>>>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
> >>>>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >>>>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
> >>>>>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
> >>>>>>>>>>>>
> >>>>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
> >>>>>>>>>>>> the manual and real hardware.
> >>>>>>>>>>>>
> >>>>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
> >>>>>>>>>>>> not always the case for everyone. Drivers relying on the reset value
> >>>>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
> >>>>>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
> >>>>>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>
> >>>>>>>>>>>> (no changes since v1)
> >>>>>>>>>>>>
> >>>>>>>>>>>>        hw/net/e1000.c | 2 ++
> >>>>>>>>>>>>        1 file changed, 2 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> >>>>>>>>>>>> index 4f75b44cfc..20cbba6411 100644
> >>>>>>>>>>>> --- a/hw/net/e1000.c
> >>>>>>>>>>>> +++ b/hw/net/e1000.c
> >>>>>>>>>>>> @@ -29,6 +29,7 @@
> >>>>>>>>>>>>        #include "hw/pci/pci.h"
> >>>>>>>>>>>>        #include "hw/qdev-properties.h"
> >>>>>>>>>>>>        #include "migration/vmstate.h"
> >>>>>>>>>>>> +#include "net/eth.h"
> >>>>>>>>>>>>        #include "net/net.h"
> >>>>>>>>>>>>        #include "net/checksum.h"
> >>>>>>>>>>>>        #include "sysemu/sysemu.h"
> >>>>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
> >>>>>>>>>>>>            [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
> >>>>>>>>>>>>                        E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> >>>>>>>>>>>>                        E1000_MANC_RMCP_EN,
> >>>>>>>>>>>> +    [VET]     = ETH_P_VLAN,
> >>>>>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
> >>>>>>>>>>>
> >>>>>>>>>>> (See e1000_properties[])
> >>>>>>>>>>>
> >>>>>>>>>> No we don't need to since it does not break migration.
> >>>>>>>>> Ping?
> >>>>>>>> I admit migration "works" but it doesn't mean it's not broken. It
> >>>>>>>> changes the guest visible default value of VET register, so it may break
> >>>>>>>> things silently for the guest.
> >>>>>>>>
> >>>>>>>> For old machine types, we should stick the value to the one without this
> >>>>>>>> fix.
> >>>>>>> Could you please propose a solution on how to handle such a scenario
> >>>>>>> in a generic way in QEMU? (+Peter)
> >>>>>> Well, I think I've suggested you to have a look at how things is done in
> >>>>>> for handling such compatibility in e1000_properties.
> >>>>>>
> >>>>>>
> >>>>>>> The POR reset value is wrong in QEMU and has carried forward the wrong
> >>>>>>> value for years, and correcting it to its right value needs to do
> >>>>>>> what?
> >>>>>> We should stick to the wrong behavior for old machine types.
> >>>>>>
> >>>>>> That's all.
> >>>>> So that means the following SD patch is also wrong (+Philippe) which
> >>>>> changes the default value of capability register.
> >>>>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
> >>>> It should compat capareg for the old value for old machine types.
> >>> Yeah, it's already a property for the SD controller model but someone
> >>> views it as a bug because the model implements 64-bit but not
> >>> reporting it in the capability register.
> >>>
> >>>>> Can we get some agreement among maintainers?
> >>>> It's not about the agreement but about to have a stable ABI. I don't
> >>>> know the case for sd but e1000 is used in various  and we work hard to
> >>>> unbreak the migration compatibility among downstream versions. Git log
> >>>> on e1000.c will tell you more.
> >>> Agreement or stable ABI, whatever we call, but we should be in some consistency.
> >>>
> >>> IMHO maintainers should reach an agreement to some extent on how
> >>> compatibility should be achieved. I just found silly to add a property
> >>> to fix a real bug in the model, and we preserve the bug all over
> >>> releases.
> >>
> >> That's the price for the stable ABI. See one of my recent fix -
> >> d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy"
> >> behavior to unbreak the migration.
> >>
> > But this series does not break the migration, as we discussed in the
> > previous thread.
>
>
> It actually did,
>
> (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
>      0 device: 20 cmask: ff wmask: 0 w1cmask:0
>
> Since the register is RO.

Did you mean the VET register is read-only? I don't understand this as
the manual says it's RW and Linux driver programs it to 0x8100.

>
>
> >
> >>> I can find plenty of examples in the current QEMU tree that were
> >>> accepted that changed the bugous register behavior, but it was not
> >>> asked to add new properties to keep the bugos behavior.
> >>>
> >>> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
> >>
> >> I guess it's simply because fsl_etsec is not used in any
> >> distributions/production environments or the maintainer may just not
> >> notice things like this.
> >>
> >> But for e1000(e), we should stick to a stable ABI for consistency.
> >> Otherwise it would be very tricky to fix them after we saw real issues.
> >> We had learnt a lot during the past decade.
> >>
> > Okay, do we have such a kind of widely used device model list? And we
> > should document such a process that we should keep compatibility on
> > these devices as well.
>
>
> I can only say for networking devices:
>
> - e1000(e)
> - rtl8139
> - virtio-net
>

Please, at least have these documented as a development process.

> >
> > Regarding this VET register, do you know what guest relies on the POR
> > value which is zero?
>
>
> I don't know.
>
>
> > Zero is not a valid ethernet VLAN type. I don't
> > think changing this will break any guests.
>
>
> You might be right here, but it would be late if we find it breaks any one.
>
> If it's not a lot of work, I tend to bother with compat stuffs for this.
>

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  9:00                   ` Peter Maydell
@ 2021-07-14  9:14                     ` Bin Meng
  2021-07-14  9:24                       ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Bin Meng @ 2021-07-14  9:14 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Markus Carlstedt, Jason Wang, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang

On Wed, Jul 14, 2021 at 5:01 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Wed, 14 Jul 2021 at 04:42, Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > 在 2021/7/13 下午5:11, Bin Meng 写道:
> > > > Can we get some agreement among maintainers?
> > >
> > >
> > > It's not about the agreement but about to have a stable ABI. I don't
> > > know the case for sd but e1000 is used in various  and we work hard to
> > > unbreak the migration compatibility among downstream versions. Git log
> > > on e1000.c will tell you more.
> >
> > Agreement or stable ABI, whatever we call, but we should be in some consistency.
> >
> > IMHO maintainers should reach an agreement to some extent on how
> > compatibility should be achieved. I just found silly to add a property
> > to fix a real bug in the model, and we preserve the bug all over
> > releases.
> >
> > I can find plenty of examples in the current QEMU tree that were
> > accepted that changed the bugous register behavior, but it was not
> > asked to add new properties to keep the bugos behavior.
> >
> > e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
>
> There is basically a judgement call going on here about whether the
> device is "significant enough" that it's worth the effort of
> preserving back-compatibility of bugs.
>
> There is at least one clear rule: if the device isn't used by any
> machine with a versioned machine type, then there is no need to
> provide back-compatibility of old buggy behaviour. (There would
> be no way for the user to select the old behaviour by choosing a
> -4.2 machine type.) This is why the fsl_etsec device doesn't need
> to handle that.
>
> For PCI devices it's a bit fuzzier because in theory you can plug
> any PCI card into a versioned x86 PC machine.
>
> We don't want to preserve bug-compatibility for absolutely
> everything, because the codebase would quickly clog up with weird
> behaviour that we never test and which is not of any use to users
> either. So you have to look at:
>  * how easy is providing the back-compat?
>  * how commonly used in production is the device?
>  * how likely is it that guests might care?
>  * would the behaviour change cause odd behaviour across
>    a cross-version migration from the old QEMU?
>
> Migration compat is similar, but not quite the same because the
> compatibility handling tends to be less invasive, so we take the
> "provide compat" choice more often. For non-versioned machine types,
> again, we're OK with breaking back-compat as long as we bump a
> migration version number so the user gets an error message rather
> than weird behaviour.

Thanks Peter. I think the above information can be put in a doc in
docs/devel, and with some real examples to help developers on how to
achieve backward compatibility.

Regarding this series, as I mentioned in an earlier thread, I believe
the possibility of breaking a guest is very low. Adding a back-compat
check is not that hard either. Just not sure which factor weighs more.

Regards,
Bin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  9:05                         ` Bin Meng
@ 2021-07-14  9:17                           ` Jason Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Wang @ 2021-07-14  9:17 UTC (permalink / raw)
  To: Bin Meng
  Cc: Peter Maydell, Markus Carlstedt, Bin Meng,
	Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org Developers, Christina Wang


在 2021/7/14 下午5:05, Bin Meng 写道:
> On Wed, Jul 14, 2021 at 4:40 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2021/7/14 下午2:04, Bin Meng 写道:
>>> On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/7/14 上午11:42, Bin Meng 写道:
>>>>> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> 在 2021/7/13 下午5:11, Bin Meng 写道:
>>>>>>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>> 在 2021/7/13 下午4:36, Bin Meng 写道:
>>>>>>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道:
>>>>>>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>>>>>>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道:
>>>>>>>>>>>>>> From: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, as per
>>>>>>>>>>>>>> the manual and real hardware.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, it is
>>>>>>>>>>>>>> not always the case for everyone. Drivers relying on the reset value
>>>>>>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
>>>>>>>>>>>>>> Signed-off-by: Christina Wang <christina.wang@windriver.com>
>>>>>>>>>>>>>> Signed-off-by: Bin Meng <bin.meng@windriver.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (no changes since v1)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         hw/net/e1000.c | 2 ++
>>>>>>>>>>>>>>         1 file changed, 2 insertions(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>>>>>>>>> index 4f75b44cfc..20cbba6411 100644
>>>>>>>>>>>>>> --- a/hw/net/e1000.c
>>>>>>>>>>>>>> +++ b/hw/net/e1000.c
>>>>>>>>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>>>>>>>>         #include "hw/pci/pci.h"
>>>>>>>>>>>>>>         #include "hw/qdev-properties.h"
>>>>>>>>>>>>>>         #include "migration/vmstate.h"
>>>>>>>>>>>>>> +#include "net/eth.h"
>>>>>>>>>>>>>>         #include "net/net.h"
>>>>>>>>>>>>>>         #include "net/checksum.h"
>>>>>>>>>>>>>>         #include "sysemu/sysemu.h"
>>>>>>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
>>>>>>>>>>>>>>             [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>>>>>>>>>>>>>                         E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
>>>>>>>>>>>>>>                         E1000_MANC_RMCP_EN,
>>>>>>>>>>>>>> +    [VET]     = ETH_P_VLAN,
>>>>>>>>>>>>> I wonder if we need a compat flag for this, since we change the behavior.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (See e1000_properties[])
>>>>>>>>>>>>>
>>>>>>>>>>>> No we don't need to since it does not break migration.
>>>>>>>>>>> Ping?
>>>>>>>>>> I admit migration "works" but it doesn't mean it's not broken. It
>>>>>>>>>> changes the guest visible default value of VET register, so it may break
>>>>>>>>>> things silently for the guest.
>>>>>>>>>>
>>>>>>>>>> For old machine types, we should stick the value to the one without this
>>>>>>>>>> fix.
>>>>>>>>> Could you please propose a solution on how to handle such a scenario
>>>>>>>>> in a generic way in QEMU? (+Peter)
>>>>>>>> Well, I think I've suggested you to have a look at how things is done in
>>>>>>>> for handling such compatibility in e1000_properties.
>>>>>>>>
>>>>>>>>
>>>>>>>>> The POR reset value is wrong in QEMU and has carried forward the wrong
>>>>>>>>> value for years, and correcting it to its right value needs to do
>>>>>>>>> what?
>>>>>>>> We should stick to the wrong behavior for old machine types.
>>>>>>>>
>>>>>>>> That's all.
>>>>>>> So that means the following SD patch is also wrong (+Philippe) which
>>>>>>> changes the default value of capability register.
>>>>>>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
>>>>>> It should compat capareg for the old value for old machine types.
>>>>> Yeah, it's already a property for the SD controller model but someone
>>>>> views it as a bug because the model implements 64-bit but not
>>>>> reporting it in the capability register.
>>>>>
>>>>>>> Can we get some agreement among maintainers?
>>>>>> It's not about the agreement but about to have a stable ABI. I don't
>>>>>> know the case for sd but e1000 is used in various  and we work hard to
>>>>>> unbreak the migration compatibility among downstream versions. Git log
>>>>>> on e1000.c will tell you more.
>>>>> Agreement or stable ABI, whatever we call, but we should be in some consistency.
>>>>>
>>>>> IMHO maintainers should reach an agreement to some extent on how
>>>>> compatibility should be achieved. I just found silly to add a property
>>>>> to fix a real bug in the model, and we preserve the bug all over
>>>>> releases.
>>>> That's the price for the stable ABI. See one of my recent fix -
>>>> d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy"
>>>> behavior to unbreak the migration.
>>>>
>>> But this series does not break the migration, as we discussed in the
>>> previous thread.
>>
>> It actually did,
>>
>> (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
>>       0 device: 20 cmask: ff wmask: 0 w1cmask:0
>>
>> Since the register is RO.
> Did you mean the VET register is read-only?


No, I meant for ATS.


> I don't understand this as
> the manual says it's RW and Linux driver programs it to 0x8100.


So ATS cap is read only so it can be validated by qemu.

VET is RW so qemu can't do similar checks.


>
>>
>>>>> I can find plenty of examples in the current QEMU tree that were
>>>>> accepted that changed the bugous register behavior, but it was not
>>>>> asked to add new properties to keep the bugos behavior.
>>>>>
>>>>> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
>>>> I guess it's simply because fsl_etsec is not used in any
>>>> distributions/production environments or the maintainer may just not
>>>> notice things like this.
>>>>
>>>> But for e1000(e), we should stick to a stable ABI for consistency.
>>>> Otherwise it would be very tricky to fix them after we saw real issues.
>>>> We had learnt a lot during the past decade.
>>>>
>>> Okay, do we have such a kind of widely used device model list? And we
>>> should document such a process that we should keep compatibility on
>>> these devices as well.
>>
>> I can only say for networking devices:
>>
>> - e1000(e)
>> - rtl8139
>> - virtio-net
>>
> Please, at least have these documented as a development process.


See Peter's reply. We should take care of the devices that is involved 
in at least one versioned machine types.

If you wish, you can post a patch to clarify things in the doc.

Thanks


>
>>> Regarding this VET register, do you know what guest relies on the POR
>>> value which is zero?
>>
>> I don't know.
>>
>>
>>> Zero is not a valid ethernet VLAN type. I don't
>>> think changing this will break any guests.
>>
>> You might be right here, but it would be late if we find it breaks any one.
>>
>> If it's not a lot of work, I tend to bother with compat stuffs for this.
>>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
  2021-07-14  9:14                     ` Bin Meng
@ 2021-07-14  9:24                       ` Jason Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Wang @ 2021-07-14  9:24 UTC (permalink / raw)
  To: Bin Meng, Peter Maydell
  Cc: Bin Meng, Christina Wang, Philippe Mathieu-Daudé,
	Markus Carlstedt, qemu-devel@nongnu.org Developers


在 2021/7/14 下午5:14, Bin Meng 写道:
> On Wed, Jul 14, 2021 at 5:01 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>> On Wed, 14 Jul 2021 at 04:42, Bin Meng <bmeng.cn@gmail.com> wrote:
>>> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>> 在 2021/7/13 下午5:11, Bin Meng 写道:
>>>>> Can we get some agreement among maintainers?
>>>>
>>>> It's not about the agreement but about to have a stable ABI. I don't
>>>> know the case for sd but e1000 is used in various  and we work hard to
>>>> unbreak the migration compatibility among downstream versions. Git log
>>>> on e1000.c will tell you more.
>>> Agreement or stable ABI, whatever we call, but we should be in some consistency.
>>>
>>> IMHO maintainers should reach an agreement to some extent on how
>>> compatibility should be achieved. I just found silly to add a property
>>> to fix a real bug in the model, and we preserve the bug all over
>>> releases.
>>>
>>> I can find plenty of examples in the current QEMU tree that were
>>> accepted that changed the bugous register behavior, but it was not
>>> asked to add new properties to keep the bugos behavior.
>>>
>>> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
>> There is basically a judgement call going on here about whether the
>> device is "significant enough" that it's worth the effort of
>> preserving back-compatibility of bugs.
>>
>> There is at least one clear rule: if the device isn't used by any
>> machine with a versioned machine type, then there is no need to
>> provide back-compatibility of old buggy behaviour. (There would
>> be no way for the user to select the old behaviour by choosing a
>> -4.2 machine type.) This is why the fsl_etsec device doesn't need
>> to handle that.
>>
>> For PCI devices it's a bit fuzzier because in theory you can plug
>> any PCI card into a versioned x86 PC machine.
>>
>> We don't want to preserve bug-compatibility for absolutely
>> everything, because the codebase would quickly clog up with weird
>> behaviour that we never test and which is not of any use to users
>> either. So you have to look at:
>>   * how easy is providing the back-compat?
>>   * how commonly used in production is the device?
>>   * how likely is it that guests might care?
>>   * would the behaviour change cause odd behaviour across
>>     a cross-version migration from the old QEMU?
>>
>> Migration compat is similar, but not quite the same because the
>> compatibility handling tends to be less invasive, so we take the
>> "provide compat" choice more often. For non-versioned machine types,
>> again, we're OK with breaking back-compat as long as we bump a
>> migration version number so the user gets an error message rather
>> than weird behaviour.
> Thanks Peter. I think the above information can be put in a doc in
> docs/devel, and with some real examples to help developers on how to
> achieve backward compatibility.
>
> Regarding this series, as I mentioned in an earlier thread, I believe
> the possibility of breaking a guest is very low. Adding a back-compat
> check is not that hard either. Just not sure which factor weighs more.


Just few lines of code for the compatibility. Let's do that.

Thanks


>
> Regards,
> Bin
>



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-14  9:26 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-02  9:24 [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Bin Meng
2021-07-02  9:24 ` [PATCH v2 2/3] hw/net: e1000e: " Bin Meng
2021-07-02  9:24 ` [PATCH v2 3/3] hw/net: e1000e: Don't zero out the VLAN tag in the legacy RX descriptor Bin Meng
2021-07-05  4:21 ` [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register Jason Wang
2021-07-05  5:57   ` Bin Meng
2021-07-12 23:06     ` Bin Meng
2021-07-13  7:03       ` Jason Wang
2021-07-13  8:36         ` Bin Meng
2021-07-13  9:02           ` Jason Wang
2021-07-13  9:11             ` Bin Meng
2021-07-14  3:10               ` Jason Wang
2021-07-14  3:42                 ` Bin Meng
2021-07-14  4:53                   ` Jason Wang
2021-07-14  6:04                     ` Bin Meng
2021-07-14  8:40                       ` Jason Wang
2021-07-14  9:05                         ` Bin Meng
2021-07-14  9:17                           ` Jason Wang
2021-07-14  9:00                   ` Peter Maydell
2021-07-14  9:14                     ` Bin Meng
2021-07-14  9:24                       ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.