All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/40] igb: Fix for DPDK
@ 2023-04-14 11:36 Akihiko Odaki
  2023-04-14 11:36 ` [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI Akihiko Odaki
                   ` (39 more replies)
  0 siblings, 40 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:36 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

This series has fixes and feature additions to pass DPDK Test Suite with igb.
It also includes a few minor changes related to networking.

Patch [01, 09] are bug fixes.
Patch [10, 13] delete code which is unnecessary and affected by later changes.
Patch [14, 28] are minor changes.
Patch [29, 38] implement new features.
Patch [39, 40] update documentations.

While this includes so many patches, it is not necessary to land them at once.
Only bug fix patches may be applied first, for example.

Akihiko Odaki (40):
  hw/net/net_tx_pkt: Decouple from PCI
  e1000x: Fix BPRC and MPRC
  igb: Fix Rx packet type encoding
  igb: Include the second VLAN tag in the buffer
  igb: Do not require CTRL.VME for tx VLAN tagging
  net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
  e1000e: Always copy ethernet header
  igb: Always copy ethernet header
  Fix references to igb Avocado test
  tests/avocado: Remove unused imports
  tests/avocado: Remove test_igb_nomsi_kvm
  hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info
  net/eth: Rename eth_setup_vlan_headers_ex
  e1000x: Share more Rx filtering logic
  e1000x: Take CRC into consideration for size check
  e1000e: Always log status after building rx metadata
  igb: Always log status after building rx metadata
  igb: Remove goto
  igb: Read DCMD.VLE of the first Tx descriptor
  e1000e: Reset packet state after emptying Tx queue
  vmxnet3: Reset packet state after emptying Tx queue
  igb: Add more definitions for Tx descriptor
  igb: Share common VF constants
  igb: Fix igb_mac_reg_init alignment
  net/eth: Use void pointers
  net/eth: Always add VLAN tag
  hw/net/net_rx_pkt: Enforce alignment for eth_header
  tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX
  igb: Implement MSI-X single vector mode
  igb: Implement igb-specific oversize check
  igb: Use UDP for RSS hash
  igb: Implement Rx SCTP CSO
  igb: Implement Tx SCTP CSO
  igb: Strip the second VLAN tag for extended VLAN
  igb: Filter with the second VLAN tag for extended VLAN
  igb: Implement Rx PTP2 timestamp
  igb: Implement Tx timestamp
  vmxnet3: Do not depend on PC
  MAINTAINERS: Add a reviewer for network packet abstractions
  docs/system/devices/igb: Note igb is tested for DPDK

 MAINTAINERS                                   |   3 +-
 docs/system/devices/igb.rst                   |  14 +-
 hw/net/Kconfig                                |   2 +-
 hw/net/e1000.c                                |  41 +-
 hw/net/e1000e_core.c                          | 103 +---
 hw/net/e1000x_common.c                        |  73 ++-
 hw/net/e1000x_common.h                        |   9 +-
 hw/net/igb.c                                  |  10 +-
 hw/net/igb_common.h                           |  24 +-
 hw/net/igb_core.c                             | 471 +++++++++++-------
 hw/net/igb_regs.h                             |  61 ++-
 hw/net/igbvf.c                                |   7 -
 hw/net/net_rx_pkt.c                           | 107 ++--
 hw/net/net_rx_pkt.h                           |  38 +-
 hw/net/net_tx_pkt.c                           | 101 ++--
 hw/net/net_tx_pkt.h                           |  46 +-
 hw/net/trace-events                           |   4 +-
 hw/net/virtio-net.c                           |   7 +-
 hw/net/vmxnet3.c                              |  22 +-
 include/net/eth.h                             |  27 +-
 include/qemu/crc32c.h                         |   1 +
 net/eth.c                                     | 100 ++--
 .../org.centos/stream/8/x86_64/test-avocado   |   2 +-
 tests/avocado/netdev-ethtool.py               |  13 +-
 tests/qtest/libqos/igb.c                      |   1 +
 util/crc32c.c                                 |   8 +
 26 files changed, 747 insertions(+), 548 deletions(-)

-- 
2.40.0



^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
@ 2023-04-14 11:36 ` Akihiko Odaki
  2023-04-14 14:23   ` Philippe Mathieu-Daudé
  2023-04-14 11:36 ` [PATCH 02/40] e1000x: Fix BPRC and MPRC Akihiko Odaki
                   ` (38 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:36 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

This also fixes the leak of memory mapping when the specified memory is
partially mapped.

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c | 13 +++++----
 hw/net/igb_core.c    | 13 ++++-----
 hw/net/net_tx_pkt.c  | 65 +++++++++++++++++++++++---------------------
 hw/net/net_tx_pkt.h  | 38 +++++++++++++++++++-------
 hw/net/vmxnet3.c     | 14 +++++-----
 5 files changed, 83 insertions(+), 60 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index cfa3f55e96..15821a75e0 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -746,7 +746,8 @@ e1000e_process_tx_desc(E1000ECore *core,
     addr = le64_to_cpu(dp->buffer_addr);
 
     if (!tx->skip_cp) {
-        if (!net_tx_pkt_add_raw_fragment(tx->tx_pkt, addr, split_size)) {
+        if (!net_tx_pkt_add_raw_fragment_pci(tx->tx_pkt, core->owner,
+                                             addr, split_size)) {
             tx->skip_cp = true;
         }
     }
@@ -764,7 +765,7 @@ e1000e_process_tx_desc(E1000ECore *core,
         }
 
         tx->skip_cp = false;
-        net_tx_pkt_reset(tx->tx_pkt, core->owner);
+        net_tx_pkt_reset(tx->tx_pkt, net_tx_pkt_unmap_frag_pci, core->owner);
 
         tx->sum_needed = 0;
         tx->cptse = 0;
@@ -3421,7 +3422,7 @@ e1000e_core_pci_realize(E1000ECore     *core,
         qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
 
     for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-        net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS);
+        net_tx_pkt_init(&core->tx[i].tx_pkt, E1000E_MAX_TX_FRAGS);
     }
 
     net_rx_pkt_init(&core->rx_pkt);
@@ -3446,7 +3447,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
     qemu_del_vm_change_state_handler(core->vmstate);
 
     for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-        net_tx_pkt_reset(core->tx[i].tx_pkt, core->owner);
+        net_tx_pkt_reset(core->tx[i].tx_pkt,
+                         net_tx_pkt_unmap_frag_pci, core->owner);
         net_tx_pkt_uninit(core->tx[i].tx_pkt);
     }
 
@@ -3571,7 +3573,8 @@ static void e1000e_reset(E1000ECore *core, bool sw)
     e1000x_reset_mac_addr(core->owner_nic, core->mac, core->permanent_mac);
 
     for (i = 0; i < ARRAY_SIZE(core->tx); i++) {
-        net_tx_pkt_reset(core->tx[i].tx_pkt, core->owner);
+        net_tx_pkt_reset(core->tx[i].tx_pkt,
+                         net_tx_pkt_unmap_frag_pci, core->owner);
         memset(&core->tx[i].props, 0, sizeof(core->tx[i].props));
         core->tx[i].skip_cp = false;
     }
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 826e7a6cf1..abfdce9aaf 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -597,7 +597,8 @@ igb_process_tx_desc(IGBCore *core,
     length = cmd_type_len & 0xFFFF;
 
     if (!tx->skip_cp) {
-        if (!net_tx_pkt_add_raw_fragment(tx->tx_pkt, buffer_addr, length)) {
+        if (!net_tx_pkt_add_raw_fragment_pci(tx->tx_pkt, dev,
+                                             buffer_addr, length)) {
             tx->skip_cp = true;
         }
     }
@@ -616,7 +617,7 @@ igb_process_tx_desc(IGBCore *core,
 
         tx->first = true;
         tx->skip_cp = false;
-        net_tx_pkt_reset(tx->tx_pkt, dev);
+        net_tx_pkt_reset(tx->tx_pkt, net_tx_pkt_unmap_frag_pci, dev);
     }
 }
 
@@ -842,8 +843,6 @@ igb_start_xmit(IGBCore *core, const IGB_TxRing *txr)
         d = core->owner;
     }
 
-    net_tx_pkt_reset(txr->tx->tx_pkt, d);
-
     while (!igb_ring_empty(core, txi)) {
         base = igb_ring_head_descr(core, txi);
 
@@ -861,6 +860,8 @@ igb_start_xmit(IGBCore *core, const IGB_TxRing *txr)
         core->mac[EICR] |= eic;
         igb_set_interrupt_cause(core, E1000_ICR_TXDW);
     }
+
+    net_tx_pkt_reset(txr->tx->tx_pkt, net_tx_pkt_unmap_frag_pci, d);
 }
 
 static uint32_t
@@ -3954,7 +3955,7 @@ igb_core_pci_realize(IGBCore        *core,
     core->vmstate = qemu_add_vm_change_state_handler(igb_vm_state_change, core);
 
     for (i = 0; i < IGB_NUM_QUEUES; i++) {
-        net_tx_pkt_init(&core->tx[i].tx_pkt, NULL, E1000E_MAX_TX_FRAGS);
+        net_tx_pkt_init(&core->tx[i].tx_pkt, E1000E_MAX_TX_FRAGS);
     }
 
     net_rx_pkt_init(&core->rx_pkt);
@@ -3979,7 +3980,6 @@ igb_core_pci_uninit(IGBCore *core)
     qemu_del_vm_change_state_handler(core->vmstate);
 
     for (i = 0; i < IGB_NUM_QUEUES; i++) {
-        net_tx_pkt_reset(core->tx[i].tx_pkt, NULL);
         net_tx_pkt_uninit(core->tx[i].tx_pkt);
     }
 
@@ -4158,7 +4158,6 @@ static void igb_reset(IGBCore *core, bool sw)
 
     for (i = 0; i < ARRAY_SIZE(core->tx); i++) {
         tx = &core->tx[i];
-        net_tx_pkt_reset(tx->tx_pkt, NULL);
         memset(tx->ctx, 0, sizeof(tx->ctx));
         tx->first = true;
         tx->skip_cp = false;
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 8dc8568ba2..cc36750c9b 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -16,12 +16,12 @@
  */
 
 #include "qemu/osdep.h"
-#include "net_tx_pkt.h"
 #include "net/eth.h"
 #include "net/checksum.h"
 #include "net/tap.h"
 #include "net/net.h"
 #include "hw/pci/pci_device.h"
+#include "net_tx_pkt.h"
 
 enum {
     NET_TX_PKT_VHDR_FRAG = 0,
@@ -32,8 +32,6 @@ enum {
 
 /* TX packet private context */
 struct NetTxPkt {
-    PCIDevice *pci_dev;
-
     struct virtio_net_hdr virt_hdr;
 
     struct iovec *raw;
@@ -59,13 +57,10 @@ struct NetTxPkt {
     uint8_t l4proto;
 };
 
-void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-    uint32_t max_frags)
+void net_tx_pkt_init(struct NetTxPkt **pkt, uint32_t max_frags)
 {
     struct NetTxPkt *p = g_malloc0(sizeof *p);
 
-    p->pci_dev = pci_dev;
-
     p->vec = g_new(struct iovec, max_frags + NET_TX_PKT_PL_START_FRAG);
 
     p->raw = g_new(struct iovec, max_frags);
@@ -384,10 +379,8 @@ void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
     }
 }
 
-bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
-    size_t len)
+bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, void *base, size_t len)
 {
-    hwaddr mapped_len = 0;
     struct iovec *ventry;
     assert(pkt);
 
@@ -395,23 +388,12 @@ bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
         return false;
     }
 
-    if (!len) {
-        return true;
-     }
-
     ventry = &pkt->raw[pkt->raw_frags];
-    mapped_len = len;
-
-    ventry->iov_base = pci_dma_map(pkt->pci_dev, pa,
-                                   &mapped_len, DMA_DIRECTION_TO_DEVICE);
+    ventry->iov_base = base;
+    ventry->iov_len = len;
+    pkt->raw_frags++;
 
-    if ((ventry->iov_base != NULL) && (len == mapped_len)) {
-        ventry->iov_len = mapped_len;
-        pkt->raw_frags++;
-        return true;
-    } else {
-        return false;
-    }
+    return true;
 }
 
 bool net_tx_pkt_has_fragments(struct NetTxPkt *pkt)
@@ -445,7 +427,8 @@ void net_tx_pkt_dump(struct NetTxPkt *pkt)
 #endif
 }
 
-void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice *pci_dev)
+void net_tx_pkt_reset(struct NetTxPkt *pkt,
+                      NetTxPktFreeFrag callback, void *context)
 {
     int i;
 
@@ -465,17 +448,37 @@ void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice *pci_dev)
         assert(pkt->raw);
         for (i = 0; i < pkt->raw_frags; i++) {
             assert(pkt->raw[i].iov_base);
-            pci_dma_unmap(pkt->pci_dev, pkt->raw[i].iov_base,
-                          pkt->raw[i].iov_len, DMA_DIRECTION_TO_DEVICE, 0);
+            callback(context, pkt->raw[i].iov_base, pkt->raw[i].iov_len);
         }
     }
-    pkt->pci_dev = pci_dev;
     pkt->raw_frags = 0;
 
     pkt->hdr_len = 0;
     pkt->l4proto = 0;
 }
 
+void net_tx_pkt_unmap_frag_pci(void *context, void *base, size_t len)
+{
+    pci_dma_unmap(context, base, len, DMA_DIRECTION_TO_DEVICE, 0);
+}
+
+bool net_tx_pkt_add_raw_fragment_pci(struct NetTxPkt *pkt, PCIDevice *pci_dev,
+                                     dma_addr_t pa, size_t len)
+{
+    dma_addr_t mapped_len = len;
+    void *base = pci_dma_map(pci_dev, pa, &mapped_len, DMA_DIRECTION_TO_DEVICE);
+    if (!base) {
+        return false;
+    }
+
+    if (mapped_len != len || !net_tx_pkt_add_raw_fragment(pkt, base, len)) {
+        net_tx_pkt_unmap_frag_pci(pci_dev, base, mapped_len);
+        return false;
+    }
+
+    return true;
+}
+
 static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt,
                                   struct iovec *iov, uint32_t iov_len,
                                   uint16_t csl)
@@ -697,7 +700,7 @@ static void net_tx_pkt_udp_fragment_fix(struct NetTxPkt *pkt,
 }
 
 static bool net_tx_pkt_do_sw_fragmentation(struct NetTxPkt *pkt,
-                                           NetTxPktCallback callback,
+                                           NetTxPktSend callback,
                                            void *context)
 {
     uint8_t gso_type = pkt->virt_hdr.gso_type & ~VIRTIO_NET_HDR_GSO_ECN;
@@ -794,7 +797,7 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc)
 }
 
 bool net_tx_pkt_send_custom(struct NetTxPkt *pkt, bool offload,
-                            NetTxPktCallback callback, void *context)
+                            NetTxPktSend callback, void *context)
 {
     assert(pkt);
 
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index e5ce6f20bc..f5cd44da6f 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -26,17 +26,16 @@
 
 struct NetTxPkt;
 
-typedef void (* NetTxPktCallback)(void *, const struct iovec *, int, const struct iovec *, int);
+typedef void (*NetTxPktFreeFrag)(void *, void *, size_t);
+typedef void (*NetTxPktSend)(void *, const struct iovec *, int, const struct iovec *, int);
 
 /**
  * Init function for tx packet functionality
  *
  * @pkt:            packet pointer
- * @pci_dev:        PCI device processing this packet
  * @max_frags:      max tx ip fragments
  */
-void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-    uint32_t max_frags);
+void net_tx_pkt_init(struct NetTxPkt **pkt, uint32_t max_frags);
 
 /**
  * Clean all tx packet resources.
@@ -95,12 +94,11 @@ net_tx_pkt_setup_vlan_header(struct NetTxPkt *pkt, uint16_t vlan)
  * populate data fragment into pkt context.
  *
  * @pkt:            packet
- * @pa:             physical address of fragment
+ * @pa:             pointer to fragment
  * @len:            length of fragment
  *
  */
-bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
-    size_t len);
+bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, void *base, size_t len);
 
 /**
  * Fix ip header fields and calculate IP header and pseudo header checksums.
@@ -148,10 +146,30 @@ void net_tx_pkt_dump(struct NetTxPkt *pkt);
  * reset tx packet private context (needed to be called between packets)
  *
  * @pkt:            packet
- * @dev:            PCI device processing the next packet
+ * @callback:       function to free the fragments
+ * @context:        pointer to be passed to the callback
+ */
+void net_tx_pkt_reset(struct NetTxPkt *pkt,
+                      NetTxPktFreeFrag callback, void *context);
+
+/**
+ * Unmap a fragment mapped from a PCI device.
  *
+ * @context:        PCI device owning fragment
+ * @base:           pointer to fragment
+ * @len:            length of fragment
+ */
+void net_tx_pkt_unmap_frag_pci(void *context, void *base, size_t len);
+
+/**
+ * map data fragment from PCI device and populate it into pkt context.
+ *
+ * @pci_dev:        PCI device owning fragment
+ * @pa:             physical address of fragment
+ * @len:            length of fragment
  */
-void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice *dev);
+bool net_tx_pkt_add_raw_fragment_pci(struct NetTxPkt *pkt, PCIDevice *pci_dev,
+                                     dma_addr_t pa, size_t len);
 
 /**
  * Send packet to qemu. handles sw offloads if vhdr is not supported.
@@ -173,7 +191,7 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc);
  * @ret:            operation result
  */
 bool net_tx_pkt_send_custom(struct NetTxPkt *pkt, bool offload,
-                            NetTxPktCallback callback, void *context);
+                            NetTxPktSend callback, void *context);
 
 /**
  * parse raw packet data and analyze offload requirements.
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index f7b874c139..9acff310e7 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -651,9 +651,8 @@ static void vmxnet3_process_tx_queue(VMXNET3State *s, int qidx)
             data_len = (txd.len > 0) ? txd.len : VMXNET3_MAX_TX_BUF_SIZE;
             data_pa = txd.addr;
 
-            if (!net_tx_pkt_add_raw_fragment(s->tx_pkt,
-                                                data_pa,
-                                                data_len)) {
+            if (!net_tx_pkt_add_raw_fragment_pci(s->tx_pkt, PCI_DEVICE(s),
+                                                 data_pa, data_len)) {
                 s->skip_current_tx_pkt = true;
             }
         }
@@ -678,7 +677,8 @@ static void vmxnet3_process_tx_queue(VMXNET3State *s, int qidx)
             vmxnet3_complete_packet(s, qidx, txd_idx);
             s->tx_sop = true;
             s->skip_current_tx_pkt = false;
-            net_tx_pkt_reset(s->tx_pkt, PCI_DEVICE(s));
+            net_tx_pkt_reset(s->tx_pkt,
+                             net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
         }
     }
 }
@@ -1159,7 +1159,7 @@ static void vmxnet3_deactivate_device(VMXNET3State *s)
 {
     if (s->device_active) {
         VMW_CBPRN("Deactivating vmxnet3...");
-        net_tx_pkt_reset(s->tx_pkt, PCI_DEVICE(s));
+        net_tx_pkt_reset(s->tx_pkt, net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
         net_tx_pkt_uninit(s->tx_pkt);
         net_rx_pkt_uninit(s->rx_pkt);
         s->device_active = false;
@@ -1519,7 +1519,7 @@ static void vmxnet3_activate_device(VMXNET3State *s)
 
     /* Preallocate TX packet wrapper */
     VMW_CFPRN("Max TX fragments is %u", s->max_tx_frags);
-    net_tx_pkt_init(&s->tx_pkt, PCI_DEVICE(s), s->max_tx_frags);
+    net_tx_pkt_init(&s->tx_pkt, s->max_tx_frags);
     net_rx_pkt_init(&s->rx_pkt);
 
     /* Read rings memory locations for RX queues */
@@ -2399,7 +2399,7 @@ static int vmxnet3_post_load(void *opaque, int version_id)
 {
     VMXNET3State *s = opaque;
 
-    net_tx_pkt_init(&s->tx_pkt, PCI_DEVICE(s), s->max_tx_frags);
+    net_tx_pkt_init(&s->tx_pkt, s->max_tx_frags);
     net_rx_pkt_init(&s->rx_pkt);
 
     if (s->msix_used) {
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/40] e1000x: Fix BPRC and MPRC
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
  2023-04-14 11:36 ` [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI Akihiko Odaki
@ 2023-04-14 11:36 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 03/40] igb: Fix Rx packet type encoding Akihiko Odaki
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:36 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.

Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.

Always determine what counters to update by inspecting the packets.

Fixes: 3b27430177 ("e1000: Implementing various counters")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000.c         |  6 +++---
 hw/net/e1000e_core.c   | 20 +++-----------------
 hw/net/e1000x_common.c | 25 +++++++++++++++++++------
 hw/net/e1000x_common.h |  5 +++--
 hw/net/igb_core.c      | 22 +++++-----------------
 5 files changed, 33 insertions(+), 45 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 59bacb5d3b..18eb6d8876 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -826,12 +826,10 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
     }
 
     if (ismcast && (rctl & E1000_RCTL_MPE)) {          /* promiscuous mcast */
-        e1000x_inc_reg_if_not_full(s->mac_reg, MPRC);
         return 1;
     }
 
     if (isbcast && (rctl & E1000_RCTL_BAM)) {          /* broadcast enabled */
-        e1000x_inc_reg_if_not_full(s->mac_reg, BPRC);
         return 1;
     }
 
@@ -922,6 +920,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
     size_t desc_offset;
     size_t desc_size;
     size_t total_size;
+    eth_pkt_types_e pkt_type;
 
     if (!e1000x_hw_rx_enabled(s->mac_reg)) {
         return -1;
@@ -971,6 +970,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
         size -= 4;
     }
 
+    pkt_type = get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf));
     rdh_start = s->mac_reg[RDH];
     desc_offset = 0;
     total_size = size + e1000x_fcs_len(s->mac_reg);
@@ -1036,7 +1036,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
         }
     } while (desc_offset < total_size);
 
-    e1000x_update_rx_total_stats(s->mac_reg, size, total_size);
+    e1000x_update_rx_total_stats(s->mac_reg, pkt_type, size, total_size);
 
     n = E1000_ICS_RXT0;
     if ((rdt = s->mac_reg[RDT]) < s->mac_reg[RDH])
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 15821a75e0..c2d864a504 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1488,24 +1488,10 @@ e1000e_write_to_rx_buffers(E1000ECore *core,
 }
 
 static void
-e1000e_update_rx_stats(E1000ECore *core,
-                       size_t data_size,
-                       size_t data_fcs_size)
+e1000e_update_rx_stats(E1000ECore *core, size_t pkt_size, size_t pkt_fcs_size)
 {
-    e1000x_update_rx_total_stats(core->mac, data_size, data_fcs_size);
-
-    switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-    case ETH_PKT_BCAST:
-        e1000x_inc_reg_if_not_full(core->mac, BPRC);
-        break;
-
-    case ETH_PKT_MCAST:
-        e1000x_inc_reg_if_not_full(core->mac, MPRC);
-        break;
-
-    default:
-        break;
-    }
+    eth_pkt_types_e pkt_type = net_rx_pkt_get_packet_type(core->rx_pkt);
+    e1000x_update_rx_total_stats(core->mac, pkt_type, pkt_size, pkt_fcs_size);
 }
 
 static inline bool
diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index 4c8e7dcf70..7694673bcc 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -80,7 +80,6 @@ bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf)
     f = mta_shift[(rctl >> E1000_RCTL_MO_SHIFT) & 3];
     f = (((buf[5] << 8) | buf[4]) >> f) & 0xfff;
     if (mac[MTA + (f >> 5)] & (1 << (f & 0x1f))) {
-        e1000x_inc_reg_if_not_full(mac, MPRC);
         return true;
     }
 
@@ -212,13 +211,14 @@ e1000x_rxbufsize(uint32_t rctl)
 
 void
 e1000x_update_rx_total_stats(uint32_t *mac,
-                             size_t data_size,
-                             size_t data_fcs_size)
+                             eth_pkt_types_e pkt_type,
+                             size_t pkt_size,
+                             size_t pkt_fcs_size)
 {
     static const int PRCregs[6] = { PRC64, PRC127, PRC255, PRC511,
                                     PRC1023, PRC1522 };
 
-    e1000x_increase_size_stats(mac, PRCregs, data_fcs_size);
+    e1000x_increase_size_stats(mac, PRCregs, pkt_fcs_size);
     e1000x_inc_reg_if_not_full(mac, TPR);
     e1000x_inc_reg_if_not_full(mac, GPRC);
     /* TOR - Total Octets Received:
@@ -226,8 +226,21 @@ e1000x_update_rx_total_stats(uint32_t *mac,
     * Address> field through the <CRC> field, inclusively.
     * Always include FCS length (4) in size.
     */
-    e1000x_grow_8reg_if_not_full(mac, TORL, data_size + 4);
-    e1000x_grow_8reg_if_not_full(mac, GORCL, data_size + 4);
+    e1000x_grow_8reg_if_not_full(mac, TORL, pkt_size + 4);
+    e1000x_grow_8reg_if_not_full(mac, GORCL, pkt_size + 4);
+
+    switch (pkt_type) {
+    case ETH_PKT_BCAST:
+        e1000x_inc_reg_if_not_full(mac, BPRC);
+        break;
+
+    case ETH_PKT_MCAST:
+        e1000x_inc_reg_if_not_full(mac, MPRC);
+        break;
+
+    default:
+        break;
+    }
 }
 
 void
diff --git a/hw/net/e1000x_common.h b/hw/net/e1000x_common.h
index 911abd8a90..0298e06283 100644
--- a/hw/net/e1000x_common.h
+++ b/hw/net/e1000x_common.h
@@ -91,8 +91,9 @@ e1000x_update_regs_on_link_up(uint32_t *mac, uint16_t *phy)
 }
 
 void e1000x_update_rx_total_stats(uint32_t *mac,
-                                  size_t data_size,
-                                  size_t data_fcs_size);
+                                  eth_pkt_types_e pkt_type,
+                                  size_t pkt_size,
+                                  size_t pkt_fcs_size);
 
 void e1000x_core_prepare_eeprom(uint16_t       *eeprom,
                                 const uint16_t *templ,
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index abfdce9aaf..464a41d0aa 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1438,29 +1438,17 @@ igb_write_to_rx_buffers(IGBCore *core,
 
 static void
 igb_update_rx_stats(IGBCore *core, const E1000E_RingInfo *rxi,
-                    size_t data_size, size_t data_fcs_size)
+                    size_t pkt_size, size_t pkt_fcs_size)
 {
-    e1000x_update_rx_total_stats(core->mac, data_size, data_fcs_size);
-
-    switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-    case ETH_PKT_BCAST:
-        e1000x_inc_reg_if_not_full(core->mac, BPRC);
-        break;
-
-    case ETH_PKT_MCAST:
-        e1000x_inc_reg_if_not_full(core->mac, MPRC);
-        break;
-
-    default:
-        break;
-    }
+    eth_pkt_types_e pkt_type = net_rx_pkt_get_packet_type(core->rx_pkt);
+    e1000x_update_rx_total_stats(core->mac, pkt_type, pkt_size, pkt_fcs_size);
 
     if (core->mac[MRQC] & 1) {
         uint16_t pool = rxi->idx % IGB_NUM_VM_POOLS;
 
-        core->mac[PVFGORC0 + (pool * 64)] += data_size + 4;
+        core->mac[PVFGORC0 + (pool * 64)] += pkt_size + 4;
         core->mac[PVFGPRC0 + (pool * 64)]++;
-        if (net_rx_pkt_get_packet_type(core->rx_pkt) == ETH_PKT_MCAST) {
+        if (pkt_type == ETH_PKT_MCAST) {
             core->mac[PVFMPRC0 + (pool * 64)]++;
         }
     }
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/40] igb: Fix Rx packet type encoding
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
  2023-04-14 11:36 ` [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI Akihiko Odaki
  2023-04-14 11:36 ` [PATCH 02/40] e1000x: Fix BPRC and MPRC Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:08   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 04/40] igb: Include the second VLAN tag in the buffer Akihiko Odaki
                   ` (36 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 464a41d0aa..55de212447 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1227,7 +1227,6 @@ igb_build_rx_metadata(IGBCore *core,
     struct virtio_net_hdr *vhdr;
     bool hasip4, hasip6;
     EthL4HdrProto l4hdr_proto;
-    uint32_t pkt_type;
 
     *status_flags = E1000_RXD_STAT_DD;
 
@@ -1266,28 +1265,29 @@ igb_build_rx_metadata(IGBCore *core,
         trace_e1000e_rx_metadata_ack();
     }
 
-    if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_DIS)) {
-        trace_e1000e_rx_metadata_ipv6_filtering_disabled();
-        pkt_type = E1000_RXD_PKT_MAC;
-    } else if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP ||
-               l4hdr_proto == ETH_L4_HDR_PROTO_UDP) {
-        pkt_type = hasip4 ? E1000_RXD_PKT_IP4_XDP : E1000_RXD_PKT_IP6_XDP;
-    } else if (hasip4 || hasip6) {
-        pkt_type = hasip4 ? E1000_RXD_PKT_IP4 : E1000_RXD_PKT_IP6;
-    } else {
-        pkt_type = E1000_RXD_PKT_MAC;
-    }
+    if (pkt_info) {
+        *pkt_info = rss_info->enabled ? rss_info->type : 0;
 
-    trace_e1000e_rx_metadata_pkt_type(pkt_type);
+        if (hasip4) {
+            *pkt_info |= BIT(4);
+        }
 
-    if (pkt_info) {
-        if (rss_info->enabled) {
-            *pkt_info = rss_info->type;
+        if (hasip6) {
+            *pkt_info |= BIT(6);
         }
 
-        *pkt_info |= (pkt_type << 4);
-    } else {
-        *status_flags |= E1000_RXD_PKT_TYPE(pkt_type);
+        switch (l4hdr_proto) {
+        case ETH_L4_HDR_PROTO_TCP:
+            *pkt_info |= BIT(8);
+            break;
+
+        case ETH_L4_HDR_PROTO_UDP:
+            *pkt_info |= BIT(9);
+            break;
+
+        default:
+            break;
+        }
     }
 
     if (hdr_info) {
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/40] igb: Include the second VLAN tag in the buffer
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (2 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 03/40] igb: Fix Rx packet type encoding Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 14:28   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging Akihiko Odaki
                   ` (35 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 55de212447..f725ab97ae 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1590,7 +1590,7 @@ static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
                      bool has_vnet, bool *external_tx)
 {
-    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
+    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 8);
 
     uint16_t queues = 0;
     uint32_t n = 0;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (3 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 04/40] igb: Include the second VLAN tag in the buffer Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:08   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols() Akihiko Odaki
                   ` (34 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)

There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.

Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index f725ab97ae..5d4884b834 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -402,7 +402,7 @@ igb_tx_insert_vlan(IGBCore *core, uint16_t qn, struct igb_tx *tx,
         }
     }
 
-    if (insert_vlan && e1000x_vlan_enabled(core->mac)) {
+    if (insert_vlan) {
         net_tx_pkt_setup_vlan_header_ex(tx->tx_pkt, vlan,
             core->mac[VET] & 0xffff);
     }
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (4 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:09   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 07/40] e1000e: Always copy ethernet header Akihiko Odaki
                   ` (33 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c   |  2 +-
 hw/net/net_rx_pkt.c | 14 +++++---------
 hw/net/net_rx_pkt.h | 10 ++++++----
 hw/net/virtio-net.c |  7 +++++--
 hw/net/vmxnet3.c    |  7 ++++++-
 include/net/eth.h   |  6 +++---
 net/eth.c           | 18 ++++++++----------
 7 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 5d4884b834..53f60fc3d3 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1650,7 +1650,7 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
 
     ehdr = PKT_GET_ETH_HDR(filter_buf);
     net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
-    net_rx_pkt_set_protocols(core->rx_pkt, filter_buf, size);
+    net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
 
     queues = igb_receive_assign(core, ehdr, size, &rss_info, external_tx);
     if (!queues) {
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 39cdea06de..63be6e05ad 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -103,7 +103,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
                                 iov, iovcnt, ploff, pkt->tot_len);
     }
 
-    eth_get_protocols(pkt->vec, pkt->vec_len, &pkt->hasip4, &pkt->hasip6,
+    eth_get_protocols(pkt->vec, pkt->vec_len, 0, &pkt->hasip4, &pkt->hasip6,
                       &pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
                       &pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);
 
@@ -186,17 +186,13 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt)
     return pkt->tot_len;
 }
 
-void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
-                              size_t len)
+void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
+                              const struct iovec *iov, size_t iovcnt,
+                              size_t iovoff)
 {
-    const struct iovec iov = {
-        .iov_base = (void *)data,
-        .iov_len = len
-    };
-
     assert(pkt);
 
-    eth_get_protocols(&iov, 1, &pkt->hasip4, &pkt->hasip6,
+    eth_get_protocols(iov, iovcnt, iovoff, &pkt->hasip4, &pkt->hasip6,
                       &pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
                       &pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);
 }
diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index d00b484900..a06f5c2675 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -55,12 +55,14 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt);
  * parse and set packet analysis results
  *
  * @pkt:            packet
- * @data:           pointer to the data buffer to be parsed
- * @len:            data length
+ * @iov:            received data scatter-gather list
+ * @iovcnt:         number of elements in iov
+ * @iovoff:         data start offset in the iov
  *
  */
-void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
-                              size_t len);
+void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
+                              const struct iovec *iov, size_t iovcnt,
+                              size_t iovoff);
 
 /**
  * fetches packet analysis results
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 53e1c32643..37551fd854 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1835,9 +1835,12 @@ static int virtio_net_process_rss(NetClientState *nc, const uint8_t *buf,
         VIRTIO_NET_HASH_REPORT_UDPv6,
         VIRTIO_NET_HASH_REPORT_UDPv6_EX
     };
+    struct iovec iov = {
+        .iov_base = (void *)buf,
+        .iov_len = size
+    };
 
-    net_rx_pkt_set_protocols(pkt, buf + n->host_hdr_len,
-                             size - n->host_hdr_len);
+    net_rx_pkt_set_protocols(pkt, &iov, 1, n->host_hdr_len);
     net_rx_pkt_get_protocols(pkt, &hasip4, &hasip6, &l4hdr_proto);
     net_hash_type = virtio_net_get_hash_type(hasip4, hasip6, l4hdr_proto,
                                              n->rss_data.hash_types);
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 9acff310e7..05f41b6dfa 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2001,7 +2001,12 @@ vmxnet3_receive(NetClientState *nc, const uint8_t *buf, size_t size)
         get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
 
     if (vmxnet3_rx_filter_may_indicate(s, buf, size)) {
-        net_rx_pkt_set_protocols(s->rx_pkt, buf, size);
+        struct iovec iov = {
+            .iov_base = (void *)buf,
+            .iov_len = size
+        };
+
+        net_rx_pkt_set_protocols(s->rx_pkt, &iov, 1, 0);
         vmxnet3_rx_need_csum_calculate(s->rx_pkt, buf, size);
         net_rx_pkt_attach_data(s->rx_pkt, buf, size, s->rx_vlan_stripping);
         bytes_indicated = vmxnet3_indicate_packet(s) ? size : -1;
diff --git a/include/net/eth.h b/include/net/eth.h
index c5ae4493b4..9f19c3a695 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -312,10 +312,10 @@ eth_get_l2_hdr_length(const void *p)
 }
 
 static inline uint32_t
-eth_get_l2_hdr_length_iov(const struct iovec *iov, int iovcnt)
+eth_get_l2_hdr_length_iov(const struct iovec *iov, size_t iovcnt, size_t iovoff)
 {
     uint8_t p[sizeof(struct eth_header) + sizeof(struct vlan_header)];
-    size_t copied = iov_to_buf(iov, iovcnt, 0, p, ARRAY_SIZE(p));
+    size_t copied = iov_to_buf(iov, iovcnt, iovoff, p, ARRAY_SIZE(p));
 
     if (copied < ARRAY_SIZE(p)) {
         return copied;
@@ -397,7 +397,7 @@ typedef struct eth_l4_hdr_info_st {
     bool has_tcp_data;
 } eth_l4_hdr_info;
 
-void eth_get_protocols(const struct iovec *iov, int iovcnt,
+void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t iovoff,
                        bool *hasip4, bool *hasip6,
                        size_t *l3hdr_off,
                        size_t *l4hdr_off,
diff --git a/net/eth.c b/net/eth.c
index 70bcd8e355..d7b30df79f 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -136,7 +136,7 @@ _eth_tcp_has_data(bool is_ip4,
     return l4len > TCP_HEADER_DATA_OFFSET(tcp);
 }
 
-void eth_get_protocols(const struct iovec *iov, int iovcnt,
+void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t iovoff,
                        bool *hasip4, bool *hasip6,
                        size_t *l3hdr_off,
                        size_t *l4hdr_off,
@@ -147,26 +147,24 @@ void eth_get_protocols(const struct iovec *iov, int iovcnt,
 {
     int proto;
     bool fragment = false;
-    size_t l2hdr_len = eth_get_l2_hdr_length_iov(iov, iovcnt);
     size_t input_size = iov_size(iov, iovcnt);
     size_t copied;
     uint8_t ip_p;
 
     *hasip4 = *hasip6 = false;
+    *l3hdr_off = iovoff + eth_get_l2_hdr_length_iov(iov, iovcnt, iovoff);
     l4hdr_info->proto = ETH_L4_HDR_PROTO_INVALID;
 
-    proto = eth_get_l3_proto(iov, iovcnt, l2hdr_len);
-
-    *l3hdr_off = l2hdr_len;
+    proto = eth_get_l3_proto(iov, iovcnt, *l3hdr_off);
 
     if (proto == ETH_P_IP) {
         struct ip_header *iphdr = &ip4hdr_info->ip4_hdr;
 
-        if (input_size < l2hdr_len) {
+        if (input_size < *l3hdr_off) {
             return;
         }
 
-        copied = iov_to_buf(iov, iovcnt, l2hdr_len, iphdr, sizeof(*iphdr));
+        copied = iov_to_buf(iov, iovcnt, *l3hdr_off, iphdr, sizeof(*iphdr));
         if (copied < sizeof(*iphdr) ||
             IP_HEADER_VERSION(iphdr) != IP_HEADER_VERSION_4) {
             return;
@@ -175,17 +173,17 @@ void eth_get_protocols(const struct iovec *iov, int iovcnt,
         *hasip4 = true;
         ip_p = iphdr->ip_p;
         ip4hdr_info->fragment = IP4_IS_FRAGMENT(iphdr);
-        *l4hdr_off = l2hdr_len + IP_HDR_GET_LEN(iphdr);
+        *l4hdr_off = *l3hdr_off + IP_HDR_GET_LEN(iphdr);
 
         fragment = ip4hdr_info->fragment;
     } else if (proto == ETH_P_IPV6) {
-        if (!eth_parse_ipv6_hdr(iov, iovcnt, l2hdr_len, ip6hdr_info)) {
+        if (!eth_parse_ipv6_hdr(iov, iovcnt, *l3hdr_off, ip6hdr_info)) {
             return;
         }
 
         *hasip6 = true;
         ip_p = ip6hdr_info->l4proto;
-        *l4hdr_off = l2hdr_len + ip6hdr_info->full_hdr_len;
+        *l4hdr_off = *l3hdr_off + ip6hdr_info->full_hdr_len;
         fragment = ip6hdr_info->fragment;
     } else {
         return;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/40] e1000e: Always copy ethernet header
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (5 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols() Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 08/40] igb: " Akihiko Odaki
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index c2d864a504..f3335194d8 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1686,12 +1686,9 @@ static ssize_t
 e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
                         bool has_vnet)
 {
-    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
-
     uint32_t n = 0;
     uint8_t min_buf[ETH_ZLEN];
     struct iovec min_iov;
-    uint8_t *filter_buf;
     size_t size, orig_size;
     size_t iov_ofs = 0;
     E1000E_RxRing rxr;
@@ -1714,7 +1711,6 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
         net_rx_pkt_unset_vhdr(core->rx_pkt);
     }
 
-    filter_buf = iov->iov_base + iov_ofs;
     orig_size = iov_size(iov, iovcnt);
     size = orig_size - iov_ofs;
 
@@ -1723,15 +1719,13 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
         iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
         memset(&min_buf[size], 0, sizeof(min_buf) - size);
         e1000x_inc_reg_if_not_full(core->mac, RUC);
-        min_iov.iov_base = filter_buf = min_buf;
+        min_iov.iov_base = min_buf;
         min_iov.iov_len = size = sizeof(min_buf);
         iovcnt = 1;
         iov = &min_iov;
         iov_ofs = 0;
-    } else if (iov->iov_len < maximum_ethernet_hdr_len) {
-        /* This is very unlikely, but may happen. */
-        iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
-        filter_buf = min_buf;
+    } else {
+        iov_to_buf(iov, iovcnt, iov_ofs, min_buf, ETH_HLEN + 4);
     }
 
     /* Discard oversized packets if !LPE and !SBP. */
@@ -1740,9 +1734,9 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
     }
 
     net_rx_pkt_set_packet_type(core->rx_pkt,
-        get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf)));
+        get_eth_packet_type(PKT_GET_ETH_HDR(min_buf)));
 
-    if (!e1000e_receive_filter(core, filter_buf, size)) {
+    if (!e1000e_receive_filter(core, min_buf, size)) {
         trace_e1000e_rx_flt_dropped();
         return orig_size;
     }
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/40] igb: Always copy ethernet header
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (6 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 07/40] e1000e: Always copy ethernet header Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 14:46   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 09/40] Fix references to igb Avocado test Akihiko Odaki
                   ` (31 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 53f60fc3d3..1d188b526c 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -67,6 +67,11 @@ typedef struct IGBTxPktVmdqCallbackContext {
     NetClientState *nc;
 } IGBTxPktVmdqCallbackContext;
 
+typedef struct L2Header {
+    struct eth_header eth;
+    struct vlan_header vlan[2];
+} L2Header;
+
 static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
                      bool has_vnet, bool *external_tx);
@@ -961,15 +966,16 @@ igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
     return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
 }
 
-static uint16_t igb_receive_assign(IGBCore *core, const struct eth_header *ehdr,
+static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
                                    size_t size, E1000E_RSSInfo *rss_info,
                                    bool *external_tx)
 {
     static const int ta_shift[] = { 4, 3, 2, 0 };
+    const struct eth_header *ehdr = &l2_header->eth;
     uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
     uint16_t queues = 0;
     uint16_t oversized = 0;
-    uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(ehdr)->h_tci) & VLAN_VID_MASK;
+    uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
     bool accepted = false;
     int i;
 
@@ -1590,14 +1596,13 @@ static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
                      bool has_vnet, bool *external_tx)
 {
-    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 8);
-
     uint16_t queues = 0;
     uint32_t n = 0;
-    uint8_t min_buf[ETH_ZLEN];
+    union {
+        L2Header l2_header;
+        uint8_t octets[ETH_ZLEN];
+    } min_buf;
     struct iovec min_iov;
-    struct eth_header *ehdr;
-    uint8_t *filter_buf;
     size_t size, orig_size;
     size_t iov_ofs = 0;
     E1000E_RxRing rxr;
@@ -1623,24 +1628,21 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
         net_rx_pkt_unset_vhdr(core->rx_pkt);
     }
 
-    filter_buf = iov->iov_base + iov_ofs;
     orig_size = iov_size(iov, iovcnt);
     size = orig_size - iov_ofs;
 
     /* Pad to minimum Ethernet frame length */
     if (size < sizeof(min_buf)) {
-        iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
-        memset(&min_buf[size], 0, sizeof(min_buf) - size);
+        iov_to_buf(iov, iovcnt, iov_ofs, &min_buf, size);
+        memset(&min_buf.octets[size], 0, sizeof(min_buf) - size);
         e1000x_inc_reg_if_not_full(core->mac, RUC);
-        min_iov.iov_base = filter_buf = min_buf;
+        min_iov.iov_base = &min_buf;
         min_iov.iov_len = size = sizeof(min_buf);
         iovcnt = 1;
         iov = &min_iov;
         iov_ofs = 0;
-    } else if (iov->iov_len < maximum_ethernet_hdr_len) {
-        /* This is very unlikely, but may happen. */
-        iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
-        filter_buf = min_buf;
+    } else {
+        iov_to_buf(iov, iovcnt, iov_ofs, &min_buf, sizeof(min_buf.l2_header));
     }
 
     /* Discard oversized packets if !LPE and !SBP. */
@@ -1648,11 +1650,12 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
         return orig_size;
     }
 
-    ehdr = PKT_GET_ETH_HDR(filter_buf);
-    net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
+    net_rx_pkt_set_packet_type(core->rx_pkt,
+                               get_eth_packet_type(&min_buf.l2_header.eth));
     net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
 
-    queues = igb_receive_assign(core, ehdr, size, &rss_info, external_tx);
+    queues = igb_receive_assign(core, &min_buf.l2_header, size,
+                                &rss_info, external_tx);
     if (!queues) {
         trace_e1000e_rx_flt_dropped();
         return orig_size;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/40] Fix references to igb Avocado test
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (7 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 08/40] igb: " Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 14:47   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 10/40] tests/avocado: Remove unused imports Akihiko Odaki
                   ` (30 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Fixes: 9f95111474 ("tests/avocado: re-factor igb test to avoid timeouts")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 MAINTAINERS                                        | 2 +-
 docs/system/devices/igb.rst                        | 2 +-
 scripts/ci/org.centos/stream/8/x86_64/test-avocado | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ef45b5e71e..c31d2279ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2256,7 +2256,7 @@ R: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
 S: Maintained
 F: docs/system/devices/igb.rst
 F: hw/net/igb*
-F: tests/avocado/igb.py
+F: tests/avocado/netdev-ethtool.py
 F: tests/qtest/igb-test.c
 F: tests/qtest/libqos/igb.c
 
diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
index 70edadd574..afe036dad2 100644
--- a/docs/system/devices/igb.rst
+++ b/docs/system/devices/igb.rst
@@ -60,7 +60,7 @@ Avocado test and can be ran with the following command:
 
 .. code:: shell
 
-  make check-avocado AVOCADO_TESTS=tests/avocado/igb.py
+  make check-avocado AVOCADO_TESTS=tests/avocado/netdev-ethtool.py
 
 References
 ==========
diff --git a/scripts/ci/org.centos/stream/8/x86_64/test-avocado b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
index d2c0e5fb4c..a1aa601ee3 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/test-avocado
+++ b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
@@ -30,7 +30,7 @@ make get-vm-images
     tests/avocado/cpu_queries.py:QueryCPUModelExpansion.test \
     tests/avocado/empty_cpu_model.py:EmptyCPUModel.test \
     tests/avocado/hotplug_cpu.py:HotPlugCPU.test \
-    tests/avocado/igb.py:IGB.test \
+    tests/avocado/netdev-ethtool.py:NetDevEthtool.test_igb_nomsi \
     tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd \
     tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu \
     tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_pt \
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/40] tests/avocado: Remove unused imports
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (8 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 09/40] Fix references to igb Avocado test Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 11/40] tests/avocado: Remove test_igb_nomsi_kvm Akihiko Odaki
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 tests/avocado/netdev-ethtool.py | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
index f7e9464184..8de118e313 100644
--- a/tests/avocado/netdev-ethtool.py
+++ b/tests/avocado/netdev-ethtool.py
@@ -7,7 +7,6 @@
 
 from avocado import skip
 from avocado_qemu import QemuSystemTest
-from avocado_qemu import exec_command, exec_command_and_wait_for_pattern
 from avocado_qemu import wait_for_console_pattern
 
 class NetDevEthtool(QemuSystemTest):
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 11/40] tests/avocado: Remove test_igb_nomsi_kvm
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (9 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 10/40] tests/avocado: Remove unused imports Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 12/40] hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info Akihiko Odaki
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

It is unlikely to find more bugs with KVM so remove test_igb_nomsi_kvm
to save time to run it.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 tests/avocado/netdev-ethtool.py | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
index 8de118e313..6da800f62b 100644
--- a/tests/avocado/netdev-ethtool.py
+++ b/tests/avocado/netdev-ethtool.py
@@ -29,7 +29,7 @@ def get_asset(self, name, sha1):
         # URL into a unique one
         return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
 
-    def common_test_code(self, netdev, extra_args=None, kvm=False):
+    def common_test_code(self, netdev, extra_args=None):
 
         # This custom kernel has drivers for all the supported network
         # devices we can emulate in QEMU
@@ -57,9 +57,6 @@ def common_test_code(self, netdev, extra_args=None, kvm=False):
                          '-drive', drive,
                          '-device', netdev)
 
-        if kvm:
-            self.vm.add_args('-accel', 'kvm')
-
         self.vm.set_console(console_index=0)
         self.vm.launch()
 
@@ -86,13 +83,6 @@ def test_igb_nomsi(self):
         """
         self.common_test_code("igb", "pci=nomsi")
 
-    def test_igb_nomsi_kvm(self):
-        """
-        :avocado: tags=device:igb
-        """
-        self.require_accelerator('kvm')
-        self.common_test_code("igb", "pci=nomsi", True)
-
     # It seems the other popular cards we model in QEMU currently fail
     # the pattern test with:
     #
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 12/40] hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (10 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 11/40] tests/avocado: Remove test_igb_nomsi_kvm Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 13/40] net/eth: Rename eth_setup_vlan_headers_ex Akihiko Odaki
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

This function is not used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/net_rx_pkt.c | 5 -----
 hw/net/net_rx_pkt.h | 9 ---------
 2 files changed, 14 deletions(-)

diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 63be6e05ad..6125a063d7 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -236,11 +236,6 @@ eth_ip4_hdr_info *net_rx_pkt_get_ip4_info(struct NetRxPkt *pkt)
     return &pkt->ip4hdr_info;
 }
 
-eth_l4_hdr_info *net_rx_pkt_get_l4_info(struct NetRxPkt *pkt)
-{
-    return &pkt->l4hdr_info;
-}
-
 static inline void
 _net_rx_rss_add_chunk(uint8_t *rss_input, size_t *bytes_written,
                       void *ptr, size_t size)
diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index a06f5c2675..ce8dbdb284 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -119,15 +119,6 @@ eth_ip6_hdr_info *net_rx_pkt_get_ip6_info(struct NetRxPkt *pkt);
  */
 eth_ip4_hdr_info *net_rx_pkt_get_ip4_info(struct NetRxPkt *pkt);
 
-/**
- * fetches L4 header analysis results
- *
- * Return:  pointer to analysis results structure which is stored in internal
- *          packet area.
- *
- */
-eth_l4_hdr_info *net_rx_pkt_get_l4_info(struct NetRxPkt *pkt);
-
 typedef enum {
     NetPktRssIpV4,
     NetPktRssIpV4Tcp,
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 13/40] net/eth: Rename eth_setup_vlan_headers_ex
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (11 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 12/40] hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 14/40] e1000x: Share more Rx filtering logic Akihiko Odaki
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/net_tx_pkt.c | 2 +-
 include/net/eth.h   | 9 +--------
 net/eth.c           | 2 +-
 3 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index cc36750c9b..ce6b102391 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -368,7 +368,7 @@ void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
     bool is_new;
     assert(pkt);
 
-    eth_setup_vlan_headers_ex(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
+    eth_setup_vlan_headers(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
         vlan, vlan_ethtype, &is_new);
 
     /* update l2hdrlen */
diff --git a/include/net/eth.h b/include/net/eth.h
index 9f19c3a695..e8af5742be 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -351,16 +351,9 @@ eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
 uint16_t
 eth_get_l3_proto(const struct iovec *l2hdr_iov, int iovcnt, size_t l2hdr_len);
 
-void eth_setup_vlan_headers_ex(struct eth_header *ehdr, uint16_t vlan_tag,
+void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
     uint16_t vlan_ethtype, bool *is_new);
 
-static inline void
-eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-    bool *is_new)
-{
-    eth_setup_vlan_headers_ex(ehdr, vlan_tag, ETH_P_VLAN, is_new);
-}
-
 
 uint8_t eth_get_gso_type(uint16_t l3_proto, uint8_t *l3_hdr, uint8_t l4proto);
 
diff --git a/net/eth.c b/net/eth.c
index d7b30df79f..b6ff89c460 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -21,7 +21,7 @@
 #include "net/checksum.h"
 #include "net/tap.h"
 
-void eth_setup_vlan_headers_ex(struct eth_header *ehdr, uint16_t vlan_tag,
+void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
     uint16_t vlan_ethtype, bool *is_new)
 {
     struct vlan_header *vhdr = PKT_GET_VLAN_HDR(ehdr);
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 14/40] e1000x: Share more Rx filtering logic
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (12 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 13/40] net/eth: Rename eth_setup_vlan_headers_ex Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:10   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 15/40] e1000x: Take CRC into consideration for size check Akihiko Odaki
                   ` (25 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

This saves some code and enables tracepoint for e1000's VLAN filtering.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000.c         | 35 +++++--------------------------
 hw/net/e1000e_core.c   | 47 +++++-------------------------------------
 hw/net/e1000x_common.c | 44 +++++++++++++++++++++++++++++++++------
 hw/net/e1000x_common.h |  4 +++-
 hw/net/igb_core.c      | 41 +++---------------------------------
 hw/net/trace-events    |  4 ++--
 6 files changed, 56 insertions(+), 119 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 18eb6d8876..aae5f0bdc0 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -804,36 +804,11 @@ start_xmit(E1000State *s)
 }
 
 static int
-receive_filter(E1000State *s, const uint8_t *buf, int size)
+receive_filter(E1000State *s, const void *buf)
 {
-    uint32_t rctl = s->mac_reg[RCTL];
-    int isbcast = is_broadcast_ether_addr(buf);
-    int ismcast = is_multicast_ether_addr(buf);
-
-    if (e1000x_is_vlan_packet(buf, le16_to_cpu(s->mac_reg[VET])) &&
-        e1000x_vlan_rx_filter_enabled(s->mac_reg)) {
-        uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(buf)->h_tci);
-        uint32_t vfta =
-            ldl_le_p((uint32_t *)(s->mac_reg + VFTA) +
-                     ((vid >> E1000_VFTA_ENTRY_SHIFT) & E1000_VFTA_ENTRY_MASK));
-        if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
-            return 0;
-        }
-    }
-
-    if (!isbcast && !ismcast && (rctl & E1000_RCTL_UPE)) { /* promiscuous ucast */
-        return 1;
-    }
-
-    if (ismcast && (rctl & E1000_RCTL_MPE)) {          /* promiscuous mcast */
-        return 1;
-    }
-
-    if (isbcast && (rctl & E1000_RCTL_BAM)) {          /* broadcast enabled */
-        return 1;
-    }
-
-    return e1000x_rx_group_filter(s->mac_reg, buf);
+    return (!e1000x_is_vlan_packet(buf, s->mac_reg[VET]) ||
+            e1000x_rx_vlan_filter(s->mac_reg, PKT_GET_VLAN_HDR(buf))) &&
+           e1000x_rx_group_filter(s->mac_reg, buf);
 }
 
 static void
@@ -949,7 +924,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
         return size;
     }
 
-    if (!receive_filter(s, filter_buf, size)) {
+    if (!receive_filter(s, filter_buf)) {
         return size;
     }
 
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index f3335194d8..743b36ddfb 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1034,48 +1034,11 @@ e1000e_rx_l4_cso_enabled(E1000ECore *core)
 }
 
 static bool
-e1000e_receive_filter(E1000ECore *core, const uint8_t *buf, int size)
+e1000e_receive_filter(E1000ECore *core, const void *buf)
 {
-    uint32_t rctl = core->mac[RCTL];
-
-    if (e1000x_is_vlan_packet(buf, core->mac[VET]) &&
-        e1000x_vlan_rx_filter_enabled(core->mac)) {
-        uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(buf)->h_tci);
-        uint32_t vfta =
-            ldl_le_p((uint32_t *)(core->mac + VFTA) +
-                     ((vid >> E1000_VFTA_ENTRY_SHIFT) & E1000_VFTA_ENTRY_MASK));
-        if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
-            trace_e1000e_rx_flt_vlan_mismatch(vid);
-            return false;
-        } else {
-            trace_e1000e_rx_flt_vlan_match(vid);
-        }
-    }
-
-    switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-    case ETH_PKT_UCAST:
-        if (rctl & E1000_RCTL_UPE) {
-            return true; /* promiscuous ucast */
-        }
-        break;
-
-    case ETH_PKT_BCAST:
-        if (rctl & E1000_RCTL_BAM) {
-            return true; /* broadcast enabled */
-        }
-        break;
-
-    case ETH_PKT_MCAST:
-        if (rctl & E1000_RCTL_MPE) {
-            return true; /* promiscuous mcast */
-        }
-        break;
-
-    default:
-        g_assert_not_reached();
-    }
-
-    return e1000x_rx_group_filter(core->mac, buf);
+    return (!e1000x_is_vlan_packet(buf, core->mac[VET]) ||
+            e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(buf))) &&
+           e1000x_rx_group_filter(core->mac, buf);
 }
 
 static inline void
@@ -1736,7 +1699,7 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
     net_rx_pkt_set_packet_type(core->rx_pkt,
         get_eth_packet_type(PKT_GET_ETH_HDR(min_buf)));
 
-    if (!e1000e_receive_filter(core, min_buf, size)) {
+    if (!e1000e_receive_filter(core, min_buf)) {
         trace_e1000e_rx_flt_dropped();
         return orig_size;
     }
diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index 7694673bcc..6cc23138a8 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -58,32 +58,64 @@ bool e1000x_is_vlan_packet(const void *buf, uint16_t vet)
     return res;
 }
 
-bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf)
+bool e1000x_rx_vlan_filter(uint32_t *mac, const struct vlan_header *vhdr)
+{
+    if (e1000x_vlan_rx_filter_enabled(mac)) {
+        uint16_t vid = lduw_be_p(&vhdr->h_tci);
+        uint32_t vfta =
+            ldl_le_p((uint32_t *)(mac + VFTA) +
+                     ((vid >> E1000_VFTA_ENTRY_SHIFT) & E1000_VFTA_ENTRY_MASK));
+        if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
+            trace_e1000x_rx_flt_vlan_mismatch(vid);
+            return false;
+        }
+
+        trace_e1000x_rx_flt_vlan_match(vid);
+    }
+
+    return true;
+}
+
+bool e1000x_rx_group_filter(uint32_t *mac, const struct eth_header *ehdr)
 {
     static const int mta_shift[] = { 4, 3, 2, 0 };
     uint32_t f, ra[2], *rp, rctl = mac[RCTL];
 
+    if (is_broadcast_ether_addr(ehdr->h_dest)) {
+        if (rctl & E1000_RCTL_BAM) {
+            return true;
+        }
+    } else if (is_multicast_ether_addr(ehdr->h_dest)) {
+        if (rctl & E1000_RCTL_MPE) {
+            return true;
+        }
+    } else {
+        if (rctl & E1000_RCTL_UPE) {
+            return true;
+        }
+    }
+
     for (rp = mac + RA; rp < mac + RA + 32; rp += 2) {
         if (!(rp[1] & E1000_RAH_AV)) {
             continue;
         }
         ra[0] = cpu_to_le32(rp[0]);
         ra[1] = cpu_to_le32(rp[1]);
-        if (!memcmp(buf, (uint8_t *)ra, ETH_ALEN)) {
+        if (!memcmp(ehdr->h_dest, (uint8_t *)ra, ETH_ALEN)) {
             trace_e1000x_rx_flt_ucast_match((int)(rp - mac - RA) / 2,
-                                            MAC_ARG(buf));
+                                            MAC_ARG(ehdr->h_dest));
             return true;
         }
     }
-    trace_e1000x_rx_flt_ucast_mismatch(MAC_ARG(buf));
+    trace_e1000x_rx_flt_ucast_mismatch(MAC_ARG(ehdr->h_dest));
 
     f = mta_shift[(rctl >> E1000_RCTL_MO_SHIFT) & 3];
-    f = (((buf[5] << 8) | buf[4]) >> f) & 0xfff;
+    f = (((ehdr->h_dest[5] << 8) | ehdr->h_dest[4]) >> f) & 0xfff;
     if (mac[MTA + (f >> 5)] & (1 << (f & 0x1f))) {
         return true;
     }
 
-    trace_e1000x_rx_flt_inexact_mismatch(MAC_ARG(buf),
+    trace_e1000x_rx_flt_inexact_mismatch(MAC_ARG(ehdr->h_dest),
                                          (rctl >> E1000_RCTL_MO_SHIFT) & 3,
                                          f >> 5,
                                          mac[MTA + (f >> 5)]);
diff --git a/hw/net/e1000x_common.h b/hw/net/e1000x_common.h
index 0298e06283..be291684de 100644
--- a/hw/net/e1000x_common.h
+++ b/hw/net/e1000x_common.h
@@ -107,7 +107,9 @@ bool e1000x_rx_ready(PCIDevice *d, uint32_t *mac);
 
 bool e1000x_is_vlan_packet(const void *buf, uint16_t vet);
 
-bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf);
+bool e1000x_rx_vlan_filter(uint32_t *mac, const struct vlan_header *vhdr);
+
+bool e1000x_rx_group_filter(uint32_t *mac, const struct eth_header *ehdr);
 
 bool e1000x_hw_rx_enabled(uint32_t *mac);
 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 1d188b526c..5fdc8bc42d 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -976,7 +976,6 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
     uint16_t queues = 0;
     uint16_t oversized = 0;
     uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
-    bool accepted = false;
     int i;
 
     memset(rss_info, 0, sizeof(E1000E_RSSInfo));
@@ -986,16 +985,8 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
     }
 
     if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
-        e1000x_vlan_rx_filter_enabled(core->mac)) {
-        uint32_t vfta =
-            ldl_le_p((uint32_t *)(core->mac + VFTA) +
-                     ((vid >> E1000_VFTA_ENTRY_SHIFT) & E1000_VFTA_ENTRY_MASK));
-        if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
-            trace_e1000e_rx_flt_vlan_mismatch(vid);
-            return queues;
-        } else {
-            trace_e1000e_rx_flt_vlan_match(vid);
-        }
+        !e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
+        return queues;
     }
 
     if (core->mac[MRQC] & 1) {
@@ -1103,33 +1094,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
             }
         }
     } else {
-        switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-        case ETH_PKT_UCAST:
-            if (rctl & E1000_RCTL_UPE) {
-                accepted = true; /* promiscuous ucast */
-            }
-            break;
-
-        case ETH_PKT_BCAST:
-            if (rctl & E1000_RCTL_BAM) {
-                accepted = true; /* broadcast enabled */
-            }
-            break;
-
-        case ETH_PKT_MCAST:
-            if (rctl & E1000_RCTL_MPE) {
-                accepted = true; /* promiscuous mcast */
-            }
-            break;
-
-        default:
-            g_assert_not_reached();
-        }
-
-        if (!accepted) {
-            accepted = e1000x_rx_group_filter(core->mac, ehdr->h_dest);
-        }
-
+        bool accepted = e1000x_rx_group_filter(core->mac, ehdr);
         if (!accepted) {
             for (macp = core->mac + RA2; macp < core->mac + RA2 + 16; macp += 2) {
                 if (!(macp[1] & E1000_RAH_AV)) {
diff --git a/hw/net/trace-events b/hw/net/trace-events
index d35554fce8..a34d196ff7 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -106,6 +106,8 @@ e1000_receiver_overrun(size_t s, uint32_t rdh, uint32_t rdt) "Receiver overrun:
 # e1000x_common.c
 e1000x_rx_can_recv_disabled(bool link_up, bool rx_enabled, bool pci_master) "link_up: %d, rx_enabled %d, pci_master %d"
 e1000x_vlan_is_vlan_pkt(bool is_vlan_pkt, uint16_t eth_proto, uint16_t vet) "Is VLAN packet: %d, ETH proto: 0x%X, VET: 0x%X"
+e1000x_rx_flt_vlan_mismatch(uint16_t vid) "VID mismatch: 0x%X"
+e1000x_rx_flt_vlan_match(uint16_t vid) "VID match: 0x%X"
 e1000x_rx_flt_ucast_match(uint32_t idx, uint8_t b0, uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4, uint8_t b5) "unicast match[%d]: %02x:%02x:%02x:%02x:%02x:%02x"
 e1000x_rx_flt_ucast_mismatch(uint8_t b0, uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4, uint8_t b5) "unicast mismatch: %02x:%02x:%02x:%02x:%02x:%02x"
 e1000x_rx_flt_inexact_mismatch(uint8_t b0, uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4, uint8_t b5, uint32_t mo, uint32_t mta, uint32_t mta_val) "inexact mismatch: %02x:%02x:%02x:%02x:%02x:%02x MO %d MTA[%d] 0x%x"
@@ -154,8 +156,6 @@ e1000e_rx_can_recv_rings_full(void) "Cannot receive: all rings are full"
 e1000e_rx_can_recv(void) "Can receive"
 e1000e_rx_has_buffers(int ridx, uint32_t free_desc, size_t total_size, uint32_t desc_buf_size) "ring #%d: free descr: %u, packet size %zu, descr buffer size %u"
 e1000e_rx_null_descriptor(void) "Null RX descriptor!!"
-e1000e_rx_flt_vlan_mismatch(uint16_t vid) "VID mismatch: 0x%X"
-e1000e_rx_flt_vlan_match(uint16_t vid) "VID match: 0x%X"
 e1000e_rx_desc_ps_read(uint64_t a0, uint64_t a1, uint64_t a2, uint64_t a3) "buffers: [0x%"PRIx64", 0x%"PRIx64", 0x%"PRIx64", 0x%"PRIx64"]"
 e1000e_rx_desc_ps_write(uint16_t a0, uint16_t a1, uint16_t a2, uint16_t a3) "bytes written: [%u, %u, %u, %u]"
 e1000e_rx_desc_buff_sizes(uint32_t b0, uint32_t b1, uint32_t b2, uint32_t b3) "buffer sizes: [%u, %u, %u, %u]"
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 15/40] e1000x: Take CRC into consideration for size check
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (13 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 14/40] e1000x: Share more Rx filtering logic Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:03   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 16/40] e1000e: Always log status after building rx metadata Akihiko Odaki
                   ` (24 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Section 13.7.15 Receive Length Error Count says:
>  Packets over 1522 bytes are oversized if LongPacketEnable is 0b
> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
> is considered oversized if it exceeds 16384 bytes.

> These lengths are based on bytes in the received packet from
> <Destination Address> through <CRC>, inclusively.

As QEMU processes packets without CRC, the number of bytes for CRC
need to be subtracted.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000x_common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index 6cc23138a8..b4dfc74b66 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -142,10 +142,10 @@ bool e1000x_is_oversized(uint32_t *mac, size_t size)
 {
     /* this is the size past which hardware will
        drop packets when setting LPE=0 */
-    static const int maximum_ethernet_vlan_size = 1522;
+    static const int maximum_ethernet_vlan_size = 1522 - 4;
     /* this is the size past which hardware will
        drop packets when setting LPE=1 */
-    static const int maximum_ethernet_lpe_size = 16 * KiB;
+    static const int maximum_ethernet_lpe_size = 16 * KiB - 4;
 
     if ((size > maximum_ethernet_lpe_size ||
         (size > maximum_ethernet_vlan_size
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 16/40] e1000e: Always log status after building rx metadata
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (14 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 15/40] e1000x: Take CRC into consideration for size check Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:04   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 17/40] igb: " Akihiko Odaki
                   ` (23 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 743b36ddfb..dfa896adef 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1244,9 +1244,8 @@ e1000e_build_rx_metadata(E1000ECore *core,
         trace_e1000e_rx_metadata_l4_cso_disabled();
     }
 
-    trace_e1000e_rx_metadata_status_flags(*status_flags);
-
 func_exit:
+    trace_e1000e_rx_metadata_status_flags(*status_flags);
     *status_flags = cpu_to_le32(*status_flags);
 }
 
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 17/40] igb: Always log status after building rx metadata
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (15 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 16/40] e1000e: Always log status after building rx metadata Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:07   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 18/40] igb: Remove goto Akihiko Odaki
                   ` (22 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 5fdc8bc42d..ccc5a626b4 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1303,9 +1303,8 @@ igb_build_rx_metadata(IGBCore *core,
         trace_e1000e_rx_metadata_l4_cso_disabled();
     }
 
-    trace_e1000e_rx_metadata_status_flags(*status_flags);
-
 func_exit:
+    trace_e1000e_rx_metadata_status_flags(*status_flags);
     *status_flags = cpu_to_le32(*status_flags);
 }
 
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 18/40] igb: Remove goto
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (16 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 17/40] igb: " Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:08   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor Akihiko Odaki
                   ` (21 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

The goto is a bit confusing as it changes the control flow only if L4
protocol is not recognized. It is also different from e1000e, and
noisy when comparing e1000e and igb.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index ccc5a626b4..cca71611fe 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1297,7 +1297,7 @@ igb_build_rx_metadata(IGBCore *core,
             break;
 
         default:
-            goto func_exit;
+            break;
         }
     } else {
         trace_e1000e_rx_metadata_l4_cso_disabled();
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (17 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 18/40] igb: Remove goto Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:08   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 20/40] e1000e: Reset packet state after emptying Tx queue Akihiko Odaki
                   ` (20 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> For frames that spans multiple descriptors, all fields apart from
> DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> in the first descriptors and are ignored in the subsequent ones.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index cca71611fe..e5a7021c0e 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -613,7 +613,7 @@ igb_process_tx_desc(IGBCore *core,
             idx = (tx->first_olinfo_status >> 4) & 1;
             igb_tx_insert_vlan(core, queue_index, tx,
                 tx->ctx[idx].vlan_macip_lens >> 16,
-                !!(cmd_type_len & E1000_TXD_CMD_VLE));
+                !!(tx->first_cmd_type_len & E1000_TXD_CMD_VLE));
 
             if (igb_tx_pkt_send(core, tx, queue_index)) {
                 igb_on_tx_done_update_stats(core, tx->tx_pkt, queue_index);
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 20/40] e1000e: Reset packet state after emptying Tx queue
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (18 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 21/40] vmxnet3: " Akihiko Odaki
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Keeping Tx packet state after the transmit queue is emptied has some
problems:
- The datasheet says the descriptors can be reused after the transmit
  queue is emptied, but the Tx packet state may keep references to them.
- The Tx packet state cannot be migrated so it can be reset anytime the
  migration happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index dfa896adef..33ffc36c67 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -959,6 +959,8 @@ e1000e_start_xmit(E1000ECore *core, const E1000E_TxRing *txr)
     if (!ide || !e1000e_intrmgr_delay_tx_causes(core, &cause)) {
         e1000e_set_interrupt_cause(core, cause);
     }
+
+    net_tx_pkt_reset(txr->tx->tx_pkt, net_tx_pkt_unmap_frag_pci, core->owner);
 }
 
 static bool
@@ -3389,8 +3391,6 @@ e1000e_core_pci_uninit(E1000ECore *core)
     qemu_del_vm_change_state_handler(core->vmstate);
 
     for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-        net_tx_pkt_reset(core->tx[i].tx_pkt,
-                         net_tx_pkt_unmap_frag_pci, core->owner);
         net_tx_pkt_uninit(core->tx[i].tx_pkt);
     }
 
@@ -3515,8 +3515,6 @@ static void e1000e_reset(E1000ECore *core, bool sw)
     e1000x_reset_mac_addr(core->owner_nic, core->mac, core->permanent_mac);
 
     for (i = 0; i < ARRAY_SIZE(core->tx); i++) {
-        net_tx_pkt_reset(core->tx[i].tx_pkt,
-                         net_tx_pkt_unmap_frag_pci, core->owner);
         memset(&core->tx[i].props, 0, sizeof(core->tx[i].props));
         core->tx[i].skip_cp = false;
     }
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 21/40] vmxnet3: Reset packet state after emptying Tx queue
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (19 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 20/40] e1000e: Reset packet state after emptying Tx queue Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 22/40] igb: Add more definitions for Tx descriptor Akihiko Odaki
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Keeping Tx packet state after the transmit queue is emptied but this
behavior is unreliable as the state can be reset anytime the migration
happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/vmxnet3.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 05f41b6dfa..18b9edfdb2 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -681,6 +681,8 @@ static void vmxnet3_process_tx_queue(VMXNET3State *s, int qidx)
                              net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
         }
     }
+
+    net_tx_pkt_reset(s->tx_pkt, net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
 }
 
 static inline void
@@ -1159,7 +1161,6 @@ static void vmxnet3_deactivate_device(VMXNET3State *s)
 {
     if (s->device_active) {
         VMW_CBPRN("Deactivating vmxnet3...");
-        net_tx_pkt_reset(s->tx_pkt, net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
         net_tx_pkt_uninit(s->tx_pkt);
         net_rx_pkt_uninit(s->rx_pkt);
         s->device_active = false;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 22/40] igb: Add more definitions for Tx descriptor
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (20 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 21/40] vmxnet3: " Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:08   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 23/40] igb: Share common VF constants Akihiko Odaki
                   ` (17 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c |  2 +-
 hw/net/igb_regs.h | 32 +++++++++++++++++++++++++++-----
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index e5a7021c0e..350462c40c 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -418,7 +418,7 @@ igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
 {
     if (tx->first_cmd_type_len & E1000_ADVTXD_DCMD_TSE) {
         uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
-        uint32_t mss = tx->ctx[idx].mss_l4len_idx >> 16;
+        uint32_t mss = tx->ctx[idx].mss_l4len_idx >> E1000_ADVTXD_MSS_SHIFT;
         if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, mss)) {
             return false;
         }
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index c5c5b3c3b8..22ce909173 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -42,11 +42,6 @@ union e1000_adv_tx_desc {
     } wb;
 };
 
-#define E1000_ADVTXD_DTYP_CTXT  0x00200000 /* Advanced Context Descriptor */
-#define E1000_ADVTXD_DTYP_DATA  0x00300000 /* Advanced Data Descriptor */
-#define E1000_ADVTXD_DCMD_DEXT  0x20000000 /* Descriptor Extension (1=Adv) */
-#define E1000_ADVTXD_DCMD_TSE   0x80000000 /* TCP/UDP Segmentation Enable */
-
 #define E1000_ADVTXD_POTS_IXSM  0x00000100 /* Insert TCP/UDP Checksum */
 #define E1000_ADVTXD_POTS_TXSM  0x00000200 /* Insert TCP/UDP Checksum */
 
@@ -151,6 +146,10 @@ union e1000_adv_rx_desc {
 #define IGB_82576_VF_DEV_ID        0x10CA
 #define IGB_I350_VF_DEV_ID         0x1520
 
+/* VLAN info */
+#define IGB_TX_FLAGS_VLAN_MASK     0xffff0000
+#define IGB_TX_FLAGS_VLAN_SHIFT    16
+
 /* from igb/e1000_82575.h */
 
 #define E1000_MRQC_ENABLE_RSS_MQ            0x00000002
@@ -160,6 +159,29 @@ union e1000_adv_rx_desc {
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP       0x00800000
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP_EX    0x01000000
 
+/* Adv Transmit Descriptor Config Masks */
+#define E1000_ADVTXD_MAC_TSTAMP   0x00080000 /* IEEE1588 Timestamp packet */
+#define E1000_ADVTXD_DTYP_CTXT    0x00200000 /* Advanced Context Descriptor */
+#define E1000_ADVTXD_DTYP_DATA    0x00300000 /* Advanced Data Descriptor */
+#define E1000_ADVTXD_DCMD_EOP     0x01000000 /* End of Packet */
+#define E1000_ADVTXD_DCMD_IFCS    0x02000000 /* Insert FCS (Ethernet CRC) */
+#define E1000_ADVTXD_DCMD_RS      0x08000000 /* Report Status */
+#define E1000_ADVTXD_DCMD_DEXT    0x20000000 /* Descriptor extension (1=Adv) */
+#define E1000_ADVTXD_DCMD_VLE     0x40000000 /* VLAN pkt enable */
+#define E1000_ADVTXD_DCMD_TSE     0x80000000 /* TCP Seg enable */
+#define E1000_ADVTXD_PAYLEN_SHIFT    14 /* Adv desc PAYLEN shift */
+
+#define E1000_ADVTXD_MACLEN_SHIFT    9  /* Adv ctxt desc mac len shift */
+#define E1000_ADVTXD_TUCMD_L4T_UDP 0x00000000  /* L4 Packet TYPE of UDP */
+#define E1000_ADVTXD_TUCMD_IPV4    0x00000400  /* IP Packet Type: 1=IPv4 */
+#define E1000_ADVTXD_TUCMD_L4T_TCP 0x00000800  /* L4 Packet TYPE of TCP */
+#define E1000_ADVTXD_TUCMD_L4T_SCTP 0x00001000 /* L4 packet TYPE of SCTP */
+/* IPSec Encrypt Enable for ESP */
+#define E1000_ADVTXD_L4LEN_SHIFT     8  /* Adv ctxt L4LEN shift */
+#define E1000_ADVTXD_MSS_SHIFT      16  /* Adv ctxt MSS shift */
+/* Adv ctxt IPSec SA IDX mask */
+/* Adv ctxt IPSec ESP len mask */
+
 /* Additional Transmit Descriptor Control definitions */
 #define E1000_TXDCTL_QUEUE_ENABLE  0x02000000 /* Enable specific Tx Queue */
 
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 23/40] igb: Share common VF constants
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (21 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 22/40] igb: Add more definitions for Tx descriptor Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:08   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 24/40] igb: Fix igb_mac_reg_init alignment Akihiko Odaki
                   ` (16 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

The constants need to be consistent between the PF and VF.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb.c        | 10 +++++-----
 hw/net/igb_common.h |  8 ++++++++
 hw/net/igbvf.c      |  7 -------
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/net/igb.c b/hw/net/igb.c
index 51a7e9133e..1c989d7677 100644
--- a/hw/net/igb.c
+++ b/hw/net/igb.c
@@ -433,16 +433,16 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
 
     pcie_ari_init(pci_dev, 0x150, 1);
 
-    pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, "igbvf",
+    pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
         IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
         IGB_VF_OFFSET, IGB_VF_STRIDE);
 
-    pcie_sriov_pf_init_vf_bar(pci_dev, 0,
+    pcie_sriov_pf_init_vf_bar(pci_dev, IGBVF_MMIO_BAR_IDX,
         PCI_BASE_ADDRESS_MEM_TYPE_64 | PCI_BASE_ADDRESS_MEM_PREFETCH,
-        16 * KiB);
-    pcie_sriov_pf_init_vf_bar(pci_dev, 3,
+        IGBVF_MMIO_SIZE);
+    pcie_sriov_pf_init_vf_bar(pci_dev, IGBVF_MSIX_BAR_IDX,
         PCI_BASE_ADDRESS_MEM_TYPE_64 | PCI_BASE_ADDRESS_MEM_PREFETCH,
-        16 * KiB);
+        IGBVF_MSIX_SIZE);
 
     igb_init_net_peer(s, pci_dev, macaddr);
 
diff --git a/hw/net/igb_common.h b/hw/net/igb_common.h
index 69ac490f75..f2a9065791 100644
--- a/hw/net/igb_common.h
+++ b/hw/net/igb_common.h
@@ -28,6 +28,14 @@
 
 #include "igb_regs.h"
 
+#define TYPE_IGBVF "igbvf"
+
+#define IGBVF_MMIO_BAR_IDX  (0)
+#define IGBVF_MSIX_BAR_IDX  (3)
+
+#define IGBVF_MMIO_SIZE     (16 * 1024)
+#define IGBVF_MSIX_SIZE     (16 * 1024)
+
 #define defreg(x) x = (E1000_##x >> 2)
 #define defreg_indexed(x, i) x##i = (E1000_##x(i) >> 2)
 #define defreg_indexeda(x, i) x##i##_A = (E1000_##x##_A(i) >> 2)
diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
index 70beb7af50..284ea61184 100644
--- a/hw/net/igbvf.c
+++ b/hw/net/igbvf.c
@@ -50,15 +50,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 
-#define TYPE_IGBVF "igbvf"
 OBJECT_DECLARE_SIMPLE_TYPE(IgbVfState, IGBVF)
 
-#define IGBVF_MMIO_BAR_IDX  (0)
-#define IGBVF_MSIX_BAR_IDX  (3)
-
-#define IGBVF_MMIO_SIZE     (16 * 1024)
-#define IGBVF_MSIX_SIZE     (16 * 1024)
-
 struct IgbVfState {
     PCIDevice parent_obj;
 
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 24/40] igb: Fix igb_mac_reg_init alignment
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (22 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 23/40] igb: Share common VF constants Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:09   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 25/40] net/eth: Use void pointers Akihiko Odaki
                   ` (15 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 96 +++++++++++++++++++++++------------------------
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 350462c40c..429b0ebc03 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -4027,54 +4027,54 @@ static const uint32_t igb_mac_reg_init[] = {
     [VMOLR0 ... VMOLR0 + 7] = 0x2600 | E1000_VMOLR_STRCRC,
     [RPLOLR]        = E1000_RPLOLR_STRCRC,
     [RLPML]         = 0x2600,
-    [TXCTL0]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL1]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL2]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL3]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL4]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL5]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL6]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL7]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL8]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL9]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL10]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL11]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL12]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL13]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL14]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
-    [TXCTL15]      = E1000_DCA_TXCTRL_DATA_RRO_EN |
-                     E1000_DCA_TXCTRL_TX_WB_RO_EN |
-                     E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL0]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL1]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL2]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL3]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL4]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL5]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL6]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL7]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL8]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL9]        = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL10]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL11]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL12]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL13]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL14]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
+    [TXCTL15]       = E1000_DCA_TXCTRL_DATA_RRO_EN |
+                      E1000_DCA_TXCTRL_TX_WB_RO_EN |
+                      E1000_DCA_TXCTRL_DESC_RRO_EN,
 };
 
 static void igb_reset(IGBCore *core, bool sw)
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 25/40] net/eth: Use void pointers
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (23 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 24/40] igb: Fix igb_mac_reg_init alignment Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:10   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 26/40] net/eth: Always add VLAN tag Akihiko Odaki
                   ` (14 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/net/eth.h | 4 ++--
 net/eth.c         | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/eth.h b/include/net/eth.h
index e8af5742be..2f87a72170 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -340,12 +340,12 @@ eth_get_pkt_tci(const void *p)
 
 size_t
 eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
-               uint8_t *new_ehdr_buf,
+               void *new_ehdr_buf,
                uint16_t *payload_offset, uint16_t *tci);
 
 size_t
 eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-                  uint16_t vet, uint8_t *new_ehdr_buf,
+                  uint16_t vet, void *new_ehdr_buf,
                   uint16_t *payload_offset, uint16_t *tci);
 
 uint16_t
diff --git a/net/eth.c b/net/eth.c
index b6ff89c460..f7ffbda600 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -226,11 +226,11 @@ void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t iovoff,
 
 size_t
 eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
-               uint8_t *new_ehdr_buf,
+               void *new_ehdr_buf,
                uint16_t *payload_offset, uint16_t *tci)
 {
     struct vlan_header vlan_hdr;
-    struct eth_header *new_ehdr = (struct eth_header *) new_ehdr_buf;
+    struct eth_header *new_ehdr = new_ehdr_buf;
 
     size_t copied = iov_to_buf(iov, iovcnt, iovoff,
                                new_ehdr, sizeof(*new_ehdr));
@@ -276,7 +276,7 @@ eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
 
 size_t
 eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-                  uint16_t vet, uint8_t *new_ehdr_buf,
+                  uint16_t vet, void *new_ehdr_buf,
                   uint16_t *payload_offset, uint16_t *tci)
 {
     struct vlan_header vlan_hdr;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 26/40] net/eth: Always add VLAN tag
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (24 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 25/40] net/eth: Use void pointers Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 27/40] hw/net/net_rx_pkt: Enforce alignment for eth_header Akihiko Odaki
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

It is possible to have another VLAN tag even if the packet is already
tagged.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/net_tx_pkt.c | 16 +++++++---------
 include/net/eth.h   |  4 ++--
 net/eth.c           | 22 ++++++----------------
 3 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index ce6b102391..af8f77a3f0 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -40,7 +40,10 @@ struct NetTxPkt {
 
     struct iovec *vec;
 
-    uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN];
+    struct {
+        struct eth_header eth;
+        struct vlan_header vlan[3];
+    } l2_hdr;
     union {
         struct ip_header ip;
         struct ip6_header ip6;
@@ -365,18 +368,13 @@ bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool tso_enable,
 void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
     uint16_t vlan, uint16_t vlan_ethtype)
 {
-    bool is_new;
     assert(pkt);
 
     eth_setup_vlan_headers(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
-        vlan, vlan_ethtype, &is_new);
+                           &pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_len,
+                           vlan, vlan_ethtype);
 
-    /* update l2hdrlen */
-    if (is_new) {
-        pkt->hdr_len += sizeof(struct vlan_header);
-        pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_len +=
-            sizeof(struct vlan_header);
-    }
+    pkt->hdr_len += sizeof(struct vlan_header);
 }
 
 bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, void *base, size_t len)
diff --git a/include/net/eth.h b/include/net/eth.h
index 2f87a72170..2bbd04ec3b 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -351,8 +351,8 @@ eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
 uint16_t
 eth_get_l3_proto(const struct iovec *l2hdr_iov, int iovcnt, size_t l2hdr_len);
 
-void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-    uint16_t vlan_ethtype, bool *is_new);
+void eth_setup_vlan_headers(struct eth_header *ehdr, size_t *ehdr_size,
+                            uint16_t vlan_tag, uint16_t vlan_ethtype);
 
 
 uint8_t eth_get_gso_type(uint16_t l3_proto, uint8_t *l3_hdr, uint8_t l4proto);
diff --git a/net/eth.c b/net/eth.c
index f7ffbda600..5307978486 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -21,26 +21,16 @@
 #include "net/checksum.h"
 #include "net/tap.h"
 
-void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-    uint16_t vlan_ethtype, bool *is_new)
+void eth_setup_vlan_headers(struct eth_header *ehdr, size_t *ehdr_size,
+                            uint16_t vlan_tag, uint16_t vlan_ethtype)
 {
     struct vlan_header *vhdr = PKT_GET_VLAN_HDR(ehdr);
 
-    switch (be16_to_cpu(ehdr->h_proto)) {
-    case ETH_P_VLAN:
-    case ETH_P_DVLAN:
-        /* vlan hdr exists */
-        *is_new = false;
-        break;
-
-    default:
-        /* No VLAN header, put a new one */
-        vhdr->h_proto = ehdr->h_proto;
-        ehdr->h_proto = cpu_to_be16(vlan_ethtype);
-        *is_new = true;
-        break;
-    }
+    memmove(vhdr + 1, vhdr, *ehdr_size - ETH_HLEN);
     vhdr->h_tci = cpu_to_be16(vlan_tag);
+    vhdr->h_proto = ehdr->h_proto;
+    ehdr->h_proto = cpu_to_be16(vlan_ethtype);
+    *ehdr_size += sizeof(*vhdr);
 }
 
 uint8_t
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 27/40] hw/net/net_rx_pkt: Enforce alignment for eth_header
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (25 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 26/40] net/eth: Always add VLAN tag Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 28/40] tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX Akihiko Odaki
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

eth_strip_vlan and eth_strip_vlan_ex refers to ehdr_buf as struct
eth_header. Enforce alignment for the structure.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/net_rx_pkt.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 6125a063d7..1de42b4f51 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -23,7 +23,10 @@
 
 struct NetRxPkt {
     struct virtio_net_hdr virt_hdr;
-    uint8_t ehdr_buf[sizeof(struct eth_header) + sizeof(struct vlan_header)];
+    struct {
+        struct eth_header eth;
+        struct vlan_header vlan;
+    } ehdr_buf;
     struct iovec *vec;
     uint16_t vec_len_total;
     uint16_t vec_len;
@@ -89,7 +92,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
     if (pkt->ehdr_buf_len) {
         net_rx_pkt_iovec_realloc(pkt, iovcnt + 1);
 
-        pkt->vec[0].iov_base = pkt->ehdr_buf;
+        pkt->vec[0].iov_base = &pkt->ehdr_buf;
         pkt->vec[0].iov_len = pkt->ehdr_buf_len;
 
         pkt->tot_len = pllen + pkt->ehdr_buf_len;
@@ -120,7 +123,7 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
     assert(pkt);
 
     if (strip_vlan) {
-        pkt->ehdr_buf_len = eth_strip_vlan(iov, iovcnt, iovoff, pkt->ehdr_buf,
+        pkt->ehdr_buf_len = eth_strip_vlan(iov, iovcnt, iovoff, &pkt->ehdr_buf,
                                            &ploff, &tci);
     } else {
         pkt->ehdr_buf_len = 0;
@@ -142,7 +145,7 @@ void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
 
     if (strip_vlan) {
         pkt->ehdr_buf_len = eth_strip_vlan_ex(iov, iovcnt, iovoff, vet,
-                                              pkt->ehdr_buf,
+                                              &pkt->ehdr_buf,
                                               &ploff, &tci);
     } else {
         pkt->ehdr_buf_len = 0;
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 28/40] tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (26 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 27/40] hw/net/net_rx_pkt: Enforce alignment for eth_header Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 29/40] igb: Implement MSI-X single vector mode Akihiko Odaki
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

GPIE.Multiple_MSIX is not set by default, and needs to be set to get
interrupts from multiple MSI-X vectors.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 tests/qtest/libqos/igb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/libqos/igb.c b/tests/qtest/libqos/igb.c
index 12fb531bf0..a603468beb 100644
--- a/tests/qtest/libqos/igb.c
+++ b/tests/qtest/libqos/igb.c
@@ -114,6 +114,7 @@ static void igb_pci_start_hw(QOSGraphObject *obj)
     e1000e_macreg_write(&d->e1000e, E1000_RCTL, E1000_RCTL_EN);
 
     /* Enable all interrupts */
+    e1000e_macreg_write(&d->e1000e, E1000_GPIE,  E1000_GPIE_MSIX_MODE);
     e1000e_macreg_write(&d->e1000e, E1000_IMS,  0xFFFFFFFF);
     e1000e_macreg_write(&d->e1000e, E1000_EIMS, 0xFFFFFFFF);
 
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 29/40] igb: Implement MSI-X single vector mode
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (27 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 28/40] tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:12   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 30/40] igb: Implement igb-specific oversize check Akihiko Odaki
                   ` (10 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 429b0ebc03..2013a9a53d 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1870,7 +1870,7 @@ igb_update_interrupt_state(IGBCore *core)
 
     icr = core->mac[ICR] & core->mac[IMS];
 
-    if (msix_enabled(core->owner)) {
+    if (core->mac[GPIE] & E1000_GPIE_MSIX_MODE) {
         if (icr) {
             causes = 0;
             if (icr & E1000_ICR_DRSTA) {
@@ -1905,7 +1905,12 @@ igb_update_interrupt_state(IGBCore *core)
         trace_e1000e_irq_pending_interrupts(core->mac[ICR] & core->mac[IMS],
                                             core->mac[ICR], core->mac[IMS]);
 
-        if (msi_enabled(core->owner)) {
+        if (msix_enabled(core->owner)) {
+            if (icr) {
+                trace_e1000e_irq_msix_notify_vec(0);
+                msix_notify(core->owner, 0);
+            }
+        } else if (msi_enabled(core->owner)) {
             if (icr) {
                 msi_notify(core->owner, 0);
             }
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 30/40] igb: Implement igb-specific oversize check
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (28 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 29/40] igb: Implement MSI-X single vector mode Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-16 11:22   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 31/40] igb: Use UDP for RSS hash Akihiko Odaki
                   ` (9 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

igb has a configurable size limit for LPE, and uses different limits
depending on whether the packet is treated as a VLAN packet.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 41 +++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 2013a9a53d..569897fb99 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -954,16 +954,21 @@ igb_rx_l4_cso_enabled(IGBCore *core)
     return !!(core->mac[RXCSUM] & E1000_RXCSUM_TUOFLD);
 }
 
-static bool
-igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
+static bool igb_rx_is_oversized(IGBCore *core, const struct eth_header *ehdr,
+                                size_t size, bool lpe, uint16_t rlpml)
 {
-    uint16_t pool = qn % IGB_NUM_VM_POOLS;
-    bool lpe = !!(core->mac[VMOLR0 + pool] & E1000_VMOLR_LPE);
-    int max_ethernet_lpe_size =
-        core->mac[VMOLR0 + pool] & E1000_VMOLR_RLPML_MASK;
-    int max_ethernet_vlan_size = 1522;
+    size += 4;
+
+    if (lpe) {
+        return size > rlpml;
+    }
+
+    if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
+        e1000x_vlan_rx_filter_enabled(core->mac)) {
+        return size > 1522;
+    }
 
-    return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
+    return size > 1518;
 }
 
 static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
@@ -976,6 +981,8 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
     uint16_t queues = 0;
     uint16_t oversized = 0;
     uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
+    bool lpe;
+    uint16_t rlpml;
     int i;
 
     memset(rss_info, 0, sizeof(E1000E_RSSInfo));
@@ -984,6 +991,14 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
         *external_tx = true;
     }
 
+    lpe = !!(core->mac[RCTL] & E1000_RCTL_LPE);
+    rlpml = core->mac[RLPML];
+    if (!(core->mac[RCTL] & E1000_RCTL_SBP) &&
+        igb_rx_is_oversized(core, ehdr, size, lpe, rlpml)) {
+        trace_e1000x_rx_oversized(size);
+        return queues;
+    }
+
     if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
         !e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
         return queues;
@@ -1067,7 +1082,10 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
         queues &= core->mac[VFRE];
         if (queues) {
             for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
-                if ((queues & BIT(i)) && igb_rx_is_oversized(core, i, size)) {
+                lpe = !!(core->mac[VMOLR0 + i] & E1000_VMOLR_LPE);
+                rlpml = core->mac[VMOLR0 + i] & E1000_VMOLR_RLPML_MASK;
+                if ((queues & BIT(i)) &&
+                    igb_rx_is_oversized(core, ehdr, size, lpe, rlpml)) {
                     oversized |= BIT(i);
                 }
             }
@@ -1609,11 +1627,6 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
         iov_to_buf(iov, iovcnt, iov_ofs, &min_buf, sizeof(min_buf.l2_header));
     }
 
-    /* Discard oversized packets if !LPE and !SBP. */
-    if (e1000x_is_oversized(core->mac, size)) {
-        return orig_size;
-    }
-
     net_rx_pkt_set_packet_type(core->rx_pkt,
                                get_eth_packet_type(&min_buf.l2_header.eth));
     net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 31/40] igb: Use UDP for RSS hash
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (29 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 30/40] igb: Implement igb-specific oversize check Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 19:45   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 32/40] igb: Implement Rx SCTP CSO Akihiko Odaki
                   ` (8 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

e1000e does not support using UDP for RSS hash, but igb does.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 16 ++++++++++++++++
 hw/net/igb_regs.h |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 569897fb99..3ad81b15d0 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -279,6 +279,11 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
             return E1000_MRQ_RSS_TYPE_IPV4TCP;
         }
 
+        if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
+            (core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV4_UDP)) {
+            return E1000_MRQ_RSS_TYPE_IPV4UDP;
+        }
+
         if (E1000_MRQC_EN_IPV4(core->mac[MRQC])) {
             return E1000_MRQ_RSS_TYPE_IPV4;
         }
@@ -314,6 +319,11 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
                 return E1000_MRQ_RSS_TYPE_IPV6TCP;
             }
 
+            if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
+                (core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV6_UDP)) {
+                return E1000_MRQ_RSS_TYPE_IPV6UDP;
+            }
+
             if (E1000_MRQC_EN_IPV6EX(core->mac[MRQC])) {
                 return E1000_MRQ_RSS_TYPE_IPV6EX;
             }
@@ -352,6 +362,12 @@ igb_rss_calc_hash(IGBCore *core, struct NetRxPkt *pkt, E1000E_RSSInfo *info)
     case E1000_MRQ_RSS_TYPE_IPV6EX:
         type = NetPktRssIpV6Ex;
         break;
+    case E1000_MRQ_RSS_TYPE_IPV4UDP:
+        type = NetPktRssIpV4Udp;
+        break;
+    case E1000_MRQ_RSS_TYPE_IPV6UDP:
+        type = NetPktRssIpV6Udp;
+        break;
     default:
         assert(false);
         return 0;
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 22ce909173..03486edb2e 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -659,6 +659,9 @@ union e1000_adv_rx_desc {
 
 #define E1000_RSS_QUEUE(reta, hash) (E1000_RETA_VAL(reta, hash) & 0x0F)
 
+#define E1000_MRQ_RSS_TYPE_IPV4UDP 7
+#define E1000_MRQ_RSS_TYPE_IPV6UDP 8
+
 #define E1000_STATUS_IOV_MODE 0x00040000
 
 #define E1000_STATUS_NUM_VFS_SHIFT 14
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 32/40] igb: Implement Rx SCTP CSO
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (30 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 31/40] igb: Use UDP for RSS hash Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 33/40] igb: Implement Tx " Akihiko Odaki
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c  |  5 ++++
 hw/net/igb_core.c     | 15 +++++++++-
 hw/net/net_rx_pkt.c   | 64 +++++++++++++++++++++++++++++++++++--------
 include/net/eth.h     |  4 ++-
 include/qemu/crc32c.h |  1 +
 net/eth.c             |  4 +++
 util/crc32c.c         |  8 ++++++
 7 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 33ffc36c67..9dc8b718c0 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1114,6 +1114,11 @@ e1000e_verify_csum_in_sw(E1000ECore *core,
         return;
     }
 
+    if (l4hdr_proto != ETH_L4_HDR_PROTO_TCP &&
+        l4hdr_proto != ETH_L4_HDR_PROTO_UDP) {
+        return;
+    }
+
     if (!net_rx_pkt_validate_l4_csum(pkt, &csum_valid)) {
         trace_e1000e_rx_metadata_l4_csum_validation_failed();
         return;
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 3ad81b15d0..0e1b681613 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1230,7 +1230,7 @@ igb_build_rx_metadata(IGBCore *core,
                       uint16_t *vlan_tag)
 {
     struct virtio_net_hdr *vhdr;
-    bool hasip4, hasip6;
+    bool hasip4, hasip6, csum_valid;
     EthL4HdrProto l4hdr_proto;
 
     *status_flags = E1000_RXD_STAT_DD;
@@ -1290,6 +1290,10 @@ igb_build_rx_metadata(IGBCore *core,
             *pkt_info |= BIT(9);
             break;
 
+        case ETH_L4_HDR_PROTO_SCTP:
+            *pkt_info |= BIT(10);
+            break;
+
         default:
             break;
         }
@@ -1322,6 +1326,15 @@ igb_build_rx_metadata(IGBCore *core,
 
     if (igb_rx_l4_cso_enabled(core)) {
         switch (l4hdr_proto) {
+        case ETH_L4_HDR_PROTO_SCTP:
+            if (!net_rx_pkt_validate_l4_csum(pkt, &csum_valid)) {
+                trace_e1000e_rx_metadata_l4_csum_validation_failed();
+                goto func_exit;
+            }
+            if (!csum_valid) {
+                *status_flags |= E1000_RXDEXT_STATERR_TCPE;
+            }
+            /* fall through */
         case ETH_L4_HDR_PROTO_TCP:
             *status_flags |= E1000_RXD_STAT_TCPCS;
             break;
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 1de42b4f51..3575c8b9f9 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/crc32c.h"
 #include "trace.h"
 #include "net_rx_pkt.h"
 #include "net/checksum.h"
@@ -554,32 +555,73 @@ _net_rx_pkt_calc_l4_csum(struct NetRxPkt *pkt)
     return csum;
 }
 
-bool net_rx_pkt_validate_l4_csum(struct NetRxPkt *pkt, bool *csum_valid)
+static bool
+_net_rx_pkt_validate_sctp_sum(struct NetRxPkt *pkt)
 {
-    uint16_t csum;
+    size_t csum_off;
+    size_t off = pkt->l4hdr_off;
+    size_t vec_len = pkt->vec_len;
+    struct iovec *vec;
+    uint32_t calculated = 0;
+    uint32_t original;
+    bool valid;
 
-    trace_net_rx_pkt_l4_csum_validate_entry();
+    for (vec = pkt->vec; vec->iov_len < off; vec++) {
+        off -= vec->iov_len;
+        vec_len--;
+    }
 
-    if (pkt->l4hdr_info.proto != ETH_L4_HDR_PROTO_TCP &&
-        pkt->l4hdr_info.proto != ETH_L4_HDR_PROTO_UDP) {
-        trace_net_rx_pkt_l4_csum_validate_not_xxp();
+    csum_off = off + 8;
+
+    if (!iov_to_buf(vec, vec_len, csum_off, &original, sizeof(original))) {
         return false;
     }
 
-    if (pkt->l4hdr_info.proto == ETH_L4_HDR_PROTO_UDP &&
-        pkt->l4hdr_info.hdr.udp.uh_sum == 0) {
-        trace_net_rx_pkt_l4_csum_validate_udp_with_no_checksum();
+    if (!iov_from_buf(vec, vec_len, csum_off,
+                      &calculated, sizeof(calculated))) {
         return false;
     }
 
+    calculated = crc32c(0xffffffff,
+                        (uint8_t *)vec->iov_base + off, vec->iov_len - off);
+    calculated = iov_crc32c(calculated ^ 0xffffffff, vec + 1, vec_len - 1);
+    valid = calculated == le32_to_cpu(original);
+    iov_from_buf(vec, vec_len, csum_off, &original, sizeof(original));
+
+    return valid;
+}
+
+bool net_rx_pkt_validate_l4_csum(struct NetRxPkt *pkt, bool *csum_valid)
+{
+    uint32_t csum;
+
+    trace_net_rx_pkt_l4_csum_validate_entry();
+
     if (pkt->hasip4 && pkt->ip4hdr_info.fragment) {
         trace_net_rx_pkt_l4_csum_validate_ip4_fragment();
         return false;
     }
 
-    csum = _net_rx_pkt_calc_l4_csum(pkt);
+    switch (pkt->l4hdr_info.proto) {
+    case ETH_L4_HDR_PROTO_UDP:
+        if (pkt->l4hdr_info.hdr.udp.uh_sum == 0) {
+            trace_net_rx_pkt_l4_csum_validate_udp_with_no_checksum();
+            return false;
+        }
+        /* fall through */
+    case ETH_L4_HDR_PROTO_TCP:
+        csum = _net_rx_pkt_calc_l4_csum(pkt);
+        *csum_valid = ((csum == 0) || (csum == 0xFFFF));
+        break;
+
+    case ETH_L4_HDR_PROTO_SCTP:
+        *csum_valid = _net_rx_pkt_validate_sctp_sum(pkt);
+        break;
 
-    *csum_valid = ((csum == 0) || (csum == 0xFFFF));
+    default:
+        trace_net_rx_pkt_l4_csum_validate_not_xxp();
+        return false;
+    }
 
     trace_net_rx_pkt_l4_csum_validate_csum(*csum_valid);
 
diff --git a/include/net/eth.h b/include/net/eth.h
index 2bbd04ec3b..6d65b7e2cb 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -222,6 +222,7 @@ struct tcp_hdr {
 #define IP_HEADER_VERSION_6       (6)
 #define IP_PROTO_TCP              (6)
 #define IP_PROTO_UDP              (17)
+#define IP_PROTO_SCTP             (132)
 #define IPTOS_ECN_MASK            0x03
 #define IPTOS_ECN(x)              ((x) & IPTOS_ECN_MASK)
 #define IPTOS_ECN_CE              0x03
@@ -377,7 +378,8 @@ typedef struct eth_ip4_hdr_info_st {
 typedef enum EthL4HdrProto {
     ETH_L4_HDR_PROTO_INVALID,
     ETH_L4_HDR_PROTO_TCP,
-    ETH_L4_HDR_PROTO_UDP
+    ETH_L4_HDR_PROTO_UDP,
+    ETH_L4_HDR_PROTO_SCTP
 } EthL4HdrProto;
 
 typedef struct eth_l4_hdr_info_st {
diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h
index 5b78884c38..88b4d2b3b3 100644
--- a/include/qemu/crc32c.h
+++ b/include/qemu/crc32c.h
@@ -30,5 +30,6 @@
 
 
 uint32_t crc32c(uint32_t crc, const uint8_t *data, unsigned int length);
+uint32_t iov_crc32c(uint32_t crc, const struct iovec *iov, size_t iov_cnt);
 
 #endif
diff --git a/net/eth.c b/net/eth.c
index 5307978486..7f02aea010 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -211,6 +211,10 @@ void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t iovoff,
             *l5hdr_off = *l4hdr_off + sizeof(l4hdr_info->hdr.udp);
         }
         break;
+
+    case IP_PROTO_SCTP:
+        l4hdr_info->proto = ETH_L4_HDR_PROTO_SCTP;
+        break;
     }
 }
 
diff --git a/util/crc32c.c b/util/crc32c.c
index 762657d853..ea7f345de8 100644
--- a/util/crc32c.c
+++ b/util/crc32c.c
@@ -113,3 +113,11 @@ uint32_t crc32c(uint32_t crc, const uint8_t *data, unsigned int length)
     return crc^0xffffffff;
 }
 
+uint32_t iov_crc32c(uint32_t crc, const struct iovec *iov, size_t iov_cnt)
+{
+    while (iov_cnt--) {
+        crc = crc32c(crc, iov->iov_base, iov->iov_len) ^ 0xffffffff;
+        iov++;
+    }
+    return crc ^ 0xffffffff;
+}
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 33/40] igb: Implement Tx SCTP CSO
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (31 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 32/40] igb: Implement Rx SCTP CSO Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 34/40] igb: Strip the second VLAN tag for extended VLAN Akihiko Odaki
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c   | 12 +++++++-----
 hw/net/net_tx_pkt.c | 18 ++++++++++++++++++
 hw/net/net_tx_pkt.h |  8 ++++++++
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 0e1b681613..955db1b1dc 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -432,8 +432,9 @@ igb_tx_insert_vlan(IGBCore *core, uint16_t qn, struct igb_tx *tx,
 static bool
 igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
 {
+    uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
+
     if (tx->first_cmd_type_len & E1000_ADVTXD_DCMD_TSE) {
-        uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
         uint32_t mss = tx->ctx[idx].mss_l4len_idx >> E1000_ADVTXD_MSS_SHIFT;
         if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, mss)) {
             return false;
@@ -444,10 +445,11 @@ igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
         return true;
     }
 
-    if (tx->first_olinfo_status & E1000_ADVTXD_POTS_TXSM) {
-        if (!net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0)) {
-            return false;
-        }
+    if ((tx->first_olinfo_status & E1000_ADVTXD_POTS_TXSM) &&
+        !((tx->ctx[idx].type_tucmd_mlhl & E1000_ADVTXD_TUCMD_L4T_SCTP) ?
+          net_tx_pkt_update_sctp_checksum(tx->tx_pkt) :
+          net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0))) {
+        return false;
     }
 
     if (tx->first_olinfo_status & E1000_ADVTXD_POTS_IXSM) {
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index af8f77a3f0..2e5f58b3c9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/crc32c.h"
 #include "net/eth.h"
 #include "net/checksum.h"
 #include "net/tap.h"
@@ -135,6 +136,23 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt)
                  pkt->virt_hdr.csum_offset, &csum, sizeof(csum));
 }
 
+bool net_tx_pkt_update_sctp_checksum(struct NetTxPkt *pkt)
+{
+    uint32_t csum = 0;
+    struct iovec *pl_start_frag = pkt->vec + NET_TX_PKT_PL_START_FRAG;
+
+    if (iov_from_buf(pl_start_frag, pkt->payload_frags, 8, &csum, sizeof(csum)) < sizeof(csum)) {
+        return false;
+    }
+
+    csum = cpu_to_le32(iov_crc32c(0xffffffff, pl_start_frag, pkt->payload_frags));
+    if (iov_from_buf(pl_start_frag, pkt->payload_frags, 8, &csum, sizeof(csum)) < sizeof(csum)) {
+        return false;
+    }
+
+    return true;
+}
+
 static void net_tx_pkt_calculate_hdr_len(struct NetTxPkt *pkt)
 {
     pkt->hdr_len = pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_len +
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index f5cd44da6f..fc00d7941d 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -116,6 +116,14 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt);
  */
 void net_tx_pkt_update_ip_hdr_checksum(struct NetTxPkt *pkt);
 
+/**
+ * Calculate the SCTP checksum.
+ *
+ * @pkt:            packet
+ *
+ */
+bool net_tx_pkt_update_sctp_checksum(struct NetTxPkt *pkt);
+
 /**
  * get length of all populated data.
  *
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 34/40] igb: Strip the second VLAN tag for extended VLAN
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (32 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 33/40] igb: Implement Tx " Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 35/40] igb: Filter with " Akihiko Odaki
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/e1000e_core.c |  3 ++-
 hw/net/igb_core.c    | 14 ++++++++++--
 hw/net/net_rx_pkt.c  | 15 +++++--------
 hw/net/net_rx_pkt.h  | 19 ++++++++--------
 include/net/eth.h    |  4 ++--
 net/eth.c            | 52 ++++++++++++++++++++++++++++----------------
 6 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 9dc8b718c0..56a46b1897 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1711,7 +1711,8 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
     }
 
     net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-                               e1000x_vlan_enabled(core->mac), core->mac[VET]);
+                               e1000x_vlan_enabled(core->mac) ? 0 : -1,
+                               core->mac[VET], 0);
 
     e1000e_rss_parse_packet(core, core->rx_pkt, &rss_info);
     e1000e_rx_ring_init(core, &rxr, rss_info.queue);
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 955db1b1dc..6e8de9d878 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1621,6 +1621,7 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
     E1000E_RxRing rxr;
     E1000E_RSSInfo rss_info;
     size_t total_size;
+    int strip_vlan_index;
     int i;
 
     trace_e1000e_rx_receive_iov(iovcnt);
@@ -1677,9 +1678,18 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
 
         igb_rx_ring_init(core, &rxr, i);
 
+        if (!igb_rx_strip_vlan(core, rxr.i)) {
+            strip_vlan_index = -1;
+        } else if (core->mac[CTRL_EXT] & BIT(26)) {
+            strip_vlan_index = 1;
+        } else {
+            strip_vlan_index = 0;
+        }
+
         net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-                                   igb_rx_strip_vlan(core, rxr.i),
-                                   core->mac[VET] & 0xffff);
+                                   strip_vlan_index,
+                                   core->mac[VET] & 0xffff,
+                                   core->mac[VET] >> 16);
 
         total_size = net_rx_pkt_get_total_len(core->rx_pkt) +
             e1000x_fcs_len(core->mac);
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 3575c8b9f9..32e5f3f9cf 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -137,20 +137,17 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
 
 void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
                                 const struct iovec *iov, int iovcnt,
-                                size_t iovoff, bool strip_vlan,
-                                uint16_t vet)
+                                size_t iovoff, int strip_vlan_index,
+                                uint16_t vet, uint16_t vet_ext)
 {
     uint16_t tci = 0;
     uint16_t ploff = iovoff;
     assert(pkt);
 
-    if (strip_vlan) {
-        pkt->ehdr_buf_len = eth_strip_vlan_ex(iov, iovcnt, iovoff, vet,
-                                              &pkt->ehdr_buf,
-                                              &ploff, &tci);
-    } else {
-        pkt->ehdr_buf_len = 0;
-    }
+    pkt->ehdr_buf_len = eth_strip_vlan_ex(iov, iovcnt, iovoff,
+                                          strip_vlan_index, vet, vet_ext,
+                                          &pkt->ehdr_buf,
+                                          &ploff, &tci);
 
     pkt->tci = tci;
 
diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index ce8dbdb284..55ec67a1a7 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -223,18 +223,19 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
 /**
 * attach scatter-gather data to rx packet
 *
-* @pkt:            packet
-* @iov:            received data scatter-gather list
-* @iovcnt          number of elements in iov
-* @iovoff          data start offset in the iov
-* @strip_vlan:     should the module strip vlan from data
-* @vet:            VLAN tag Ethernet type
+* @pkt:              packet
+* @iov:              received data scatter-gather list
+* @iovcnt:           number of elements in iov
+* @iovoff:           data start offset in the iov
+* @strip_vlan_index: index of Q tag if it is to be stripped. negative otherwise.
+* @vet:              VLAN tag Ethernet type
+* @vet_ext:          outer VLAN tag Ethernet type
 *
 */
 void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
-                                   const struct iovec *iov, int iovcnt,
-                                   size_t iovoff, bool strip_vlan,
-                                   uint16_t vet);
+                                const struct iovec *iov, int iovcnt,
+                                size_t iovoff, int strip_vlan_index,
+                                uint16_t vet, uint16_t vet_ext);
 
 /**
  * attach data to rx packet
diff --git a/include/net/eth.h b/include/net/eth.h
index 6d65b7e2cb..7e76a4b139 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -345,8 +345,8 @@ eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
                uint16_t *payload_offset, uint16_t *tci);
 
 size_t
-eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-                  uint16_t vet, void *new_ehdr_buf,
+eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff, int index,
+                  uint16_t vet, uint16_t vet_ext, void *new_ehdr_buf,
                   uint16_t *payload_offset, uint16_t *tci);
 
 uint16_t
diff --git a/net/eth.c b/net/eth.c
index 7f02aea010..649e66bb1f 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -269,36 +269,50 @@ eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
 }
 
 size_t
-eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-                  uint16_t vet, void *new_ehdr_buf,
+eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff, int index,
+                  uint16_t vet, uint16_t vet_ext, void *new_ehdr_buf,
                   uint16_t *payload_offset, uint16_t *tci)
 {
     struct vlan_header vlan_hdr;
-    struct eth_header *new_ehdr = (struct eth_header *) new_ehdr_buf;
-
-    size_t copied = iov_to_buf(iov, iovcnt, iovoff,
-                               new_ehdr, sizeof(*new_ehdr));
-
-    if (copied < sizeof(*new_ehdr)) {
-        return 0;
-    }
+    uint16_t *new_ehdr_proto;
+    size_t new_ehdr_size;
+    size_t copied;
 
-    if (be16_to_cpu(new_ehdr->h_proto) == vet) {
-        copied = iov_to_buf(iov, iovcnt, iovoff + sizeof(*new_ehdr),
-                            &vlan_hdr, sizeof(vlan_hdr));
+    switch (index) {
+    case 0:
+        new_ehdr_proto = &PKT_GET_ETH_HDR(new_ehdr_buf)->h_proto;
+        new_ehdr_size = sizeof(struct eth_header);
+        copied = iov_to_buf(iov, iovcnt, iovoff, new_ehdr_buf, new_ehdr_size);
+        break;
 
-        if (copied < sizeof(vlan_hdr)) {
+    case 1:
+        new_ehdr_proto = &PKT_GET_VLAN_HDR(new_ehdr_buf)->h_proto;
+        new_ehdr_size = sizeof(struct eth_header) + sizeof(struct vlan_header);
+        copied = iov_to_buf(iov, iovcnt, iovoff, new_ehdr_buf, new_ehdr_size);
+        if (be16_to_cpu(PKT_GET_ETH_HDR(new_ehdr_buf)->h_proto) != vet_ext) {
             return 0;
         }
+        break;
 
-        new_ehdr->h_proto = vlan_hdr.h_proto;
+    default:
+        return 0;
+    }
 
-        *tci = be16_to_cpu(vlan_hdr.h_tci);
-        *payload_offset = iovoff + sizeof(*new_ehdr) + sizeof(vlan_hdr);
-        return sizeof(struct eth_header);
+    if (copied < new_ehdr_size || be16_to_cpu(*new_ehdr_proto) != vet) {
+        return 0;
+    }
+
+    copied = iov_to_buf(iov, iovcnt, iovoff + new_ehdr_size,
+                        &vlan_hdr, sizeof(vlan_hdr));
+    if (copied < sizeof(vlan_hdr)) {
+        return 0;
     }
 
-    return 0;
+    *new_ehdr_proto = vlan_hdr.h_proto;
+    *payload_offset = iovoff + new_ehdr_size + sizeof(vlan_hdr);
+    *tci = be16_to_cpu(vlan_hdr.h_tci);
+
+    return new_ehdr_size;
 }
 
 void
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 35/40] igb: Filter with the second VLAN tag for extended VLAN
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (33 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 34/40] igb: Strip the second VLAN tag for extended VLAN Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 36/40] igb: Implement Rx PTP2 timestamp Akihiko Odaki
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 6e8de9d878..70acc86834 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1017,9 +1017,17 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
         return queues;
     }
 
-    if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
-        !e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
-        return queues;
+    if (core->mac[CTRL_EXT] & BIT(26)) {
+        if (be16_to_cpu(ehdr->h_proto) == core->mac[VET] >> 16 &&
+            be16_to_cpu(l2_header->vlan[0].h_proto) == (core->mac[VET] & 0xffff) &&
+            !e1000x_rx_vlan_filter(core->mac, l2_header->vlan + 1)) {
+            return queues;
+        }
+    } else {
+        if (be16_to_cpu(ehdr->h_proto) == (core->mac[VET] & 0xffff) &&
+            !e1000x_rx_vlan_filter(core->mac, l2_header->vlan)) {
+            return queues;
+        }
     }
 
     if (core->mac[MRQC] & 1) {
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 36/40] igb: Implement Rx PTP2 timestamp
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (34 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 35/40] igb: Filter with " Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 37/40] igb: Implement Tx timestamp Akihiko Odaki
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_common.h |  16 +++---
 hw/net/igb_core.c   | 129 ++++++++++++++++++++++++++++++++------------
 hw/net/igb_regs.h   |  23 ++++++++
 3 files changed, 127 insertions(+), 41 deletions(-)

diff --git a/hw/net/igb_common.h b/hw/net/igb_common.h
index f2a9065791..5c261ba9d3 100644
--- a/hw/net/igb_common.h
+++ b/hw/net/igb_common.h
@@ -51,7 +51,7 @@
                    defreg_indexeda(x, 0), defreg_indexeda(x, 1), \
                    defreg_indexeda(x, 2), defreg_indexeda(x, 3)
 
-#define defregv(x) defreg_indexed(x, 0), defreg_indexed(x, 1),   \
+#define defreg8(x) defreg_indexed(x, 0), defreg_indexed(x, 1),   \
                    defreg_indexed(x, 2), defreg_indexed(x, 3),   \
                    defreg_indexed(x, 4), defreg_indexed(x, 5),   \
                    defreg_indexed(x, 6), defreg_indexed(x, 7)
@@ -122,6 +122,8 @@ enum {
     defreg(EICS),        defreg(EIMS),        defreg(EIMC),       defreg(EIAM),
     defreg(EICR),        defreg(IVAR_MISC),   defreg(GPIE),
 
+    defreg(TSYNCRXCFG), defreg8(ETQF),
+
     defreg(RXPBS),      defregd(RDBAL),       defregd(RDBAH),     defregd(RDLEN),
     defregd(SRRCTL),    defregd(RDH),         defregd(RDT),
     defregd(RXDCTL),    defregd(RXCTL),       defregd(RQDPC),     defreg(RA2),
@@ -133,15 +135,15 @@ enum {
 
     defreg(VT_CTL),
 
-    defregv(P2VMAILBOX), defregv(V2PMAILBOX), defreg(MBVFICR),    defreg(MBVFIMR),
+    defreg8(P2VMAILBOX), defreg8(V2PMAILBOX), defreg(MBVFICR),    defreg(MBVFIMR),
     defreg(VFLRE),       defreg(VFRE),        defreg(VFTE),       defreg(WVBR),
     defreg(QDE),         defreg(DTXSWC),      defreg_indexed(VLVF, 0),
-    defregv(VMOLR),      defreg(RPLOLR),      defregv(VMBMEM),    defregv(VMVIR),
+    defreg8(VMOLR),      defreg(RPLOLR),      defreg8(VMBMEM),    defreg8(VMVIR),
 
-    defregv(PVTCTRL),    defregv(PVTEICS),    defregv(PVTEIMS),   defregv(PVTEIMC),
-    defregv(PVTEIAC),    defregv(PVTEIAM),    defregv(PVTEICR),   defregv(PVFGPRC),
-    defregv(PVFGPTC),    defregv(PVFGORC),    defregv(PVFGOTC),   defregv(PVFMPRC),
-    defregv(PVFGPRLBC),  defregv(PVFGPTLBC),  defregv(PVFGORLBC), defregv(PVFGOTLBC),
+    defreg8(PVTCTRL),    defreg8(PVTEICS),    defreg8(PVTEIMS),   defreg8(PVTEIMC),
+    defreg8(PVTEIAC),    defreg8(PVTEIAM),    defreg8(PVTEICR),   defreg8(PVFGPRC),
+    defreg8(PVFGPTC),    defreg8(PVFGORC),    defreg8(PVFGOTC),   defreg8(PVFMPRC),
+    defreg8(PVFGPRLBC),  defreg8(PVFGPTLBC),  defreg8(PVFGORLBC), defreg8(PVFGOTLBC),
 
     defreg(MTA_A),
 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 70acc86834..c716f400fd 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -72,6 +72,24 @@ typedef struct L2Header {
     struct vlan_header vlan[2];
 } L2Header;
 
+typedef struct PTP2 {
+    uint8_t message_id_transport_specific;
+    uint8_t version_ptp;
+    uint16_t message_length;
+    uint8_t subdomain_number;
+    uint8_t reserved0;
+    uint16_t flags;
+    uint64_t correction;
+    uint8_t reserved1[5];
+    uint8_t source_communication_technology;
+    uint32_t source_uuid_lo;
+    uint16_t source_uuid_hi;
+    uint16_t source_port_id;
+    uint16_t sequence_id;
+    uint8_t control;
+    uint8_t log_message_period;
+} PTP2;
+
 static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
                      bool has_vnet, bool *external_tx);
@@ -989,9 +1007,11 @@ static bool igb_rx_is_oversized(IGBCore *core, const struct eth_header *ehdr,
     return size > 1518;
 }
 
-static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
-                                   size_t size, E1000E_RSSInfo *rss_info,
-                                   bool *external_tx)
+static uint16_t igb_receive_assign(IGBCore *core, const struct iovec *iov,
+                                   size_t iovcnt, size_t iov_ofs,
+                                   const L2Header *l2_header, size_t size,
+                                   E1000E_RSSInfo *rss_info,
+                                   uint16_t *etqf, bool *ts, bool *external_tx)
 {
     static const int ta_shift[] = { 4, 3, 2, 0 };
     const struct eth_header *ehdr = &l2_header->eth;
@@ -999,11 +1019,13 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
     uint16_t queues = 0;
     uint16_t oversized = 0;
     uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
+    PTP2 ptp2;
     bool lpe;
     uint16_t rlpml;
     int i;
 
     memset(rss_info, 0, sizeof(E1000E_RSSInfo));
+    *ts = false;
 
     if (external_tx) {
         *external_tx = true;
@@ -1017,6 +1039,26 @@ static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
         return queues;
     }
 
+    for (*etqf = 0; *etqf < 8; (*etqf)++) {
+        if ((core->mac[ETQF0 + *etqf] & E1000_ETQF_FILTER_ENABLE) &&
+            be16_to_cpu(ehdr->h_proto) == (core->mac[ETQF0 + *etqf] & E1000_ETQF_ETYPE_MASK)) {
+            if ((core->mac[ETQF0 + *etqf] & E1000_ETQF_1588) &&
+                (core->mac[TSYNCRXCTL] & E1000_TSYNCRXCTL_ENABLED) &&
+                !(core->mac[TSYNCRXCTL] & E1000_TSYNCRXCTL_VALID) &&
+                iov_to_buf(iov, iovcnt, iov_ofs + ETH_HLEN, &ptp2, sizeof(ptp2)) >= sizeof(ptp2) &&
+                (ptp2.version_ptp & 15) == 2 &&
+                ptp2.message_id_transport_specific == ((core->mac[TSYNCRXCFG] >> 8) & 255)) {
+                e1000x_timestamp(core->mac, core->timadj, RXSTMPL, RXSTMPH);
+                *ts = true;
+                core->mac[TSYNCRXCTL] |= E1000_TSYNCRXCTL_VALID;
+                core->mac[RXSATRL] = le32_to_cpu(ptp2.source_uuid_lo);
+                core->mac[RXSATRH] = le16_to_cpu(ptp2.source_uuid_hi) |
+                                     (le16_to_cpu(ptp2.sequence_id) << 16);
+            }
+            break;
+        }
+    }
+
     if (core->mac[CTRL_EXT] & BIT(26)) {
         if (be16_to_cpu(ehdr->h_proto) == core->mac[VET] >> 16 &&
             be16_to_cpu(l2_header->vlan[0].h_proto) == (core->mac[VET] & 0xffff) &&
@@ -1232,7 +1274,7 @@ static void
 igb_build_rx_metadata(IGBCore *core,
                       struct NetRxPkt *pkt,
                       bool is_eop,
-                      const E1000E_RSSInfo *rss_info,
+                      const E1000E_RSSInfo *rss_info, uint16_t etqf, bool ts,
                       uint16_t *pkt_info, uint16_t *hdr_info,
                       uint32_t *rss,
                       uint32_t *status_flags,
@@ -1283,29 +1325,33 @@ igb_build_rx_metadata(IGBCore *core,
     if (pkt_info) {
         *pkt_info = rss_info->enabled ? rss_info->type : 0;
 
-        if (hasip4) {
-            *pkt_info |= BIT(4);
-        }
+        if (etqf < 8) {
+            *pkt_info |= BIT(11) | (etqf << 4);
+        } else {
+            if (hasip4) {
+                *pkt_info |= BIT(4);
+            }
 
-        if (hasip6) {
-            *pkt_info |= BIT(6);
-        }
+            if (hasip6) {
+                *pkt_info |= BIT(6);
+            }
 
-        switch (l4hdr_proto) {
-        case ETH_L4_HDR_PROTO_TCP:
-            *pkt_info |= BIT(8);
-            break;
+            switch (l4hdr_proto) {
+            case ETH_L4_HDR_PROTO_TCP:
+                *pkt_info |= BIT(8);
+                break;
 
-        case ETH_L4_HDR_PROTO_UDP:
-            *pkt_info |= BIT(9);
-            break;
+            case ETH_L4_HDR_PROTO_UDP:
+                *pkt_info |= BIT(9);
+                break;
 
-        case ETH_L4_HDR_PROTO_SCTP:
-            *pkt_info |= BIT(10);
-            break;
+            case ETH_L4_HDR_PROTO_SCTP:
+                *pkt_info |= BIT(10);
+                break;
 
-        default:
-            break;
+            default:
+                break;
+            }
         }
     }
 
@@ -1313,6 +1359,10 @@ igb_build_rx_metadata(IGBCore *core,
         *hdr_info = 0;
     }
 
+    if (ts) {
+        *status_flags |= BIT(16);
+    }
+
     /* RX CSO information */
     if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_XSUM_DIS)) {
         trace_e1000e_rx_metadata_ipv6_sum_disabled();
@@ -1368,7 +1418,7 @@ func_exit:
 static inline void
 igb_write_lgcy_rx_descr(IGBCore *core, struct e1000_rx_desc *desc,
                         struct NetRxPkt *pkt,
-                        const E1000E_RSSInfo *rss_info,
+                        const E1000E_RSSInfo *rss_info, uint16_t etqf, bool ts,
                         uint16_t length)
 {
     uint32_t status_flags, rss;
@@ -1379,7 +1429,7 @@ igb_write_lgcy_rx_descr(IGBCore *core, struct e1000_rx_desc *desc,
     desc->csum = 0;
 
     igb_build_rx_metadata(core, pkt, pkt != NULL,
-                          rss_info,
+                          rss_info, etqf, ts,
                           NULL, NULL, &rss,
                           &status_flags, &ip_id,
                           &desc->special);
@@ -1390,7 +1440,7 @@ igb_write_lgcy_rx_descr(IGBCore *core, struct e1000_rx_desc *desc,
 static inline void
 igb_write_adv_rx_descr(IGBCore *core, union e1000_adv_rx_desc *desc,
                        struct NetRxPkt *pkt,
-                       const E1000E_RSSInfo *rss_info,
+                       const E1000E_RSSInfo *rss_info, uint16_t etqf, bool ts,
                        uint16_t length)
 {
     memset(&desc->wb, 0, sizeof(desc->wb));
@@ -1398,7 +1448,7 @@ igb_write_adv_rx_descr(IGBCore *core, union e1000_adv_rx_desc *desc,
     desc->wb.upper.length = cpu_to_le16(length);
 
     igb_build_rx_metadata(core, pkt, pkt != NULL,
-                          rss_info,
+                          rss_info, etqf, ts,
                           &desc->wb.lower.lo_dword.pkt_info,
                           &desc->wb.lower.lo_dword.hdr_info,
                           &desc->wb.lower.hi_dword.rss,
@@ -1409,12 +1459,15 @@ igb_write_adv_rx_descr(IGBCore *core, union e1000_adv_rx_desc *desc,
 
 static inline void
 igb_write_rx_descr(IGBCore *core, union e1000_rx_desc_union *desc,
-struct NetRxPkt *pkt, const E1000E_RSSInfo *rss_info, uint16_t length)
+                   struct NetRxPkt *pkt, const E1000E_RSSInfo *rss_info,
+                   uint16_t etqf, bool ts, uint16_t length)
 {
     if (igb_rx_use_legacy_descriptor(core)) {
-        igb_write_lgcy_rx_descr(core, &desc->legacy, pkt, rss_info, length);
+        igb_write_lgcy_rx_descr(core, &desc->legacy, pkt, rss_info,
+                                etqf, ts, length);
     } else {
-        igb_write_adv_rx_descr(core, &desc->adv, pkt, rss_info, length);
+        igb_write_adv_rx_descr(core, &desc->adv, pkt, rss_info,
+                               etqf, ts, length);
     }
 }
 
@@ -1491,7 +1544,8 @@ igb_rx_descr_threshold_hit(IGBCore *core, const E1000E_RingInfo *rxi)
 static void
 igb_write_packet_to_guest(IGBCore *core, struct NetRxPkt *pkt,
                           const E1000E_RxRing *rxr,
-                          const E1000E_RSSInfo *rss_info)
+                          const E1000E_RSSInfo *rss_info,
+                          uint16_t etqf, bool ts)
 {
     PCIDevice *d;
     dma_addr_t base;
@@ -1573,7 +1627,7 @@ igb_write_packet_to_guest(IGBCore *core, struct NetRxPkt *pkt,
         }
 
         igb_write_rx_descr(core, &desc, is_last ? core->rx_pkt : NULL,
-                           rss_info, written);
+                           rss_info, etqf, ts, written);
         igb_pci_dma_write_rx_desc(core, d, base, &desc, core->rx_desc_len);
 
         igb_ring_advance(core, rxi, core->rx_desc_len / E1000_MIN_RX_DESC_LEN);
@@ -1628,6 +1682,8 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
     size_t iov_ofs = 0;
     E1000E_RxRing rxr;
     E1000E_RSSInfo rss_info;
+    uint16_t etqf;
+    bool ts;
     size_t total_size;
     int strip_vlan_index;
     int i;
@@ -1671,8 +1727,9 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
                                get_eth_packet_type(&min_buf.l2_header.eth));
     net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
 
-    queues = igb_receive_assign(core, &min_buf.l2_header, size,
-                                &rss_info, external_tx);
+    queues = igb_receive_assign(core, iov, iovcnt, iov_ofs,
+                                &min_buf.l2_header, size,
+                                &rss_info, &etqf, &ts, external_tx);
     if (!queues) {
         trace_e1000e_rx_flt_dropped();
         return orig_size;
@@ -1711,7 +1768,7 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
         n |= E1000_ICR_RXDW;
 
         igb_rx_fix_l4_csum(core, core->rx_pkt);
-        igb_write_packet_to_guest(core, core->rx_pkt, &rxr, &rss_info);
+        igb_write_packet_to_guest(core, core->rx_pkt, &rxr, &rss_info, etqf, ts);
 
         /* Check if receive descriptor minimum threshold hit */
         if (igb_rx_descr_threshold_hit(core, rxr.i)) {
@@ -3304,6 +3361,8 @@ static const readops igb_macreg_readops[] = {
     [EIAM]       = igb_mac_readreg,
     [IVAR0 ... IVAR0 + 7] = igb_mac_readreg,
     igb_getreg(IVAR_MISC),
+    igb_getreg(TSYNCRXCFG),
+    [ETQF0 ... ETQF0 + 7] = igb_mac_readreg,
     igb_getreg(VT_CTL),
     [P2VMAILBOX0 ... P2VMAILBOX7] = igb_mac_readreg,
     [V2PMAILBOX0 ... V2PMAILBOX7] = igb_mac_vfmailbox_read,
@@ -3711,6 +3770,8 @@ static const writeops igb_macreg_writeops[] = {
     [EIMS] = igb_set_eims,
     [IVAR0 ... IVAR0 + 7] = igb_mac_writereg,
     igb_putreg(IVAR_MISC),
+    igb_putreg(TSYNCRXCFG),
+    [ETQF0 ... ETQF0 + 7] = igb_mac_writereg,
     igb_putreg(VT_CTL),
     [P2VMAILBOX0 ... P2VMAILBOX7] = igb_set_pfmailbox,
     [V2PMAILBOX0 ... V2PMAILBOX7] = igb_set_vfmailbox,
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 03486edb2e..b88dc9f1f1 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -210,6 +210,15 @@ union e1000_adv_rx_desc {
 #define E1000_DCA_TXCTRL_CPUID_SHIFT 24 /* Tx CPUID now in the last byte */
 #define E1000_DCA_RXCTRL_CPUID_SHIFT 24 /* Rx CPUID now in the last byte */
 
+/* ETQF register bit definitions */
+#define E1000_ETQF_FILTER_ENABLE   BIT(26)
+#define E1000_ETQF_1588            BIT(30)
+#define E1000_ETQF_IMM_INT         BIT(29)
+#define E1000_ETQF_QUEUE_ENABLE    BIT(31)
+#define E1000_ETQF_QUEUE_SHIFT     16
+#define E1000_ETQF_QUEUE_MASK      0x00070000
+#define E1000_ETQF_ETYPE_MASK      0x0000FFFF
+
 #define E1000_DTXSWC_MAC_SPOOF_MASK   0x000000FF /* Per VF MAC spoof control */
 #define E1000_DTXSWC_VLAN_SPOOF_MASK  0x0000FF00 /* Per VF VLAN spoof control */
 #define E1000_DTXSWC_LLE_MASK         0x00FF0000 /* Per VF Local LB enables */
@@ -384,6 +393,20 @@ union e1000_adv_rx_desc {
 #define E1000_FRTIMER   0x01048  /* Free Running Timer - RW */
 #define E1000_FCRTV     0x02460  /* Flow Control Refresh Timer Value - RW */
 
+#define E1000_TSYNCRXCFG 0x05F50 /* Time Sync Rx Configuration - RW */
+
+/* Filtering Registers */
+#define E1000_SAQF(_n) (0x5980 + 4 * (_n))
+#define E1000_DAQF(_n) (0x59A0 + 4 * (_n))
+#define E1000_SPQF(_n) (0x59C0 + 4 * (_n))
+#define E1000_FTQF(_n) (0x59E0 + 4 * (_n))
+#define E1000_SAQF0 E1000_SAQF(0)
+#define E1000_DAQF0 E1000_DAQF(0)
+#define E1000_SPQF0 E1000_SPQF(0)
+#define E1000_FTQF0 E1000_FTQF(0)
+#define E1000_SYNQF(_n) (0x055FC + (4 * (_n))) /* SYN Packet Queue Fltr */
+#define E1000_ETQF(_n)  (0x05CB0 + (4 * (_n))) /* EType Queue Fltr */
+
 #define E1000_RQDPC(_n) (0x0C030 + ((_n) * 0x40))
 
 #define E1000_RXPBS 0x02404  /* Rx Packet Buffer Size - RW */
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 37/40] igb: Implement Tx timestamp
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (35 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 36/40] igb: Implement Rx PTP2 timestamp Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-15 20:13   ` Sriram Yagnaraman
  2023-04-14 11:37 ` [PATCH 38/40] vmxnet3: Do not depend on PC Akihiko Odaki
                   ` (2 subsequent siblings)
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/igb_core.c | 7 +++++++
 hw/net/igb_regs.h | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index c716f400fd..38b53676d4 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -614,6 +614,13 @@ igb_process_tx_desc(IGBCore *core,
                 tx->first_olinfo_status = le32_to_cpu(tx_desc->read.olinfo_status);
                 tx->first = false;
             }
+
+            if ((cmd_type_len & E1000_ADVTXD_MAC_TSTAMP) &&
+                (core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_ENABLED) &&
+                !(core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_VALID)) {
+                core->mac[TSYNCTXCTL] |= E1000_TSYNCTXCTL_VALID;
+                e1000x_timestamp(core->mac, core->timadj, TXSTMPL, TXSTMPH);
+            }
         } else if ((cmd_type_len & E1000_ADVTXD_DTYP_CTXT) ==
                    E1000_ADVTXD_DTYP_CTXT) {
             /* advanced transmit context descriptor */
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index b88dc9f1f1..808b587a36 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -322,6 +322,9 @@ union e1000_adv_rx_desc {
 /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
 #define E1000_EITR_CNT_IGNR     0x80000000 /* Don't reset counters on write */
 
+#define E1000_TSYNCTXCTL_VALID    0x00000001 /* tx timestamp valid */
+#define E1000_TSYNCTXCTL_ENABLED  0x00000010 /* enable tx timestampping */
+
 /* PCI Express Control */
 #define E1000_GCR_CMPL_TMOUT_MASK       0x0000F000
 #define E1000_GCR_CMPL_TMOUT_10ms       0x00001000
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 38/40] vmxnet3: Do not depend on PC
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (36 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 37/40] igb: Implement Tx timestamp Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:13   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions Akihiko Odaki
  2023-04-14 11:37 ` [PATCH 40/40] docs/system/devices/igb: Note igb is tested for DPDK Akihiko Odaki
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

vmxnet3 has no dependency on PC, and VMware Fusion actually makes it
available on Apple Silicon according to:
https://kb.vmware.com/s/article/90364

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/net/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/Kconfig b/hw/net/Kconfig
index 18c7851efe..98e00be4f9 100644
--- a/hw/net/Kconfig
+++ b/hw/net/Kconfig
@@ -56,7 +56,7 @@ config RTL8139_PCI
 
 config VMXNET3_PCI
     bool
-    default y if PCI_DEVICES && PC_PCI
+    default y if PCI_DEVICES
     depends on PCI
 
 config SMC91C111
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (37 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 38/40] vmxnet3: Do not depend on PC Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  2023-04-14 15:13   ` Philippe Mathieu-Daudé
  2023-04-14 11:37 ` [PATCH 40/40] docs/system/devices/igb: Note igb is tested for DPDK Akihiko Odaki
  39 siblings, 1 reply; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

I have made significant changes for network packet abstractions so add
me as a reviewer.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c31d2279ab..8b2ef5943c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2214,6 +2214,7 @@ F: tests/qtest/fuzz-megasas-test.c
 
 Network packet abstractions
 M: Dmitry Fleytman <dmitry.fleytman@gmail.com>
+R: Akihiko Odaki <akihiko.odaki@daynix.com>
 S: Maintained
 F: include/net/eth.h
 F: net/eth.c
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 40/40] docs/system/devices/igb: Note igb is tested for DPDK
  2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
                   ` (38 preceding siblings ...)
  2023-04-14 11:37 ` [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions Akihiko Odaki
@ 2023-04-14 11:37 ` Akihiko Odaki
  39 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-14 11:37 UTC (permalink / raw)
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel,
	Akihiko Odaki

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 docs/system/devices/igb.rst | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
index afe036dad2..60c10bf7c7 100644
--- a/docs/system/devices/igb.rst
+++ b/docs/system/devices/igb.rst
@@ -14,7 +14,8 @@ Limitations
 ===========
 
 This igb implementation was tested with Linux Test Project [2]_ and Windows HLK
-[3]_ during the initial development. The command used when testing with LTP is:
+[3]_ during the initial development. Later it was also tested with DPDK Test
+Suite [4]_. The command used when testing with LTP is:
 
 .. code-block:: shell
 
@@ -22,8 +23,8 @@ This igb implementation was tested with Linux Test Project [2]_ and Windows HLK
 
 Be aware that this implementation lacks many functionalities available with the
 actual hardware, and you may experience various failures if you try to use it
-with a different operating system other than Linux and Windows or if you try
-functionalities not covered by the tests.
+with a different operating system other than DPDK, Linux, and Windows or if you
+try functionalities not covered by the tests.
 
 Using igb
 =========
@@ -32,7 +33,7 @@ Using igb should be nothing different from using another network device. See
 :ref:`pcsys_005fnetwork` in general.
 
 However, you may also need to perform additional steps to activate SR-IOV
-feature on your guest. For Linux, refer to [4]_.
+feature on your guest. For Linux, refer to [5]_.
 
 Developing igb
 ==============
@@ -68,4 +69,5 @@ References
 .. [1] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eb-gigabit-ethernet-controller-datasheet.pdf
 .. [2] https://github.com/linux-test-project/ltp
 .. [3] https://learn.microsoft.com/en-us/windows-hardware/test/hlk/
-.. [4] https://docs.kernel.org/PCI/pci-iov-howto.html
+.. [4] https://doc.dpdk.org/dts/gsg/
+.. [5] https://docs.kernel.org/PCI/pci-iov-howto.html
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI
  2023-04-14 11:36 ` [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI Akihiko Odaki
@ 2023-04-14 14:23   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:23 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:36, Akihiko Odaki wrote:
> This also fixes the leak of memory mapping when the specified memory is
> partially mapped.
> 
> Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/net_tx_pkt.c  | 65 +++++++++++++++++++++++---------------------
>   hw/net/net_tx_pkt.h  | 38 +++++++++++++++++++-------

Preferably split the patch in at least 2, first the back-end,
then the front-ends.

Also consider installing scripts/git.orderfile when posting
API changes, as this eases email review workflow (no need to
scroll up/down frenetically to follow).

>   hw/net/e1000e_core.c | 13 +++++----
>   hw/net/igb_core.c    | 13 ++++-----

>   hw/net/vmxnet3.c     | 14 +++++-----
>   5 files changed, 83 insertions(+), 60 deletions(-)



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/40] igb: Include the second VLAN tag in the buffer
  2023-04-14 11:37 ` [PATCH 04/40] igb: Include the second VLAN tag in the buffer Akihiko Odaki
@ 2023-04-14 14:28   ` Philippe Mathieu-Daudé
  2023-04-14 14:32     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:28 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/igb_core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 55de212447..f725ab97ae 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -1590,7 +1590,7 @@ static ssize_t
>   igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
>                        bool has_vnet, bool *external_tx)
>   {
> -    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
> +    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 8);

Aren't VLAN tags 16-bit wide? Could you convert this magic value
to some verbose-but-obvious definitions?

Is it worth adding a vlan_tag_t typedef in include/net/eth.h?

>       uint16_t queues = 0;
>       uint32_t n = 0;



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/40] igb: Include the second VLAN tag in the buffer
  2023-04-14 14:28   ` Philippe Mathieu-Daudé
@ 2023-04-14 14:32     ` Philippe Mathieu-Daudé
  2023-04-14 14:35       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:32 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 16:28, Philippe Mathieu-Daudé wrote:
> On 14/4/23 13:37, Akihiko Odaki wrote:
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>>   hw/net/igb_core.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
>> index 55de212447..f725ab97ae 100644
>> --- a/hw/net/igb_core.c
>> +++ b/hw/net/igb_core.c
>> @@ -1590,7 +1590,7 @@ static ssize_t
>>   igb_receive_internal(IGBCore *core, const struct iovec *iov, int 
>> iovcnt,
>>                        bool has_vnet, bool *external_tx)
>>   {
>> -    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
>> +    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 8);
> 
> Aren't VLAN tags 16-bit wide? Could you convert this magic value
> to some verbose-but-obvious definitions?

Digging a bit more, is this struct vlan_header?

> Is it worth adding a vlan_tag_t typedef in include/net/eth.h?
> 
>>       uint16_t queues = 0;
>>       uint32_t n = 0;
> 



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/40] igb: Include the second VLAN tag in the buffer
  2023-04-14 14:32     ` Philippe Mathieu-Daudé
@ 2023-04-14 14:35       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:35 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 16:32, Philippe Mathieu-Daudé wrote:
> On 14/4/23 16:28, Philippe Mathieu-Daudé wrote:
>> On 14/4/23 13:37, Akihiko Odaki wrote:
>>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>>> ---
>>>   hw/net/igb_core.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
>>> index 55de212447..f725ab97ae 100644
>>> --- a/hw/net/igb_core.c
>>> +++ b/hw/net/igb_core.c
>>> @@ -1590,7 +1590,7 @@ static ssize_t
>>>   igb_receive_internal(IGBCore *core, const struct iovec *iov, int 
>>> iovcnt,
>>>                        bool has_vnet, bool *external_tx)
>>>   {
>>> -    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
>>> +    static const int maximum_ethernet_hdr_len = (ETH_HLEN + 8);
>>
>> Aren't VLAN tags 16-bit wide? Could you convert this magic value
>> to some verbose-but-obvious definitions?
> 
> Digging a bit more, is this struct vlan_header?

And now I see in patch #08 "igb: Always copy ethernet header":

   +typedef struct L2Header {
   +    struct eth_header eth;
   +    struct vlan_header vlan[2];
   +} L2Header;

Maybe add it first, and use sizeof(L2Header) here directly?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/40] igb: Always copy ethernet header
  2023-04-14 11:37 ` [PATCH 08/40] igb: " Akihiko Odaki
@ 2023-04-14 14:46   ` Philippe Mathieu-Daudé
  2023-04-21 12:18     ` Akihiko Odaki
  0 siblings, 1 reply; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:46 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> igb_receive_internal() used to check the iov length to determine
> copy the iovs to a contiguous buffer, but the check is flawed in two
> ways:
> - It does not ensure that iovcnt > 0.
> - It does not take virtio-net header into consideration.
> 
> The size of this copy is just 22 octets, which can be even less than
> the code size required for checks. This (wrong) optimization is probably
> not worth so just remove it. Removing this also allows igb to assume
> aligned accesses for the ethernet header.
> 
> Fixes: 3a977deebe ("Intrdocue igb device emulation")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/igb_core.c | 39 +++++++++++++++++++++------------------
>   1 file changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 53f60fc3d3..1d188b526c 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c


> -static uint16_t igb_receive_assign(IGBCore *core, const struct eth_header *ehdr,
> +static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
>                                      size_t size, E1000E_RSSInfo *rss_info,
>                                      bool *external_tx)
>   {
>       static const int ta_shift[] = { 4, 3, 2, 0 };
> +    const struct eth_header *ehdr = &l2_header->eth;
>       uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
>       uint16_t queues = 0;
>       uint16_t oversized = 0;
> -    uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(ehdr)->h_tci) & VLAN_VID_MASK;
> +    uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;

Why this API change? Are we certain tci is aligned in host memory?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/40] Fix references to igb Avocado test
  2023-04-14 11:37 ` [PATCH 09/40] Fix references to igb Avocado test Akihiko Odaki
@ 2023-04-14 14:47   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 14:47 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Fixes: 9f95111474 ("tests/avocado: re-factor igb test to avoid timeouts")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   MAINTAINERS                                        | 2 +-
>   docs/system/devices/igb.rst                        | 2 +-
>   scripts/ci/org.centos/stream/8/x86_64/test-avocado | 2 +-
>   3 files changed, 3 insertions(+), 3 deletions(-)

Oops.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 15/40] e1000x: Take CRC into consideration for size check
  2023-04-14 11:37 ` [PATCH 15/40] e1000x: Take CRC into consideration for size check Akihiko Odaki
@ 2023-04-14 15:03   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:03 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Section 13.7.15 Receive Length Error Count says:
>>   Packets over 1522 bytes are oversized if LongPacketEnable is 0b
>> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
>> is considered oversized if it exceeds 16384 bytes.
> 
>> These lengths are based on bytes in the received packet from
>> <Destination Address> through <CRC>, inclusively.
> 
> As QEMU processes packets without CRC, the number of bytes for CRC
> need to be subtracted.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/e1000x_common.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
> index 6cc23138a8..b4dfc74b66 100644
> --- a/hw/net/e1000x_common.c
> +++ b/hw/net/e1000x_common.c
> @@ -142,10 +142,10 @@ bool e1000x_is_oversized(uint32_t *mac, size_t size)
>   {
>       /* this is the size past which hardware will
>          drop packets when setting LPE=0 */
> -    static const int maximum_ethernet_vlan_size = 1522;
> +    static const int maximum_ethernet_vlan_size = 1522 - 4;
>       /* this is the size past which hardware will
>          drop packets when setting LPE=1 */
> -    static const int maximum_ethernet_lpe_size = 16 * KiB;
> +    static const int maximum_ethernet_lpe_size = 16 * KiB - 4;
>   
>       if ((size > maximum_ethernet_lpe_size ||
>           (size > maximum_ethernet_vlan_size

IMHO this function could be simplified. Something like:

   bool long_packet_enabled = mac[RCTL] & E1000_RCTL_LPE;
   size_t oversize = long_packet_enabled ? 16 * KiB : ETH_VLAN_MAXSIZE;
   size_t crc32_size = sizeof(uint32_t);

   if (mac[RCTL] & E1000_RCTL_SBP) {
     return false;
   }

   if (size + crc32_size > oversize ) {
     ...
     return true;
   }

   return false;
}


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 16/40] e1000e: Always log status after building rx metadata
  2023-04-14 11:37 ` [PATCH 16/40] e1000e: Always log status after building rx metadata Akihiko Odaki
@ 2023-04-14 15:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:04 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Without this change, the status flags may not be traced e.g. if checksum
> offloading is disabled.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/e1000e_core.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 17/40] igb: Always log status after building rx metadata
  2023-04-14 11:37 ` [PATCH 17/40] igb: " Akihiko Odaki
@ 2023-04-14 15:07   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:07 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Without this change, the status flags may not be traced e.g. if checksum
> offloading is disabled.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/igb_core.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 5fdc8bc42d..ccc5a626b4 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -1303,9 +1303,8 @@ igb_build_rx_metadata(IGBCore *core,
>           trace_e1000e_rx_metadata_l4_cso_disabled();
>       }
>   
> -    trace_e1000e_rx_metadata_status_flags(*status_flags);
> -
>   func_exit:
> +    trace_e1000e_rx_metadata_status_flags(*status_flags);
>       *status_flags = cpu_to_le32(*status_flags);
>   }
>   

So igb_build_rx_metadata() is very similar to
e1000e_build_rx_metadata()...


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 23/40] igb: Share common VF constants
  2023-04-14 11:37 ` [PATCH 23/40] igb: Share common VF constants Akihiko Odaki
@ 2023-04-14 15:08   ` Philippe Mathieu-Daudé
  2023-04-15 19:08     ` Sriram Yagnaraman
  0 siblings, 1 reply; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> The constants need to be consistent between the PF and VF.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/igb.c        | 10 +++++-----
>   hw/net/igb_common.h |  8 ++++++++
>   hw/net/igbvf.c      |  7 -------
>   3 files changed, 13 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 24/40] igb: Fix igb_mac_reg_init alignment
  2023-04-14 11:37 ` [PATCH 24/40] igb: Fix igb_mac_reg_init alignment Akihiko Odaki
@ 2023-04-14 15:09   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:09 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/igb_core.c | 96 +++++++++++++++++++++++------------------------
>   1 file changed, 48 insertions(+), 48 deletions(-)

"Fix igb_mac_reg_init() coding style alignment" to clarify
this isn't about data alignment.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 25/40] net/eth: Use void pointers
  2023-04-14 11:37 ` [PATCH 25/40] net/eth: Use void pointers Akihiko Odaki
@ 2023-04-14 15:10   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:10 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> The uses of uint8_t pointers were misleading as they are never accessed
> as an array of octets and it even require more strict alignment to
> access as struct eth_header.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   include/net/eth.h | 4 ++--
>   net/eth.c         | 6 +++---
>   2 files changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 38/40] vmxnet3: Do not depend on PC
  2023-04-14 11:37 ` [PATCH 38/40] vmxnet3: Do not depend on PC Akihiko Odaki
@ 2023-04-14 15:13   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:13 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> vmxnet3 has no dependency on PC, and VMware Fusion actually makes it
> available on Apple Silicon according to:
> https://kb.vmware.com/s/article/90364
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/net/Kconfig | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions
  2023-04-14 11:37 ` [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions Akihiko Odaki
@ 2023-04-14 15:13   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 69+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-14 15:13 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 14/4/23 13:37, Akihiko Odaki wrote:
> I have made significant changes for network packet abstractions so add
> me as a reviewer.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   MAINTAINERS | 1 +
>   1 file changed, 1 insertion(+)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging
  2023-04-14 11:37 ` [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging Akihiko Odaki
@ 2023-04-15 19:08   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel


> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging
> 
> While the datasheet of e1000e says it checks CTRL.VME for tx VLAN tagging,
> igb's datasheet has no such statements. It also says for
> "CTRL.VLE":
> > This register only affects the VLAN Strip in Rx it does not have any
> > influence in the Tx path in the 82576.
> (Appendix A. Changes from the 82575)
> 
> There is no "CTRL.VLE" so it is more likely that it is a mistake of CTRL.VME.
> 
> Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 03/40] igb: Fix Rx packet type encoding
  2023-04-14 11:37 ` [PATCH 03/40] igb: Fix Rx packet type encoding Akihiko Odaki
@ 2023-04-15 19:08   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel

> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 03/40] igb: Fix Rx packet type encoding
> 
> igb's advanced descriptor uses a packet type encoding different from one used
> in e1000e's extended descriptor. Fix the logic to encode Rx packet type
> accordingly.
> 
> Fixes: 3a977deebe ("Intrdocue igb device emulation")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> 464a41d0aa..55de212447 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -1227,7 +1227,6 @@ igb_build_rx_metadata(IGBCore *core,
>      struct virtio_net_hdr *vhdr;
>      bool hasip4, hasip6;
>      EthL4HdrProto l4hdr_proto;
> -    uint32_t pkt_type;
> 
>      *status_flags = E1000_RXD_STAT_DD;
> 
> @@ -1266,28 +1265,29 @@ igb_build_rx_metadata(IGBCore *core,
>          trace_e1000e_rx_metadata_ack();
>      }
> 
> -    if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_DIS)) {
> -        trace_e1000e_rx_metadata_ipv6_filtering_disabled();
> -        pkt_type = E1000_RXD_PKT_MAC;
> -    } else if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP ||
> -               l4hdr_proto == ETH_L4_HDR_PROTO_UDP) {
> -        pkt_type = hasip4 ? E1000_RXD_PKT_IP4_XDP :
> E1000_RXD_PKT_IP6_XDP;
> -    } else if (hasip4 || hasip6) {
> -        pkt_type = hasip4 ? E1000_RXD_PKT_IP4 : E1000_RXD_PKT_IP6;
> -    } else {
> -        pkt_type = E1000_RXD_PKT_MAC;
> -    }
> +    if (pkt_info) {
> +        *pkt_info = rss_info->enabled ? rss_info->type : 0;
> 
> -    trace_e1000e_rx_metadata_pkt_type(pkt_type);
> +        if (hasip4) {
> +            *pkt_info |= BIT(4);

DPDK seems to care about the packet type. 😊
Would it make sense to introduce a new set of macros similar to E1000_RXD_PKT* for igb instead of these magic numbers?
In any case, 
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

> +        }
> 
> -    if (pkt_info) {
> -        if (rss_info->enabled) {
> -            *pkt_info = rss_info->type;
> +        if (hasip6) {
> +            *pkt_info |= BIT(6);
>          }
> 
> -        *pkt_info |= (pkt_type << 4);
> -    } else {
> -        *status_flags |= E1000_RXD_PKT_TYPE(pkt_type);
> +        switch (l4hdr_proto) {
> +        case ETH_L4_HDR_PROTO_TCP:
> +            *pkt_info |= BIT(8);
> +            break;
> +
> +        case ETH_L4_HDR_PROTO_UDP:
> +            *pkt_info |= BIT(9);
> +            break;
> +
> +        default:
> +            break;
> +        }
>      }
> 
>      if (hdr_info) {
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 18/40] igb: Remove goto
  2023-04-14 11:37 ` [PATCH 18/40] igb: Remove goto Akihiko Odaki
@ 2023-04-15 19:08   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel



> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 18/40] igb: Remove goto
> 
> The goto is a bit confusing as it changes the control flow only if L4 protocol is
> not recognized. It is also different from e1000e, and noisy when comparing
> e1000e and igb.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 22/40] igb: Add more definitions for Tx descriptor
  2023-04-14 11:37 ` [PATCH 22/40] igb: Add more definitions for Tx descriptor Akihiko Odaki
@ 2023-04-15 19:08   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel



> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 22/40] igb: Add more definitions for Tx descriptor
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c |  2 +-
>  hw/net/igb_regs.h | 32 +++++++++++++++++++++++++++-----
>  2 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> e5a7021c0e..350462c40c 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -418,7 +418,7 @@ igb_setup_tx_offloads(IGBCore *core, struct igb_tx
> *tx)  {
>      if (tx->first_cmd_type_len & E1000_ADVTXD_DCMD_TSE) {
>          uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
> -        uint32_t mss = tx->ctx[idx].mss_l4len_idx >> 16;
> +        uint32_t mss = tx->ctx[idx].mss_l4len_idx >>
> + E1000_ADVTXD_MSS_SHIFT;
>          if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, mss)) {
>              return false;
>          }
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h index
> c5c5b3c3b8..22ce909173 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -42,11 +42,6 @@ union e1000_adv_tx_desc {
>      } wb;
>  };
> 
> -#define E1000_ADVTXD_DTYP_CTXT  0x00200000 /* Advanced Context
> Descriptor */ -#define E1000_ADVTXD_DTYP_DATA  0x00300000 /* Advanced
> Data Descriptor */ -#define E1000_ADVTXD_DCMD_DEXT  0x20000000 /*
> Descriptor Extension (1=Adv) */
> -#define E1000_ADVTXD_DCMD_TSE   0x80000000 /* TCP/UDP Segmentation
> Enable */
> -
>  #define E1000_ADVTXD_POTS_IXSM  0x00000100 /* Insert TCP/UDP
> Checksum */  #define E1000_ADVTXD_POTS_TXSM  0x00000200 /* Insert
> TCP/UDP Checksum */
> 
> @@ -151,6 +146,10 @@ union e1000_adv_rx_desc {
>  #define IGB_82576_VF_DEV_ID        0x10CA
>  #define IGB_I350_VF_DEV_ID         0x1520
> 
> +/* VLAN info */
> +#define IGB_TX_FLAGS_VLAN_MASK     0xffff0000
> +#define IGB_TX_FLAGS_VLAN_SHIFT    16
> +

Doesn't seem to be used anywhere, added by mistake? 

>  /* from igb/e1000_82575.h */
> 
>  #define E1000_MRQC_ENABLE_RSS_MQ            0x00000002
> @@ -160,6 +159,29 @@ union e1000_adv_rx_desc {
>  #define E1000_MRQC_RSS_FIELD_IPV6_UDP       0x00800000
>  #define E1000_MRQC_RSS_FIELD_IPV6_UDP_EX    0x01000000
> 
> +/* Adv Transmit Descriptor Config Masks */
> +#define E1000_ADVTXD_MAC_TSTAMP   0x00080000 /* IEEE1588
> Timestamp packet */
> +#define E1000_ADVTXD_DTYP_CTXT    0x00200000 /* Advanced Context
> Descriptor */
> +#define E1000_ADVTXD_DTYP_DATA    0x00300000 /* Advanced Data
> Descriptor */
> +#define E1000_ADVTXD_DCMD_EOP     0x01000000 /* End of Packet */
> +#define E1000_ADVTXD_DCMD_IFCS    0x02000000 /* Insert FCS (Ethernet
> CRC) */
> +#define E1000_ADVTXD_DCMD_RS      0x08000000 /* Report Status */
> +#define E1000_ADVTXD_DCMD_DEXT    0x20000000 /* Descriptor extension
> (1=Adv) */
> +#define E1000_ADVTXD_DCMD_VLE     0x40000000 /* VLAN pkt enable */

nit; You could use the above definition instead of E1000_TXD_CMD_VLE in igb_tx_insert_vlan()?

> +#define E1000_ADVTXD_DCMD_TSE     0x80000000 /* TCP Seg enable */
> +#define E1000_ADVTXD_PAYLEN_SHIFT    14 /* Adv desc PAYLEN shift */
> +
> +#define E1000_ADVTXD_MACLEN_SHIFT    9  /* Adv ctxt desc mac len shift */
> +#define E1000_ADVTXD_TUCMD_L4T_UDP 0x00000000  /* L4 Packet TYPE
> of UDP */
> +#define E1000_ADVTXD_TUCMD_IPV4    0x00000400  /* IP Packet Type:
> 1=IPv4 */
> +#define E1000_ADVTXD_TUCMD_L4T_TCP 0x00000800  /* L4 Packet TYPE of
> TCP
> +*/ #define E1000_ADVTXD_TUCMD_L4T_SCTP 0x00001000 /* L4 packet
> TYPE of
> +SCTP */
> +/* IPSec Encrypt Enable for ESP */
> +#define E1000_ADVTXD_L4LEN_SHIFT     8  /* Adv ctxt L4LEN shift */
> +#define E1000_ADVTXD_MSS_SHIFT      16  /* Adv ctxt MSS shift */
> +/* Adv ctxt IPSec SA IDX mask */
> +/* Adv ctxt IPSec ESP len mask */
> +
>  /* Additional Transmit Descriptor Control definitions */  #define
> E1000_TXDCTL_QUEUE_ENABLE  0x02000000 /* Enable specific Tx Queue */
> 
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor
  2023-04-14 11:37 ` [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor Akihiko Odaki
@ 2023-04-15 19:08   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel


> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor
> 
> Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> > For frames that spans multiple descriptors, all fields apart from
> > DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> > in the first descriptors and are ignored in the subsequent ones.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> cca71611fe..e5a7021c0e 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -613,7 +613,7 @@ igb_process_tx_desc(IGBCore *core,
>              idx = (tx->first_olinfo_status >> 4) & 1;
>              igb_tx_insert_vlan(core, queue_index, tx,
>                  tx->ctx[idx].vlan_macip_lens >> 16,
> -                !!(cmd_type_len & E1000_TXD_CMD_VLE));
> +                !!(tx->first_cmd_type_len & E1000_TXD_CMD_VLE));
> 
>              if (igb_tx_pkt_send(core, tx, queue_index)) {
>                  igb_on_tx_done_update_stats(core, tx->tx_pkt, queue_index);
> --
> 2.40.0

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 23/40] igb: Share common VF constants
  2023-04-14 15:08   ` Philippe Mathieu-Daudé
@ 2023-04-15 19:08     ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:08 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Thomas Huth, Wainer dos Santos Moschetta,
	Beraldo Leal, Cleber Rosa, Laurent Vivier, Paolo Bonzini,
	qemu-devel


> -----Original Message-----
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Sent: Friday, 14 April 2023 17:09
> To: Akihiko Odaki <akihiko.odaki@daynix.com>
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Thomas Huth <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org
> Subject: Re: [PATCH 23/40] igb: Share common VF constants
> 
> On 14/4/23 13:37, Akihiko Odaki wrote:
> > The constants need to be consistent between the PF and VF.
> >
> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > ---
> >   hw/net/igb.c        | 10 +++++-----
> >   hw/net/igb_common.h |  8 ++++++++
> >   hw/net/igbvf.c      |  7 -------
> >   3 files changed, 13 insertions(+), 12 deletions(-)
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
  2023-04-14 11:37 ` [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols() Akihiko Odaki
@ 2023-04-15 19:09   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:09 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel


> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 06/40] net/net_rx_pkt: Use iovec for
> net_rx_pkt_set_protocols()
> 
> igb does not properly ensure the buffer passed to
> net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
> Allow it to pass scattered data to net_rx_pkt_set_protocols().
> 
> Fixes: 3a977deebe ("Intrdocue igb device emulation")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c   |  2 +-
>  hw/net/net_rx_pkt.c | 14 +++++---------  hw/net/net_rx_pkt.h | 10 ++++++----
> hw/net/virtio-net.c |  7 +++++--
>  hw/net/vmxnet3.c    |  7 ++++++-
>  include/net/eth.h   |  6 +++---
>  net/eth.c           | 18 ++++++++----------
>  7 files changed, 34 insertions(+), 30 deletions(-)
> 

Very nice. 
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> 5d4884b834..53f60fc3d3 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -1650,7 +1650,7 @@ igb_receive_internal(IGBCore *core, const struct
> iovec *iov, int iovcnt,
> 
>      ehdr = PKT_GET_ETH_HDR(filter_buf);
>      net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
> -    net_rx_pkt_set_protocols(core->rx_pkt, filter_buf, size);
> +    net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
> 
>      queues = igb_receive_assign(core, ehdr, size, &rss_info, external_tx);
>      if (!queues) {
> diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c index
> 39cdea06de..63be6e05ad 100644
> --- a/hw/net/net_rx_pkt.c
> +++ b/hw/net/net_rx_pkt.c
> @@ -103,7 +103,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
>                                  iov, iovcnt, ploff, pkt->tot_len);
>      }
> 
> -    eth_get_protocols(pkt->vec, pkt->vec_len, &pkt->hasip4, &pkt->hasip6,
> +    eth_get_protocols(pkt->vec, pkt->vec_len, 0, &pkt->hasip4,
> + &pkt->hasip6,
>                        &pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
>                        &pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);
> 
> @@ -186,17 +186,13 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt
> *pkt)
>      return pkt->tot_len;
>  }
> 
> -void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
> -                              size_t len)
> +void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
> +                              const struct iovec *iov, size_t iovcnt,
> +                              size_t iovoff)
>  {
> -    const struct iovec iov = {
> -        .iov_base = (void *)data,
> -        .iov_len = len
> -    };
> -
>      assert(pkt);
> 
> -    eth_get_protocols(&iov, 1, &pkt->hasip4, &pkt->hasip6,
> +    eth_get_protocols(iov, iovcnt, iovoff, &pkt->hasip4, &pkt->hasip6,
>                        &pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
>                        &pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);  } diff --
> git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h index
> d00b484900..a06f5c2675 100644
> --- a/hw/net/net_rx_pkt.h
> +++ b/hw/net/net_rx_pkt.h
> @@ -55,12 +55,14 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt);
>   * parse and set packet analysis results
>   *
>   * @pkt:            packet
> - * @data:           pointer to the data buffer to be parsed
> - * @len:            data length
> + * @iov:            received data scatter-gather list
> + * @iovcnt:         number of elements in iov
> + * @iovoff:         data start offset in the iov
>   *
>   */
> -void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
> -                              size_t len);
> +void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
> +                              const struct iovec *iov, size_t iovcnt,
> +                              size_t iovoff);
> 
>  /**
>   * fetches packet analysis results
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index
> 53e1c32643..37551fd854 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1835,9 +1835,12 @@ static int virtio_net_process_rss(NetClientState
> *nc, const uint8_t *buf,
>          VIRTIO_NET_HASH_REPORT_UDPv6,
>          VIRTIO_NET_HASH_REPORT_UDPv6_EX
>      };
> +    struct iovec iov = {
> +        .iov_base = (void *)buf,
> +        .iov_len = size
> +    };
> 
> -    net_rx_pkt_set_protocols(pkt, buf + n->host_hdr_len,
> -                             size - n->host_hdr_len);
> +    net_rx_pkt_set_protocols(pkt, &iov, 1, n->host_hdr_len);
>      net_rx_pkt_get_protocols(pkt, &hasip4, &hasip6, &l4hdr_proto);
>      net_hash_type = virtio_net_get_hash_type(hasip4, hasip6, l4hdr_proto,
>                                               n->rss_data.hash_types); diff --git
> a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 9acff310e7..05f41b6dfa
> 100644
> --- a/hw/net/vmxnet3.c
> +++ b/hw/net/vmxnet3.c
> @@ -2001,7 +2001,12 @@ vmxnet3_receive(NetClientState *nc, const uint8_t
> *buf, size_t size)
>          get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
> 
>      if (vmxnet3_rx_filter_may_indicate(s, buf, size)) {
> -        net_rx_pkt_set_protocols(s->rx_pkt, buf, size);
> +        struct iovec iov = {
> +            .iov_base = (void *)buf,
> +            .iov_len = size
> +        };
> +
> +        net_rx_pkt_set_protocols(s->rx_pkt, &iov, 1, 0);
>          vmxnet3_rx_need_csum_calculate(s->rx_pkt, buf, size);
>          net_rx_pkt_attach_data(s->rx_pkt, buf, size, s->rx_vlan_stripping);
>          bytes_indicated = vmxnet3_indicate_packet(s) ? size : -1; diff --git
> a/include/net/eth.h b/include/net/eth.h index c5ae4493b4..9f19c3a695
> 100644
> --- a/include/net/eth.h
> +++ b/include/net/eth.h
> @@ -312,10 +312,10 @@ eth_get_l2_hdr_length(const void *p)  }
> 
>  static inline uint32_t
> -eth_get_l2_hdr_length_iov(const struct iovec *iov, int iovcnt)
> +eth_get_l2_hdr_length_iov(const struct iovec *iov, size_t iovcnt,
> +size_t iovoff)
>  {
>      uint8_t p[sizeof(struct eth_header) + sizeof(struct vlan_header)];
> -    size_t copied = iov_to_buf(iov, iovcnt, 0, p, ARRAY_SIZE(p));
> +    size_t copied = iov_to_buf(iov, iovcnt, iovoff, p, ARRAY_SIZE(p));
> 
>      if (copied < ARRAY_SIZE(p)) {
>          return copied;
> @@ -397,7 +397,7 @@ typedef struct eth_l4_hdr_info_st {
>      bool has_tcp_data;
>  } eth_l4_hdr_info;
> 
> -void eth_get_protocols(const struct iovec *iov, int iovcnt,
> +void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t
> +iovoff,
>                         bool *hasip4, bool *hasip6,
>                         size_t *l3hdr_off,
>                         size_t *l4hdr_off, diff --git a/net/eth.c b/net/eth.c index
> 70bcd8e355..d7b30df79f 100644
> --- a/net/eth.c
> +++ b/net/eth.c
> @@ -136,7 +136,7 @@ _eth_tcp_has_data(bool is_ip4,
>      return l4len > TCP_HEADER_DATA_OFFSET(tcp);  }
> 
> -void eth_get_protocols(const struct iovec *iov, int iovcnt,
> +void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t
> +iovoff,
>                         bool *hasip4, bool *hasip6,
>                         size_t *l3hdr_off,
>                         size_t *l4hdr_off, @@ -147,26 +147,24 @@ void
> eth_get_protocols(const struct iovec *iov, int iovcnt,  {
>      int proto;
>      bool fragment = false;
> -    size_t l2hdr_len = eth_get_l2_hdr_length_iov(iov, iovcnt);
>      size_t input_size = iov_size(iov, iovcnt);
>      size_t copied;
>      uint8_t ip_p;
> 
>      *hasip4 = *hasip6 = false;
> +    *l3hdr_off = iovoff + eth_get_l2_hdr_length_iov(iov, iovcnt,
> + iovoff);
>      l4hdr_info->proto = ETH_L4_HDR_PROTO_INVALID;
> 
> -    proto = eth_get_l3_proto(iov, iovcnt, l2hdr_len);
> -
> -    *l3hdr_off = l2hdr_len;
> +    proto = eth_get_l3_proto(iov, iovcnt, *l3hdr_off);
> 
>      if (proto == ETH_P_IP) {
>          struct ip_header *iphdr = &ip4hdr_info->ip4_hdr;
> 
> -        if (input_size < l2hdr_len) {
> +        if (input_size < *l3hdr_off) {
>              return;
>          }
> 
> -        copied = iov_to_buf(iov, iovcnt, l2hdr_len, iphdr, sizeof(*iphdr));
> +        copied = iov_to_buf(iov, iovcnt, *l3hdr_off, iphdr,
> + sizeof(*iphdr));
>          if (copied < sizeof(*iphdr) ||
>              IP_HEADER_VERSION(iphdr) != IP_HEADER_VERSION_4) {
>              return;
> @@ -175,17 +173,17 @@ void eth_get_protocols(const struct iovec *iov, int
> iovcnt,
>          *hasip4 = true;
>          ip_p = iphdr->ip_p;
>          ip4hdr_info->fragment = IP4_IS_FRAGMENT(iphdr);
> -        *l4hdr_off = l2hdr_len + IP_HDR_GET_LEN(iphdr);
> +        *l4hdr_off = *l3hdr_off + IP_HDR_GET_LEN(iphdr);
> 
>          fragment = ip4hdr_info->fragment;
>      } else if (proto == ETH_P_IPV6) {
> -        if (!eth_parse_ipv6_hdr(iov, iovcnt, l2hdr_len, ip6hdr_info)) {
> +        if (!eth_parse_ipv6_hdr(iov, iovcnt, *l3hdr_off, ip6hdr_info))
> + {
>              return;
>          }
> 
>          *hasip6 = true;
>          ip_p = ip6hdr_info->l4proto;
> -        *l4hdr_off = l2hdr_len + ip6hdr_info->full_hdr_len;
> +        *l4hdr_off = *l3hdr_off + ip6hdr_info->full_hdr_len;
>          fragment = ip6hdr_info->fragment;
>      } else {
>          return;
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 14/40] e1000x: Share more Rx filtering logic
  2023-04-14 11:37 ` [PATCH 14/40] e1000x: Share more Rx filtering logic Akihiko Odaki
@ 2023-04-15 19:10   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:10 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel


> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 14/40] e1000x: Share more Rx filtering logic
> 
> This saves some code and enables tracepoint for e1000's VLAN filtering.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/e1000.c         | 35 +++++--------------------------
>  hw/net/e1000e_core.c   | 47 +++++-------------------------------------
>  hw/net/e1000x_common.c | 44 +++++++++++++++++++++++++++++++++----
> --
>  hw/net/e1000x_common.h |  4 +++-
>  hw/net/igb_core.c      | 41 +++---------------------------------
>  hw/net/trace-events    |  4 ++--
>  6 files changed, 56 insertions(+), 119 deletions(-)

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 29/40] igb: Implement MSI-X single vector mode
  2023-04-14 11:37 ` [PATCH 29/40] igb: Implement MSI-X single vector mode Akihiko Odaki
@ 2023-04-15 19:12   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:12 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel


> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 29/40] igb: Implement MSI-X single vector mode
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> 429b0ebc03..2013a9a53d 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -1870,7 +1870,7 @@ igb_update_interrupt_state(IGBCore *core)
> 
>      icr = core->mac[ICR] & core->mac[IMS];
> 
> -    if (msix_enabled(core->owner)) {
> +    if (core->mac[GPIE] & E1000_GPIE_MSIX_MODE) {
>          if (icr) {
>              causes = 0;
>              if (icr & E1000_ICR_DRSTA) { @@ -1905,7 +1905,12 @@
> igb_update_interrupt_state(IGBCore *core)
>          trace_e1000e_irq_pending_interrupts(core->mac[ICR] & core->mac[IMS],
>                                              core->mac[ICR], core->mac[IMS]);
> 
> -        if (msi_enabled(core->owner)) {
> +        if (msix_enabled(core->owner)) {
> +            if (icr) {
> +                trace_e1000e_irq_msix_notify_vec(0);
> +                msix_notify(core->owner, 0);
> +            }
> +        } else if (msi_enabled(core->owner)) {
>              if (icr) {
>                  msi_notify(core->owner, 0);
>              }
> --
> 2.40.0

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 31/40] igb: Use UDP for RSS hash
  2023-04-14 11:37 ` [PATCH 31/40] igb: Use UDP for RSS hash Akihiko Odaki
@ 2023-04-15 19:45   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 19:45 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel



> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 31/40] igb: Use UDP for RSS hash
> 
> e1000e does not support using UDP for RSS hash, but igb does.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 16 ++++++++++++++++  hw/net/igb_regs.h |  3 +++
>  2 files changed, 19 insertions(+)

Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

UDP hash types look good to me, but while reviewing this patch I realized MRQC bit 18 is different between igb and e1000e.
igb: MRQC BIT(18) -> TcpIPv6Ex
igb: MRQC BIT(21) -> TcpIPv6
e1000e: MRQC BIT(18) -> TcpIPv6


> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> 569897fb99..3ad81b15d0 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -279,6 +279,11 @@ igb_rss_get_hash_type(IGBCore *core, struct
> NetRxPkt *pkt)
>              return E1000_MRQ_RSS_TYPE_IPV4TCP;
>          }
> 
> +        if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
> +            (core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV4_UDP)) {
> +            return E1000_MRQ_RSS_TYPE_IPV4UDP;
> +        }
> +
>          if (E1000_MRQC_EN_IPV4(core->mac[MRQC])) {
>              return E1000_MRQ_RSS_TYPE_IPV4;
>          }
> @@ -314,6 +319,11 @@ igb_rss_get_hash_type(IGBCore *core, struct
> NetRxPkt *pkt)
>                  return E1000_MRQ_RSS_TYPE_IPV6TCP;
>              }
> 
> +            if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
> +                (core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV6_UDP)) {
> +                return E1000_MRQ_RSS_TYPE_IPV6UDP;
> +            }
> +
>              if (E1000_MRQC_EN_IPV6EX(core->mac[MRQC])) {
>                  return E1000_MRQ_RSS_TYPE_IPV6EX;
>              }
> @@ -352,6 +362,12 @@ igb_rss_calc_hash(IGBCore *core, struct NetRxPkt
> *pkt, E1000E_RSSInfo *info)
>      case E1000_MRQ_RSS_TYPE_IPV6EX:
>          type = NetPktRssIpV6Ex;
>          break;
> +    case E1000_MRQ_RSS_TYPE_IPV4UDP:
> +        type = NetPktRssIpV4Udp;
> +        break;
> +    case E1000_MRQ_RSS_TYPE_IPV6UDP:
> +        type = NetPktRssIpV6Udp;
> +        break;
>      default:
>          assert(false);
>          return 0;
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h index
> 22ce909173..03486edb2e 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -659,6 +659,9 @@ union e1000_adv_rx_desc {
> 
>  #define E1000_RSS_QUEUE(reta, hash) (E1000_RETA_VAL(reta, hash) & 0x0F)
> 
> +#define E1000_MRQ_RSS_TYPE_IPV4UDP 7
> +#define E1000_MRQ_RSS_TYPE_IPV6UDP 8
> +
>  #define E1000_STATUS_IOV_MODE 0x00040000
> 
>  #define E1000_STATUS_NUM_VFS_SHIFT 14
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 37/40] igb: Implement Tx timestamp
  2023-04-14 11:37 ` [PATCH 37/40] igb: Implement Tx timestamp Akihiko Odaki
@ 2023-04-15 20:13   ` Sriram Yagnaraman
  0 siblings, 0 replies; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-15 20:13 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel



> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:38
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 37/40] igb: Implement Tx timestamp
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 7 +++++++
>  hw/net/igb_regs.h | 3 +++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> c716f400fd..38b53676d4 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -614,6 +614,13 @@ igb_process_tx_desc(IGBCore *core,
>                  tx->first_olinfo_status = le32_to_cpu(tx_desc->read.olinfo_status);
>                  tx->first = false;
>              }
> +
> +            if ((cmd_type_len & E1000_ADVTXD_MAC_TSTAMP) &&

Should ^ be tx->first_cmd_type_len?
Otherwise, Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

> +                (core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_ENABLED) &&
> +                !(core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_VALID)) {
> +                core->mac[TSYNCTXCTL] |= E1000_TSYNCTXCTL_VALID;
> +                e1000x_timestamp(core->mac, core->timadj, TXSTMPL, TXSTMPH);
> +            }
>          } else if ((cmd_type_len & E1000_ADVTXD_DTYP_CTXT) ==
>                     E1000_ADVTXD_DTYP_CTXT) {
>              /* advanced transmit context descriptor */ diff --git
> a/hw/net/igb_regs.h b/hw/net/igb_regs.h index b88dc9f1f1..808b587a36
> 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -322,6 +322,9 @@ union e1000_adv_rx_desc {
>  /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
>  #define E1000_EITR_CNT_IGNR     0x80000000 /* Don't reset counters on
> write */
> 
> +#define E1000_TSYNCTXCTL_VALID    0x00000001 /* tx timestamp valid */
> +#define E1000_TSYNCTXCTL_ENABLED  0x00000010 /* enable tx
> timestampping
> +*/
> +
>  /* PCI Express Control */
>  #define E1000_GCR_CMPL_TMOUT_MASK       0x0000F000
>  #define E1000_GCR_CMPL_TMOUT_10ms       0x00001000
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* RE: [PATCH 30/40] igb: Implement igb-specific oversize check
  2023-04-14 11:37 ` [PATCH 30/40] igb: Implement igb-specific oversize check Akihiko Odaki
@ 2023-04-16 11:22   ` Sriram Yagnaraman
  2023-04-22  5:45     ` Akihiko Odaki
  0 siblings, 1 reply; 69+ messages in thread
From: Sriram Yagnaraman @ 2023-04-16 11:22 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel



> -----Original Message-----
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> Sent: Friday, 14 April 2023 13:37
> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
> <thuth@redhat.com>; Wainer dos Santos Moschetta
> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
> <akihiko.odaki@daynix.com>
> Subject: [PATCH 30/40] igb: Implement igb-specific oversize check
> 
> igb has a configurable size limit for LPE, and uses different limits depending on
> whether the packet is treated as a VLAN packet.
> 
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/net/igb_core.c | 41 +++++++++++++++++++++++++++--------------
>  1 file changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
> 2013a9a53d..569897fb99 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -954,16 +954,21 @@ igb_rx_l4_cso_enabled(IGBCore *core)
>      return !!(core->mac[RXCSUM] & E1000_RXCSUM_TUOFLD);  }
> 
> -static bool

The convention in seems to be to declare return value in first line and then the function name in the next line. 

> -igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
> +static bool igb_rx_is_oversized(IGBCore *core, const struct eth_header *ehdr,
> +                                size_t size, bool lpe, uint16_t rlpml)
>  {
> -    uint16_t pool = qn % IGB_NUM_VM_POOLS;
> -    bool lpe = !!(core->mac[VMOLR0 + pool] & E1000_VMOLR_LPE);
> -    int max_ethernet_lpe_size =
> -        core->mac[VMOLR0 + pool] & E1000_VMOLR_RLPML_MASK;
> -    int max_ethernet_vlan_size = 1522;
> +    size += 4;

Is the above 4 CRC bytes?

> +
> +    if (lpe) {
> +        return size > rlpml;
> +    }
> +
> +    if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
> +        e1000x_vlan_rx_filter_enabled(core->mac)) {
> +        return size > 1522;
> +    }

Should a check for 1526 bytes if extended VLAN is present be added?
Maybe in "igb: Strip the second VLAN tag for extended VLAN"?

> 
> -    return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
> +    return size > 1518;
>  }
> 
>  static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
> @@ -976,6 +981,8 @@ static uint16_t igb_receive_assign(IGBCore *core,
> const L2Header *l2_header,
>      uint16_t queues = 0;
>      uint16_t oversized = 0;
>      uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
> +    bool lpe;
> +    uint16_t rlpml;
>      int i;
> 
>      memset(rss_info, 0, sizeof(E1000E_RSSInfo)); @@ -984,6 +991,14 @@
> static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
>          *external_tx = true;
>      }
> 
> +    lpe = !!(core->mac[RCTL] & E1000_RCTL_LPE);
> +    rlpml = core->mac[RLPML];
> +    if (!(core->mac[RCTL] & E1000_RCTL_SBP) &&
> +        igb_rx_is_oversized(core, ehdr, size, lpe, rlpml)) {
> +        trace_e1000x_rx_oversized(size);
> +        return queues;
> +    }
> +
>      if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
>          !e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
>          return queues;
> @@ -1067,7 +1082,10 @@ static uint16_t igb_receive_assign(IGBCore *core,
> const L2Header *l2_header,
>          queues &= core->mac[VFRE];
>          if (queues) {
>              for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
> -                if ((queues & BIT(i)) && igb_rx_is_oversized(core, i, size)) {
> +                lpe = !!(core->mac[VMOLR0 + i] & E1000_VMOLR_LPE);
> +                rlpml = core->mac[VMOLR0 + i] & E1000_VMOLR_RLPML_MASK;
> +                if ((queues & BIT(i)) &&
> +                    igb_rx_is_oversized(core, ehdr, size, lpe, rlpml))
> + {
>                      oversized |= BIT(i);
>                  }
>              }
> @@ -1609,11 +1627,6 @@ igb_receive_internal(IGBCore *core, const struct
> iovec *iov, int iovcnt,
>          iov_to_buf(iov, iovcnt, iov_ofs, &min_buf, sizeof(min_buf.l2_header));
>      }
> 
> -    /* Discard oversized packets if !LPE and !SBP. */
> -    if (e1000x_is_oversized(core->mac, size)) {
> -        return orig_size;
> -    }
> -
>      net_rx_pkt_set_packet_type(core->rx_pkt,
>                                 get_eth_packet_type(&min_buf.l2_header.eth));
>      net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
> --
> 2.40.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/40] igb: Always copy ethernet header
  2023-04-14 14:46   ` Philippe Mathieu-Daudé
@ 2023-04-21 12:18     ` Akihiko Odaki
  0 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-21 12:18 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Sriram Yagnaraman, Jason Wang, Dmitry Fleytman,
	Michael S. Tsirkin, Alex Bennée, Thomas Huth,
	Wainer dos Santos Moschetta, Beraldo Leal, Cleber Rosa,
	Laurent Vivier, Paolo Bonzini, qemu-devel

On 2023/04/14 23:46, Philippe Mathieu-Daudé wrote:
> On 14/4/23 13:37, Akihiko Odaki wrote:
>> igb_receive_internal() used to check the iov length to determine
>> copy the iovs to a contiguous buffer, but the check is flawed in two
>> ways:
>> - It does not ensure that iovcnt > 0.
>> - It does not take virtio-net header into consideration.
>>
>> The size of this copy is just 22 octets, which can be even less than
>> the code size required for checks. This (wrong) optimization is probably
>> not worth so just remove it. Removing this also allows igb to assume
>> aligned accesses for the ethernet header.
>>
>> Fixes: 3a977deebe ("Intrdocue igb device emulation")
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>>   hw/net/igb_core.c | 39 +++++++++++++++++++++------------------
>>   1 file changed, 21 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
>> index 53f60fc3d3..1d188b526c 100644
>> --- a/hw/net/igb_core.c
>> +++ b/hw/net/igb_core.c
> 
> 
>> -static uint16_t igb_receive_assign(IGBCore *core, const struct 
>> eth_header *ehdr,
>> +static uint16_t igb_receive_assign(IGBCore *core, const L2Header 
>> *l2_header,
>>                                      size_t size, E1000E_RSSInfo 
>> *rss_info,
>>                                      bool *external_tx)
>>   {
>>       static const int ta_shift[] = { 4, 3, 2, 0 };
>> +    const struct eth_header *ehdr = &l2_header->eth;
>>       uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
>>       uint16_t queues = 0;
>>       uint16_t oversized = 0;
>> -    uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(ehdr)->h_tci) & 
>> VLAN_VID_MASK;
>> +    uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & 
>> VLAN_VID_MASK;
> 
> Why this API change? Are we certain tci is aligned in host memory?

This change makes the VLAN tag always copied to the host memory, which 
ensures that tci is aligned.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 30/40] igb: Implement igb-specific oversize check
  2023-04-16 11:22   ` Sriram Yagnaraman
@ 2023-04-22  5:45     ` Akihiko Odaki
  0 siblings, 0 replies; 69+ messages in thread
From: Akihiko Odaki @ 2023-04-22  5:45 UTC (permalink / raw)
  To: Sriram Yagnaraman
  Cc: Jason Wang, Dmitry Fleytman, Michael S. Tsirkin,
	Alex Bennée, Philippe Mathieu-Daudé,
	Thomas Huth, Wainer dos Santos Moschetta, Beraldo Leal,
	Cleber Rosa, Laurent Vivier, Paolo Bonzini, qemu-devel

On 2023/04/16 20:22, Sriram Yagnaraman wrote:
> 
> 
>> -----Original Message-----
>> From: Akihiko Odaki <akihiko.odaki@daynix.com>
>> Sent: Friday, 14 April 2023 13:37
>> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; Jason Wang
>> <jasowang@redhat.com>; Dmitry Fleytman <dmitry.fleytman@gmail.com>;
>> Michael S. Tsirkin <mst@redhat.com>; Alex Bennée <alex.bennee@linaro.org>;
>> Philippe Mathieu-Daudé <philmd@linaro.org>; Thomas Huth
>> <thuth@redhat.com>; Wainer dos Santos Moschetta
>> <wainersm@redhat.com>; Beraldo Leal <bleal@redhat.com>; Cleber Rosa
>> <crosa@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Paolo Bonzini
>> <pbonzini@redhat.com>; qemu-devel@nongnu.org; Akihiko Odaki
>> <akihiko.odaki@daynix.com>
>> Subject: [PATCH 30/40] igb: Implement igb-specific oversize check
>>
>> igb has a configurable size limit for LPE, and uses different limits depending on
>> whether the packet is treated as a VLAN packet.
>>
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>>   hw/net/igb_core.c | 41 +++++++++++++++++++++++++++--------------
>>   1 file changed, 27 insertions(+), 14 deletions(-)
>>
>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index
>> 2013a9a53d..569897fb99 100644
>> --- a/hw/net/igb_core.c
>> +++ b/hw/net/igb_core.c
>> @@ -954,16 +954,21 @@ igb_rx_l4_cso_enabled(IGBCore *core)
>>       return !!(core->mac[RXCSUM] & E1000_RXCSUM_TUOFLD);  }
>>
>> -static bool
> 
> The convention in seems to be to declare return value in first line and then the function name in the next line.

There are already functions not following the convention, and it is more 
like exceptional in the entire QEMU code base. This patch prioritize the 
QEMU's common practice over e1000e's old convention.

> 
>> -igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
>> +static bool igb_rx_is_oversized(IGBCore *core, const struct eth_header *ehdr,
>> +                                size_t size, bool lpe, uint16_t rlpml)
>>   {
>> -    uint16_t pool = qn % IGB_NUM_VM_POOLS;
>> -    bool lpe = !!(core->mac[VMOLR0 + pool] & E1000_VMOLR_LPE);
>> -    int max_ethernet_lpe_size =
>> -        core->mac[VMOLR0 + pool] & E1000_VMOLR_RLPML_MASK;
>> -    int max_ethernet_vlan_size = 1522;
>> +    size += 4;
> 
> Is the above 4 CRC bytes?

Yes. In v2, a new constant ETH_FCS_LEN is used to explictly state that.

> 
>> +
>> +    if (lpe) {
>> +        return size > rlpml;
>> +    }
>> +
>> +    if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
>> +        e1000x_vlan_rx_filter_enabled(core->mac)) {
>> +        return size > 1522;
>> +    }
> 
> Should a check for 1526 bytes if extended VLAN is present be added?
> Maybe in "igb: Strip the second VLAN tag for extended VLAN"?

In v2, I placed "igb: Strip the second VLAN tag for extended VLAN" 
earlier than this patch, and this patch is rewritten so it can handle 
the second VLAN tag too.

> 
>>
>> -    return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
>> +    return size > 1518;
>>   }
>>
>>   static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
>> @@ -976,6 +981,8 @@ static uint16_t igb_receive_assign(IGBCore *core,
>> const L2Header *l2_header,
>>       uint16_t queues = 0;
>>       uint16_t oversized = 0;
>>       uint16_t vid = be16_to_cpu(l2_header->vlan[0].h_tci) & VLAN_VID_MASK;
>> +    bool lpe;
>> +    uint16_t rlpml;
>>       int i;
>>
>>       memset(rss_info, 0, sizeof(E1000E_RSSInfo)); @@ -984,6 +991,14 @@
>> static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
>>           *external_tx = true;
>>       }
>>
>> +    lpe = !!(core->mac[RCTL] & E1000_RCTL_LPE);
>> +    rlpml = core->mac[RLPML];
>> +    if (!(core->mac[RCTL] & E1000_RCTL_SBP) &&
>> +        igb_rx_is_oversized(core, ehdr, size, lpe, rlpml)) {
>> +        trace_e1000x_rx_oversized(size);
>> +        return queues;
>> +    }
>> +
>>       if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0xffff) &&
>>           !e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
>>           return queues;
>> @@ -1067,7 +1082,10 @@ static uint16_t igb_receive_assign(IGBCore *core,
>> const L2Header *l2_header,
>>           queues &= core->mac[VFRE];
>>           if (queues) {
>>               for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
>> -                if ((queues & BIT(i)) && igb_rx_is_oversized(core, i, size)) {
>> +                lpe = !!(core->mac[VMOLR0 + i] & E1000_VMOLR_LPE);
>> +                rlpml = core->mac[VMOLR0 + i] & E1000_VMOLR_RLPML_MASK;
>> +                if ((queues & BIT(i)) &&
>> +                    igb_rx_is_oversized(core, ehdr, size, lpe, rlpml))
>> + {
>>                       oversized |= BIT(i);
>>                   }
>>               }
>> @@ -1609,11 +1627,6 @@ igb_receive_internal(IGBCore *core, const struct
>> iovec *iov, int iovcnt,
>>           iov_to_buf(iov, iovcnt, iov_ofs, &min_buf, sizeof(min_buf.l2_header));
>>       }
>>
>> -    /* Discard oversized packets if !LPE and !SBP. */
>> -    if (e1000x_is_oversized(core->mac, size)) {
>> -        return orig_size;
>> -    }
>> -
>>       net_rx_pkt_set_packet_type(core->rx_pkt,
>>                                  get_eth_packet_type(&min_buf.l2_header.eth));
>>       net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
>> --
>> 2.40.0
> 


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2023-04-22  5:46 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-14 11:36 [PATCH 00/40] igb: Fix for DPDK Akihiko Odaki
2023-04-14 11:36 ` [PATCH 01/40] hw/net/net_tx_pkt: Decouple from PCI Akihiko Odaki
2023-04-14 14:23   ` Philippe Mathieu-Daudé
2023-04-14 11:36 ` [PATCH 02/40] e1000x: Fix BPRC and MPRC Akihiko Odaki
2023-04-14 11:37 ` [PATCH 03/40] igb: Fix Rx packet type encoding Akihiko Odaki
2023-04-15 19:08   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 04/40] igb: Include the second VLAN tag in the buffer Akihiko Odaki
2023-04-14 14:28   ` Philippe Mathieu-Daudé
2023-04-14 14:32     ` Philippe Mathieu-Daudé
2023-04-14 14:35       ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 05/40] igb: Do not require CTRL.VME for tx VLAN tagging Akihiko Odaki
2023-04-15 19:08   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 06/40] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols() Akihiko Odaki
2023-04-15 19:09   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 07/40] e1000e: Always copy ethernet header Akihiko Odaki
2023-04-14 11:37 ` [PATCH 08/40] igb: " Akihiko Odaki
2023-04-14 14:46   ` Philippe Mathieu-Daudé
2023-04-21 12:18     ` Akihiko Odaki
2023-04-14 11:37 ` [PATCH 09/40] Fix references to igb Avocado test Akihiko Odaki
2023-04-14 14:47   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 10/40] tests/avocado: Remove unused imports Akihiko Odaki
2023-04-14 11:37 ` [PATCH 11/40] tests/avocado: Remove test_igb_nomsi_kvm Akihiko Odaki
2023-04-14 11:37 ` [PATCH 12/40] hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info Akihiko Odaki
2023-04-14 11:37 ` [PATCH 13/40] net/eth: Rename eth_setup_vlan_headers_ex Akihiko Odaki
2023-04-14 11:37 ` [PATCH 14/40] e1000x: Share more Rx filtering logic Akihiko Odaki
2023-04-15 19:10   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 15/40] e1000x: Take CRC into consideration for size check Akihiko Odaki
2023-04-14 15:03   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 16/40] e1000e: Always log status after building rx metadata Akihiko Odaki
2023-04-14 15:04   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 17/40] igb: " Akihiko Odaki
2023-04-14 15:07   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 18/40] igb: Remove goto Akihiko Odaki
2023-04-15 19:08   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 19/40] igb: Read DCMD.VLE of the first Tx descriptor Akihiko Odaki
2023-04-15 19:08   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 20/40] e1000e: Reset packet state after emptying Tx queue Akihiko Odaki
2023-04-14 11:37 ` [PATCH 21/40] vmxnet3: " Akihiko Odaki
2023-04-14 11:37 ` [PATCH 22/40] igb: Add more definitions for Tx descriptor Akihiko Odaki
2023-04-15 19:08   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 23/40] igb: Share common VF constants Akihiko Odaki
2023-04-14 15:08   ` Philippe Mathieu-Daudé
2023-04-15 19:08     ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 24/40] igb: Fix igb_mac_reg_init alignment Akihiko Odaki
2023-04-14 15:09   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 25/40] net/eth: Use void pointers Akihiko Odaki
2023-04-14 15:10   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 26/40] net/eth: Always add VLAN tag Akihiko Odaki
2023-04-14 11:37 ` [PATCH 27/40] hw/net/net_rx_pkt: Enforce alignment for eth_header Akihiko Odaki
2023-04-14 11:37 ` [PATCH 28/40] tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX Akihiko Odaki
2023-04-14 11:37 ` [PATCH 29/40] igb: Implement MSI-X single vector mode Akihiko Odaki
2023-04-15 19:12   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 30/40] igb: Implement igb-specific oversize check Akihiko Odaki
2023-04-16 11:22   ` Sriram Yagnaraman
2023-04-22  5:45     ` Akihiko Odaki
2023-04-14 11:37 ` [PATCH 31/40] igb: Use UDP for RSS hash Akihiko Odaki
2023-04-15 19:45   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 32/40] igb: Implement Rx SCTP CSO Akihiko Odaki
2023-04-14 11:37 ` [PATCH 33/40] igb: Implement Tx " Akihiko Odaki
2023-04-14 11:37 ` [PATCH 34/40] igb: Strip the second VLAN tag for extended VLAN Akihiko Odaki
2023-04-14 11:37 ` [PATCH 35/40] igb: Filter with " Akihiko Odaki
2023-04-14 11:37 ` [PATCH 36/40] igb: Implement Rx PTP2 timestamp Akihiko Odaki
2023-04-14 11:37 ` [PATCH 37/40] igb: Implement Tx timestamp Akihiko Odaki
2023-04-15 20:13   ` Sriram Yagnaraman
2023-04-14 11:37 ` [PATCH 38/40] vmxnet3: Do not depend on PC Akihiko Odaki
2023-04-14 15:13   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 39/40] MAINTAINERS: Add a reviewer for network packet abstractions Akihiko Odaki
2023-04-14 15:13   ` Philippe Mathieu-Daudé
2023-04-14 11:37 ` [PATCH 40/40] docs/system/devices/igb: Note igb is tested for DPDK Akihiko Odaki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.