On Wed, Nov 27, 2019 at 03:39:03PM +0300, ASM wrote: > When the packet is received, e1000 writes it to memory directrly > without any RCU. > The address of memory for writing is set by the driver from dpdk driver. > Driver writes to RDBA (RDBAH,RDBAL) base address of ring. > > It turns out that MMIO RCU (mentioned from e1000_mmio_setup) does not > protect, and can't protect the ring descriptors. > The area for protection may be any area of operational memory. And it > becomes famous when writing to registers RDBA by driver. > (see datasheet 82574 GbE Controller "7.1.8 Receive Descriptor Queue Structure") > > How can this memory be protected? As I understand it, the e1000 should > track the record in RDBA and enable memory protection in this region. > But how to do it right? I misunderstood the issue and you can probably ignore my comments about coalesced MMIO. You quoted descriptor DMA code below so coalesced MMIO shouldn't be relevant since desc->status isn't an MMIO register. > > Source code qemu: > hw/net/e1000.c:954 (version master) > > 954 base = rx_desc_base(s) + sizeof(desc) * s->mac_reg[RDH]; > where rx_desc_base -- address RDBAH regs. It address no have RCU protect. > ... > 955 pci_dma_read(d, base, &desc, sizeof(desc)); > ... > 957 desc.status |= (vlan_status | E1000_RXD_STAT_DD); > ... > 990 pci_dma_write(d, base, &desc, sizeof(desc)); > -> > exec.c: > 3111 static MemTxResult flatview_write_continue(FlatView *fv, hwaddr addr, > 3112 MemTxAttrs attrs, > 3113 const uint8_t *buf, > 3114 hwaddr len, hwaddr addr1, > 3115 hwaddr l, MemoryRegion *mr) > 3116 { > ... > 3123 if (!memory_access_is_direct(mr, true)) { > (false) > 3131 } else { > 3132 /* RAM case */ > 3133 ptr = qemu_ram_ptr_length(mr->ram_block, addr1, &l, false); > 3134 memcpy(ptr, buf, l); > > where I be seeing weird behavior with KVM due to MMIO write coalescing > > 3135 invalidate_and_set_dirty(mr, addr1, l); > 3136 } > 3137 > > Source code dpdk(e1000): (version dpdk-stable-17.11.9) > drivers/net/e1000/em_rxtx.c: > > 699 uint16_t > 700 eth_em_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, > 701 uint16_t nb_pkts) > ... > 718 rxq = rx_queue > ... > 722 rx_id = rxq->rx_tail; > 723 rx_ring = rxq->rx_ring > ... > 734 rxdp = &rx_ring[rx_id]; > 735 status = rxdp->status; > 736 if (! (status & E1000_RXD_STAT_DD)) > 737 break; > ... > 807 rxdp->buffer_addr = dma_addr; > 808 rxdp->status = 0; > where I be seeing weird behavior with KVM due to MMIO write > coalescing It could be a bug in QEMU's e1000 emulation - maybe it's not doing things in the correct order and causes a race condition with the DPDK polling driver - or it could be a bug in the DPDK e1000 driver regarding the order in which the descriptor ring and RX Head/Tail MMIO registers are updated. Did you find the root cause? > P.S. > > Also, is DPDK accessing the e1000 device from more than 1 vCPU? > All tests on single virtual CPU. > > I created github project for quick reproduction of this error: > https://github.com/BASM/qemu_dpdk_e1000_test > > --- > Best regards, > Leonid Myravjev > > On Thu, 21 Nov 2019 at 17:05, Stefan Hajnoczi wrote: > > > > On Wed, Nov 20, 2019 at 08:36:32PM +0300, ASM wrote: > > > I trying solve the problem, with packets stopping (e1000,tap,kvm). > > > My studies led to the following: > > > 1. From flatview_write_continue() I see, what e1000 writes the number > > > "7" to the STAT register. > > > 2. The driver from target OS reads STAT register with number "7" and > > > writes to the register the number "0". > > > 3. From flatview_write_continue() (I make edits): > > > memcpy(ptr, buf, l); > > > new1=ptr[0xc]; > > > usleep(100); > > > new2=ptr[0xc]; > > > invalidate_and_set_dirty(mr, addr1, l); > > > new3=ptr[0xc]; > > > printf("Old: %i, new1, %i, new2: %i, new3: %i\n", old,new1,new2,new3); > > > > > > I see what memory in first printf is "7", but after usleep() is "0". > > > Do I understand correctly that this should not be? Or RCU lock > > > suggests the ability to the multiple writers? > > > > > > The problem is that qemu(e1000) writes the number 7, after which > > > target(dpdk driver) reads 7, on the basis of this it writes the number > > > 0, but as a result (extremely rarely), the value STATUS still remains > > > 7. Therefore, packet processing is interrupted. This behavior is > > > observed only on kvm (it is not observed on tcg). > > > > > > Please help with advice or ideas. > > > > Hi Leonid, > > Could you be seeing weird behavior with KVM due to MMIO write > > coalescing? > > > > static void e1000_mmio_setup(E1000State *d) > > { > > int i; > > const uint32_t excluded_regs[] = { > > E1000_MDIC, E1000_ICR, E1000_ICS, E1000_IMS, > > E1000_IMC, E1000_TCTL, E1000_TDT, PNPMMIO_SIZE > > }; > > > > memory_region_init_io(&d->mmio, OBJECT(d), &e1000_mmio_ops, d, > > "e1000-mmio", PNPMMIO_SIZE); > > memory_region_add_coalescing(&d->mmio, 0, excluded_regs[0]); > > for (i = 0; excluded_regs[i] != PNPMMIO_SIZE; i++) > > memory_region_add_coalescing(&d->mmio, excluded_regs[i] + 4, > > excluded_regs[i+1] - excluded_regs[i] - 4); > > memory_region_init_io(&d->io, OBJECT(d), &e1000_io_ops, d, "e1000-io", IOPORT_SIZE); > > } > > > > MMIO write coalescing means that QEMU doesn't see the register writes > > immediately. Instead kvm.ko records them into a ring buffer and QEMU > > processes the ring when the next ioctl(KVM_RUN) exit occurs. > > > > See Linux Documentation/virt/kvm/api.txt "4.116 > > KVM_(UN)REGISTER_COALESCED_MMIO" for more details. > > > > I don't really understand your printf debugging explanation. It would > > help to see the DPDK code and the exact printf() output. > > > > Also, is DPDK accessing the e1000 device from more than 1 vCPU? > > > > Stefan