mbuf changes

* mbuf changes
@ 2016-10-24 15:49 Morten Brørup
  2016-10-24 16:11 ` Wiles, Keith
  2016-10-25 13:14 ` Olivier Matz
  0 siblings, 2 replies; 46+ messages in thread
From: Morten Brørup @ 2016-10-24 15:49 UTC (permalink / raw)
  To: dev; +Cc: Olivier Matz

First of all: Thanks for a great DPDK Userspace 2016!

Continuing the Userspace discussion about Olivier Matz’s proposed mbuf changes...

1.

Stephen Hemminger had a noteworthy general comment about keeping metadata for the NIC in the appropriate section of the mbuf: Metadata generated by the NIC’s RX handler belongs in the first cache line, and metadata required by the NIC’s TX handler belongs in the second cache line. This also means that touching the second cache line on ingress should be avoided if possible; and Bruce Richardson mentioned that for this reason m->next was zeroed on free().

2.

There seemed to be consensus that the size of m->refcnt should match the size of m->port because a packet could be duplicated on all physical ports for L3 multicast and L2 flooding.

Furthermore, although a single physical machine (i.e. a single server) with 255 physical ports probably doesn’t exist, it might contain more than 255 virtual machines with a virtual port each, so it makes sense extending these mbuf fields from 8 to 16 bits.

3.

Someone (Bruce Richardson?) suggested moving m->refcnt and m->port to the second cache line, which then generated questions from the audience about the real life purpose of m->port, and if m->port could be removed from the mbuf structure.

4.

I suggested using offset -1 for m->refcnt, so m->refcnt becomes 0 on first allocation. This is based on the assumption that other mbuf fields must be zeroed at alloc()/free() anyway, so zeroing m->refcnt is cheaper than setting it to 1.

Furthermore (regardless of m->refcnt offset), I suggested that it is not required to modify m->refcnt when allocating and freeing the mbuf, thus saving one write operation on both alloc() and free(). However, this assumes that m->refcnt debugging, e.g. underrun detection, is not required.

5.

And here’s something new to think about:

m->next already reveals if there are more segments to a packet. Which purpose does m->nb_segs serve that is not already covered by m->next?

Med venlig hilsen / kind regards

Morten Brørup

CTO

SmartShare Systems A/S

Tonsbakken 16-18

DK-2740 Skovlunde

Denmark

Office      +45 70 20 00 93

Direct      +45 89 93 50 22

Mobile     +45 25 40 82 12

mb@smartsharesystems.com <mailto:mb@smartsharesystems.com> 

www.smartsharesystems.com <https://www.smartsharesystems.com/> 

^ permalink raw reply	[flat|nested] 46+ messages in thread