All of lore.kernel.org
 help / color / mirror / Atom feed
* ixgbe: prefetch packet headers in vector PMD receive function
@ 2015-09-01 19:17 Zoltan Kiss
  2015-09-07 12:25 ` [PATCH] " Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-09-01 19:17 UTC (permalink / raw)
  To: dev

The lack of this prefetch causes a significant performance drop in
OVS-DPDK: 13.3 Mpps instead of 14 when forwarding 64 byte packets. Even
though OVS prefetches the next packet's header before it starts processing
the current one, it doesn't get there fast enough. This aligns with the
behaviour of other receive functions.

Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
---
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index cf25a53..51299fa 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -502,6 +502,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
                _mm_storeu_si128((void *)&rx_pkts[pos]->rx_descriptor_fields1,
                                pkt_mb1);
 
+               rte_packet_prefetch((char*)(rx_pkts[pos]->buf_addr) +
+                                   RTE_PKTMBUF_HEADROOM);
+               rte_packet_prefetch((char*)(rx_pkts[pos + 1]->buf_addr) +
+                                   RTE_PKTMBUF_HEADROOM);
+               rte_packet_prefetch((char*)(rx_pkts[pos + 2]->buf_addr) +
+                                   RTE_PKTMBUF_HEADROOM);
+               rte_packet_prefetch((char*)(rx_pkts[pos + 3]->buf_addr) +
+                                   RTE_PKTMBUF_HEADROOM);
+
                /* C.4 calc avaialbe number of desc */
                var = __builtin_popcountll(_mm_cvtsi128_si64(staterr));
                nb_pkts_recd += var;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-01 19:17 ixgbe: prefetch packet headers in vector PMD receive function Zoltan Kiss
@ 2015-09-07 12:25 ` Zoltan Kiss
  2015-09-07 12:57   ` Richardson, Bruce
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-09-07 12:25 UTC (permalink / raw)
  To: dev

Hi,

I just realized I've missed the "[PATCH]" tag from the subject. Did 
anyone had time to review this?

Regards,

Zoltan

On 01/09/15 20:17, Zoltan Kiss wrote:
> The lack of this prefetch causes a significant performance drop in
> OVS-DPDK: 13.3 Mpps instead of 14 when forwarding 64 byte packets. Even
> though OVS prefetches the next packet's header before it starts processing
> the current one, it doesn't get there fast enough. This aligns with the
> behaviour of other receive functions.
>
> Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
> ---
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index cf25a53..51299fa 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -502,6 +502,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>                  _mm_storeu_si128((void *)&rx_pkts[pos]->rx_descriptor_fields1,
>                                  pkt_mb1);
>
> +               rte_packet_prefetch((char*)(rx_pkts[pos]->buf_addr) +
> +                                   RTE_PKTMBUF_HEADROOM);
> +               rte_packet_prefetch((char*)(rx_pkts[pos + 1]->buf_addr) +
> +                                   RTE_PKTMBUF_HEADROOM);
> +               rte_packet_prefetch((char*)(rx_pkts[pos + 2]->buf_addr) +
> +                                   RTE_PKTMBUF_HEADROOM);
> +               rte_packet_prefetch((char*)(rx_pkts[pos + 3]->buf_addr) +
> +                                   RTE_PKTMBUF_HEADROOM);
> +
>                  /* C.4 calc avaialbe number of desc */
>                  var = __builtin_popcountll(_mm_cvtsi128_si64(staterr));
>                  nb_pkts_recd += var;
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-07 12:25 ` [PATCH] " Zoltan Kiss
@ 2015-09-07 12:57   ` Richardson, Bruce
  2015-09-07 14:15     ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Richardson, Bruce @ 2015-09-07 12:57 UTC (permalink / raw)
  To: Zoltan Kiss, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Monday, September 7, 2015 1:26 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin; Richardson, Bruce
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> function
> 
> Hi,
> 
> I just realized I've missed the "[PATCH]" tag from the subject. Did anyone
> had time to review this?
> 

Hi Zoltan,

the big thing that concerns me with this is the addition of new instructions for
each packet in the fast path. Ideally, this prefetching would be better handled
in the application itself, as for some apps, e.g. those using pipelining, the
core doing the RX from the NIC may not touch the packet data at all, and the
prefetches will instead cause a performance slowdown.

Is it possible to get the same performance increase - or something close to it -
by making changes in OVS?

Regards,
/Bruce

> Regards,
> 
> Zoltan
> 
> On 01/09/15 20:17, Zoltan Kiss wrote:
> > The lack of this prefetch causes a significant performance drop in
> > OVS-DPDK: 13.3 Mpps instead of 14 when forwarding 64 byte packets.
> > Even though OVS prefetches the next packet's header before it starts
> > processing the current one, it doesn't get there fast enough. This
> > aligns with the behaviour of other receive functions.
> >
> > Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
> > ---
> > diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> > b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> > index cf25a53..51299fa 100644
> > --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> > +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> > @@ -502,6 +502,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq,
> struct rte_mbuf **rx_pkts,
> >                  _mm_storeu_si128((void *)&rx_pkts[pos]-
> >rx_descriptor_fields1,
> >                                  pkt_mb1);
> >
> > +               rte_packet_prefetch((char*)(rx_pkts[pos]->buf_addr) +
> > +                                   RTE_PKTMBUF_HEADROOM);
> > +               rte_packet_prefetch((char*)(rx_pkts[pos + 1]->buf_addr)
> +
> > +                                   RTE_PKTMBUF_HEADROOM);
> > +               rte_packet_prefetch((char*)(rx_pkts[pos + 2]->buf_addr)
> +
> > +                                   RTE_PKTMBUF_HEADROOM);
> > +               rte_packet_prefetch((char*)(rx_pkts[pos + 3]->buf_addr)
> +
> > +                                   RTE_PKTMBUF_HEADROOM);
> > +
> >                  /* C.4 calc avaialbe number of desc */
> >                  var = __builtin_popcountll(_mm_cvtsi128_si64(staterr));
> >                  nb_pkts_recd += var;
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-07 12:57   ` Richardson, Bruce
@ 2015-09-07 14:15     ` Zoltan Kiss
  2015-09-07 14:41       ` Richardson, Bruce
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-09-07 14:15 UTC (permalink / raw)
  To: Richardson, Bruce, dev



On 07/09/15 13:57, Richardson, Bruce wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Monday, September 7, 2015 1:26 PM
>> To: dev@dpdk.org
>> Cc: Ananyev, Konstantin; Richardson, Bruce
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>> function
>>
>> Hi,
>>
>> I just realized I've missed the "[PATCH]" tag from the subject. Did anyone
>> had time to review this?
>>
>
> Hi Zoltan,
>
> the big thing that concerns me with this is the addition of new instructions for
> each packet in the fast path. Ideally, this prefetching would be better handled
> in the application itself, as for some apps, e.g. those using pipelining, the
> core doing the RX from the NIC may not touch the packet data at all, and the
> prefetches will instead cause a performance slowdown.
>
> Is it possible to get the same performance increase - or something close to it -
> by making changes in OVS?

OVS already does a prefetch when it's processing the previous packet, 
but apparently it's not early enough. At least for my test scenario, 
where I'm forwarding UDP packets with the least possible overhead. I 
guess in tests where OVS does more complex processing it should be fine.
I'll try to move the prefetch earlier in OVS codebase, but I'm not sure 
if it'll help.
Also, I've checked the PMD receive functions, and generally it's quite 
mixed whether they prefetch the header or not. All the other 3 ixgbe 
receive functions do that for example, as well as the following drivers:

bnx2x
e1000
fm10k (scattered)
i40e
igb
virtio

While these drivers don't do that:

cxgbe
enic
fm10k (non-scattered)
mlx4

I think it would be better to add rte_packet_prefetch() everywhere, 
because then applications can turn that off with 
CONFIG_RTE_PMD_PACKET_PREFETCH.

>
> Regards,
> /Bruce
>
>> Regards,
>>
>> Zoltan
>>
>> On 01/09/15 20:17, Zoltan Kiss wrote:
>>> The lack of this prefetch causes a significant performance drop in
>>> OVS-DPDK: 13.3 Mpps instead of 14 when forwarding 64 byte packets.
>>> Even though OVS prefetches the next packet's header before it starts
>>> processing the current one, it doesn't get there fast enough. This
>>> aligns with the behaviour of other receive functions.
>>>
>>> Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
>>> ---
>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>> index cf25a53..51299fa 100644
>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>> @@ -502,6 +502,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq,
>> struct rte_mbuf **rx_pkts,
>>>                   _mm_storeu_si128((void *)&rx_pkts[pos]-
>>> rx_descriptor_fields1,
>>>                                   pkt_mb1);
>>>
>>> +               rte_packet_prefetch((char*)(rx_pkts[pos]->buf_addr) +
>>> +                                   RTE_PKTMBUF_HEADROOM);
>>> +               rte_packet_prefetch((char*)(rx_pkts[pos + 1]->buf_addr)
>> +
>>> +                                   RTE_PKTMBUF_HEADROOM);
>>> +               rte_packet_prefetch((char*)(rx_pkts[pos + 2]->buf_addr)
>> +
>>> +                                   RTE_PKTMBUF_HEADROOM);
>>> +               rte_packet_prefetch((char*)(rx_pkts[pos + 3]->buf_addr)
>> +
>>> +                                   RTE_PKTMBUF_HEADROOM);
>>> +
>>>                   /* C.4 calc avaialbe number of desc */
>>>                   var = __builtin_popcountll(_mm_cvtsi128_si64(staterr));
>>>                   nb_pkts_recd += var;
>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-07 14:15     ` Zoltan Kiss
@ 2015-09-07 14:41       ` Richardson, Bruce
  2015-09-25 18:28         ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Richardson, Bruce @ 2015-09-07 14:41 UTC (permalink / raw)
  To: Zoltan Kiss, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Monday, September 7, 2015 3:15 PM
> To: Richardson, Bruce; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> function
> 
> 
> 
> On 07/09/15 13:57, Richardson, Bruce wrote:
> >
> >
> >> -----Original Message-----
> >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >> Sent: Monday, September 7, 2015 1:26 PM
> >> To: dev@dpdk.org
> >> Cc: Ananyev, Konstantin; Richardson, Bruce
> >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
> >> receive function
> >>
> >> Hi,
> >>
> >> I just realized I've missed the "[PATCH]" tag from the subject. Did
> >> anyone had time to review this?
> >>
> >
> > Hi Zoltan,
> >
> > the big thing that concerns me with this is the addition of new
> > instructions for each packet in the fast path. Ideally, this
> > prefetching would be better handled in the application itself, as for
> > some apps, e.g. those using pipelining, the core doing the RX from the
> > NIC may not touch the packet data at all, and the prefetches will
> instead cause a performance slowdown.
> >
> > Is it possible to get the same performance increase - or something
> > close to it - by making changes in OVS?
> 
> OVS already does a prefetch when it's processing the previous packet, but
> apparently it's not early enough. At least for my test scenario, where I'm
> forwarding UDP packets with the least possible overhead. I guess in tests
> where OVS does more complex processing it should be fine.
> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
> it'll help.

I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
8 ahead might work better, depending on the processing being done.

/Bruce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-07 14:41       ` Richardson, Bruce
@ 2015-09-25 18:28         ` Zoltan Kiss
  2015-09-27 23:19           ` Ananyev, Konstantin
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-09-25 18:28 UTC (permalink / raw)
  To: Richardson, Bruce, dev

On 07/09/15 07:41, Richardson, Bruce wrote:
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Monday, September 7, 2015 3:15 PM
>> To: Richardson, Bruce; dev@dpdk.org
>> Cc: Ananyev, Konstantin
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>> function
>>
>>
>>
>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>> To: dev@dpdk.org
>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>> receive function
>>>>
>>>> Hi,
>>>>
>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>> anyone had time to review this?
>>>>
>>> Hi Zoltan,
>>>
>>> the big thing that concerns me with this is the addition of new
>>> instructions for each packet in the fast path. Ideally, this
>>> prefetching would be better handled in the application itself, as for
>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>> NIC may not touch the packet data at all, and the prefetches will
>> instead cause a performance slowdown.
>>> Is it possible to get the same performance increase - or something
>>> close to it - by making changes in OVS?
>> OVS already does a prefetch when it's processing the previous packet, but
>> apparently it's not early enough. At least for my test scenario, where I'm
>> forwarding UDP packets with the least possible overhead. I guess in tests
>> where OVS does more complex processing it should be fine.
>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>> it'll help.
> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
> 8 ahead might work better, depending on the processing being done.

I've moved the prefetch earlier, and it seems to work:

https://patchwork.ozlabs.org/patch/519017/

However it raises the question: should we remove header prefetch from 
all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?

>
> /Bruce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-25 18:28         ` Zoltan Kiss
@ 2015-09-27 23:19           ` Ananyev, Konstantin
  2015-10-14 16:10             ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Ananyev, Konstantin @ 2015-09-27 23:19 UTC (permalink / raw)
  To: Zoltan Kiss, Richardson, Bruce, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Friday, September 25, 2015 7:29 PM
> To: Richardson, Bruce; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> 
> On 07/09/15 07:41, Richardson, Bruce wrote:
> >
> >> -----Original Message-----
> >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >> Sent: Monday, September 7, 2015 3:15 PM
> >> To: Richardson, Bruce; dev@dpdk.org
> >> Cc: Ananyev, Konstantin
> >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> >> function
> >>
> >>
> >>
> >> On 07/09/15 13:57, Richardson, Bruce wrote:
> >>>
> >>>> -----Original Message-----
> >>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>> Sent: Monday, September 7, 2015 1:26 PM
> >>>> To: dev@dpdk.org
> >>>> Cc: Ananyev, Konstantin; Richardson, Bruce
> >>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
> >>>> receive function
> >>>>
> >>>> Hi,
> >>>>
> >>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
> >>>> anyone had time to review this?
> >>>>
> >>> Hi Zoltan,
> >>>
> >>> the big thing that concerns me with this is the addition of new
> >>> instructions for each packet in the fast path. Ideally, this
> >>> prefetching would be better handled in the application itself, as for
> >>> some apps, e.g. those using pipelining, the core doing the RX from the
> >>> NIC may not touch the packet data at all, and the prefetches will
> >> instead cause a performance slowdown.
> >>> Is it possible to get the same performance increase - or something
> >>> close to it - by making changes in OVS?
> >> OVS already does a prefetch when it's processing the previous packet, but
> >> apparently it's not early enough. At least for my test scenario, where I'm
> >> forwarding UDP packets with the least possible overhead. I guess in tests
> >> where OVS does more complex processing it should be fine.
> >> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
> >> it'll help.
> > I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
> > 8 ahead might work better, depending on the processing being done.
> 
> I've moved the prefetch earlier, and it seems to work:
> 
> https://patchwork.ozlabs.org/patch/519017/
> 
> However it raises the question: should we remove header prefetch from
> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?

My vote would be for that.
Konstantin


> 
> >
> > /Bruce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-09-27 23:19           ` Ananyev, Konstantin
@ 2015-10-14 16:10             ` Zoltan Kiss
  2015-10-14 23:23               ` Ananyev, Konstantin
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-10-14 16:10 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce, dev



On 28/09/15 00:19, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Friday, September 25, 2015 7:29 PM
>> To: Richardson, Bruce; dev@dpdk.org
>> Cc: Ananyev, Konstantin
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>
>> On 07/09/15 07:41, Richardson, Bruce wrote:
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>> Sent: Monday, September 7, 2015 3:15 PM
>>>> To: Richardson, Bruce; dev@dpdk.org
>>>> Cc: Ananyev, Konstantin
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>>>> function
>>>>
>>>>
>>>>
>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>>>> To: dev@dpdk.org
>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>>>> receive function
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>>>> anyone had time to review this?
>>>>>>
>>>>> Hi Zoltan,
>>>>>
>>>>> the big thing that concerns me with this is the addition of new
>>>>> instructions for each packet in the fast path. Ideally, this
>>>>> prefetching would be better handled in the application itself, as for
>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>>>> NIC may not touch the packet data at all, and the prefetches will
>>>> instead cause a performance slowdown.
>>>>> Is it possible to get the same performance increase - or something
>>>>> close to it - by making changes in OVS?
>>>> OVS already does a prefetch when it's processing the previous packet, but
>>>> apparently it's not early enough. At least for my test scenario, where I'm
>>>> forwarding UDP packets with the least possible overhead. I guess in tests
>>>> where OVS does more complex processing it should be fine.
>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>>>> it'll help.
>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
>>> 8 ahead might work better, depending on the processing being done.
>>
>> I've moved the prefetch earlier, and it seems to work:
>>
>> https://patchwork.ozlabs.org/patch/519017/
>>
>> However it raises the question: should we remove header prefetch from
>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
>
> My vote would be for that.
> Konstantin

After some further thinking I would rather support the 
rte_packet_prefetch() macro (prefetch the header in the driver, and 
configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)

- the prefetch can happen earlier, so if an application needs the header 
right away, this is the fastest
- if the application doesn't need header prefetch, it can turn it off 
compile time. Same as if we wouldn't have this option.
- if the application has mixed needs (sometimes it needs the header 
right away, sometimes it doesn't), it can turn it off and do what it 
needs. Same as if we wouldn't have this option.

A harder decision would be whether it should be turned on or off by 
default. Currently it's on, and half of the receive functions don't use it.

>
>
>>
>>>
>>> /Bruce
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-14 16:10             ` Zoltan Kiss
@ 2015-10-14 23:23               ` Ananyev, Konstantin
  2015-10-15 10:32                 ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Ananyev, Konstantin @ 2015-10-14 23:23 UTC (permalink / raw)
  To: Zoltan Kiss, Richardson, Bruce, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Wednesday, October 14, 2015 5:11 PM
> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> 
> 
> 
> On 28/09/15 00:19, Ananyev, Konstantin wrote:
> >
> >
> >> -----Original Message-----
> >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >> Sent: Friday, September 25, 2015 7:29 PM
> >> To: Richardson, Bruce; dev@dpdk.org
> >> Cc: Ananyev, Konstantin
> >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>
> >> On 07/09/15 07:41, Richardson, Bruce wrote:
> >>>
> >>>> -----Original Message-----
> >>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>> Sent: Monday, September 7, 2015 3:15 PM
> >>>> To: Richardson, Bruce; dev@dpdk.org
> >>>> Cc: Ananyev, Konstantin
> >>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> >>>> function
> >>>>
> >>>>
> >>>>
> >>>> On 07/09/15 13:57, Richardson, Bruce wrote:
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>> Sent: Monday, September 7, 2015 1:26 PM
> >>>>>> To: dev@dpdk.org
> >>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
> >>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
> >>>>>> receive function
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
> >>>>>> anyone had time to review this?
> >>>>>>
> >>>>> Hi Zoltan,
> >>>>>
> >>>>> the big thing that concerns me with this is the addition of new
> >>>>> instructions for each packet in the fast path. Ideally, this
> >>>>> prefetching would be better handled in the application itself, as for
> >>>>> some apps, e.g. those using pipelining, the core doing the RX from the
> >>>>> NIC may not touch the packet data at all, and the prefetches will
> >>>> instead cause a performance slowdown.
> >>>>> Is it possible to get the same performance increase - or something
> >>>>> close to it - by making changes in OVS?
> >>>> OVS already does a prefetch when it's processing the previous packet, but
> >>>> apparently it's not early enough. At least for my test scenario, where I'm
> >>>> forwarding UDP packets with the least possible overhead. I guess in tests
> >>>> where OVS does more complex processing it should be fine.
> >>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
> >>>> it'll help.
> >>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
> >>> 8 ahead might work better, depending on the processing being done.
> >>
> >> I've moved the prefetch earlier, and it seems to work:
> >>
> >> https://patchwork.ozlabs.org/patch/519017/
> >>
> >> However it raises the question: should we remove header prefetch from
> >> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
> >
> > My vote would be for that.
> > Konstantin
> 
> After some further thinking I would rather support the
> rte_packet_prefetch() macro (prefetch the header in the driver, and
> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
> 
> - the prefetch can happen earlier, so if an application needs the header
> right away, this is the fastest
> - if the application doesn't need header prefetch, it can turn it off
> compile time. Same as if we wouldn't have this option.
> - if the application has mixed needs (sometimes it needs the header
> right away, sometimes it doesn't), it can turn it off and do what it
> needs. Same as if we wouldn't have this option.
> 
> A harder decision would be whether it should be turned on or off by
> default. Currently it's on, and half of the receive functions don't use it.

Yep,  it is sort of a mess right now.
Another question if we'd like to keep it and standardise it: 
at what moment to prefetch: as soon as we realize that HW is done with that buffer,
or as late inside rx_burst() as possible?  
I suppose there is no clear answer for that.
That's why my thought was to just get rid of it.
Though if it would be implemented in some standardized way and disabled by default -
that's probably ok too. 
BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
Then we can have the code one place and don't need compile option at all -
could be ebabled/disabled dynamically on a per nic basis.
Or would it be too high overhead for that?
Konstantin
 


> 
> >
> >
> >>
> >>>
> >>> /Bruce
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-14 23:23               ` Ananyev, Konstantin
@ 2015-10-15 10:32                 ` Zoltan Kiss
  2015-10-15 15:43                   ` Ananyev, Konstantin
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-10-15 10:32 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce, dev



On 15/10/15 00:23, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Wednesday, October 14, 2015 5:11 PM
>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>
>>
>>
>> On 28/09/15 00:19, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>> Sent: Friday, September 25, 2015 7:29 PM
>>>> To: Richardson, Bruce; dev@dpdk.org
>>>> Cc: Ananyev, Konstantin
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>
>>>> On 07/09/15 07:41, Richardson, Bruce wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>> Sent: Monday, September 7, 2015 3:15 PM
>>>>>> To: Richardson, Bruce; dev@dpdk.org
>>>>>> Cc: Ananyev, Konstantin
>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>>>>>> function
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>>>>>> To: dev@dpdk.org
>>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>>>>>> receive function
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>>>>>> anyone had time to review this?
>>>>>>>>
>>>>>>> Hi Zoltan,
>>>>>>>
>>>>>>> the big thing that concerns me with this is the addition of new
>>>>>>> instructions for each packet in the fast path. Ideally, this
>>>>>>> prefetching would be better handled in the application itself, as for
>>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>>>>>> NIC may not touch the packet data at all, and the prefetches will
>>>>>> instead cause a performance slowdown.
>>>>>>> Is it possible to get the same performance increase - or something
>>>>>>> close to it - by making changes in OVS?
>>>>>> OVS already does a prefetch when it's processing the previous packet, but
>>>>>> apparently it's not early enough. At least for my test scenario, where I'm
>>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
>>>>>> where OVS does more complex processing it should be fine.
>>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>>>>>> it'll help.
>>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
>>>>> 8 ahead might work better, depending on the processing being done.
>>>>
>>>> I've moved the prefetch earlier, and it seems to work:
>>>>
>>>> https://patchwork.ozlabs.org/patch/519017/
>>>>
>>>> However it raises the question: should we remove header prefetch from
>>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
>>>
>>> My vote would be for that.
>>> Konstantin
>>
>> After some further thinking I would rather support the
>> rte_packet_prefetch() macro (prefetch the header in the driver, and
>> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
>>
>> - the prefetch can happen earlier, so if an application needs the header
>> right away, this is the fastest
>> - if the application doesn't need header prefetch, it can turn it off
>> compile time. Same as if we wouldn't have this option.
>> - if the application has mixed needs (sometimes it needs the header
>> right away, sometimes it doesn't), it can turn it off and do what it
>> needs. Same as if we wouldn't have this option.
>>
>> A harder decision would be whether it should be turned on or off by
>> default. Currently it's on, and half of the receive functions don't use it.
>
> Yep,  it is sort of a mess right now.
> Another question if we'd like to keep it and standardise it:
> at what moment to prefetch: as soon as we realize that HW is done with that buffer,
> or as late inside rx_burst() as possible?
> I suppose there is no clear answer for that.
I think if the application needs the driver start the prefetching, it 
does it because it's already too late when rte_eth_rx_burst() returns. 
So I think it's best to do it as soon as we learn that the HW is done.

> That's why my thought was to just get rid of it.
> Though if it would be implemented in some standardized way and disabled by default -
> that's probably ok too.
> BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
> Then we can have the code one place and don't need compile option at all -
> could be ebabled/disabled dynamically on a per nic basis.
> Or would it be too high overhead for that?

I think if we go that way, it's better to pass an option to 
rte_eth_rx_burst() and tell if you want the header prefetched by the 
driver. That would be the most flexible.

> Konstantin
>
>
>
>>
>>>
>>>
>>>>
>>>>>
>>>>> /Bruce
>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-15 10:32                 ` Zoltan Kiss
@ 2015-10-15 15:43                   ` Ananyev, Konstantin
  2015-10-19 16:30                     ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Ananyev, Konstantin @ 2015-10-15 15:43 UTC (permalink / raw)
  To: Zoltan Kiss, Richardson, Bruce, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Thursday, October 15, 2015 11:33 AM
> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> 
> 
> 
> On 15/10/15 00:23, Ananyev, Konstantin wrote:
> >
> >
> >> -----Original Message-----
> >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >> Sent: Wednesday, October 14, 2015 5:11 PM
> >> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>
> >>
> >>
> >> On 28/09/15 00:19, Ananyev, Konstantin wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>> Sent: Friday, September 25, 2015 7:29 PM
> >>>> To: Richardson, Bruce; dev@dpdk.org
> >>>> Cc: Ananyev, Konstantin
> >>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>>>
> >>>> On 07/09/15 07:41, Richardson, Bruce wrote:
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>> Sent: Monday, September 7, 2015 3:15 PM
> >>>>>> To: Richardson, Bruce; dev@dpdk.org
> >>>>>> Cc: Ananyev, Konstantin
> >>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> >>>>>> function
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
> >>>>>>>> To: dev@dpdk.org
> >>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
> >>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
> >>>>>>>> receive function
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
> >>>>>>>> anyone had time to review this?
> >>>>>>>>
> >>>>>>> Hi Zoltan,
> >>>>>>>
> >>>>>>> the big thing that concerns me with this is the addition of new
> >>>>>>> instructions for each packet in the fast path. Ideally, this
> >>>>>>> prefetching would be better handled in the application itself, as for
> >>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
> >>>>>>> NIC may not touch the packet data at all, and the prefetches will
> >>>>>> instead cause a performance slowdown.
> >>>>>>> Is it possible to get the same performance increase - or something
> >>>>>>> close to it - by making changes in OVS?
> >>>>>> OVS already does a prefetch when it's processing the previous packet, but
> >>>>>> apparently it's not early enough. At least for my test scenario, where I'm
> >>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
> >>>>>> where OVS does more complex processing it should be fine.
> >>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
> >>>>>> it'll help.
> >>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
> >>>>> 8 ahead might work better, depending on the processing being done.
> >>>>
> >>>> I've moved the prefetch earlier, and it seems to work:
> >>>>
> >>>> https://patchwork.ozlabs.org/patch/519017/
> >>>>
> >>>> However it raises the question: should we remove header prefetch from
> >>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
> >>>
> >>> My vote would be for that.
> >>> Konstantin
> >>
> >> After some further thinking I would rather support the
> >> rte_packet_prefetch() macro (prefetch the header in the driver, and
> >> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
> >>
> >> - the prefetch can happen earlier, so if an application needs the header
> >> right away, this is the fastest
> >> - if the application doesn't need header prefetch, it can turn it off
> >> compile time. Same as if we wouldn't have this option.
> >> - if the application has mixed needs (sometimes it needs the header
> >> right away, sometimes it doesn't), it can turn it off and do what it
> >> needs. Same as if we wouldn't have this option.
> >>
> >> A harder decision would be whether it should be turned on or off by
> >> default. Currently it's on, and half of the receive functions don't use it.
> >
> > Yep,  it is sort of a mess right now.
> > Another question if we'd like to keep it and standardise it:
> > at what moment to prefetch: as soon as we realize that HW is done with that buffer,
> > or as late inside rx_burst() as possible?
> > I suppose there is no clear answer for that.
> I think if the application needs the driver start the prefetching, it
> does it because it's already too late when rte_eth_rx_burst() returns.
> So I think it's best to do it as soon as we learn that the HW is done.

Probably, but it might be situations when it would be more plausible to do it later too.
Again it might depend on particular HW and your memory access pattern.

> 
> > That's why my thought was to just get rid of it.
> > Though if it would be implemented in some standardized way and disabled by default -
> > that's probably ok too.
> > BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
> > Then we can have the code one place and don't need compile option at all -
> > could be ebabled/disabled dynamically on a per nic basis.
> > Or would it be too high overhead for that?
> 
> I think if we go that way, it's better to pass an option to
> rte_eth_rx_burst() and tell if you want the header prefetched by the
> driver. That would be the most flexible.

That would mean ABI breakage for rx_burst()...
Probably then better a new parameter in rx_conf or something.
Though it still means that each PMD has to implement it on its own.
And again there might be PMDs that would just ignore that parameter.
While for callback it would be all in one and known place.
But again, I didn't measure it, so not sure what will be an impact of the callback itself.
Might be it not worth it. 
Konstantin

> 
> > Konstantin
> >
> >
> >
> >>
> >>>
> >>>
> >>>>
> >>>>>
> >>>>> /Bruce
> >>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-15 15:43                   ` Ananyev, Konstantin
@ 2015-10-19 16:30                     ` Zoltan Kiss
  2015-10-19 18:57                       ` Ananyev, Konstantin
  0 siblings, 1 reply; 14+ messages in thread
From: Zoltan Kiss @ 2015-10-19 16:30 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce, dev



On 15/10/15 16:43, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Thursday, October 15, 2015 11:33 AM
>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>
>>
>>
>> On 15/10/15 00:23, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>> Sent: Wednesday, October 14, 2015 5:11 PM
>>>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>
>>>>
>>>>
>>>> On 28/09/15 00:19, Ananyev, Konstantin wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>> Sent: Friday, September 25, 2015 7:29 PM
>>>>>> To: Richardson, Bruce; dev@dpdk.org
>>>>>> Cc: Ananyev, Konstantin
>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>>>
>>>>>> On 07/09/15 07:41, Richardson, Bruce wrote:
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>> Sent: Monday, September 7, 2015 3:15 PM
>>>>>>>> To: Richardson, Bruce; dev@dpdk.org
>>>>>>>> Cc: Ananyev, Konstantin
>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>>>>>>>> function
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>>>>>>>> To: dev@dpdk.org
>>>>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>>>>>>>> receive function
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>>>>>>>> anyone had time to review this?
>>>>>>>>>>
>>>>>>>>> Hi Zoltan,
>>>>>>>>>
>>>>>>>>> the big thing that concerns me with this is the addition of new
>>>>>>>>> instructions for each packet in the fast path. Ideally, this
>>>>>>>>> prefetching would be better handled in the application itself, as for
>>>>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>>>>>>>> NIC may not touch the packet data at all, and the prefetches will
>>>>>>>> instead cause a performance slowdown.
>>>>>>>>> Is it possible to get the same performance increase - or something
>>>>>>>>> close to it - by making changes in OVS?
>>>>>>>> OVS already does a prefetch when it's processing the previous packet, but
>>>>>>>> apparently it's not early enough. At least for my test scenario, where I'm
>>>>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
>>>>>>>> where OVS does more complex processing it should be fine.
>>>>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>>>>>>>> it'll help.
>>>>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
>>>>>>> 8 ahead might work better, depending on the processing being done.
>>>>>>
>>>>>> I've moved the prefetch earlier, and it seems to work:
>>>>>>
>>>>>> https://patchwork.ozlabs.org/patch/519017/
>>>>>>
>>>>>> However it raises the question: should we remove header prefetch from
>>>>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
>>>>>
>>>>> My vote would be for that.
>>>>> Konstantin
>>>>
>>>> After some further thinking I would rather support the
>>>> rte_packet_prefetch() macro (prefetch the header in the driver, and
>>>> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
>>>>
>>>> - the prefetch can happen earlier, so if an application needs the header
>>>> right away, this is the fastest
>>>> - if the application doesn't need header prefetch, it can turn it off
>>>> compile time. Same as if we wouldn't have this option.
>>>> - if the application has mixed needs (sometimes it needs the header
>>>> right away, sometimes it doesn't), it can turn it off and do what it
>>>> needs. Same as if we wouldn't have this option.
>>>>
>>>> A harder decision would be whether it should be turned on or off by
>>>> default. Currently it's on, and half of the receive functions don't use it.
>>>
>>> Yep,  it is sort of a mess right now.
>>> Another question if we'd like to keep it and standardise it:
>>> at what moment to prefetch: as soon as we realize that HW is done with that buffer,
>>> or as late inside rx_burst() as possible?
>>> I suppose there is no clear answer for that.
>> I think if the application needs the driver start the prefetching, it
>> does it because it's already too late when rte_eth_rx_burst() returns.
>> So I think it's best to do it as soon as we learn that the HW is done.
>
> Probably, but it might be situations when it would be more plausible to do it later too.
Could you elaborate?
If the application wants prefetch to start earlier than could be done by 
itself (right after rte_eth_rx_burst() returns), earlier is better. So 
it will have the best chances to have the data in cache.
If it would need the data later, then it could do the prefetch by itself.

> Again it might depend on particular HW and your memory access pattern.
>
>>
>>> That's why my thought was to just get rid of it.
>>> Though if it would be implemented in some standardized way and disabled by default -
>>> that's probably ok too.
>>> BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
>>> Then we can have the code one place and don't need compile option at all -
>>> could be ebabled/disabled dynamically on a per nic basis.
>>> Or would it be too high overhead for that?
>>
>> I think if we go that way, it's better to pass an option to
>> rte_eth_rx_burst() and tell if you want the header prefetched by the
>> driver. That would be the most flexible.
>
> That would mean ABI breakage for rx_burst()...
> Probably then better a new parameter in rx_conf or something.
> Though it still means that each PMD has to implement it on its own.
That would be the case in any way, as we are are talking about 
prefetching in the receive function.

> And again there might be PMDs that would just ignore that parameter.
If the PMD has a good reason to do that (e.g. prefetch has undesirable 
effects), I think it's fine.

> While for callback it would be all in one and known place.
Who and when would call that callback? If the application after 
rte_eth_rx_burst() returned, it wouldn't have too much use, it could 
just call prefetch by itself.

> But again, I didn't measure it, so not sure what will be an impact of the callback itself.
> Might be it not worth it.
> Konstantin
>
>>
>>> Konstantin
>>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> /Bruce
>>>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-19 16:30                     ` Zoltan Kiss
@ 2015-10-19 18:57                       ` Ananyev, Konstantin
  2015-10-20  9:58                         ` Zoltan Kiss
  0 siblings, 1 reply; 14+ messages in thread
From: Ananyev, Konstantin @ 2015-10-19 18:57 UTC (permalink / raw)
  To: Zoltan Kiss, Richardson, Bruce, dev



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> Sent: Monday, October 19, 2015 5:30 PM
> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> 
> 
> 
> On 15/10/15 16:43, Ananyev, Konstantin wrote:
> >
> >
> >> -----Original Message-----
> >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >> Sent: Thursday, October 15, 2015 11:33 AM
> >> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>
> >>
> >>
> >> On 15/10/15 00:23, Ananyev, Konstantin wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>> Sent: Wednesday, October 14, 2015 5:11 PM
> >>>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
> >>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>>>
> >>>>
> >>>>
> >>>> On 28/09/15 00:19, Ananyev, Konstantin wrote:
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>> Sent: Friday, September 25, 2015 7:29 PM
> >>>>>> To: Richardson, Bruce; dev@dpdk.org
> >>>>>> Cc: Ananyev, Konstantin
> >>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
> >>>>>>
> >>>>>> On 07/09/15 07:41, Richardson, Bruce wrote:
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>>>> Sent: Monday, September 7, 2015 3:15 PM
> >>>>>>>> To: Richardson, Bruce; dev@dpdk.org
> >>>>>>>> Cc: Ananyev, Konstantin
> >>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
> >>>>>>>> function
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
> >>>>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
> >>>>>>>>>> To: dev@dpdk.org
> >>>>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
> >>>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
> >>>>>>>>>> receive function
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
> >>>>>>>>>> anyone had time to review this?
> >>>>>>>>>>
> >>>>>>>>> Hi Zoltan,
> >>>>>>>>>
> >>>>>>>>> the big thing that concerns me with this is the addition of new
> >>>>>>>>> instructions for each packet in the fast path. Ideally, this
> >>>>>>>>> prefetching would be better handled in the application itself, as for
> >>>>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
> >>>>>>>>> NIC may not touch the packet data at all, and the prefetches will
> >>>>>>>> instead cause a performance slowdown.
> >>>>>>>>> Is it possible to get the same performance increase - or something
> >>>>>>>>> close to it - by making changes in OVS?
> >>>>>>>> OVS already does a prefetch when it's processing the previous packet, but
> >>>>>>>> apparently it's not early enough. At least for my test scenario, where I'm
> >>>>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
> >>>>>>>> where OVS does more complex processing it should be fine.
> >>>>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
> >>>>>>>> it'll help.
> >>>>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
> >>>>>>> 8 ahead might work better, depending on the processing being done.
> >>>>>>
> >>>>>> I've moved the prefetch earlier, and it seems to work:
> >>>>>>
> >>>>>> https://patchwork.ozlabs.org/patch/519017/
> >>>>>>
> >>>>>> However it raises the question: should we remove header prefetch from
> >>>>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
> >>>>>
> >>>>> My vote would be for that.
> >>>>> Konstantin
> >>>>
> >>>> After some further thinking I would rather support the
> >>>> rte_packet_prefetch() macro (prefetch the header in the driver, and
> >>>> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
> >>>>
> >>>> - the prefetch can happen earlier, so if an application needs the header
> >>>> right away, this is the fastest
> >>>> - if the application doesn't need header prefetch, it can turn it off
> >>>> compile time. Same as if we wouldn't have this option.
> >>>> - if the application has mixed needs (sometimes it needs the header
> >>>> right away, sometimes it doesn't), it can turn it off and do what it
> >>>> needs. Same as if we wouldn't have this option.
> >>>>
> >>>> A harder decision would be whether it should be turned on or off by
> >>>> default. Currently it's on, and half of the receive functions don't use it.
> >>>
> >>> Yep,  it is sort of a mess right now.
> >>> Another question if we'd like to keep it and standardise it:
> >>> at what moment to prefetch: as soon as we realize that HW is done with that buffer,
> >>> or as late inside rx_burst() as possible?
> >>> I suppose there is no clear answer for that.
> >> I think if the application needs the driver start the prefetching, it
> >> does it because it's already too late when rte_eth_rx_burst() returns.
> >> So I think it's best to do it as soon as we learn that the HW is done.
> >
> > Probably, but it might be situations when it would be more plausible to do it later too.
> Could you elaborate?
> If the application wants prefetch to start earlier than could be done by
> itself (right after rte_eth_rx_burst() returns), earlier is better. So
> it will have the best chances to have the data in cache.
> If it would need the data later, then it could do the prefetch by itself.
> 
> > Again it might depend on particular HW and your memory access pattern.
> >
> >>
> >>> That's why my thought was to just get rid of it.
> >>> Though if it would be implemented in some standardized way and disabled by default -
> >>> that's probably ok too.
> >>> BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
> >>> Then we can have the code one place and don't need compile option at all -
> >>> could be ebabled/disabled dynamically on a per nic basis.
> >>> Or would it be too high overhead for that?
> >>
> >> I think if we go that way, it's better to pass an option to
> >> rte_eth_rx_burst() and tell if you want the header prefetched by the
> >> driver. That would be the most flexible.
> >
> > That would mean ABI breakage for rx_burst()...
> > Probably then better a new parameter in rx_conf or something.
> > Though it still means that each PMD has to implement it on its own.
> That would be the case in any way, as we are are talking about
> prefetching in the receive function.
> 
> > And again there might be PMDs that would just ignore that parameter.
> If the PMD has a good reason to do that (e.g. prefetch has undesirable
> effects), I think it's fine.
> 
> > While for callback it would be all in one and known place.
> Who and when would call that callback? If the application after
> rte_eth_rx_burst() returned, it wouldn't have too much use, it could
> just call prefetch by itself.

I am talking about the callbacks called by rx_burst() itself:

static inline uint16_t
rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
                 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
{
        struct rte_eth_dev *dev;

        dev = &rte_eth_devices[port_id];

        int16_t nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
                        rx_pkts, nb_pkts);

#ifdef RTE_ETHDEV_RXTX_CALLBACKS
        struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];

        if (unlikely(cb != NULL)) {
                do {
                        nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
                                                nb_pkts, cb->param);
                        cb = cb->next;
                } while (cb != NULL);
#endif

        return nb_rx;
}

Konstantin

> 
> > But again, I didn't measure it, so not sure what will be an impact of the callback itself.
> > Might be it not worth it.
> > Konstantin
> >
> >>
> >>> Konstantin
> >>>
> >>>
> >>>
> >>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> /Bruce
> >>>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
  2015-10-19 18:57                       ` Ananyev, Konstantin
@ 2015-10-20  9:58                         ` Zoltan Kiss
  0 siblings, 0 replies; 14+ messages in thread
From: Zoltan Kiss @ 2015-10-20  9:58 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce, dev



On 19/10/15 19:57, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Monday, October 19, 2015 5:30 PM
>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>
>>
>>
>> On 15/10/15 16:43, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>> Sent: Thursday, October 15, 2015 11:33 AM
>>>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>
>>>>
>>>>
>>>> On 15/10/15 00:23, Ananyev, Konstantin wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>> Sent: Wednesday, October 14, 2015 5:11 PM
>>>>>> To: Ananyev, Konstantin; Richardson, Bruce; dev@dpdk.org
>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 28/09/15 00:19, Ananyev, Konstantin wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>> Sent: Friday, September 25, 2015 7:29 PM
>>>>>>>> To: Richardson, Bruce; dev@dpdk.org
>>>>>>>> Cc: Ananyev, Konstantin
>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>>>>>
>>>>>>>> On 07/09/15 07:41, Richardson, Bruce wrote:
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>>>> Sent: Monday, September 7, 2015 3:15 PM
>>>>>>>>>> To: Richardson, Bruce; dev@dpdk.org
>>>>>>>>>> Cc: Ananyev, Konstantin
>>>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>>>>>>>>>> function
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>>>>>>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>>>>>>>>>> To: dev@dpdk.org
>>>>>>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>>>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>>>>>>>>>> receive function
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>>>>>>>>>> anyone had time to review this?
>>>>>>>>>>>>
>>>>>>>>>>> Hi Zoltan,
>>>>>>>>>>>
>>>>>>>>>>> the big thing that concerns me with this is the addition of new
>>>>>>>>>>> instructions for each packet in the fast path. Ideally, this
>>>>>>>>>>> prefetching would be better handled in the application itself, as for
>>>>>>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>>>>>>>>>> NIC may not touch the packet data at all, and the prefetches will
>>>>>>>>>> instead cause a performance slowdown.
>>>>>>>>>>> Is it possible to get the same performance increase - or something
>>>>>>>>>>> close to it - by making changes in OVS?
>>>>>>>>>> OVS already does a prefetch when it's processing the previous packet, but
>>>>>>>>>> apparently it's not early enough. At least for my test scenario, where I'm
>>>>>>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
>>>>>>>>>> where OVS does more complex processing it should be fine.
>>>>>>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>>>>>>>>>> it'll help.
>>>>>>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
>>>>>>>>> 8 ahead might work better, depending on the processing being done.
>>>>>>>>
>>>>>>>> I've moved the prefetch earlier, and it seems to work:
>>>>>>>>
>>>>>>>> https://patchwork.ozlabs.org/patch/519017/
>>>>>>>>
>>>>>>>> However it raises the question: should we remove header prefetch from
>>>>>>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
>>>>>>>
>>>>>>> My vote would be for that.
>>>>>>> Konstantin
>>>>>>
>>>>>> After some further thinking I would rather support the
>>>>>> rte_packet_prefetch() macro (prefetch the header in the driver, and
>>>>>> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
>>>>>>
>>>>>> - the prefetch can happen earlier, so if an application needs the header
>>>>>> right away, this is the fastest
>>>>>> - if the application doesn't need header prefetch, it can turn it off
>>>>>> compile time. Same as if we wouldn't have this option.
>>>>>> - if the application has mixed needs (sometimes it needs the header
>>>>>> right away, sometimes it doesn't), it can turn it off and do what it
>>>>>> needs. Same as if we wouldn't have this option.
>>>>>>
>>>>>> A harder decision would be whether it should be turned on or off by
>>>>>> default. Currently it's on, and half of the receive functions don't use it.
>>>>>
>>>>> Yep,  it is sort of a mess right now.
>>>>> Another question if we'd like to keep it and standardise it:
>>>>> at what moment to prefetch: as soon as we realize that HW is done with that buffer,
>>>>> or as late inside rx_burst() as possible?
>>>>> I suppose there is no clear answer for that.
>>>> I think if the application needs the driver start the prefetching, it
>>>> does it because it's already too late when rte_eth_rx_burst() returns.
>>>> So I think it's best to do it as soon as we learn that the HW is done.
>>>
>>> Probably, but it might be situations when it would be more plausible to do it later too.
>> Could you elaborate?
>> If the application wants prefetch to start earlier than could be done by
>> itself (right after rte_eth_rx_burst() returns), earlier is better. So
>> it will have the best chances to have the data in cache.
>> If it would need the data later, then it could do the prefetch by itself.
>>
>>> Again it might depend on particular HW and your memory access pattern.
>>>
>>>>
>>>>> That's why my thought was to just get rid of it.
>>>>> Though if it would be implemented in some standardized way and disabled by default -
>>>>> that's probably ok too.
>>>>> BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
>>>>> Then we can have the code one place and don't need compile option at all -
>>>>> could be ebabled/disabled dynamically on a per nic basis.
>>>>> Or would it be too high overhead for that?
>>>>
>>>> I think if we go that way, it's better to pass an option to
>>>> rte_eth_rx_burst() and tell if you want the header prefetched by the
>>>> driver. That would be the most flexible.
>>>
>>> That would mean ABI breakage for rx_burst()...
>>> Probably then better a new parameter in rx_conf or something.
>>> Though it still means that each PMD has to implement it on its own.
>> That would be the case in any way, as we are are talking about
>> prefetching in the receive function.
>>
>>> And again there might be PMDs that would just ignore that parameter.
>> If the PMD has a good reason to do that (e.g. prefetch has undesirable
>> effects), I think it's fine.
>>
>>> While for callback it would be all in one and known place.
>> Who and when would call that callback? If the application after
>> rte_eth_rx_burst() returned, it wouldn't have too much use, it could
>> just call prefetch by itself.
>
> I am talking about the callbacks called by rx_burst() itself:

I don't think it would give us any advantage: the app can just prefetch 
the headers after rte_eth_rx_burst(). If you look at the original patch 
I've sent in about this topic:

http://dpdk.org/ml/archives/dev/2015-September/023144.html

It will start prefetching immediately when an iteration of packets is 
finished. If RTE_IXGBE_DESCS_PER_LOOP is 4 and the burst is 32, by the 
time the receive function returns the first packet's headers could be 
almost there in the cache. And actually you can do it earlier than in 
that patch: right when the pointers become available at "B.2 copy 2 mbuf 
point into rx_pkts"


>
> static inline uint16_t
> rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>                   struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> {
>          struct rte_eth_dev *dev;
>
>          dev = &rte_eth_devices[port_id];
>
>          int16_t nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
>                          rx_pkts, nb_pkts);
>
> #ifdef RTE_ETHDEV_RXTX_CALLBACKS
>          struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
>
>          if (unlikely(cb != NULL)) {
>                  do {
>                          nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
>                                                  nb_pkts, cb->param);
>                          cb = cb->next;
>                  } while (cb != NULL);
> #endif
>
>          return nb_rx;
> }
>
> Konstantin
>
>>
>>> But again, I didn't measure it, so not sure what will be an impact of the callback itself.
>>> Might be it not worth it.
>>> Konstantin
>>>
>>>>
>>>>> Konstantin
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> /Bruce
>>>>>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-10-20  9:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01 19:17 ixgbe: prefetch packet headers in vector PMD receive function Zoltan Kiss
2015-09-07 12:25 ` [PATCH] " Zoltan Kiss
2015-09-07 12:57   ` Richardson, Bruce
2015-09-07 14:15     ` Zoltan Kiss
2015-09-07 14:41       ` Richardson, Bruce
2015-09-25 18:28         ` Zoltan Kiss
2015-09-27 23:19           ` Ananyev, Konstantin
2015-10-14 16:10             ` Zoltan Kiss
2015-10-14 23:23               ` Ananyev, Konstantin
2015-10-15 10:32                 ` Zoltan Kiss
2015-10-15 15:43                   ` Ananyev, Konstantin
2015-10-19 16:30                     ` Zoltan Kiss
2015-10-19 18:57                       ` Ananyev, Konstantin
2015-10-20  9:58                         ` Zoltan Kiss

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.