* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
@ 2017-03-13 16:16 John Fastabend
2017-03-13 17:18 ` Alexander Duyck
0 siblings, 1 reply; 6+ messages in thread
From: John Fastabend @ 2017-03-13 16:16 UTC (permalink / raw)
To: intel-wired-lan
Current XDP implementation hits the tail on every XDP_TX return
code. This patch changes driver behavior to only hit the tail after
packet processing is complete.
RFC for now as I test this, it looks promising on my dev box but
want to do some more tests before official submission.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index bef4e24..2c244b6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
unsigned int mss = 0;
#endif /* IXGBE_FCOE */
u16 cleaned_count = ixgbe_desc_unused(rx_ring);
+ bool xdp_xmit = false;
while (likely(total_rx_packets < budget)) {
union ixgbe_adv_rx_desc *rx_desc;
@@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
}
if (IS_ERR(skb)) {
- if (PTR_ERR(skb) == -IXGBE_XDP_TX)
+ if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
+ xdp_xmit = true;
ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
- else
+ } else {
rx_buffer->pagecnt_bias++;
+ }
total_rx_packets++;
total_rx_bytes += size;
} else if (skb) {
@@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
total_rx_packets++;
}
+ if (xdp_xmit) {
+ struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
+
+ writel(ring->next_to_use, ring->tail);
+ }
+
u64_stats_update_begin(&rx_ring->syncp);
rx_ring->stats.packets += total_rx_packets;
rx_ring->stats.bytes += total_rx_bytes;
@@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
tx_buffer->next_to_watch = tx_desc;
ring->next_to_use = i;
- writel(i, ring->tail);
return IXGBE_XDP_TX;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
2017-03-13 16:16 [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets John Fastabend
@ 2017-03-13 17:18 ` Alexander Duyck
2017-03-14 23:50 ` John Fastabend
0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2017-03-13 17:18 UTC (permalink / raw)
To: intel-wired-lan
On Mon, Mar 13, 2017 at 9:16 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
> Current XDP implementation hits the tail on every XDP_TX return
> code. This patch changes driver behavior to only hit the tail after
> packet processing is complete.
>
> RFC for now as I test this, it looks promising on my dev box but
> want to do some more tests before official submission.
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index bef4e24..2c244b6 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> unsigned int mss = 0;
> #endif /* IXGBE_FCOE */
> u16 cleaned_count = ixgbe_desc_unused(rx_ring);
> + bool xdp_xmit = false;
>
> while (likely(total_rx_packets < budget)) {
> union ixgbe_adv_rx_desc *rx_desc;
> @@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> }
>
> if (IS_ERR(skb)) {
> - if (PTR_ERR(skb) == -IXGBE_XDP_TX)
> + if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
> + xdp_xmit = true;
> ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
> - else
> + } else {
> rx_buffer->pagecnt_bias++;
> + }
> total_rx_packets++;
> total_rx_bytes += size;
> } else if (skb) {
> @@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> total_rx_packets++;
> }
>
> + if (xdp_xmit) {
> + struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
> +
> + writel(ring->next_to_use, ring->tail);
> + }
> +
We will need a wmb here.
> u64_stats_update_begin(&rx_ring->syncp);
> rx_ring->stats.packets += total_rx_packets;
> rx_ring->stats.bytes += total_rx_bytes;
> @@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
> tx_buffer->next_to_watch = tx_desc;
> ring->next_to_use = i;
>
> - writel(i, ring->tail);
So you might want to change the barrier setup for all this to use
smp_wmb instead. We need the wmb to be paired with the writel. That
should give you a slight performance boost since smp_wmb breaks down
to just a barrier on x86 systems.
> return IXGBE_XDP_TX;
> }
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
2017-03-13 17:18 ` Alexander Duyck
@ 2017-03-14 23:50 ` John Fastabend
2017-03-15 1:14 ` Alexander Duyck
0 siblings, 1 reply; 6+ messages in thread
From: John Fastabend @ 2017-03-14 23:50 UTC (permalink / raw)
To: intel-wired-lan
On 17-03-13 10:18 AM, Alexander Duyck wrote:
> On Mon, Mar 13, 2017 at 9:16 AM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> Current XDP implementation hits the tail on every XDP_TX return
>> code. This patch changes driver behavior to only hit the tail after
>> packet processing is complete.
>>
>> RFC for now as I test this, it looks promising on my dev box but
>> want to do some more tests before official submission.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
>> 1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index bef4e24..2c244b6 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>> unsigned int mss = 0;
>> #endif /* IXGBE_FCOE */
>> u16 cleaned_count = ixgbe_desc_unused(rx_ring);
>> + bool xdp_xmit = false;
>>
>> while (likely(total_rx_packets < budget)) {
>> union ixgbe_adv_rx_desc *rx_desc;
>> @@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>> }
>>
>> if (IS_ERR(skb)) {
>> - if (PTR_ERR(skb) == -IXGBE_XDP_TX)
>> + if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
>> + xdp_xmit = true;
>> ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
>> - else
>> + } else {
>> rx_buffer->pagecnt_bias++;
>> + }
>> total_rx_packets++;
>> total_rx_bytes += size;
>> } else if (skb) {
>> @@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>> total_rx_packets++;
>> }
>>
>> + if (xdp_xmit) {
>> + struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
>> +
>> + writel(ring->next_to_use, ring->tail);
>> + }
>> +
>
> We will need a wmb here.
>
>> u64_stats_update_begin(&rx_ring->syncp);
>> rx_ring->stats.packets += total_rx_packets;
>> rx_ring->stats.bytes += total_rx_bytes;
>> @@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
>> tx_buffer->next_to_watch = tx_desc;
>> ring->next_to_use = i;
>>
>> - writel(i, ring->tail);
>
> So you might want to change the barrier setup for all this to use
> smp_wmb instead. We need the wmb to be paired with the writel. That
> should give you a slight performance boost since smp_wmb breaks down
> to just a barrier on x86 systems.
>
Not sure I grok this description entirely, but I think you are just saying
replace,
if (xdp_xmit) {
...
writel(...)
}
with
if (xdp_xmit) {
...
smp_rmb()
writel()
}
Correct? Did you have some other change in mind ... "for all this"?
Thanks,
John
>> return IXGBE_XDP_TX;
>> }
>>
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
2017-03-14 23:50 ` John Fastabend
@ 2017-03-15 1:14 ` Alexander Duyck
2017-03-27 19:28 ` Fastabend, John R
0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2017-03-15 1:14 UTC (permalink / raw)
To: intel-wired-lan
On Tue, Mar 14, 2017 at 4:50 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 17-03-13 10:18 AM, Alexander Duyck wrote:
>> On Mon, Mar 13, 2017 at 9:16 AM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> Current XDP implementation hits the tail on every XDP_TX return
>>> code. This patch changes driver behavior to only hit the tail after
>>> packet processing is complete.
>>>
>>> RFC for now as I test this, it looks promising on my dev box but
>>> want to do some more tests before official submission.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>> ---
>>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
>>> 1 file changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index bef4e24..2c244b6 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> unsigned int mss = 0;
>>> #endif /* IXGBE_FCOE */
>>> u16 cleaned_count = ixgbe_desc_unused(rx_ring);
>>> + bool xdp_xmit = false;
>>>
>>> while (likely(total_rx_packets < budget)) {
>>> union ixgbe_adv_rx_desc *rx_desc;
>>> @@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> }
>>>
>>> if (IS_ERR(skb)) {
>>> - if (PTR_ERR(skb) == -IXGBE_XDP_TX)
>>> + if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
>>> + xdp_xmit = true;
>>> ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
>>> - else
>>> + } else {
>>> rx_buffer->pagecnt_bias++;
>>> + }
>>> total_rx_packets++;
>>> total_rx_bytes += size;
>>> } else if (skb) {
>>> @@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> total_rx_packets++;
>>> }
>>>
>>> + if (xdp_xmit) {
>>> + struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
>>> +
>>> + writel(ring->next_to_use, ring->tail);
>>> + }
>>> +
>>
>> We will need a wmb here.
>>
>>> u64_stats_update_begin(&rx_ring->syncp);
>>> rx_ring->stats.packets += total_rx_packets;
>>> rx_ring->stats.bytes += total_rx_bytes;
>>> @@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
>>> tx_buffer->next_to_watch = tx_desc;
>>> ring->next_to_use = i;
>>>
>>> - writel(i, ring->tail);
>>
>> So you might want to change the barrier setup for all this to use
>> smp_wmb instead. We need the wmb to be paired with the writel. That
>> should give you a slight performance boost since smp_wmb breaks down
>> to just a barrier on x86 systems.
>>
>
> Not sure I grok this description entirely, but I think you are just saying
> replace,
>
> if (xdp_xmit) {
> ...
> writel(...)
> }
>
> with
>
> if (xdp_xmit) {
> ...
> smp_rmb()
> writel()
> }
>
> Correct? Did you have some other change in mind ... "for all this"?
>
> Thanks,
> John
No. Basically what it should be is find and replace wmb with
smp_wmb() in ixgbe_xmit_xdp_ring. That way you won't have to worry
about a race between Tx and clean-up.
Then the if statement should be:
if (xdp_xmit) {
...
/* wmb required to flush writes to coherent memory before writing
to non-coherent memory*/
wmb();
/* writel is a write to non-coherent memory mapped I/O */
writel();
}
>>> return IXGBE_XDP_TX;
>>> }
>>>
>>>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
2017-03-15 1:14 ` Alexander Duyck
@ 2017-03-27 19:28 ` Fastabend, John R
2017-03-27 22:42 ` Duyck, Alexander H
0 siblings, 1 reply; 6+ messages in thread
From: Fastabend, John R @ 2017-03-27 19:28 UTC (permalink / raw)
To: intel-wired-lan
FWIW with a revised version of this patch I see 14.6Mpps on RX drop tests and 13.5Mpps on TX test case.
Alex, maybe we can sync up and squeeze the last mpps out of the design.
Thanks,
John
________________________________________
From: Intel-wired-lan [intel-wired-lan-bounces at lists.osuosl.org] on behalf of Alexander Duyck [alexander.duyck at gmail.com]
Sent: Tuesday, March 14, 2017 6:14 PM
To: John Fastabend
Cc: intel-wired-lan; Alexei Starovoitov; Daniel Borkmann; William Tu
Subject: Re: [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
On Tue, Mar 14, 2017 at 4:50 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 17-03-13 10:18 AM, Alexander Duyck wrote:
>> On Mon, Mar 13, 2017 at 9:16 AM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> Current XDP implementation hits the tail on every XDP_TX return
>>> code. This patch changes driver behavior to only hit the tail after
>>> packet processing is complete.
>>>
>>> RFC for now as I test this, it looks promising on my dev box but
>>> want to do some more tests before official submission.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>> ---
>>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
>>> 1 file changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index bef4e24..2c244b6 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> unsigned int mss = 0;
>>> #endif /* IXGBE_FCOE */
>>> u16 cleaned_count = ixgbe_desc_unused(rx_ring);
>>> + bool xdp_xmit = false;
>>>
>>> while (likely(total_rx_packets < budget)) {
>>> union ixgbe_adv_rx_desc *rx_desc;
>>> @@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> }
>>>
>>> if (IS_ERR(skb)) {
>>> - if (PTR_ERR(skb) == -IXGBE_XDP_TX)
>>> + if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
>>> + xdp_xmit = true;
>>> ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
>>> - else
>>> + } else {
>>> rx_buffer->pagecnt_bias++;
>>> + }
>>> total_rx_packets++;
>>> total_rx_bytes += size;
>>> } else if (skb) {
>>> @@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>> total_rx_packets++;
>>> }
>>>
>>> + if (xdp_xmit) {
>>> + struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
>>> +
>>> + writel(ring->next_to_use, ring->tail);
>>> + }
>>> +
>>
>> We will need a wmb here.
>>
>>> u64_stats_update_begin(&rx_ring->syncp);
>>> rx_ring->stats.packets += total_rx_packets;
>>> rx_ring->stats.bytes += total_rx_bytes;
>>> @@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
>>> tx_buffer->next_to_watch = tx_desc;
>>> ring->next_to_use = i;
>>>
>>> - writel(i, ring->tail);
>>
>> So you might want to change the barrier setup for all this to use
>> smp_wmb instead. We need the wmb to be paired with the writel. That
>> should give you a slight performance boost since smp_wmb breaks down
>> to just a barrier on x86 systems.
>>
>
> Not sure I grok this description entirely, but I think you are just saying
> replace,
>
> if (xdp_xmit) {
> ...
> writel(...)
> }
>
> with
>
> if (xdp_xmit) {
> ...
> smp_rmb()
> writel()
> }
>
> Correct? Did you have some other change in mind ... "for all this"?
>
> Thanks,
> John
No. Basically what it should be is find and replace wmb with
smp_wmb() in ixgbe_xmit_xdp_ring. That way you won't have to worry
about a race between Tx and clean-up.
Then the if statement should be:
if (xdp_xmit) {
...
/* wmb required to flush writes to coherent memory before writing
to non-coherent memory*/
wmb();
/* writel is a write to non-coherent memory mapped I/O */
writel();
}
>>> return IXGBE_XDP_TX;
>>> }
>>>
>>>
>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan at lists.osuosl.org
http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
2017-03-27 19:28 ` Fastabend, John R
@ 2017-03-27 22:42 ` Duyck, Alexander H
0 siblings, 0 replies; 6+ messages in thread
From: Duyck, Alexander H @ 2017-03-27 22:42 UTC (permalink / raw)
To: intel-wired-lan
Sounds good. ?We can probably discuss it tomorrow at our 1pm meeting.
Thanks.
- Alex
On Mon, 2017-03-27 at 19:28 +0000, Fastabend, John R wrote:
> FWIW with a revised version of this patch I see 14.6Mpps on RX drop tests and 13.5Mpps on TX test case.
>
> Alex, maybe we can sync up and squeeze the last mpps out of the design.
>
> Thanks,
> John
> ________________________________________
> From: Intel-wired-lan [intel-wired-lan-bounces at lists.osuosl.org] on behalf of Alexander Duyck [alexander.duyck at gmail.com]
> Sent: Tuesday, March 14, 2017 6:14 PM
> To: John Fastabend
> Cc: intel-wired-lan; Alexei Starovoitov; Daniel Borkmann; William Tu
> Subject: Re: [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets
>
> On Tue, Mar 14, 2017 at 4:50 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > On 17-03-13 10:18 AM, Alexander Duyck wrote:
> > >
> > > On Mon, Mar 13, 2017 at 9:16 AM, John Fastabend
> > > <john.fastabend@gmail.com> wrote:
> > > >
> > > > Current XDP implementation hits the tail on every XDP_TX return
> > > > code. This patch changes driver behavior to only hit the tail after
> > > > packet processing is complete.
> > > >
> > > > RFC for now as I test this, it looks promising on my dev box but
> > > > want to do some more tests before official submission.
> > > >
> > > > Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> > > > ---
> > > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 14 +++++++++++---
> > > > 1 file changed, 11 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > index bef4e24..2c244b6 100644
> > > > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > @@ -2282,6 +2282,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> > > > unsigned int mss = 0;
> > > > #endif /* IXGBE_FCOE */
> > > > u16 cleaned_count = ixgbe_desc_unused(rx_ring);
> > > > + bool xdp_xmit = false;
> > > >
> > > > while (likely(total_rx_packets < budget)) {
> > > > union ixgbe_adv_rx_desc *rx_desc;
> > > > @@ -2321,10 +2322,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> > > > }
> > > >
> > > > if (IS_ERR(skb)) {
> > > > - if (PTR_ERR(skb) == -IXGBE_XDP_TX)
> > > > + if (PTR_ERR(skb) == -IXGBE_XDP_TX) {
> > > > + xdp_xmit = true;
> > > > ixgbe_rx_buffer_flip(rx_ring, rx_buffer, size);
> > > > - else
> > > > + } else {
> > > > rx_buffer->pagecnt_bias++;
> > > > + }
> > > > total_rx_packets++;
> > > > total_rx_bytes += size;
> > > > } else if (skb) {
> > > > @@ -2392,6 +2395,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> > > > total_rx_packets++;
> > > > }
> > > >
> > > > + if (xdp_xmit) {
> > > > + struct ixgbe_ring *ring = adapter->xdp_ring[smp_processor_id()];
> > > > +
> > > > + writel(ring->next_to_use, ring->tail);
> > > > + }
> > > > +
> > >
> > > We will need a wmb here.
> > >
> > > >
> > > > u64_stats_update_begin(&rx_ring->syncp);
> > > > rx_ring->stats.packets += total_rx_packets;
> > > > rx_ring->stats.bytes += total_rx_bytes;
> > > > @@ -8251,7 +8260,6 @@ static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
> > > > tx_buffer->next_to_watch = tx_desc;
> > > > ring->next_to_use = i;
> > > >
> > > > - writel(i, ring->tail);
> > >
> > > So you might want to change the barrier setup for all this to use
> > > smp_wmb instead. We need the wmb to be paired with the writel. That
> > > should give you a slight performance boost since smp_wmb breaks down
> > > to just a barrier on x86 systems.
> > >
> >
> > Not sure I grok this description entirely, but I think you are just saying
> > replace,
> >
> > if (xdp_xmit) {
> > ...
> > writel(...)
> > }
> >
> > with
> >
> > if (xdp_xmit) {
> > ...
> > smp_rmb()
> > writel()
> > }
> >
> > Correct? Did you have some other change in mind ... "for all this"?
> >
> > Thanks,
> > John
>
>
> No. Basically what it should be is find and replace wmb with
> smp_wmb() in ixgbe_xmit_xdp_ring. That way you won't have to worry
> about a race between Tx and clean-up.
>
> Then the if statement should be:
> if (xdp_xmit) {
> ...
> /* wmb required to flush writes to coherent memory before writing
> to non-coherent memory*/
> wmb();
> /* writel is a write to non-coherent memory mapped I/O */
> writel();
> }
>
> >
> > >
> > > >
> > > > return IXGBE_XDP_TX;
> > > > }
> > > >
> > > >
> >
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-27 22:42 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13 16:16 [Intel-wired-lan] [RFC PATCH] ixgbe: delay tail write to every 'n' packets John Fastabend
2017-03-13 17:18 ` Alexander Duyck
2017-03-14 23:50 ` John Fastabend
2017-03-15 1:14 ` Alexander Duyck
2017-03-27 19:28 ` Fastabend, John R
2017-03-27 22:42 ` Duyck, Alexander H
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.