linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] net-next: hw flow offloading
@ 2017-07-21 15:20 John Crispin
  2017-07-21 15:20 ` [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info John Crispin
  2017-07-21 15:20 ` [RFC 2/2] net-next: mediatek: populate the shared John Crispin
  0 siblings, 2 replies; 7+ messages in thread
From: John Crispin @ 2017-07-21 15:20 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet; +Cc: linux-kernel, netdev, John Crispin

Hi,

I managed to bring up the flow offloading on latest MedieTek silicon.

When enabling HW flow offloading, the traffic coming in on either of the
GMACs is first sent to the PPE for processing. Any traffic not offloaded
at this point will then be forwarded to the normal RX DMA ring for SW path
processing. In this case the PPE will send additional data inside RXD4
that is later required by the upper layers to populate the flow offloading
engines HW tables properly.

This series is a RFC as i am not sure how to best propagate the additional
info from the RX DMA descriptor. The driver is still using NF hooks and
I plan to rebase it and send it upstream once the flow table offloading
patches that folks are working on are upstream.

I am right now trying to get rid of the remaning hacks in the code and
wanted to know if this series would be a feasible solution.

	John

John Crispin (2):
  net-next: add a dma_desc element to struct skb_shared_info
  net-next: mediatek: populate the shared

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 ++++
 include/linux/skbuff.h                      | 1 +
 2 files changed, 5 insertions(+)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info
  2017-07-21 15:20 [RFC 0/2] net-next: hw flow offloading John Crispin
@ 2017-07-21 15:20 ` John Crispin
  2017-07-21 15:56   ` Paolo Abeni
  2017-07-21 15:20 ` [RFC 2/2] net-next: mediatek: populate the shared John Crispin
  1 sibling, 1 reply; 7+ messages in thread
From: John Crispin @ 2017-07-21 15:20 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet; +Cc: linux-kernel, netdev, John Crispin

In order to make HW flow offloading work in latest MediaTek silicon we need
to propagate part of the RX DMS descriptor to the upper layers populating
the flow offload engines HW tables. This patch adds an extra element to
struct skb_shared_info allowing the ethernet drivers RX napi code to store
the required information and make it persistent for the lifecycle of the
skb and its clones.

Signed-off-by: John Crispin <john@phrozen.org>
---
 include/linux/skbuff.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4093552be1de..db9576cd946b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -426,6 +426,7 @@ struct skb_shared_info {
 	unsigned int	gso_type;
 	u32		tskey;
 	__be32          ip6_frag_id;
+	u32		dma_desc;
 
 	/*
 	 * Warning : all fields before dataref are cleared in __alloc_skb()
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC 2/2] net-next: mediatek: populate the shared
  2017-07-21 15:20 [RFC 0/2] net-next: hw flow offloading John Crispin
  2017-07-21 15:20 ` [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info John Crispin
@ 2017-07-21 15:20 ` John Crispin
  1 sibling, 0 replies; 7+ messages in thread
From: John Crispin @ 2017-07-21 15:20 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet; +Cc: linux-kernel, netdev, John Crispin

When enabling HW flow offloading, the traffic coming in on either of the
GMACs is first sent to the PPE for processing. Any traffic not offloaded
at this point will then be forwarded to the normal RX DMA ring for SW path
processing. In this case the PPE will send additional data inside RXD4
that is later required by the upper layers to populate the flow offloading
engines HW tables properly. This patch sets the skb_shared_info's dma_desc
field so that we can use the value later on.

Signed-off-by: John Crispin <john@phrozen.org>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index a455d1b4f1d8..42d162cd6363 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -918,6 +918,7 @@ static void mtk_update_rx_cpu_idx(struct mtk_eth *eth)
 static int mtk_poll_rx(struct napi_struct *napi, int budget,
 		       struct mtk_eth *eth)
 {
+	struct skb_shared_info *sh;
 	struct mtk_rx_ring *ring;
 	int idx;
 	struct sk_buff *skb;
@@ -1000,6 +1001,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 		else
 			skb_checksum_none_assert(skb);
 		skb->protocol = eth_type_trans(skb, netdev);
+		sh = skb_shinfo(skb);
+
+		sh->dma_desc = trxd.rxd4;
 
 		if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX &&
 		    RX_DMA_VID(trxd.rxd3))
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info
  2017-07-21 15:20 ` [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info John Crispin
@ 2017-07-21 15:56   ` Paolo Abeni
  2017-07-21 17:01     ` John Crispin
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2017-07-21 15:56 UTC (permalink / raw)
  To: John Crispin, David S . Miller, Eric Dumazet; +Cc: linux-kernel, netdev

Hi,

On Fri, 2017-07-21 at 17:20 +0200, John Crispin wrote:
> In order to make HW flow offloading work in latest MediaTek silicon we need
> to propagate part of the RX DMS descriptor to the upper layers populating
> the flow offload engines HW tables. This patch adds an extra element to
> struct skb_shared_info allowing the ethernet drivers RX napi code to store
> the required information and make it persistent for the lifecycle of the
> skb and its clones.
> 
> Signed-off-by: John Crispin <john@phrozen.org>
> ---
>  include/linux/skbuff.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4093552be1de..db9576cd946b 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -426,6 +426,7 @@ struct skb_shared_info {
>  	unsigned int	gso_type;
>  	u32		tskey;
>  	__be32          ip6_frag_id;
> +	u32		dma_desc;
>  
>  	/*
>  	 * Warning : all fields before dataref are cleared in __alloc_skb()

This will increase the skb_shared_info struct size, which is already
quite large, and can have several kind of performance drawback.
AFAIK this is discouraged. 

I don't understand the use case; the driver will set this field, but
who is going to consume it?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info
  2017-07-21 15:56   ` Paolo Abeni
@ 2017-07-21 17:01     ` John Crispin
  2017-07-21 19:21       ` David Miller
  2017-07-21 20:37       ` Florian Westphal
  0 siblings, 2 replies; 7+ messages in thread
From: John Crispin @ 2017-07-21 17:01 UTC (permalink / raw)
  To: Paolo Abeni, David S . Miller, Eric Dumazet; +Cc: linux-kernel, netdev



On 21/07/17 17:56, Paolo Abeni wrote:
> Hi,
>
> On Fri, 2017-07-21 at 17:20 +0200, John Crispin wrote:
>> In order to make HW flow offloading work in latest MediaTek silicon we need
>> to propagate part of the RX DMS descriptor to the upper layers populating
>> the flow offload engines HW tables. This patch adds an extra element to
>> struct skb_shared_info allowing the ethernet drivers RX napi code to store
>> the required information and make it persistent for the lifecycle of the
>> skb and its clones.
>>
>> Signed-off-by: John Crispin <john@phrozen.org>
>> ---
>>   include/linux/skbuff.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 4093552be1de..db9576cd946b 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -426,6 +426,7 @@ struct skb_shared_info {
>>   	unsigned int	gso_type;
>>   	u32		tskey;
>>   	__be32          ip6_frag_id;
>> +	u32		dma_desc;
>>   
>>   	/*
>>   	 * Warning : all fields before dataref are cleared in __alloc_skb()
> This will increase the skb_shared_info struct size, which is already
> quite large, and can have several kind of performance drawback.
> AFAIK this is discouraged.
>
> I don't understand the use case; the driver will set this field, but
> who is going to consume it?
>
> Thanks,
>
> Paolo
Hi Paolo,

When the flow offloading engine forwards a packet to the DMA it will 
send additional info to the sw path. this includes
* physical switch port
* internal flow hash - this is required to populate the correct flow 
table entry
* ppe state - this indicates what state the PPEs internal table is in 
for the flow
* the reason why the packet was forwarde - these are things like bind, 
unbind, timed out, ...

once the flow table offloading patches are ready and upstream, the 
netfilter layer will see the SKB and pass it o to the flow table 
offloading code, at which point it will finally end up inside the 
offloading driver. this will need to have access to this info sent to 
the sw path inside the rx descriptor to properly work out what state the 
flow is in and which table entry to populate in the HW table for 
offloading to work.

Hope that is a little clearer. current hackish driver is here [1], the 
patch to the ethernet driver is here [2]

     John

[1] 
https://git.lede-project.org/?p=lede/blogic/staging.git;a=tree;f=target/linux/mediatek/files/drivers/net/ethernet/mediatek/mtk_hnat;hb=bc0518b9d928b43d965d8a1f8860281f0ae6a31c
[2] 
https://git.lede-project.org/?p=lede/blogic/staging.git;a=blob;f=target/linux/mediatek/patches-4.9/0310-hwnat.patch;h=57bd0c07b39d2169f3ba08e1aa83b92dffcee025;hb=bc0518b9d928b43d965d8a1f8860281f0ae6a31c

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info
  2017-07-21 17:01     ` John Crispin
@ 2017-07-21 19:21       ` David Miller
  2017-07-21 20:37       ` Florian Westphal
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2017-07-21 19:21 UTC (permalink / raw)
  To: john; +Cc: pabeni, edumazet, linux-kernel, netdev

From: John Crispin <john@phrozen.org>
Date: Fri, 21 Jul 2017 19:01:57 +0200

> When the flow offloading engine forwards a packet to the DMA it will
> send additional info to the sw path. this includes
> * physical switch port
> * internal flow hash - this is required to populate the correct flow
> * table entry
> * ppe state - this indicates what state the PPEs internal table is in
> * for the flow
> * the reason why the packet was forwarde - these are things like bind,
> * unbind, timed out, ...
> 
> once the flow table offloading patches are ready and upstream, the
> netfilter layer will see the SKB and pass it o to the flow table
> offloading code, at which point it will finally end up inside the
> offloading driver. this will need to have access to this info sent to
> the sw path inside the rx descriptor to properly work out what state
> the flow is in and which table entry to populate in the HW table for
> offloading to work.

You absolutely must justify any change to a core data structure
alongside the complete and full set of patches that actually make use
of that data structure change.

You can't just say "here is the data structure change and BTW what
actually uses this is somewhere else, and not here on the list yet."

That makes it impossible to 1) evaluate the correctness of your change
and 2) validate the actual use so we can suggest alternative schemes
and/or approaches.

So please don't suggest changes this way.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info
  2017-07-21 17:01     ` John Crispin
  2017-07-21 19:21       ` David Miller
@ 2017-07-21 20:37       ` Florian Westphal
  1 sibling, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-07-21 20:37 UTC (permalink / raw)
  To: John Crispin
  Cc: Paolo Abeni, David S . Miller, Eric Dumazet, linux-kernel, netdev

John Crispin <john@phrozen.org> wrote:
> When the flow offloading engine forwards a packet to the DMA it will send
> additional info to the sw path. this includes
> * physical switch port
> * internal flow hash - this is required to populate the correct flow table
> entry
> * ppe state - this indicates what state the PPEs internal table is in for
> the flow
> * the reason why the packet was forwarde - these are things like bind,
> unbind, timed out, ...
> 
> once the flow table offloading patches are ready and upstream, the netfilter
> layer will see the SKB and pass it o to the flow table offloading code,

If this is about conntrack offloading, then I prefer if this is done
without changing any core network structure.

What about adding a new conntrack extension to hold whatever info
you need, and then allocate a conntrack entry in the driver?

This would obviously need core changes in conntrack (such as allowing
calls into conntrack from drivers without hard module dependencies,
and a thorough check if this causes backwards problems (e.g.
right now a "-m conntrack" check in the raw table can only succeed for
packets from lo interface).

But I think that could be worked around, esp. if we assume that we
won't see such entries a lot (assuming sw is slowpath and hw handles
most packets).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-07-21 20:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-21 15:20 [RFC 0/2] net-next: hw flow offloading John Crispin
2017-07-21 15:20 ` [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info John Crispin
2017-07-21 15:56   ` Paolo Abeni
2017-07-21 17:01     ` John Crispin
2017-07-21 19:21       ` David Miller
2017-07-21 20:37       ` Florian Westphal
2017-07-21 15:20 ` [RFC 2/2] net-next: mediatek: populate the shared John Crispin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).