All of lore.kernel.org
 help / color / mirror / Atom feed
* Alternate location for skb_shared_info other than the tail?
@ 2020-08-15 16:57 Florian Fainelli
  2020-08-15 17:34 ` Eric Dumazet
  0 siblings, 1 reply; 2+ messages in thread
From: Florian Fainelli @ 2020-08-15 16:57 UTC (permalink / raw)
  To: netdev, David S. Miller, Jakub Kicinski, edumazet
  Cc: Doug Berger, Justin Chen

Hi all,

We have an Ethernet controller that is capable of putting several 
Ethernet frames within a single 4KB buffer provided by Linux. The 
rationale for this design is to maximize the DRAM efficiency, especially 
for LPDDR4/5 architectures where the cost to keep a page open is higher 
than say DDR3/4.

We have a programmable alignment which allows us to make sure that we 
can align the start of the Ethernet frames on a cache line boundary (and 
also add the 2 byte stuffing to further align the IP header). What we do 
not have however is a programmable tail room to space out the Ethernet 
frames between one another. Worst case, if the end aligns on a 64b 
boundary, the next frame would start on the adjacent byte, leaving no 
space in between.

We were initially thinking about using build_skb() (and variants) and 
point the data argument to the location within that 4KB buffer, however 
this causes a problem that we have the skb_shared_info structure at the 
end of the Ethernet frame, and that area can be overwritten by the 
hardware. Right now we allocate a new sk_buff and copy from the offset 
within the 4KB buffer. The CPU is fast enough and this warms up the data 
cache that this is not a performance problem, but we would prefer to do 
without the copy to limit the amount of allocations in NAPI context.

What is further complicating is that by the time we process one Ethernet 
frame within the 4KB data buffer, we do not necessarily know whether 
another one will come, and what space we would have between the two. If 
we do know, though, we could see if we have enough room for the 
skb_shared_info and at least avoid copying that sk_buff, but this would 
not be such a big win.

We are unfortunately a bit late to fix that hardware design limitation, 
and we did not catch the requirement for putting skb_shared_info at the 
end until too late :/

In premise skb_shared_info could reside somewhere other than at the 
tail, and the entire network stack uses the skb_shinfo() to access the 
area (there are no concerns about "wild" accesses to skb_shared_info), 
so if we could find a way to encode within the sk_buff that an alternate 
location is to be used, we could get our cookie. There are some parts of 
the stack that do however assume that we need to allocate the header 
part plus the sbk_shared_info structure, those assumptions may be harder 
to break.

Do you have any suggestions on how we could specify an alternate 
location for skb_shared_info or any other suggestions on how to avoid 
copies?

Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Alternate location for skb_shared_info other than the tail?
  2020-08-15 16:57 Alternate location for skb_shared_info other than the tail? Florian Fainelli
@ 2020-08-15 17:34 ` Eric Dumazet
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Dumazet @ 2020-08-15 17:34 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Jakub Kicinski, Doug Berger, Justin Chen

On Sat, Aug 15, 2020 at 9:57 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> Hi all,
>
> We have an Ethernet controller that is capable of putting several
> Ethernet frames within a single 4KB buffer provided by Linux. The
> rationale for this design is to maximize the DRAM efficiency, especially
> for LPDDR4/5 architectures where the cost to keep a page open is higher
> than say DDR3/4.
>
> We have a programmable alignment which allows us to make sure that we
> can align the start of the Ethernet frames on a cache line boundary (and
> also add the 2 byte stuffing to further align the IP header). What we do
> not have however is a programmable tail room to space out the Ethernet
> frames between one another. Worst case, if the end aligns on a 64b
> boundary, the next frame would start on the adjacent byte, leaving no
> space in between.
>
> We were initially thinking about using build_skb() (and variants) and
> point the data argument to the location within that 4KB buffer, however
> this causes a problem that we have the skb_shared_info structure at the
> end of the Ethernet frame, and that area can be overwritten by the
> hardware. Right now we allocate a new sk_buff and copy from the offset
> within the 4KB buffer. The CPU is fast enough and this warms up the data
> cache that this is not a performance problem, but we would prefer to do
> without the copy to limit the amount of allocations in NAPI context.
>
> What is further complicating is that by the time we process one Ethernet
> frame within the 4KB data buffer, we do not necessarily know whether
> another one will come, and what space we would have between the two. If
> we do know, though, we could see if we have enough room for the
> skb_shared_info and at least avoid copying that sk_buff, but this would
> not be such a big win.
>
> We are unfortunately a bit late to fix that hardware design limitation,
> and we did not catch the requirement for putting skb_shared_info at the
> end until too late :/
>
> In premise skb_shared_info could reside somewhere other than at the
> tail, and the entire network stack uses the skb_shinfo() to access the
> area (there are no concerns about "wild" accesses to skb_shared_info),
> so if we could find a way to encode within the sk_buff that an alternate
> location is to be used, we could get our cookie. There are some parts of
> the stack that do however assume that we need to allocate the header
> part plus the sbk_shared_info structure, those assumptions may be harder
> to break.
>
> Do you have any suggestions on how we could specify an alternate
> location for skb_shared_info or any other suggestions on how to avoid
> copies?

It seems you do _not_ want to use build_skb(), but instead allocate a
standard skb
with enough room in skb->head to pull the headers.

build_skb was fine for MTU=1500 and using 2KB pre-allocated buffers,
because we could use the
2048-1536 = 512 bytes as a placeholder for shared_info.

But for non conventional MTU or buffer splitting (as in your case),
there is no point for using this.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-15 22:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-15 16:57 Alternate location for skb_shared_info other than the tail? Florian Fainelli
2020-08-15 17:34 ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.