* Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
@ 2020-02-15 5:23 Dexuan Cui
2020-02-15 15:20 ` Haiyang Zhang
0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2020-02-15 5:23 UTC (permalink / raw)
To: Haiyang Zhang, Stephen Hemminger, David S. Miller, netdev,
KY Srinivasan, linux-kernel
Cc: linux-hyperv
Hi,
It looks all the layers of drivers among the network stack can use the 48-byte
skb->cb array. Is there any rule how they should coordinate with each other?
I noticed the last 16 bytes are used by struct skb_gso_cb:
include/linux/skbuff.h:
struct skb_gso_cb {
union {
int mac_offset;
int data_offset;
};
int encap_level;
__wsum csum;
__u16 csum_start;
};
#define SKB_SGO_CB_OFFSET 32
#define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb + SKB_SGO_CB_OFFSET))
Does this mean a low level NIC driver (e.g. hv_netvsc) should only use
the first 32 bytes? What if the upper layer network stack starts to take up
more space in the future?
Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in the
future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET);
?
struct hv_netvsc_packet {
/* Bookkeeping stuff */
u8 cp_partial; /* partial copy into send buffer */
u8 rmsg_size; /* RNDIS header and PPI size */
u8 rmsg_pgcnt; /* page count of RNDIS header and PPI */
u8 page_buf_cnt;
u16 q_idx;
u16 total_packets;
u32 total_bytes;
u32 send_buf_index;
u32 total_data_buflen;
};
static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
{
...
/*
* Place the rndis header in the skb head room and
* the skb->cb will be used for hv_netvsc_packet
* structure.
*/
ret = skb_cow_head(skb, RNDIS_AND_PPI_SIZE);
if (ret)
goto no_memory;
/* Use the skb control buffer for building up the packet */
BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) >
FIELD_SIZEOF(struct sk_buff, cb));
packet = (struct hv_netvsc_packet *)skb->cb;
Thanks,
-- Dexuan
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
2020-02-15 5:23 Is it safe for a NIC driver to use all the 48 bytes of skb->cb? Dexuan Cui
@ 2020-02-15 15:20 ` Haiyang Zhang
2020-02-15 18:04 ` Dexuan Cui
0 siblings, 1 reply; 4+ messages in thread
From: Haiyang Zhang @ 2020-02-15 15:20 UTC (permalink / raw)
To: Dexuan Cui, Stephen Hemminger, David S. Miller, netdev,
KY Srinivasan, linux-kernel
Cc: linux-hyperv
> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Saturday, February 15, 2020 12:24 AM
> To: Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; David S. Miller <davem@davemloft.net>;
> netdev@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; linux-
> kernel@vger.kernel.org
> Cc: linux-hyperv@vger.kernel.org
> Subject: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
>
> Hi,
> It looks all the layers of drivers among the network stack can use the 48-byte
> skb->cb array. Is there any rule how they should coordinate with each other?
>
> I noticed the last 16 bytes are used by struct skb_gso_cb:
>
> include/linux/skbuff.h:
> struct skb_gso_cb {
> union {
> int mac_offset;
> int data_offset;
> };
> int encap_level;
> __wsum csum;
> __u16 csum_start;
> };
> #define SKB_SGO_CB_OFFSET 32
> #define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb +
> SKB_SGO_CB_OFFSET))
>
> Does this mean a low level NIC driver (e.g. hv_netvsc) should only use the first
> 32 bytes? What if the upper layer network stack starts to take up more space in
> the future?
According to the comments in skbuff.h below, it is the responsibility of the owning
layer to make a SKB clone, if it wants to keep the data across layers. So, every layer
can still use all of the 48 bytes.
/*
* This is the control buffer. It is free to use for every
* layer. Please put your private variables there. If you
* want to keep them across layers you have to do a skb_clone()
* first. This is owned by whoever has the skb queued ATM.
*/
char cb[48] __aligned(8);
> Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in the
> future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?
Based on the explanation above, the existing hv_netvsc code is correct.
Thanks,
- Haiyang
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
2020-02-15 15:20 ` Haiyang Zhang
@ 2020-02-15 18:04 ` Dexuan Cui
2020-02-17 22:31 ` Haiyang Zhang
0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2020-02-15 18:04 UTC (permalink / raw)
To: Haiyang Zhang, Stephen Hemminger, David S. Miller, netdev,
KY Srinivasan, linux-kernel
Cc: linux-hyperv
> From: Haiyang Zhang <haiyangz@microsoft.com>
> Sent: Saturday, February 15, 2020 7:20 AM
> To: Dexuan Cui <decui@microsoft.com>; Stephen Hemminger
>
> According to the comments in skbuff.h below, it is the responsibility of the
> owning layer to make a SKB clone, if it wants to keep the data across layers.
> So, every layer can still use all of the 48 bytes.
>
> /*
> * This is the control buffer. It is free to use for every
> * layer. Please put your private variables there. If you
> * want to keep them across layers you have to do a skb_clone()
> * first. This is owned by whoever has the skb queued ATM.
> */
> char cb[48] __aligned(8);
>
> > Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> > 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in
> the
> > future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> > BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?
>
> Based on the explanation above, the existing hv_netvsc code is correct.
>
> Thanks,
> - Haiyang
Got it. So if the upper layer saves something in the cb, it must do a skb_clone()
and pass the new skb to hv_netvsc. hv_netvsc is the lowest layer in the network
stack, so it can use all the 48 bytes without calling skb_clone().
BTW, now I happen to have a different question: in netvsc_probe() we have
net->needed_headroom = RNDIS_AND_PPI_SIZE;
I think this means when the network stack (ARP, IP, ICMP, TCP, UDP,etc) passes a
skb to hv_netvsc, the skb's headroom is increased by an extra size of
net->needed_headroom, right? Then in netvsc_xmit(), why do we still need to
call skb_cow_head(skb, RNDIS_AND_PPI_SIZE)? -- this looks unnecessary to me?
PS, what does the "cow" here mean? Copy On Write? It looks skb_cow_head()
just copies the data (if necessary) and it has nothing to do with the
write-protection in the MMU code.
Thanks,
Dexuan
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
2020-02-15 18:04 ` Dexuan Cui
@ 2020-02-17 22:31 ` Haiyang Zhang
0 siblings, 0 replies; 4+ messages in thread
From: Haiyang Zhang @ 2020-02-17 22:31 UTC (permalink / raw)
To: Dexuan Cui, Stephen Hemminger, David S. Miller, netdev,
KY Srinivasan, linux-kernel
Cc: linux-hyperv
> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Saturday, February 15, 2020 1:04 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; David S. Miller <davem@davemloft.net>;
> netdev@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; linux-
> kernel@vger.kernel.org
> Cc: linux-hyperv@vger.kernel.org
> Subject: RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
>
> > From: Haiyang Zhang <haiyangz@microsoft.com>
> > Sent: Saturday, February 15, 2020 7:20 AM
> > To: Dexuan Cui <decui@microsoft.com>; Stephen Hemminger
> >
> > According to the comments in skbuff.h below, it is the responsibility of the
> > owning layer to make a SKB clone, if it wants to keep the data across layers.
> > So, every layer can still use all of the 48 bytes.
> >
> > /*
> > * This is the control buffer. It is free to use for every
> > * layer. Please put your private variables there. If you
> > * want to keep them across layers you have to do a skb_clone()
> > * first. This is owned by whoever has the skb queued ATM.
> > */
> > char cb[48] __aligned(8);
> >
> > > Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> > > 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in
> > the
> > > future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> > > BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?
> >
> > Based on the explanation above, the existing hv_netvsc code is correct.
> >
> > Thanks,
> > - Haiyang
>
> Got it. So if the upper layer saves something in the cb, it must do a skb_clone()
> and pass the new skb to hv_netvsc. hv_netvsc is the lowest layer in the network
> stack, so it can use all the 48 bytes without calling skb_clone().
>
> BTW, now I happen to have a different question: in netvsc_probe() we have
> net->needed_headroom = RNDIS_AND_PPI_SIZE;
> I think this means when the network stack (ARP, IP, ICMP, TCP, UDP,etc) passes
> a
> skb to hv_netvsc, the skb's headroom is increased by an extra size of
> net->needed_headroom, right? Then in netvsc_xmit(), why do we still need to
> call skb_cow_head(skb, RNDIS_AND_PPI_SIZE)? -- this looks unnecessary to me?
skb_cow_head() only expands the headroom if it is not enough, in case some
upper layer path didn't reserve enough.
> PS, what does the "cow" here mean? Copy On Write? It looks skb_cow_head()
> just copies the data (if necessary) and it has nothing to do with the
> write-protection in the MMU code.
Unrelated to MMU. It just copies some data to make room for writing.
Thanks,
- Haiyang
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-02-17 22:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-15 5:23 Is it safe for a NIC driver to use all the 48 bytes of skb->cb? Dexuan Cui
2020-02-15 15:20 ` Haiyang Zhang
2020-02-15 18:04 ` Dexuan Cui
2020-02-17 22:31 ` Haiyang Zhang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.