netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
@ 2020-07-10  6:45 Magnus Karlsson
  2020-07-10 16:34 ` Jonathan Lemon
  2020-07-10 23:26 ` Daniel Borkmann
  0 siblings, 2 replies; 7+ messages in thread
From: Magnus Karlsson @ 2020-07-10  6:45 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, jonathan.lemon; +Cc: A.Zema

In the skb Tx path, transmission of a packet is performed with
dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
routines, it returns NETDEV_TX_BUSY signifying that it was not
possible to send the packet now, please try later. Unfortunately, the
xsk transmit code discarded the packet, missed to free the skb, and
returned EBUSY to the application. Fix this memory leak and
unnecessary packet loss, by not discarding the packet in the Tx ring,
freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
application, it can then retry the send operation and the packet will
finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
state anymore. So EAGAIN tells the application that the packet was not
discarded from the Tx ring and that it needs to call send()
again. EBUSY, on the other hand, signifies that the packet was not
sent and discarded from the Tx ring. The application needs to put the
packet on the Tx ring again if it wants it to be sent.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
---
The v1 of this patch was called "xsk: do not discard packet when
QUEUE_STATE_FROZEN".
---
 net/xdp/xsk.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3700266..5304250 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
 		skb->destructor = xsk_destruct_skb;
 
 		err = dev_direct_xmit(skb, xs->queue_id);
-		xskq_cons_release(xs->tx);
 		/* Ignore NET_XMIT_CN as packet might have been sent */
-		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
+		if (err == NET_XMIT_DROP) {
 			/* SKB completed but not sent */
+			xskq_cons_release(xs->tx);
 			err = -EBUSY;
 			goto out;
+		} else if  (err == NETDEV_TX_BUSY) {
+			/* QUEUE_STATE_FROZEN, tell application to
+			 * retry sending the packet
+			 */
+			skb->destructor = NULL;
+			kfree_skb(skb);
+			err = -EAGAIN;
+			goto out;
 		}
+		xskq_cons_release(xs->tx);
 
 		sent_frame = true;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-10  6:45 [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path Magnus Karlsson
@ 2020-07-10 16:34 ` Jonathan Lemon
  2020-07-10 23:26 ` Daniel Borkmann
  1 sibling, 0 replies; 7+ messages in thread
From: Jonathan Lemon @ 2020-07-10 16:34 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, A.Zema

On Fri, Jul 10, 2020 at 08:45:54AM +0200, Magnus Karlsson wrote:
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> routines, it returns NETDEV_TX_BUSY signifying that it was not
> possible to send the packet now, please try later. Unfortunately, the
> xsk transmit code discarded the packet, missed to free the skb, and
> returned EBUSY to the application. Fix this memory leak and
> unnecessary packet loss, by not discarding the packet in the Tx ring,
> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> application, it can then retry the send operation and the packet will
> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> state anymore. So EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
> 
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

-- 
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-10  6:45 [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path Magnus Karlsson
  2020-07-10 16:34 ` Jonathan Lemon
@ 2020-07-10 23:26 ` Daniel Borkmann
  2020-07-11  7:39   ` Magnus Karlsson
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2020-07-10 23:26 UTC (permalink / raw)
  To: Magnus Karlsson, bjorn.topel, ast, netdev, jonathan.lemon; +Cc: A.Zema

Hi Magnus,

On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> routines, it returns NETDEV_TX_BUSY signifying that it was not
> possible to send the packet now, please try later. Unfortunately, the
> xsk transmit code discarded the packet, missed to free the skb, and
> returned EBUSY to the application. Fix this memory leak and
> unnecessary packet loss, by not discarding the packet in the Tx ring,
> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> application, it can then retry the send operation and the packet will
> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> state anymore. So EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
> 
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> ---
> The v1 of this patch was called "xsk: do not discard packet when
> QUEUE_STATE_FROZEN".
> ---
>   net/xdp/xsk.c | 13 +++++++++++--
>   1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 3700266..5304250 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
>   		skb->destructor = xsk_destruct_skb;
>   
>   		err = dev_direct_xmit(skb, xs->queue_id);
> -		xskq_cons_release(xs->tx);
>   		/* Ignore NET_XMIT_CN as packet might have been sent */
> -		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> +		if (err == NET_XMIT_DROP) {
>   			/* SKB completed but not sent */
> +			xskq_cons_release(xs->tx);
>   			err = -EBUSY;
>   			goto out;
> +		} else if  (err == NETDEV_TX_BUSY) {
> +			/* QUEUE_STATE_FROZEN, tell application to
> +			 * retry sending the packet
> +			 */
> +			skb->destructor = NULL;
> +			kfree_skb(skb);
> +			err = -EAGAIN;
> +			goto out;

Hmm, I'm probably missing something or I should blame my current lack of coffee,
but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
double-free with NETDEV_TX_BUSY?

>   		}
> +		xskq_cons_release(xs->tx);
>   
>   		sent_frame = true;
>   	}
> 

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-10 23:26 ` Daniel Borkmann
@ 2020-07-11  7:39   ` Magnus Karlsson
  2020-07-13 16:53     ` Jonathan Lemon
  2020-07-15 18:36     ` Daniel Borkmann
  0 siblings, 2 replies; 7+ messages in thread
From: Magnus Karlsson @ 2020-07-11  7:39 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Network Development, Jonathan Lemon, A.Zema

On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hi Magnus,
>
> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> > In the skb Tx path, transmission of a packet is performed with
> > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > possible to send the packet now, please try later. Unfortunately, the
> > xsk transmit code discarded the packet, missed to free the skb, and
> > returned EBUSY to the application. Fix this memory leak and
> > unnecessary packet loss, by not discarding the packet in the Tx ring,
> > freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> > application, it can then retry the send operation and the packet will
> > finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> > state anymore. So EAGAIN tells the application that the packet was not
> > discarded from the Tx ring and that it needs to call send()
> > again. EBUSY, on the other hand, signifies that the packet was not
> > sent and discarded from the Tx ring. The application needs to put the
> > packet on the Tx ring again if it wants it to be sent.
> >
> > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > ---
> > The v1 of this patch was called "xsk: do not discard packet when
> > QUEUE_STATE_FROZEN".
> > ---
> >   net/xdp/xsk.c | 13 +++++++++++--
> >   1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 3700266..5304250 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> >               skb->destructor = xsk_destruct_skb;
> >
> >               err = dev_direct_xmit(skb, xs->queue_id);
> > -             xskq_cons_release(xs->tx);
> >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > +             if (err == NET_XMIT_DROP) {
> >                       /* SKB completed but not sent */
> > +                     xskq_cons_release(xs->tx);
> >                       err = -EBUSY;
> >                       goto out;
> > +             } else if  (err == NETDEV_TX_BUSY) {
> > +                     /* QUEUE_STATE_FROZEN, tell application to
> > +                      * retry sending the packet
> > +                      */
> > +                     skb->destructor = NULL;
> > +                     kfree_skb(skb);
> > +                     err = -EAGAIN;
> > +                     goto out;
>
> Hmm, I'm probably missing something or I should blame my current lack of coffee,
> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> double-free with NETDEV_TX_BUSY?

I think you are correct even without coffee :-). I misinterpreted the
following piece of code in dev_direct_xmit():

if (!dev_xmit_complete(ret))
     kfree_skb(skb);

If the skb was NOT consumed by the transmit, then it goes and frees
the skb. NETDEV_TX_BUSY as a return value will make
dev_xmit_complete() return false which triggers the freeing of the
skb. So if I now understand dev_direct_xmit() correctly, it will
always consume the skb, even when NETDEV_TX_BUSY is returned. And this
is what I would like to avoid. If the skb is freed, the destructor is
triggered and it will complete the packet to user-space, which is the
same thing as dropping it, which is what I want to avoid in the first
place since it is completely unnecessary.

So what would be the best way to solve this? Prefer to share the code
with AF_PACKET if possible. Introduce a boolean function parameter to
indicate if it should be freed in this case? Other ideas? Here are the
users of dev_direct_xmit():

drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c

line 349
line 939
line 1033
line 1303
line 1665

include/linux/netdevice.h, line 2719
net/core/dev.c

line 4095
line 4132

net/packet/af_packet.c, line 240
net/xdp/xsk.c, line 425

Thanks: Magnus

> >               }
> > +             xskq_cons_release(xs->tx);
> >
> >               sent_frame = true;
> >       }
> >
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-11  7:39   ` Magnus Karlsson
@ 2020-07-13 16:53     ` Jonathan Lemon
  2020-07-15 18:36     ` Daniel Borkmann
  1 sibling, 0 replies; 7+ messages in thread
From: Jonathan Lemon @ 2020-07-13 16:53 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Daniel Borkmann, Magnus Karlsson, Björn Töpel,
	Alexei Starovoitov, Network Development, A.Zema

On Sat, Jul 11, 2020 at 09:39:58AM +0200, Magnus Karlsson wrote:
> On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > Hi Magnus,
> >
> > On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> > > In the skb Tx path, transmission of a packet is performed with
> > > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > > possible to send the packet now, please try later. Unfortunately, the
> > > xsk transmit code discarded the packet, missed to free the skb, and
> > > returned EBUSY to the application. Fix this memory leak and
> > > unnecessary packet loss, by not discarding the packet in the Tx ring,
> > > freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> > > application, it can then retry the send operation and the packet will
> > > finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> > > state anymore. So EAGAIN tells the application that the packet was not
> > > discarded from the Tx ring and that it needs to call send()
> > > again. EBUSY, on the other hand, signifies that the packet was not
> > > sent and discarded from the Tx ring. The application needs to put the
> > > packet on the Tx ring again if it wants it to be sent.
> > >
> > > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > ---
> > > The v1 of this patch was called "xsk: do not discard packet when
> > > QUEUE_STATE_FROZEN".
> > > ---
> > >   net/xdp/xsk.c | 13 +++++++++++--
> > >   1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index 3700266..5304250 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> > >               skb->destructor = xsk_destruct_skb;
> > >
> > >               err = dev_direct_xmit(skb, xs->queue_id);
> > > -             xskq_cons_release(xs->tx);
> > >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > > +             if (err == NET_XMIT_DROP) {
> > >                       /* SKB completed but not sent */
> > > +                     xskq_cons_release(xs->tx);
> > >                       err = -EBUSY;
> > >                       goto out;
> > > +             } else if  (err == NETDEV_TX_BUSY) {
> > > +                     /* QUEUE_STATE_FROZEN, tell application to
> > > +                      * retry sending the packet
> > > +                      */
> > > +                     skb->destructor = NULL;
> > > +                     kfree_skb(skb);
> > > +                     err = -EAGAIN;
> > > +                     goto out;
> >
> > Hmm, I'm probably missing something or I should blame my current lack of coffee,
> > but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> > dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> > double-free with NETDEV_TX_BUSY?
> 
> I think you are correct even without coffee :-). I misinterpreted the
> following piece of code in dev_direct_xmit():
> 
> if (!dev_xmit_complete(ret))
>      kfree_skb(skb);

I did look carefuly at this, but apparently forgot about the "!" part of
the conditional while looking at dev_xmit_complete() internals:

    return (NETDEV_TX_BUSY < NET_XMIT_MASK)
    return (0x10 < 0x0f)
    return false;
-- 
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-11  7:39   ` Magnus Karlsson
  2020-07-13 16:53     ` Jonathan Lemon
@ 2020-07-15 18:36     ` Daniel Borkmann
  2020-07-16  4:43       ` Magnus Karlsson
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2020-07-15 18:36 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Network Development, Jonathan Lemon, A.Zema

On 7/11/20 9:39 AM, Magnus Karlsson wrote:
> On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
>>> In the skb Tx path, transmission of a packet is performed with
>>> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
>>> routines, it returns NETDEV_TX_BUSY signifying that it was not
>>> possible to send the packet now, please try later. Unfortunately, the
>>> xsk transmit code discarded the packet, missed to free the skb, and
>>> returned EBUSY to the application. Fix this memory leak and
>>> unnecessary packet loss, by not discarding the packet in the Tx ring,
>>> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
>>> application, it can then retry the send operation and the packet will
>>> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
>>> state anymore. So EAGAIN tells the application that the packet was not
>>> discarded from the Tx ring and that it needs to call send()
>>> again. EBUSY, on the other hand, signifies that the packet was not
>>> sent and discarded from the Tx ring. The application needs to put the
>>> packet on the Tx ring again if it wants it to be sent.
>>>
>>> Fixes: 35fcde7f8deb ("xsk: support for Tx")
>>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>>> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
>>> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
>>> ---
>>> The v1 of this patch was called "xsk: do not discard packet when
>>> QUEUE_STATE_FROZEN".
>>> ---
>>>    net/xdp/xsk.c | 13 +++++++++++--
>>>    1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>>> index 3700266..5304250 100644
>>> --- a/net/xdp/xsk.c
>>> +++ b/net/xdp/xsk.c
>>> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
>>>                skb->destructor = xsk_destruct_skb;
>>>
>>>                err = dev_direct_xmit(skb, xs->queue_id);
>>> -             xskq_cons_release(xs->tx);
>>>                /* Ignore NET_XMIT_CN as packet might have been sent */
>>> -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
>>> +             if (err == NET_XMIT_DROP) {
>>>                        /* SKB completed but not sent */
>>> +                     xskq_cons_release(xs->tx);
>>>                        err = -EBUSY;
>>>                        goto out;
>>> +             } else if  (err == NETDEV_TX_BUSY) {
>>> +                     /* QUEUE_STATE_FROZEN, tell application to
>>> +                      * retry sending the packet
>>> +                      */
>>> +                     skb->destructor = NULL;
>>> +                     kfree_skb(skb);
>>> +                     err = -EAGAIN;
>>> +                     goto out;
>>
>> Hmm, I'm probably missing something or I should blame my current lack of coffee,
>> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
>> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
>> double-free with NETDEV_TX_BUSY?
> 
> I think you are correct even without coffee :-). I misinterpreted the
> following piece of code in dev_direct_xmit():
> 
> if (!dev_xmit_complete(ret))
>       kfree_skb(skb);
> 
> If the skb was NOT consumed by the transmit, then it goes and frees
> the skb. NETDEV_TX_BUSY as a return value will make
> dev_xmit_complete() return false which triggers the freeing of the
> skb. So if I now understand dev_direct_xmit() correctly, it will
> always consume the skb, even when NETDEV_TX_BUSY is returned. And this
> is what I would like to avoid. If the skb is freed, the destructor is
> triggered and it will complete the packet to user-space, which is the
> same thing as dropping it, which is what I want to avoid in the first
> place since it is completely unnecessary.
> 
> So what would be the best way to solve this? Prefer to share the code
> with AF_PACKET if possible. Introduce a boolean function parameter to
> indicate if it should be freed in this case? Other ideas? Here are the
> users of dev_direct_xmit():

Another option could be looking at pktgen which mangles skb->users to keep
the skb alive.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path
  2020-07-15 18:36     ` Daniel Borkmann
@ 2020-07-16  4:43       ` Magnus Karlsson
  0 siblings, 0 replies; 7+ messages in thread
From: Magnus Karlsson @ 2020-07-16  4:43 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Network Development, Jonathan Lemon, A.Zema

On Wed, Jul 15, 2020 at 8:36 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/11/20 9:39 AM, Magnus Karlsson wrote:
> > On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> >>> In the skb Tx path, transmission of a packet is performed with
> >>> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> >>> routines, it returns NETDEV_TX_BUSY signifying that it was not
> >>> possible to send the packet now, please try later. Unfortunately, the
> >>> xsk transmit code discarded the packet, missed to free the skb, and
> >>> returned EBUSY to the application. Fix this memory leak and
> >>> unnecessary packet loss, by not discarding the packet in the Tx ring,
> >>> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> >>> application, it can then retry the send operation and the packet will
> >>> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> >>> state anymore. So EAGAIN tells the application that the packet was not
> >>> discarded from the Tx ring and that it needs to call send()
> >>> again. EBUSY, on the other hand, signifies that the packet was not
> >>> sent and discarded from the Tx ring. The application needs to put the
> >>> packet on the Tx ring again if it wants it to be sent.
> >>>
> >>> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> >>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> >>> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> >>> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> >>> ---
> >>> The v1 of this patch was called "xsk: do not discard packet when
> >>> QUEUE_STATE_FROZEN".
> >>> ---
> >>>    net/xdp/xsk.c | 13 +++++++++++--
> >>>    1 file changed, 11 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> >>> index 3700266..5304250 100644
> >>> --- a/net/xdp/xsk.c
> >>> +++ b/net/xdp/xsk.c
> >>> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> >>>                skb->destructor = xsk_destruct_skb;
> >>>
> >>>                err = dev_direct_xmit(skb, xs->queue_id);
> >>> -             xskq_cons_release(xs->tx);
> >>>                /* Ignore NET_XMIT_CN as packet might have been sent */
> >>> -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> >>> +             if (err == NET_XMIT_DROP) {
> >>>                        /* SKB completed but not sent */
> >>> +                     xskq_cons_release(xs->tx);
> >>>                        err = -EBUSY;
> >>>                        goto out;
> >>> +             } else if  (err == NETDEV_TX_BUSY) {
> >>> +                     /* QUEUE_STATE_FROZEN, tell application to
> >>> +                      * retry sending the packet
> >>> +                      */
> >>> +                     skb->destructor = NULL;
> >>> +                     kfree_skb(skb);
> >>> +                     err = -EAGAIN;
> >>> +                     goto out;
> >>
> >> Hmm, I'm probably missing something or I should blame my current lack of coffee,
> >> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> >> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> >> double-free with NETDEV_TX_BUSY?
> >
> > I think you are correct even without coffee :-). I misinterpreted the
> > following piece of code in dev_direct_xmit():
> >
> > if (!dev_xmit_complete(ret))
> >       kfree_skb(skb);
> >
> > If the skb was NOT consumed by the transmit, then it goes and frees
> > the skb. NETDEV_TX_BUSY as a return value will make
> > dev_xmit_complete() return false which triggers the freeing of the
> > skb. So if I now understand dev_direct_xmit() correctly, it will
> > always consume the skb, even when NETDEV_TX_BUSY is returned. And this
> > is what I would like to avoid. If the skb is freed, the destructor is
> > triggered and it will complete the packet to user-space, which is the
> > same thing as dropping it, which is what I want to avoid in the first
> > place since it is completely unnecessary.
> >
> > So what would be the best way to solve this? Prefer to share the code
> > with AF_PACKET if possible. Introduce a boolean function parameter to
> > indicate if it should be freed in this case? Other ideas? Here are the
> > users of dev_direct_xmit():
>
> Another option could be looking at pktgen which mangles skb->users to keep
> the skb alive.

Thanks. Will take a look at that and give it a try.

/Magnus

> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-16  4:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10  6:45 [PATCH bpf v2] xsk: fix memory leak and packet loss in Tx skb path Magnus Karlsson
2020-07-10 16:34 ` Jonathan Lemon
2020-07-10 23:26 ` Daniel Borkmann
2020-07-11  7:39   ` Magnus Karlsson
2020-07-13 16:53     ` Jonathan Lemon
2020-07-15 18:36     ` Daniel Borkmann
2020-07-16  4:43       ` Magnus Karlsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).