netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] xfrm: release device reference for invalid state
       [not found] <20191108082059.22515-1-stid.smth@gmail.com>
@ 2019-11-11  6:17 ` Steffen Klassert
  2019-11-11  6:38   ` Xiaodong Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Steffen Klassert @ 2019-11-11  6:17 UTC (permalink / raw)
  To: Xiaodong Xu; +Cc: herbert, davem, chenborfc, netdev

Please make sure to always Cc netdev@vger.kernel.org on networking
patches.

Aso, what is the difference between this patch and the one you sent
before? Please add version numbers to your patches and describe the
changes between the versions.

On Fri, Nov 08, 2019 at 12:20:59AM -0800, Xiaodong Xu wrote:
> An ESP packet could be decrypted in async mode if the input handler for
> this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
> reference in skb is held. Later xfrm_input() will be invoked again to
> resume the processing.
> If the transform state is still valid it would continue to release the
> device reference and there won't be a problem; however if the transform
> state is not valid when async resumption happens, the packet will be
> dropped while the device reference is still being held.
> When the device is deleted for some reason and the reference to this
> device is not properly released, the kernel will keep logging like:
> 
> unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
> 
> The issue is observed when running IPsec traffic over a PPPoE device based
> on a bridge interface. By terminating the PPPoE connection on the server
> end for multiple times, the PPPoE device on the client side will eventually
> get stuck on the above warning message.
> 
> This patch will check the async mode first and continue to release device
> reference in async resumption, before it is dropped due to invalid state.
> 
> Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
> Signed-off-by: Xiaodong Xu <stid.smth@gmail.com>
> Reported-by: Bo Chen <chenborfc@163.com>
> Tested-by: Bo Chen <chenborfc@163.com>
> ---
>  net/xfrm/xfrm_input.c | 30 +++++++++++++++++++++---------
>  1 file changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index 9b599ed66d97..80c5af7cfec7 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -474,6 +474,13 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
>  	if (encap_type < 0) {
>  		x = xfrm_input_state(skb);
>  
> +		/* An encap_type of -1 indicates async resumption. */
> +		if (encap_type == -1) {
> +			async = 1;
> +			seq = XFRM_SKB_CB(skb)->seq.input.low;
> +			goto resume;
> +		}
> +
>  		if (unlikely(x->km.state != XFRM_STATE_VALID)) {
>  			if (x->km.state == XFRM_STATE_ACQ)
>  				XFRM_INC_STATS(net, LINUX_MIB_XFRMACQUIREERROR);

Why not just dropping the reference here if the state became invalid
after async resumption?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfrm: release device reference for invalid state
  2019-11-11  6:17 ` [PATCH] xfrm: release device reference for invalid state Steffen Klassert
@ 2019-11-11  6:38   ` Xiaodong Xu
  2019-11-11 11:32     ` Steffen Klassert
  0 siblings, 1 reply; 3+ messages in thread
From: Xiaodong Xu @ 2019-11-11  6:38 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: herbert, davem, chenborfc, netdev

Thanks for reviewing the patch, Steffen. Please check my replies below.

On Sun, Nov 10, 2019 at 10:17 PM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> Please make sure to always Cc netdev@vger.kernel.org on networking
> patches.
>
> Aso, what is the difference between this patch and the one you sent
> before? Please add version numbers to your patches and describe the
> changes between the versions.
>
The main difference in the new version is that 'family' will not be
assigned (in which case x->outer_mode needs to be accessed, and I'm
not sure if x->outer_mode is still accessible when the state is
invalid) in an invalid state.
I'll update the version to my patch.

> On Fri, Nov 08, 2019 at 12:20:59AM -0800, Xiaodong Xu wrote:
> > An ESP packet could be decrypted in async mode if the input handler for
> > this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
> > reference in skb is held. Later xfrm_input() will be invoked again to
> > resume the processing.
> > If the transform state is still valid it would continue to release the
> > device reference and there won't be a problem; however if the transform
> > state is not valid when async resumption happens, the packet will be
> > dropped while the device reference is still being held.
> > When the device is deleted for some reason and the reference to this
> > device is not properly released, the kernel will keep logging like:
> >
> > unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
> >
> > The issue is observed when running IPsec traffic over a PPPoE device based
> > on a bridge interface. By terminating the PPPoE connection on the server
> > end for multiple times, the PPPoE device on the client side will eventually
> > get stuck on the above warning message.
> >
> > This patch will check the async mode first and continue to release device
> > reference in async resumption, before it is dropped due to invalid state.
> >
> > Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
> > Signed-off-by: Xiaodong Xu <stid.smth@gmail.com>
> > Reported-by: Bo Chen <chenborfc@163.com>
> > Tested-by: Bo Chen <chenborfc@163.com>
> > ---
> >  net/xfrm/xfrm_input.c | 30 +++++++++++++++++++++---------
> >  1 file changed, 21 insertions(+), 9 deletions(-)
> >
> > diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> > index 9b599ed66d97..80c5af7cfec7 100644
> > --- a/net/xfrm/xfrm_input.c
> > +++ b/net/xfrm/xfrm_input.c
> > @@ -474,6 +474,13 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
> >       if (encap_type < 0) {
> >               x = xfrm_input_state(skb);
> >
> > +             /* An encap_type of -1 indicates async resumption. */
> > +             if (encap_type == -1) {
> > +                     async = 1;
> > +                     seq = XFRM_SKB_CB(skb)->seq.input.low;
> > +                     goto resume;
> > +             }
> > +
> >               if (unlikely(x->km.state != XFRM_STATE_VALID)) {
> >                       if (x->km.state == XFRM_STATE_ACQ)
> >                               XFRM_INC_STATS(net, LINUX_MIB_XFRMACQUIREERROR);
>
> Why not just dropping the reference here if the state became invalid
> after async resumption?
>
I was thinking about releasing the device reference immediately after
checking the state in the async resumption too. However it seems more
natural to me to simply jump to the 'resume' label in the async case.
Suppose there are more resources to be held before the async
resumption, we don't have to worry about that before dropping the
packet.
But if you prefer the other way I am OK with that too.

Regards,
Xiaodong

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfrm: release device reference for invalid state
  2019-11-11  6:38   ` Xiaodong Xu
@ 2019-11-11 11:32     ` Steffen Klassert
  0 siblings, 0 replies; 3+ messages in thread
From: Steffen Klassert @ 2019-11-11 11:32 UTC (permalink / raw)
  To: Xiaodong Xu; +Cc: herbert, davem, chenborfc, netdev

On Sun, Nov 10, 2019 at 10:38:41PM -0800, Xiaodong Xu wrote:
> Thanks for reviewing the patch, Steffen. Please check my replies below.
> 
> On Sun, Nov 10, 2019 at 10:17 PM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > > +++ b/net/xfrm/xfrm_input.c
> > > @@ -474,6 +474,13 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
> > >       if (encap_type < 0) {
> > >               x = xfrm_input_state(skb);
> > >
> > > +             /* An encap_type of -1 indicates async resumption. */
> > > +             if (encap_type == -1) {
> > > +                     async = 1;
> > > +                     seq = XFRM_SKB_CB(skb)->seq.input.low;
> > > +                     goto resume;
> > > +             }
> > > +
> > >               if (unlikely(x->km.state != XFRM_STATE_VALID)) {
> > >                       if (x->km.state == XFRM_STATE_ACQ)
> > >                               XFRM_INC_STATS(net, LINUX_MIB_XFRMACQUIREERROR);
> >
> > Why not just dropping the reference here if the state became invalid
> > after async resumption?
> >
> I was thinking about releasing the device reference immediately after
> checking the state in the async resumption too. However it seems more
> natural to me to simply jump to the 'resume' label in the async case.

If you add the check here, you add it to an error path. If you add the
check to the resume label, it is in the fastpath.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-11 11:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191108082059.22515-1-stid.smth@gmail.com>
2019-11-11  6:17 ` [PATCH] xfrm: release device reference for invalid state Steffen Klassert
2019-11-11  6:38   ` Xiaodong Xu
2019-11-11 11:32     ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).