linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] can: j1939: fix memory leak of skbs
@ 2022-07-08 17:59 Fedor Pchelkin
  2022-07-29  4:22 ` Oleksij Rempel
  0 siblings, 1 reply; 8+ messages in thread
From: Fedor Pchelkin @ 2022-07-08 17:59 UTC (permalink / raw)
  To: Robin van der Gracht, Oleksij Rempel
  Cc: Fedor Pchelkin, kernel, Oliver Hartkopp, Marc Kleine-Budde,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-can, netdev, linux-kernel, ldv-project, Alexey Khoroshilov

Syzkaller reported memory leak of skbs introduced with the commit
2030043e616c ("can: j1939: fix Use-after-Free, hold skb ref while in use").

Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743

The suggested solution was tested on the new memory-leak Syzkaller repro
and on the old use-after-free repro (that use-after-free bug was solved
with aforementioned commit). Although there can probably be another
situations when the numbers of skb_get() and skb_unref() calls don't match
and I don't see it in right way.

Moreover, skb_unref() call can be harmlessly removed from line 338 in
j1939_session_skb_drop_old() (/net/can/j1939/transport.c). But then I
assume this removal ruins the whole reference counts logic...

Overall, there is definitely something not clear in skb reference counts
management with skb_get() and skb_unref(). The solution we suggested fixes
the leaks and use-after-free's induced by Syzkaller but perhaps the origin
of the problem can be somewhere else.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
---
 net/can/j1939/transport.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index 307ee1174a6e..9600b339cbf8 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -356,7 +356,6 @@ void j1939_session_skb_queue(struct j1939_session *session,
 
 	skcb->flags |= J1939_ECU_LOCAL_SRC;
 
-	skb_get(skb);
 	skb_queue_tail(&session->skb_queue, skb);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-07-08 17:59 [PATCH] can: j1939: fix memory leak of skbs Fedor Pchelkin
@ 2022-07-29  4:22 ` Oleksij Rempel
  2022-08-05  8:56   ` Fedor Pchelkin
  0 siblings, 1 reply; 8+ messages in thread
From: Oleksij Rempel @ 2022-07-29  4:22 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Robin van der Gracht, Oleksij Rempel, kernel, Oliver Hartkopp,
	Marc Kleine-Budde, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-can, netdev, linux-kernel, ldv-project,
	Alexey Khoroshilov

Hi Fedor,

thank you for work.

On Fri, Jul 08, 2022 at 08:59:49PM +0300, Fedor Pchelkin wrote:
> Syzkaller reported memory leak of skbs introduced with the commit
> 2030043e616c ("can: j1939: fix Use-after-Free, hold skb ref while in use").
> 
> Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743
> 
> The suggested solution was tested on the new memory-leak Syzkaller repro
> and on the old use-after-free repro (that use-after-free bug was solved
> with aforementioned commit). Although there can probably be another
> situations when the numbers of skb_get() and skb_unref() calls don't match
> and I don't see it in right way.
> 
> Moreover, skb_unref() call can be harmlessly removed from line 338 in
> j1939_session_skb_drop_old() (/net/can/j1939/transport.c). But then I
> assume this removal ruins the whole reference counts logic...
> 
> Overall, there is definitely something not clear in skb reference counts
> management with skb_get() and skb_unref(). The solution we suggested fixes
> the leaks and use-after-free's induced by Syzkaller but perhaps the origin
> of the problem can be somewhere else.
> 
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
> ---
>  net/can/j1939/transport.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index 307ee1174a6e..9600b339cbf8 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -356,7 +356,6 @@ void j1939_session_skb_queue(struct j1939_session *session,
>  
>  	skcb->flags |= J1939_ECU_LOCAL_SRC;
>  
> -	skb_get(skb);
>  	skb_queue_tail(&session->skb_queue, skb);
>  }

This skb_get() is counter part of skb_unref()
j1939_session_skb_drop_old().

Initial issue can be reproduced by using real (slow) CAN with j1939cat[1]
tool. Both parts should be started to make sure the j1939_session_tx_dat() will
actually start using the queue. After pushing about 100K of data, application
will try to close the socket and exit. After socket is closed, all skb related
to this socket will be freed and j1939_session_tx_dat() will use freed skbs.

NACK for this patch.

1. https://github.com/linux-can/can-utils/blob/master/j1939cat.c
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-07-29  4:22 ` Oleksij Rempel
@ 2022-08-05  8:56   ` Fedor Pchelkin
  2022-08-05  9:55     ` Oleksij Rempel
  0 siblings, 1 reply; 8+ messages in thread
From: Fedor Pchelkin @ 2022-08-05  8:56 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Robin van der Gracht, Oleksij Rempel, kernel, Oliver Hartkopp,
	Marc Kleine-Budde, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-can, netdev, linux-kernel, ldv-project,
	Alexey Khoroshilov

Hi Oleksij,

On 29.07.2022 07:22, Oleksij Rempel wrote:
> Initial issue can be reproduced by using real (slow) CAN with j1939cat[1]
> tool. Both parts should be started to make sure the j1939_session_tx_dat() will
> actually start using the queue. After pushing about 100K of data, application
> will try to close the socket and exit. After socket is closed, all skb related
> to this socket will be freed and j1939_session_tx_dat() will use freed skbs.

Ok, the patch I suggested was a kind of a guess, now I understand that
it breaks important logic.

On 29.07.2022 07:22, Oleksij Rempel wrote:
 > This skb_get() is counter part of skb_unref()
 > j1939_session_skb_drop_old().

However, we have a case [1] where j1939_session_skb_queue() is called
but the corresponding j1939_session_skb_drop_old() is not called and it
causes a memory leak.

I tried to investigate it a little bit: the thing is that
j1939_session_skb_drop_old() can be called only when processing
J1939_ETP_CMD_CTS. On the contrary, as I can see,
j1939_session_skb_queue() can be called independently from
J1939_ETP_CMD_CTS so the two functions obviously do not correspond to
each other.

In reproducer case there is a J1939_ETP_CMD_RTS processing, then
we send some messages (where j1939_session_skb_queue() is called) and
after that J1939_ETP_CMD_ABORT is processed and we lose those skbs.

[1] https://forge.ispras.ru/issues/11743

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-08-05  8:56   ` Fedor Pchelkin
@ 2022-08-05  9:55     ` Oleksij Rempel
  2022-08-05 14:55       ` Fedor Pchelkin
  0 siblings, 1 reply; 8+ messages in thread
From: Oleksij Rempel @ 2022-08-05  9:55 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Robin van der Gracht, Oleksij Rempel, kernel, Oliver Hartkopp,
	Marc Kleine-Budde, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-can, netdev, linux-kernel, ldv-project,
	Alexey Khoroshilov

On Fri, Aug 05, 2022 at 11:56:18AM +0300, Fedor Pchelkin wrote:
> Hi Oleksij,
> 
> On 29.07.2022 07:22, Oleksij Rempel wrote:
> > Initial issue can be reproduced by using real (slow) CAN with j1939cat[1]
> > tool. Both parts should be started to make sure the j1939_session_tx_dat() will
> > actually start using the queue. After pushing about 100K of data, application
> > will try to close the socket and exit. After socket is closed, all skb related
> > to this socket will be freed and j1939_session_tx_dat() will use freed skbs.
> 
> Ok, the patch I suggested was a kind of a guess, now I understand that
> it breaks important logic.
> 
> On 29.07.2022 07:22, Oleksij Rempel wrote:
> > This skb_get() is counter part of skb_unref()
> > j1939_session_skb_drop_old().
> 
> However, we have a case [1] where j1939_session_skb_queue() is called
> but the corresponding j1939_session_skb_drop_old() is not called and it
> causes a memory leak.
> 
> I tried to investigate it a little bit: the thing is that
> j1939_session_skb_drop_old() can be called only when processing
> J1939_ETP_CMD_CTS. On the contrary, as I can see,
> j1939_session_skb_queue() can be called independently from
> J1939_ETP_CMD_CTS so the two functions obviously do not correspond to
> each other.
> 
> In reproducer case there is a J1939_ETP_CMD_RTS processing, then
> we send some messages (where j1939_session_skb_queue() is called) and
> after that J1939_ETP_CMD_ABORT is processed and we lose those skbs.

Ah.. good point.
In this case it will go to:
  j1939_session_destroy()
    skb_queue_purge(&session->skb_queue)
      kfree_skb(skb);

And in the normal path we have:
  j1939_session_skb_drop_old()
    skb_unref(do_skb);
    kfree_skb(do_skb);

It means skb_queue_purge() should be replaced with something like:
	while ((skb = skb_dequeue(list)) != NULL) {
		/* drop ref taken in j1939_session_skb_queue() */
		skb_unref(skb);
		kfree_skb(skb);
	}

Can you please test it?

Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-08-05  9:55     ` Oleksij Rempel
@ 2022-08-05 14:55       ` Fedor Pchelkin
  0 siblings, 0 replies; 8+ messages in thread
From: Fedor Pchelkin @ 2022-08-05 14:55 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Robin van der Gracht, Oleksij Rempel, kernel, Oliver Hartkopp,
	Marc Kleine-Budde, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-can, netdev, linux-kernel, ldv-project,
	Alexey Khoroshilov

On 05.08.2022 12:55, Oleksij Rempel wrote:
> Can you please test it?

That works fine. I'm preparing a patch for the issue.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-08-05 15:02 Fedor Pchelkin
  2022-08-05 16:20 ` Oleksij Rempel
@ 2022-08-08  7:56 ` Marc Kleine-Budde
  1 sibling, 0 replies; 8+ messages in thread
From: Marc Kleine-Budde @ 2022-08-08  7:56 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Oleksij Rempel, Robin van der Gracht, kernel, Oliver Hartkopp,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-can, netdev, linux-kernel, ldv-project, Alexey Khoroshilov

[-- Attachment #1: Type: text/plain, Size: 922 bytes --]

On 05.08.2022 18:02:16, Fedor Pchelkin wrote:
> We need to drop skb references taken in j1939_session_skb_queue() when
> destroying a session in j1939_session_destroy(). Otherwise those skbs
> would be lost.
> 
> Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.
> 
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
> 
> Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
> Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>

Added to can/master

Thanks,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] can: j1939: fix memory leak of skbs
  2022-08-05 15:02 Fedor Pchelkin
@ 2022-08-05 16:20 ` Oleksij Rempel
  2022-08-08  7:56 ` Marc Kleine-Budde
  1 sibling, 0 replies; 8+ messages in thread
From: Oleksij Rempel @ 2022-08-05 16:20 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Robin van der Gracht, ldv-project, Oliver Hartkopp, linux-can,
	Eric Dumazet, Marc Kleine-Budde, Alexey Khoroshilov, kernel,
	netdev, Jakub Kicinski, Paolo Abeni, David S . Miller,
	linux-kernel

On Fri, Aug 05, 2022 at 06:02:16PM +0300, Fedor Pchelkin wrote:
> We need to drop skb references taken in j1939_session_skb_queue() when
> destroying a session in j1939_session_destroy(). Otherwise those skbs
> would be lost.
> 
> Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.
> 
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
> 
> Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
> Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>

> ---
>  net/can/j1939/transport.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index 307ee1174a6e..d7d86c944d76 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session)
>  
>  static void j1939_session_destroy(struct j1939_session *session)
>  {
> +	struct sk_buff *skb;
> +
>  	if (session->transmission) {
>  		if (session->err)
>  			j1939_sk_errqueue(session, J1939_ERRQUEUE_TX_ABORT);
> @@ -274,7 +276,11 @@ static void j1939_session_destroy(struct j1939_session *session)
>  	WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry));
>  	WARN_ON_ONCE(!list_empty(&session->active_session_list_entry));
>  
> -	skb_queue_purge(&session->skb_queue);
> +	while ((skb = skb_dequeue(&session->skb_queue)) != NULL) {
> +		/* drop ref taken in j1939_session_skb_queue() */
> +		skb_unref(skb);
> +		kfree_skb(skb);
> +	}
>  	__j1939_session_drop(session);
>  	j1939_priv_put(session->priv);
>  	kfree(session);
> -- 
> 2.25.1
> 
> 
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] can: j1939: fix memory leak of skbs
@ 2022-08-05 15:02 Fedor Pchelkin
  2022-08-05 16:20 ` Oleksij Rempel
  2022-08-08  7:56 ` Marc Kleine-Budde
  0 siblings, 2 replies; 8+ messages in thread
From: Fedor Pchelkin @ 2022-08-05 15:02 UTC (permalink / raw)
  To: Oleksij Rempel, Robin van der Gracht
  Cc: Fedor Pchelkin, kernel, Oliver Hartkopp, Marc Kleine-Budde,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-can, netdev, linux-kernel, ldv-project, Alexey Khoroshilov

We need to drop skb references taken in j1939_session_skb_queue() when
destroying a session in j1939_session_destroy(). Otherwise those skbs
would be lost.

Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
---
 net/can/j1939/transport.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index 307ee1174a6e..d7d86c944d76 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session)
 
 static void j1939_session_destroy(struct j1939_session *session)
 {
+	struct sk_buff *skb;
+
 	if (session->transmission) {
 		if (session->err)
 			j1939_sk_errqueue(session, J1939_ERRQUEUE_TX_ABORT);
@@ -274,7 +276,11 @@ static void j1939_session_destroy(struct j1939_session *session)
 	WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry));
 	WARN_ON_ONCE(!list_empty(&session->active_session_list_entry));
 
-	skb_queue_purge(&session->skb_queue);
+	while ((skb = skb_dequeue(&session->skb_queue)) != NULL) {
+		/* drop ref taken in j1939_session_skb_queue() */
+		skb_unref(skb);
+		kfree_skb(skb);
+	}
 	__j1939_session_drop(session);
 	j1939_priv_put(session->priv);
 	kfree(session);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-08  7:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 17:59 [PATCH] can: j1939: fix memory leak of skbs Fedor Pchelkin
2022-07-29  4:22 ` Oleksij Rempel
2022-08-05  8:56   ` Fedor Pchelkin
2022-08-05  9:55     ` Oleksij Rempel
2022-08-05 14:55       ` Fedor Pchelkin
2022-08-05 15:02 Fedor Pchelkin
2022-08-05 16:20 ` Oleksij Rempel
2022-08-08  7:56 ` Marc Kleine-Budde

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).