netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] pull request for net: batman-adv 2023-06-07
@ 2023-06-07 15:55 Simon Wunderlich
  2023-06-07 15:55 ` [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work Simon Wunderlich
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Wunderlich @ 2023-06-07 15:55 UTC (permalink / raw)
  To: davem, kuba; +Cc: netdev, b.a.t.m.a.n, Simon Wunderlich

Hi David, hi Jakub,

here is a bugfix for batman-adv which we would like to have integrated into net.

Please pull or let me know of any problem!

Thank you,
      Simon

The following changes since commit 44c026a73be8038f03dbdeef028b642880cf1511:

  Linux 6.4-rc3 (2023-05-21 14:05:48 -0700)

are available in the Git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-net-pullrequest-20230607

for you to fetch changes up to abac3ac97fe8734b620e7322a116450d7f90aa43:

  batman-adv: Broken sync while rescheduling delayed work (2023-05-26 23:14:49 +0200)

----------------------------------------------------------------
Here is a batman-adv bugfix:

 - fix a broken sync while rescheduling delayed work, by
   Vladislav Efanov

----------------------------------------------------------------
Vladislav Efanov (1):
      batman-adv: Broken sync while rescheduling delayed work

 net/batman-adv/distributed-arp-table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-07 15:55 [PATCH 0/1] pull request for net: batman-adv 2023-06-07 Simon Wunderlich
@ 2023-06-07 15:55 ` Simon Wunderlich
  2023-06-08  5:00   ` patchwork-bot+netdevbpf
  2023-06-08  5:01   ` Jakub Kicinski
  0 siblings, 2 replies; 8+ messages in thread
From: Simon Wunderlich @ 2023-06-07 15:55 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, b.a.t.m.a.n, Vladislav Efanov, stable, Sven Eckelmann,
	Simon Wunderlich

From: Vladislav Efanov <VEfanov@ispras.ru>

Syzkaller got a lot of crashes like:
KASAN: use-after-free Write in *_timers*

All of these crashes point to the same memory area:

The buggy address belongs to the object at ffff88801f870000
 which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 5320 bytes inside of
 8192-byte region [ffff88801f870000, ffff88801f872000)

This area belongs to :
        batadv_priv->batadv_priv_dat->delayed_work->timer_list

The reason for these issues is the lack of synchronization. Delayed
work (batadv_dat_purge) schedules new timer/work while the device
is being deleted. As the result new timer/delayed work is set after
cancel_delayed_work_sync() was called. So after the device is freed
the timer list contains pointer to already freed memory.

Found by Linux Verification Center (linuxtesting.org) with syzkaller.

Cc: stable@kernel.org
Fixes: 2f1dfbe18507 ("batman-adv: Distributed ARP Table - implement local storage")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 net/batman-adv/distributed-arp-table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c
index 6968e55eb971..28a939d56090 100644
--- a/net/batman-adv/distributed-arp-table.c
+++ b/net/batman-adv/distributed-arp-table.c
@@ -101,7 +101,6 @@ static void batadv_dat_purge(struct work_struct *work);
  */
 static void batadv_dat_start_timer(struct batadv_priv *bat_priv)
 {
-	INIT_DELAYED_WORK(&bat_priv->dat.work, batadv_dat_purge);
 	queue_delayed_work(batadv_event_workqueue, &bat_priv->dat.work,
 			   msecs_to_jiffies(10000));
 }
@@ -819,6 +818,7 @@ int batadv_dat_init(struct batadv_priv *bat_priv)
 	if (!bat_priv->dat.hash)
 		return -ENOMEM;
 
+	INIT_DELAYED_WORK(&bat_priv->dat.work, batadv_dat_purge);
 	batadv_dat_start_timer(bat_priv);
 
 	batadv_tvlv_handler_register(bat_priv, batadv_dat_tvlv_ogm_handler_v1,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-07 15:55 ` [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work Simon Wunderlich
@ 2023-06-08  5:00   ` patchwork-bot+netdevbpf
  2023-06-08  5:01   ` Jakub Kicinski
  1 sibling, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-06-08  5:00 UTC (permalink / raw)
  To: Simon Wunderlich; +Cc: davem, kuba, netdev, b.a.t.m.a.n, VEfanov, stable, sven

Hello:

This patch was applied to netdev/net.git (main)
by Simon Wunderlich <sw@simonwunderlich.de>:

On Wed,  7 Jun 2023 17:55:15 +0200 you wrote:
> From: Vladislav Efanov <VEfanov@ispras.ru>
> 
> Syzkaller got a lot of crashes like:
> KASAN: use-after-free Write in *_timers*
> 
> All of these crashes point to the same memory area:
> 
> [...]

Here is the summary with links:
  - [1/1] batman-adv: Broken sync while rescheduling delayed work
    https://git.kernel.org/netdev/net/c/abac3ac97fe8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-07 15:55 ` [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work Simon Wunderlich
  2023-06-08  5:00   ` patchwork-bot+netdevbpf
@ 2023-06-08  5:01   ` Jakub Kicinski
  2023-06-08  5:24     ` Keller, Jacob E
  2023-06-08  9:27     ` Paolo Abeni
  1 sibling, 2 replies; 8+ messages in thread
From: Jakub Kicinski @ 2023-06-08  5:01 UTC (permalink / raw)
  To: Simon Wunderlich
  Cc: davem, netdev, b.a.t.m.a.n, Vladislav Efanov, stable, Sven Eckelmann

On Wed,  7 Jun 2023 17:55:15 +0200 Simon Wunderlich wrote:
> The reason for these issues is the lack of synchronization. Delayed
> work (batadv_dat_purge) schedules new timer/work while the device
> is being deleted. As the result new timer/delayed work is set after
> cancel_delayed_work_sync() was called. So after the device is freed
> the timer list contains pointer to already freed memory.

I guess this is better than status quo but is the fix really complete?
We're still not preventing the timer / work from getting scheduled
and staying alive after the netdev has been freed, right?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-08  5:01   ` Jakub Kicinski
@ 2023-06-08  5:24     ` Keller, Jacob E
  2023-06-08  9:01       ` Vlad Efanov
  2023-06-08  9:27     ` Paolo Abeni
  1 sibling, 1 reply; 8+ messages in thread
From: Keller, Jacob E @ 2023-06-08  5:24 UTC (permalink / raw)
  To: Jakub Kicinski, Simon Wunderlich
  Cc: davem, netdev, b.a.t.m.a.n, Vladislav Efanov, stable, Sven Eckelmann



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, June 7, 2023 10:01 PM
> To: Simon Wunderlich <sw@simonwunderlich.de>
> Cc: davem@davemloft.net; netdev@vger.kernel.org; b.a.t.m.a.n@lists.open-
> mesh.org; Vladislav Efanov <VEfanov@ispras.ru>; stable@kernel.org; Sven
> Eckelmann <sven@narfation.org>
> Subject: Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed
> work
> 
> On Wed,  7 Jun 2023 17:55:15 +0200 Simon Wunderlich wrote:
> > The reason for these issues is the lack of synchronization. Delayed
> > work (batadv_dat_purge) schedules new timer/work while the device
> > is being deleted. As the result new timer/delayed work is set after
> > cancel_delayed_work_sync() was called. So after the device is freed
> > the timer list contains pointer to already freed memory.
> 
> I guess this is better than status quo but is the fix really complete?
> We're still not preventing the timer / work from getting scheduled
> and staying alive after the netdev has been freed, right?

Yea, I would expect some synchronization mechanism to ensure that after cancel_delayed_work_sync() you can't queue the work again.

I know for timers there is recently timer_shutdown_sync() which can be used to guarantee a timer can't re-arm at all, and its intended for some situations where there is a cyclic dependency...

Thanks,
Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-08  5:24     ` Keller, Jacob E
@ 2023-06-08  9:01       ` Vlad Efanov
  0 siblings, 0 replies; 8+ messages in thread
From: Vlad Efanov @ 2023-06-08  9:01 UTC (permalink / raw)
  To: Keller, Jacob E, Jakub Kicinski, Simon Wunderlich
  Cc: davem, netdev, b.a.t.m.a.n, stable, Sven Eckelmann

As far as I found the synchronization is provided by delayed work 
subsystem. It is based on the WORK_STRUCT_PENDING_BIT in work->data field.

The cancel_delayed_work_sync() atomically sets this bit and 
queue_delayed_work() checks it before scheduling new delayed work.


The problem is caused by the INIT_DELAYED_WORK() call inside 
batadv_dat_start_timer(). This call happens before the 
queue_delayed_work() call and clears this bit.


Best regards,

Vlad


On 08.06.2023 08:24, Keller, Jacob E wrote:
>
>> -----Original Message-----
>> From: Jakub Kicinski <kuba@kernel.org>
>> Sent: Wednesday, June 7, 2023 10:01 PM
>> To: Simon Wunderlich <sw@simonwunderlich.de>
>> Cc: davem@davemloft.net; netdev@vger.kernel.org; b.a.t.m.a.n@lists.open-
>> mesh.org; Vladislav Efanov <VEfanov@ispras.ru>; stable@kernel.org; Sven
>> Eckelmann <sven@narfation.org>
>> Subject: Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed
>> work
>>
>> On Wed,  7 Jun 2023 17:55:15 +0200 Simon Wunderlich wrote:
>>> The reason for these issues is the lack of synchronization. Delayed
>>> work (batadv_dat_purge) schedules new timer/work while the device
>>> is being deleted. As the result new timer/delayed work is set after
>>> cancel_delayed_work_sync() was called. So after the device is freed
>>> the timer list contains pointer to already freed memory.
>> I guess this is better than status quo but is the fix really complete?
>> We're still not preventing the timer / work from getting scheduled
>> and staying alive after the netdev has been freed, right?
> Yea, I would expect some synchronization mechanism to ensure that after cancel_delayed_work_sync() you can't queue the work again.
>
> I know for timers there is recently timer_shutdown_sync() which can be used to guarantee a timer can't re-arm at all, and its intended for some situations where there is a cyclic dependency...
>
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-08  5:01   ` Jakub Kicinski
  2023-06-08  5:24     ` Keller, Jacob E
@ 2023-06-08  9:27     ` Paolo Abeni
  2023-06-08 16:57       ` Sven Eckelmann
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2023-06-08  9:27 UTC (permalink / raw)
  To: Jakub Kicinski, Simon Wunderlich
  Cc: davem, netdev, b.a.t.m.a.n, Vladislav Efanov, stable, Sven Eckelmann

On Wed, 2023-06-07 at 22:01 -0700, Jakub Kicinski wrote:
> On Wed,  7 Jun 2023 17:55:15 +0200 Simon Wunderlich wrote:
> > The reason for these issues is the lack of synchronization. Delayed
> > work (batadv_dat_purge) schedules new timer/work while the device
> > is being deleted. As the result new timer/delayed work is set after
> > cancel_delayed_work_sync() was called. So after the device is freed
> > the timer list contains pointer to already freed memory.
> 
> I guess this is better than status quo but is the fix really complete?
> We're still not preventing the timer / work from getting scheduled
> and staying alive after the netdev has been freed, right?

I *think* this specific use case does not expose such problem, as the
delayed work is (AFAICS) scheduled only at device creation time and by
the work itself, it should never be re-scheduled after
cancel_delayed_work_sync()

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work
  2023-06-08  9:27     ` Paolo Abeni
@ 2023-06-08 16:57       ` Sven Eckelmann
  0 siblings, 0 replies; 8+ messages in thread
From: Sven Eckelmann @ 2023-06-08 16:57 UTC (permalink / raw)
  To: Jakub Kicinski, Simon Wunderlich, Paolo Abeni
  Cc: davem, netdev, b.a.t.m.a.n, Vladislav Efanov, stable

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

On Thursday, 8 June 2023 11:27:31 CEST Paolo Abeni wrote:
[...]
> > We're still not preventing the timer / work from getting scheduled
> > and staying alive after the netdev has been freed, right?
> 
> I *think* this specific use case does not expose such problem, as the
> delayed work is (AFAICS) scheduled only at device creation time and by
> the work itself, it should never be re-scheduled after
> cancel_delayed_work_sync()

Correct.

* batadv_dat_start_timer is the only thing scheduling it
* batadv_dat_start_timer is called by:

  - batadv_dat_purge (the worker rearming itself)
  - batadv_dat_init (when the interface is created)

Kind regards,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-08 16:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 15:55 [PATCH 0/1] pull request for net: batman-adv 2023-06-07 Simon Wunderlich
2023-06-07 15:55 ` [PATCH 1/1] batman-adv: Broken sync while rescheduling delayed work Simon Wunderlich
2023-06-08  5:00   ` patchwork-bot+netdevbpf
2023-06-08  5:01   ` Jakub Kicinski
2023-06-08  5:24     ` Keller, Jacob E
2023-06-08  9:01       ` Vlad Efanov
2023-06-08  9:27     ` Paolo Abeni
2023-06-08 16:57       ` Sven Eckelmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).