All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] ipv4: make fib_info_cnt atomic
@ 2022-01-14 15:39 Eric Dumazet
  2022-01-14 15:50 ` David Laight
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2022-01-14 15:39 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, syzbot, Ido Schimmel, Jiri Pirko

From: Eric Dumazet <edumazet@google.com>

Instead of making sure all free_fib_info() callers
hold rtnl, it seems better to convert fib_info_cnt
to an atomic_t.

BUG: KCSAN: data-race in fib_create_info / free_fib_info

write to 0xffffffff86e243a0 of 4 bytes by task 26429 on cpu 0:
 fib_create_info+0xe78/0x3440 net/ipv4/fib_semantics.c:1428
 fib_table_insert+0x148/0x10c0 net/ipv4/fib_trie.c:1224
 fib_magic+0x195/0x1e0 net/ipv4/fib_frontend.c:1087
 fib_add_ifaddr+0xd0/0x2e0 net/ipv4/fib_frontend.c:1109
 fib_netdev_event+0x178/0x510 net/ipv4/fib_frontend.c:1466
 notifier_call_chain kernel/notifier.c:83 [inline]
 raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:391
 __dev_notify_flags+0x1d3/0x3b0
 dev_change_flags+0xa2/0xc0 net/core/dev.c:8872
 do_setlink+0x810/0x2410 net/core/rtnetlink.c:2719
 rtnl_group_changelink net/core/rtnetlink.c:3242 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3396 [inline]
 rtnl_newlink+0xb10/0x13b0 net/core/rtnetlink.c:3506
 rtnetlink_rcv_msg+0x745/0x7e0 net/core/rtnetlink.c:5571
 netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2496
 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5589
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x5fc/0x6c0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x726/0x840 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg net/socket.c:724 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2409
 ___sys_sendmsg net/socket.c:2463 [inline]
 __sys_sendmsg+0x195/0x230 net/socket.c:2492
 __do_sys_sendmsg net/socket.c:2501 [inline]
 __se_sys_sendmsg net/socket.c:2499 [inline]
 __x64_sys_sendmsg+0x42/0x50 net/socket.c:2499
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffffffff86e243a0 of 4 bytes by task 31505 on cpu 1:
 free_fib_info+0x35/0x80 net/ipv4/fib_semantics.c:252
 fib_info_put include/net/ip_fib.h:575 [inline]
 nsim_fib4_rt_destroy drivers/net/netdevsim/fib.c:294 [inline]
 nsim_fib4_rt_replace drivers/net/netdevsim/fib.c:403 [inline]
 nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:431 [inline]
 nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
 nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
 nsim_fib_event_work+0x15ca/0x2cf0 drivers/net/netdevsim/fib.c:1477
 process_one_work+0x3fc/0x980 kernel/workqueue.c:2298
 process_scheduled_works kernel/workqueue.c:2361 [inline]
 worker_thread+0x7df/0xa70 kernel/workqueue.c:2447
 kthread+0x2c7/0x2e0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30

value changed: 0x00000d2d -> 0x00000d2e

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 31505 Comm: kworker/1:21 Not tainted 5.16.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_fib_event_work

Fixes: 48bb9eb47b27 ("netdevsim: fib: Add dummy implementation for FIB offload")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
---
 net/ipv4/fib_semantics.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 828de171708f599b56f63715514c0259c7cb08a2..302373acf232205c812d10a041a87f22a64b1017 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -51,7 +51,7 @@ static DEFINE_SPINLOCK(fib_info_lock);
 static struct hlist_head *fib_info_hash;
 static struct hlist_head *fib_info_laddrhash;
 static unsigned int fib_info_hash_size;
-static unsigned int fib_info_cnt;
+static atomic_t fib_info_cnt;
 
 #define DEVINDEX_HASHBITS 8
 #define DEVINDEX_HASHSIZE (1U << DEVINDEX_HASHBITS)
@@ -249,7 +249,7 @@ void free_fib_info(struct fib_info *fi)
 		pr_warn("Freeing alive fib_info %p\n", fi);
 		return;
 	}
-	fib_info_cnt--;
+	atomic_dec(&fib_info_cnt);
 
 	call_rcu(&fi->rcu, free_fib_info_rcu);
 }
@@ -1430,7 +1430,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
 #endif
 
 	err = -ENOBUFS;
-	if (fib_info_cnt >= fib_info_hash_size) {
+	if (atomic_read(&fib_info_cnt) >= fib_info_hash_size) {
 		unsigned int new_size = fib_info_hash_size << 1;
 		struct hlist_head *new_info_hash;
 		struct hlist_head *new_laddrhash;
@@ -1462,7 +1462,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
 		return ERR_PTR(err);
 	}
 
-	fib_info_cnt++;
+	atomic_inc(&fib_info_cnt);
 	fi->fib_net = net;
 	fi->fib_protocol = cfg->fc_protocol;
 	fi->fib_scope = cfg->fc_scope;
-- 
2.34.1.703.g22d0c6ccf7-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH net] ipv4: make fib_info_cnt atomic
  2022-01-14 15:39 [PATCH net] ipv4: make fib_info_cnt atomic Eric Dumazet
@ 2022-01-14 15:50 ` David Laight
  2022-01-14 16:25   ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: David Laight @ 2022-01-14 15:50 UTC (permalink / raw)
  To: 'Eric Dumazet', David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, syzbot, Ido Schimmel, Jiri Pirko

From: Eric Dumazet
> Sent: 14 January 2022 15:39
> 
> Instead of making sure all free_fib_info() callers
> hold rtnl, it seems better to convert fib_info_cnt
> to an atomic_t.

Since fib_info_cnt is only used to control the size of the hash table
could it be incremented when a fid is added to the hash table and
decremented when it is removed.

This is all inside the fib_info_lock.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: make fib_info_cnt atomic
  2022-01-14 15:50 ` David Laight
@ 2022-01-14 16:25   ` Eric Dumazet
  2022-01-16  8:46     ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2022-01-14 16:25 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, syzbot,
	Ido Schimmel, Jiri Pirko

On Fri, Jan 14, 2022 at 7:50 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Eric Dumazet
> > Sent: 14 January 2022 15:39
> >
> > Instead of making sure all free_fib_info() callers
> > hold rtnl, it seems better to convert fib_info_cnt
> > to an atomic_t.
>
> Since fib_info_cnt is only used to control the size of the hash table
> could it be incremented when a fid is added to the hash table and
> decremented when it is removed.
>
> This is all inside the fib_info_lock.

Sure, this will need some READ_ONCE()/WRITE_ONCE() annotations
because the resize would read fib_info_cnt without this lock held.

I am not sure this is a stable candidate though, patch looks a bit more risky.

This seems to suggest another issue...

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 828de171708f599b56f63715514c0259c7cb08a2..45619c005b8dddd7ccd5c7029efa4ed69b6ce1de
100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -249,7 +249,6 @@ void free_fib_info(struct fib_info *fi)
                pr_warn("Freeing alive fib_info %p\n", fi);
                return;
        }
-       fib_info_cnt--;

        call_rcu(&fi->rcu, free_fib_info_rcu);
 }
@@ -260,6 +259,10 @@ void fib_release_info(struct fib_info *fi)
        spin_lock_bh(&fib_info_lock);
        if (fi && refcount_dec_and_test(&fi->fib_treeref)) {
                hlist_del(&fi->fib_hash);
+
+               /* Paired with READ_ONCE() in fib_create_info(). */
+               WRITE_ONCE(fib_info_cnt, fib_info_cnt - 1);
+
                if (fi->fib_prefsrc)
                        hlist_del(&fi->fib_lhash);
                if (fi->nh) {
@@ -1430,7 +1433,9 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
 #endif

        err = -ENOBUFS;
-       if (fib_info_cnt >= fib_info_hash_size) {
+
+       /* Paired with WRITE_ONCE() in fib_release_info() */
+       if (READ_ONCE(fib_info_cnt) >= fib_info_hash_size) {
                unsigned int new_size = fib_info_hash_size << 1;
                struct hlist_head *new_info_hash;
                struct hlist_head *new_laddrhash;
@@ -1462,7 +1467,6 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
                return ERR_PTR(err);
        }

-       fib_info_cnt++;
        fi->fib_net = net;
        fi->fib_protocol = cfg->fc_protocol;
        fi->fib_scope = cfg->fc_scope;
@@ -1591,6 +1595,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
        refcount_set(&fi->fib_treeref, 1);
        refcount_set(&fi->fib_clntref, 1);
        spin_lock_bh(&fib_info_lock);
+       fib_info_cnt++;
        hlist_add_head(&fi->fib_hash,
                       &fib_info_hash[fib_info_hashfn(fi)]);
        if (fi->fib_prefsrc) {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: make fib_info_cnt atomic
  2022-01-14 16:25   ` Eric Dumazet
@ 2022-01-16  8:46     ` Ido Schimmel
  2022-01-16  8:54       ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2022-01-16  8:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Laight, Eric Dumazet, David S . Miller, Jakub Kicinski,
	netdev, syzbot, Ido Schimmel, Jiri Pirko

On Fri, Jan 14, 2022 at 08:25:04AM -0800, 'Eric Dumazet' via syzkaller wrote:
> On Fri, Jan 14, 2022 at 7:50 AM David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Eric Dumazet
> > > Sent: 14 January 2022 15:39
> > >
> > > Instead of making sure all free_fib_info() callers
> > > hold rtnl, it seems better to convert fib_info_cnt
> > > to an atomic_t.
> >
> > Since fib_info_cnt is only used to control the size of the hash table
> > could it be incremented when a fid is added to the hash table and
> > decremented when it is removed.
> >
> > This is all inside the fib_info_lock.
> 
> Sure, this will need some READ_ONCE()/WRITE_ONCE() annotations
> because the resize would read fib_info_cnt without this lock held.
> 
> I am not sure this is a stable candidate though, patch looks a bit more risky.
> 
> This seems to suggest another issue...
> 
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index 828de171708f599b56f63715514c0259c7cb08a2..45619c005b8dddd7ccd5c7029efa4ed69b6ce1de
> 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -249,7 +249,6 @@ void free_fib_info(struct fib_info *fi)
>                 pr_warn("Freeing alive fib_info %p\n", fi);
>                 return;
>         }
> -       fib_info_cnt--;
> 
>         call_rcu(&fi->rcu, free_fib_info_rcu);
>  }
> @@ -260,6 +259,10 @@ void fib_release_info(struct fib_info *fi)
>         spin_lock_bh(&fib_info_lock);
>         if (fi && refcount_dec_and_test(&fi->fib_treeref)) {
>                 hlist_del(&fi->fib_hash);
> +
> +               /* Paired with READ_ONCE() in fib_create_info(). */
> +               WRITE_ONCE(fib_info_cnt, fib_info_cnt - 1);
> +
>                 if (fi->fib_prefsrc)
>                         hlist_del(&fi->fib_lhash);
>                 if (fi->nh) {
> @@ -1430,7 +1433,9 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
>  #endif
> 
>         err = -ENOBUFS;
> -       if (fib_info_cnt >= fib_info_hash_size) {
> +
> +       /* Paired with WRITE_ONCE() in fib_release_info() */
> +       if (READ_ONCE(fib_info_cnt) >= fib_info_hash_size) {
>                 unsigned int new_size = fib_info_hash_size << 1;
>                 struct hlist_head *new_info_hash;
>                 struct hlist_head *new_laddrhash;
> @@ -1462,7 +1467,6 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
>                 return ERR_PTR(err);
>         }
> 
> -       fib_info_cnt++;
>         fi->fib_net = net;
>         fi->fib_protocol = cfg->fc_protocol;
>         fi->fib_scope = cfg->fc_scope;
> @@ -1591,6 +1595,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
>         refcount_set(&fi->fib_treeref, 1);
>         refcount_set(&fi->fib_clntref, 1);
>         spin_lock_bh(&fib_info_lock);
> +       fib_info_cnt++;
>         hlist_add_head(&fi->fib_hash,
>                        &fib_info_hash[fib_info_hashfn(fi)]);
>         if (fi->fib_prefsrc) {

Thanks for the fix. Looks OK to me. The counter is incremented under the
lock when adding to the hash table(s) and decremented under the lock
upon removal. Do you intend to submit this version instead of the first
one?

> 
> -- 
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/CANn89iKA32qt8X6VzFsissZwxHpar6pDEJT_dgYhnxfROcaqRA%40mail.gmail.com.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: make fib_info_cnt atomic
  2022-01-16  8:46     ` Ido Schimmel
@ 2022-01-16  8:54       ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2022-01-16  8:54 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: David Laight, Eric Dumazet, David S . Miller, Jakub Kicinski,
	netdev, syzbot, Ido Schimmel, Jiri Pirko

On Sun, Jan 16, 2022 at 12:46 AM Ido Schimmel <idosch@idosch.org> wrote:
>
> Thanks for the fix. Looks OK to me. The counter is incremented under the
> lock when adding to the hash table(s) and decremented under the lock
> upon removal. Do you intend to submit this version instead of the first
> one?

Yes, I will send a V2, thanks !

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-16  8:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 15:39 [PATCH net] ipv4: make fib_info_cnt atomic Eric Dumazet
2022-01-14 15:50 ` David Laight
2022-01-14 16:25   ` Eric Dumazet
2022-01-16  8:46     ` Ido Schimmel
2022-01-16  8:54       ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.