All of lore.kernel.org
 help / color / mirror / Atom feed
* neighbour netlink notifications delivered in wrong order
@ 2022-06-06 23:01 Francesco Ruggeri
  2022-06-07  2:07 ` Andy Roulin
  0 siblings, 1 reply; 13+ messages in thread
From: Francesco Ruggeri @ 2022-06-06 23:01 UTC (permalink / raw)
  To: netdev, fruggeri

I have run into a race condition on a 4.19 kernel where netlink
notifications for a neighbour are queued in the wrong order on the
netlink socket.
This is one scenario, but I have also seen cases where the process
and softirq processing happens on the same cpu.
An Arp reply (or maybe garp, I am not sure) is received for a neighbour
while it is being deleted.

	CPU1			CPU2

rtnetlink_rcv_msg
neigh_delete
neigh_update
__neigh_notify(RTM_NEWNEIGH/NUD_FAILED)
__netlink_sendskb
			arp_rcv
			arp_process
			neigh_update
			__neigh_notify(RTM_NEWNEIGH/REACHABLE)
			__netlink_sendskb
			skb_queue_tail(&sk->sk_receive_queue, skb);
skb_queue_tail(&sk->sk_receive_queue, skb);

Is this a known issue?

Thanks,
Francesco Ruggeri



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-06 23:01 neighbour netlink notifications delivered in wrong order Francesco Ruggeri
@ 2022-06-07  2:07 ` Andy Roulin
  2022-06-07  3:19   ` Stephen Hemminger
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Roulin @ 2022-06-07  2:07 UTC (permalink / raw)
  To: fruggeri; +Cc: netdev

Below is the patch I have been using and it has worked for me. I didn't 
get a chance yet to test all cases or with net-next but I am planning to 
send upstream.

----

neigh_update sends a rtnl notification if an update, e.g.,
nud_state change, was done but there is no guarantee of
ordering of the rtnl notifications. Consider the following
scenario:

userspace thread                   kernel thread
================                   =============
neigh_update
   write_lock_bh(n->lock)
   n->nud_state = STALE
   write_unlock_bh(n->lock)
   neigh_notify
     neigh_fill_info
       read_lock_bh(n->lock)
       ndm->nud_state = STALE
       read_unlock_bh(n->lock)
     -------------------------->
			          neigh:update
				  write_lock_bh(n->lock)
				  n->nud_state = REACHABLE
				  write_unlock_bh(n->lock)
			          neigh_notify
			            neigh_fill_info
                                       read_lock_bh(n->lock)
                                       ndm->nud_state = REACHABLE
                                       read_unlock_bh(n->lock)
			            rtnl_nofify
				  RTNL REACHABLE sent
		        <--------
    rtnl_notify
    RTNL STALE sent

In this scenario, the kernel neigh is updated first to STALE and
then REACHABLE but the netlink notifications are sent out of order,
first REACHABLE and then STALE.

To fix this ordering, use read_lock_bh(n->lock) for both reading the
neigh state (neigh_fill_info) __and__ sending the netlink notification
(rtnl_notify).

Signed-off-by: Andy Roulin <aroulin@nvidia.com>
---
  net/core/neighbour.c | 25 ++++++++++++++++---------
  1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 54625287ee5b..a91dfcbfc01c 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb, 
struct neighbour *neigh,
  	if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
  		goto nla_put_failure;

-	read_lock_bh(&neigh->lock);
  	ndm->ndm_state	 = neigh->nud_state;
  	if (neigh->nud_state & NUD_VALID) {
  		char haddr[MAX_ADDR_LEN];

  		neigh_ha_snapshot(haddr, neigh, neigh->dev);
-		if (nla_put(skb, NDA_LLADDR, neigh->dev->addr_len, haddr) < 0) {
-			read_unlock_bh(&neigh->lock);
+		if (nla_put(skb, NDA_LLADDR, neigh->dev->addr_len, haddr) < 0)
  			goto nla_put_failure;
-		}
  	}

  	ci.ndm_used	 = jiffies_to_clock_t(now - neigh->used);
  	ci.ndm_confirmed = jiffies_to_clock_t(now - neigh->confirmed);
  	ci.ndm_updated	 = jiffies_to_clock_t(now - neigh->updated);
  	ci.ndm_refcnt	 = refcount_read(&neigh->refcnt) - 1;
-	read_unlock_bh(&neigh->lock);

  	if (nla_put_u32(skb, NDA_PROBES, atomic_read(&neigh->probes)) ||
  	    nla_put(skb, NDA_CACHEINFO, sizeof(ci), &ci))
@@ -2674,10 +2670,15 @@ static int neigh_dump_table(struct neigh_table 
*tbl, struct sk_buff *skb,
  			if (neigh_ifindex_filtered(n->dev, filter->dev_idx) ||
  			    neigh_master_filtered(n->dev, filter->master_idx))
  				goto next;
-			if (neigh_fill_info(skb, n, NETLINK_CB(cb->skb).portid,
-					    cb->nlh->nlmsg_seq,
-					    RTM_NEWNEIGH,
-					    flags) < 0) {
+
+			read_lock_bh(&n->lock);
+			rc = neigh_fill_info(skb, n, NETLINK_CB(cb->skb).portid,
+					     cb->nlh->nlmsg_seq,
+					     RTM_NEWNEIGH,
+					     flags);
+			read_unlock_bh(&n->lock);
+
+			if (rc < 0) {
  				rc = -1;
  				goto out;
  			}
@@ -2926,7 +2927,10 @@ static int neigh_get_reply(struct net *net, 
struct neighbour *neigh,
  	if (!skb)
  		return -ENOBUFS;

+	read_lock_bh(&neigh->lock);
  	err = neigh_fill_info(skb, neigh, pid, seq, RTM_NEWNEIGH, 0);
+	read_unlock_bh(&neigh->lock);
+
  	if (err) {
  		kfree_skb(skb);
  		goto errout;
@@ -3460,14 +3464,17 @@ static void __neigh_notify(struct neighbour *n, 
int type, int flags,
  	if (skb == NULL)
  		goto errout;

+	read_lock_bh(&n->lock);
  	err = neigh_fill_info(skb, n, pid, 0, type, flags);
  	if (err < 0) {
  		/* -EMSGSIZE implies BUG in neigh_nlmsg_size() */
  		WARN_ON(err == -EMSGSIZE);
+		read_unlock_bh(&n->lock);
  		kfree_skb(skb);
  		goto errout;
  	}
  	rtnl_notify(skb, net, 0, RTNLGRP_NEIGH, NULL, GFP_ATOMIC);
+	read_unlock_bh(&n->lock);
  	return;
  errout:
  	if (err < 0)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-07  2:07 ` Andy Roulin
@ 2022-06-07  3:19   ` Stephen Hemminger
  2022-06-07 16:29     ` Francesco Ruggeri
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2022-06-07  3:19 UTC (permalink / raw)
  To: Andy Roulin; +Cc: fruggeri, netdev

On Mon, 6 Jun 2022 19:07:04 -0700
Andy Roulin <aroulin@nvidia.com> wrote:

> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 54625287ee5b..a91dfcbfc01c 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb, 
> struct neighbour *neigh,
>   	if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
>   		goto nla_put_failure;
> 
> -	read_lock_bh(&neigh->lock);
>   	ndm->ndm_state	 = neigh->nud_state;

Accessing neighbor state outside of lock is not safe.

But you should be able to use RCU here??

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-07  3:19   ` Stephen Hemminger
@ 2022-06-07 16:29     ` Francesco Ruggeri
  2022-06-07 17:32       ` Stephen Hemminger
  0 siblings, 1 reply; 13+ messages in thread
From: Francesco Ruggeri @ 2022-06-07 16:29 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andy Roulin, netdev

On Mon, Jun 6, 2022 at 8:19 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Mon, 6 Jun 2022 19:07:04 -0700
> Andy Roulin <aroulin@nvidia.com> wrote:
>
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index 54625287ee5b..a91dfcbfc01c 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb,
> > struct neighbour *neigh,
> >       if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
> >               goto nla_put_failure;
> >
> > -     read_lock_bh(&neigh->lock);
> >       ndm->ndm_state   = neigh->nud_state;
>
> Accessing neighbor state outside of lock is not safe.
>
> But you should be able to use RCU here??

I think the patch removes the lock from neigh_fill_info but it then uses it
to protect all calls to neigh_fill_info, so the access should still be safe.
In case of __neigh_notify the lock also extends to protect rtnl_notify,
guaranteeing that the state cannot be changed while the notification
is in progress (I assume all state changes are protected by the same lock).
Andy, is that the idea?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-07 16:29     ` Francesco Ruggeri
@ 2022-06-07 17:32       ` Stephen Hemminger
  2022-06-07 20:03         ` Francesco Ruggeri
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2022-06-07 17:32 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: Andy Roulin, netdev

On Tue, 7 Jun 2022 09:29:45 -0700
Francesco Ruggeri <fruggeri@arista.com> wrote:

> On Mon, Jun 6, 2022 at 8:19 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Mon, 6 Jun 2022 19:07:04 -0700
> > Andy Roulin <aroulin@nvidia.com> wrote:
> >  
> > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > > index 54625287ee5b..a91dfcbfc01c 100644
> > > --- a/net/core/neighbour.c
> > > +++ b/net/core/neighbour.c
> > > @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb,
> > > struct neighbour *neigh,
> > >       if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
> > >               goto nla_put_failure;
> > >
> > > -     read_lock_bh(&neigh->lock);
> > >       ndm->ndm_state   = neigh->nud_state;  
> >
> > Accessing neighbor state outside of lock is not safe.
> >
> > But you should be able to use RCU here??  
> 
> I think the patch removes the lock from neigh_fill_info but it then uses it
> to protect all calls to neigh_fill_info, so the access should still be safe.
> In case of __neigh_notify the lock also extends to protect rtnl_notify,
> guaranteeing that the state cannot be changed while the notification
> is in progress (I assume all state changes are protected by the same lock).
> Andy, is that the idea?

Neigh info is already protected by RCU, is per neighbour reader/writer lock
still needed at all?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-07 17:32       ` Stephen Hemminger
@ 2022-06-07 20:03         ` Francesco Ruggeri
  2022-06-08  3:49           ` Andy Roulin
  0 siblings, 1 reply; 13+ messages in thread
From: Francesco Ruggeri @ 2022-06-07 20:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andy Roulin, netdev

On Tue, Jun 7, 2022 at 10:32 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 7 Jun 2022 09:29:45 -0700
> Francesco Ruggeri <fruggeri@arista.com> wrote:
>
> > On Mon, Jun 6, 2022 at 8:19 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Mon, 6 Jun 2022 19:07:04 -0700
> > > Andy Roulin <aroulin@nvidia.com> wrote:
> > >
> > > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > > > index 54625287ee5b..a91dfcbfc01c 100644
> > > > --- a/net/core/neighbour.c
> > > > +++ b/net/core/neighbour.c
> > > > @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb,
> > > > struct neighbour *neigh,
> > > >       if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
> > > >               goto nla_put_failure;
> > > >
> > > > -     read_lock_bh(&neigh->lock);
> > > >       ndm->ndm_state   = neigh->nud_state;
> > >
> > > Accessing neighbor state outside of lock is not safe.
> > >
> > > But you should be able to use RCU here??
> >
> > I think the patch removes the lock from neigh_fill_info but it then uses it
> > to protect all calls to neigh_fill_info, so the access should still be safe.
> > In case of __neigh_notify the lock also extends to protect rtnl_notify,
> > guaranteeing that the state cannot be changed while the notification
> > is in progress (I assume all state changes are protected by the same lock).
> > Andy, is that the idea?
>
> Neigh info is already protected by RCU, is per neighbour reader/writer lock
> still needed at all?

The goal of the patch seems to be to make changing a neighbour's state and
delivering the corresponding notification atomic, in order to prevent
reordering of notifications. It uses the existing lock to do so.
Can reordering be prevented if the lock is replaced with rcu?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-07 20:03         ` Francesco Ruggeri
@ 2022-06-08  3:49           ` Andy Roulin
  2022-06-09 16:40             ` Francesco Ruggeri
  2023-04-12  0:41             ` Stephen Hemminger
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Roulin @ 2022-06-08  3:49 UTC (permalink / raw)
  To: Francesco Ruggeri, Stephen Hemminger; +Cc: netdev

On 6/7/22 1:03 PM, Francesco Ruggeri wrote:
> On Tue, Jun 7, 2022 at 10:32 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>>
>> On Tue, 7 Jun 2022 09:29:45 -0700
>> Francesco Ruggeri <fruggeri@arista.com> wrote:
>>
>>> On Mon, Jun 6, 2022 at 8:19 PM Stephen Hemminger
>>> <stephen@networkplumber.org> wrote:
>>>>
>>>> On Mon, 6 Jun 2022 19:07:04 -0700
>>>> Andy Roulin <aroulin@nvidia.com> wrote:
>>>>
>>>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>>>> index 54625287ee5b..a91dfcbfc01c 100644
>>>>> --- a/net/core/neighbour.c
>>>>> +++ b/net/core/neighbour.c
>>>>> @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb,
>>>>> struct neighbour *neigh,
>>>>>        if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
>>>>>                goto nla_put_failure;
>>>>>
>>>>> -     read_lock_bh(&neigh->lock);
>>>>>        ndm->ndm_state   = neigh->nud_state;
>>>>
>>>> Accessing neighbor state outside of lock is not safe.
>>>>
>>>> But you should be able to use RCU here??
>>>
>>> I think the patch removes the lock from neigh_fill_info but it then uses it
>>> to protect all calls to neigh_fill_info, so the access should still be safe.
>>> In case of __neigh_notify the lock also extends to protect rtnl_notify,
>>> guaranteeing that the state cannot be changed while the notification
>>> is in progress (I assume all state changes are protected by the same lock).
>>> Andy, is that the idea?

Yes correct.

>>
>> Neigh info is already protected by RCU, is per neighbour reader/writer lock
>> still needed at all?
> 
> The goal of the patch seems to be to make changing a neighbour's state and
> delivering the corresponding notification atomic, in order to prevent
> reordering of notifications. It uses the existing lock to do so.
> Can reordering be prevented if the lock is replaced with rcu?

Yes that's the goal of the patch. I'd have to look in more details if 
there's a better solution with RCU.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-08  3:49           ` Andy Roulin
@ 2022-06-09 16:40             ` Francesco Ruggeri
  2022-06-10 16:18               ` Francesco Ruggeri
  2023-04-12  0:41             ` Stephen Hemminger
  1 sibling, 1 reply; 13+ messages in thread
From: Francesco Ruggeri @ 2022-06-09 16:40 UTC (permalink / raw)
  To: Andy Roulin; +Cc: Stephen Hemminger, netdev

On Mon, Jun 6, 2022 at 7:07 PM Andy Roulin <aroulin@nvidia.com> wrote:
>
> Below is the patch I have been using and it has worked for me. I didn't
> get a chance yet to test all cases or with net-next but I am planning to
> send upstream.

Thanks Andy, the patch fixes the reordering that I was seeing in my
failure scenario.

Francesco

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-09 16:40             ` Francesco Ruggeri
@ 2022-06-10 16:18               ` Francesco Ruggeri
  2022-06-16 18:33                 ` Andy Roulin
  0 siblings, 1 reply; 13+ messages in thread
From: Francesco Ruggeri @ 2022-06-10 16:18 UTC (permalink / raw)
  To: Andy Roulin; +Cc: Stephen Hemminger, netdev

On Thu, Jun 9, 2022 at 9:40 AM Francesco Ruggeri <fruggeri@arista.com> wrote:
>
> On Mon, Jun 6, 2022 at 7:07 PM Andy Roulin <aroulin@nvidia.com> wrote:
> >
> > Below is the patch I have been using and it has worked for me. I didn't
> > get a chance yet to test all cases or with net-next but I am planning to
> > send upstream.
>
> Thanks Andy, the patch fixes the reordering that I was seeing in my
> failure scenario.

I think that with this patch there may still be a narrower race
condition, though probably not as bad.
The patch guarantees that the notification is for the latest state change,
but not necessarily the change that initiated the notification.
In this scenario:

n->nud_state = STALE
write_unlock_bh(n->lock)
                       n->nud_state = REACHABLE
                       write_unlock_bh(n->lock)
                       neigh_notify
neigh_notify

wouldn't both notifications be for REACHABLE?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-10 16:18               ` Francesco Ruggeri
@ 2022-06-16 18:33                 ` Andy Roulin
  2023-04-11 19:49                   ` Kevin Mitchell
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Roulin @ 2022-06-16 18:33 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: Stephen Hemminger, netdev

On 6/10/22 9:18 AM, Francesco Ruggeri wrote:
> On Thu, Jun 9, 2022 at 9:40 AM Francesco Ruggeri <fruggeri@arista.com> wrote:
>>
>> On Mon, Jun 6, 2022 at 7:07 PM Andy Roulin <aroulin@nvidia.com> wrote:
>>>
>>> Below is the patch I have been using and it has worked for me. I didn't
>>> get a chance yet to test all cases or with net-next but I am planning to
>>> send upstream.
>>
>> Thanks Andy, the patch fixes the reordering that I was seeing in my
>> failure scenario.
> 
> I think that with this patch there may still be a narrower race
> condition, though probably not as bad.
> The patch guarantees that the notification is for the latest state change,
> but not necessarily the change that initiated the notification.
> In this scenario:
> 
> n->nud_state = STALE
> write_unlock_bh(n->lock)
>                         n->nud_state = REACHABLE
>                         write_unlock_bh(n->lock)
>                         neigh_notify
> neigh_notify
> 
> wouldn't both notifications be for REACHABLE?

Yes that's right, in this case it will consolidate both notifications to 
be the same, i.e., last state.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-16 18:33                 ` Andy Roulin
@ 2023-04-11 19:49                   ` Kevin Mitchell
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Mitchell @ 2023-04-11 19:49 UTC (permalink / raw)
  To: aroulin; +Cc: netdev, stephen

-fruggeri@arista.com as he is no longer at the company

Has there been any progress in getting this patch or some other fix for this
issue into mainline. It's been working well for us so far in our testing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2022-06-08  3:49           ` Andy Roulin
  2022-06-09 16:40             ` Francesco Ruggeri
@ 2023-04-12  0:41             ` Stephen Hemminger
  2023-04-12  1:22               ` Stephen Hemminger
  1 sibling, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2023-04-12  0:41 UTC (permalink / raw)
  To: Andy Roulin; +Cc: Francesco Ruggeri, netdev

On Tue, 7 Jun 2022 20:49:40 -0700
Andy Roulin <aroulin@nvidia.com> wrote:

> On 6/7/22 1:03 PM, Francesco Ruggeri wrote:
> > On Tue, Jun 7, 2022 at 10:32 AM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:  
> >>
> >> On Tue, 7 Jun 2022 09:29:45 -0700
> >> Francesco Ruggeri <fruggeri@arista.com> wrote:
> >>  
> >>> On Mon, Jun 6, 2022 at 8:19 PM Stephen Hemminger
> >>> <stephen@networkplumber.org> wrote:  
> >>>>
> >>>> On Mon, 6 Jun 2022 19:07:04 -0700
> >>>> Andy Roulin <aroulin@nvidia.com> wrote:
> >>>>  
> >>>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> >>>>> index 54625287ee5b..a91dfcbfc01c 100644
> >>>>> --- a/net/core/neighbour.c
> >>>>> +++ b/net/core/neighbour.c
> >>>>> @@ -2531,23 +2531,19 @@ static int neigh_fill_info(struct sk_buff *skb,
> >>>>> struct neighbour *neigh,
> >>>>>        if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
> >>>>>                goto nla_put_failure;
> >>>>>
> >>>>> -     read_lock_bh(&neigh->lock);
> >>>>>        ndm->ndm_state   = neigh->nud_state;  
> >>>>
> >>>> Accessing neighbor state outside of lock is not safe.
> >>>>
> >>>> But you should be able to use RCU here??  
> >>>
> >>> I think the patch removes the lock from neigh_fill_info but it then uses it
> >>> to protect all calls to neigh_fill_info, so the access should still be safe.
> >>> In case of __neigh_notify the lock also extends to protect rtnl_notify,
> >>> guaranteeing that the state cannot be changed while the notification
> >>> is in progress (I assume all state changes are protected by the same lock).
> >>> Andy, is that the idea?  
> 
> Yes correct.
> 
> >>
> >> Neigh info is already protected by RCU, is per neighbour reader/writer lock
> >> still needed at all?  
> > 
> > The goal of the patch seems to be to make changing a neighbour's state and
> > delivering the corresponding notification atomic, in order to prevent
> > reordering of notifications. It uses the existing lock to do so.
> > Can reordering be prevented if the lock is replaced with rcu?  
> 
> Yes that's the goal of the patch. I'd have to look in more details if 
> there's a better solution with RCU.

But the patch would update ndm->ndm_state based on neigh, but there
is nothing ensuring that neigh is not going to be deleted or modified.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: neighbour netlink notifications delivered in wrong order
  2023-04-12  0:41             ` Stephen Hemminger
@ 2023-04-12  1:22               ` Stephen Hemminger
  0 siblings, 0 replies; 13+ messages in thread
From: Stephen Hemminger @ 2023-04-12  1:22 UTC (permalink / raw)
  To: Andy Roulin; +Cc: Francesco Ruggeri, netdev

On Tue, 11 Apr 2023 17:41:31 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> > >> Neigh info is already protected by RCU, is per neighbour reader/writer lock
> > >> still needed at all?  

Yes there is nothing that prevents an incoming packet changing the contents
of a neighbour entry
  
> > > 
> > > The goal of the patch seems to be to make changing a neighbour's state and
> > > delivering the corresponding notification atomic, in order to prevent
> > > reordering of notifications. It uses the existing lock to do so.
> > > Can reordering be prevented if the lock is replaced with rcu?    
> > 
> > Yes that's the goal of the patch. I'd have to look in more details if 
> > there's a better solution with RCU.  
> 
> But the patch would update ndm->ndm_state based on neigh, but there
> is nothing ensuring that neigh is not going to be deleted or modified.

Making the update atomic would require a redesign of the locking here.
The update would have to acquire the write lock, modify, then call
the code that generates the message; drop the write lock and then
queue the message to the netlink socket.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-04-12  1:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06 23:01 neighbour netlink notifications delivered in wrong order Francesco Ruggeri
2022-06-07  2:07 ` Andy Roulin
2022-06-07  3:19   ` Stephen Hemminger
2022-06-07 16:29     ` Francesco Ruggeri
2022-06-07 17:32       ` Stephen Hemminger
2022-06-07 20:03         ` Francesco Ruggeri
2022-06-08  3:49           ` Andy Roulin
2022-06-09 16:40             ` Francesco Ruggeri
2022-06-10 16:18               ` Francesco Ruggeri
2022-06-16 18:33                 ` Andy Roulin
2023-04-11 19:49                   ` Kevin Mitchell
2023-04-12  0:41             ` Stephen Hemminger
2023-04-12  1:22               ` Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.