linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bonding: avoid repeated display of same link status change
@ 2018-09-17  7:20 mk.singh
  2018-09-17 14:38 ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: mk.singh @ 2018-09-17  7:20 UTC (permalink / raw)
  To: netdev
  Cc: Manish Kumar Singh, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be
taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 217b790d22ed..fb4e3aff1677 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !bond->rtnl_needed) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !bond->rtnl_needed) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			bond->rtnl_needed = true;
 			goto re_arm;
 		}
 
+		bond->rtnl_needed = false;
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 808f1d167349..50d61cf77855 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -234,6 +234,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	bool rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding: avoid repeated display of same link status change
  2018-09-17  7:20 [PATCH] bonding: avoid repeated display of same link status change mk.singh
@ 2018-09-17 14:38 ` Eric Dumazet
  2018-09-18  5:05   ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2018-09-17 14:38 UTC (permalink / raw)
  To: mk.singh, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel



On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> From: Manish Kumar Singh <mk.singh@oracle.com>
> 
> When link status change needs to be committed and rtnl lock couldn't be
> taken, avoid redisplay of same link status change message.
> 
> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> ---
>  drivers/net/bonding/bond_main.c | 6 ++++--
>  include/net/bonding.h           | 1 +
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 217b790d22ed..fb4e3aff1677 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
>  			commit++;
>  			slave->delay = bond->params.downdelay;
> -			if (slave->delay) {
> +			if (slave->delay && !bond->rtnl_needed) {
>  				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
>  					    (BOND_MODE(bond) ==
>  					     BOND_MODE_ACTIVEBACKUP) ?
> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>  			commit++;
>  			slave->delay = bond->params.updelay;
>  
> -			if (slave->delay) {
> +			if (slave->delay && !bond->rtnl_needed) {
>  				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
>  					    slave->dev->name,
>  					    ignore_updelay ? 0 :
> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct work_struct *work)
>  		if (!rtnl_trylock()) {
>  			delay = 1;
>  			should_notify_peers = false;
> +			bond->rtnl_needed = true;

How can you set a shared variable with no synchronization ?

A bool is particularly dangerous here, at least on some arches.

>  			goto re_arm;
>  		}
>  
> +		bond->rtnl_needed = false;
>  		bond_for_each_slave(bond, slave, iter) {
>  			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
>  		}
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index 808f1d167349..50d61cf77855 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -234,6 +234,7 @@ struct bonding {
>  	struct	 dentry *debug_dir;
>  #endif /* CONFIG_DEBUG_FS */
>  	struct rtnl_link_stats64 bond_stats;
> +	bool rtnl_needed;
>  };
>  
>  #define bond_slave_get_rcu(dev) \
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding: avoid repeated display of same link status change
  2018-09-17 14:38 ` Eric Dumazet
@ 2018-09-18  5:05   ` Manish Kumar Singh
  2018-09-18 14:00     ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Manish Kumar Singh @ 2018-09-18  5:05 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel



> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: 17 सितम्बर 2018 20:08
> To: Manish Kumar Singh; netdev@vger.kernel.org
> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> change
> 
> 
> 
> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> > From: Manish Kumar Singh <mk.singh@oracle.com>
> >
> > When link status change needs to be committed and rtnl lock couldn't be
> > taken, avoid redisplay of same link status change message.
> >
> > Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> > ---
> >  drivers/net/bonding/bond_main.c | 6 ++++--
> >  include/net/bonding.h           | 1 +
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c
> > index 217b790d22ed..fb4e3aff1677 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding
> *bond)
> >  			bond_propose_link_state(slave, BOND_LINK_FAIL);
> >  			commit++;
> >  			slave->delay = bond->params.downdelay;
> > -			if (slave->delay) {
> > +			if (slave->delay && !bond->rtnl_needed) {
> >  				netdev_info(bond->dev, "link status down for
> %sinterface %s, disabling it in %d ms\n",
> >  					    (BOND_MODE(bond) ==
> >  					     BOND_MODE_ACTIVEBACKUP) ?
> > @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding
> *bond)
> >  			commit++;
> >  			slave->delay = bond->params.updelay;
> >
> > -			if (slave->delay) {
> > +			if (slave->delay && !bond->rtnl_needed) {
> >  				netdev_info(bond->dev, "link status up for
> interface %s, enabling it in %d ms\n",
> >  					    slave->dev->name,
> >  					    ignore_updelay ? 0 :
> > @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
> work_struct *work)
> >  		if (!rtnl_trylock()) {
> >  			delay = 1;
> >  			should_notify_peers = false;
> > +			bond->rtnl_needed = true;
> 
> How can you set a shared variable with no synchronization ?
Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable, it is part of bonding structure, that is one per bonding driver instance. There can't be two parallel instances of bond_miimon_inspect for a single  bonding driver instance at any given point of time. and only bond_miimon_inspect updates it. That’s why I think there is no need of any synchronization here.  

> 
> A bool is particularly dangerous here, at least on some arches.
Thank you for cautioning us on bool usage. even a u8 can meet our requirement.  we will change it.  but; if time permits can you share more on "particularly dangerous here, at least on some arches".

F
> 
> >  			goto re_arm;
> >  		}
> >
> > +		bond->rtnl_needed = false;
> >  		bond_for_each_slave(bond, slave, iter) {
> >  			bond_commit_link_state(slave,
> BOND_SLAVE_NOTIFY_LATER);
> >  		}
> > diff --git a/include/net/bonding.h b/include/net/bonding.h
> > index 808f1d167349..50d61cf77855 100644
> > --- a/include/net/bonding.h
> > +++ b/include/net/bonding.h
> > @@ -234,6 +234,7 @@ struct bonding {
> >  	struct	 dentry *debug_dir;
> >  #endif /* CONFIG_DEBUG_FS */
> >  	struct rtnl_link_stats64 bond_stats;
> > +	bool rtnl_needed;
> >  };
> >
> >  #define bond_slave_get_rcu(dev) \
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding: avoid repeated display of same link status change
  2018-09-18  5:05   ` Manish Kumar Singh
@ 2018-09-18 14:00     ` Eric Dumazet
  2018-09-24  7:05       ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2018-09-18 14:00 UTC (permalink / raw)
  To: Manish Kumar Singh, Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel



On 09/17/2018 10:05 PM, Manish Kumar Singh wrote:
> 
> 
>> -----Original Message-----
>> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>> Sent: 17 सितम्बर 2018 20:08
>> To: Manish Kumar Singh; netdev@vger.kernel.org
>> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
>> change
>>
>>
>>
>> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
>>> From: Manish Kumar Singh <mk.singh@oracle.com>
>>>
>>> When link status change needs to be committed and rtnl lock couldn't be
>>> taken, avoid redisplay of same link status change message.
>>>
>>> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
>>> ---
>>>  drivers/net/bonding/bond_main.c | 6 ++++--
>>>  include/net/bonding.h           | 1 +
>>>  2 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c
>> b/drivers/net/bonding/bond_main.c
>>> index 217b790d22ed..fb4e3aff1677 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding
>> *bond)
>>>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
>>>  			commit++;
>>>  			slave->delay = bond->params.downdelay;
>>> -			if (slave->delay) {
>>> +			if (slave->delay && !bond->rtnl_needed) {
>>>  				netdev_info(bond->dev, "link status down for
>> %sinterface %s, disabling it in %d ms\n",
>>>  					    (BOND_MODE(bond) ==
>>>  					     BOND_MODE_ACTIVEBACKUP) ?
>>> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding
>> *bond)
>>>  			commit++;
>>>  			slave->delay = bond->params.updelay;
>>>
>>> -			if (slave->delay) {
>>> +			if (slave->delay && !bond->rtnl_needed) {
>>>  				netdev_info(bond->dev, "link status up for
>> interface %s, enabling it in %d ms\n",
>>>  					    slave->dev->name,
>>>  					    ignore_updelay ? 0 :
>>> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
>> work_struct *work)
>>>  		if (!rtnl_trylock()) {
>>>  			delay = 1;
>>>  			should_notify_peers = false;
>>> +			bond->rtnl_needed = true;
>>
>> How can you set a shared variable with no synchronization ?
> Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable, it is part of bonding structure, that is one per bonding driver instance. There can't be two parallel instances of bond_miimon_inspect for a single  bonding driver instance at any given point of time. and only bond_miimon_inspect updates it. That’s why I think there is no need of any synchronization here.  
> 
>

If rtnl_trylock() can not grab RTNL, 
there is no way the current thread can set the  variable without a race, if the word including rtnl_needed is shared by other fields in the structure.

Your patch adds a subtle possibility of future bugs, even if it runs fine today.

Do not pave the way for future bugs, make your code robust, please.








^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding: avoid repeated display of same link status change
  2018-09-18 14:00     ` Eric Dumazet
@ 2018-09-24  7:05       ` Manish Kumar Singh
  2018-10-22  7:29         ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Manish Kumar Singh @ 2018-09-24  7:05 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel



> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: 18 सितम्बर 2018 19:30
> To: Manish Kumar Singh; Eric Dumazet; netdev@vger.kernel.org
> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> change
> 
> 
> 
> On 09/17/2018 10:05 PM, Manish Kumar Singh wrote:
> >
> >
> >> -----Original Message-----
> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >> Sent: 17 सितम्बर 2018 20:08
> >> To: Manish Kumar Singh; netdev@vger.kernel.org
> >> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller;
> linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> >> change
> >>
> >>
> >>
> >> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> >>> From: Manish Kumar Singh <mk.singh@oracle.com>
> >>>
> >>> When link status change needs to be committed and rtnl lock couldn't be
> >>> taken, avoid redisplay of same link status change message.
> >>>
> >>> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> >>> ---
> >>>  drivers/net/bonding/bond_main.c | 6 ++++--
> >>>  include/net/bonding.h           | 1 +
> >>>  2 files changed, 5 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/bonding/bond_main.c
> >> b/drivers/net/bonding/bond_main.c
> >>> index 217b790d22ed..fb4e3aff1677 100644
> >>> --- a/drivers/net/bonding/bond_main.c
> >>> +++ b/drivers/net/bonding/bond_main.c
> >>> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding
> >> *bond)
> >>>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
> >>>  			commit++;
> >>>  			slave->delay = bond->params.downdelay;
> >>> -			if (slave->delay) {
> >>> +			if (slave->delay && !bond->rtnl_needed) {
> >>>  				netdev_info(bond->dev, "link status down
> for
> >> %sinterface %s, disabling it in %d ms\n",
> >>>  					    (BOND_MODE(bond) ==
> >>>  					     BOND_MODE_ACTIVEBACKUP) ?
> >>> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding
> >> *bond)
> >>>  			commit++;
> >>>  			slave->delay = bond->params.updelay;
> >>>
> >>> -			if (slave->delay) {
> >>> +			if (slave->delay && !bond->rtnl_needed) {
> >>>  				netdev_info(bond->dev, "link status up for
> >> interface %s, enabling it in %d ms\n",
> >>>  					    slave->dev->name,
> >>>  					    ignore_updelay ? 0 :
> >>> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
> >> work_struct *work)
> >>>  		if (!rtnl_trylock()) {
> >>>  			delay = 1;
> >>>  			should_notify_peers = false;
> >>> +			bond->rtnl_needed = true;
> >>
> >> How can you set a shared variable with no synchronization ?
> > Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable, it
> is part of bonding structure, that is one per bonding driver instance. There
> can't be two parallel instances of bond_miimon_inspect for a single  bonding
> driver instance at any given point of time. and only bond_miimon_inspect
> updates it. That’s why I think there is no need of any synchronization here.
> >
> >
> 
> If rtnl_trylock() can not grab RTNL,
> there is no way the current thread can set the  variable without a race, if the
> word including rtnl_needed is shared by other fields in the structure.
> 
> Your patch adds a subtle possibility of future bugs, even if it runs fine today.
> 
> Do not pave the way for future bugs, make your code robust, please.

Thankyou Eric, we are making the changes and will repost the patch after testing it.
-Manish
> 
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding: avoid repeated display of same link status change
  2018-09-24  7:05       ` Manish Kumar Singh
@ 2018-10-22  7:29         ` Manish Kumar Singh
  0 siblings, 0 replies; 19+ messages in thread
From: Manish Kumar Singh @ 2018-10-22  7:29 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel



> -----Original Message-----
> From: Manish Kumar Singh
> Sent: 24 सितम्बर 2018 12:36
> To: Eric Dumazet; netdev@vger.kernel.org
> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: RE: [PATCH] bonding: avoid repeated display of same link status
> change
> 
> 
> 
> > -----Original Message-----
> > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > Sent: 18 सितम्बर 2018 19:30
> > To: Manish Kumar Singh; Eric Dumazet; netdev@vger.kernel.org
> > Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> > kernel@vger.kernel.org
> > Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > change
> >
> >
> >
> > On 09/17/2018 10:05 PM, Manish Kumar Singh wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > >> Sent: 17 सितम्बर 2018 20:08
> > >> To: Manish Kumar Singh; netdev@vger.kernel.org
> > >> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller;
> > linux-
> > >> kernel@vger.kernel.org
> > >> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > >> change
> > >>
> > >>
> > >>
> > >> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> > >>> From: Manish Kumar Singh <mk.singh@oracle.com>
> > >>>
> > >>> When link status change needs to be committed and rtnl lock couldn't
> be
> > >>> taken, avoid redisplay of same link status change message.
> > >>>
> > >>> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> > >>> ---
> > >>>  drivers/net/bonding/bond_main.c | 6 ++++--
> > >>>  include/net/bonding.h           | 1 +
> > >>>  2 files changed, 5 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/bonding/bond_main.c
> > >> b/drivers/net/bonding/bond_main.c
> > >>> index 217b790d22ed..fb4e3aff1677 100644
> > >>> --- a/drivers/net/bonding/bond_main.c
> > >>> +++ b/drivers/net/bonding/bond_main.c
> > >>> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.downdelay;
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status down
> > for
> > >> %sinterface %s, disabling it in %d ms\n",
> > >>>  					    (BOND_MODE(bond) ==
> > >>>  					     BOND_MODE_ACTIVEBACKUP) ?
> > >>> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.updelay;
> > >>>
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status up for
> > >> interface %s, enabling it in %d ms\n",
> > >>>  					    slave->dev->name,
> > >>>  					    ignore_updelay ? 0 :
> > >>> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
> > >> work_struct *work)
> > >>>  		if (!rtnl_trylock()) {
> > >>>  			delay = 1;
> > >>>  			should_notify_peers = false;
> > >>> +			bond->rtnl_needed = true;
> > >>
> > >> How can you set a shared variable with no synchronization ?
> > > Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable,
> it
> > is part of bonding structure, that is one per bonding driver instance. There
> > can't be two parallel instances of bond_miimon_inspect for a
> single  bonding
> > driver instance at any given point of time. and only bond_miimon_inspect
> > updates it. That’s why I think there is no need of any synchronization here.
> > >
> > >
> >
> > If rtnl_trylock() can not grab RTNL,
> > there is no way the current thread can set the  variable without a race, if
> the
> > word including rtnl_needed is shared by other fields in the structure.
> >
> > Your patch adds a subtle possibility of future bugs, even if it runs fine
> today.
> >
> > Do not pave the way for future bugs, make your code robust, please.
> 
> Thankyou Eric, we are making the changes and will repost the patch after
> testing it.
> -Manish

Please review the updated patch, I have changed the rtnl_needed to an atomic variable, to avoid any race condition:

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 217b790d22ed..fac5350bf19c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h index 808f1d167349..ffc1219f7a07 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -234,6 +234,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
--
2.14.1



Thanks,
Manish

> >
> >
> >
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding:avoid repeated display of same link status change
  2018-11-04 19:41   ` Michal Kubecek
@ 2018-11-20 10:41     ` Manish Kumar Singh
  0 siblings, 0 replies; 19+ messages in thread
From: Manish Kumar Singh @ 2018-11-20 10:41 UTC (permalink / raw)
  To: Michal Kubecek, David Miller
  Cc: netdev, eric.dumazet, j.vosburgh, vfalico, andy, linux-kernel

> -----Original Message-----
> From: Michal Kubecek [mailto:mkubecek@suse.cz]
> Sent: 05 नवम्बर 2018 01:11
> To: David Miller
> Cc: Manish Kumar Singh; netdev@vger.kernel.org; eric.dumazet@gmail.com;
> j.vosburgh@gmail.com; vfalico@gmail.com; andy@greyhouse.net; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] bonding:avoid repeated display of same link status
> change
> 
> On Fri, Nov 02, 2018 at 11:31:38PM -0700, David Miller wrote:
> > From: mk.singh@oracle.com
> > Date: Wed, 31 Oct 2018 16:27:28 +0530
> >
> > > -			if (slave->delay) {
> > > +			if (slave->delay &&
> > > +			    !atomic64_read(&bond->rtnl_needed)) {
> >  ...
> > > +			    !atomic64_read(&bond->rtnl_needed)) {
> >  ...
> > > +			atomic64_set(&bond->rtnl_needed, 1);
> >  ...
> > > +		atomic64_set(&bond->rtnl_needed, 0);
> >  ...
> > > @@ -229,6 +229,7 @@ struct bonding {
> > >  	struct	 dentry *debug_dir;
> > >  #endif /* CONFIG_DEBUG_FS */
> > >  	struct rtnl_link_stats64 bond_stats;
> > > +	atomic64_t rtnl_needed;
> >
> > There is nothing "atomic" about a value that is only set and read.
> >
> > And using a full 64-bit value for something taking on only '0' and
> > '1' is unnecessary as well.
> 
> Part of the misunderstanding is caused by the fact that this is actually
> a v4 but not marked as such:
> 
>   v1: https://patchwork.ozlabs.org/patch/955789/
>   v2: https://patchwork.ozlabs.org/patch/970421/
>   v3: https://patchwork.ozlabs.org/patch/988241/
> 
> When commenting v3, I didn't know about the v2 discussion where Eric
> Dumazet NACKed the patch because of potential conflict issues:
> 
>   https://patchwork.ozlabs.org/patch/970421/#1992397
>   https://patchwork.ozlabs.org/patch/988241/#2017317
> 
> On the other hand, there is no need for atomic64_t. Simple atomic_t
> (with explaining comment) would suffice. On architectures allowing
> atomic read/write for 32-bit integers, there would be no performance
> penalty. On architectures not allowing it, atomic_read() and
> atomic_set() are implemented to be safe.

Sorry for late response, I was off to work for couple of weeks.

v3: https://patchwork.ozlabs.org/patch/988241/  was sent with atomic_t and after seeing your comment, I sent it with atomic64_t.
Please let me know if v3 was fine ? 

Thanks,
Manish
> 
> Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-11-03  6:31 ` David Miller
@ 2018-11-04 19:41   ` Michal Kubecek
  2018-11-20 10:41     ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Kubecek @ 2018-11-04 19:41 UTC (permalink / raw)
  To: David Miller
  Cc: mk.singh, netdev, eric.dumazet, j.vosburgh, vfalico, andy, linux-kernel

On Fri, Nov 02, 2018 at 11:31:38PM -0700, David Miller wrote:
> From: mk.singh@oracle.com
> Date: Wed, 31 Oct 2018 16:27:28 +0530
> 
> > -			if (slave->delay) {
> > +			if (slave->delay &&
> > +			    !atomic64_read(&bond->rtnl_needed)) {
>  ...
> > +			    !atomic64_read(&bond->rtnl_needed)) {
>  ...
> > +			atomic64_set(&bond->rtnl_needed, 1);
>  ...
> > +		atomic64_set(&bond->rtnl_needed, 0);
>  ...
> > @@ -229,6 +229,7 @@ struct bonding {
> >  	struct	 dentry *debug_dir;
> >  #endif /* CONFIG_DEBUG_FS */
> >  	struct rtnl_link_stats64 bond_stats;
> > +	atomic64_t rtnl_needed;
> 
> There is nothing "atomic" about a value that is only set and read.
> 
> And using a full 64-bit value for something taking on only '0' and
> '1' is unnecessary as well.

Part of the misunderstanding is caused by the fact that this is actually
a v4 but not marked as such:

  v1: https://patchwork.ozlabs.org/patch/955789/
  v2: https://patchwork.ozlabs.org/patch/970421/
  v3: https://patchwork.ozlabs.org/patch/988241/

When commenting v3, I didn't know about the v2 discussion where Eric
Dumazet NACKed the patch because of potential conflict issues:

  https://patchwork.ozlabs.org/patch/970421/#1992397
  https://patchwork.ozlabs.org/patch/988241/#2017317

On the other hand, there is no need for atomic64_t. Simple atomic_t
(with explaining comment) would suffice. On architectures allowing
atomic read/write for 32-bit integers, there would be no performance
penalty. On architectures not allowing it, atomic_read() and
atomic_set() are implemented to be safe.

Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-31 10:57 mk.singh
@ 2018-11-03  6:31 ` David Miller
  2018-11-04 19:41   ` Michal Kubecek
  0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2018-11-03  6:31 UTC (permalink / raw)
  To: mk.singh
  Cc: netdev, eric.dumazet, mkubecek, j.vosburgh, vfalico, andy, linux-kernel

From: mk.singh@oracle.com
Date: Wed, 31 Oct 2018 16:27:28 +0530

> -			if (slave->delay) {
> +			if (slave->delay &&
> +			    !atomic64_read(&bond->rtnl_needed)) {
 ...
> +			    !atomic64_read(&bond->rtnl_needed)) {
 ...
> +			atomic64_set(&bond->rtnl_needed, 1);
 ...
> +		atomic64_set(&bond->rtnl_needed, 0);
 ...
> @@ -229,6 +229,7 @@ struct bonding {
>  	struct	 dentry *debug_dir;
>  #endif /* CONFIG_DEBUG_FS */
>  	struct rtnl_link_stats64 bond_stats;
> +	atomic64_t rtnl_needed;

There is nothing "atomic" about a value that is only set and read.

And using a full 64-bit value for something taking on only '0' and
'1' is unnecessary as well.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] bonding:avoid repeated display of same link status change
@ 2018-10-31 10:57 mk.singh
  2018-11-03  6:31 ` David Miller
  0 siblings, 1 reply; 19+ messages in thread
From: mk.singh @ 2018-10-31 10:57 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, mkubecek, Manish Kumar Singh, Jay Vosburgh,
	Veaceslav Falico, Andy Gospodarek, David S. Miller, linux-kernel

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be
taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 8 ++++++--
 include/net/bonding.h           | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2b01180be834..b3d95c7040ac 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2096,7 +2096,8 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay &&
+			    !atomic64_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2136,7 +2137,8 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay &&
+			    !atomic64_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2310,9 +2312,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic64_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic64_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index a4f116f06c50..20c3c875266f 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -229,6 +229,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic64_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-25  9:29           ` Michal Kubecek
@ 2018-10-26  6:49             ` Manish Kumar Singh
  0 siblings, 0 replies; 19+ messages in thread
From: Manish Kumar Singh @ 2018-10-26  6:49 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: Eric Dumazet,
	Mahesh Bandewar (महेश
	बंडेवार),
	linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel



> -----Original Message-----
> From: Michal Kubecek [mailto:mkubecek@suse.cz]
> Sent: 25 अक्तूबर 2018 14:59
> To: Manish Kumar Singh
> Cc: Eric Dumazet; Mahesh Bandewar (महेश बंडेवार); linux-netdev; Jay
> Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] bonding:avoid repeated display of same link status
> change
> 
> On Thu, Oct 25, 2018 at 02:21:05AM -0700, Manish Kumar Singh wrote:
> > > From: Michal Kubecek [mailto:mkubecek@suse.cz]
> > > IMHO it does not. AFAICS multiple instances of bond_mii_monitor()
> cannot
> > > run simultaneously for the same bond so that there doesn't seem to be
> > > anything to collide with. (And if they could, we would need to test and
> > > set the flag atomically in bond_miimon_inspect().)
> > >
> > Yes, Michal, we are inline with your understanding.
> > when the -original- patch was posted to upstream there was no
> > -synchronization- nor -racing- addressing code was in read/write of this
> > added filed, as we -never- saw need for either.
> >
> > -only- writer of the added field is bond_mii_monitor.
> > -only- reader of the added field is bond_miimon_inspect.
> > -this writer & reader -never- can run concurrently.
> > -writer invokes the reader.
> >
> > hence, imo uint_8 rtnl_needed is all what is needed; with
> bond_mii_monitor doing rtnl_needed = 1; and bond_miimon_inspect doing
> if rtnl_needed.
> >
> > here is the gravity of the situation with multiple customers whose names
> including machine names redacted:
> >
> >  4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC
> on p2p1
> >  4354 May 31 02:38:57 hostname kernel: public: link status down for active
> interface p2p1, disabling it in 100 ms
> >  4355 May 31 02:38:57 hostname kernel: public: link status down for active
> interface p2p1, disabling it in 100 ms
> >  4356 May 31 02:38:57 hostname kernel: public: link status definitely down
> for interface p2p1, disabling it
> >  4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 the
> new active one
> >  4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered PHC
> device on p2p1
> >  4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Link is
> Up 10 Gbps, Flow Control: RX/TX
> >  4360 May 31 02:39:00 hostname kernel: public: link status up for interface
> p2p1, enabling it in 200 ms
> >  4361 May 31 02:39:00 hostname kernel: public: link status definitely up for
> interface p2p1, 10000 Mbps full duplex
> >  4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages
> >  4363 May 31 02:45:37 hostname kernel: public: link status down for active
> interface p2p2, disabling it in 100 ms
> >  	---------------------
> >                       11000+ APPROX SAME REPEATED MESSAGES in second
> > 	---------------------
> > 15877 May 31 02:45:37 hostname kernel: public: link status down for active
> interface p2p2, disabling it in 100 ms
> > 15878 May 31 02:45:37 hostname kernel: public: link status definitely down
> for interface p2p2, disabling it
> > 15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 the
> new active one
> 
> When I was replying, I didn't know this was a v2 and I haven't seen the
> v1 discussion. I have read it since and I think I understand Eric's
> point now. The thing is that just adding e.g. u8 is OK as it is now.
> However, someone could later add another u8 next to it which would also
> be perfectly OK on its own but reads/writes to these two could collide
> between each other.
> 
> And as pointed out by a colleague, even having atomic_t and u8 flag in
> one 64-bit word could be a problem on architectures which cannot do an
> atomic read/write from/to a 32-bit word (sparc seems to be one).

Thanks Michal for explaining it, now we understand the problem what Eric was referring to in v1 of the patch.
I could think of fixing it in 3 ways, Please suggest which one would be safe and optimal fix:

1. Use type unit64_t for rtnl_needed .
2. Use type atomic64_t for rtnl_needed and atomic64_set/read.
3. Use type uint64_t for rtnl_needed with spinlock protection.

I think option 3 would be overkill keeping in mind the frequency of bond_mii_monitor.

Thanks,
Manish
> 
> Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-25  9:21         ` Manish Kumar Singh
@ 2018-10-25  9:29           ` Michal Kubecek
  2018-10-26  6:49             ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Kubecek @ 2018-10-25  9:29 UTC (permalink / raw)
  To: Manish Kumar Singh
  Cc: Eric Dumazet,
	Mahesh Bandewar (महेश
	बंडेवार),
	linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel

On Thu, Oct 25, 2018 at 02:21:05AM -0700, Manish Kumar Singh wrote:
> > From: Michal Kubecek [mailto:mkubecek@suse.cz]
> > IMHO it does not. AFAICS multiple instances of bond_mii_monitor() cannot
> > run simultaneously for the same bond so that there doesn't seem to be
> > anything to collide with. (And if they could, we would need to test and
> > set the flag atomically in bond_miimon_inspect().)
> > 
> Yes, Michal, we are inline with your understanding.
> when the -original- patch was posted to upstream there was no
> -synchronization- nor -racing- addressing code was in read/write of this
> added filed, as we -never- saw need for either.
> 
> -only- writer of the added field is bond_mii_monitor.
> -only- reader of the added field is bond_miimon_inspect.
> -this writer & reader -never- can run concurrently.
> -writer invokes the reader.
> 
> hence, imo uint_8 rtnl_needed is all what is needed; with bond_mii_monitor doing rtnl_needed = 1; and bond_miimon_inspect doing if rtnl_needed.
> 
> here is the gravity of the situation with multiple customers whose names including machine names redacted:
> 
>  4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC on p2p1
>  4354 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
>  4355 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
>  4356 May 31 02:38:57 hostname kernel: public: link status definitely down for interface p2p1, disabling it
>  4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 the new active one
>  4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered PHC device on p2p1
>  4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>  4360 May 31 02:39:00 hostname kernel: public: link status up for interface p2p1, enabling it in 200 ms
>  4361 May 31 02:39:00 hostname kernel: public: link status definitely up for interface p2p1, 10000 Mbps full duplex
>  4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages
>  4363 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
>  	---------------------
>                       11000+ APPROX SAME REPEATED MESSAGES in second
> 	---------------------
> 15877 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
> 15878 May 31 02:45:37 hostname kernel: public: link status definitely down for interface p2p2, disabling it
> 15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 the new active one

When I was replying, I didn't know this was a v2 and I haven't seen the
v1 discussion. I have read it since and I think I understand Eric's
point now. The thing is that just adding e.g. u8 is OK as it is now.
However, someone could later add another u8 next to it which would also
be perfectly OK on its own but reads/writes to these two could collide
between each other.

And as pointed out by a colleague, even having atomic_t and u8 flag in
one 64-bit word could be a problem on architectures which cannot do an
atomic read/write from/to a 32-bit word (sparc seems to be one).

Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 16:38       ` Michal Kubecek
@ 2018-10-25  9:21         ` Manish Kumar Singh
  2018-10-25  9:29           ` Michal Kubecek
  0 siblings, 1 reply; 19+ messages in thread
From: Manish Kumar Singh @ 2018-10-25  9:21 UTC (permalink / raw)
  To: Michal Kubecek, Eric Dumazet
  Cc: Mahesh Bandewar (महेश
	बंडेवार),
	linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel



> -----Original Message-----
> From: Michal Kubecek [mailto:mkubecek@suse.cz]
> Sent: 23 अक्तूबर 2018 22:08
> To: Eric Dumazet
> Cc: Mahesh Bandewar (महेश बंडेवार); Manish Kumar Singh; linux-netdev; Jay
> Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] bonding:avoid repeated display of same link status
> change
> 
> On Tue, Oct 23, 2018 at 06:26:14PM +0200, Michal Kubecek wrote:
> > On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> > >
> > >
> > > On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> > >
> > > > Atomic operations are expensive (on certain architectures) and miimon
> > > > runs quite frequently. Is the added cost of these atomic operations
> > > > even worth just to avoid *duplicate info* messages? This seems like a
> > > > overkill!
> > >
> > > atomic_read() is a simple read, no atomic operation involved.
> > >
> > > Same remark for atomic_set()
> >
> > Which makes me wonder if the patch really needs atomic_t.
> 
> IMHO it does not. AFAICS multiple instances of bond_mii_monitor() cannot
> run simultaneously for the same bond so that there doesn't seem to be
> anything to collide with. (And if they could, we would need to test and
> set the flag atomically in bond_miimon_inspect().)
> 
Yes, Michal, we are inline with your understanding.
when the -original- patch was posted to upstream there was no -synchronization- nor -racing- addressing code was in read/write of this added filed, as we -never- saw need for either.

-only- writer of the added field is bond_mii_monitor.
-only- reader of the added field is bond_miimon_inspect.
-this writer & reader -never- can run concurrently.
-writer invokes the reader.

hence, imo uint_8 rtnl_needed is all what is needed; with bond_mii_monitor doing rtnl_needed = 1; and bond_miimon_inspect doing if rtnl_needed.

here is the gravity of the situation with multiple customers whose names including machine names redacted:

 4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC on p2p1
 4354 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
 4355 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
 4356 May 31 02:38:57 hostname kernel: public: link status definitely down for interface p2p1, disabling it
 4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 the new active one
 4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered PHC device on p2p1
 4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
 4360 May 31 02:39:00 hostname kernel: public: link status up for interface p2p1, enabling it in 200 ms
 4361 May 31 02:39:00 hostname kernel: public: link status definitely up for interface p2p1, 10000 Mbps full duplex
 4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages
 4363 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
 	---------------------
                      11000+ APPROX SAME REPEATED MESSAGES in second
	---------------------
15877 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
15878 May 31 02:45:37 hostname kernel: public: link status definitely down for interface p2p2, disabling it
15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 the new active one

Thanks,
Manish

> Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 15:29 [PATCH] bonding:avoid " mk.singh
  2018-10-23 15:54 ` Mahesh Bandewar (महेश बंडेवार)
@ 2018-10-23 18:08 ` David Miller
  1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2018-10-23 18:08 UTC (permalink / raw)
  To: mk.singh; +Cc: netdev, j.vosburgh, vfalico, andy, linux-kernel

From: mk.singh@oracle.com
Date: Tue, 23 Oct 2018 20:59:24 +0530

> @@ -229,6 +229,7 @@ struct bonding {
>  	struct	 dentry *debug_dir;
>  #endif /* CONFIG_DEBUG_FS */
>  	struct rtnl_link_stats64 bond_stats;
> +	atomic_t rtnl_needed;

As mentioned by others, if the only operations you perform on a value
are set and read, using atomic_t is utterly and totally pointless.

I really have no idea what is achieved by using atomic_t in this set
of circumstances.

It is not guaranteeing that the value stays stable after you read it,
and it is not guaranteeing that another thread won't overwrite the
value you just set it to.

All of those things, if important, need proper synchronization.  An
atomic_t by itself will not do that for you.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 16:26     ` Michal Kubecek
@ 2018-10-23 16:38       ` Michal Kubecek
  2018-10-25  9:21         ` Manish Kumar Singh
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Kubecek @ 2018-10-23 16:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mahesh Bandewar (महेश
	बंडेवार),
	mk.singh, linux-netdev, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel

On Tue, Oct 23, 2018 at 06:26:14PM +0200, Michal Kubecek wrote:
> On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> > 
> > 
> > On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> > 
> > > Atomic operations are expensive (on certain architectures) and miimon
> > > runs quite frequently. Is the added cost of these atomic operations
> > > even worth just to avoid *duplicate info* messages? This seems like a
> > > overkill!
> > 
> > atomic_read() is a simple read, no atomic operation involved.
> > 
> > Same remark for atomic_set()
> 
> Which makes me wonder if the patch really needs atomic_t.

IMHO it does not. AFAICS multiple instances of bond_mii_monitor() cannot
run simultaneously for the same bond so that there doesn't seem to be
anything to collide with. (And if they could, we would need to test and
set the flag atomically in bond_miimon_inspect().)

Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 16:10   ` Eric Dumazet
@ 2018-10-23 16:26     ` Michal Kubecek
  2018-10-23 16:38       ` Michal Kubecek
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Kubecek @ 2018-10-23 16:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mahesh Bandewar (महेश
	बंडेवार),
	mk.singh, linux-netdev, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel

On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> 
> 
> On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> 
> > Atomic operations are expensive (on certain architectures) and miimon
> > runs quite frequently. Is the added cost of these atomic operations
> > even worth just to avoid *duplicate info* messages? This seems like a
> > overkill!
> 
> atomic_read() is a simple read, no atomic operation involved.
> 
> Same remark for atomic_set()

Which makes me wonder if the patch really needs atomic_t.

Michal Kubecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 15:54 ` Mahesh Bandewar (महेश बंडेवार)
@ 2018-10-23 16:10   ` Eric Dumazet
  2018-10-23 16:26     ` Michal Kubecek
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2018-10-23 16:10 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार),
	mk.singh
  Cc: linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel



On 10/23/2018 08:54 AM, Mahesh Bandewar (महेश बंडेवार) wrote:

> Atomic operations are expensive (on certain architectures) and miimon
> runs quite frequently. Is the added cost of these atomic operations
> even worth just to avoid *duplicate info* messages? This seems like a
> overkill!

atomic_read() is a simple read, no atomic operation involved.

Same remark for atomic_set()


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] bonding:avoid repeated display of same link status change
  2018-10-23 15:29 [PATCH] bonding:avoid " mk.singh
@ 2018-10-23 15:54 ` Mahesh Bandewar (महेश बंडेवार)
  2018-10-23 16:10   ` Eric Dumazet
  2018-10-23 18:08 ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-10-23 15:54 UTC (permalink / raw)
  To: mk.singh
  Cc: linux-netdev, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, linux-kernel

On Tue, Oct 23, 2018 at 8:29 AM,  <mk.singh@oracle.com> wrote:
> From: Manish Kumar Singh <mk.singh@oracle.com>
>
> When link status change needs to be committed and rtnl lock couldn't be
> taken, avoid redisplay of same link status change message.
>
> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> ---
>  drivers/net/bonding/bond_main.c | 6 ++++--
>  include/net/bonding.h           | 1 +
>  2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 2b01180be834..af9ef889a429 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2096,7 +2096,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>                         bond_propose_link_state(slave, BOND_LINK_FAIL);
>                         commit++;
>                         slave->delay = bond->params.downdelay;
> -                       if (slave->delay) {
> +                       if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
Atomic operations are expensive (on certain architectures) and miimon
runs quite frequently. Is the added cost of these atomic operations
even worth just to avoid *duplicate info* messages? This seems like a
overkill!

>                                 netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
>                                             (BOND_MODE(bond) ==
>                                              BOND_MODE_ACTIVEBACKUP) ?
> @@ -2136,7 +2136,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>                         commit++;
>                         slave->delay = bond->params.updelay;
>
> -                       if (slave->delay) {
> +                       if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
>                                 netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
>                                             slave->dev->name,
>                                             ignore_updelay ? 0 :
> @@ -2310,9 +2310,11 @@ static void bond_mii_monitor(struct work_struct *work)
>                 if (!rtnl_trylock()) {
>                         delay = 1;
>                         should_notify_peers = false;
> +                       atomic_set(&bond->rtnl_needed, 1);
>                         goto re_arm;
>                 }
>
> +               atomic_set(&bond->rtnl_needed, 0);
>                 bond_for_each_slave(bond, slave, iter) {
>                         bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
>                 }
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index a4f116f06c50..a4353506bb4f 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -229,6 +229,7 @@ struct bonding {
>         struct   dentry *debug_dir;
>  #endif /* CONFIG_DEBUG_FS */
>         struct rtnl_link_stats64 bond_stats;
> +       atomic_t rtnl_needed;
>  };
>
>  #define bond_slave_get_rcu(dev) \
> --
> 2.14.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] bonding:avoid repeated display of same link status change
@ 2018-10-23 15:29 mk.singh
  2018-10-23 15:54 ` Mahesh Bandewar (महेश बंडेवार)
  2018-10-23 18:08 ` David Miller
  0 siblings, 2 replies; 19+ messages in thread
From: mk.singh @ 2018-10-23 15:29 UTC (permalink / raw)
  To: netdev
  Cc: Manish Kumar Singh, Jay Vosburgh, Veaceslav Falico,
	Andy Gospodarek, David S. Miller, linux-kernel

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be
taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2b01180be834..af9ef889a429 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2096,7 +2096,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2136,7 +2136,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2310,9 +2310,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index a4f116f06c50..a4353506bb4f 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -229,6 +229,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-11-20 10:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-17  7:20 [PATCH] bonding: avoid repeated display of same link status change mk.singh
2018-09-17 14:38 ` Eric Dumazet
2018-09-18  5:05   ` Manish Kumar Singh
2018-09-18 14:00     ` Eric Dumazet
2018-09-24  7:05       ` Manish Kumar Singh
2018-10-22  7:29         ` Manish Kumar Singh
2018-10-23 15:29 [PATCH] bonding:avoid " mk.singh
2018-10-23 15:54 ` Mahesh Bandewar (महेश बंडेवार)
2018-10-23 16:10   ` Eric Dumazet
2018-10-23 16:26     ` Michal Kubecek
2018-10-23 16:38       ` Michal Kubecek
2018-10-25  9:21         ` Manish Kumar Singh
2018-10-25  9:29           ` Michal Kubecek
2018-10-26  6:49             ` Manish Kumar Singh
2018-10-23 18:08 ` David Miller
2018-10-31 10:57 mk.singh
2018-11-03  6:31 ` David Miller
2018-11-04 19:41   ` Michal Kubecek
2018-11-20 10:41     ` Manish Kumar Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).