All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] igb: minimize busy loop on igb_get_hw_semaphore
@ 2013-07-08 21:17 Luis Claudio R. Goncalves
  2013-08-12 13:55 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Luis Claudio R. Goncalves @ 2013-07-08 21:17 UTC (permalink / raw)
  To: linux-rt-users, Thomas Gleixner, rostedt

Hello,

This patch was written to 3.0-rt but the same code path triggering the
issue exists up to 3.8.13-rt13. It was initially a test patch, to minimize
a problem observed by a customer, but it may be the starting point of a
needed solution.

Rostedt helped me to visualize this small patch on the early stages and
Clark Williams has been bugging me to send it out to the list in order
to gather ideas on how useful this small change really is.

As it is noted on the description, though the same code is present
upstream, it may be a problem only on RT.

----

igb: minimize busy loop on igb_get_hw_semaphore

Bugzilla: 976912

In drivers/net/igb/e1000_82575.c, funtion igb_release_swfw_sync_82575()
there is this line:

	while (igb_get_hw_semaphore(hw) != 0);

That is basically a busy loop waiting on a HW semaphore.

A customer has a setup where two igb NICs are part of a bonding interface.
This customer also has a monitoring script that calls ifconfig often. It was
observed that in this scenario there is a chance that this ifconfig, that
happens to hold the bond->lock while collecting statistics, enters this busy
loop waiting for another thread clear that HW semaphore.

Meanwhile, the irq/xxx-ethY-Tx threads, running at FIFO:85, try to acquire
the bond lock, held by ifconfig. As it happens on RT, a Priority Inheritance
operation is started and ifconfig is boosted to FIFO:85 so that it may be able
to finish its work sooner and release the bond->lock, desired by the
aforementioned threads.

As ifconfig is running on a busy loop, waiting for the HW semaphore, this
thread now runs a busy loop at a very high priority, preventing other threads
on that CPU from progressing.

On that scenario, it seems that the thread holding the HW semaphore is also
waiting for a lock held by other task. This whole scenario leads to RCU stall
warnings, that have as side effects a crescent number of threads being stuck.
As this progresses, the livelock reaches threads on other CPUs and the system
becomes more and more unresponsive.

This little patch aims to prevent the busy loop at a high priority (the code
called by ifconfig in this example) to starve the threads on the same CPU. It
may not solve the issue but will at least lead us closer to the real issue,
masked by the RCU stalls created by the busy loop.

This is mostly a debug patch for a testing kernel.

Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>

diff --git a/drivers/net/igb/e1000_mac.c b/drivers/net/igb/e1000_mac.c
index ce8255f..0ca912c 100644
--- a/drivers/net/igb/e1000_mac.c
+++ b/drivers/net/igb/e1000_mac.c
@@ -1037,7 +1037,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
 		if (!(swsm & E1000_SWSM_SMBI))
 			break;
 
-		udelay(50);
+		usleep_range(50,51);
 		i++;
 	}
 
@@ -1056,7 +1056,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
 		if (rd32(E1000_SWSM) & E1000_SWSM_SWESMBI)
 			break;
 
-		udelay(50);
+		usleep_range(50,51);
 	}
 
 	if (i == timeout) {

-- 
[ Luis Claudio R. Goncalves                    Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] igb: minimize busy loop on igb_get_hw_semaphore
  2013-07-08 21:17 [RFC] igb: minimize busy loop on igb_get_hw_semaphore Luis Claudio R. Goncalves
@ 2013-08-12 13:55 ` Sebastian Andrzej Siewior
  2013-08-13  1:37   ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-08-12 13:55 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves; +Cc: linux-rt-users, Thomas Gleixner, rostedt

* Luis Claudio R. Goncalves | 2013-07-08 18:17:05 [-0300]:

>Hello,
Hi Lius,

>	while (igb_get_hw_semaphore(hw) != 0);
>
>That is basically a busy loop waiting on a HW semaphore.
>
>A customer has a setup where two igb NICs are part of a bonding interface.
>This customer also has a monitoring script that calls ifconfig often. It was
>observed that in this scenario there is a chance that this ifconfig, that
>happens to hold the bond->lock while collecting statistics, enters this busy
>loop waiting for another thread clear that HW semaphore.
>
>Meanwhile, the irq/xxx-ethY-Tx threads, running at FIFO:85, try to acquire
>the bond lock, held by ifconfig. As it happens on RT, a Priority Inheritance
>operation is started and ifconfig is boosted to FIFO:85 so that it may be able
>to finish its work sooner and release the bond->lock, desired by the
>aforementioned threads.
>
>As ifconfig is running on a busy loop, waiting for the HW semaphore, this
>thread now runs a busy loop at a very high priority, preventing other threads
>on that CPU from progressing.
>
>On that scenario, it seems that the thread holding the HW semaphore is also
>waiting for a lock held by other task. This whole scenario leads to RCU stall
>warnings, that have as side effects a crescent number of threads being stuck.
>As this progresses, the livelock reaches threads on other CPUs and the system
>becomes more and more unresponsive.

So you are saying someone is holding the lock and never gets on the CPU
in order to release the lock while in the meantime everyone gets boosted
to grab the lock and busy loops until you call it a day?

If so, then you should tell the locking code about the hw semaphore and that
it needs to boost the owner of the semaphore in order to get it
released. Something like this should do the job:

diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.h b/drivers/net/ethernet/intel/e1000/e1000_hw.h
index 11578c8..8b7299f 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_hw.h
+++ b/drivers/net/ethernet/intel/e1000/e1000_hw.h
@@ -1433,6 +1433,7 @@ struct e1000_hw {
 	bool leave_av_bit_off;
 	bool bad_tx_carr_stats_fd;
 	bool has_smbus;
+	spin_lock_t hwsem_lock;
 };
 
 #define E1000_EEPROM_SWDPIN0   0x0001	/* SWDPIN 0 EEPROM Value */
diff --git a/drivers/net/ethernet/intel/igb/e1000_mac.c b/drivers/net/ethernet/intel/igb/e1000_mac.c
index 2559d70..285cc81 100644
--- a/drivers/net/ethernet/intel/igb/e1000_mac.c
+++ b/drivers/net/ethernet/intel/igb/e1000_mac.c
@@ -1198,6 +1198,8 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
 	s32 timeout = hw->nvm.word_size + 1;
 	s32 i = 0;
 
+	spin_lock(&hw->hwsem_lock);
+
 	/* Get the SW semaphore */
 	while (i < timeout) {
 		swsm = rd32(E1000_SWSM);
@@ -1235,6 +1237,8 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
 	}
 
 out:
+	if (ret_val)
+		spin_unlock(&hw->hwsem_lock);
 	return ret_val;
 }
 
@@ -1253,6 +1257,7 @@ void igb_put_hw_semaphore(struct e1000_hw *hw)
 	swsm &= ~(E1000_SWSM_SMBI | E1000_SWSM_SWESMBI);
 
 	wr32(E1000_SWSM, swsm);
+	spin_unlock(&hw->hwsem_lock);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 64cbe0d..4ae835a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2683,6 +2683,8 @@ static int igb_sw_init(struct igb_adapter *adapter)
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
 	spin_lock_init(&adapter->stats64_lock);
+	spin_lock_init(&hw->hwsem_lock);
+
 #ifdef CONFIG_PCI_IOV
 	switch (hw->mac.type) {
 	case e1000_82576:


So I don't even know if this compiles and the error code is wrong but I
think you get the idea:
Before you attempt to grab the hw semaphore you grab a lock. If the lock
is taken, then the semaphore is taken as well. In non-RT you spin on a
memory instead of IO-memory so I doubt somebody will complain :)
If you need to get the lock while it is taken and you are a high prio
thread then the code should boost the owner of the hw semaphore which it
knows now about.

Sebastian

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] igb: minimize busy loop on igb_get_hw_semaphore
  2013-08-12 13:55 ` Sebastian Andrzej Siewior
@ 2013-08-13  1:37   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 3+ messages in thread
From: Luis Claudio R. Goncalves @ 2013-08-13  1:37 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, Thomas Gleixner, rostedt

On Mon, Aug 12, 2013 at 03:55:59PM +0200, Sebastian Andrzej Siewior wrote:
| * Luis Claudio R. Goncalves | 2013-07-08 18:17:05 [-0300]:
| 
| >Hello,
| Hi Lius,
| 
| >	while (igb_get_hw_semaphore(hw) != 0);
| >
| >That is basically a busy loop waiting on a HW semaphore.
| >
| >A customer has a setup where two igb NICs are part of a bonding interface.
| >This customer also has a monitoring script that calls ifconfig often. It was
| >observed that in this scenario there is a chance that this ifconfig, that
| >happens to hold the bond->lock while collecting statistics, enters this busy
| >loop waiting for another thread clear that HW semaphore.
| >
| >Meanwhile, the irq/xxx-ethY-Tx threads, running at FIFO:85, try to acquire
| >the bond lock, held by ifconfig. As it happens on RT, a Priority Inheritance
| >operation is started and ifconfig is boosted to FIFO:85 so that it may be able
| >to finish its work sooner and release the bond->lock, desired by the
| >aforementioned threads.
| >
| >As ifconfig is running on a busy loop, waiting for the HW semaphore, this
| >thread now runs a busy loop at a very high priority, preventing other threads
| >on that CPU from progressing.
| >
| >On that scenario, it seems that the thread holding the HW semaphore is also
| >waiting for a lock held by other task. This whole scenario leads to RCU stall
| >warnings, that have as side effects a crescent number of threads being stuck.
| >As this progresses, the livelock reaches threads on other CPUs and the system
| >becomes more and more unresponsive.
| 
| So you are saying someone is holding the lock and never gets on the CPU
| in order to release the lock while in the meantime everyone gets boosted
| to grab the lock and busy loops until you call it a day?
| 
| If so, then you should tell the locking code about the hw semaphore and that
| it needs to boost the owner of the semaphore in order to get it
| released. Something like this should do the job:

Sebastian, as I told you on IRC, thanks for that great idea!

A few minutes ago I recalled one detail: this semaphore may be acquired by
the hardware, by the NIC itself. But as the hardware was supposed to hold
this semaphore for very short periods of time, it is worth to try you idea.

Cheers,
Luis
 
| diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.h b/drivers/net/ethernet/intel/e1000/e1000_hw.h
| index 11578c8..8b7299f 100644
| --- a/drivers/net/ethernet/intel/e1000/e1000_hw.h
| +++ b/drivers/net/ethernet/intel/e1000/e1000_hw.h
| @@ -1433,6 +1433,7 @@ struct e1000_hw {
|  	bool leave_av_bit_off;
|  	bool bad_tx_carr_stats_fd;
|  	bool has_smbus;
| +	spin_lock_t hwsem_lock;
|  };
|  
|  #define E1000_EEPROM_SWDPIN0   0x0001	/* SWDPIN 0 EEPROM Value */
| diff --git a/drivers/net/ethernet/intel/igb/e1000_mac.c b/drivers/net/ethernet/intel/igb/e1000_mac.c
| index 2559d70..285cc81 100644
| --- a/drivers/net/ethernet/intel/igb/e1000_mac.c
| +++ b/drivers/net/ethernet/intel/igb/e1000_mac.c
| @@ -1198,6 +1198,8 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
|  	s32 timeout = hw->nvm.word_size + 1;
|  	s32 i = 0;
|  
| +	spin_lock(&hw->hwsem_lock);
| +
|  	/* Get the SW semaphore */
|  	while (i < timeout) {
|  		swsm = rd32(E1000_SWSM);
| @@ -1235,6 +1237,8 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
|  	}
|  
|  out:
| +	if (ret_val)
| +		spin_unlock(&hw->hwsem_lock);
|  	return ret_val;
|  }
|  
| @@ -1253,6 +1257,7 @@ void igb_put_hw_semaphore(struct e1000_hw *hw)
|  	swsm &= ~(E1000_SWSM_SMBI | E1000_SWSM_SWESMBI);
|  
|  	wr32(E1000_SWSM, swsm);
| +	spin_unlock(&hw->hwsem_lock);
|  }
|  
|  /**
| diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
| index 64cbe0d..4ae835a 100644
| --- a/drivers/net/ethernet/intel/igb/igb_main.c
| +++ b/drivers/net/ethernet/intel/igb/igb_main.c
| @@ -2683,6 +2683,8 @@ static int igb_sw_init(struct igb_adapter *adapter)
|  	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
|  
|  	spin_lock_init(&adapter->stats64_lock);
| +	spin_lock_init(&hw->hwsem_lock);
| +
|  #ifdef CONFIG_PCI_IOV
|  	switch (hw->mac.type) {
|  	case e1000_82576:
| 
| 
| So I don't even know if this compiles and the error code is wrong but I
| think you get the idea:
| Before you attempt to grab the hw semaphore you grab a lock. If the lock
| is taken, then the semaphore is taken as well. In non-RT you spin on a
| memory instead of IO-memory so I doubt somebody will complain :)
| If you need to get the lock while it is taken and you are a high prio
| thread then the code should boost the owner of the hw semaphore which it
| knows now about.
| 
| Sebastian
---end quoted text---

-- 
[ Luis Claudio R. Goncalves                    Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-08-13  1:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-08 21:17 [RFC] igb: minimize busy loop on igb_get_hw_semaphore Luis Claudio R. Goncalves
2013-08-12 13:55 ` Sebastian Andrzej Siewior
2013-08-13  1:37   ` Luis Claudio R. Goncalves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.