All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-09 21:41 ` Tyler Baicar
  0 siblings, 0 replies; 25+ messages in thread
From: Tyler Baicar @ 2016-11-09 21:41 UTC (permalink / raw)
  To: jeffrey.t.kirsher, intel-wired-lan, netdev, linux-kernel, okaya, timur
  Cc: alexander.duyck, dima.ruinskiy, Tyler Baicar

Move IRQ free code so that it will happen regardless of the
__E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
if the __E1000_DOWN bit is cleared. This is not sufficient because
it is possible for __E1000_DOWN to be set without releasing the IRQ.
In such a situation, we will hit a kernel bug later in e1000_remove
because the IRQ still has action since it was never freed. A
secondary bus reset can cause this case to happen.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 7017281..36cfcb0 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
 
 	if (!test_bit(__E1000_DOWN, &adapter->state)) {
 		e1000e_down(adapter, true);
-		e1000_free_irq(adapter);
 
 		/* Link status message must follow this format */
 		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
 	}
 
+	e1000_free_irq(adapter);
+
 	napi_disable(&adapter->napi);
 
 	e1000e_free_tx_resources(adapter->tx_ring);
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-09 21:41 ` Tyler Baicar
  0 siblings, 0 replies; 25+ messages in thread
From: Tyler Baicar @ 2016-11-09 21:41 UTC (permalink / raw)
  To: intel-wired-lan

Move IRQ free code so that it will happen regardless of the
__E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
if the __E1000_DOWN bit is cleared. This is not sufficient because
it is possible for __E1000_DOWN to be set without releasing the IRQ.
In such a situation, we will hit a kernel bug later in e1000_remove
because the IRQ still has action since it was never freed. A
secondary bus reset can cause this case to happen.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 7017281..36cfcb0 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
 
 	if (!test_bit(__E1000_DOWN, &adapter->state)) {
 		e1000e_down(adapter, true);
-		e1000_free_irq(adapter);
 
 		/* Link status message must follow this format */
 		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
 	}
 
+	e1000_free_irq(adapter);
+
 	napi_disable(&adapter->napi);
 
 	e1000e_free_tx_resources(adapter->tx_ring);
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-09 21:41 ` [Intel-wired-lan] " Tyler Baicar
@ 2016-11-10  6:19   ` Neftin, Sasha
  -1 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-10  6:19 UTC (permalink / raw)
  To: Tyler Baicar, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/9/2016 11:41 PM, Tyler Baicar wrote:
> Move IRQ free code so that it will happen regardless of the
> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
> if the __E1000_DOWN bit is cleared. This is not sufficient because
> it is possible for __E1000_DOWN to be set without releasing the IRQ.
> In such a situation, we will hit a kernel bug later in e1000_remove
> because the IRQ still has action since it was never freed. A
> secondary bus reset can cause this case to happen.
> 
> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7017281..36cfcb0 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>  
>  	if (!test_bit(__E1000_DOWN, &adapter->state)) {
>  		e1000e_down(adapter, true);
> -		e1000_free_irq(adapter);
>  
>  		/* Link status message must follow this format */
>  		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>  	}				
>  
> +	e1000_free_irq(adapter);
> +
>  	napi_disable(&adapter->napi);
>  
>  	e1000e_free_tx_resources(adapter->tx_ring);
> 
I would like not recommend insert this change. This change related
driver state machine, we afraid from lot of synchronization problem and
issues.
We need keep e1000_free_irq in loop and check for 'test_bit' ready.
Another point, does before execute secondary bus reset your SW back up
pcie configuration space as properly?

Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-10  6:19   ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-10  6:19 UTC (permalink / raw)
  To: intel-wired-lan

On 11/9/2016 11:41 PM, Tyler Baicar wrote:
> Move IRQ free code so that it will happen regardless of the
> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
> if the __E1000_DOWN bit is cleared. This is not sufficient because
> it is possible for __E1000_DOWN to be set without releasing the IRQ.
> In such a situation, we will hit a kernel bug later in e1000_remove
> because the IRQ still has action since it was never freed. A
> secondary bus reset can cause this case to happen.
> 
> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7017281..36cfcb0 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>  
>  	if (!test_bit(__E1000_DOWN, &adapter->state)) {
>  		e1000e_down(adapter, true);
> -		e1000_free_irq(adapter);
>  
>  		/* Link status message must follow this format */
>  		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>  	}				
>  
> +	e1000_free_irq(adapter);
> +
>  	napi_disable(&adapter->napi);
>  
>  	e1000e_free_tx_resources(adapter->tx_ring);
> 
I would like not recommend insert this change. This change related
driver state machine, we afraid from lot of synchronization problem and
issues.
We need keep e1000_free_irq in loop and check for 'test_bit' ready.
Another point, does before execute secondary bus reset your SW back up
pcie configuration space as properly?

Sasha


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-10  6:19   ` Neftin, Sasha
@ 2016-11-10 22:35     ` Baicar, Tyler
  -1 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-10 22:35 UTC (permalink / raw)
  To: Neftin, Sasha, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

Hello Sasha,

On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>> Move IRQ free code so that it will happen regardless of the
>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>> In such a situation, we will hit a kernel bug later in e1000_remove
>> because the IRQ still has action since it was never freed. A
>> secondary bus reset can cause this case to happen.
>>
>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>> ---
>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index 7017281..36cfcb0 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>   
>>   	if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>   		e1000e_down(adapter, true);
>> -		e1000_free_irq(adapter);
>>   
>>   		/* Link status message must follow this format */
>>   		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>   	}				
>>   
>> +	e1000_free_irq(adapter);
>> +
>>   	napi_disable(&adapter->napi);
>>   
>>   	e1000e_free_tx_resources(adapter->tx_ring);
>>
> I would like not recommend insert this change. This change related
> driver state machine, we afraid from lot of synchronization problem and
> issues.
> We need keep e1000_free_irq in loop and check for 'test_bit' ready.

What do you mean here? There is no loop. If __E1000_DOWN is set then we
will never free the IRQ.

> Another point, does before execute secondary bus reset your SW back up
> pcie configuration space as properly?

After a secondary bus reset, the link needs to recover and go back to a
working state after 1 second.

 From the callstack, the issue is happening while removing the endpoint
from the system, before applying the secondary bus reset.

The order of events is
1. remove the drivers
2. cause a secondary bus reset
3. wait 1 second
4. recover the link

callstack:
free_msi_irqs+0x6c/0x1a8
pci_disable_msi+0xb0/0x148
e1000e_reset_interrupt_capability+0x60/0x78
e1000_remove+0xc8/0x180
pci_device_remove+0x48/0x118
__device_release_driver+0x80/0x108
device_release_driver+0x2c/0x40
pci_stop_bus_device+0xa0/0xb0
pci_stop_bus_device+0x3c/0xb0
pci_stop_root_bus+0x54/0x80
acpi_pci_root_remove+0x28/0x64
acpi_bus_trim+0x6c/0xa4
acpi_device_hotplug+0x19c/0x3f4
acpi_hotplug_work_fn+0x28/0x3c
process_one_work+0x150/0x460
worker_thread+0x50/0x4b8
kthread+0xd4/0xe8
ret_from_fork+0x10/0x50

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-10 22:35     ` Baicar, Tyler
  0 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-10 22:35 UTC (permalink / raw)
  To: intel-wired-lan

Hello Sasha,

On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>> Move IRQ free code so that it will happen regardless of the
>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>> In such a situation, we will hit a kernel bug later in e1000_remove
>> because the IRQ still has action since it was never freed. A
>> secondary bus reset can cause this case to happen.
>>
>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>> ---
>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index 7017281..36cfcb0 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>   
>>   	if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>   		e1000e_down(adapter, true);
>> -		e1000_free_irq(adapter);
>>   
>>   		/* Link status message must follow this format */
>>   		pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>   	}				
>>   
>> +	e1000_free_irq(adapter);
>> +
>>   	napi_disable(&adapter->napi);
>>   
>>   	e1000e_free_tx_resources(adapter->tx_ring);
>>
> I would like not recommend insert this change. This change related
> driver state machine, we afraid from lot of synchronization problem and
> issues.
> We need keep e1000_free_irq in loop and check for 'test_bit' ready.

What do you mean here? There is no loop. If __E1000_DOWN is set then we
will never free the IRQ.

> Another point, does before execute secondary bus reset your SW back up
> pcie configuration space as properly?

After a secondary bus reset, the link needs to recover and go back to a
working state after 1 second.

 From the callstack, the issue is happening while removing the endpoint
from the system, before applying the secondary bus reset.

The order of events is
1. remove the drivers
2. cause a secondary bus reset
3. wait 1 second
4. recover the link

callstack:
free_msi_irqs+0x6c/0x1a8
pci_disable_msi+0xb0/0x148
e1000e_reset_interrupt_capability+0x60/0x78
e1000_remove+0xc8/0x180
pci_device_remove+0x48/0x118
__device_release_driver+0x80/0x108
device_release_driver+0x2c/0x40
pci_stop_bus_device+0xa0/0xb0
pci_stop_bus_device+0x3c/0xb0
pci_stop_root_bus+0x54/0x80
acpi_pci_root_remove+0x28/0x64
acpi_bus_trim+0x6c/0xa4
acpi_device_hotplug+0x19c/0x3f4
acpi_hotplug_work_fn+0x28/0x3c
process_one_work+0x150/0x460
worker_thread+0x50/0x4b8
kthread+0xd4/0xe8
ret_from_fork+0x10/0x50

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-10 22:35     ` Baicar, Tyler
@ 2016-11-13  8:34       ` Neftin, Sasha
  -1 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-13  8:34 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
> Hello Sasha,
> 
> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>> Move IRQ free code so that it will happen regardless of the
>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>> because the IRQ still has action since it was never freed. A
>>> secondary bus reset can cause this case to happen.
>>>
>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>> ---
>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> index 7017281..36cfcb0 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>           e1000e_down(adapter, true);
>>> -        e1000_free_irq(adapter);
>>>             /* Link status message must follow this format */
>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>       }               
>>>   +    e1000_free_irq(adapter);
>>> +
>>>       napi_disable(&adapter->napi);
>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>
>> I would like not recommend insert this change. This change related
>> driver state machine, we afraid from lot of synchronization problem and
>> issues.
>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
> 
> What do you mean here? There is no loop. If __E1000_DOWN is set then we
> will never free the IRQ.
> 
>> Another point, does before execute secondary bus reset your SW back up
>> pcie configuration space as properly?
> 
> After a secondary bus reset, the link needs to recover and go back to a
> working state after 1 second.
> 
> From the callstack, the issue is happening while removing the endpoint
> from the system, before applying the secondary bus reset.
> 
> The order of events is
> 1. remove the drivers
> 2. cause a secondary bus reset
> 3. wait 1 second
Actually, this is too much, usually link up in less than 100ms.You can
check Data Link Layer indication.
> 4. recover the link
> 
> callstack:
> free_msi_irqs+0x6c/0x1a8
> pci_disable_msi+0xb0/0x148
> e1000e_reset_interrupt_capability+0x60/0x78
> e1000_remove+0xc8/0x180
> pci_device_remove+0x48/0x118
> __device_release_driver+0x80/0x108
> device_release_driver+0x2c/0x40
> pci_stop_bus_device+0xa0/0xb0
> pci_stop_bus_device+0x3c/0xb0
> pci_stop_root_bus+0x54/0x80
> acpi_pci_root_remove+0x28/0x64
> acpi_bus_trim+0x6c/0xa4
> acpi_device_hotplug+0x19c/0x3f4
> acpi_hotplug_work_fn+0x28/0x3c
> process_one_work+0x150/0x460
> worker_thread+0x50/0x4b8
> kthread+0xd4/0xe8
> ret_from_fork+0x10/0x50
> 
> Thanks,
> Tyler
> 
Hello Tyler,
Okay, we need consult more about this suggestion.
May I ask what is setup you run? Is there NIC or on board LAN? I would
like try reproduce this issue in our lab's too.
Also, is same issue observed with same scenario and others NIC's too?
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-13  8:34       ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-13  8:34 UTC (permalink / raw)
  To: intel-wired-lan

On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
> Hello Sasha,
> 
> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>> Move IRQ free code so that it will happen regardless of the
>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>> because the IRQ still has action since it was never freed. A
>>> secondary bus reset can cause this case to happen.
>>>
>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>> ---
>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> index 7017281..36cfcb0 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>           e1000e_down(adapter, true);
>>> -        e1000_free_irq(adapter);
>>>             /* Link status message must follow this format */
>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>       }               
>>>   +    e1000_free_irq(adapter);
>>> +
>>>       napi_disable(&adapter->napi);
>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>
>> I would like not recommend insert this change. This change related
>> driver state machine, we afraid from lot of synchronization problem and
>> issues.
>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
> 
> What do you mean here? There is no loop. If __E1000_DOWN is set then we
> will never free the IRQ.
> 
>> Another point, does before execute secondary bus reset your SW back up
>> pcie configuration space as properly?
> 
> After a secondary bus reset, the link needs to recover and go back to a
> working state after 1 second.
> 
> From the callstack, the issue is happening while removing the endpoint
> from the system, before applying the secondary bus reset.
> 
> The order of events is
> 1. remove the drivers
> 2. cause a secondary bus reset
> 3. wait 1 second
Actually, this is too much, usually link up in less than 100ms.You can
check Data Link Layer indication.
> 4. recover the link
> 
> callstack:
> free_msi_irqs+0x6c/0x1a8
> pci_disable_msi+0xb0/0x148
> e1000e_reset_interrupt_capability+0x60/0x78
> e1000_remove+0xc8/0x180
> pci_device_remove+0x48/0x118
> __device_release_driver+0x80/0x108
> device_release_driver+0x2c/0x40
> pci_stop_bus_device+0xa0/0xb0
> pci_stop_bus_device+0x3c/0xb0
> pci_stop_root_bus+0x54/0x80
> acpi_pci_root_remove+0x28/0x64
> acpi_bus_trim+0x6c/0xa4
> acpi_device_hotplug+0x19c/0x3f4
> acpi_hotplug_work_fn+0x28/0x3c
> process_one_work+0x150/0x460
> worker_thread+0x50/0x4b8
> kthread+0xd4/0xe8
> ret_from_fork+0x10/0x50
> 
> Thanks,
> Tyler
> 
Hello Tyler,
Okay, we need consult more about this suggestion.
May I ask what is setup you run? Is there NIC or on board LAN? I would
like try reproduce this issue in our lab's too.
Also, is same issue observed with same scenario and others NIC's too?
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-13  8:34       ` Neftin, Sasha
@ 2016-11-13  9:25         ` Neftin, Sasha
  -1 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-13  9:25 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>> Hello Sasha,
>>
>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>> Move IRQ free code so that it will happen regardless of the
>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>> because the IRQ still has action since it was never freed. A
>>>> secondary bus reset can cause this case to happen.
>>>>
>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>> ---
>>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> index 7017281..36cfcb0 100644
>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>           e1000e_down(adapter, true);
>>>> -        e1000_free_irq(adapter);
>>>>             /* Link status message must follow this format */
>>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>       }               
>>>>   +    e1000_free_irq(adapter);
>>>> +
>>>>       napi_disable(&adapter->napi);
>>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>>
>>> I would like not recommend insert this change. This change related
>>> driver state machine, we afraid from lot of synchronization problem and
>>> issues.
>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>
>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>> will never free the IRQ.
>>
>>> Another point, does before execute secondary bus reset your SW back up
>>> pcie configuration space as properly?
>>
>> After a secondary bus reset, the link needs to recover and go back to a
>> working state after 1 second.
>>
>> From the callstack, the issue is happening while removing the endpoint
>> from the system, before applying the secondary bus reset.
>>
>> The order of events is
>> 1. remove the drivers
>> 2. cause a secondary bus reset
>> 3. wait 1 second
> Actually, this is too much, usually link up in less than 100ms.You can
> check Data Link Layer indication.
>> 4. recover the link
>>
>> callstack:
>> free_msi_irqs+0x6c/0x1a8
>> pci_disable_msi+0xb0/0x148
>> e1000e_reset_interrupt_capability+0x60/0x78
>> e1000_remove+0xc8/0x180
>> pci_device_remove+0x48/0x118
>> __device_release_driver+0x80/0x108
>> device_release_driver+0x2c/0x40
>> pci_stop_bus_device+0xa0/0xb0
>> pci_stop_bus_device+0x3c/0xb0
>> pci_stop_root_bus+0x54/0x80
>> acpi_pci_root_remove+0x28/0x64
>> acpi_bus_trim+0x6c/0xa4
>> acpi_device_hotplug+0x19c/0x3f4
>> acpi_hotplug_work_fn+0x28/0x3c
>> process_one_work+0x150/0x460
>> worker_thread+0x50/0x4b8
>> kthread+0xd4/0xe8
>> ret_from_fork+0x10/0x50
>>
>> Thanks,
>> Tyler
>>
> Hello Tyler,
> Okay, we need consult more about this suggestion.
> May I ask what is setup you run? Is there NIC or on board LAN? I would
> like try reproduce this issue in our lab's too.
> Also, is same issue observed with same scenario and others NIC's too?
> Sasha
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 
Please, specify what is device used.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-13  9:25         ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-13  9:25 UTC (permalink / raw)
  To: intel-wired-lan

On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>> Hello Sasha,
>>
>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>> Move IRQ free code so that it will happen regardless of the
>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>> because the IRQ still has action since it was never freed. A
>>>> secondary bus reset can cause this case to happen.
>>>>
>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>> ---
>>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> index 7017281..36cfcb0 100644
>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>           e1000e_down(adapter, true);
>>>> -        e1000_free_irq(adapter);
>>>>             /* Link status message must follow this format */
>>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>       }               
>>>>   +    e1000_free_irq(adapter);
>>>> +
>>>>       napi_disable(&adapter->napi);
>>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>>
>>> I would like not recommend insert this change. This change related
>>> driver state machine, we afraid from lot of synchronization problem and
>>> issues.
>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>
>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>> will never free the IRQ.
>>
>>> Another point, does before execute secondary bus reset your SW back up
>>> pcie configuration space as properly?
>>
>> After a secondary bus reset, the link needs to recover and go back to a
>> working state after 1 second.
>>
>> From the callstack, the issue is happening while removing the endpoint
>> from the system, before applying the secondary bus reset.
>>
>> The order of events is
>> 1. remove the drivers
>> 2. cause a secondary bus reset
>> 3. wait 1 second
> Actually, this is too much, usually link up in less than 100ms.You can
> check Data Link Layer indication.
>> 4. recover the link
>>
>> callstack:
>> free_msi_irqs+0x6c/0x1a8
>> pci_disable_msi+0xb0/0x148
>> e1000e_reset_interrupt_capability+0x60/0x78
>> e1000_remove+0xc8/0x180
>> pci_device_remove+0x48/0x118
>> __device_release_driver+0x80/0x108
>> device_release_driver+0x2c/0x40
>> pci_stop_bus_device+0xa0/0xb0
>> pci_stop_bus_device+0x3c/0xb0
>> pci_stop_root_bus+0x54/0x80
>> acpi_pci_root_remove+0x28/0x64
>> acpi_bus_trim+0x6c/0xa4
>> acpi_device_hotplug+0x19c/0x3f4
>> acpi_hotplug_work_fn+0x28/0x3c
>> process_one_work+0x150/0x460
>> worker_thread+0x50/0x4b8
>> kthread+0xd4/0xe8
>> ret_from_fork+0x10/0x50
>>
>> Thanks,
>> Tyler
>>
> Hello Tyler,
> Okay, we need consult more about this suggestion.
> May I ask what is setup you run? Is there NIC or on board LAN? I would
> like try reproduce this issue in our lab's too.
> Also, is same issue observed with same scenario and others NIC's too?
> Sasha
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 
Please, specify what is device used.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-13  9:25         ` Neftin, Sasha
@ 2016-11-15 21:50           ` Baicar, Tyler
  -1 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-15 21:50 UTC (permalink / raw)
  To: Neftin, Sasha, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>> because the IRQ still has action since it was never freed. A
>>>>> secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related
>>>> driver state machine, we afraid from lot of synchronization problem and
>>>> issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>>> will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back up
>>>> pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back to a
>>> working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the endpoint
>>> from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You can
>> check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>> like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Please, specify what is device used.
Hello Sasha,
This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server 
adapter. I have not tried
other e1000e PCIe cards, but have not seen any similar issues with 
Mellanox cards. I'm able to
reproduce it with just pulling the card out. Here is the lspci -vvv 
output for this card:

0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00 
[Normal decode])
         Physical Slot: 5
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0
         Interrupt: pin A routed to IRQ 297
         Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
         I/O behind bridge: 00002000-00002fff
         Memory behind bridge: 00100000-002fffff
         Prefetchable memory behind bridge: 
00000c0400000000-00000c04001fffff
         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- <SERR- <PERR-
         BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
         Capabilities: [40] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
                 DevCap: MaxPayload 512 bytes, PhantFunc 0
                         ExtTag- RBE+
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr- TransPend-
                 LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, 
Exit Latency L0s <1us, L1 <16us
                         ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive+ BWMgmt- ABWMgmt-
                 SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+ 
HotPlug+ Surprise+
                         Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
                 SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- 
HPIrq- LinkChg-
                         Control: AttnInd Off, PwrInd Off, Power- Interlock-
                 SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- 
PresDet- Interlock-
                         Changed: MRL- PresDet- LinkState-
                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- 
PMEIntEna+ CRSVisible-
                 RootCap: CRSVisible-
                 RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, 
LTR+, OBFF Not Supported ARIFwd+
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, 
LTR-, OBFF Disabled ARIFwd-
                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- 
SpeedDis-
                          Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                          Compliance De-emphasis: -6dB
                 LnkSta2: Current De-emphasis Level: -3.5dB, 
EqualizationComplete-, EqualizationPhase1-
                          EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
         Capabilities: [100 v2] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap+ CGenEn+ 
ChkCap+ ChkEn+
         Capabilities: [178 v1] #19
         Kernel driver in use: pcieport

0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit 
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c0100100000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100120000 (32-bit, non-prefetchable) 
[size=128K]
         Region 2: I/O ports at 1000 [size=32]
         Expansion ROM at c0100140000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, 
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit 
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin B routed to IRQ 301
         Region 0: Memory at c0100160000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100180000 (32-bit, non-prefetchable) 
[size=128K]
         Region 2: I/O ports at 1020 [size=32]
         Expansion ROM at c01001a0000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, 
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-15 21:50           ` Baicar, Tyler
  0 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-15 21:50 UTC (permalink / raw)
  To: intel-wired-lan

On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>> because the IRQ still has action since it was never freed. A
>>>>> secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related
>>>> driver state machine, we afraid from lot of synchronization problem and
>>>> issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>>> will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back up
>>>> pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back to a
>>> working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the endpoint
>>> from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You can
>> check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>> like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Please, specify what is device used.
Hello Sasha,
This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server 
adapter. I have not tried
other e1000e PCIe cards, but have not seen any similar issues with 
Mellanox cards. I'm able to
reproduce it with just pulling the card out. Here is the lspci -vvv 
output for this card:

0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00 
[Normal decode])
         Physical Slot: 5
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0
         Interrupt: pin A routed to IRQ 297
         Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
         I/O behind bridge: 00002000-00002fff
         Memory behind bridge: 00100000-002fffff
         Prefetchable memory behind bridge: 
00000c0400000000-00000c04001fffff
         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- <SERR- <PERR-
         BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
         Capabilities: [40] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
                 DevCap: MaxPayload 512 bytes, PhantFunc 0
                         ExtTag- RBE+
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr- TransPend-
                 LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, 
Exit Latency L0s <1us, L1 <16us
                         ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive+ BWMgmt- ABWMgmt-
                 SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+ 
HotPlug+ Surprise+
                         Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
                 SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- 
HPIrq- LinkChg-
                         Control: AttnInd Off, PwrInd Off, Power- Interlock-
                 SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- 
PresDet- Interlock-
                         Changed: MRL- PresDet- LinkState-
                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- 
PMEIntEna+ CRSVisible-
                 RootCap: CRSVisible-
                 RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, 
LTR+, OBFF Not Supported ARIFwd+
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, 
LTR-, OBFF Disabled ARIFwd-
                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- 
SpeedDis-
                          Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                          Compliance De-emphasis: -6dB
                 LnkSta2: Current De-emphasis Level: -3.5dB, 
EqualizationComplete-, EqualizationPhase1-
                          EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
         Capabilities: [100 v2] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap+ CGenEn+ 
ChkCap+ ChkEn+
         Capabilities: [178 v1] #19
         Kernel driver in use: pcieport

0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit 
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c0100100000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100120000 (32-bit, non-prefetchable) 
[size=128K]
         Region 2: I/O ports at 1000 [size=32]
         Expansion ROM@c0100140000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, 
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit 
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin B routed to IRQ 301
         Region 0: Memory at c0100160000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100180000 (32-bit, non-prefetchable) 
[size=128K]
         Region 2: I/O ports at 1020 [size=32]
         Expansion ROM@c01001a0000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, 
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
       [not found]           ` <630A6B92B7EDEB45A87E20D3D286660153E3B481@hasmsx109.ger.corp.intel.com>
@ 2016-11-17 13:07             ` Neftin, Sasha
  2016-11-17 13:17               ` [Intel-wired-lan] Fwd: " Neftin, Sasha
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:07 UTC (permalink / raw)
  To: intel-wired-lan@lists.osuosl.org; netdev, linux-kernel, Tyler Baicar

> -----Original Message-----
> From: Baicar, Tyler [mailto:tbaicar@codeaurora.org] 
> Sent: Tuesday, November 15, 2016 11:50 PM
> To: Neftin, Sasha <sasha.neftin@intel.com>; Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; okaya@codeaurora.org; timur@codeaurora.org
> Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
> 
> On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>> Hello Sasha,
>>>>
>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>> Move IRQ free code so that it will happen regardless of the 
>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its 
>>>>>> IRQ if the __E1000_DOWN bit is cleared. This is not sufficient 
>>>>>> because it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>> In such a situation, we will hit a kernel bug later in 
>>>>>> e1000_remove because the IRQ still has action since it was never 
>>>>>> freed. A secondary bus reset can cause this case to happen.
>>>>>>
>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>> ---
>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> index 7017281..36cfcb0 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>            e1000e_down(adapter, true);
>>>>>> -        e1000_free_irq(adapter);
>>>>>>              /* Link status message must follow this format */
>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>        }
>>>>>>    +    e1000_free_irq(adapter);
>>>>>> +
>>>>>>        napi_disable(&adapter->napi);
>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>
>>>>> I would like not recommend insert this change. This change related 
>>>>> driver state machine, we afraid from lot of synchronization problem 
>>>>> and issues.
>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>> What do you mean here? There is no loop. If __E1000_DOWN is set then 
>>>> we will never free the IRQ.
>>>>
>>>>> Another point, does before execute secondary bus reset your SW back 
>>>>> up pcie configuration space as properly?
>>>> After a secondary bus reset, the link needs to recover and go back 
>>>> to a working state after 1 second.
>>>>
>>>>  From the callstack, the issue is happening while removing the 
>>>> endpoint from the system, before applying the secondary bus reset.
>>>>
>>>> The order of events is
>>>> 1. remove the drivers
>>>> 2. cause a secondary bus reset
>>>> 3. wait 1 second
>>> Actually, this is too much, usually link up in less than 100ms.You 
>>> can check Data Link Layer indication.
>>>> 4. recover the link
>>>>
>>>> callstack:
>>>> free_msi_irqs+0x6c/0x1a8
>>>> pci_disable_msi+0xb0/0x148
>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>> e1000_remove+0xc8/0x180
>>>> pci_device_remove+0x48/0x118
>>>> __device_release_driver+0x80/0x108
>>>> device_release_driver+0x2c/0x40
>>>> pci_stop_bus_device+0xa0/0xb0
>>>> pci_stop_bus_device+0x3c/0xb0
>>>> pci_stop_root_bus+0x54/0x80
>>>> acpi_pci_root_remove+0x28/0x64
>>>> acpi_bus_trim+0x6c/0xa4
>>>> acpi_device_hotplug+0x19c/0x3f4
>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>> process_one_work+0x150/0x460
>>>> worker_thread+0x50/0x4b8
>>>> kthread+0xd4/0xe8
>>>> ret_from_fork+0x10/0x50
>>>>
>>>> Thanks,
>>>> Tyler
>>>>
>>> Hello Tyler,
>>> Okay, we need consult more about this suggestion.
>>> May I ask what is setup you run? Is there NIC or on board LAN? I 
>>> would like try reproduce this issue in our lab's too.
>>> Also, is same issue observed with same scenario and others NIC's too?
>>> Sasha
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan@lists.osuosl.org
>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>
>> Please, specify what is device used.
> Hello Sasha,
> This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server adapter. I have not tried other e1000e PCIe cards, but have not seen any similar issues with Mellanox cards. I'm able to reproduce it with just pulling the card out. Here is the lspci -vvv output for this card:
> 
> 0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00 [Normal decode])
>          Physical Slot: 5
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0
>          Interrupt: pin A routed to IRQ 297
>          Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>          I/O behind bridge: 00002000-00002fff
>          Memory behind bridge: 00100000-002fffff
>          Prefetchable memory behind bridge: 
> 00000c0400000000-00000c04001fffff
>          Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- <SERR- <PERR-
>          BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
>                  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>          Capabilities: [40] Power Management version 3
>                  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold-)
>                  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>          Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
>                  DevCap: MaxPayload 512 bytes, PhantFunc 0
>                          ExtTag- RBE+
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
> AuxPwr- TransPend-
>                  LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <1us, L1 <16us
>                          ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
>                  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
> DLActive+ BWMgmt- ABWMgmt-
>                  SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+ 
> HotPlug+ Surprise+
>                          Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
>                  SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt-
> HPIrq- LinkChg-
>                          Control: AttnInd Off, PwrInd Off, Power- Interlock-
>                  SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt-
> PresDet- Interlock-
>                          Changed: MRL- PresDet- LinkState-
>                  RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- 
> PMEIntEna+ CRSVisible-
>                  RootCap: CRSVisible-
>                  RootSta: PME ReqID 0000, PMEStatus- PMEPending-
>                  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, 
> LTR+, OBFF Not Supported ARIFwd+
>                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
>                  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
> SpeedDis-
>                           Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
>                           Compliance De-emphasis: -6dB
>                  LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>                           EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
>          Capabilities: [100 v2] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
>                  AERCap: First Error Pointer: 00, GenCap+ CGenEn+ 
> ChkCap+ ChkEn+
>          Capabilities: [178 v1] #19
>          Kernel driver in use: pcieport
> 
> 0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>          Physical Slot: 5-1
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0, Cache Line Size: 128 bytes
>          Interrupt: pin A routed to IRQ 299
>          Region 0: Memory at c0100100000 (32-bit, non-prefetchable) [size=128K]
>          Region 1: Memory at c0100120000 (32-bit, non-prefetchable) [size=128K]
>          Region 2: I/O ports at 1000 [size=32]
>          Expansion ROM at c0100140000 [disabled] [size=128K]
>          Capabilities: [c8] Power Management version 2
>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>          Capabilities: [100 v1] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
> ChkCap- ChkEn-
>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>          Kernel driver in use: e1000e
> 
> 0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>          Physical Slot: 5-1
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0, Cache Line Size: 128 bytes
>          Interrupt: pin B routed to IRQ 301
>          Region 0: Memory at c0100160000 (32-bit, non-prefetchable) [size=128K]
>          Region 1: Memory at c0100180000 (32-bit, non-prefetchable) [size=128K]
>          Region 2: I/O ports at 1020 [size=32]
>          Expansion ROM at c01001a0000 [disabled] [size=128K]
>          Capabilities: [c8] Power Management version 2
>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>          Capabilities: [100 v1] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
> ChkCap- ChkEn-
>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>          Kernel driver in use: e1000e
> 
> Thanks,
> Tyler
> 
Hello Tyler,
I see some in consistent implementation of __*_close method in our
drivers. Do you have any igb NIC to test if this issue persist there?
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Fwd:[Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
       [not found]           ` <630A6B92B7EDEB45A87E20D3D286660153E3B481@hasmsx109.ger.corp.intel.com>
@ 2016-11-17 13:17               ` Neftin, Sasha
  2016-11-17 13:17               ` [Intel-wired-lan] Fwd: " Neftin, Sasha
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:17 UTC (permalink / raw)
  To: intel-wired-lan, Tyler Baicar, netdev

From: Baicar, Tyler [mailto:tbaicar@codeaurora.org] Sent: Tuesday,
November 15, 2016 11:50 PM
To: Neftin, Sasha <sasha.neftin@intel.com>; Kirsher, Jeffrey T
<jeffrey.t.kirsher@intel.com>; intel-wired-lan@lists.osuosl.org;
netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
okaya@codeaurora.org; timur@codeaurora.org
Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of
__E1000_DOWN

On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the 
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its 
>>>>> IRQ if the __E1000_DOWN bit is cleared. This is not sufficient 
>>>>> because it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in 
>>>>> e1000_remove because the IRQ still has action since it was never 
>>>>> freed. A secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related 
>>>> driver state machine, we afraid from lot of synchronization problem 
>>>> and issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then 
>>> we will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back 
>>>> up pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back 
>>> to a working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the 
>>> endpoint from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You 
>> can check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I 
>> would like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Please, specify what is device used.
Hello Sasha,
This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server
adapter. I have not tried other e1000e PCIe cards, but have not seen any
similar issues with Mellanox cards. I'm able to reproduce it with just
pulling the card out. Here is the lspci -vvv output for this card:

0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00
[Normal decode])
         Physical Slot: 5
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0
         Interrupt: pin A routed to IRQ 297
         Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
         I/O behind bridge: 00002000-00002fff
         Memory behind bridge: 00100000-002fffff
         Prefetchable memory behind bridge:
00000c0400000000-00000c04001fffff
         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
         BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
         Capabilities: [40] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
                 DevCap: MaxPayload 512 bytes, PhantFunc 0
                         ExtTag- RBE+
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr- TransPend-
                 LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1,
Exit Latency L0s <1us, L1 <16us
                         ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive+ BWMgmt- ABWMgmt-
                 SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+
HotPlug+ Surprise+
                         Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
                 SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt-
HPIrq- LinkChg-
                         Control: AttnInd Off, PwrInd Off, Power- Interlock-
                 SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt-
PresDet- Interlock-
                         Changed: MRL- PresDet- LinkState-
                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal-
PMEIntEna+ CRSVisible-
                 RootCap: CRSVisible-
                 RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+,
LTR+, OBFF Not Supported ARIFwd+
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
LTR-, OBFF Disabled ARIFwd-
                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
SpeedDis-
                          Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                          Compliance De-emphasis: -6dB
                 LnkSta2: Current De-emphasis Level: -3.5dB,
EqualizationComplete-, EqualizationPhase1-
                          EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
         Capabilities: [100 v2] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap+ CGenEn+
ChkCap+ ChkEn+
         Capabilities: [178 v1] #19
         Kernel driver in use: pcieport

0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c0100100000 (32-bit, non-prefetchable)
[size=128K]
         Region 1: Memory at c0100120000 (32-bit, non-prefetchable)
[size=128K]
         Region 2: I/O ports at 1000 [size=32]
         Expansion ROM at c0100140000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn-
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin B routed to IRQ 301
         Region 0: Memory at c0100160000 (32-bit, non-prefetchable)
[size=128K]
         Region 1: Memory at c0100180000 (32-bit, non-prefetchable)
[size=128K]
         Region 2: I/O ports at 1020 [size=32]
         Expansion ROM at c01001a0000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn-
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to test and check if same problem
persist there?
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] Fwd: [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-17 13:17               ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:17 UTC (permalink / raw)
  To: intel-wired-lan

From: Baicar, Tyler [mailto:tbaicar at codeaurora.org] Sent: Tuesday,
November 15, 2016 11:50 PM
To: Neftin, Sasha <sasha.neftin@intel.com>; Kirsher, Jeffrey T
<jeffrey.t.kirsher@intel.com>; intel-wired-lan at lists.osuosl.org;
netdev at vger.kernel.org; linux-kernel at vger.kernel.org;
okaya at codeaurora.org; timur at codeaurora.org
Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of
__E1000_DOWN

On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the 
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its 
>>>>> IRQ if the __E1000_DOWN bit is cleared. This is not sufficient 
>>>>> because it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in 
>>>>> e1000_remove because the IRQ still has action since it was never 
>>>>> freed. A secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related 
>>>> driver state machine, we afraid from lot of synchronization problem 
>>>> and issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then 
>>> we will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back 
>>>> up pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back 
>>> to a working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the 
>>> endpoint from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You 
>> can check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I 
>> would like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Please, specify what is device used.
Hello Sasha,
This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server
adapter. I have not tried other e1000e PCIe cards, but have not seen any
similar issues with Mellanox cards. I'm able to reproduce it with just
pulling the card out. Here is the lspci -vvv output for this card:

0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00
[Normal decode])
         Physical Slot: 5
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0
         Interrupt: pin A routed to IRQ 297
         Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
         I/O behind bridge: 00002000-00002fff
         Memory behind bridge: 00100000-002fffff
         Prefetchable memory behind bridge:
00000c0400000000-00000c04001fffff
         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
         BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
         Capabilities: [40] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
                 DevCap: MaxPayload 512 bytes, PhantFunc 0
                         ExtTag- RBE+
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr- TransPend-
                 LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1,
Exit Latency L0s <1us, L1 <16us
                         ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive+ BWMgmt- ABWMgmt-
                 SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+
HotPlug+ Surprise+
                         Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
                 SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt-
HPIrq- LinkChg-
                         Control: AttnInd Off, PwrInd Off, Power- Interlock-
                 SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt-
PresDet- Interlock-
                         Changed: MRL- PresDet- LinkState-
                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal-
PMEIntEna+ CRSVisible-
                 RootCap: CRSVisible-
                 RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+,
LTR+, OBFF Not Supported ARIFwd+
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
LTR-, OBFF Disabled ARIFwd-
                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
SpeedDis-
                          Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                          Compliance De-emphasis: -6dB
                 LnkSta2: Current De-emphasis Level: -3.5dB,
EqualizationComplete-, EqualizationPhase1-
                          EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
         Capabilities: [100 v2] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap+ CGenEn+
ChkCap+ ChkEn+
         Capabilities: [178 v1] #19
         Kernel driver in use: pcieport

0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c0100100000 (32-bit, non-prefetchable)
[size=128K]
         Region 1: Memory at c0100120000 (32-bit, non-prefetchable)
[size=128K]
         Region 2: I/O ports at 1000 [size=32]
         Expansion ROM@c0100140000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn-
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit
Ethernet Controller (rev 06)
         Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 128 bytes
         Interrupt: pin B routed to IRQ 301
         Region 0: Memory at c0100160000 (32-bit, non-prefetchable)
[size=128K]
         Region 1: Memory at c0100180000 (32-bit, non-prefetchable)
[size=128K]
         Region 2: I/O ports at 1020 [size=32]
         Expansion ROM@c01001a0000 [disabled] [size=128K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
Exit Latency L0s <4us, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                 AERCap: First Error Pointer: 14, GenCap- CGenEn-
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
         Kernel driver in use: e1000e

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to test and check if same problem
persist there?
Thanks,
Sasha



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
       [not found]           ` <630A6B92B7EDEB45A87E20D3D286660153E3B481@hasmsx109.ger.corp.intel.com>
  2016-11-17 13:07             ` Neftin, Sasha
  2016-11-17 13:17               ` [Intel-wired-lan] Fwd: " Neftin, Sasha
@ 2016-11-17 13:27             ` Neftin, Sasha
       [not found]             ` <82260f85-2f8d-20e7-4f52-86b156aff96f@intel.com>
  3 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:27 UTC (permalink / raw)
  To: intel-wired-lan

On 11/17/2016 2:53 PM, Neftin, Sasha wrote:
> 
> 
> Skype Me???! 
> -----Original Message-----
> From: Baicar, Tyler [mailto:tbaicar at codeaurora.org] 
> Sent: Tuesday, November 15, 2016 11:50 PM
> To: Neftin, Sasha <sasha.neftin@intel.com>; Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; intel-wired-lan at lists.osuosl.org; netdev at vger.kernel.org; linux-kernel at vger.kernel.org; okaya at codeaurora.org; timur at codeaurora.org
> Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
> 
> On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>> Hello Sasha,
>>>>
>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>> Move IRQ free code so that it will happen regardless of the 
>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its 
>>>>>> IRQ if the __E1000_DOWN bit is cleared. This is not sufficient 
>>>>>> because it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>> In such a situation, we will hit a kernel bug later in 
>>>>>> e1000_remove because the IRQ still has action since it was never 
>>>>>> freed. A secondary bus reset can cause this case to happen.
>>>>>>
>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>> ---
>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> index 7017281..36cfcb0 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>            e1000e_down(adapter, true);
>>>>>> -        e1000_free_irq(adapter);
>>>>>>              /* Link status message must follow this format */
>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>        }
>>>>>>    +    e1000_free_irq(adapter);
>>>>>> +
>>>>>>        napi_disable(&adapter->napi);
>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>
>>>>> I would like not recommend insert this change. This change related 
>>>>> driver state machine, we afraid from lot of synchronization problem 
>>>>> and issues.
>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>> What do you mean here? There is no loop. If __E1000_DOWN is set then 
>>>> we will never free the IRQ.
>>>>
>>>>> Another point, does before execute secondary bus reset your SW back 
>>>>> up pcie configuration space as properly?
>>>> After a secondary bus reset, the link needs to recover and go back 
>>>> to a working state after 1 second.
>>>>
>>>>  From the callstack, the issue is happening while removing the 
>>>> endpoint from the system, before applying the secondary bus reset.
>>>>
>>>> The order of events is
>>>> 1. remove the drivers
>>>> 2. cause a secondary bus reset
>>>> 3. wait 1 second
>>> Actually, this is too much, usually link up in less than 100ms.You 
>>> can check Data Link Layer indication.
>>>> 4. recover the link
>>>>
>>>> callstack:
>>>> free_msi_irqs+0x6c/0x1a8
>>>> pci_disable_msi+0xb0/0x148
>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>> e1000_remove+0xc8/0x180
>>>> pci_device_remove+0x48/0x118
>>>> __device_release_driver+0x80/0x108
>>>> device_release_driver+0x2c/0x40
>>>> pci_stop_bus_device+0xa0/0xb0
>>>> pci_stop_bus_device+0x3c/0xb0
>>>> pci_stop_root_bus+0x54/0x80
>>>> acpi_pci_root_remove+0x28/0x64
>>>> acpi_bus_trim+0x6c/0xa4
>>>> acpi_device_hotplug+0x19c/0x3f4
>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>> process_one_work+0x150/0x460
>>>> worker_thread+0x50/0x4b8
>>>> kthread+0xd4/0xe8
>>>> ret_from_fork+0x10/0x50
>>>>
>>>> Thanks,
>>>> Tyler
>>>>
>>> Hello Tyler,
>>> Okay, we need consult more about this suggestion.
>>> May I ask what is setup you run? Is there NIC or on board LAN? I 
>>> would like try reproduce this issue in our lab's too.
>>> Also, is same issue observed with same scenario and others NIC's too?
>>> Sasha
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan at lists.osuosl.org
>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>
>> Please, specify what is device used.
> Hello Sasha,
> This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server adapter. I have not tried other e1000e PCIe cards, but have not seen any similar issues with Mellanox cards. I'm able to reproduce it with just pulling the card out. Here is the lspci -vvv output for this card:
> 
> 0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00 [Normal decode])
>          Physical Slot: 5
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0
>          Interrupt: pin A routed to IRQ 297
>          Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>          I/O behind bridge: 00002000-00002fff
>          Memory behind bridge: 00100000-002fffff
>          Prefetchable memory behind bridge: 
> 00000c0400000000-00000c04001fffff
>          Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- <SERR- <PERR-
>          BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
>                  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>          Capabilities: [40] Power Management version 3
>                  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold-)
>                  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>          Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
>                  DevCap: MaxPayload 512 bytes, PhantFunc 0
>                          ExtTag- RBE+
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
> AuxPwr- TransPend-
>                  LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <1us, L1 <16us
>                          ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
>                  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
> DLActive+ BWMgmt- ABWMgmt-
>                  SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+ 
> HotPlug+ Surprise+
>                          Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
>                  SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt-
> HPIrq- LinkChg-
>                          Control: AttnInd Off, PwrInd Off, Power- Interlock-
>                  SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt-
> PresDet- Interlock-
>                          Changed: MRL- PresDet- LinkState-
>                  RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- 
> PMEIntEna+ CRSVisible-
>                  RootCap: CRSVisible-
>                  RootSta: PME ReqID 0000, PMEStatus- PMEPending-
>                  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, 
> LTR+, OBFF Not Supported ARIFwd+
>                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
>                  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
> SpeedDis-
>                           Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
>                           Compliance De-emphasis: -6dB
>                  LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>                           EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
>          Capabilities: [100 v2] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
>                  AERCap: First Error Pointer: 00, GenCap+ CGenEn+ 
> ChkCap+ ChkEn+
>          Capabilities: [178 v1] #19
>          Kernel driver in use: pcieport
> 
> 0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>          Physical Slot: 5-1
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0, Cache Line Size: 128 bytes
>          Interrupt: pin A routed to IRQ 299
>          Region 0: Memory at c0100100000 (32-bit, non-prefetchable) [size=128K]
>          Region 1: Memory at c0100120000 (32-bit, non-prefetchable) [size=128K]
>          Region 2: I/O ports at 1000 [size=32]
>          Expansion ROM at c0100140000 [disabled] [size=128K]
>          Capabilities: [c8] Power Management version 2
>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>          Capabilities: [100 v1] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
> ChkCap- ChkEn-
>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>          Kernel driver in use: e1000e
> 
> 0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>          Physical Slot: 5-1
>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>          Latency: 0, Cache Line Size: 128 bytes
>          Interrupt: pin B routed to IRQ 301
>          Region 0: Memory at c0100160000 (32-bit, non-prefetchable) [size=128K]
>          Region 1: Memory at c0100180000 (32-bit, non-prefetchable) [size=128K]
>          Region 2: I/O ports at 1020 [size=32]
>          Expansion ROM at c01001a0000 [disabled] [size=128K]
>          Capabilities: [c8] Power Management version 2
>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                  Address: 00000000397f0040  Data: 0000
>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>          Capabilities: [100 v1] Advanced Error Reporting
>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
> ChkCap- ChkEn-
>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>          Kernel driver in use: e1000e
> 
> Thanks,
> Tyler
> 
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to check if same problem persist there?
Thanks,
Sasha


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-13  8:34       ` Neftin, Sasha
@ 2016-11-17 13:31         ` Neftin, Sasha
  -1 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:31 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>> Hello Sasha,
>>
>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>> Move IRQ free code so that it will happen regardless of the
>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>> because the IRQ still has action since it was never freed. A
>>>> secondary bus reset can cause this case to happen.
>>>>
>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>> ---
>>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> index 7017281..36cfcb0 100644
>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>           e1000e_down(adapter, true);
>>>> -        e1000_free_irq(adapter);
>>>>             /* Link status message must follow this format */
>>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>       }               
>>>>   +    e1000_free_irq(adapter);
>>>> +
>>>>       napi_disable(&adapter->napi);
>>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>>
>>> I would like not recommend insert this change. This change related
>>> driver state machine, we afraid from lot of synchronization problem and
>>> issues.
>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>
>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>> will never free the IRQ.
>>
>>> Another point, does before execute secondary bus reset your SW back up
>>> pcie configuration space as properly?
>>
>> After a secondary bus reset, the link needs to recover and go back to a
>> working state after 1 second.
>>
>> From the callstack, the issue is happening while removing the endpoint
>> from the system, before applying the secondary bus reset.
>>
>> The order of events is
>> 1. remove the drivers
>> 2. cause a secondary bus reset
>> 3. wait 1 second
> Actually, this is too much, usually link up in less than 100ms.You can
> check Data Link Layer indication.
>> 4. recover the link
>>
>> callstack:
>> free_msi_irqs+0x6c/0x1a8
>> pci_disable_msi+0xb0/0x148
>> e1000e_reset_interrupt_capability+0x60/0x78
>> e1000_remove+0xc8/0x180
>> pci_device_remove+0x48/0x118
>> __device_release_driver+0x80/0x108
>> device_release_driver+0x2c/0x40
>> pci_stop_bus_device+0xa0/0xb0
>> pci_stop_bus_device+0x3c/0xb0
>> pci_stop_root_bus+0x54/0x80
>> acpi_pci_root_remove+0x28/0x64
>> acpi_bus_trim+0x6c/0xa4
>> acpi_device_hotplug+0x19c/0x3f4
>> acpi_hotplug_work_fn+0x28/0x3c
>> process_one_work+0x150/0x460
>> worker_thread+0x50/0x4b8
>> kthread+0xd4/0xe8
>> ret_from_fork+0x10/0x50
>>
>> Thanks,
>> Tyler
>>
> Hello Tyler,
> Okay, we need consult more about this suggestion.
> May I ask what is setup you run? Is there NIC or on board LAN? I would
> like try reproduce this issue in our lab's too.
> Also, is same issue observed with same scenario and others NIC's too?
> Sasha
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to check if same problem persist there?
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-17 13:31         ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:31 UTC (permalink / raw)
  To: intel-wired-lan

On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>> Hello Sasha,
>>
>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>> Move IRQ free code so that it will happen regardless of the
>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>> because the IRQ still has action since it was never freed. A
>>>> secondary bus reset can cause this case to happen.
>>>>
>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>> ---
>>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> index 7017281..36cfcb0 100644
>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>           e1000e_down(adapter, true);
>>>> -        e1000_free_irq(adapter);
>>>>             /* Link status message must follow this format */
>>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>       }               
>>>>   +    e1000_free_irq(adapter);
>>>> +
>>>>       napi_disable(&adapter->napi);
>>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>>
>>> I would like not recommend insert this change. This change related
>>> driver state machine, we afraid from lot of synchronization problem and
>>> issues.
>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>
>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>> will never free the IRQ.
>>
>>> Another point, does before execute secondary bus reset your SW back up
>>> pcie configuration space as properly?
>>
>> After a secondary bus reset, the link needs to recover and go back to a
>> working state after 1 second.
>>
>> From the callstack, the issue is happening while removing the endpoint
>> from the system, before applying the secondary bus reset.
>>
>> The order of events is
>> 1. remove the drivers
>> 2. cause a secondary bus reset
>> 3. wait 1 second
> Actually, this is too much, usually link up in less than 100ms.You can
> check Data Link Layer indication.
>> 4. recover the link
>>
>> callstack:
>> free_msi_irqs+0x6c/0x1a8
>> pci_disable_msi+0xb0/0x148
>> e1000e_reset_interrupt_capability+0x60/0x78
>> e1000_remove+0xc8/0x180
>> pci_device_remove+0x48/0x118
>> __device_release_driver+0x80/0x108
>> device_release_driver+0x2c/0x40
>> pci_stop_bus_device+0xa0/0xb0
>> pci_stop_bus_device+0x3c/0xb0
>> pci_stop_root_bus+0x54/0x80
>> acpi_pci_root_remove+0x28/0x64
>> acpi_bus_trim+0x6c/0xa4
>> acpi_device_hotplug+0x19c/0x3f4
>> acpi_hotplug_work_fn+0x28/0x3c
>> process_one_work+0x150/0x460
>> worker_thread+0x50/0x4b8
>> kthread+0xd4/0xe8
>> ret_from_fork+0x10/0x50
>>
>> Thanks,
>> Tyler
>>
> Hello Tyler,
> Okay, we need consult more about this suggestion.
> May I ask what is setup you run? Is there NIC or on board LAN? I would
> like try reproduce this issue in our lab's too.
> Also, is same issue observed with same scenario and others NIC's too?
> Sasha
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to check if same problem persist there?
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] Fwd: FW: [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
       [not found]               ` <7a5a4fdc-6086-4bd8-cb95-00d4ec2c0817@intel.com>
@ 2016-11-17 13:49                 ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-11-17 13:49 UTC (permalink / raw)
  To: intel-wired-lan

On 11/17/2016 3:48 PM, Neftin, Sasha wrote:
> On 11/17/2016 3:47 PM, Neftin, Sasha wrote:
>>
>>
>>
>> -------- Forwarded Message --------
>> Subject: FW: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of
>> __E1000_DOWN
>> Date: Thu, 17 Nov 2016 14:53:26 +0200
>> From: Neftin, Sasha <sasha.neftin@intel.com>
>> To: Neftin, Sasha <sasha.neftin@intel.com>
>>
>>
>>
>> Skype Me???! -----Original Message-----
>> From: Baicar, Tyler [mailto:tbaicar at codeaurora.org] Sent: Tuesday,
>> November 15, 2016 11:50 PM
>> To: Neftin, Sasha <sasha.neftin@intel.com>; Kirsher, Jeffrey T
>> <jeffrey.t.kirsher@intel.com>; intel-wired-lan at lists.osuosl.org;
>> netdev at vger.kernel.org; linux-kernel at vger.kernel.org;
>> okaya at codeaurora.org; timur at codeaurora.org
>> Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of
>> __E1000_DOWN
>>
>> On 11/13/2016 2:25 AM, Neftin, Sasha wrote:
>>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>>> Hello Sasha,
>>>>>
>>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>>> Move IRQ free code so that it will happen regardless of the 
>>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its 
>>>>>>> IRQ if the __E1000_DOWN bit is cleared. This is not sufficient 
>>>>>>> because it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>>> In such a situation, we will hit a kernel bug later in 
>>>>>>> e1000_remove because the IRQ still has action since it was never 
>>>>>>> freed. A secondary bus reset can cause this case to happen.
>>>>>>>
>>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>>> ---
>>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> index 7017281..36cfcb0 100644
>>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>>            e1000e_down(adapter, true);
>>>>>>> -        e1000_free_irq(adapter);
>>>>>>>              /* Link status message must follow this format */
>>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>>        }
>>>>>>>    +    e1000_free_irq(adapter);
>>>>>>> +
>>>>>>>        napi_disable(&adapter->napi);
>>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>>
>>>>>> I would like not recommend insert this change. This change related 
>>>>>> driver state machine, we afraid from lot of synchronization problem 
>>>>>> and issues.
>>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>>> What do you mean here? There is no loop. If __E1000_DOWN is set then 
>>>>> we will never free the IRQ.
>>>>>
>>>>>> Another point, does before execute secondary bus reset your SW back 
>>>>>> up pcie configuration space as properly?
>>>>> After a secondary bus reset, the link needs to recover and go back 
>>>>> to a working state after 1 second.
>>>>>
>>>>>  From the callstack, the issue is happening while removing the 
>>>>> endpoint from the system, before applying the secondary bus reset.
>>>>>
>>>>> The order of events is
>>>>> 1. remove the drivers
>>>>> 2. cause a secondary bus reset
>>>>> 3. wait 1 second
>>>> Actually, this is too much, usually link up in less than 100ms.You 
>>>> can check Data Link Layer indication.
>>>>> 4. recover the link
>>>>>
>>>>> callstack:
>>>>> free_msi_irqs+0x6c/0x1a8
>>>>> pci_disable_msi+0xb0/0x148
>>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>>> e1000_remove+0xc8/0x180
>>>>> pci_device_remove+0x48/0x118
>>>>> __device_release_driver+0x80/0x108
>>>>> device_release_driver+0x2c/0x40
>>>>> pci_stop_bus_device+0xa0/0xb0
>>>>> pci_stop_bus_device+0x3c/0xb0
>>>>> pci_stop_root_bus+0x54/0x80
>>>>> acpi_pci_root_remove+0x28/0x64
>>>>> acpi_bus_trim+0x6c/0xa4
>>>>> acpi_device_hotplug+0x19c/0x3f4
>>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>>> process_one_work+0x150/0x460
>>>>> worker_thread+0x50/0x4b8
>>>>> kthread+0xd4/0xe8
>>>>> ret_from_fork+0x10/0x50
>>>>>
>>>>> Thanks,
>>>>> Tyler
>>>>>
>>>> Hello Tyler,
>>>> Okay, we need consult more about this suggestion.
>>>> May I ask what is setup you run? Is there NIC or on board LAN? I 
>>>> would like try reproduce this issue in our lab's too.
>>>> Also, is same issue observed with same scenario and others NIC's too?
>>>> Sasha
>>>> _______________________________________________
>>>> Intel-wired-lan mailing list
>>>> Intel-wired-lan at lists.osuosl.org
>>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>>
>>> Please, specify what is device used.
>> Hello Sasha,
>> This was on a QDF2432 using an Intel PRO/1000 PT Dual Port server
>> adapter. I have not tried other e1000e PCIe cards, but have not seen any
>> similar issues with Mellanox cards. I'm able to reproduce it with just
>> pulling the card out. Here is the lspci -vvv output for this card:
>>
>> 0004:00:00.0 PCI bridge: Airgo Networks, Inc. Device 0400 (prog-if 00
>> [Normal decode])
>>          Physical Slot: 5
>>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>          Latency: 0
>>          Interrupt: pin A routed to IRQ 297
>>          Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>>          I/O behind bridge: 00002000-00002fff
>>          Memory behind bridge: 00100000-002fffff
>>          Prefetchable memory behind bridge:
>> 00000c0400000000-00000c04001fffff
>>          Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- <SERR- <PERR-
>>          BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
>>                  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>>          Capabilities: [40] Power Management version 3
>>                  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold-)
>>                  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>          Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                  Address: 00000000397f0040  Data: 0000
>>          Capabilities: [70] Express (v2) Root Port (Slot+), MSI 00
>>                  DevCap: MaxPayload 512 bytes, PhantFunc 0
>>                          ExtTag- RBE+
>>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>> Unsupported+
>>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
>> AuxPwr- TransPend-
>>                  LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1,
>> Exit Latency L0s <1us, L1 <16us
>>                          ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
>>                  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
>>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
>> DLActive+ BWMgmt- ABWMgmt-
>>                  SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+
>> HotPlug+ Surprise+
>>                          Slot #5, PowerLimit 75.000W; Interlock+ NoCompl-
>>                  SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt-
>> HPIrq- LinkChg-
>>                          Control: AttnInd Off, PwrInd Off, Power- Interlock-
>>                  SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt-
>> PresDet- Interlock-
>>                          Changed: MRL- PresDet- LinkState-
>>                  RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal-
>> PMEIntEna+ CRSVisible-
>>                  RootCap: CRSVisible-
>>                  RootSta: PME ReqID 0000, PMEStatus- PMEPending-
>>                  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+,
>> LTR+, OBFF Not Supported ARIFwd+
>>                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
>> LTR-, OBFF Disabled ARIFwd-
>>                  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
>> SpeedDis-
>>                           Transmit Margin: Normal Operating Range,
>> EnterModifiedCompliance- ComplianceSOS-
>>                           Compliance De-emphasis: -6dB
>>                  LnkSta2: Current De-emphasis Level: -3.5dB,
>> EqualizationComplete-, EqualizationPhase1-
>>                           EqualizationPhase2-, EqualizationPhase3-,
>> LinkEqualizationRequest-
>>          Capabilities: [100 v2] Advanced Error Reporting
>>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                  UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr+
>>                  AERCap: First Error Pointer: 00, GenCap+ CGenEn+
>> ChkCap+ ChkEn+
>>          Capabilities: [178 v1] #19
>>          Kernel driver in use: pcieport
>>
>> 0004:01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit
>> Ethernet Controller (rev 06)
>>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>>          Physical Slot: 5-1
>>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>          Latency: 0, Cache Line Size: 128 bytes
>>          Interrupt: pin A routed to IRQ 299
>>          Region 0: Memory at c0100100000 (32-bit, non-prefetchable)
>> [size=128K]
>>          Region 1: Memory at c0100120000 (32-bit, non-prefetchable)
>> [size=128K]
>>          Region 2: I/O ports at 1000 [size=32]
>>          Expansion ROM at c0100140000 [disabled] [size=128K]
>>          Capabilities: [c8] Power Management version 2
>>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                  Address: 00000000397f0040  Data: 0000
>>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
>> <512ns, L1 <64us
>>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>> Unsupported+
>>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
>> AuxPwr+ TransPend-
>>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
>> Exit Latency L0s <4us, L1 <64us
>>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
>> DLActive- BWMgmt- ABWMgmt-
>>          Capabilities: [100 v1] Advanced Error Reporting
>>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
>> ChkCap- ChkEn-
>>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>>          Kernel driver in use: e1000e
>>
>> 0004:01:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit
>> Ethernet Controller (rev 06)
>>          Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
>>          Physical Slot: 5-1
>>          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>          Latency: 0, Cache Line Size: 128 bytes
>>          Interrupt: pin B routed to IRQ 301
>>          Region 0: Memory at c0100160000 (32-bit, non-prefetchable)
>> [size=128K]
>>          Region 1: Memory at c0100180000 (32-bit, non-prefetchable)
>> [size=128K]
>>          Region 2: I/O ports at 1020 [size=32]
>>          Expansion ROM at c01001a0000 [disabled] [size=128K]
>>          Capabilities: [c8] Power Management version 2
>>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                  Address: 00000000397f0040  Data: 0000
>>          Capabilities: [e0] Express (v1) Endpoint, MSI 00
>>                  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
>> <512ns, L1 <64us
>>                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>>                  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>> Unsupported+
>>                          RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                          MaxPayload 256 bytes, MaxReadReq 512 bytes
>>                  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
>> AuxPwr+ TransPend-
>>                  LnkCap: Port #8, Speed 2.5GT/s, Width x4, ASPM L0s,
>> Exit Latency L0s <4us, L1 <64us
>>                          ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>>                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>>                          ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
>> DLActive- BWMgmt- ABWMgmt-
>>          Capabilities: [100 v1] Advanced Error Reporting
>>                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>>                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                  AERCap: First Error Pointer: 14, GenCap- CGenEn-
>> ChkCap- ChkEn-
>>          Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-3e-5b-7a
>>          Kernel driver in use: e1000e
>>
>> Thanks,
>> Tyler
>>
> 
Hello Tyler,
I see some in consistent implementation of __*_close methods in our
drivers. Do you have any igb NIC to check if same problem persist there?
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-17 13:31         ` Neftin, Sasha
@ 2016-11-21 20:40           ` Baicar, Tyler
  -1 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-21 20:40 UTC (permalink / raw)
  To: Neftin, Sasha, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>> because the IRQ still has action since it was never freed. A
>>>>> secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related
>>>> driver state machine, we afraid from lot of synchronization problem and
>>>> issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>>> will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back up
>>>> pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back to a
>>> working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the endpoint
>>> from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You can
>> check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>> like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Hello Tyler,
> I see some in consistent implementation of __*_close methods in our
> drivers. Do you have any igb NIC to check if same problem persist there?
> Thanks,
> Sasha
Hello Sasha,

I couldn't find an igb NIC to test with, but I did find another e1000e 
card that does not cause the same issue. That card is:

0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit 
Network Connection
         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100100000 (32-bit, non-prefetchable) 
[size=512K]
         Region 2: I/O ports at 1000 [size=32]
         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable) 
[size=16K]
         Expansion ROM at c0100180000 [disabled] [size=256K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1, 
Exit Latency L0s <128ns, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
                 Vector table: BAR=3 offset=00000000
                 PBA: BAR=3 offset=00002000
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-29-47-34
         Kernel driver in use: e1000e

Here are the kernel logs from first running the test on this card and 
then running the test on the Intel 82571EB card that I originally found 
the issue with (you can see the issue doesn't happen on this card but 
does on the other):

[   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
[   44.155238] pcieport 0000:00:00.0: PCIe Bus Error: 
severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
id=0000(Requester ID)
[   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error 
status/mask=00004000/00400000
[   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
[   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register 
not set as expected
[   82.445586] pcieport 0002:00:00.0: PCIe Bus Error: 
severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error 
status/mask=00000001/00006000
[   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
[   82.469355] pcieport 0002:00:00.0: PCIe Bus Error: 
severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
id=0000(Requester ID)
[   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error 
status/mask=00004000/00400000
[   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
[   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
[   84.528036] kernel BUG at drivers/pci/msi.c:369!

I'm not sure why it reproduces on the 82571EB card and not the 82574L 
card. The only obvious difference is there is no Reciever Error on the 
82574L card.

If you have a patch fixing the inconsistencies you mentioned with the 
__*_close methods I would certainly be willing to test it out!

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-11-21 20:40           ` Baicar, Tyler
  0 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-11-21 20:40 UTC (permalink / raw)
  To: intel-wired-lan

On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>> because the IRQ still has action since it was never freed. A
>>>>> secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>            e1000e_down(adapter, true);
>>>>> -        e1000_free_irq(adapter);
>>>>>              /* Link status message must follow this format */
>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>        }
>>>>>    +    e1000_free_irq(adapter);
>>>>> +
>>>>>        napi_disable(&adapter->napi);
>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related
>>>> driver state machine, we afraid from lot of synchronization problem and
>>>> issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>>> will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back up
>>>> pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back to a
>>> working state after 1 second.
>>>
>>>  From the callstack, the issue is happening while removing the endpoint
>>> from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You can
>> check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>> like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Hello Tyler,
> I see some in consistent implementation of __*_close methods in our
> drivers. Do you have any igb NIC to check if same problem persist there?
> Thanks,
> Sasha
Hello Sasha,

I couldn't find an igb NIC to test with, but I did find another e1000e 
card that does not cause the same issue. That card is:

0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit 
Network Connection
         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
         Physical Slot: 5-1
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 299
         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at c0100100000 (32-bit, non-prefetchable) 
[size=512K]
         Region 2: I/O ports at 1000 [size=32]
         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable) 
[size=16K]
         Expansion ROM at c0100180000 [disabled] [size=256K]
         Capabilities: [c8] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                 Address: 00000000397f0040  Data: 0000
         Capabilities: [e0] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
<512ns, L1 <64us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 256 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
AuxPwr+ TransPend-
                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1, 
Exit Latency L0s <128ns, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
                 Vector table: BAR=3 offset=00000000
                 PBA: BAR=3 offset=00002000
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap- CGenEn- 
ChkCap- ChkEn-
         Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-29-47-34
         Kernel driver in use: e1000e

Here are the kernel logs from first running the test on this card and 
then running the test on the Intel 82571EB card that I originally found 
the issue with (you can see the issue doesn't happen on this card but 
does on the other):

[   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
[   44.155238] pcieport 0000:00:00.0: PCIe Bus Error: 
severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
id=0000(Requester ID)
[   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error 
status/mask=00004000/00400000
[   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
[   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register 
not set as expected
[   82.445586] pcieport 0002:00:00.0: PCIe Bus Error: 
severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error 
status/mask=00000001/00006000
[   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
[   82.469355] pcieport 0002:00:00.0: PCIe Bus Error: 
severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
id=0000(Requester ID)
[   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error 
status/mask=00004000/00400000
[   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
[   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
[   84.528036] kernel BUG@drivers/pci/msi.c:369!

I'm not sure why it reproduces on the 82571EB card and not the 82574L 
card. The only obvious difference is there is no Reciever Error on the 
82574L card.

If you have a patch fixing the inconsistencies you mentioned with the 
__*_close methods I would certainly be willing to test it out!

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-11-21 20:40           ` Baicar, Tyler
@ 2016-12-02 17:02             ` Baicar, Tyler
  -1 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-12-02 17:02 UTC (permalink / raw)
  To: Neftin, Sasha, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

Hello Sasha,

Were you able to reproduce this issue?

Do you have a patch fixing the close function inconsistencies that you 
mentioned which I could try out?

Thanks,
Tyler

On 11/21/2016 1:40 PM, Baicar, Tyler wrote:
> On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>> Hello Sasha,
>>>>
>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>> Move IRQ free code so that it will happen regardless of the
>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>>> because the IRQ still has action since it was never freed. A
>>>>>> secondary bus reset can cause this case to happen.
>>>>>>
>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>> ---
>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> index 7017281..36cfcb0 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>            e1000e_down(adapter, true);
>>>>>> -        e1000_free_irq(adapter);
>>>>>>              /* Link status message must follow this format */
>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>        }
>>>>>>    +    e1000_free_irq(adapter);
>>>>>> +
>>>>>>        napi_disable(&adapter->napi);
>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>
>>>>> I would like not recommend insert this change. This change related
>>>>> driver state machine, we afraid from lot of synchronization 
>>>>> problem and
>>>>> issues.
>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>> What do you mean here? There is no loop. If __E1000_DOWN is set 
>>>> then we
>>>> will never free the IRQ.
>>>>
>>>>> Another point, does before execute secondary bus reset your SW 
>>>>> back up
>>>>> pcie configuration space as properly?
>>>> After a secondary bus reset, the link needs to recover and go back 
>>>> to a
>>>> working state after 1 second.
>>>>
>>>>  From the callstack, the issue is happening while removing the 
>>>> endpoint
>>>> from the system, before applying the secondary bus reset.
>>>>
>>>> The order of events is
>>>> 1. remove the drivers
>>>> 2. cause a secondary bus reset
>>>> 3. wait 1 second
>>> Actually, this is too much, usually link up in less than 100ms.You can
>>> check Data Link Layer indication.
>>>> 4. recover the link
>>>>
>>>> callstack:
>>>> free_msi_irqs+0x6c/0x1a8
>>>> pci_disable_msi+0xb0/0x148
>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>> e1000_remove+0xc8/0x180
>>>> pci_device_remove+0x48/0x118
>>>> __device_release_driver+0x80/0x108
>>>> device_release_driver+0x2c/0x40
>>>> pci_stop_bus_device+0xa0/0xb0
>>>> pci_stop_bus_device+0x3c/0xb0
>>>> pci_stop_root_bus+0x54/0x80
>>>> acpi_pci_root_remove+0x28/0x64
>>>> acpi_bus_trim+0x6c/0xa4
>>>> acpi_device_hotplug+0x19c/0x3f4
>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>> process_one_work+0x150/0x460
>>>> worker_thread+0x50/0x4b8
>>>> kthread+0xd4/0xe8
>>>> ret_from_fork+0x10/0x50
>>>>
>>>> Thanks,
>>>> Tyler
>>>>
>>> Hello Tyler,
>>> Okay, we need consult more about this suggestion.
>>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>>> like try reproduce this issue in our lab's too.
>>> Also, is same issue observed with same scenario and others NIC's too?
>>> Sasha
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan@lists.osuosl.org
>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>
>> Hello Tyler,
>> I see some in consistent implementation of __*_close methods in our
>> drivers. Do you have any igb NIC to check if same problem persist there?
>> Thanks,
>> Sasha
> Hello Sasha,
>
> I couldn't find an igb NIC to test with, but I did find another e1000e 
> card that does not cause the same issue. That card is:
>
> 0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit 
> Network Connection
>         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
>         Physical Slot: 5-1
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 299
>         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable) 
> [size=128K]
>         Region 1: Memory at c0100100000 (32-bit, non-prefetchable) 
> [size=512K]
>         Region 2: I/O ports at 1000 [size=32]
>         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable) 
> [size=16K]
>         Expansion ROM at c0100180000 [disabled] [size=256K]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                 Address: 00000000397f0040  Data: 0000
>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
> <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1, 
> Exit Latency L0s <128ns, L1 <64us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- 
> SlotClk+ DLActive- BWMgmt- ABWMgmt-
>         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
>                 Vector table: BAR=3 offset=00000000
>                 PBA: BAR=3 offset=00002000
>         Capabilities: [100 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap- CGenEn- 
> ChkCap- ChkEn-
>         Capabilities: [140 v1] Device Serial Number 
> 68-05-ca-ff-ff-29-47-34
>         Kernel driver in use: e1000e
>
> Here are the kernel logs from first running the test on this card and 
> then running the test on the Intel 82571EB card that I originally 
> found the issue with (you can see the issue doesn't happen on this 
> card but does on the other):
>
> [   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
> [   44.155238] pcieport 0000:00:00.0: PCIe Bus Error: 
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
> id=0000(Requester ID)
> [   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error 
> status/mask=00004000/00400000
> [   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
> [   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register 
> not set as expected
> [   82.445586] pcieport 0002:00:00.0: PCIe Bus Error: 
> severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
> [   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error 
> status/mask=00000001/00006000
> [   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
> [   82.469355] pcieport 0002:00:00.0: PCIe Bus Error: 
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
> id=0000(Requester ID)
> [   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error 
> status/mask=00004000/00400000
> [   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
> [   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
> [   84.528036] kernel BUG at drivers/pci/msi.c:369!
>
> I'm not sure why it reproduces on the 82571EB card and not the 82574L 
> card. The only obvious difference is there is no Reciever Error on the 
> 82574L card.
>
> If you have a patch fixing the inconsistencies you mentioned with the 
> __*_close methods I would certainly be willing to test it out!
>
> Thanks,
> Tyler
>

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-12-02 17:02             ` Baicar, Tyler
  0 siblings, 0 replies; 25+ messages in thread
From: Baicar, Tyler @ 2016-12-02 17:02 UTC (permalink / raw)
  To: intel-wired-lan

Hello Sasha,

Were you able to reproduce this issue?

Do you have a patch fixing the close function inconsistencies that you 
mentioned which I could try out?

Thanks,
Tyler

On 11/21/2016 1:40 PM, Baicar, Tyler wrote:
> On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>> Hello Sasha,
>>>>
>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>> Move IRQ free code so that it will happen regardless of the
>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>>> because the IRQ still has action since it was never freed. A
>>>>>> secondary bus reset can cause this case to happen.
>>>>>>
>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>> ---
>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> index 7017281..36cfcb0 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>            e1000e_down(adapter, true);
>>>>>> -        e1000_free_irq(adapter);
>>>>>>              /* Link status message must follow this format */
>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>        }
>>>>>>    +    e1000_free_irq(adapter);
>>>>>> +
>>>>>>        napi_disable(&adapter->napi);
>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>
>>>>> I would like not recommend insert this change. This change related
>>>>> driver state machine, we afraid from lot of synchronization 
>>>>> problem and
>>>>> issues.
>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>> What do you mean here? There is no loop. If __E1000_DOWN is set 
>>>> then we
>>>> will never free the IRQ.
>>>>
>>>>> Another point, does before execute secondary bus reset your SW 
>>>>> back up
>>>>> pcie configuration space as properly?
>>>> After a secondary bus reset, the link needs to recover and go back 
>>>> to a
>>>> working state after 1 second.
>>>>
>>>>  From the callstack, the issue is happening while removing the 
>>>> endpoint
>>>> from the system, before applying the secondary bus reset.
>>>>
>>>> The order of events is
>>>> 1. remove the drivers
>>>> 2. cause a secondary bus reset
>>>> 3. wait 1 second
>>> Actually, this is too much, usually link up in less than 100ms.You can
>>> check Data Link Layer indication.
>>>> 4. recover the link
>>>>
>>>> callstack:
>>>> free_msi_irqs+0x6c/0x1a8
>>>> pci_disable_msi+0xb0/0x148
>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>> e1000_remove+0xc8/0x180
>>>> pci_device_remove+0x48/0x118
>>>> __device_release_driver+0x80/0x108
>>>> device_release_driver+0x2c/0x40
>>>> pci_stop_bus_device+0xa0/0xb0
>>>> pci_stop_bus_device+0x3c/0xb0
>>>> pci_stop_root_bus+0x54/0x80
>>>> acpi_pci_root_remove+0x28/0x64
>>>> acpi_bus_trim+0x6c/0xa4
>>>> acpi_device_hotplug+0x19c/0x3f4
>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>> process_one_work+0x150/0x460
>>>> worker_thread+0x50/0x4b8
>>>> kthread+0xd4/0xe8
>>>> ret_from_fork+0x10/0x50
>>>>
>>>> Thanks,
>>>> Tyler
>>>>
>>> Hello Tyler,
>>> Okay, we need consult more about this suggestion.
>>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>>> like try reproduce this issue in our lab's too.
>>> Also, is same issue observed with same scenario and others NIC's too?
>>> Sasha
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan at lists.osuosl.org
>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>
>> Hello Tyler,
>> I see some in consistent implementation of __*_close methods in our
>> drivers. Do you have any igb NIC to check if same problem persist there?
>> Thanks,
>> Sasha
> Hello Sasha,
>
> I couldn't find an igb NIC to test with, but I did find another e1000e 
> card that does not cause the same issue. That card is:
>
> 0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit 
> Network Connection
>         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
>         Physical Slot: 5-1
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 299
>         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable) 
> [size=128K]
>         Region 1: Memory at c0100100000 (32-bit, non-prefetchable) 
> [size=512K]
>         Region 2: I/O ports at 1000 [size=32]
>         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable) 
> [size=16K]
>         Expansion ROM at c0100180000 [disabled] [size=256K]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                 Address: 00000000397f0040  Data: 0000
>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
> <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
> Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- 
> AuxPwr+ TransPend-
>                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1, 
> Exit Latency L0s <128ns, L1 <64us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- 
> SlotClk+ DLActive- BWMgmt- ABWMgmt-
>         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
>                 Vector table: BAR=3 offset=00000000
>                 PBA: BAR=3 offset=00002000
>         Capabilities: [100 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- 
> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap- CGenEn- 
> ChkCap- ChkEn-
>         Capabilities: [140 v1] Device Serial Number 
> 68-05-ca-ff-ff-29-47-34
>         Kernel driver in use: e1000e
>
> Here are the kernel logs from first running the test on this card and 
> then running the test on the Intel 82571EB card that I originally 
> found the issue with (you can see the issue doesn't happen on this 
> card but does on the other):
>
> [   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
> [   44.155238] pcieport 0000:00:00.0: PCIe Bus Error: 
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
> id=0000(Requester ID)
> [   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error 
> status/mask=00004000/00400000
> [   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
> [   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register 
> not set as expected
> [   82.445586] pcieport 0002:00:00.0: PCIe Bus Error: 
> severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
> [   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error 
> status/mask=00000001/00006000
> [   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
> [   82.469355] pcieport 0002:00:00.0: PCIe Bus Error: 
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, 
> id=0000(Requester ID)
> [   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error 
> status/mask=00004000/00400000
> [   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
> [   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
> [   84.528036] kernel BUG at drivers/pci/msi.c:369!
>
> I'm not sure why it reproduces on the 82571EB card and not the 82574L 
> card. The only obvious difference is there is no Reciever Error on the 
> 82574L card.
>
> If you have a patch fixing the inconsistencies you mentioned with the 
> __*_close methods I would certainly be willing to test it out!
>
> Thanks,
> Tyler
>

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
  2016-12-02 17:02             ` Baicar, Tyler
@ 2016-12-04  7:35               ` Neftin, Sasha
  -1 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-12-04  7:35 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur

On 12/2/2016 7:02 PM, Baicar, Tyler wrote:
> Hello Sasha,
> 
> Were you able to reproduce this issue?
> 
> Do you have a patch fixing the close function inconsistencies that you
> mentioned which I could try out?
> 
> Thanks,
> Tyler
> 
> On 11/21/2016 1:40 PM, Baicar, Tyler wrote:
>> On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
>>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>>> Hello Sasha,
>>>>>
>>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>>> Move IRQ free code so that it will happen regardless of the
>>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>>>> because the IRQ still has action since it was never freed. A
>>>>>>> secondary bus reset can cause this case to happen.
>>>>>>>
>>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>>> ---
>>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> index 7017281..36cfcb0 100644
>>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>>            e1000e_down(adapter, true);
>>>>>>> -        e1000_free_irq(adapter);
>>>>>>>              /* Link status message must follow this format */
>>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>>        }
>>>>>>>    +    e1000_free_irq(adapter);
>>>>>>> +
>>>>>>>        napi_disable(&adapter->napi);
>>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>>
>>>>>> I would like not recommend insert this change. This change related
>>>>>> driver state machine, we afraid from lot of synchronization
>>>>>> problem and
>>>>>> issues.
>>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>>> What do you mean here? There is no loop. If __E1000_DOWN is set
>>>>> then we
>>>>> will never free the IRQ.
>>>>>
>>>>>> Another point, does before execute secondary bus reset your SW
>>>>>> back up
>>>>>> pcie configuration space as properly?
>>>>> After a secondary bus reset, the link needs to recover and go back
>>>>> to a
>>>>> working state after 1 second.
>>>>>
>>>>>  From the callstack, the issue is happening while removing the
>>>>> endpoint
>>>>> from the system, before applying the secondary bus reset.
>>>>>
>>>>> The order of events is
>>>>> 1. remove the drivers
>>>>> 2. cause a secondary bus reset
>>>>> 3. wait 1 second
>>>> Actually, this is too much, usually link up in less than 100ms.You can
>>>> check Data Link Layer indication.
>>>>> 4. recover the link
>>>>>
>>>>> callstack:
>>>>> free_msi_irqs+0x6c/0x1a8
>>>>> pci_disable_msi+0xb0/0x148
>>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>>> e1000_remove+0xc8/0x180
>>>>> pci_device_remove+0x48/0x118
>>>>> __device_release_driver+0x80/0x108
>>>>> device_release_driver+0x2c/0x40
>>>>> pci_stop_bus_device+0xa0/0xb0
>>>>> pci_stop_bus_device+0x3c/0xb0
>>>>> pci_stop_root_bus+0x54/0x80
>>>>> acpi_pci_root_remove+0x28/0x64
>>>>> acpi_bus_trim+0x6c/0xa4
>>>>> acpi_device_hotplug+0x19c/0x3f4
>>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>>> process_one_work+0x150/0x460
>>>>> worker_thread+0x50/0x4b8
>>>>> kthread+0xd4/0xe8
>>>>> ret_from_fork+0x10/0x50
>>>>>
>>>>> Thanks,
>>>>> Tyler
>>>>>
>>>> Hello Tyler,
>>>> Okay, we need consult more about this suggestion.
>>>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>>>> like try reproduce this issue in our lab's too.
>>>> Also, is same issue observed with same scenario and others NIC's too?
>>>> Sasha
>>>> _______________________________________________
>>>> Intel-wired-lan mailing list
>>>> Intel-wired-lan@lists.osuosl.org
>>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>>
>>> Hello Tyler,
>>> I see some in consistent implementation of __*_close methods in our
>>> drivers. Do you have any igb NIC to check if same problem persist there?
>>> Thanks,
>>> Sasha
>> Hello Sasha,
>>
>> I couldn't find an igb NIC to test with, but I did find another e1000e
>> card that does not cause the same issue. That card is:
>>
>> 0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
>> Network Connection
>>         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
>>         Physical Slot: 5-1
>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 299
>>         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable)
>> [size=128K]
>>         Region 1: Memory at c0100100000 (32-bit, non-prefetchable)
>> [size=512K]
>>         Region 2: I/O ports at 1000 [size=32]
>>         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable)
>> [size=16K]
>>         Expansion ROM at c0100180000 [disabled] [size=256K]
>>         Capabilities: [c8] Power Management version 2
>>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                 Address: 00000000397f0040  Data: 0000
>>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
>> <512ns, L1 <64us
>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>> Unsupported+
>>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
>> AuxPwr+ TransPend-
>>                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1,
>> Exit Latency L0s <128ns, L1 <64us
>>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train-
>> SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
>>                 Vector table: BAR=3 offset=00000000
>>                 PBA: BAR=3 offset=00002000
>>         Capabilities: [100 v1] Advanced Error Reporting
>>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr+
>>                 AERCap: First Error Pointer: 00, GenCap- CGenEn-
>> ChkCap- ChkEn-
>>         Capabilities: [140 v1] Device Serial Number
>> 68-05-ca-ff-ff-29-47-34
>>         Kernel driver in use: e1000e
>>
>> Here are the kernel logs from first running the test on this card and
>> then running the test on the Intel 82571EB card that I originally
>> found the issue with (you can see the issue doesn't happen on this
>> card but does on the other):
>>
>> [   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
>> [   44.155238] pcieport 0000:00:00.0: PCIe Bus Error:
>> severity=Uncorrected (Non-Fatal), type=Transaction Layer,
>> id=0000(Requester ID)
>> [   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error
>> status/mask=00004000/00400000
>> [   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
>> [   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register
>> not set as expected
>> [   82.445586] pcieport 0002:00:00.0: PCIe Bus Error:
>> severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
>> [   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error
>> status/mask=00000001/00006000
>> [   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
>> [   82.469355] pcieport 0002:00:00.0: PCIe Bus Error:
>> severity=Uncorrected (Non-Fatal), type=Transaction Layer,
>> id=0000(Requester ID)
>> [   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error
>> status/mask=00004000/00400000
>> [   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
>> [   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
>> [   84.528036] kernel BUG at drivers/pci/msi.c:369!
>>
>> I'm not sure why it reproduces on the 82571EB card and not the 82574L
>> card. The only obvious difference is there is no Reciever Error on the
>> 82574L card.
>>
>> If you have a patch fixing the inconsistencies you mentioned with the
>> __*_close methods I would certainly be willing to test it out!
>>
>> Thanks,
>> Tyler
>>
> 
Hello Tyler,
We do not tried reproduce issues yet. As I know hit on BUG_ON happen on
very old NIC. Our more new NIC do not experienced such problem. I would
like recommend do not change kernel code in this moment. This is very
sensitive for a driver state machine and we afraid from opening lot of
issues. I will investigate more depth this problem later(hope have a
time for it) and let you know if we have suggest fixes for this problem.
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
@ 2016-12-04  7:35               ` Neftin, Sasha
  0 siblings, 0 replies; 25+ messages in thread
From: Neftin, Sasha @ 2016-12-04  7:35 UTC (permalink / raw)
  To: intel-wired-lan

On 12/2/2016 7:02 PM, Baicar, Tyler wrote:
> Hello Sasha,
> 
> Were you able to reproduce this issue?
> 
> Do you have a patch fixing the close function inconsistencies that you
> mentioned which I could try out?
> 
> Thanks,
> Tyler
> 
> On 11/21/2016 1:40 PM, Baicar, Tyler wrote:
>> On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
>>> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>>>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>>>> Hello Sasha,
>>>>>
>>>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>>>> Move IRQ free code so that it will happen regardless of the
>>>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>>>> because the IRQ still has action since it was never freed. A
>>>>>>> secondary bus reset can cause this case to happen.
>>>>>>>
>>>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>>>> ---
>>>>>>>    drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> index 7017281..36cfcb0 100644
>>>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>>>>          if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>>>>            e1000e_down(adapter, true);
>>>>>>> -        e1000_free_irq(adapter);
>>>>>>>              /* Link status message must follow this format */
>>>>>>>            pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>>>>        }
>>>>>>>    +    e1000_free_irq(adapter);
>>>>>>> +
>>>>>>>        napi_disable(&adapter->napi);
>>>>>>>          e1000e_free_tx_resources(adapter->tx_ring);
>>>>>>>
>>>>>> I would like not recommend insert this change. This change related
>>>>>> driver state machine, we afraid from lot of synchronization
>>>>>> problem and
>>>>>> issues.
>>>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>>>> What do you mean here? There is no loop. If __E1000_DOWN is set
>>>>> then we
>>>>> will never free the IRQ.
>>>>>
>>>>>> Another point, does before execute secondary bus reset your SW
>>>>>> back up
>>>>>> pcie configuration space as properly?
>>>>> After a secondary bus reset, the link needs to recover and go back
>>>>> to a
>>>>> working state after 1 second.
>>>>>
>>>>>  From the callstack, the issue is happening while removing the
>>>>> endpoint
>>>>> from the system, before applying the secondary bus reset.
>>>>>
>>>>> The order of events is
>>>>> 1. remove the drivers
>>>>> 2. cause a secondary bus reset
>>>>> 3. wait 1 second
>>>> Actually, this is too much, usually link up in less than 100ms.You can
>>>> check Data Link Layer indication.
>>>>> 4. recover the link
>>>>>
>>>>> callstack:
>>>>> free_msi_irqs+0x6c/0x1a8
>>>>> pci_disable_msi+0xb0/0x148
>>>>> e1000e_reset_interrupt_capability+0x60/0x78
>>>>> e1000_remove+0xc8/0x180
>>>>> pci_device_remove+0x48/0x118
>>>>> __device_release_driver+0x80/0x108
>>>>> device_release_driver+0x2c/0x40
>>>>> pci_stop_bus_device+0xa0/0xb0
>>>>> pci_stop_bus_device+0x3c/0xb0
>>>>> pci_stop_root_bus+0x54/0x80
>>>>> acpi_pci_root_remove+0x28/0x64
>>>>> acpi_bus_trim+0x6c/0xa4
>>>>> acpi_device_hotplug+0x19c/0x3f4
>>>>> acpi_hotplug_work_fn+0x28/0x3c
>>>>> process_one_work+0x150/0x460
>>>>> worker_thread+0x50/0x4b8
>>>>> kthread+0xd4/0xe8
>>>>> ret_from_fork+0x10/0x50
>>>>>
>>>>> Thanks,
>>>>> Tyler
>>>>>
>>>> Hello Tyler,
>>>> Okay, we need consult more about this suggestion.
>>>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>>>> like try reproduce this issue in our lab's too.
>>>> Also, is same issue observed with same scenario and others NIC's too?
>>>> Sasha
>>>> _______________________________________________
>>>> Intel-wired-lan mailing list
>>>> Intel-wired-lan at lists.osuosl.org
>>>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>>>
>>> Hello Tyler,
>>> I see some in consistent implementation of __*_close methods in our
>>> drivers. Do you have any igb NIC to check if same problem persist there?
>>> Thanks,
>>> Sasha
>> Hello Sasha,
>>
>> I couldn't find an igb NIC to test with, but I did find another e1000e
>> card that does not cause the same issue. That card is:
>>
>> 0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
>> Network Connection
>>         Subsystem: Intel Corporation Gigabit CT Desktop Adapter
>>         Physical Slot: 5-1
>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR+ FastB2B- DisINTx+
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 299
>>         Region 0: Memory at c01001c0000 (32-bit, non-prefetchable)
>> [size=128K]
>>         Region 1: Memory at c0100100000 (32-bit, non-prefetchable)
>> [size=512K]
>>         Region 2: I/O ports at 1000 [size=32]
>>         Region 3: Memory at c01001e0000 (32-bit, non-prefetchable)
>> [size=16K]
>>         Expansion ROM at c0100180000 [disabled] [size=256K]
>>         Capabilities: [c8] Power Management version 2
>>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>>         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                 Address: 00000000397f0040  Data: 0000
>>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
>> <512ns, L1 <64us
>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>> Unsupported+
>>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
>> AuxPwr+ TransPend-
>>                 LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1,
>> Exit Latency L0s <128ns, L1 <64us
>>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train-
>> SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>         Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
>>                 Vector table: BAR=3 offset=00000000
>>                 PBA: BAR=3 offset=00002000
>>         Capabilities: [100 v1] Advanced Error Reporting
>>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
>> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr-
>>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>> NonFatalErr+
>>                 AERCap: First Error Pointer: 00, GenCap- CGenEn-
>> ChkCap- ChkEn-
>>         Capabilities: [140 v1] Device Serial Number
>> 68-05-ca-ff-ff-29-47-34
>>         Kernel driver in use: e1000e
>>
>> Here are the kernel logs from first running the test on this card and
>> then running the test on the Intel 82571EB card that I originally
>> found the issue with (you can see the issue doesn't happen on this
>> card but does on the other):
>>
>> [   44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
>> [   44.155238] pcieport 0000:00:00.0: PCIe Bus Error:
>> severity=Uncorrected (Non-Fatal), type=Transaction Layer,
>> id=0000(Requester ID)
>> [   44.166111] pcieport 0000:00:00.0:   device [17cb:0400] error
>> status/mask=00004000/00400000
>> [   44.174420] pcieport 0000:00:00.0:    [14] Completion Timeout (First)
>> [   44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register
>> not set as expected
>> [   82.445586] pcieport 0002:00:00.0: PCIe Bus Error:
>> severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
>> [   82.454851] pcieport 0002:00:00.0:   device [17cb:0400] error
>> status/mask=00000001/00006000
>> [   82.463209] pcieport 0002:00:00.0:    [ 0] Receiver Error
>> [   82.469355] pcieport 0002:00:00.0: PCIe Bus Error:
>> severity=Uncorrected (Non-Fatal), type=Transaction Layer,
>> id=0000(Requester ID)
>> [   82.481026] pcieport 0002:00:00.0:   device [17cb:0400] error
>> status/mask=00004000/00400000
>> [   82.489343] pcieport 0002:00:00.0:    [14] Completion Timeout (First)
>> [   82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
>> [   84.528036] kernel BUG at drivers/pci/msi.c:369!
>>
>> I'm not sure why it reproduces on the 82571EB card and not the 82574L
>> card. The only obvious difference is there is no Reciever Error on the
>> 82574L card.
>>
>> If you have a patch fixing the inconsistencies you mentioned with the
>> __*_close methods I would certainly be willing to test it out!
>>
>> Thanks,
>> Tyler
>>
> 
Hello Tyler,
We do not tried reproduce issues yet. As I know hit on BUG_ON happen on
very old NIC. Our more new NIC do not experienced such problem. I would
like recommend do not change kernel code in this moment. This is very
sensitive for a driver state machine and we afraid from opening lot of
issues. I will investigate more depth this problem later(hope have a
time for it) and let you know if we have suggest fixes for this problem.
Thanks,
Sasha


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-12-04  7:36 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-09 21:41 [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN Tyler Baicar
2016-11-09 21:41 ` [Intel-wired-lan] " Tyler Baicar
2016-11-10  6:19 ` Neftin, Sasha
2016-11-10  6:19   ` Neftin, Sasha
2016-11-10 22:35   ` Baicar, Tyler
2016-11-10 22:35     ` Baicar, Tyler
2016-11-13  8:34     ` Neftin, Sasha
2016-11-13  8:34       ` Neftin, Sasha
2016-11-13  9:25       ` Neftin, Sasha
2016-11-13  9:25         ` Neftin, Sasha
2016-11-15 21:50         ` Baicar, Tyler
2016-11-15 21:50           ` Baicar, Tyler
     [not found]           ` <630A6B92B7EDEB45A87E20D3D286660153E3B481@hasmsx109.ger.corp.intel.com>
2016-11-17 13:07             ` Neftin, Sasha
2016-11-17 13:17             ` Fwd:[Intel-wired-lan] " Neftin, Sasha
2016-11-17 13:17               ` [Intel-wired-lan] Fwd: " Neftin, Sasha
2016-11-17 13:27             ` [Intel-wired-lan] " Neftin, Sasha
     [not found]             ` <82260f85-2f8d-20e7-4f52-86b156aff96f@intel.com>
     [not found]               ` <7a5a4fdc-6086-4bd8-cb95-00d4ec2c0817@intel.com>
2016-11-17 13:49                 ` [Intel-wired-lan] Fwd: FW: " Neftin, Sasha
2016-11-17 13:31       ` [Intel-wired-lan] " Neftin, Sasha
2016-11-17 13:31         ` Neftin, Sasha
2016-11-21 20:40         ` Baicar, Tyler
2016-11-21 20:40           ` Baicar, Tyler
2016-12-02 17:02           ` Baicar, Tyler
2016-12-02 17:02             ` Baicar, Tyler
2016-12-04  7:35             ` Neftin, Sasha
2016-12-04  7:35               ` Neftin, Sasha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.