All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Stefan Dietrich <roots@gmx.de>
Cc: kuba@kernel.org, greg@kroah.com, netdev@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org, regressions@lists.linux.dev
Subject: Re: [PATCH] igc: Avoid possible deadlock during suspend/resume
Date: Thu, 02 Dec 2021 14:34:21 -0800	[thread overview]
Message-ID: <87o85yljpu.fsf@intel.com> (raw)
In-Reply-To: <5a4b31d43d9bf32e518188f3ef84c433df3a18b1.camel@gmx.de>

Hi Stefan,

Stefan Dietrich <roots@gmx.de> writes:

> Hi Vinicius,
>
> thanks for the patch - unfortunately it did not solve the issue and I
> am still getting reboots/lockups.
>

Thanks for the test. We learned something, not a lot, but something: the
problem you are facing is PTM related and it's not the same bug as that
PM deadlock.

I am still trying to understand what's going on.

Are you able to send me the 'dmesg' output for the two kernel configs
(CONFIG_PCIE_PTM enabled and disabled)? (no need to bring the network
interface up or down). Your kernel .config would be useful as well.

>
> Cheers,
> Stefan
>
> On Wed, 2021-12-01 at 10:57 -0800, Vinicius Costa Gomes wrote:
>> Inspired by:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215129
>>
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> ---
>> Just to see if it's indeed the same problem as the bug report above.
>>
>>  drivers/net/ethernet/intel/igc/igc_main.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c
>> b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 0e19b4d02e62..c58bf557a2a1 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -6619,7 +6619,7 @@ static void igc_deliver_wake_packet(struct
>> net_device *netdev)
>>  	netif_rx(skb);
>>  }
>>
>> -static int __maybe_unused igc_resume(struct device *dev)
>> +static int __maybe_unused __igc_resume(struct device *dev, bool rpm)
>>  {
>>  	struct pci_dev *pdev = to_pci_dev(dev);
>>  	struct net_device *netdev = pci_get_drvdata(pdev);
>> @@ -6661,20 +6661,27 @@ static int __maybe_unused igc_resume(struct
>> device *dev)
>>
>>  	wr32(IGC_WUS, ~0);
>>
>> -	rtnl_lock();
>> +	if (!rpm)
>> +		rtnl_lock();
>>  	if (!err && netif_running(netdev))
>>  		err = __igc_open(netdev, true);
>>
>>  	if (!err)
>>  		netif_device_attach(netdev);
>> -	rtnl_unlock();
>> +	if (!rpm)
>> +		rtnl_unlock();
>>
>>  	return err;
>>  }
>>
>>  static int __maybe_unused igc_runtime_resume(struct device *dev)
>>  {
>> -	return igc_resume(dev);
>> +	return __igc_resume(dev, true);
>> +}
>> +
>> +static int __maybe_unused igc_resume(struct device *dev)
>> +{
>> +	return __igc_resume(dev, false);
>>  }
>>
>>  static int __maybe_unused igc_suspend(struct device *dev)
>> @@ -6738,7 +6745,7 @@ static pci_ers_result_t
>> igc_io_error_detected(struct pci_dev *pdev,
>>   *  @pdev: Pointer to PCI device
>>   *
>>   *  Restart the card from scratch, as if from a cold-boot.
>> Implementation
>> - *  resembles the first-half of the igc_resume routine.
>> + *  resembles the first-half of the __igc_resume routine.
>>   **/
>>  static pci_ers_result_t igc_io_slot_reset(struct pci_dev *pdev)
>>  {
>> @@ -6777,7 +6784,7 @@ static pci_ers_result_t
>> igc_io_slot_reset(struct pci_dev *pdev)
>>   *
>>   *  This callback is called when the error recovery driver tells us
>> that
>>   *  its OK to resume normal operation. Implementation resembles the
>> - *  second-half of the igc_resume routine.
>> + *  second-half of the __igc_resume routine.
>>   */
>>  static void igc_io_resume(struct pci_dev *pdev)
>>  {
>


Cheers,
-- 
Vinicius

WARNING: multiple messages have this Message-ID (diff)
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH] igc: Avoid possible deadlock during suspend/resume
Date: Thu, 02 Dec 2021 14:34:21 -0800	[thread overview]
Message-ID: <87o85yljpu.fsf@intel.com> (raw)
In-Reply-To: <5a4b31d43d9bf32e518188f3ef84c433df3a18b1.camel@gmx.de>

Hi Stefan,

Stefan Dietrich <roots@gmx.de> writes:

> Hi Vinicius,
>
> thanks for the patch - unfortunately it did not solve the issue and I
> am still getting reboots/lockups.
>

Thanks for the test. We learned something, not a lot, but something: the
problem you are facing is PTM related and it's not the same bug as that
PM deadlock.

I am still trying to understand what's going on.

Are you able to send me the 'dmesg' output for the two kernel configs
(CONFIG_PCIE_PTM enabled and disabled)? (no need to bring the network
interface up or down). Your kernel .config would be useful as well.

>
> Cheers,
> Stefan
>
> On Wed, 2021-12-01 at 10:57 -0800, Vinicius Costa Gomes wrote:
>> Inspired by:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215129
>>
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> ---
>> Just to see if it's indeed the same problem as the bug report above.
>>
>>  drivers/net/ethernet/intel/igc/igc_main.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c
>> b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 0e19b4d02e62..c58bf557a2a1 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -6619,7 +6619,7 @@ static void igc_deliver_wake_packet(struct
>> net_device *netdev)
>>  	netif_rx(skb);
>>  }
>>
>> -static int __maybe_unused igc_resume(struct device *dev)
>> +static int __maybe_unused __igc_resume(struct device *dev, bool rpm)
>>  {
>>  	struct pci_dev *pdev = to_pci_dev(dev);
>>  	struct net_device *netdev = pci_get_drvdata(pdev);
>> @@ -6661,20 +6661,27 @@ static int __maybe_unused igc_resume(struct
>> device *dev)
>>
>>  	wr32(IGC_WUS, ~0);
>>
>> -	rtnl_lock();
>> +	if (!rpm)
>> +		rtnl_lock();
>>  	if (!err && netif_running(netdev))
>>  		err = __igc_open(netdev, true);
>>
>>  	if (!err)
>>  		netif_device_attach(netdev);
>> -	rtnl_unlock();
>> +	if (!rpm)
>> +		rtnl_unlock();
>>
>>  	return err;
>>  }
>>
>>  static int __maybe_unused igc_runtime_resume(struct device *dev)
>>  {
>> -	return igc_resume(dev);
>> +	return __igc_resume(dev, true);
>> +}
>> +
>> +static int __maybe_unused igc_resume(struct device *dev)
>> +{
>> +	return __igc_resume(dev, false);
>>  }
>>
>>  static int __maybe_unused igc_suspend(struct device *dev)
>> @@ -6738,7 +6745,7 @@ static pci_ers_result_t
>> igc_io_error_detected(struct pci_dev *pdev,
>>   *  @pdev: Pointer to PCI device
>>   *
>>   *  Restart the card from scratch, as if from a cold-boot.
>> Implementation
>> - *  resembles the first-half of the igc_resume routine.
>> + *  resembles the first-half of the __igc_resume routine.
>>   **/
>>  static pci_ers_result_t igc_io_slot_reset(struct pci_dev *pdev)
>>  {
>> @@ -6777,7 +6784,7 @@ static pci_ers_result_t
>> igc_io_slot_reset(struct pci_dev *pdev)
>>   *
>>   *  This callback is called when the error recovery driver tells us
>> that
>>   *  its OK to resume normal operation. Implementation resembles the
>> - *  second-half of the igc_resume routine.
>> + *  second-half of the __igc_resume routine.
>>   */
>>  static void igc_io_resume(struct pci_dev *pdev)
>>  {
>


Cheers,
-- 
Vinicius

  reply	other threads:[~2021-12-02 22:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-24  7:28 [REGRESSION] Kernel 5.15 reboots / freezes upon ifup/ifdown Stefan Dietrich
2021-11-24  7:33 ` Greg KH
2021-11-24  7:42   ` Stefan Dietrich
2021-11-24 17:20   ` Stefan Dietrich
2021-11-24 23:34     ` Jakub Kicinski
2021-11-24 23:34       ` [Intel-wired-lan] " Jakub Kicinski
2021-11-25  1:07       ` Vinicius Costa Gomes
2021-11-25  1:07         ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-11-25  1:13         ` Jakub Kicinski
2021-11-25  1:13           ` [Intel-wired-lan] " Jakub Kicinski
2021-11-25  8:41         ` Stefan Dietrich
2021-11-25  8:41           ` [Intel-wired-lan] " Stefan Dietrich
2021-12-01 11:45           ` Thorsten Leemhuis
2021-12-01 11:45             ` [Intel-wired-lan] " Thorsten Leemhuis
2021-12-01 17:47             ` Vinicius Costa Gomes
2021-12-01 17:47               ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-01 18:57               ` [PATCH] igc: Avoid possible deadlock during suspend/resume Vinicius Costa Gomes
2021-12-01 18:57                 ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-02  6:41                 ` Greg KH
2021-12-02  6:41                   ` [Intel-wired-lan] " Greg KH
2021-12-02  6:50                   ` Vinicius Costa Gomes
2021-12-02  6:50                     ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-02  8:34                 ` Stefan Dietrich
2021-12-02  8:34                   ` [Intel-wired-lan] " Stefan Dietrich
2021-12-02 22:34                   ` Vinicius Costa Gomes [this message]
2021-12-02 22:34                     ` Vinicius Costa Gomes
2021-12-10  9:40                     ` Thorsten Leemhuis
2021-12-10  9:40                       ` [Intel-wired-lan] " Thorsten Leemhuis
2021-12-10 13:45                       ` Stefan Dietrich
2021-12-10 13:45                         ` [Intel-wired-lan] " Stefan Dietrich
2021-12-10 14:01                         ` Thorsten Leemhuis
2021-12-10 14:01                           ` [Intel-wired-lan] " Thorsten Leemhuis
2021-12-10 14:51                           ` Stefan Dietrich
2021-12-10 14:51                             ` [Intel-wired-lan] " Stefan Dietrich
2021-12-11  0:41                             ` Vinicius Costa Gomes
2021-12-11  0:41                               ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-11  9:50                               ` Stefan Dietrich
2021-12-11  9:50                                 ` [Intel-wired-lan] " Stefan Dietrich
2021-12-13 18:32                                 ` Vinicius Costa Gomes
2021-12-13 18:32                                   ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-14  0:39                                   ` [PATCH net v1] igc: Do not enable crosstimestamping for i225-V models Vinicius Costa Gomes
2021-12-14  0:39                                     ` [Intel-wired-lan] " Vinicius Costa Gomes
2021-12-14  5:49                                     ` Thorsten Leemhuis
2021-12-14  5:49                                       ` [Intel-wired-lan] " Thorsten Leemhuis
2021-12-23  7:21                                       ` Thorsten Leemhuis
2021-12-23  7:21                                         ` [Intel-wired-lan] " Thorsten Leemhuis
2021-12-27 20:35                                     ` Kraus, NechamaX
2021-12-27 20:35                                       ` Kraus, NechamaX
2021-12-14  6:39                                   ` [PATCH] igc: Avoid possible deadlock during suspend/resume Stefan Dietrich
2021-12-14  6:39                                     ` [Intel-wired-lan] " Stefan Dietrich
2021-11-24  7:48 ` [REGRESSION] Kernel 5.15 reboots / freezes upon ifup/ifdown Thorsten Leemhuis
2021-11-25 11:15   ` Thorsten Leemhuis
2021-11-24  8:05 ` Stefan Dietrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o85yljpu.fsf@intel.com \
    --to=vinicius.gomes@intel.com \
    --cc=greg@kroah.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=roots@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.