linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [e1000e] e86e383f28: suspend-stress.fail
       [not found] ` <5A1631F8-259E-4897-BE52-0F5DB406E44F@canonical.com>
@ 2020-07-02 12:20   ` Zhang Rui
  2020-07-02 13:12     ` Kai-Heng Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Zhang Rui @ 2020-07-02 12:20 UTC (permalink / raw)
  To: Kai-Heng Feng, moderated list:INTEL ETHERNET DRIVERS
  Cc: kernel test robot, lkp, Len Brown, Rafael J. Wysocki, Linux PM list

Hi, all,

This patch has been shipped in 5.8-rc1 with its upstream commit id
0c80cdbf3320. And we observed big drop of suspend quality.

Previously, we have run into this "e1000e Hardware Error" issue,
occasionally. But now, on a NUC I have, system suspend-to-mem fails within 10 suspend  cycles in most cases, but won't work again until a reboot.
https://bugzilla.kernel.org/show_bug.cgi?id=205015

IMO, this is a regression, and we need to find a way to fix it.

thanks,
rui


On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
> [+Cc intel-wired-lan]
> 
> > On May 21, 2020, at 13:27, kernel test robot <rong.a.chen@intel.com
> > > wrote:
> > 
> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-7):
> > 
> > commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e: Warn if
> > disabling ULP failed")
> > 
https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
> >  dev-queue
> 
> kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed to
> disable ULP
> kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up, 100Mbps,
> full-duplex, lpa 0xC1E1
> kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0: Hardware
> Error
> kern  :info  : [  242.772702] e1000e 0000:00:19.0:
> pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
> 
> So the patch does catch issues previously ignored.
> 
> I wonder what's the next move, maybe increase the ULP timeout again?
> 
> Kai-Heng
> 
> > in testcase: suspend-stress
> > with following parameters:
> > 
> > 	mode: mem
> > 	iterations: 10
> > 
> > 
> > 
> > on test machine: 4 threads Broadwell with 8G memory
> > 
> > caused below changes (please refer to attached dmesg/kmsg for
> > entire log/backtrace):
> > 
> > 
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > 
> > SUSPEND RESUME TEST STARTED
> > Suspend to mem 1/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
> >  -O /dev/null
> > Done
> > Sleep for 10 seconds
> > Suspend to mem 2/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
> >  -O /dev/null
> > Done
> > Sleep for 10 seconds
> > Suspend to mem 3/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
> >  -O /dev/null
> > Done
> > Sleep for 10 seconds
> > Suspend to mem 4/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
> >  -O /dev/null
> > Done
> > Sleep for 10 seconds
> > Suspend to mem 5/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
> >  -O /dev/null
> > Done
> > Sleep for 10 seconds
> > Suspend to mem 6/10:
> > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
> > http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
> >  -O /dev/null
> > Failed
> > 
> > 
> > 
> > To reproduce:
> > 
> >        git clone https://github.com/intel/lkp-tests.git
> >        cd lkp-tests
> >        bin/lkp install job.yaml  # job file is attached in this
> > email
> >        bin/lkp run     job.yaml
> > 
> > 
> > 
> > Thanks,
> > Rong Chen
> > 
> > <config-5.7.0-rc4-01618-ge86e383f28542><job-
> > script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 12:20   ` [e1000e] e86e383f28: suspend-stress.fail Zhang Rui
@ 2020-07-02 13:12     ` Kai-Heng Feng
  2020-07-02 14:10       ` Zhang Rui
  2020-07-09  3:28       ` Zhang Rui
  0 siblings, 2 replies; 7+ messages in thread
From: Kai-Heng Feng @ 2020-07-02 13:12 UTC (permalink / raw)
  To: Zhang Rui
  Cc: moderated list:INTEL ETHERNET DRIVERS, kernel test robot, lkp,
	Len Brown, Rafael J. Wysocki, Linux PM list



> On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com> wrote:
> 
> Hi, all,
> 
> This patch has been shipped in 5.8-rc1 with its upstream commit id
> 0c80cdbf3320. And we observed big drop of suspend quality.
> 
> Previously, we have run into this "e1000e Hardware Error" issue,
> occasionally. But now, on a NUC I have, system suspend-to-mem fails within 10 suspend  cycles in most cases, but won't work again until a reboot.
> https://bugzilla.kernel.org/show_bug.cgi?id=205015
> 
> IMO, this is a regression, and we need to find a way to fix it.

Should be fixed by https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/

Kai-Heng

> 
> thanks,
> rui
> 
> 
> On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
>> [+Cc intel-wired-lan]
>> 
>>> On May 21, 2020, at 13:27, kernel test robot <rong.a.chen@intel.com
>>>> wrote:
>>> 
>>> Greeting,
>>> 
>>> FYI, we noticed the following commit (built with gcc-7):
>>> 
>>> commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e: Warn if
>>> disabling ULP failed")
>>> 
> https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
>>> dev-queue
>> 
>> kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed to
>> disable ULP
>> kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up, 100Mbps,
>> full-duplex, lpa 0xC1E1
>> kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0: Hardware
>> Error
>> kern  :info  : [  242.772702] e1000e 0000:00:19.0:
>> pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
>> 
>> So the patch does catch issues previously ignored.
>> 
>> I wonder what's the next move, maybe increase the ULP timeout again?
>> 
>> Kai-Heng
>> 
>>> in testcase: suspend-stress
>>> with following parameters:
>>> 
>>> 	mode: mem
>>> 	iterations: 10
>>> 
>>> 
>>> 
>>> on test machine: 4 threads Broadwell with 8G memory
>>> 
>>> caused below changes (please refer to attached dmesg/kmsg for
>>> entire log/backtrace):
>>> 
>>> 
>>> 
>>> 
>>> If you fix the issue, kindly add following tag
>>> Reported-by: kernel test robot <rong.a.chen@intel.com>
>>> 
>>> SUSPEND RESUME TEST STARTED
>>> Suspend to mem 1/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
>>> -O /dev/null
>>> Done
>>> Sleep for 10 seconds
>>> Suspend to mem 2/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
>>> -O /dev/null
>>> Done
>>> Sleep for 10 seconds
>>> Suspend to mem 3/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
>>> -O /dev/null
>>> Done
>>> Sleep for 10 seconds
>>> Suspend to mem 4/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
>>> -O /dev/null
>>> Done
>>> Sleep for 10 seconds
>>> Suspend to mem 5/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
>>> -O /dev/null
>>> Done
>>> Sleep for 10 seconds
>>> Suspend to mem 6/10:
>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 
>>> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
>>> -O /dev/null
>>> Failed
>>> 
>>> 
>>> 
>>> To reproduce:
>>> 
>>>       git clone https://github.com/intel/lkp-tests.git
>>>       cd lkp-tests
>>>       bin/lkp install job.yaml  # job file is attached in this
>>> email
>>>       bin/lkp run     job.yaml
>>> 
>>> 
>>> 
>>> Thanks,
>>> Rong Chen
>>> 
>>> <config-5.7.0-rc4-01618-ge86e383f28542><job-
>>> script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
>> 
>> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 13:12     ` Kai-Heng Feng
@ 2020-07-02 14:10       ` Zhang Rui
  2020-07-02 15:09         ` [Intel-wired-lan] " Neftin, Sasha
  2020-07-07 12:50         ` Zhang Rui
  2020-07-09  3:28       ` Zhang Rui
  1 sibling, 2 replies; 7+ messages in thread
From: Zhang Rui @ 2020-07-02 14:10 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: moderated list:INTEL ETHERNET DRIVERS, kernel test robot, lkp,
	Len Brown, Rafael J. Wysocki, Linux PM list

On Thu, 2020-07-02 at 21:12 +0800, Kai-Heng Feng wrote:
> > On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com> wrote:
> > 
> > Hi, all,
> > 
> > This patch has been shipped in 5.8-rc1 with its upstream commit id
> > 0c80cdbf3320. And we observed big drop of suspend quality.
> > 
> > Previously, we have run into this "e1000e Hardware Error" issue,
> > occasionally. But now, on a NUC I have, system suspend-to-mem fails
> > within 10 suspend  cycles in most cases, but won't work again until
> > a reboot.
> > https://bugzilla.kernel.org/show_bug.cgi?id=205015
> > 
> > IMO, this is a regression, and we need to find a way to fix it.
> 
> Should be fixed by 
> https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/
> 

Great, I will give it a try and update later.

thanks,
rui

> Kai-Heng
> 
> > 
> > thanks,
> > rui
> > 
> > 
> > On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
> > > [+Cc intel-wired-lan]
> > > 
> > > > On May 21, 2020, at 13:27, kernel test robot <
> > > > rong.a.chen@intel.com
> > > > > wrote:
> > > > 
> > > > Greeting,
> > > > 
> > > > FYI, we noticed the following commit (built with gcc-7):
> > > > 
> > > > commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e: Warn
> > > > if
> > > > disabling ULP failed")
> > > > 
> > 
> > 
https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
> > > > dev-queue
> > > 
> > > kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed to
> > > disable ULP
> > > kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up,
> > > 100Mbps,
> > > full-duplex, lpa 0xC1E1
> > > kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0: Hardware
> > > Error
> > > kern  :info  : [  242.772702] e1000e 0000:00:19.0:
> > > pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
> > > 
> > > So the patch does catch issues previously ignored.
> > > 
> > > I wonder what's the next move, maybe increase the ULP timeout
> > > again?
> > > 
> > > Kai-Heng
> > > 
> > > > in testcase: suspend-stress
> > > > with following parameters:
> > > > 
> > > > 	mode: mem
> > > > 	iterations: 10
> > > > 
> > > > 
> > > > 
> > > > on test machine: 4 threads Broadwell with 8G memory
> > > > 
> > > > caused below changes (please refer to attached dmesg/kmsg for
> > > > entire log/backtrace):
> > > > 
> > > > 
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > > > 
> > > > SUSPEND RESUME TEST STARTED
> > > > Suspend to mem 1/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 2/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 3/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 4/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 5/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 6/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
> > > > -O /dev/null
> > > > Failed
> > > > 
> > > > 
> > > > 
> > > > To reproduce:
> > > > 
> > > >       git clone https://github.com/intel/lkp-tests.git
> > > >       cd lkp-tests
> > > >       bin/lkp install job.yaml  # job file is attached in this
> > > > email
> > > >       bin/lkp run     job.yaml
> > > > 
> > > > 
> > > > 
> > > > Thanks,
> > > > Rong Chen
> > > > 
> > > > <config-5.7.0-rc4-01618-ge86e383f28542><job-
> > > > script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
> > > 
> > > 
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-wired-lan] [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 14:10       ` Zhang Rui
@ 2020-07-02 15:09         ` Neftin, Sasha
  2020-07-02 15:13           ` Zhang Rui
  2020-07-07 12:50         ` Zhang Rui
  1 sibling, 1 reply; 7+ messages in thread
From: Neftin, Sasha @ 2020-07-02 15:09 UTC (permalink / raw)
  To: Zhang Rui, Kai-Heng Feng
  Cc: Len Brown, Linux PM list, Rafael J. Wysocki, lkp,
	moderated list:INTEL ETHERNET DRIVERS, kernel test robot,
	Lifshits, Vitaly, mickey.elya, Avivi, Amir, Neftin, Sasha

On 7/2/2020 17:10, Zhang Rui wrote:
> On Thu, 2020-07-02 at 21:12 +0800, Kai-Heng Feng wrote:
>>> On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com> wrote:
>>>
>>> Hi, all,
>>>
>>> This patch has been shipped in 5.8-rc1 with its upstream commit id
>>> 0c80cdbf3320. And we observed big drop of suspend quality.
>>>
>>> Previously, we have run into this "e1000e Hardware Error" issue,
>>> occasionally. But now, on a NUC I have, system suspend-to-mem fails
>>> within 10 suspend  cycles in most cases, but won't work again until
>>> a reboot.
>>> https://bugzilla.kernel.org/show_bug.cgi?id=205015
>>>
>>> IMO, this is a regression, and we need to find a way to fix it.
>>
>> Should be fixed by
>> https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/
>>
> 
> Great, I will give it a try and update later.
Rui,
Does ME/CSME AMT run on your machine?
Thanks,
sasha
> 
> thanks,
> rui
> 
>> Kai-Heng
>>
>>>
>>> thanks,
>>> rui
>>>
>>>
>>> On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
>>>> [+Cc intel-wired-lan]
>>>>
>>>>> On May 21, 2020, at 13:27, kernel test robot <
>>>>> rong.a.chen@intel.com
>>>>>> wrote:
>>>>>
>>>>> Greeting,
>>>>>
>>>>> FYI, we noticed the following commit (built with gcc-7):
>>>>>
>>>>> commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e: Warn
>>>>> if
>>>>> disabling ULP failed")
>>>>>
>>>
>>>
> https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
>>>>> dev-queue
>>>>
>>>> kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed to
>>>> disable ULP
>>>> kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up,
>>>> 100Mbps,
>>>> full-duplex, lpa 0xC1E1
>>>> kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0: Hardware
>>>> Error
>>>> kern  :info  : [  242.772702] e1000e 0000:00:19.0:
>>>> pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
>>>>
>>>> So the patch does catch issues previously ignored.
>>>>
>>>> I wonder what's the next move, maybe increase the ULP timeout
>>>> again?
>>>>
>>>> Kai-Heng
>>>>
>>>>> in testcase: suspend-stress
>>>>> with following parameters:
>>>>>
>>>>> 	mode: mem
>>>>> 	iterations: 10
>>>>>
>>>>>
>>>>>
>>>>> on test machine: 4 threads Broadwell with 8G memory
>>>>>
>>>>> caused below changes (please refer to attached dmesg/kmsg for
>>>>> entire log/backtrace):
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> If you fix the issue, kindly add following tag
>>>>> Reported-by: kernel test robot <rong.a.chen@intel.com>
>>>>>
>>>>> SUSPEND RESUME TEST STARTED
>>>>> Suspend to mem 1/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
>>>>> -O /dev/null
>>>>> Done
>>>>> Sleep for 10 seconds
>>>>> Suspend to mem 2/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
>>>>> -O /dev/null
>>>>> Done
>>>>> Sleep for 10 seconds
>>>>> Suspend to mem 3/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
>>>>> -O /dev/null
>>>>> Done
>>>>> Sleep for 10 seconds
>>>>> Suspend to mem 4/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
>>>>> -O /dev/null
>>>>> Done
>>>>> Sleep for 10 seconds
>>>>> Suspend to mem 5/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
>>>>> -O /dev/null
>>>>> Done
>>>>> Sleep for 10 seconds
>>>>> Suspend to mem 6/10:
>>>>> /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
>>>>> 8
>>>>>
> http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
>>>>> -O /dev/null
>>>>> Failed
>>>>>
>>>>>
>>>>>
>>>>> To reproduce:
>>>>>
>>>>>        git clone https://github.com/intel/lkp-tests.git
>>>>>        cd lkp-tests
>>>>>        bin/lkp install job.yaml  # job file is attached in this
>>>>> email
>>>>>        bin/lkp run     job.yaml
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Rong Chen
>>>>>
>>>>> <config-5.7.0-rc4-01618-ge86e383f28542><job-
>>>>> script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
>>>>
>>>>
>>
>>
> 
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-wired-lan] [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 15:09         ` [Intel-wired-lan] " Neftin, Sasha
@ 2020-07-02 15:13           ` Zhang Rui
  0 siblings, 0 replies; 7+ messages in thread
From: Zhang Rui @ 2020-07-02 15:13 UTC (permalink / raw)
  To: Neftin, Sasha, Kai-Heng Feng
  Cc: Len Brown, Linux PM list, Rafael J. Wysocki, lkp,
	moderated list:INTEL ETHERNET DRIVERS, kernel test robot,
	Lifshits, Vitaly, mickey.elya, Avivi, Amir

On Thu, 2020-07-02 at 18:09 +0300, Neftin, Sasha wrote:
> On 7/2/2020 17:10, Zhang Rui wrote:
> > On Thu, 2020-07-02 at 21:12 +0800, Kai-Heng Feng wrote:
> > > > On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com>
> > > > wrote:
> > > > 
> > > > Hi, all,
> > > > 
> > > > This patch has been shipped in 5.8-rc1 with its upstream commit
> > > > id
> > > > 0c80cdbf3320. And we observed big drop of suspend quality.
> > > > 
> > > > Previously, we have run into this "e1000e Hardware Error"
> > > > issue,
> > > > occasionally. But now, on a NUC I have, system suspend-to-mem
> > > > fails
> > > > within 10 suspend  cycles in most cases, but won't work again
> > > > until
> > > > a reboot.
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=205015
> > > > 
> > > > IMO, this is a regression, and we need to find a way to fix it.
> > > 
> > > Should be fixed by
> > > 
https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/
> > > 
> > 
> > Great, I will give it a try and update later.
> 
> Rui,
> Does ME/CSME AMT run on your machine?

I'm not sure. Need to check this tomorrow.

thanks,
rui
> Thanks,
> sasha
> > 
> > thanks,
> > rui
> > 
> > > Kai-Heng
> > > 
> > > > 
> > > > thanks,
> > > > rui
> > > > 
> > > > 
> > > > On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
> > > > > [+Cc intel-wired-lan]
> > > > > 
> > > > > > On May 21, 2020, at 13:27, kernel test robot <
> > > > > > rong.a.chen@intel.com
> > > > > > > wrote:
> > > > > > 
> > > > > > Greeting,
> > > > > > 
> > > > > > FYI, we noticed the following commit (built with gcc-7):
> > > > > > 
> > > > > > commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e:
> > > > > > Warn
> > > > > > if
> > > > > > disabling ULP failed")
> > > > > > 
> > > > 
> > > > 
> > 
> > 
https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
> > > > > > dev-queue
> > > > > 
> > > > > kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0:
> > > > > Failed to
> > > > > disable ULP
> > > > > kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up,
> > > > > 100Mbps,
> > > > > full-duplex, lpa 0xC1E1
> > > > > kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0:
> > > > > Hardware
> > > > > Error
> > > > > kern  :info  : [  242.772702] e1000e 0000:00:19.0:
> > > > > pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
> > > > > 
> > > > > So the patch does catch issues previously ignored.
> > > > > 
> > > > > I wonder what's the next move, maybe increase the ULP timeout
> > > > > again?
> > > > > 
> > > > > Kai-Heng
> > > > > 
> > > > > > in testcase: suspend-stress
> > > > > > with following parameters:
> > > > > > 
> > > > > > 	mode: mem
> > > > > > 	iterations: 10
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > on test machine: 4 threads Broadwell with 8G memory
> > > > > > 
> > > > > > caused below changes (please refer to attached dmesg/kmsg
> > > > > > for
> > > > > > entire log/backtrace):
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > If you fix the issue, kindly add following tag
> > > > > > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > > > > > 
> > > > > > SUSPEND RESUME TEST STARTED
> > > > > > Suspend to mem 1/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
> > > > > > -O /dev/null
> > > > > > Done
> > > > > > Sleep for 10 seconds
> > > > > > Suspend to mem 2/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
> > > > > > -O /dev/null
> > > > > > Done
> > > > > > Sleep for 10 seconds
> > > > > > Suspend to mem 3/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
> > > > > > -O /dev/null
> > > > > > Done
> > > > > > Sleep for 10 seconds
> > > > > > Suspend to mem 4/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
> > > > > > -O /dev/null
> > > > > > Done
> > > > > > Sleep for 10 seconds
> > > > > > Suspend to mem 5/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
> > > > > > -O /dev/null
> > > > > > Done
> > > > > > Sleep for 10 seconds
> > > > > > Suspend to mem 6/10:
> > > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > > encoding=UTF-
> > > > > > 8
> > > > > > 
> > 
> > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
> > > > > > -O /dev/null
> > > > > > Failed
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > To reproduce:
> > > > > > 
> > > > > >        git clone https://github.com/intel/lkp-tests.git
> > > > > >        cd lkp-tests
> > > > > >        bin/lkp install job.yaml  # job file is attached in
> > > > > > this
> > > > > > email
> > > > > >        bin/lkp run     job.yaml
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Thanks,
> > > > > > Rong Chen
> > > > > > 
> > > > > > <config-5.7.0-rc4-01618-ge86e383f28542><job-
> > > > > > script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
> > > > > 
> > > > > 
> > > 
> > > 
> > 
> > _______________________________________________
> > Intel-wired-lan mailing list
> > Intel-wired-lan@osuosl.org
> > https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> > 
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 14:10       ` Zhang Rui
  2020-07-02 15:09         ` [Intel-wired-lan] " Neftin, Sasha
@ 2020-07-07 12:50         ` Zhang Rui
  1 sibling, 0 replies; 7+ messages in thread
From: Zhang Rui @ 2020-07-07 12:50 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: moderated list:INTEL ETHERNET DRIVERS, kernel test robot, lkp,
	Len Brown, Rafael J. Wysocki, Linux PM list

On Thu, 2020-07-02 at 22:10 +0800, Zhang Rui wrote:
> On Thu, 2020-07-02 at 21:12 +0800, Kai-Heng Feng wrote:
> > > On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com> wrote:
> > > 
> > > Hi, all,
> > > 
> > > This patch has been shipped in 5.8-rc1 with its upstream commit
> > > id
> > > 0c80cdbf3320. And we observed big drop of suspend quality.
> > > 
> > > Previously, we have run into this "e1000e Hardware Error" issue,
> > > occasionally. But now, on a NUC I have, system suspend-to-mem
> > > fails
> > > within 10 suspend  cycles in most cases, but won't work again
> > > until
> > > a reboot.
> > > https://bugzilla.kernel.org/show_bug.cgi?id=205015
> > > 
> > > IMO, this is a regression, and we need to find a way to fix it.
> > 
> > Should be fixed by 
> > 
https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/
> > 
> 
> Great, I will give it a try and update later.

The test box was busy on other usage, and I didn't have a chance to try
the patched kernel. Will update tomorrow.

thanks,
rui
> 
> thanks,
> rui
> 
> > Kai-Heng
> > 
> > > 
> > > thanks,
> > > rui
> > > 
> > > 
> > > On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
> > > > [+Cc intel-wired-lan]
> > > > 
> > > > > On May 21, 2020, at 13:27, kernel test robot <
> > > > > rong.a.chen@intel.com
> > > > > > wrote:
> > > > > 
> > > > > Greeting,
> > > > > 
> > > > > FYI, we noticed the following commit (built with gcc-7):
> > > > > 
> > > > > commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e:
> > > > > Warn
> > > > > if
> > > > > disabling ULP failed")
> > > > > 
> > > 
> > > 
> 
> https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
> > > > > dev-queue
> > > > 
> > > > kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed
> > > > to
> > > > disable ULP
> > > > kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up,
> > > > 100Mbps,
> > > > full-duplex, lpa 0xC1E1
> > > > kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0:
> > > > Hardware
> > > > Error
> > > > kern  :info  : [  242.772702] e1000e 0000:00:19.0:
> > > > pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
> > > > 
> > > > So the patch does catch issues previously ignored.
> > > > 
> > > > I wonder what's the next move, maybe increase the ULP timeout
> > > > again?
> > > > 
> > > > Kai-Heng
> > > > 
> > > > > in testcase: suspend-stress
> > > > > with following parameters:
> > > > > 
> > > > > 	mode: mem
> > > > > 	iterations: 10
> > > > > 
> > > > > 
> > > > > 
> > > > > on test machine: 4 threads Broadwell with 8G memory
> > > > > 
> > > > > caused below changes (please refer to attached dmesg/kmsg for
> > > > > entire log/backtrace):
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > > > > 
> > > > > SUSPEND RESUME TEST STARTED
> > > > > Suspend to mem 1/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
> > > > > -O /dev/null
> > > > > Done
> > > > > Sleep for 10 seconds
> > > > > Suspend to mem 2/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
> > > > > -O /dev/null
> > > > > Done
> > > > > Sleep for 10 seconds
> > > > > Suspend to mem 3/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
> > > > > -O /dev/null
> > > > > Done
> > > > > Sleep for 10 seconds
> > > > > Suspend to mem 4/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
> > > > > -O /dev/null
> > > > > Done
> > > > > Sleep for 10 seconds
> > > > > Suspend to mem 5/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
> > > > > -O /dev/null
> > > > > Done
> > > > > Sleep for 10 seconds
> > > > > Suspend to mem 6/10:
> > > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-
> > > > > encoding=UTF-
> > > > > 8 
> > > > > 
> 
> 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
> > > > > -O /dev/null
> > > > > Failed
> > > > > 
> > > > > 
> > > > > 
> > > > > To reproduce:
> > > > > 
> > > > >       git clone https://github.com/intel/lkp-tests.git
> > > > >       cd lkp-tests
> > > > >       bin/lkp install job.yaml  # job file is attached in
> > > > > this
> > > > > email
> > > > >       bin/lkp run     job.yaml
> > > > > 
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > Rong Chen
> > > > > 
> > > > > <config-5.7.0-rc4-01618-ge86e383f28542><job-
> > > > > script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
> > > > 
> > > > 
> > 
> > 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [e1000e] e86e383f28: suspend-stress.fail
  2020-07-02 13:12     ` Kai-Heng Feng
  2020-07-02 14:10       ` Zhang Rui
@ 2020-07-09  3:28       ` Zhang Rui
  1 sibling, 0 replies; 7+ messages in thread
From: Zhang Rui @ 2020-07-09  3:28 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: moderated list:INTEL ETHERNET DRIVERS, kernel test robot, lkp,
	Len Brown, Rafael J. Wysocki, Linux PM list

On Thu, 2020-07-02 at 21:12 +0800, Kai-Heng Feng wrote:
> > On Jul 2, 2020, at 20:20, Zhang Rui <rui.zhang@intel.com> wrote:
> > 
> > Hi, all,
> > 
> > This patch has been shipped in 5.8-rc1 with its upstream commit id
> > 0c80cdbf3320. And we observed big drop of suspend quality.
> > 
> > Previously, we have run into this "e1000e Hardware Error" issue,
> > occasionally. But now, on a NUC I have, system suspend-to-mem fails
> > within 10 suspend  cycles in most cases, but won't work again until
> > a reboot.
> > https://bugzilla.kernel.org/show_bug.cgi?id=205015
> > 
> > IMO, this is a regression, and we need to find a way to fix it.
> 
> Should be fixed by 
> https://lore.kernel.org/lkml/20200618065453.12140-1-aaron.ma@canonical.com/

With the patch on top of clean 5.8-rc3, suspend-resume always success,
although I can see "Failed to disable ULP" in dmesg for almost half of
the resumes.

thanks,
rui

> 
> Kai-Heng
> 
> > 
> > thanks,
> > rui
> > 
> > 
> > On Sat, 2020-05-23 at 20:20 +0800, Kai-Heng Feng wrote:
> > > [+Cc intel-wired-lan]
> > > 
> > > > On May 21, 2020, at 13:27, kernel test robot <
> > > > rong.a.chen@intel.com
> > > > > wrote:
> > > > 
> > > > Greeting,
> > > > 
> > > > FYI, we noticed the following commit (built with gcc-7):
> > > > 
> > > > commit: e86e383f2854234129c66e90f84ac2c74b2b1828 ("e1000e: Warn
> > > > if
> > > > disabling ULP failed")
> > > > 
> > 
> > 
https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git
> > > > dev-queue
> > > 
> > > kern  :warn  : [  240.884667] e1000e 0000:00:19.0 eth0: Failed to
> > > disable ULP
> > > kern  :info  : [  241.896122] asix 2-3:1.0 eth1: link up,
> > > 100Mbps,
> > > full-duplex, lpa 0xC1E1
> > > kern  :err   : [  242.269348] e1000e 0000:00:19.0 eth0: Hardware
> > > Error
> > > kern  :info  : [  242.772702] e1000e 0000:00:19.0:
> > > pci_pm_resume+0x0/0x80 returned 0 after 2985422 usecs
> > > 
> > > So the patch does catch issues previously ignored.
> > > 
> > > I wonder what's the next move, maybe increase the ULP timeout
> > > again?
> > > 
> > > Kai-Heng
> > > 
> > > > in testcase: suspend-stress
> > > > with following parameters:
> > > > 
> > > > 	mode: mem
> > > > 	iterations: 10
> > > > 
> > > > 
> > > > 
> > > > on test machine: 4 threads Broadwell with 8G memory
> > > > 
> > > > caused below changes (please refer to attached dmesg/kmsg for
> > > > entire log/backtrace):
> > > > 
> > > > 
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > > > 
> > > > SUSPEND RESUME TEST STARTED
> > > > Suspend to mem 1/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-1/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 2/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-2/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 3/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-3/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 4/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-4/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 5/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-5/10
> > > > -O /dev/null
> > > > Done
> > > > Sleep for 10 seconds
> > > > Suspend to mem 6/10:
> > > > /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-
> > > > 8 
> > > > 
http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-bdw-nuc1/suspend-stress-10-mem-debian-x86_64-20180403.cgz-e86e383f2854234129c66e90f84ac2c74b2b1828-20200517-66267-13fgkna-8.yaml&job_state=suspending-6/10
> > > > -O /dev/null
> > > > Failed
> > > > 
> > > > 
> > > > 
> > > > To reproduce:
> > > > 
> > > >       git clone https://github.com/intel/lkp-tests.git
> > > >       cd lkp-tests
> > > >       bin/lkp install job.yaml  # job file is attached in this
> > > > email
> > > >       bin/lkp run     job.yaml
> > > > 
> > > > 
> > > > 
> > > > Thanks,
> > > > Rong Chen
> > > > 
> > > > <config-5.7.0-rc4-01618-ge86e383f28542><job-
> > > > script.txt><kmsg.xz><suspend-stress.txt><job.yaml>
> > > 
> > > 
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-09  3:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200521052753.GB12456@shao2-debian>
     [not found] ` <5A1631F8-259E-4897-BE52-0F5DB406E44F@canonical.com>
2020-07-02 12:20   ` [e1000e] e86e383f28: suspend-stress.fail Zhang Rui
2020-07-02 13:12     ` Kai-Heng Feng
2020-07-02 14:10       ` Zhang Rui
2020-07-02 15:09         ` [Intel-wired-lan] " Neftin, Sasha
2020-07-02 15:13           ` Zhang Rui
2020-07-07 12:50         ` Zhang Rui
2020-07-09  3:28       ` Zhang Rui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).