All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
       [not found] <20170704161352.1cdb2670.wim.ten.have@oracle.com>
@ 2017-07-04 15:32 ` Bjorn Helgaas
  2017-07-04 15:57   ` Sinan Kaya
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2017-07-04 15:32 UTC (permalink / raw)
  To: linux-pci; +Cc: Sinan Kaya, Wim ten Have

[+cc linux-pci]

Thanks very much for the detailed problem report, Wim!  I'm taking the
liberty to forward to the linux-pci list in case others trip over the
same thing.


---------- Forwarded message ----------
From: Wim ten Have <wim.ten.have@oracle.com>
Date: Tue, Jul 4, 2017 at 9:13 AM
Subject: Red Hat (Fedora) bug report 1467674 concerning your kernel
functional performance enhancements causing PCI Express crashes,
To: Sinan Kaya <okaya@codeaurora.org>, Bjorn Helgaas <bhelgaas@google.com>
Cc: Wim ten Have <wim.ten.have@oracle.com>


        Howdy,

I created Red Hat (Fedora) bug report 1467674.
        https://bugzilla.redhat.com/show_bug.cgi?id=1467674

This may be in your interest given fact you were involved in creating
code that is causing kernel (oops)/malfunction under;

commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
Author: Sinan Kaya <okaya@codeaurora.org>
Date:   Fri Jan 20 09:16:51 2017 -0500

    PCI: Enable PCIe Extended Tags if supported

    Every PCIe device can generate 5-bit transaction Tags, which allow up to 32
    concurrent requests.  Some devices can generate 8-bit Extended Tags, which
    allow up to 256 concurrent requests.

    Per the ECN mentioned below, all PCIe Receivers are expected to support
    Extended Tags, so devices are allowed (but not required) to enable them by
    default.

    If a device supports Extended Tags but does not enable them by default,
    enable them.  This allows the device to have up to 256 outstanding
    transactions at a time, which may improve performance.

    [bhelgaas: changelog, check for PCIe device]
    Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf
    Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


>>>>>>>>>>>> REPORT <<<<<<<<<<<<<<<<
Wim ten Have 2017-07-04 10:06:19 EDT

Description of problem:
=======================
Systems with "eth0: Tigon3 [partno(BCM95721) rev 4201] (PCI Express)"
ethernet like DELL PowerEdge SC1435 fail their ethernet after
interface bind/ifconfig up.
  [    0.000000] SMBIOS 2.4 present.
  [    0.000000] DMI: Dell Inc. PowerEdge SC1435/0H313M, BIOS 2.2.5 03/21/2008


The problem is not specific to this piece of h/w.  I did pin-point the
issue to specific kernel code commit
60db3a4d8cc9073cf56264785197ba75ee1caca4
  * <wtenhave@hagen:55> git bisect good
    60db3a4d8cc9073cf56264785197ba75ee1caca4 is the first bad commit
    commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
    Author: Sinan Kaya <okaya@codeaurora.org>
    Date:   Fri Jan 20 09:16:51 2017 -0500

      PCI: Enable PCIe Extended Tags if supported

The system will shortly after getting to the ifconfig up statement
report below kernel messages.
  Jul  4 15:00:12 hagen kernel: tg3 0000:01:00.0 eth0: Tigon3
[partno(BCM95721) rev 4201] (PCI Express) MAC address
00:22:19:27:cd:f8
  Jul  4 15:00:12 hagen kernel: tg3 0000:01:00.0 eth0: attached PHY is
5750 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
  Jul  4 15:00:12 hagen kernel: tg3 0000:01:00.0 eth0: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
  Jul  4 15:00:12 hagen kernel: tg3 0000:01:00.0 eth0:
dma_rwctrl[76180000] dma_mask[64-bit]
  Jul  4 15:00:12 hagen kernel: tg3 0000:02:00.0 eth1: Tigon3
[partno(BCM95721) rev 4201] (PCI Express) MAC address
00:22:19:27:cd:f9
  Jul  4 15:00:12 hagen kernel: tg3 0000:02:00.0 eth1: attached PHY is
5750 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
  Jul  4 15:00:12 hagen kernel: tg3 0000:02:00.0 eth1: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
  Jul  4 15:00:12 hagen kernel: tg3 0000:02:00.0 eth1:
dma_rwctrl[76180000] dma_mask[64-bit]
     ...
  Jul  4 15:00:12 hagen kernel: tg3 0000:02:00.0 enp2s0: renamed from eth1
     ...
  Jul  4 15:00:39 hagen kernel: tg3 0000:01:00.0 enp1s0: Link is up at
1000 Mbps, full duplex
  Jul  4 15:00:39 hagen kernel: tg3 0000:01:00.0 enp1s0: Flow control
is on for TX and on for RX
     ...
  Jul  4 15:00:50 hagen kernel: NETDEV WATCHDOG: enp1s0 (tg3):
transmit queue 0 timed out
  Jul  4 15:00:50 hagen kernel: ------------[ cut here ]------------
  Jul  4 15:00:50 hagen kernel: WARNING: CPU: 6 PID: 0 at
net/sched/sch_generic.c:316 dev_watchdog+0x215/0x220
  Jul  4 15:00:50 hagen kernel: Modules linked in: ip6t_rpfilter
ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat
ebtable_broute bridge stp llc ip6table_raw ip6table_security
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle iptable_raw iptable_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle ebtable_filter ebtables ip6table_filter ip6_tables
sunrpc xfs libcrc32c amd64_edac_mod edac_mce_amd kvm_amd kvm dcdbas
irqbypass ipmi_ssif acpi_cpufreq shpchp ipmi_si ipmi_devintf tpm_tis
pcspkr tpm_tis_core k10temp i2c_piix4 ipmi_msghandler tpm
target_core_mod amdkfd amd_iommu_v2 radeon i2c_algo_bit drm_kms_helper
ttm drm tg3 ptp ata_generic serio_raw pata_acpi pps_core
pata_serverworks sata_svw
  Jul  4 15:00:50 hagen kernel: CPU: 6 PID: 0 Comm: swapper/6 Not
tainted 4.12.0broken+ #16
  Jul  4 15:00:50 hagen kernel: Hardware name: Dell Inc. PowerEdge
SC1435/0H313M, BIOS 2.2.5 03/21/2008
  Jul  4 15:00:50 hagen kernel: task: ffff90a9ede5c980 task.stack:
ffffb123031b4000
  Jul  4 15:00:50 hagen kernel: RIP: 0010:dev_watchdog+0x215/0x220
  Jul  4 15:00:50 hagen kernel: RSP: 0018:ffff90a9efd83e60 EFLAGS: 00010286
  Jul  4 15:00:50 hagen kernel: RAX: 0000000000000039 RBX:
0000000000000000 RCX: 0000000000000000
  Jul  4 15:00:50 hagen kernel: RDX: 0000000000000000 RSI:
00000000000000f6 RDI: 0000000000000300
  Jul  4 15:00:50 hagen kernel: RBP: ffff90a9efd83e80 R08:
0000000000000000 R09: 0000000000000346
  Jul  4 15:00:50 hagen kernel: R10: ffff90a9efd92430 R11:
000000000000000f R12: ffff90a9ece46000
  Jul  4 15:00:50 hagen kernel: R13: 0000000000000006 R14:
0000000000000005 R15: ffff90a9ece46000
  Jul  4 15:00:50 hagen kernel: FS:  0000000000000000(0000)
GS:ffff90a9efd80000(0000) knlGS:0000000000000000
  Jul  4 15:00:50 hagen kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
  Jul  4 15:00:50 hagen kernel: CR2: 000055d4fa41f038 CR3:
00000004c3e09000 CR4: 00000000000006e0
  Jul  4 15:00:50 hagen kernel: Call Trace:
  Jul  4 15:00:50 hagen kernel: <IRQ>
  Jul  4 15:00:50 hagen kernel: ? qdisc_rcu_free+0x50/0x50
  Jul  4 15:00:50 hagen kernel: call_timer_fn+0x35/0x130
  Jul  4 15:00:50 hagen kernel: run_timer_softirq+0x1d1/0x420
  Jul  4 15:00:50 hagen kernel: ? sched_clock+0x9/0x10
  Jul  4 15:00:50 hagen kernel: ? sched_clock+0x9/0x10
  Jul  4 15:00:50 hagen kernel: ? sched_clock_cpu+0x11/0xb0
  Jul  4 15:00:50 hagen kernel: __do_softirq+0x10c/0x2a5
  Jul  4 15:00:50 hagen kernel: irq_exit+0xff/0x110
  Jul  4 15:00:50 hagen kernel: smp_apic_timer_interrupt+0x3d/0x50
  Jul  4 15:00:50 hagen kernel: apic_timer_interrupt+0x93/0xa0
  Jul  4 15:00:50 hagen kernel: RIP: 0010:native_safe_halt+0x6/0x10
  Jul  4 15:00:50 hagen kernel: RSP: 0018:ffffb123031b7e60 EFLAGS:
00000246 ORIG_RAX: ffffffffffffff10
  Jul  4 15:00:50 hagen kernel: RAX: 6874754100002548 RBX:
ffff90a9ede5c980 RCX: 0000000000000000
  Jul  4 15:00:50 hagen kernel: RDX: 0000000000000000 RSI:
0000000000000000 RDI: 0000000000000000
  Jul  4 15:00:50 hagen kernel: RBP: ffffb123031b7e60 R08:
00000009c80ff365 R09: ffffb1230837fa38
  Jul  4 15:00:50 hagen kernel: R10: 0000000000000000 R11:
00000000fffc0fe9 R12: 0000000000000006
  Jul  4 15:00:50 hagen kernel: R13: ffff90a9ede5c980 R14:
0000000000000000 R15: 0000000000000000
  Jul  4 15:00:50 hagen kernel: </IRQ>
  Jul  4 15:00:50 hagen kernel: default_idle+0x20/0x100
  Jul  4 15:00:50 hagen kernel: amd_e400_idle+0x3f/0x50
  Jul  4 15:00:50 hagen kernel: arch_cpu_idle+0xf/0x20
  Jul  4 15:00:50 hagen kernel: default_idle_call+0x23/0x30
  Jul  4 15:00:50 hagen kernel: do_idle+0x174/0x1e0
  Jul  4 15:00:50 hagen kernel: cpu_startup_entry+0x71/0x80
  Jul  4 15:00:50 hagen kernel: start_secondary+0x154/0x190
  Jul  4 15:00:50 hagen kernel: secondary_startup_64+0x9f/0x9f
  Jul  4 15:00:50 hagen kernel: Code: 8c 24 64 04 00 00 eb 8f 4c 89 e7
c6 05 ef 0e 88 00 01 e8 4f 6b fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7
60 a0 d1 86 e8 12 cc a5 ff <0f> ff eb c1 0f 1f 80 00 00 00 00 0f 1f 44
00 00 48 c7 47 08 00
  Jul  4 15:00:50 hagen kernel: ---[ end trace 6fdc4540cb931145 ]---
  Jul  4 15:00:50 hagen kernel: tg3 0000:01:00.0 enp1s0: transmit
timed out, resetting
  Jul  4 15:00:52 hagen abrt-dump-journal-oops:
abrt-dump-journal-oops: Found oopses: 1
  Jul  4 15:00:52 hagen abrt-dump-journal-oops:
abrt-dump-journal-oops: Creating problem directories
  Jul  4 15:00:52 hagen kernel: tg3 0000:01:00.0 enp1s0: 0x00000000:
0x165914e4, 0x00100406, 0x02000021, 0x00000010
  Jul  4 15:00:52 hagen kernel: tg3 0000:01:00.0 enp1s0: 0x00000010:
0xefef0004, 0x00000000, 0x00000000, 0x00000000
    ...
  Jul  4 15:00:53 hagen kernel: tg3 0000:01:00.0 enp1s0: 0x00007810:
0x00000000, 0x00000060, 0x00000000, 0x00000000
  Jul  4 15:00:53 hagen kernel: tg3 0000:01:00.0 enp1s0: 0: Host
status block [00000001:00000014:(0000:0005:0000):(0005:000c)]
  Jul  4 15:00:53 hagen kernel: tg3 0000:01:00.0 enp1s0: 0: NAPI info
[00000014:00000014:(0010:000c:01ff):0005:(00cd:0000:0000:0000)]
  Jul  4 15:00:53 hagen kernel: tg3 0000:01:00.0: tg3_stop_block timed
out, ofs=4800 enable_bit=2
  Jul  4 15:00:53 hagen kernel: tg3 0000:01:00.0 enp1s0: Link is down
  Jul  4 15:00:53 hagen abrt-dump-journal-oops: Reported 1 kernel oopses to Abrt
  Jul  4 15:00:53 hagen abrt-server: Deleting problem directory
oops-2017-07-04-15:00:52-889-0 (dup of oops-2017-07-03-12:49:03-930-0)


Version-Release number of selected component (if applicable):
=============================================================
The issue was first noticeable under Fedora 25 updating the kernel
from version 4.10.x => 4.11.0.

Given I needed to move forward with latest version of available kernel
i decided to hunt the bug and report.
I found it to be cause under all (linux) kernels from commit
60db3a4d8cc9073cf56264785197ba75ee1caca4

* <wtenhave@hagen:55> git bisect good
  60db3a4d8cc9073cf56264785197ba75ee1caca4 is the first bad commit
  commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
  Author: Sinan Kaya <okaya@codeaurora.org>
  Date:   Fri Jan 20 09:16:51 2017 -0500

    PCI: Enable PCIe Extended Tags if supported



How reproducible:
=================
It is 100% reproducible.  In fact you can take latest kernel out today
and back-out the change as done under;
  commit 60db3a4d8cc9073cf56264785197ba75ee1caca4

  <wtenhave@hagen:55> git log 60db3a4d8cc9073cf56264785197ba75ee1caca4
  commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
  Author: Sinan Kaya <okaya@codeaurora.org>
  Date:   Fri Jan 20 09:16:51 2017 -0500

    PCI: Enable PCIe Extended Tags if supported

    Every PCIe device can generate 5-bit transaction Tags, which allow up to 32
    concurrent requests.  Some devices can generate 8-bit Extended Tags, which
    allow up to 256 concurrent requests.

    Per the ECN mentioned below, all PCIe Receivers are expected to support
    Extended Tags, so devices are allowed (but not required) to enable them by
    default.

    If a device supports Extended Tags but does not enable them by default,
    enable them.  This allows the device to have up to 256 outstanding
    transactions at a time, which may improve performance.

    [bhelgaas: changelog, check for PCIe device]
    Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf
    Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


To back-out take any kernel later then 4.11 and apply below code
change.  Then build and install that kernel.

  <wtenhave@hagen:58> git diff
  diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
  index 19c8950..1005e9d 100644
  --- a/drivers/pci/probe.c
  +++ b/drivers/pci/probe.c
  @@ -1707,7 +1707,7 @@ static void pci_configure_device(struct pci_dev *dev)
          int ret;

          pci_configure_mps(dev);
  -       pci_configure_extended_tags(dev);
  +       // pci_configure_extended_tags(dev);

          memset(&hpp, 0, sizeof(hpp));
          ret = pci_get_hp_params(dev, &hpp);


Steps to Reproduce:
===================
1. Take a machine with appropriate h/w like a Dell Inc. PowerEdge
SC1435/0H313M, BIOS 2.2.5 03/21/2008 with Tigon3 [partno(BCM95721) rev
4201] (PCI Express) controller
2. Install Fedora25 or (other) with kernel including specific
code/commit like kernel-4.11.7-200.fc25.x86_64
3. Boot and see it crash as soon it starts to operate on specific PCI
Express Ethernet controller.

Actual results:

Expected results:

Additional info:
================
If you need further input please drop me a line at;
"Wim ten Have <wim.ten.have@oracle.com>"

Regards,
- Wim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-04 15:32 ` Fwd: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes, Bjorn Helgaas
@ 2017-07-04 15:57   ` Sinan Kaya
  2017-07-04 17:59     ` Wim ten Have
  0 siblings, 1 reply; 11+ messages in thread
From: Sinan Kaya @ 2017-07-04 15:57 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci; +Cc: Wim ten Have

Hi,

On 7/4/2017 11:32 AM, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> Thanks very much for the detailed problem report, Wim!  I'm taking the
> liberty to forward to the linux-pci list in case others trip over the
> same thing.
> 

So, the spec is lying :) and reality doesn't match theory.

"Per the ECN mentioned below, all PCIe Receivers are expected to support
 Extended Tags"

> 
> 
> 
> The problem is not specific to this piece of h/w.  I did pin-point the
> issue to specific kernel code commit
<snip>

> 60db3a4d8cc9073cf56264785197ba75ee1caca4
>   * <wtenhave@hagen:55> git bisect good
>     60db3a4d8cc9073cf56264785197ba75ee1caca4 is the first bad commit
>     commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
>     Author: Sinan Kaya <okaya@codeaurora.org>
>     Date:   Fri Jan 20 09:16:51 2017 -0500
> 
>       PCI: Enable PCIe Extended Tags if supported
> 
<snip>

> 3. Boot and see it crash as soon it starts to operate on specific PCI
> Express Ethernet controller.
> 

I guess we have an endpoint/system with errata that needs to be blacklisted.
Can you please try another endpoint with the same system?

You have conflicting information above. I want to understand whether it
is the endpoint or the system that needs to be blacklisted.

Please also provide sudo lspci -vvv output from the system with the patch.

Sinan

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-04 15:57   ` Sinan Kaya
@ 2017-07-04 17:59     ` Wim ten Have
  2017-07-04 22:25       ` Sinan Kaya
  0 siblings, 1 reply; 11+ messages in thread
From: Wim ten Have @ 2017-07-04 17:59 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Bjorn Helgaas, linux-pci, Wim ten Have

On Tue, 4 Jul 2017 11:57:37 -0400
Sinan Kaya <okaya@codeaurora.org> wrote:

> Hi,
> 
> On 7/4/2017 11:32 AM, Bjorn Helgaas wrote:
> > [+cc linux-pci]
> > 
> > Thanks very much for the detailed problem report, Wim!  I'm taking the
> > liberty to forward to the linux-pci list in case others trip over the
> > same thing.
> >   
> 
> So, the spec is lying :) and reality doesn't match theory.
> 
> "Per the ECN mentioned below, all PCIe Receivers are expected to support
>  Extended Tags"
> 
> > The problem is not specific to this piece of h/w.  I did pin-point the
> > issue to specific kernel code commit  
> <snip>
> 
> > 60db3a4d8cc9073cf56264785197ba75ee1caca4
> >   * <wtenhave@hagen:55> git bisect good
> >     60db3a4d8cc9073cf56264785197ba75ee1caca4 is the first bad commit
> >     commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
> >     Author: Sinan Kaya <okaya@codeaurora.org>
> >     Date:   Fri Jan 20 09:16:51 2017 -0500
> > 
> >       PCI: Enable PCIe Extended Tags if supported
> >   
> <snip>
> 
> > 3. Boot and see it crash as soon it starts to operate on specific PCI
> > Express Ethernet controller.
> >   
> 
> I guess we have an endpoint/system with errata that needs to be blacklisted.
> Can you please try another endpoint with the same system?
> 
> You have conflicting information above. I want to understand whether it
> is the endpoint or the system that needs to be blacklisted.

  Specific PCI Express ethernet are embedded on the systems mainboard.
  There's only one PCI Express that requires a riser card.  It is empty.

> Please also provide sudo lspci -vvv output from the system with the patch.
> Sinan

  Detail (lspci -vvv) is added to RedHat filed bugzilla entry; BugID 1467674
  since the info is rather large.

	https://bugzilla.redhat.com/show_bug.cgi?id=1467674

Enjoy,
- Wim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-04 17:59     ` Wim ten Have
@ 2017-07-04 22:25       ` Sinan Kaya
  2017-07-05  1:00         ` Sinan Kaya
  0 siblings, 1 reply; 11+ messages in thread
From: Sinan Kaya @ 2017-07-04 22:25 UTC (permalink / raw)
  To: Wim ten Have; +Cc: Bjorn Helgaas, linux-pci

On 7/4/2017 1:59 PM, Wim ten Have wrote:
> On Tue, 4 Jul 2017 11:57:37 -0400
> Sinan Kaya <okaya@codeaurora.org> wrote:
> 
>> Hi,
>>
>> On 7/4/2017 11:32 AM, Bjorn Helgaas wrote:
>>> [+cc linux-pci]
>>>
>>> Thanks very much for the detailed problem report, Wim!  I'm taking the
>>> liberty to forward to the linux-pci list in case others trip over the
>>> same thing.
>>>   
>>
>> So, the spec is lying :) and reality doesn't match theory.
>>
>> "Per the ECN mentioned below, all PCIe Receivers are expected to support
>>  Extended Tags"
>>
>>> The problem is not specific to this piece of h/w.  I did pin-point the
>>> issue to specific kernel code commit  
>> <snip>
>>
>>> 60db3a4d8cc9073cf56264785197ba75ee1caca4
>>>   * <wtenhave@hagen:55> git bisect good
>>>     60db3a4d8cc9073cf56264785197ba75ee1caca4 is the first bad commit
>>>     commit 60db3a4d8cc9073cf56264785197ba75ee1caca4
>>>     Author: Sinan Kaya <okaya@codeaurora.org>
>>>     Date:   Fri Jan 20 09:16:51 2017 -0500
>>>
>>>       PCI: Enable PCIe Extended Tags if supported
>>>   
>> <snip>
>>
>>> 3. Boot and see it crash as soon it starts to operate on specific PCI
>>> Express Ethernet controller.
>>>   
>>
>> I guess we have an endpoint/system with errata that needs to be blacklisted.
>> Can you please try another endpoint with the same system?
>>
>> You have conflicting information above. I want to understand whether it
>> is the endpoint or the system that needs to be blacklisted.
> 
>   Specific PCI Express ethernet are embedded on the systems mainboard.
>   There's only one PCI Express that requires a riser card.  It is empty.
>
>> Please also provide sudo lspci -vvv output from the system with the patch.
>> Sinan
> 
>   Detail (lspci -vvv) is added to RedHat filed bugzilla entry; BugID 1467674
>   since the info is rather large.
> 
> 	https://bugzilla.redhat.com/show_bug.cgi?id=1467674

I think I understand the issue better now. The ECN seems to be introduced against
PCIE 2.0 spec. 

The PCI Express bridge you have is a Broadcom HT 2100 bridge which seems to support
PCI-Express V1.0 and 1.0a compliant only.

http://www.hard-net.de/info_wissen/chipsatz/broadcom/HT-2100.pdf

I can also see this in your lspci output. 

00:08.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 19
	NUMA node: 0
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [empty]
	Memory behind bridge: efe00000-efefffff [size=1M]
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [empty]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [a0] HyperTransport: MSI Mapping Enable+ Fixed-
		Mapping Address Base: 00000000fee00000
	Capabilities: [b0] Express (v1) Root Port (Slot-), MSI 00

I'll post a patch to apply extended tags to systems with PCI express v2 and higher
bridges only.


> 
> Enjoy,
> - Wim.
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-04 22:25       ` Sinan Kaya
@ 2017-07-05  1:00         ` Sinan Kaya
  2017-07-05  7:42           ` Ethan Zhao
  2017-07-05 11:13           ` Wim ten Have
  0 siblings, 2 replies; 11+ messages in thread
From: Sinan Kaya @ 2017-07-05  1:00 UTC (permalink / raw)
  To: Wim ten Have; +Cc: Bjorn Helgaas, linux-pci

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

On 7/4/2017 6:25 PM, Sinan Kaya wrote:
> On 7/4/2017 1:59 PM, Wim ten Have wrote:
>> On Tue, 4 Jul 2017 11:57:37 -0400
>> Sinan Kaya <okaya@codeaurora.org> wrote:
>>
>>> Hi,
>>>
>>> On 7/4/2017 11:32 AM, Bjorn Helgaas wrote:
>>>> [+cc linux-pci]
>>>>
>>>> Thanks very much for the detailed problem report, Wim!  I'm taking the
>>>> liberty to forward to the linux-pci list in case others trip over the
>>>> same thing.
>>>>   
>>>
>>> So, the spec is lying :) and reality doesn't match theory.
> 
> The PCI Express bridge you have is a Broadcom HT 2100 bridge which seems to support
> PCI-Express V1.0 and 1.0a compliant only.
> 
> http://www.hard-net.de/info_wissen/chipsatz/broadcom/HT-2100.pdf
> 
> I can also see this in your lspci output. 
> 
> 00:08.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) (prog-if 00 [Normal decode])
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 19
> 	NUMA node: 0
> 	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> 	I/O behind bridge: 0000f000-00000fff [empty]
> 	Memory behind bridge: efe00000-efefffff [size=1M]
> 	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [empty]
> 	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
> 	BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
> 		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> 	Capabilities: [a0] HyperTransport: MSI Mapping Enable+ Fixed-
> 		Mapping Address Base: 00000000fee00000
> 	Capabilities: [b0] Express (v1) Root Port (Slot-), MSI 00
> 
> I'll post a patch to apply extended tags to systems with PCI express v2 and higher
> bridges only.
> 

Please give this patch a try. I can make the patch pretty and re-post if it works for you. 

You should be seeing messages like this during boot.

[    3.949621] pci 0003:01:00.0: clearing extended tags capability
[    3.959540] pci 0003:01:00.1: clearing extended tags capability
[    3.969454] pci 0003:01:00.2: clearing extended tags capability
[    3.979373] pci 0003:01:00.3: clearing extended tags capability
[    3.989290] pci 0003:01:00.4: clearing extended tags capability



-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

[-- Attachment #2: 0001-pci-do-not-enable-extended-tags-on-pre-dated-v1.x-sy.patch --]
[-- Type: text/plain, Size: 2658 bytes --]

From a50edf37d58993983ec90dc5aab8ca6d2b8ff10b Mon Sep 17 00:00:00 2001
From: Sinan Kaya <okaya@codeaurora.org>
Date: Tue, 4 Jul 2017 20:39:08 -0400
Subject: [PATCH] pci: do not enable extended tags on pre-dated(v1.x) systems

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/pci/probe.c | 52 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index dfc9a27..c67af22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1663,21 +1663,58 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp)
 	 */
 }

-static void pci_configure_extended_tags(struct pci_dev *dev)
+static bool pcie_bus_exttags_supported(struct pci_bus *bus)
+{
+	bool exttags_supported = true;
+	struct pci_dev *bridge;
+	int rc;
+	u16 flags;
+
+	bridge = bus->self;
+	while (bridge) {
+		if (pci_is_pcie(bridge)) {
+			rc = pcie_capability_read_word(bridge, PCI_EXP_FLAGS,
+						       &flags);
+			if (!rc && ((flags & PCI_EXP_FLAGS_VERS) < 2)) {
+				exttags_supported = false;
+				break;
+			}
+		}
+		if (!bridge->bus->parent)
+			break;
+		bridge = bridge->bus->parent->self;
+	}
+
+	return exttags_supported;
+}
+
+static int pcie_bus_configure_exttags(struct pci_dev *dev, void *data)
 {
 	u32 dev_cap;
 	int ret;
+	bool supported;

 	if (!pci_is_pcie(dev))
-		return;
+		return 0;

 	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &dev_cap);
 	if (ret)
-		return;
+		return 0;

-	if (dev_cap & PCI_EXP_DEVCAP_EXT_TAG)
-		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
-					 PCI_EXP_DEVCTL_EXT_TAG);
+	if (dev_cap & PCI_EXP_DEVCAP_EXT_TAG) {
+		supported = pcie_bus_exttags_supported(dev->bus);
+
+		if (supported) {
+			dev_info(&dev->dev, "setting extended tags capability\n");
+			pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
+						 PCI_EXP_DEVCTL_EXT_TAG);
+		} else {
+			dev_info(&dev->dev, "clearing extended tags capability\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
+						   PCI_EXP_DEVCTL_EXT_TAG);
+		}
+	}
+	return 0;
 }

 static void pci_configure_device(struct pci_dev *dev)
@@ -1686,7 +1723,6 @@ static void pci_configure_device(struct pci_dev *dev)
 	int ret;

 	pci_configure_mps(dev);
-	pci_configure_extended_tags(dev);

 	memset(&hpp, 0, sizeof(hpp));
 	ret = pci_get_hp_params(dev, &hpp);
@@ -2231,6 +2267,8 @@ void pcie_bus_configure_settings(struct pci_bus *bus)

 	pcie_bus_configure_set(bus->self, &smpss);
 	pci_walk_bus(bus, pcie_bus_configure_set, &smpss);
+
+	pci_walk_bus(bus, pcie_bus_configure_exttags, NULL);
 }
 EXPORT_SYMBOL_GPL(pcie_bus_configure_settings);

--
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05  1:00         ` Sinan Kaya
@ 2017-07-05  7:42           ` Ethan Zhao
  2017-07-05 12:28             ` Sinan Kaya
  2017-07-05 11:13           ` Wim ten Have
  1 sibling, 1 reply; 11+ messages in thread
From: Ethan Zhao @ 2017-07-05  7:42 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Wim ten Have, Bjorn Helgaas, linux-pci

Sinan=EF=BC=8C

    About the patch attached, why clear the word of
PCI_EXP_DEVCTL_EXT_TAG ?  does the device will be set by default after
POST it is not supported ?

   dev_info(&dev->dev, "clearing extended tags capability\n");

+ pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
+   PCI_EXP_DEVCTL_EXT_TAG);


Thanks,
Ethan

On Wed, Jul 5, 2017 at 9:00 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 7/4/2017 6:25 PM, Sinan Kaya wrote:
>> On 7/4/2017 1:59 PM, Wim ten Have wrote:
>>> On Tue, 4 Jul 2017 11:57:37 -0400
>>> Sinan Kaya <okaya@codeaurora.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> On 7/4/2017 11:32 AM, Bjorn Helgaas wrote:
>>>>> [+cc linux-pci]
>>>>>
>>>>> Thanks very much for the detailed problem report, Wim!  I'm taking th=
e
>>>>> liberty to forward to the linux-pci list in case others trip over the
>>>>> same thing.
>>>>>
>>>>
>>>> So, the spec is lying :) and reality doesn't match theory.
>>
>> The PCI Express bridge you have is a Broadcom HT 2100 bridge which seems=
 to support
>> PCI-Express V1.0 and 1.0a compliant only.
>>
>> http://www.hard-net.de/info_wissen/chipsatz/broadcom/HT-2100.pdf
>>
>> I can also see this in your lspci output.
>>
>> 00:08.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) (prog-if=
 00 [Normal decode])
>>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr=
- Stepping- SERR+ FastB2B- DisINTx+
>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=3Dfast >TAbort- <=
TAbort- <MAbort- >SERR- <PERR- INTx-
>>       Latency: 0, Cache Line Size: 64 bytes
>>       Interrupt: pin A routed to IRQ 19
>>       NUMA node: 0
>>       Bus: primary=3D00, secondary=3D01, subordinate=3D01, sec-latency=
=3D0
>>       I/O behind bridge: 0000f000-00000fff [empty]
>>       Memory behind bridge: efe00000-efefffff [size=3D1M]
>>       Prefetchable memory behind bridge: 00000000fff00000-00000000000fff=
ff [empty]
>>       Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=3Dfast >TAbort- <=
TAbort- <MAbort+ <SERR- <PERR-
>>       BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
>>               PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>>       Capabilities: [a0] HyperTransport: MSI Mapping Enable+ Fixed-
>>               Mapping Address Base: 00000000fee00000
>>       Capabilities: [b0] Express (v1) Root Port (Slot-), MSI 00
>>
>> I'll post a patch to apply extended tags to systems with PCI express v2 =
and higher
>> bridges only.
>>
>
> Please give this patch a try. I can make the patch pretty and re-post if =
it works for you.
>
> You should be seeing messages like this during boot.
>
> [    3.949621] pci 0003:01:00.0: clearing extended tags capability
> [    3.959540] pci 0003:01:00.1: clearing extended tags capability
> [    3.969454] pci 0003:01:00.2: clearing extended tags capability
> [    3.979373] pci 0003:01:00.3: clearing extended tags capability
> [    3.989290] pci 0003:01:00.4: clearing extended tags capability
>
>
>
> --
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Techno=
logies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux=
 Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05  1:00         ` Sinan Kaya
  2017-07-05  7:42           ` Ethan Zhao
@ 2017-07-05 11:13           ` Wim ten Have
  2017-07-05 12:37             ` Sinan Kaya
  1 sibling, 1 reply; 11+ messages in thread
From: Wim ten Have @ 2017-07-05 11:13 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Bjorn Helgaas, linux-pci, Wim ten Have

On Tue, 4 Jul 2017 21:00:03 -0400
Sinan Kaya <okaya@codeaurora.org> wrote:

> On 7/4/2017 6:25 PM, Sinan Kaya wrote:
> > 
> > I can also see this in your lspci output. 
> > 
> > 00:08.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) (prog-if 00 [Normal decode])
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 0, Cache Line Size: 64 bytes
> > 	Interrupt: pin A routed to IRQ 19
> > 	NUMA node: 0
> > 	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> > 	I/O behind bridge: 0000f000-00000fff [empty]
> > 	Memory behind bridge: efe00000-efefffff [size=1M]
> > 	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [empty]
> > 	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
> > 	BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
> > 		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> > 	Capabilities: [a0] HyperTransport: MSI Mapping Enable+ Fixed-
> > 		Mapping Address Base: 00000000fee00000
> > 	Capabilities: [b0] Express (v1) Root Port (Slot-), MSI 00
> > 
> > I'll post a patch to apply extended tags to systems with PCI express v2 and higher
> > bridges only.
> 
> Please give this patch a try. I can make the patch pretty and re-post if it works for you. 

    Howdy,

  I setup your patch under an /usr/src/kernel/rpmbuild tree for current
  "kernel-4.11.8-200.fc25.src.rpm" and made below change to kernel.spec
  file together with an rpmbuild -ba cycle on SPECS/kernel.spec.

  Your patch under the SOURCE/PATCH tree

	<wtenhave@hagen:55> ls -l SOURCES/0001-pci-do-not-enable-extended-tags-on-pre-dated-v1.x-sy.patch 
	-rw-r--r-- 1 wtenhave users 2658 Jul  5 09:45 SOURCES/0001-pci-do-not-enable-extended-tags-on-pre-dated-v1.x-sy.patch

  Change to the kernel.spec file

	<wtenhave@hagen:56> rcsdiff -u SPECS/kernel.spec
	===================================================================
	RCS file: SPECS/kernel.spec,v
	retrieving revision 1.1
	diff -u -r1.1 SPECS/kernel.spec
	--- SPECS/kernel.spec	2017/07/05 07:54:17	1.1
	+++ SPECS/kernel.spec	2017/07/05 07:54:20
	@@ -635,6 +635,9 @@
	 # rhbz 1459326
	 Patch683: RFC-audit-fix-a-race-condition-with-the-auditd-tracking-code.patch
	 
	+# rhbz 1467674
	+Patch700: 0001-pci-do-not-enable-extended-tags-on-pre-dated-v1.x-sy.patch
	+
	 # END OF PATCH DEFINITIONS
	 
	 %endif

  From an 'rpmbuild -ba kernel.spec' nicely proceeded and generated all
  package.  They all installed and the system boots and works like a charm!

> You should be seeing messages like this during boot.
> 
> [    3.949621] pci 0003:01:00.0: clearing extended tags capability
> [    3.959540] pci 0003:01:00.1: clearing extended tags capability
> [    3.969454] pci 0003:01:00.2: clearing extended tags capability
> [    3.979373] pci 0003:01:00.3: clearing extended tags capability
> [    3.989290] pci 0003:01:00.4: clearing extended tags capability

  Correct ... see excerpt below.

	[    0.000000] Linux version 4.11.8-200.fc25.x86_64 (root@hagen) (gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC) ) #1 SMP Wed Jul 5 10:37:18 CEST 2017
	[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.11.8-200.fc25.x86_64 root=/dev/mapper/fedora_hagen-root ro rd.lvm.lv=fedora_hagen/root rd.lvm.lv=fedora_hagen/swap audit=0
	[    0.000000] x86/fpu: x87 FPU will use FXSAVE
	[    0.000000] e820: BIOS-provided physical RAM map:
		...
	[    0.532911] PCI host bridge to bus 0000:00
	[    0.533104] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
	[    0.533301] pci_bus 0000:00: root bus resource [io  0xd000-0xefff window]
	[    0.533501] pci_bus 0000:00: root bus resource [io  0x0d00-0x0fff window]
	[    0.533699] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
	[    0.534039] pci_bus 0000:00: root bus resource [mem 0xf0000000-0xf1ffffff window]
	[    0.534376] pci_bus 0000:00: root bus resource [mem 0xefb00000-0xefffffff window]
	[    0.534722] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xe7ffffff window]
	[    0.535060] pci_bus 0000:00: root bus resource [bus 00-fd]
	[    0.535264] pci 0000:00:01.0: [1166:0036] type 01 class 0x060400
	[    0.535299] pci 0000:00:01.0: Enabling HT MSI Mapping
	[    0.535552] pci 0000:00:01.0: System wakeup disabled by ACPI
	[    0.535786] pci 0000:00:02.0: [1166:0205] type 00 class 0x060000
	[    0.535882] pci 0000:00:02.1: [1166:0214] type 00 class 0x01018a
	[    0.535904] pci 0000:00:02.1: reg 0x10: [io  0x01f0-0x01f7]
	[    0.535912] pci 0000:00:02.1: reg 0x14: [io  0x03f4-0x03f7]
	[    0.535920] pci 0000:00:02.1: reg 0x18: [io  0x0170-0x0177]
	[    0.535929] pci 0000:00:02.1: reg 0x1c: [io  0x0374-0x0377]
	[    0.535937] pci 0000:00:02.1: reg 0x20: [io  0x08c0-0x08cf]
	[    0.535956] pci 0000:00:02.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
	[    0.536154] pci 0000:00:02.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
	[    0.536350] pci 0000:00:02.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
	[    0.536551] pci 0000:00:02.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
	[    0.536814] pci 0000:00:02.2: [1166:0234] type 00 class 0x060100
	[    0.536951] pci 0000:00:03.0: [1166:0223] type 00 class 0x0c0310
	[    0.536967] pci 0000:00:03.0: reg 0x10: [mem 0xefbed000-0xefbedfff]
	[    0.536976] pci 0000:00:03.0: reg 0x14: [io  0xd000-0xd0ff]
	[    0.537089] pci 0000:00:03.1: [1166:0223] type 00 class 0x0c0310
	[    0.537104] pci 0000:00:03.1: reg 0x10: [mem 0xefbee000-0xefbeefff]
	[    0.537113] pci 0000:00:03.1: reg 0x14: [io  0xd400-0xd4ff]
	[    0.537222] pci 0000:00:03.2: [1166:0223] type 00 class 0x0c0320
	[    0.537237] pci 0000:00:03.2: reg 0x10: [mem 0xefbef000-0xefbeffff]
	[    0.537246] pci 0000:00:03.2: reg 0x14: [io  0xd800-0xd8ff]
	[    0.537311] pci 0000:00:03.2: supports D1 D2
	[    0.537312] pci 0000:00:03.2: PME# supported from D0 D1 D2 D3hot
	[    0.537373] pci 0000:00:04.0: [1002:515e] type 00 class 0x030000
	[    0.537389] pci 0000:00:04.0: reg 0x10: [mem 0xe0000000-0xe7ffffff pref]
	[    0.537398] pci 0000:00:04.0: reg 0x14: [io  0xdc00-0xdcff]
	[    0.537407] pci 0000:00:04.0: reg 0x18: [mem 0xefbf0000-0xefbfffff]
	[    0.537440] pci 0000:00:04.0: reg 0x30: [mem 0x00000000-0x0001ffff pref]
	[    0.537468] pci 0000:00:04.0: supports D1 D2
	[    0.537513] pci 0000:00:07.0: [1166:0140] type 01 class 0x060400
	[    0.537550] pci 0000:00:07.0: PME# supported from D0 D3hot D3cold
	[    0.537603] pci 0000:00:08.0: [1166:0142] type 01 class 0x060400
	[    0.537639] pci 0000:00:08.0: PME# supported from D0 D3hot D3cold
	[    0.537666] pci 0000:00:08.0: System wakeup disabled by ACPI
	[    0.537893] pci 0000:00:09.0: [1166:0144] type 01 class 0x060400
	[    0.537936] pci 0000:00:09.0: PME# supported from D0 D3hot D3cold
	[    0.537964] pci 0000:00:09.0: System wakeup disabled by ACPI
	[    0.538187] pci 0000:00:0a.0: [1166:0142] type 01 class 0x060400
	[    0.538221] pci 0000:00:0a.0: PME# supported from D0 D3hot D3cold
	[    0.538248] pci 0000:00:0a.0: System wakeup disabled by ACPI
	[    0.538478] pci 0000:00:0b.0: [1166:0144] type 01 class 0x060400
	[    0.538512] pci 0000:00:0b.0: PME# supported from D0 D3hot D3cold
	[    0.538577] pci 0000:00:18.0: [1022:1200] type 00 class 0x060000
	[    0.538629] pci 0000:00:18.1: [1022:1201] type 00 class 0x060000
	[    0.538683] pci 0000:00:18.2: [1022:1202] type 00 class 0x060000
	[    0.538729] pci 0000:00:18.3: [1022:1203] type 00 class 0x060000
	[    0.538778] pci 0000:00:18.4: [1022:1204] type 00 class 0x060000
	[    0.538827] pci 0000:00:19.0: [1022:1200] type 00 class 0x060000
	[    0.538885] pci 0000:00:19.1: [1022:1201] type 00 class 0x060000
	[    0.538945] pci 0000:00:19.2: [1022:1202] type 00 class 0x060000
	[    0.538993] pci 0000:00:19.3: [1022:1203] type 00 class 0x060000
	[    0.539045] pci 0000:00:19.4: [1022:1204] type 00 class 0x060000
	[    0.539152] pci 0000:03:0d.0: [1166:0104] type 01 class 0x060400
	[    0.539228] pci 0000:03:0e.0: [1166:024b] type 00 class 0x01018f
	[    0.539238] pci 0000:03:0e.0: reg 0x10: [io  0xecb0-0xecb7]
	[    0.539244] pci 0000:03:0e.0: reg 0x14: [io  0xeca0-0xeca3]
	[    0.539250] pci 0000:03:0e.0: reg 0x18: [io  0xecb8-0xecbf]
	[    0.539255] pci 0000:03:0e.0: reg 0x1c: [io  0xeca4-0xeca7]
	[    0.539261] pci 0000:03:0e.0: reg 0x20: [io  0xece0-0xecef]
	[    0.539267] pci 0000:03:0e.0: reg 0x24: [mem 0xefdfe000-0xefdfffff]
	[    0.539273] pci 0000:03:0e.0: reg 0x30: [mem 0x00000000-0x0001ffff pref]
	[    0.539337] pci 0000:00:01.0: PCI bridge to [bus 03-04]
	[    0.539537] pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
	[    0.539540] pci 0000:00:01.0:   bridge window [mem 0xefc00000-0xefdfffff]
	[    0.539598] pci 0000:03:0d.0: PCI bridge to [bus 04]
	[    0.539828] pci 0000:00:07.0: PCI bridge to [bus 05]
	[    0.540068] pci 0000:01:00.0: [14e4:1659] type 00 class 0x020000
	[    0.540087] pci 0000:01:00.0: reg 0x10: [mem 0xefef0000-0xefefffff 64bit]
	[    0.540185] pci 0000:01:00.0: PME# supported from D3hot D3cold
	[    0.542890] pci 0000:00:08.0: PCI bridge to [bus 01]
	[    0.543094] pci 0000:00:08.0:   bridge window [mem 0xefe00000-0xefefffff]
	[    0.543155] pci 0000:02:00.0: [14e4:1659] type 00 class 0x020000
	[    0.543175] pci 0000:02:00.0: reg 0x10: [mem 0xefff0000-0xefffffff 64bit]
	[    0.543277] pci 0000:02:00.0: PME# supported from D3hot D3cold
	[    0.545889] pci 0000:00:09.0: PCI bridge to [bus 02]
	[    0.546088] pci 0000:00:09.0:   bridge window [mem 0xeff00000-0xefffffff]
	[    0.546130] pci 0000:00:0a.0: PCI bridge to [bus 06]
	[    0.546351] pci 0000:00:0b.0: PCI bridge to [bus 07]
***>	[    0.546570] pci 0000:01:00.0: clearing extended tags capability
***>	[    0.546773] pci 0000:02:00.0: clearing extended tags capability
		...

	<wtenhave@hagen:60> uname -a
	Linux hagen 4.11.8-200.fc25.x86_64 #1 SMP Wed Jul 5 10:37:18 CEST 2017 x86_64 x86_64 x86_64 GNU/Linux

  I'll update RHT Bugzilla entry 1467674 with this info. 
  Hope someone picks this up for Fedora25 and Fedora26 (next week).

Regards,
- Wim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05  7:42           ` Ethan Zhao
@ 2017-07-05 12:28             ` Sinan Kaya
  0 siblings, 0 replies; 11+ messages in thread
From: Sinan Kaya @ 2017-07-05 12:28 UTC (permalink / raw)
  To: Ethan Zhao; +Cc: Wim ten Have, Bjorn Helgaas, linux-pci

Hi,

On 7/5/2017 3:42 AM, Ethan Zhao wrote:
> Sinan,
> 
>     About the patch attached, why clear the word of
> PCI_EXP_DEVCTL_EXT_TAG ?  does the device will be set by default after
> POST it is not supported ?
> 
>    dev_info(&dev->dev, "clearing extended tags capability\n");
> 
> + pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
> +   PCI_EXP_DEVCTL_EXT_TAG);

We don't usually trust the FW in Linux. Some outdated firmware might have
done the same mistake and end users usually don't upgrade their BIOS in
general. We end up working around issues in Linux with quirks etc.

The other issue is this. Default value of the register is not zero.

"Extended Tag Field Enable:
Default value of this bit is implementation specific"

https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf

Hope this helps,

Sinan

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05 11:13           ` Wim ten Have
@ 2017-07-05 12:37             ` Sinan Kaya
  2017-07-05 13:00               ` Wim ten Have
  0 siblings, 1 reply; 11+ messages in thread
From: Sinan Kaya @ 2017-07-05 12:37 UTC (permalink / raw)
  To: Wim ten Have; +Cc: Bjorn Helgaas, linux-pci

On 7/5/2017 7:13 AM, Wim ten Have wrote:
>  From an 'rpmbuild -ba kernel.spec' nicely proceeded and generated all
>   package.  They all installed and the system boots and works like a charm!

Thanks for testing. I'll re-post the same patch with better commit text
and fixes tag so that Bjorn can review it.

Can I have a tested-by?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05 12:37             ` Sinan Kaya
@ 2017-07-05 13:00               ` Wim ten Have
  2017-07-05 13:20                 ` Sinan Kaya
  0 siblings, 1 reply; 11+ messages in thread
From: Wim ten Have @ 2017-07-05 13:00 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Bjorn Helgaas, linux-pci, Wim ten Have

On Wed, 5 Jul 2017 08:37:03 -0400
Sinan Kaya <okaya@codeaurora.org> wrote:

> On 7/5/2017 7:13 AM, Wim ten Have wrote:
> >  From an 'rpmbuild -ba kernel.spec' nicely proceeded and generated all
> >   package.  They all installed and the system boots and works like a charm!  
> 
> Thanks for testing. I'll re-post the same patch with better commit text
> and fixes tag so that Bjorn can review it.
> 
> Can I have a tested-by?

  Sure.

  tested-by: Wim ten Have <wim.ten.have@oracle.com>

- Wim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes,
  2017-07-05 13:00               ` Wim ten Have
@ 2017-07-05 13:20                 ` Sinan Kaya
  0 siblings, 0 replies; 11+ messages in thread
From: Sinan Kaya @ 2017-07-05 13:20 UTC (permalink / raw)
  To: Wim ten Have; +Cc: Bjorn Helgaas, linux-pci

On 7/5/2017 9:00 AM, Wim ten Have wrote:
>> Can I have a tested-by?
>   Sure.
> 
>   tested-by: Wim ten Have <wim.ten.have@oracle.com>
> 
> - Wim.
> 

Thanks, I just re-posted the patch. The only change is replacing dev_info
with dev_dbg to reduce verbosity. Feel free to retest.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-05 13:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170704161352.1cdb2670.wim.ten.have@oracle.com>
2017-07-04 15:32 ` Fwd: Red Hat (Fedora) bug report 1467674 concerning your kernel functional performance enhancements causing PCI Express crashes, Bjorn Helgaas
2017-07-04 15:57   ` Sinan Kaya
2017-07-04 17:59     ` Wim ten Have
2017-07-04 22:25       ` Sinan Kaya
2017-07-05  1:00         ` Sinan Kaya
2017-07-05  7:42           ` Ethan Zhao
2017-07-05 12:28             ` Sinan Kaya
2017-07-05 11:13           ` Wim ten Have
2017-07-05 12:37             ` Sinan Kaya
2017-07-05 13:00               ` Wim ten Have
2017-07-05 13:20                 ` Sinan Kaya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.