All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
@ 2014-04-23  2:26 Wei Yang
  2014-04-23  2:26 ` [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev Wei Yang
  2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
  0 siblings, 2 replies; 10+ messages in thread
From: Wei Yang @ 2014-04-23  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Wei Yang, aik, gwshan

During the EEH hotplug event, iommu_add_device() will be invoked three times
and two of them will trigger warning or error.

The three times to invoke the iommu_add_device() are:

    pci_device_add
       ...
       set_iommu_table_base_and_group   <- 1st time, fail
    device_add
       ...
       tce_iommu_bus_notifier           <- 2nd time, succees
    pcibios_add_pci_devices
       ...
       pcibios_setup_bus_devices        <- 3rd time, re-attach

The first time fails, since the dev->kobj->sd is not initialized. The
dev->kobj->sd is initialized in device_add().
The third time's warning is triggered by the re-attach of the iommu_group.

After applying this patch, the error

    iommu_tce: 0003:05:00.0 has not been added, ret=-14

and the warning

    [  204.123609] ------------[ cut here ]------------
    [  204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
    [  204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
    [  204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
    [  204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
    [  204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
    [  204.124506] REGS: c0000027ed50f440 TRAP: 0700   Not tainted  (3.14.0-rc5yw+)
    [  204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 88008084  XER: 20000000
    [  204.124682] CFAR: c00000000006c644 SOFTE: 1
    GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
    GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
    GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
    GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
    GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
    GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
    [  204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
    [  204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125443] Call Trace:
    [  204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
    [  204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [  204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
    [  204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
    [  204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
    [  204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
    [  204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
    [  204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
    [  204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
    [  204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
    [  204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
    [  204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
    [  204.126556] Instruction dump:
    [  204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
    [  204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
    [  204.126966] ---[ end trace 6e7aefd80add2973 ]---

are cleared.

This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
revert part of the change in commit d905c5df(PPC: POWERNV: move
iommu_add_device earlier).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index bc8e74e..bd3870e 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1423,7 +1423,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
 
 	pe = &phb->ioda.pe_array[pdn->pe_number];
 	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
-	set_iommu_table_base_and_group(&pdev->dev, pe->tce32_table);
+	set_iommu_table_base(&pdev->dev, pe->tce32_table);
 }
 
 static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev
  2014-04-23  2:26 [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Wei Yang
@ 2014-04-23  2:26 ` Wei Yang
  2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
  1 sibling, 0 replies; 10+ messages in thread
From: Wei Yang @ 2014-04-23  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Wei Yang, aik, gwshan

On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.

This patch release the unnecessary refcount.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index bd3870e..cec6a75 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -558,7 +558,6 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
 				pci_name(dev));
 			continue;
 		}
-		pci_dev_get(dev);
 		pdn->pcidev = dev;
 		pdn->pe_number = pe->pe_number;
 		pe->dma_weight += pnv_ioda_dma_weight(dev);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-23  2:26 [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Wei Yang
  2014-04-23  2:26 ` [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev Wei Yang
@ 2014-04-28 13:35 ` Alexey Kardashevskiy
  2014-04-29  6:49   ` Wei Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2014-04-28 13:35 UTC (permalink / raw)
  To: Wei Yang, linuxppc-dev; +Cc: gwshan

On 04/23/2014 12:26 PM, Wei Yang wrote:
> During the EEH hotplug event, iommu_add_device() will be invoked three times
> and two of them will trigger warning or error.
> 
> The three times to invoke the iommu_add_device() are:
> 
>     pci_device_add
>        ...
>        set_iommu_table_base_and_group   <- 1st time, fail
>     device_add
>        ...
>        tce_iommu_bus_notifier           <- 2nd time, succees
>     pcibios_add_pci_devices
>        ...
>        pcibios_setup_bus_devices        <- 3rd time, re-attach
> 
> The first time fails, since the dev->kobj->sd is not initialized. The
> dev->kobj->sd is initialized in device_add().
> The third time's warning is triggered by the re-attach of the iommu_group.
> 
> After applying this patch, the error

Nack.

The patch still seems incorrect and we actually need to remove
tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
from another notifier anyway. Could you please test it?


> 
>     iommu_tce: 0003:05:00.0 has not been added, ret=-14
> 
> and the warning
> 
>     [  204.123609] ------------[ cut here ]------------
>     [  204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
>     [  204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
>     [  204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
>     [  204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
>     [  204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
>     [  204.124506] REGS: c0000027ed50f440 TRAP: 0700   Not tainted  (3.14.0-rc5yw+)
>     [  204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 88008084  XER: 20000000
>     [  204.124682] CFAR: c00000000006c644 SOFTE: 1
>     GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
>     GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
>     GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
>     GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
>     GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>     GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
>     GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
>     GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
>     [  204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
>     [  204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
>     [  204.125443] Call Trace:
>     [  204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
>     [  204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
>     [  204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
>     [  204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
>     [  204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
>     [  204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
>     [  204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
>     [  204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
>     [  204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
>     [  204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
>     [  204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
>     [  204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
>     [  204.126556] Instruction dump:
>     [  204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
>     [  204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
>     [  204.126966] ---[ end trace 6e7aefd80add2973 ]---
> 
> are cleared.
> 
> This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
> revert part of the change in commit d905c5df(PPC: POWERNV: move
> iommu_add_device earlier).
> 
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index bc8e74e..bd3870e 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1423,7 +1423,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>  
>  	pe = &phb->ioda.pe_array[pdn->pe_number];
>  	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
> -	set_iommu_table_base_and_group(&pdev->dev, pe->tce32_table);
> +	set_iommu_table_base(&pdev->dev, pe->tce32_table);
>  }
>  
>  static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
> 


-- 
Alexey Kardashevskiy
IBM OzLabs, LTC Team

e-mail: aik@au1.ibm.com
notes: Alexey Kardashevskiy/Australia/IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
@ 2014-04-29  6:49   ` Wei Yang
  2014-04-29  7:55     ` Alexey Kardashevskiy
  2014-04-30  0:28     ` Gavin Shan
  0 siblings, 2 replies; 10+ messages in thread
From: Wei Yang @ 2014-04-29  6:49 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev, gwshan

On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>On 04/23/2014 12:26 PM, Wei Yang wrote:
>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>> and two of them will trigger warning or error.
>> 
>> The three times to invoke the iommu_add_device() are:
>> 
>>     pci_device_add
>>        ...
>>        set_iommu_table_base_and_group   <- 1st time, fail
>>     device_add
>>        ...
>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>     pcibios_add_pci_devices
>>        ...
>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>> 
>> The first time fails, since the dev->kobj->sd is not initialized. The
>> dev->kobj->sd is initialized in device_add().
>> The third time's warning is triggered by the re-attach of the iommu_group.
>> 
>> After applying this patch, the error
>
>Nack.
>
>The patch still seems incorrect and we actually need to remove
>tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>from another notifier anyway. Could you please test it?
>
>

Hi, Alexey,

Nice to see your comment. Let me show what I got fist.

Generally, when kernel enumerate on the pci device, following functions will
be invoked.

     pci_device_add
        pcibios_setup_bus_device
        ...
        set_iommu_table_base_and_group   
     device_add
        ...
        tce_iommu_bus_notifier           
     pcibios_fixp_bus
        pcibios_add_pci_devices
        ...
        pcibios_setup_bus_devices        

>From the call flow, we see for a normall pci device, the
pcibios_setup_bus_device() will be invoked twice.

At the bootup time, none of them succeed to setup the dma, since the PE is not
assigned or the tbl is not set. The iommu tbl and group is setup in
pnv_pci_ioda_setup_DMA().

This call flow maintains the same when EEH error happens on Bus PE, while the
behavior is a little different. 

     pci_device_add
        pcibios_setup_bus_device
        ...
        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
     device_add
        ...
        tce_iommu_bus_notifier           <- succeed
     pcibios_fixp_bus
        pcibios_add_pci_devices
        ...
        pcibios_setup_bus_devices        <- warning, re-attach

While this call flow will change a little on a VF. For a VF,
pcibios_fixp_bus() will not be invoked. Current behavior is this.

     pci_device_add
        pcibios_setup_bus_device
        ...
        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
     device_add
        ...
        tce_iommu_bus_notifier           <- succeed

And if an EEH error happens just on a VF, I believe the same call flow should
go for recovery. (This is not set down, still under discussion with Gavin.)

My conclusion is:
1. remove the tce_iommu_bus_notifier completely will make the VF not work.
   So I choose to revert the code and attach make the iommu group attachment
   just happens in one place.

BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
PF, maybe some platform need to fixup some thing after the pci_bus is added.
So I don't remove one of them to solve the problem.

If you have a better idea, I am glad to take it.

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-29  6:49   ` Wei Yang
@ 2014-04-29  7:55     ` Alexey Kardashevskiy
  2014-04-29  9:37       ` Wei Yang
  2014-04-30  0:28     ` Gavin Shan
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2014-04-29  7:55 UTC (permalink / raw)
  To: Wei Yang; +Cc: linuxppc-dev, gwshan

On 04/29/2014 04:49 PM, Wei Yang wrote:
> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>> and two of them will trigger warning or error.
>>>
>>> The three times to invoke the iommu_add_device() are:
>>>
>>>     pci_device_add
>>>        ...
>>>        set_iommu_table_base_and_group   <- 1st time, fail
>>>     device_add
>>>        ...
>>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>>     pcibios_add_pci_devices
>>>        ...
>>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>>>
>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>> dev->kobj->sd is initialized in device_add().
>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>
>>> After applying this patch, the error
>>
>> Nack.
>>
>> The patch still seems incorrect and we actually need to remove
>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>from another notifier anyway. Could you please test it?
>>
>>
> 
> Hi, Alexey,
> 
> Nice to see your comment. Let me show what I got fist.
> 
> Generally, when kernel enumerate on the pci device, following functions will
> be invoked.
> 
>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   
>      device_add
>         ...
>         tce_iommu_bus_notifier           
>      pcibios_fixp_bus
>         pcibios_add_pci_devices
>         ...
>         pcibios_setup_bus_devices        
> 
>>From the call flow, we see for a normall pci device, the
> pcibios_setup_bus_device() will be invoked twice.


tce_iommu_bus_notifier should not be called for devices during boot-time
PCI discovery as the discovery is done from subsys_initcall handler and
tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
this is happening? You should see warnings as for PF's EEH but you do not
say that.


> At the bootup time, none of them succeed to setup the dma, since the PE is not
> assigned or the tbl is not set. The iommu tbl and group is setup in
> pnv_pci_ioda_setup_DMA().
> 
> This call flow maintains the same when EEH error happens on Bus PE, while the
> behavior is a little different. 
> 
>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized


Sorry, what is uninitialized? The only kobj here is the one in iommu_group
and it must have been taken care of before adding a device.


>      device_add
>         ...
>         tce_iommu_bus_notifier           <- succeed
>      pcibios_fixp_bus
>         pcibios_add_pci_devices
>         ...
>         pcibios_setup_bus_devices        <- warning, re-attach


This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
avoid the warning.


> While this call flow will change a little on a VF. For a VF,
> pcibios_fixp_bus() will not be invoked. Current behavior is this.


You meant pcibios_fixup_bus? I would expect it to get called (from
pci_rescan_bus() or something like that?) and this seems to be the problem
here.

How are VFs added in terms of code flow? Is there any git tree to look at?



>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>      device_add
>         ...
>         tce_iommu_bus_notifier           <- succeed
> 
> And if an EEH error happens just on a VF, I believe the same call flow should
> go for recovery. (This is not set down, still under discussion with Gavin.)
> 
> My conclusion is:
> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>    So I choose to revert the code and attach make the iommu group attachment
>    just happens in one place.
> 
> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
> PF, maybe some platform need to fixup some thing after the pci_bus is added.
> So I don't remove one of them to solve the problem.
> 
> If you have a better idea, I am glad to take it.
> 


-- 
Alexey Kardashevskiy
IBM OzLabs, LTC Team

e-mail: aik@au1.ibm.com
notes: Alexey Kardashevskiy/Australia/IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-29  7:55     ` Alexey Kardashevskiy
@ 2014-04-29  9:37       ` Wei Yang
  2014-04-29 13:11         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2014-04-29  9:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Wei Yang, linuxppc-dev, gwshan

On Tue, Apr 29, 2014 at 05:55:48PM +1000, Alexey Kardashevskiy wrote:
>On 04/29/2014 04:49 PM, Wei Yang wrote:
>> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>>> and two of them will trigger warning or error.
>>>>
>>>> The three times to invoke the iommu_add_device() are:
>>>>
>>>>     pci_device_add
>>>>        ...
>>>>        set_iommu_table_base_and_group   <- 1st time, fail
>>>>     device_add
>>>>        ...
>>>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>>>     pcibios_add_pci_devices
>>>>        ...
>>>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>>>>
>>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>>> dev->kobj->sd is initialized in device_add().
>>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>>
>>>> After applying this patch, the error
>>>
>>> Nack.
>>>
>>> The patch still seems incorrect and we actually need to remove
>>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>>from another notifier anyway. Could you please test it?
>>>
>>>
>> 
>> Hi, Alexey,
>> 
>> Nice to see your comment. Let me show what I got fist.
>> 
>> Generally, when kernel enumerate on the pci device, following functions will
>> be invoked.
>> 
>>      pci_device_add
>>         pcibios_setup_bus_device
>>         ...
>>         set_iommu_table_base_and_group   
>>      device_add
>>         ...
>>         tce_iommu_bus_notifier           
>>      pcibios_fixp_bus
>>         pcibios_add_pci_devices
>>         ...
>>         pcibios_setup_bus_devices        
>> 
>>>From the call flow, we see for a normall pci device, the
>> pcibios_setup_bus_device() will be invoked twice.
>
>
>tce_iommu_bus_notifier should not be called for devices during boot-time
>PCI discovery as the discovery is done from subsys_initcall handler and
>tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
>this is happening? You should see warnings as for PF's EEH but you do not
>say that.
>

Let me make it clearer. I list the call flow for general purpose to show the
sequence of the call flow.

The tce_iommu_bus_notifier is not invoked at the boot time. And none of them
do real thing at boot time.

I don't understand your last sentence. I see warning and error during EEH
hotplug. If necessary, I will add this in the commit log.

>
>> At the bootup time, none of them succeed to setup the dma, since the PE is not
>> assigned or the tbl is not set. The iommu tbl and group is setup in
>> pnv_pci_ioda_setup_DMA().
>> 
>> This call flow maintains the same when EEH error happens on Bus PE, while the
>> behavior is a little different. 
>> 
>>      pci_device_add
>>         pcibios_setup_bus_device
>>         ...
>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>
>
>Sorry, what is uninitialized? The only kobj here is the one in iommu_group
>and it must have been taken care of before adding a device.

pci_dev->dev->kobj->sd.

I have explained this in previous discussion. This kobj->sd is initialized in
device_add().

>
>
>>      device_add
>>         ...
>>         tce_iommu_bus_notifier           <- succeed
>>      pcibios_fixp_bus
>>         pcibios_add_pci_devices
>>         ...
>>         pcibios_setup_bus_devices        <- warning, re-attach
>
>
>This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
>avoid the warning.

Then how about the first time's error?

>
>
>> While this call flow will change a little on a VF. For a VF,
>> pcibios_fixp_bus() will not be invoked. Current behavior is this.
>
>
>You meant pcibios_fixup_bus? I would expect it to get called (from
>pci_rescan_bus() or something like that?) and this seems to be the problem
>here.

When EEH happens and hotplug the pci bus, we need to do the pci_scan_bridge()
which will do the pcibios_fixup_bus().

So you suggest to remove it? This is the generic code.

>
>How are VFs added in terms of code flow? Is there any git tree to look at?
>

VF add code flow is a generic code in drivers/pci/iov.c, virtfn_add().

>
>
>>      pci_device_add
>>         pcibios_setup_bus_device
>>         ...
>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>>      device_add
>>         ...
>>         tce_iommu_bus_notifier           <- succeed
>> 
>> And if an EEH error happens just on a VF, I believe the same call flow should
>> go for recovery. (This is not set down, still under discussion with Gavin.)
>> 
>> My conclusion is:
>> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>>    So I choose to revert the code and attach make the iommu group attachment
>>    just happens in one place.
>> 
>> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
>> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
>> PF, maybe some platform need to fixup some thing after the pci_bus is added.
>> So I don't remove one of them to solve the problem.
>> 
>> If you have a better idea, I am glad to take it.
>> 
>
>
>-- 
>Alexey Kardashevskiy
>IBM OzLabs, LTC Team
>
>e-mail: aik@au1.ibm.com
>notes: Alexey Kardashevskiy/Australia/IBM

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-29  9:37       ` Wei Yang
@ 2014-04-29 13:11         ` Alexey Kardashevskiy
  2014-04-30  1:31           ` Wei Yang
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2014-04-29 13:11 UTC (permalink / raw)
  To: Wei Yang, Alexey Kardashevskiy; +Cc: linuxppc-dev, gwshan

On 04/29/2014 07:37 PM, Wei Yang wrote:
> On Tue, Apr 29, 2014 at 05:55:48PM +1000, Alexey Kardashevskiy wrote:
>> On 04/29/2014 04:49 PM, Wei Yang wrote:
>>> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>>>> and two of them will trigger warning or error.
>>>>>
>>>>> The three times to invoke the iommu_add_device() are:
>>>>>
>>>>>     pci_device_add
>>>>>        ...
>>>>>        set_iommu_table_base_and_group   <- 1st time, fail
>>>>>     device_add
>>>>>        ...
>>>>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>>>>     pcibios_add_pci_devices
>>>>>        ...
>>>>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>>>>>
>>>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>>>> dev->kobj->sd is initialized in device_add().
>>>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>>>
>>>>> After applying this patch, the error
>>>>
>>>> Nack.
>>>>
>>>> The patch still seems incorrect and we actually need to remove
>>>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>> >from another notifier anyway. Could you please test it?
>>>>
>>>>
>>>
>>> Hi, Alexey,
>>>
>>> Nice to see your comment. Let me show what I got fist.
>>>
>>> Generally, when kernel enumerate on the pci device, following functions will
>>> be invoked.
>>>
>>>      pci_device_add
>>>         pcibios_setup_bus_device
>>>         ...
>>>         set_iommu_table_base_and_group   
>>>      device_add
>>>         ...
>>>         tce_iommu_bus_notifier           
>>>      pcibios_fixp_bus
>>>         pcibios_add_pci_devices
>>>         ...
>>>         pcibios_setup_bus_devices        
>>>
>>> >From the call flow, we see for a normall pci device, the
>>> pcibios_setup_bus_device() will be invoked twice.
>>
>>
>> tce_iommu_bus_notifier should not be called for devices during boot-time
>> PCI discovery as the discovery is done from subsys_initcall handler and
>> tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
>> this is happening? You should see warnings as for PF's EEH but you do not
>> say that.
>>
> 
> Let me make it clearer. I list the call flow for general purpose to show the
> sequence of the call flow.
> 
> The tce_iommu_bus_notifier is not invoked at the boot time. And none of them
> do real thing at boot time.
> 
> I don't understand your last sentence. I see warning and error during EEH
> hotplug. If necessary, I will add this in the commit log.
> 
>>
>>> At the bootup time, none of them succeed to setup the dma, since the PE is not
>>> assigned or the tbl is not set. The iommu tbl and group is setup in
>>> pnv_pci_ioda_setup_DMA().
>>>
>>> This call flow maintains the same when EEH error happens on Bus PE, while the
>>> behavior is a little different. 
>>>
>>>      pci_device_add
>>>         pcibios_setup_bus_device
>>>         ...
>>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>>
>>
>> Sorry, what is uninitialized? The only kobj here is the one in iommu_group
>> and it must have been taken care of before adding a device.
> 
> pci_dev->dev->kobj->sd.
> 
> I have explained this in previous discussion. This kobj->sd is initialized in
> device_add().
> 
>>
>>
>>>      device_add
>>>         ...
>>>         tce_iommu_bus_notifier           <- succeed
>>>      pcibios_fixp_bus
>>>         pcibios_add_pci_devices
>>>         ...
>>>         pcibios_setup_bus_devices        <- warning, re-attach
>>
>>
>> This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
>> avoid the warning.
> 
> Then how about the first time's error?


What do you call "first time error" here?


>>> While this call flow will change a little on a VF. For a VF,
>>> pcibios_fixp_bus() will not be invoked. Current behavior is this.
>>
>>
>> You meant pcibios_fixup_bus? I would expect it to get called (from
>> pci_rescan_bus() or something like that?) and this seems to be the problem
>> here.
> 
> When EEH happens and hotplug the pci bus, we need to do the pci_scan_bridge()
> which will do the pcibios_fixup_bus().

Are you talking now about EEH on PF (physical function) or EEH on VF
(virtual function)?

Are you calling pci_scan_bridge()


> So you suggest to remove it? This is the generic code.


I suggest removing tce_iommu_bus_notifier only. It must go. It was my
mistake at the first place.


>> How are VFs added in terms of code flow? Is there any git tree to look at?
>>
> 
> VF add code flow is a generic code in drivers/pci/iov.c, virtfn_add().

virtfn_add -> pci_device_add -> pcibios_add_device -> pcibios_setup_device
-> ppc_md.pci_dma_dev_setup -> pnv_pci_dma_dev_setup ->
pnv_pci_ioda_dma_dev_setup ->
set_iommu_table_base.


What part of this is missing?


> 
>>
>>
>>>      pci_device_add
>>>         pcibios_setup_bus_device
>>>         ...
>>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>>>      device_add
>>>         ...
>>>         tce_iommu_bus_notifier           <- succeed
>>>
>>> And if an EEH error happens just on a VF, I believe the same call flow should
>>> go for recovery. (This is not set down, still under discussion with Gavin.)
>>>
>>> My conclusion is:
>>> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>>>    So I choose to revert the code and attach make the iommu group attachment
>>>    just happens in one place.
>>>
>>> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
>>> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
>>> PF, maybe some platform need to fixup some thing after the pci_bus is added.
>>> So I don't remove one of them to solve the problem.
>>>
>>> If you have a better idea, I am glad to take it.



-- 
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-29  6:49   ` Wei Yang
  2014-04-29  7:55     ` Alexey Kardashevskiy
@ 2014-04-30  0:28     ` Gavin Shan
  2014-04-30  3:28       ` Wei Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2014-04-30  0:28 UTC (permalink / raw)
  To: Wei Yang; +Cc: linuxppc-dev, Alexey Kardashevskiy, gwshan

On Tue, Apr 29, 2014 at 02:49:55PM +0800, Wei Yang wrote:
>On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>On 04/23/2014 12:26 PM, Wei Yang wrote:

.../...

>Generally, when kernel enumerate on the pci device, following functions will
>be invoked.
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   
>     device_add
>        ...
>        tce_iommu_bus_notifier           
>     pcibios_fixup_bus
>        pcibios_add_pci_devices
>        ...
>        pcibios_setup_bus_devices        
>
>From the call flow, we see for a normall pci device, the
>pcibios_setup_bus_device() will be invoked twice.
>
>At the bootup time, none of them succeed to setup the dma, since the PE is not
>assigned or the tbl is not set. The iommu tbl and group is setup in
>pnv_pci_ioda_setup_DMA().
>

Yes, we don't assign PE# for PCI devices until ppc_md.pcibios_fixup().
We gets IOMMU group and IOMMU group device registered in ppc_md.pcibios_fixup().

As Alexy already pointed out, "tce_iommu_bus_notifier" doesn't take effect
during system boot stage.

>This call flow maintains the same when EEH error happens on Bus PE, while the
>behavior is a little different. 
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>     device_add
>        ...
>        tce_iommu_bus_notifier           <- succeed
>     pcibios_fixp_bus
>        pcibios_add_pci_devices
>        ...
>        pcibios_setup_bus_devices        <- warning, re-attach
>
>While this call flow will change a little on a VF. For a VF,
>pcibios_fixp_bus() will not be invoked. Current behavior is this.
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>     device_add
>        ...
>        tce_iommu_bus_notifier           <- succeed
>

It seems that we have 2 problems here:

- For non-SRIOV case, pcibios_setup_device() is called for towice. That
  seems incorrect. We could simply remove pcibios_setup_bus_devices()
  from pcibios_fixup_bus().

- It's too early to register IOMMU group/device in pnv_pci_ioda_dma_dev_setup()
  because the sysfs entries of the PCI device aren't finalized yet. So we could
  remove all logic we have in pnv_pci_ioda_dma_dev_setup() and just purely rely
  on "tce_iommu_bus_notifier".

By the way, I never tried EEH on SRIOV PF/VFs. However, I never hit similar
issue in non-SRIOV cases.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-29 13:11         ` Alexey Kardashevskiy
@ 2014-04-30  1:31           ` Wei Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Wei Yang @ 2014-04-30  1:31 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: linuxppc-dev, Wei Yang, Alexey Kardashevskiy, gwshan

On Tue, Apr 29, 2014 at 11:11:05PM +1000, Alexey Kardashevskiy wrote:
>On 04/29/2014 07:37 PM, Wei Yang wrote:
>> On Tue, Apr 29, 2014 at 05:55:48PM +1000, Alexey Kardashevskiy wrote:
>>> On 04/29/2014 04:49 PM, Wei Yang wrote:
>>>> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>>>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>>>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>>>>> and two of them will trigger warning or error.
>>>>>>
>>>>>> The three times to invoke the iommu_add_device() are:
>>>>>>
>>>>>>     pci_device_add
>>>>>>        ...
>>>>>>        set_iommu_table_base_and_group   <- 1st time, fail
>>>>>>     device_add
>>>>>>        ...
>>>>>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>>>>>     pcibios_add_pci_devices
>>>>>>        ...
>>>>>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>>>>>>
>>>>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>>>>> dev->kobj->sd is initialized in device_add().
>>>>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>>>>
>>>>>> After applying this patch, the error
>>>>>
>>>>> Nack.
>>>>>
>>>>> The patch still seems incorrect and we actually need to remove
>>>>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>>> >from another notifier anyway. Could you please test it?
>>>>>
>>>>>
>>>>
>>>> Hi, Alexey,
>>>>
>>>> Nice to see your comment. Let me show what I got fist.
>>>>
>>>> Generally, when kernel enumerate on the pci device, following functions will
>>>> be invoked.
>>>>
>>>>      pci_device_add
>>>>         pcibios_setup_bus_device
>>>>         ...
>>>>         set_iommu_table_base_and_group   
>>>>      device_add
>>>>         ...
>>>>         tce_iommu_bus_notifier           
>>>>      pcibios_fixp_bus
>>>>         pcibios_add_pci_devices
>>>>         ...
>>>>         pcibios_setup_bus_devices        
>>>>
>>>> >From the call flow, we see for a normall pci device, the
>>>> pcibios_setup_bus_device() will be invoked twice.
>>>
>>>
>>> tce_iommu_bus_notifier should not be called for devices during boot-time
>>> PCI discovery as the discovery is done from subsys_initcall handler and
>>> tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
>>> this is happening? You should see warnings as for PF's EEH but you do not
>>> say that.
>>>
>> 
>> Let me make it clearer. I list the call flow for general purpose to show the
>> sequence of the call flow.
>> 
>> The tce_iommu_bus_notifier is not invoked at the boot time. And none of them
>> do real thing at boot time.
>> 
>> I don't understand your last sentence. I see warning and error during EEH
>> hotplug. If necessary, I will add this in the commit log.
>> 
>>>
>>>> At the bootup time, none of them succeed to setup the dma, since the PE is not
>>>> assigned or the tbl is not set. The iommu tbl and group is setup in
>>>> pnv_pci_ioda_setup_DMA().
>>>>
>>>> This call flow maintains the same when EEH error happens on Bus PE, while the
>>>> behavior is a little different. 
>>>>
>>>>      pci_device_add
>>>>         pcibios_setup_bus_device
>>>>         ...
>>>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>>>
>>>
>>> Sorry, what is uninitialized? The only kobj here is the one in iommu_group
>>> and it must have been taken care of before adding a device.
>> 
>> pci_dev->dev->kobj->sd.
>> 
>> I have explained this in previous discussion. This kobj->sd is initialized in
>> device_add().
>> 
>>>
>>>
>>>>      device_add
>>>>         ...
>>>>         tce_iommu_bus_notifier           <- succeed
>>>>      pcibios_fixp_bus
>>>>         pcibios_add_pci_devices
>>>>         ...
>>>>         pcibios_setup_bus_devices        <- warning, re-attach
>>>
>>>
>>> This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
>>> avoid the warning.
>> 
>> Then how about the first time's error?
>
>
>What do you call "first time error" here?
>

Would you please take a look at my commit log?

Currently, the iommu_group will be added three times. The error happens at the
first time we try to attatch the iommu_group in pci_device_add(). And yes,
this happens at the EEH recovery on PF for a hotplug case.

The error is: 

        iommu_tce: 0003:05:00.0 has not been added, ret=-14

And the reason is:
  
        pci_dev->dev->kobj->sd is not initialized.

>
>>>> While this call flow will change a little on a VF. For a VF,
>>>> pcibios_fixp_bus() will not be invoked. Current behavior is this.
>>>
>>>
>>> You meant pcibios_fixup_bus? I would expect it to get called (from
>>> pci_rescan_bus() or something like that?) and this seems to be the problem
>>> here.
>> 
>> When EEH happens and hotplug the pci bus, we need to do the pci_scan_bridge()
>> which will do the pcibios_fixup_bus().
>
>Are you talking now about EEH on PF (physical function) or EEH on VF
>(virtual function)?
>
>Are you calling pci_scan_bridge()
>

Yes, it is pcibios_add_pci_devices() do the hotplug and invoke the
pci_scan_bridge().

>
>> So you suggest to remove it? This is the generic code.
>
>
>I suggest removing tce_iommu_bus_notifier only. It must go. It was my
>mistake at the first place.
>
>
>>> How are VFs added in terms of code flow? Is there any git tree to look at?
>>>
>> 
>> VF add code flow is a generic code in drivers/pci/iov.c, virtfn_add().
>
>virtfn_add -> pci_device_add -> pcibios_add_device -> pcibios_setup_device
>-> ppc_md.pci_dma_dev_setup -> pnv_pci_dma_dev_setup ->
>pnv_pci_ioda_dma_dev_setup ->
>set_iommu_table_base.
>
>
>What part of this is missing?
>

Currently, you will do set_iommu_table_base_and_group() in
pnv_pci_ioda_dma_dev_setup(). And this will fail and return an error.

The error reason is the kobj->sd is not initialized yet.
I hope it is clear this time.

>
>> 
>>>
>>>
>>>>      pci_device_add
>>>>         pcibios_setup_bus_device
>>>>         ...
>>>>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>>>>      device_add
>>>>         ...
>>>>         tce_iommu_bus_notifier           <- succeed
>>>>
>>>> And if an EEH error happens just on a VF, I believe the same call flow should
>>>> go for recovery. (This is not set down, still under discussion with Gavin.)
>>>>
>>>> My conclusion is:
>>>> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>>>>    So I choose to revert the code and attach make the iommu group attachment
>>>>    just happens in one place.
>>>>
>>>> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
>>>> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
>>>> PF, maybe some platform need to fixup some thing after the pci_bus is added.
>>>> So I don't remove one of them to solve the problem.
>>>>
>>>> If you have a better idea, I am glad to take it.
>
>
>
>-- 
>Alexey

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
  2014-04-30  0:28     ` Gavin Shan
@ 2014-04-30  3:28       ` Wei Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Wei Yang @ 2014-04-30  3:28 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev, Wei Yang, Alexey Kardashevskiy

On Wed, Apr 30, 2014 at 10:28:12AM +1000, Gavin Shan wrote:
>
>It seems that we have 2 problems here:
>
>- For non-SRIOV case, pcibios_setup_device() is called for towice. That
>  seems incorrect. We could simply remove pcibios_setup_bus_devices()
>  from pcibios_fixup_bus().

I have thought about this solution before, but I guess this would have some
side effect on other platforms.

Well, just did a test by removing this line in pcibios_fixup_bus(), the result
is:
1. system up, thanks god.
2. no one invoke the pcibios_setup_device() at bootup time.

The reason for no one invoke the pcibios_setup_device() is: in
pcibios_add_device() it will check whether the bus is added before calling
pcibios_setup_device(). And at this time, the bus is not added.

Still wierd, why the system could be up. But one thing for sure is, no one
invoke the pcibios_setup_device() at bootup stage. So this solution may not
work.

>
>- It's too early to register IOMMU group/device in pnv_pci_ioda_dma_dev_setup()
>  because the sysfs entries of the PCI device aren't finalized yet. So we could
>  remove all logic we have in pnv_pci_ioda_dma_dev_setup() and just purely rely
>  on "tce_iommu_bus_notifier".

This would be another solution. 

One concern:

    If we want to do like this, we need to retrieve the pe information and
    get the tce32_table base in tce_iommu_bus_notifier. Hmm... looks a little
    not that nice.

>
>By the way, I never tried EEH on SRIOV PF/VFs. However, I never hit similar
>issue in non-SRIOV cases.

I have test this case on a PF with no VF enabled.

I just make the mlx4_pci_err_detected() return PCI_ERS_RESULT_NONE and do
nothing. So this will force the eeh to do a hotplug recovery.

You could have a try on your machine too.

>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-04-30  3:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-23  2:26 [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Wei Yang
2014-04-23  2:26 ` [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev Wei Yang
2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
2014-04-29  6:49   ` Wei Yang
2014-04-29  7:55     ` Alexey Kardashevskiy
2014-04-29  9:37       ` Wei Yang
2014-04-29 13:11         ` Alexey Kardashevskiy
2014-04-30  1:31           ` Wei Yang
2014-04-30  0:28     ` Gavin Shan
2014-04-30  3:28       ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.