All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW
@ 2016-01-19 20:18 Guilherme G. Piccoli
  2016-01-19 20:18 ` [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early() Guilherme G. Piccoli
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-01-19 20:18 UTC (permalink / raw)
  To: mpe, linuxppc-dev; +Cc: gwshan, benh, nfont, paulus, Guilherme G. Piccoli

We have 2 patches here to deal with checks of EEH availability before its use.
The issue that triggered these patches was related to hotplug-add a PCI device
with no other PCI device present in machine. Since EEH wouldn't be enabled in
this case, we hit an oops.

We already sent a patch to ppc-dev ("[PATCH] powerpc/eeh: Validate arch in
eeh_add_device_early()"), but Michael Ellerman made some good points that led us
to elaborate more our approach. Now, we have two patches instead of one:

(i) The first one aims to solve a specific issue in hotplug process: we have
lowered the "constraints" imposed by checking for eeh_enabled() result. To
achieve this, we've added a small function named eeh_available(). Instead of
checking if EEH is enabled on hotplug process (which is not always true), we
check if it's available based on running architecture (currently, only pSeries
and PowerNV have EEH capabilities).

(ii) The DDW mechanism has a "non-exposed" intrinsic dependency of EEH. We
rely on EEH to obtain the devices' config. address, so if EEH is not enabled
(for example if we boot with "eeh=off" on kernel command-line), we hit an oops
during device probe caused by DDW trying to use EEH struct. The 2nd patch in
this series inserts a check based on eeh_enabled() function, so the DDW
mechanism safely fallbacks to non-dynamic iommu if EEH is not enabled.

Guilherme G. Piccoli (2):
  powerpc/eeh: Check for EEH availability in eeh_add_device_early()
  powerpc/pseries: Check if EEH is enabled on DDW mechanism code

 arch/powerpc/kernel/eeh.c              | 19 ++++++++++++++++++-
 arch/powerpc/platforms/pseries/iommu.c |  6 ++++--
 2 files changed, 22 insertions(+), 3 deletions(-)

--
2.1.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early()
  2016-01-19 20:18 [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
@ 2016-01-19 20:18 ` Guilherme G. Piccoli
  2016-02-02 22:44   ` Gavin Shan
  2016-01-19 20:18 ` [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code Guilherme G. Piccoli
  2016-02-01 12:47 ` [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
  2 siblings, 1 reply; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-01-19 20:18 UTC (permalink / raw)
  To: mpe, linuxppc-dev; +Cc: gwshan, benh, nfont, paulus, Guilherme G. Piccoli

The function eeh_add_device_early() is used to perform EEH initialization in
devices added later on the system, like in hotplug/DLPAR scenarios. Since the
commit 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
a new check was introduced in this function - Cell has no EEH capabilities
which led to kernel oops if hotplug was performed, so checking for
eeh_enabled() was introduced to avoid the issue.

However, in architectures that EEH is present like pSeries or PowerNV, we might
reach a case in which no PCI devices are present on boot and so EEH is not
initialized. Then, if a device is added via DLPAR for example,
eeh_add_device_early() fails because eeh_enabled() is false.

Also, we can hit a kernel oops on pSeries arch if eeh_add_device_early() fails:
if we have no PCI devices on machine at boot time, and then we add a PCI device
via DLPAR operation, the function query_ddw() triggers the oops on NULL pointer
dereference in the line "cfg_addr = edev->config_addr;". It happens because
config_addr in edev is NULL, since the function eeh_add_device_early() was not
completed successfully.

This patch just changes the way the arch checking is done in function
eeh_add_device_early(): we don't use eeh_enabled() anymore, but instead we
introduce the function eeh_available() that checks the running architecture
by using the macro machine_is(). If we are running on pSeries or PowerNV, the
EEH mechanism is available (even if not initialized yet). This way, we don't
try to enable EEH on Cell and we don't hit the oops on DLPAR either.

Fixes: 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 40e4d4a..69031d7 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1056,6 +1056,23 @@ int eeh_init(void)
 core_initcall_sync(eeh_init);
 
 /**
+ * eeh_available - Checks for the availability of EEH based on running
+ * architecture.
+ *
+ * This routine should be used in case we need to check if EEH is
+ * available in some situation, regardless if EEH is enabled or not.
+ * For example, if we hotplug-add a PCI device on a machine with no
+ * other PCI device, EEH won't be enabled, yet it's available if the
+ * arch supports it.
+ */
+static inline bool eeh_available(void)
+{
+	if (machine_is(pseries) || machine_is(powernv))
+		return true;
+	return false;
+}
+
+/**
  * eeh_add_device_early - Enable EEH for the indicated device node
  * @pdn: PCI device node for which to set up EEH
  *
@@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
 	struct pci_controller *phb;
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
 
-	if (!edev || !eeh_enabled())
+	if (!edev || !eeh_available())
 		return;
 
 	if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code
  2016-01-19 20:18 [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
  2016-01-19 20:18 ` [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early() Guilherme G. Piccoli
@ 2016-01-19 20:18 ` Guilherme G. Piccoli
  2016-02-02 23:48   ` Gavin Shan
  2016-02-01 12:47 ` [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
  2 siblings, 1 reply; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-01-19 20:18 UTC (permalink / raw)
  To: mpe, linuxppc-dev; +Cc: gwshan, benh, nfont, paulus, Guilherme G. Piccoli

The Dynamic DMA Window (DDW) mechanism relies on EEH to obtain the
configuration address of devices. For example, the functions query_ddw()
and create_ddw() make use of eeh_dev struct. So, the dependency is
intrinsic - DDW mechanism will fail if EEH is not enabled.

Despite this dependency, no check for EEH availability is performed in DDW
code. This patch adds a check based on eeh_enabled() function, so if EEH is
not enabled before eeh_dev struct use, DDW will fallback to default iommu
mechanism and won't fail.

One use case for this patch is when we disable EEH globally via kernel
command-line ("eeh=off") - without the patch, a device probe can hit a kernel
oops because EEH is disabled but DDW will try to use it.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index bd98ce2..1ff55cc 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1224,8 +1224,10 @@ static int dma_set_mask_pSeriesLP(struct device *dev, u64 dma_mask)
 
 	pdev = to_pci_dev(dev);
 
-	/* only attempt to use a new window if 64-bit DMA is requested */
-	if (!disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
+	/* We should check if EEH is enabled here, since DDW mechanism has
+	 * an intrinsic dependency of EEH config addr information. Also, we
+	 * only attempt to use a new window if 64-bit DMA is requested */
+	if (eeh_enabled() && !disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
 		dn = pci_device_to_OF_node(pdev);
 		dev_dbg(dev, "node is %s\n", dn->full_name);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW
  2016-01-19 20:18 [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
  2016-01-19 20:18 ` [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early() Guilherme G. Piccoli
  2016-01-19 20:18 ` [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code Guilherme G. Piccoli
@ 2016-02-01 12:47 ` Guilherme G. Piccoli
  2 siblings, 0 replies; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-02-01 12:47 UTC (permalink / raw)
  To: mpe, linuxppc-dev, benh; +Cc: gwshan, nfont, paulus

Hello Michael and Benjamin, any news on this one?


Quick correction in this email subject: it was cropped because it's too 
long. The full title was "[PATCH 0/2] Two patches regarding EEH 
availability checks - DLPAR/DDW crash avoidance"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early()
  2016-01-19 20:18 ` [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early() Guilherme G. Piccoli
@ 2016-02-02 22:44   ` Gavin Shan
  2016-02-03 11:54     ` Guilherme G. Piccoli
  0 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2016-02-02 22:44 UTC (permalink / raw)
  To: Guilherme G. Piccoli; +Cc: mpe, linuxppc-dev, gwshan, benh, nfont, paulus

On Tue, Jan 19, 2016 at 06:18:19PM -0200, Guilherme G. Piccoli wrote:
>The function eeh_add_device_early() is used to perform EEH initialization in
>devices added later on the system, like in hotplug/DLPAR scenarios. Since the
>commit 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
>a new check was introduced in this function - Cell has no EEH capabilities
>which led to kernel oops if hotplug was performed, so checking for
>eeh_enabled() was introduced to avoid the issue.
>
>However, in architectures that EEH is present like pSeries or PowerNV, we might
>reach a case in which no PCI devices are present on boot and so EEH is not
>initialized. Then, if a device is added via DLPAR for example,
>eeh_add_device_early() fails because eeh_enabled() is false.
>
>Also, we can hit a kernel oops on pSeries arch if eeh_add_device_early() fails:
>if we have no PCI devices on machine at boot time, and then we add a PCI device
>via DLPAR operation, the function query_ddw() triggers the oops on NULL pointer
>dereference in the line "cfg_addr = edev->config_addr;". It happens because
>config_addr in edev is NULL, since the function eeh_add_device_early() was not
>completed successfully.
>
>This patch just changes the way the arch checking is done in function
>eeh_add_device_early(): we don't use eeh_enabled() anymore, but instead we
>introduce the function eeh_available() that checks the running architecture
>by using the macro machine_is(). If we are running on pSeries or PowerNV, the
>EEH mechanism is available (even if not initialized yet). This way, we don't
>try to enable EEH on Cell and we don't hit the oops on DLPAR either.
>
>Fixes: 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
>Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
>---
> arch/powerpc/kernel/eeh.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 40e4d4a..69031d7 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1056,6 +1056,23 @@ int eeh_init(void)
> core_initcall_sync(eeh_init);
>
> /**
>+ * eeh_available - Checks for the availability of EEH based on running
>+ * architecture.
>+ *
>+ * This routine should be used in case we need to check if EEH is
>+ * available in some situation, regardless if EEH is enabled or not.
>+ * For example, if we hotplug-add a PCI device on a machine with no
>+ * other PCI device, EEH won't be enabled, yet it's available if the
>+ * arch supports it.
>+ */
>+static inline bool eeh_available(void)
>+{
>+	if (machine_is(pseries) || machine_is(powernv))
>+		return true;
>+	return false;
>+}
>+

As I was told by somebody else before, the comments for static function
needn't to be exported.

>+/**
>  * eeh_add_device_early - Enable EEH for the indicated device node
>  * @pdn: PCI device node for which to set up EEH
>  *
>@@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
> 	struct pci_controller *phb;
> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>
>-	if (!edev || !eeh_enabled())
>+	if (!edev || !eeh_available())
> 		return;
>
> 	if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE))

The change here seems not correct enough. Before the changes, eeh_add_device_early()
does nothing if EEH is disabled on pSeries. With the changes applied, the EEH device
(edev) will be scanned even EEH is disabled on pSeries.

>From the code changes, I didn't figure out the real problem you try to fix. Cell
platform doesn't have flag EEH_PROBE_MODE_DEVTREE. So the function does nothing
on Cell platform except calling into pdn_to_eeh_dev(). I'm not sure if the kernel
crashed in pdn_to_eeh_dev() on Cell platform?

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code
  2016-01-19 20:18 ` [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code Guilherme G. Piccoli
@ 2016-02-02 23:48   ` Gavin Shan
  2016-02-03 12:26     ` Guilherme G. Piccoli
  0 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2016-02-02 23:48 UTC (permalink / raw)
  To: Guilherme G. Piccoli; +Cc: mpe, linuxppc-dev, gwshan, benh, nfont, paulus

On Tue, Jan 19, 2016 at 06:18:20PM -0200, Guilherme G. Piccoli wrote:
>The Dynamic DMA Window (DDW) mechanism relies on EEH to obtain the
>configuration address of devices. For example, the functions query_ddw()
>and create_ddw() make use of eeh_dev struct. So, the dependency is
>intrinsic - DDW mechanism will fail if EEH is not enabled.
>
>Despite this dependency, no check for EEH availability is performed in DDW
>code. This patch adds a check based on eeh_enabled() function, so if EEH is
>not enabled before eeh_dev struct use, DDW will fallback to default iommu
>mechanism and won't fail.
>
>One use case for this patch is when we disable EEH globally via kernel
>command-line ("eeh=off") - without the patch, a device probe can hit a kernel
>oops because EEH is disabled but DDW will try to use it.
>
>Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/pseries/iommu.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>index bd98ce2..1ff55cc 100644
>--- a/arch/powerpc/platforms/pseries/iommu.c
>+++ b/arch/powerpc/platforms/pseries/iommu.c
>@@ -1224,8 +1224,10 @@ static int dma_set_mask_pSeriesLP(struct device *dev, u64 dma_mask)
>
> 	pdev = to_pci_dev(dev);
>
>-	/* only attempt to use a new window if 64-bit DMA is requested */
>-	if (!disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
>+	/* We should check if EEH is enabled here, since DDW mechanism has
>+	 * an intrinsic dependency of EEH config addr information. Also, we
>+	 * only attempt to use a new window if 64-bit DMA is requested */
>+	if (eeh_enabled() && !disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
> 		dn = pci_device_to_OF_node(pdev);
> 		dev_dbg(dev, "node is %s\n", dn->full_name);
>

There are two types of addresses: (1) PCI config address (2) PE config address.
(1) is used to indentify one PCI device which is included in the PE. (2) is the
PCI config address of PE's primary bus in pHyp. Both of them can be used to identify
the PE. It means the (1) PCI config address, which is retrieved from pci_dn, can be
passed to hypervisor. Then we don't have to disable DDW when EEH is disabled.

Guilherme, did you hit the crash on pHyp or PowerKVM?

Thanks,
Gavin 

>-- 
>2.1.0
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early()
  2016-02-02 22:44   ` Gavin Shan
@ 2016-02-03 11:54     ` Guilherme G. Piccoli
  2016-02-04  5:27       ` Gavin Shan
  0 siblings, 1 reply; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-02-03 11:54 UTC (permalink / raw)
  To: Gavin Shan, Michael Ellerman; +Cc: paulus, nfont, linuxppc-dev

On 02/02/2016 08:44 PM, Gavin Shan wrote:
>> /**
>> + * eeh_available - Checks for the availability of EEH based on running
>> + * architecture.
>> + *
>> + * This routine should be used in case we need to check if EEH is
>> + * available in some situation, regardless if EEH is enabled or not.
>> + * For example, if we hotplug-add a PCI device on a machine with no
>> + * other PCI device, EEH won't be enabled, yet it's available if the
>> + * arch supports it.
>> + */
>> +static inline bool eeh_available(void)
>> +{
>> +	if (machine_is(pseries) || machine_is(powernv))
>> +		return true;
>> +	return false;
>> +}
>> +
>
> As I was told by somebody else before, the comments for static function
> needn't to be exported.
>

Thanks for the advice Gavin, but I didn't get it. What means comments 
not being exported? For sure I can change the comment if desired, but I 
can't see any issue with it.


>> +/**
>>   * eeh_add_device_early - Enable EEH for the indicated device node
>>   * @pdn: PCI device node for which to set up EEH
>>   *
>> @@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
>> 	struct pci_controller *phb;
>> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>
>> -	if (!edev || !eeh_enabled())
>> +	if (!edev || !eeh_available())
>> 		return;
>>
>> 	if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE))
>
> The change here seems not correct enough. Before the changes, eeh_add_device_early()
> does nothing if EEH is disabled on pSeries. With the changes applied, the EEH device
> (edev) will be scanned even EEH is disabled on pSeries.
>
>  From the code changes, I didn't figure out the real problem you try to fix. Cell
> platform doesn't have flag EEH_PROBE_MODE_DEVTREE. So the function does nothing
> on Cell platform except calling into pdn_to_eeh_dev(). I'm not sure if the kernel
> crashed in pdn_to_eeh_dev() on Cell platform?

Gavin, thanks for the comments. Seems your commit d91dafc02f4 
("powerpc/eeh: Delay probing EEH device during hotplug") solved the 
problem with Cell arch. Notice that Michael pointed the issue with Cell 
and fixed it by commit 89a51df5ab1d ("powerpc/eeh: Fix crash in 
eeh_add_device_early() on Cell").

But you commit is recent than Michael's and since it adds the check for 
EEH_PROBE_MODE_DEVTREE, the Cell crash doesn't happen anymore. It's my 
bad, I should have checked the dates of commits to figure your commit is 
recent than Michael's one.

Anyway, the modification proposed by this patch is useless if we revert 
Michael's commit then. Gavin and Michael, are you OK if I revert it?

The reason for the revert is that we can't check for eeh_enabled() here 
- imagine the following scenario: we boot a machine with no PCI device, 
then, we hotplug-add a PCI device. When we reach eeh_add_device_early(), 
the function eeh_enabled() is false because at boot time we had no PCI 
devices so EEH is not initialized/enabled.

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code
  2016-02-02 23:48   ` Gavin Shan
@ 2016-02-03 12:26     ` Guilherme G. Piccoli
  2016-02-04  5:30       ` Gavin Shan
  0 siblings, 1 reply; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-02-03 12:26 UTC (permalink / raw)
  To: Gavin Shan; +Cc: paulus, nfont, linuxppc-dev

On 02/02/2016 09:48 PM, Gavin Shan wrote:
> On Tue, Jan 19, 2016 at 06:18:20PM -0200, Guilherme G. Piccoli wrote:
>> -	/* only attempt to use a new window if 64-bit DMA is requested */
>> -	if (!disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
>> +	/* We should check if EEH is enabled here, since DDW mechanism has
>> +	 * an intrinsic dependency of EEH config addr information. Also, we
>> +	 * only attempt to use a new window if 64-bit DMA is requested */
>> +	if (eeh_enabled() && !disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
>> 		dn = pci_device_to_OF_node(pdev);
>> 		dev_dbg(dev, "node is %s\n", dn->full_name);
>>
>
> There are two types of addresses: (1) PCI config address (2) PE config address.
> (1) is used to indentify one PCI device which is included in the PE. (2) is the
> PCI config address of PE's primary bus in pHyp. Both of them can be used to identify
> the PE. It means the (1) PCI config address, which is retrieved from pci_dn, can be
> passed to hypervisor. Then we don't have to disable DDW when EEH is disabled.
>
> Guilherme, did you hit the crash on pHyp or PowerKVM?

Gavin, thanks very much for the clarification. So, we can interchange 
edev->config_addr with pdn->pci_ext_config_space ?
This would solve the issue with DDW being enabled when EEH is not.

I hit the issue on PowerVM (PHyp). I wasn't able to perform hotplug in 
qemu that time I was testing this - I can re-test on qemu. Can we use 
pci_dn config address in qemu guest too?

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early()
  2016-02-03 11:54     ` Guilherme G. Piccoli
@ 2016-02-04  5:27       ` Gavin Shan
  0 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2016-02-04  5:27 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Gavin Shan, Michael Ellerman, paulus, nfont, linuxppc-dev

On Wed, Feb 03, 2016 at 09:54:51AM -0200, Guilherme G. Piccoli wrote:
>On 02/02/2016 08:44 PM, Gavin Shan wrote:
>>>/**
>>>+ * eeh_available - Checks for the availability of EEH based on running
>>>+ * architecture.
>>>+ *
>>>+ * This routine should be used in case we need to check if EEH is
>>>+ * available in some situation, regardless if EEH is enabled or not.
>>>+ * For example, if we hotplug-add a PCI device on a machine with no
>>>+ * other PCI device, EEH won't be enabled, yet it's available if the
>>>+ * arch supports it.
>>>+ */
>>>+static inline bool eeh_available(void)
>>>+{
>>>+	if (machine_is(pseries) || machine_is(powernv))
>>>+		return true;
>>>+	return false;
>>>+}
>>>+
>>
>>As I was told by somebody else before, the comments for static function
>>needn't to be exported.
>>
>
>Thanks for the advice Gavin, but I didn't get it. What means comments not
>being exported? For sure I can change the comment if desired, but I can't see
>any issue with it.
>

The comments in "/** ... */" can be exposed to readable document by makefile.
I think it's necessary to have it for a static function :-)

>>>+/**
>>>  * eeh_add_device_early - Enable EEH for the indicated device node
>>>  * @pdn: PCI device node for which to set up EEH
>>>  *
>>>@@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
>>>	struct pci_controller *phb;
>>>	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>>
>>>-	if (!edev || !eeh_enabled())
>>>+	if (!edev || !eeh_available())
>>>		return;
>>>
>>>	if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE))
>>
>>The change here seems not correct enough. Before the changes, eeh_add_device_early()
>>does nothing if EEH is disabled on pSeries. With the changes applied, the EEH device
>>(edev) will be scanned even EEH is disabled on pSeries.
>>
>> From the code changes, I didn't figure out the real problem you try to fix. Cell
>>platform doesn't have flag EEH_PROBE_MODE_DEVTREE. So the function does nothing
>>on Cell platform except calling into pdn_to_eeh_dev(). I'm not sure if the kernel
>>crashed in pdn_to_eeh_dev() on Cell platform?
>
>Gavin, thanks for the comments. Seems your commit d91dafc02f4 ("powerpc/eeh:
>Delay probing EEH device during hotplug") solved the problem with Cell arch.
>Notice that Michael pointed the issue with Cell and fixed it by commit
>89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell").
>
>But you commit is recent than Michael's and since it adds the check for
>EEH_PROBE_MODE_DEVTREE, the Cell crash doesn't happen anymore. It's my bad, I
>should have checked the dates of commits to figure your commit is recent than
>Michael's one.
>
>Anyway, the modification proposed by this patch is useless if we revert
>Michael's commit then. Gavin and Michael, are you OK if I revert it?
>
>The reason for the revert is that we can't check for eeh_enabled() here -
>imagine the following scenario: we boot a machine with no PCI device, then,
>we hotplug-add a PCI device. When we reach eeh_add_device_early(), the
>function eeh_enabled() is false because at boot time we had no PCI devices so
>EEH is not initialized/enabled.
>

Yeah, eeh_enabled() returns false and we don't have EEH_PROBE_MODE_DEVTREE on Cell
platform. That means those two checks are duplicated. I think eeh_enabled() check
can be removed if it really helps to avoid the crash.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code
  2016-02-03 12:26     ` Guilherme G. Piccoli
@ 2016-02-04  5:30       ` Gavin Shan
  2016-04-07  0:23         ` Guilherme G. Piccoli
  0 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2016-02-04  5:30 UTC (permalink / raw)
  To: Guilherme G. Piccoli; +Cc: Gavin Shan, paulus, nfont, linuxppc-dev

On Wed, Feb 03, 2016 at 10:26:36AM -0200, Guilherme G. Piccoli wrote:
>On 02/02/2016 09:48 PM, Gavin Shan wrote:
>>On Tue, Jan 19, 2016 at 06:18:20PM -0200, Guilherme G. Piccoli wrote:
>>>-	/* only attempt to use a new window if 64-bit DMA is requested */
>>>-	if (!disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
>>>+	/* We should check if EEH is enabled here, since DDW mechanism has
>>>+	 * an intrinsic dependency of EEH config addr information. Also, we
>>>+	 * only attempt to use a new window if 64-bit DMA is requested */
>>>+	if (eeh_enabled() && !disable_ddw && dma_mask == DMA_BIT_MASK(64)) {
>>>		dn = pci_device_to_OF_node(pdev);
>>>		dev_dbg(dev, "node is %s\n", dn->full_name);
>>>
>>
>>There are two types of addresses: (1) PCI config address (2) PE config address.
>>(1) is used to indentify one PCI device which is included in the PE. (2) is the
>>PCI config address of PE's primary bus in pHyp. Both of them can be used to identify
>>the PE. It means the (1) PCI config address, which is retrieved from pci_dn, can be
>>passed to hypervisor. Then we don't have to disable DDW when EEH is disabled.
>>
>>Guilherme, did you hit the crash on pHyp or PowerKVM?
>
>Gavin, thanks very much for the clarification. So, we can interchange
>edev->config_addr with pdn->pci_ext_config_space ?
>This would solve the issue with DDW being enabled when EEH is not.
>
>I hit the issue on PowerVM (PHyp). I wasn't able to perform hotplug in qemu
>that time I was testing this - I can re-test on qemu. Can we use pci_dn
>config address in qemu guest too?
>

On PowerKVM, QEMU can't recognize (1). pHyp is expected to support both
of them. Please have a try with (1) when eeh_enabled() returns false.
The address (1) can be retrieved from pci_dn as below:

	(pdn->busno << 8) | (pdn->devfn)

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code
  2016-02-04  5:30       ` Gavin Shan
@ 2016-04-07  0:23         ` Guilherme G. Piccoli
  0 siblings, 0 replies; 11+ messages in thread
From: Guilherme G. Piccoli @ 2016-04-07  0:23 UTC (permalink / raw)
  To: Gavin Shan; +Cc: nfont, linuxppc-dev, paulus, Michael Ellerman

On 02/04/2016 03:30 AM, Gavin Shan wrote:
> On Wed, Feb 03, 2016 at 10:26:36AM -0200, Guilherme G. Piccoli wrote:
>> On 02/02/2016 09:48 PM, Gavin Shan wrote:
>>
>> Gavin, thanks very much for the clarification. So, we can interchange
>> edev->config_addr with pdn->pci_ext_config_space ?
>> This would solve the issue with DDW being enabled when EEH is not.
>>
>> I hit the issue on PowerVM (PHyp). I wasn't able to perform hotplug in qemu
>> that time I was testing this - I can re-test on qemu. Can we use pci_dn
>> config address in qemu guest too?
>>
>
> On PowerKVM, QEMU can't recognize (1). pHyp is expected to support both
> of them. Please have a try with (1) when eeh_enabled() returns false.
> The address (1) can be retrieved from pci_dn as below:
>
> 	(pdn->busno << 8) | (pdn->devfn)
>
> Thanks,
> Gavin
>

Gavin, thanks very much for your advice. And sorry for my huge delay in 
replying this...

I tested the pci_dn's busno/devfn solution in both PHyp and qemu and it 
worked fine - DDW was enabled with success.

I just sent v2 patches to the list - your review is much appreciated.

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-04-07  0:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-19 20:18 [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli
2016-01-19 20:18 ` [PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early() Guilherme G. Piccoli
2016-02-02 22:44   ` Gavin Shan
2016-02-03 11:54     ` Guilherme G. Piccoli
2016-02-04  5:27       ` Gavin Shan
2016-01-19 20:18 ` [PATCH 2/2] powerpc/pseries: Check if EEH is enabled on DDW mechanism code Guilherme G. Piccoli
2016-02-02 23:48   ` Gavin Shan
2016-02-03 12:26     ` Guilherme G. Piccoli
2016-02-04  5:30       ` Gavin Shan
2016-04-07  0:23         ` Guilherme G. Piccoli
2016-02-01 12:47 ` [PATCH 0/2] Two patches regarding EEH availability checks - DLPAR/DDW Guilherme G. Piccoli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.