All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
@ 2015-11-03  3:48 Guenter Roeck
  2015-11-03  3:48 ` [RFC PATCH 2/2] PCI: pciehp: Implement support for delayed poweron Guenter Roeck
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Guenter Roeck @ 2015-11-03  3:48 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, linux-pci, linux-kernel, Guenter Roeck

Some oddball devices may experience a PCIe link flap after power-on.
This may result in the following sequence of events.

fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Up event ignored on slot(0): already powering on
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Down event queued on slot(0): currently getting powered on
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Up event queued on slot(0): currently getting powered off

This causes the driver for affected devices to be instantiated, removed,
and re-instantiated.

An even worse problem is a device which resets itself continuously.
This can result in the following endless sequence of messages.

pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)

The problem in the both cases is that all events are enqueued as hotplug
work requests and executed in sequence, which can overwhelm the system
and even result in "hung task" tracebacks in pciehp_power_thread().

This exposes an underlying limitation of the hotplug state machine: It
executes all received requests, even though only the most recent request
really needs to be handled. As a result, by the time a request is handled,
it may well be obsolete and have been superseded by many other enable /
disable requests which have been enqueued in the meantime.

To solve the problem, fold hotplug work requests into a single request.
Store the request as well as the work data structure in 'struct slot',
thus eliminating the need to allocate memory for each request.
Handle a sequence of requests by setting a 'disable' flag when needed,
indicating that a link needs to be disabled prior to re-enabling it.

With this change, requests and request sequences are handled as follows.

enable (single request):		enable link
disable (single request):		disable link
... disable-enable-disable...disable:	disable link
... disable-enable-disable...enable:	disable link, then enable it

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
This is a different approach to solve the problem I tried to address
earlier with with "PCI: pciehp: Add support for delayed power-on" [1].

While the earlier patch implemented an additional state in the hotplug
state machine to solve the problem, the approach taken here is a bit
simpler and more straightfoward. By folding multiple requests into one,
the follow-up patch can use delayed work to implement power-on delays.
An additional advantage is that this patch can be applied separately
to simplify the state machine.

While working on this patch, I also tried to drop multiple "disable"
requests, and only disable a slot if it was already disabled, to reduce
overhead. Unfortunately, this did not always work. In some instances,
I ended up in a situation where a device could not be enabled
because the system thought that it already existed. I don't know
if this is a weakness in this patch or some state change I did not catch. 
This may be left for further study.

[1] https://lkml.org/lkml/2015/10/12/686

 drivers/pci/hotplug/pciehp.h      |  4 +++
 drivers/pci/hotplug/pciehp_ctrl.c | 52 ++++++++++++++++++---------------------
 drivers/pci/hotplug/pciehp_hpc.c  |  1 +
 3 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 62d6fe6c3714..364b6fa32978 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -78,6 +78,9 @@ struct slot {
 	struct mutex lock;
 	struct mutex hotplug_lock;
 	struct workqueue_struct *wq;
+	struct work_struct hotplug_work;
+	u32 hotplug_req;
+	bool disable;			/* true to disable before enable */
 };
 
 struct event_info {
@@ -130,6 +133,7 @@ void pciehp_queue_interrupt_event(struct slot *slot, u32 event_type);
 int pciehp_configure_device(struct slot *p_slot);
 int pciehp_unconfigure_device(struct slot *p_slot);
 void pciehp_queue_pushbutton_work(struct work_struct *work);
+void pciehp_power_thread(struct work_struct *work);
 struct controller *pcie_init(struct pcie_device *dev);
 int pcie_init_notification(struct controller *ctrl);
 int pciehp_enable_slot(struct slot *p_slot);
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 880978b6d534..ad1321e91546 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -156,13 +156,9 @@ static int remove_board(struct slot *p_slot)
 	return 0;
 }
 
-struct power_work_info {
-	struct slot *p_slot;
-	struct work_struct work;
-	unsigned int req;
-#define DISABLE_REQ 0
-#define ENABLE_REQ  1
-};
+/* Hotplug work requests */
+#define DISABLE_REQ	0
+#define ENABLE_REQ	1
 
 /**
  * pciehp_power_thread - handle pushbutton events
@@ -171,14 +167,19 @@ struct power_work_info {
  * Scheduled procedure to handle blocking stuff for the pushbuttons.
  * Handles all pending events and exits.
  */
-static void pciehp_power_thread(struct work_struct *work)
+void pciehp_power_thread(struct work_struct *work)
 {
-	struct power_work_info *info =
-		container_of(work, struct power_work_info, work);
-	struct slot *p_slot = info->p_slot;
-	int ret;
+	struct slot *p_slot = container_of(work, struct slot, hotplug_work);
+	int ret, req;
+	bool disable;
+
+	mutex_lock(&p_slot->lock);
+	req = p_slot->hotplug_req;
+	disable = p_slot->disable;
+	p_slot->disable = false;
+	mutex_unlock(&p_slot->lock);
 
-	switch (info->req) {
+	switch (req) {
 	case DISABLE_REQ:
 		mutex_lock(&p_slot->hotplug_lock);
 		pciehp_disable_slot(p_slot);
@@ -189,6 +190,8 @@ static void pciehp_power_thread(struct work_struct *work)
 		break;
 	case ENABLE_REQ:
 		mutex_lock(&p_slot->hotplug_lock);
+		if (disable)
+			pciehp_disable_slot(p_slot);
 		ret = pciehp_enable_slot(p_slot);
 		mutex_unlock(&p_slot->hotplug_lock);
 		if (ret)
@@ -200,26 +203,19 @@ static void pciehp_power_thread(struct work_struct *work)
 	default:
 		break;
 	}
-
-	kfree(info);
 }
 
 static void pciehp_queue_power_work(struct slot *p_slot, int req)
 {
-	struct power_work_info *info;
-
-	p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE;
-
-	info = kmalloc(sizeof(*info), GFP_KERNEL);
-	if (!info) {
-		ctrl_err(p_slot->ctrl, "no memory to queue %s request\n",
-			 (req == ENABLE_REQ) ? "poweron" : "poweroff");
-		return;
+	if (req == ENABLE_REQ) {
+		p_slot->state = POWERON_STATE;
+	} else {
+		p_slot->state = POWEROFF_STATE;
+		p_slot->disable = true;
 	}
-	info->p_slot = p_slot;
-	INIT_WORK(&info->work, pciehp_power_thread);
-	info->req = req;
-	queue_work(p_slot->wq, &info->work);
+	p_slot->hotplug_req = req;
+
+	queue_work(p_slot->wq, &p_slot->hotplug_work);
 }
 
 void pciehp_queue_pushbutton_work(struct work_struct *work)
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 5c24e938042f..e4e6fcbe1e20 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -755,6 +755,7 @@ static int pcie_init_slot(struct controller *ctrl)
 	mutex_init(&slot->lock);
 	mutex_init(&slot->hotplug_lock);
 	INIT_DELAYED_WORK(&slot->work, pciehp_queue_pushbutton_work);
+	INIT_WORK(&slot->hotplug_work, pciehp_power_thread);
 	ctrl->slot = slot;
 	return 0;
 abort:
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/2] PCI: pciehp: Implement support for delayed poweron
  2015-11-03  3:48 [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
@ 2015-11-03  3:48 ` Guenter Roeck
  2015-11-16 21:34 ` [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
  2016-01-08 16:18 ` Bjorn Helgaas
  2 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2015-11-03  3:48 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, linux-pci, linux-kernel, Guenter Roeck

Some oddball devices may experience PCIe link flaps after power-on.
This may result in the following sequence of events.

fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Up event ignored on slot(0): already powering on
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Down event queued on slot(0): currently getting powered on
fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
fpc0 kernel: pciehp 0000:02:08.0:pcie24:
	Link Up event queued on slot(0): currently getting powered off

This causes the driver for affected devices to be instantiated, removed,
and re-instantiated. While commit 'PCI: pciehp: Implement hotplug event
folding' reduces the scope of the problem, it can still occur. In some
cases, device insertion followed by immediate removal can result errors
such as the following.

fpc0 kernel: remove_proc_entry: removing non-empty directory 'irq/148',
	leaking at least 'pic0'
fpc0 kernel: ------------[ cut here ]------------
fpc0 kernel: WARNING: at fs/proc/generic.c:575

This can for example happen if the removed device provides gpio pins
and/or interrupts used by other drivers, if those other drivers are still
instantiated.

Add support for per-port power-on delay to avoid this situation. This can
be enabled globally with the pciehp_poweron_delay module parameter, or
per port (using a quirks function) with a new poweron_delay flag in
struct pci_dev.

With this patch, the link flap still occurs, but because of the delayed
insertion the driver is not immediately instantiated, and the above error
is no longer seen.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/pci/hotplug/pciehp.h      |  4 +++-
 drivers/pci/hotplug/pciehp_core.c |  3 +++
 drivers/pci/hotplug/pciehp_ctrl.c | 30 +++++++++++++++++++-----------
 drivers/pci/hotplug/pciehp_hpc.c  |  5 ++++-
 include/linux/pci.h               |  1 +
 5 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 364b6fa32978..0d44b1691431 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -40,6 +40,7 @@
 
 #define MY_NAME	"pciehp"
 
+extern bool pciehp_poweron_delay;
 extern bool pciehp_poll_mode;
 extern int pciehp_poll_time;
 extern bool pciehp_debug;
@@ -78,7 +79,7 @@ struct slot {
 	struct mutex lock;
 	struct mutex hotplug_lock;
 	struct workqueue_struct *wq;
-	struct work_struct hotplug_work;
+	struct delayed_work hotplug_work;
 	u32 hotplug_req;
 	bool disable;			/* true to disable before enable */
 };
@@ -101,6 +102,7 @@ struct controller {
 	unsigned int cmd_busy:1;
 	unsigned int link_active_reporting:1;
 	unsigned int notification_enabled:1;
+	unsigned int poweron_delay:1;
 	unsigned int power_fault_detected;
 };
 
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 612b21a14df5..cc69fd10d884 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -38,6 +38,7 @@
 #include <linux/time.h>
 
 /* Global variables */
+bool pciehp_poweron_delay;
 bool pciehp_debug;
 bool pciehp_poll_mode;
 int pciehp_poll_time;
@@ -51,10 +52,12 @@ MODULE_AUTHOR(DRIVER_AUTHOR);
 MODULE_DESCRIPTION(DRIVER_DESC);
 MODULE_LICENSE("GPL");
 
+module_param(pciehp_poweron_delay, bool, 0644);
 module_param(pciehp_debug, bool, 0644);
 module_param(pciehp_poll_mode, bool, 0644);
 module_param(pciehp_poll_time, int, 0644);
 module_param(pciehp_force, bool, 0644);
+MODULE_PARM_DESC(pciehp_poweron_delay, "Delay port power-on");
 MODULE_PARM_DESC(pciehp_debug, "Debugging mode enabled or not");
 MODULE_PARM_DESC(pciehp_poll_mode, "Using polling mechanism for hot-plug events or not");
 MODULE_PARM_DESC(pciehp_poll_time, "Polling mechanism frequency, in seconds");
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index ad1321e91546..0c5b2e5965ce 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -169,7 +169,8 @@ static int remove_board(struct slot *p_slot)
  */
 void pciehp_power_thread(struct work_struct *work)
 {
-	struct slot *p_slot = container_of(work, struct slot, hotplug_work);
+	struct slot *p_slot = container_of(work, struct slot,
+					   hotplug_work.work);
 	int ret, req;
 	bool disable;
 
@@ -205,17 +206,21 @@ void pciehp_power_thread(struct work_struct *work)
 	}
 }
 
-static void pciehp_queue_power_work(struct slot *p_slot, int req)
+static void pciehp_queue_power_work(struct slot *p_slot, int req, bool delay)
 {
+	int delay_hz = 0;
+
 	if (req == ENABLE_REQ) {
 		p_slot->state = POWERON_STATE;
+		if (delay)
+			delay_hz = HZ;
 	} else {
 		p_slot->state = POWEROFF_STATE;
 		p_slot->disable = true;
 	}
 	p_slot->hotplug_req = req;
 
-	queue_work(p_slot->wq, &p_slot->hotplug_work);
+	mod_delayed_work(p_slot->wq, &p_slot->hotplug_work, delay_hz);
 }
 
 void pciehp_queue_pushbutton_work(struct work_struct *work)
@@ -225,10 +230,10 @@ void pciehp_queue_pushbutton_work(struct work_struct *work)
 	mutex_lock(&p_slot->lock);
 	switch (p_slot->state) {
 	case BLINKINGOFF_STATE:
-		pciehp_queue_power_work(p_slot, DISABLE_REQ);
+		pciehp_queue_power_work(p_slot, DISABLE_REQ, false);
 		break;
 	case BLINKINGON_STATE:
-		pciehp_queue_power_work(p_slot, ENABLE_REQ);
+		pciehp_queue_power_work(p_slot, ENABLE_REQ, false);
 		break;
 	default:
 		break;
@@ -259,7 +264,7 @@ static void handle_button_press_event(struct slot *p_slot)
 		/* blink green LED and turn off amber */
 		pciehp_green_led_blink(p_slot);
 		pciehp_set_attention_status(p_slot, 0);
-		queue_delayed_work(p_slot->wq, &p_slot->work, 5*HZ);
+		mod_delayed_work(p_slot->wq, &p_slot->work, 5*HZ);
 		break;
 	case BLINKINGOFF_STATE:
 	case BLINKINGON_STATE:
@@ -303,9 +308,10 @@ static void handle_surprise_event(struct slot *p_slot)
 
 	pciehp_get_adapter_status(p_slot, &getstatus);
 	if (!getstatus)
-		pciehp_queue_power_work(p_slot, DISABLE_REQ);
+		pciehp_queue_power_work(p_slot, DISABLE_REQ, false);
 	else
-		pciehp_queue_power_work(p_slot, ENABLE_REQ);
+		pciehp_queue_power_work(p_slot, ENABLE_REQ,
+					p_slot->ctrl->poweron_delay);
 }
 
 /*
@@ -322,7 +328,8 @@ static void handle_link_event(struct slot *p_slot, u32 event)
 		/* Fall through */
 	case STATIC_STATE:
 		pciehp_queue_power_work(p_slot, event == INT_LINK_UP ?
-					ENABLE_REQ : DISABLE_REQ);
+					ENABLE_REQ : DISABLE_REQ,
+					ctrl->poweron_delay);
 		break;
 	case POWERON_STATE:
 		if (event == INT_LINK_UP) {
@@ -333,7 +340,7 @@ static void handle_link_event(struct slot *p_slot, u32 event)
 			ctrl_info(ctrl,
 				  "Link Down event queued on slot(%s): currently getting powered on\n",
 				  slot_name(p_slot));
-			pciehp_queue_power_work(p_slot, DISABLE_REQ);
+			pciehp_queue_power_work(p_slot, DISABLE_REQ, false);
 		}
 		break;
 	case POWEROFF_STATE:
@@ -341,7 +348,8 @@ static void handle_link_event(struct slot *p_slot, u32 event)
 			ctrl_info(ctrl,
 				  "Link Up event queued on slot(%s): currently getting powered off\n",
 				  slot_name(p_slot));
-			pciehp_queue_power_work(p_slot, ENABLE_REQ);
+			pciehp_queue_power_work(p_slot, ENABLE_REQ,
+						ctrl->poweron_delay);
 		} else {
 			ctrl_info(ctrl,
 				  "Link Down event ignored on slot(%s): already powering off\n",
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index e4e6fcbe1e20..11e1a34bd57b 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -755,7 +755,7 @@ static int pcie_init_slot(struct controller *ctrl)
 	mutex_init(&slot->lock);
 	mutex_init(&slot->hotplug_lock);
 	INIT_DELAYED_WORK(&slot->work, pciehp_queue_pushbutton_work);
-	INIT_WORK(&slot->hotplug_work, pciehp_power_thread);
+	INIT_DELAYED_WORK(&slot->hotplug_work, pciehp_power_thread);
 	ctrl->slot = slot;
 	return 0;
 abort:
@@ -767,6 +767,7 @@ static void pcie_cleanup_slot(struct controller *ctrl)
 {
 	struct slot *slot = ctrl->slot;
 	cancel_delayed_work(&slot->work);
+	cancel_delayed_work(&slot->hotplug_work);
 	destroy_workqueue(slot->wq);
 	kfree(slot);
 }
@@ -810,6 +811,8 @@ struct controller *pcie_init(struct pcie_device *dev)
 	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &link_cap);
 	if (link_cap & PCI_EXP_LNKCAP_DLLLARC)
 		ctrl->link_active_reporting = 1;
+	if (pciehp_poweron_delay || dev->port->poweron_delay)
+		ctrl->poweron_delay = 1;
 
 	/* Clear all remaining event bits in Slot Status register */
 	pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e90eb22de628..e93c3d4e5b70 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -359,6 +359,7 @@ struct pci_dev {
 	unsigned int	io_window_1k:1;	/* Intel P2P bridge 1K I/O windows */
 	unsigned int	irq_managed:1;
 	unsigned int	has_secondary_link:1;
+	unsigned int	poweron_delay:1; /* Port needs poweron delay */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
  2015-11-03  3:48 [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
  2015-11-03  3:48 ` [RFC PATCH 2/2] PCI: pciehp: Implement support for delayed poweron Guenter Roeck
@ 2015-11-16 21:34 ` Guenter Roeck
  2016-01-08 16:18 ` Bjorn Helgaas
  2 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2015-11-16 21:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, linux-pci, linux-kernel

Hi folks,

any comments on this series ?

Thanks,
Guenter

On Mon, Nov 02, 2015 at 07:48:15PM -0800, Guenter Roeck wrote:
> Some oddball devices may experience a PCIe link flap after power-on.
> This may result in the following sequence of events.
> 
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Up event ignored on slot(0): already powering on
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Down event queued on slot(0): currently getting powered on
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Up event queued on slot(0): currently getting powered off
> 
> This causes the driver for affected devices to be instantiated, removed,
> and re-instantiated.
> 
> An even worse problem is a device which resets itself continuously.
> This can result in the following endless sequence of messages.
> 
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> 
> The problem in the both cases is that all events are enqueued as hotplug
> work requests and executed in sequence, which can overwhelm the system
> and even result in "hung task" tracebacks in pciehp_power_thread().
> 
> This exposes an underlying limitation of the hotplug state machine: It
> executes all received requests, even though only the most recent request
> really needs to be handled. As a result, by the time a request is handled,
> it may well be obsolete and have been superseded by many other enable /
> disable requests which have been enqueued in the meantime.
> 
> To solve the problem, fold hotplug work requests into a single request.
> Store the request as well as the work data structure in 'struct slot',
> thus eliminating the need to allocate memory for each request.
> Handle a sequence of requests by setting a 'disable' flag when needed,
> indicating that a link needs to be disabled prior to re-enabling it.
> 
> With this change, requests and request sequences are handled as follows.
> 
> enable (single request):		enable link
> disable (single request):		disable link
> ... disable-enable-disable...disable:	disable link
> ... disable-enable-disable...enable:	disable link, then enable it
> 
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> This is a different approach to solve the problem I tried to address
> earlier with with "PCI: pciehp: Add support for delayed power-on" [1].
> 
> While the earlier patch implemented an additional state in the hotplug
> state machine to solve the problem, the approach taken here is a bit
> simpler and more straightfoward. By folding multiple requests into one,
> the follow-up patch can use delayed work to implement power-on delays.
> An additional advantage is that this patch can be applied separately
> to simplify the state machine.
> 
> While working on this patch, I also tried to drop multiple "disable"
> requests, and only disable a slot if it was already disabled, to reduce
> overhead. Unfortunately, this did not always work. In some instances,
> I ended up in a situation where a device could not be enabled
> because the system thought that it already existed. I don't know
> if this is a weakness in this patch or some state change I did not catch. 
> This may be left for further study.
> 
> [1] https://lkml.org/lkml/2015/10/12/686
> 
>  drivers/pci/hotplug/pciehp.h      |  4 +++
>  drivers/pci/hotplug/pciehp_ctrl.c | 52 ++++++++++++++++++---------------------
>  drivers/pci/hotplug/pciehp_hpc.c  |  1 +
>  3 files changed, 29 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> index 62d6fe6c3714..364b6fa32978 100644
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -78,6 +78,9 @@ struct slot {
>  	struct mutex lock;
>  	struct mutex hotplug_lock;
>  	struct workqueue_struct *wq;
> +	struct work_struct hotplug_work;
> +	u32 hotplug_req;
> +	bool disable;			/* true to disable before enable */
>  };
>  
>  struct event_info {
> @@ -130,6 +133,7 @@ void pciehp_queue_interrupt_event(struct slot *slot, u32 event_type);
>  int pciehp_configure_device(struct slot *p_slot);
>  int pciehp_unconfigure_device(struct slot *p_slot);
>  void pciehp_queue_pushbutton_work(struct work_struct *work);
> +void pciehp_power_thread(struct work_struct *work);
>  struct controller *pcie_init(struct pcie_device *dev);
>  int pcie_init_notification(struct controller *ctrl);
>  int pciehp_enable_slot(struct slot *p_slot);
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 880978b6d534..ad1321e91546 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -156,13 +156,9 @@ static int remove_board(struct slot *p_slot)
>  	return 0;
>  }
>  
> -struct power_work_info {
> -	struct slot *p_slot;
> -	struct work_struct work;
> -	unsigned int req;
> -#define DISABLE_REQ 0
> -#define ENABLE_REQ  1
> -};
> +/* Hotplug work requests */
> +#define DISABLE_REQ	0
> +#define ENABLE_REQ	1
>  
>  /**
>   * pciehp_power_thread - handle pushbutton events
> @@ -171,14 +167,19 @@ struct power_work_info {
>   * Scheduled procedure to handle blocking stuff for the pushbuttons.
>   * Handles all pending events and exits.
>   */
> -static void pciehp_power_thread(struct work_struct *work)
> +void pciehp_power_thread(struct work_struct *work)
>  {
> -	struct power_work_info *info =
> -		container_of(work, struct power_work_info, work);
> -	struct slot *p_slot = info->p_slot;
> -	int ret;
> +	struct slot *p_slot = container_of(work, struct slot, hotplug_work);
> +	int ret, req;
> +	bool disable;
> +
> +	mutex_lock(&p_slot->lock);
> +	req = p_slot->hotplug_req;
> +	disable = p_slot->disable;
> +	p_slot->disable = false;
> +	mutex_unlock(&p_slot->lock);
>  
> -	switch (info->req) {
> +	switch (req) {
>  	case DISABLE_REQ:
>  		mutex_lock(&p_slot->hotplug_lock);
>  		pciehp_disable_slot(p_slot);
> @@ -189,6 +190,8 @@ static void pciehp_power_thread(struct work_struct *work)
>  		break;
>  	case ENABLE_REQ:
>  		mutex_lock(&p_slot->hotplug_lock);
> +		if (disable)
> +			pciehp_disable_slot(p_slot);
>  		ret = pciehp_enable_slot(p_slot);
>  		mutex_unlock(&p_slot->hotplug_lock);
>  		if (ret)
> @@ -200,26 +203,19 @@ static void pciehp_power_thread(struct work_struct *work)
>  	default:
>  		break;
>  	}
> -
> -	kfree(info);
>  }
>  
>  static void pciehp_queue_power_work(struct slot *p_slot, int req)
>  {
> -	struct power_work_info *info;
> -
> -	p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE;
> -
> -	info = kmalloc(sizeof(*info), GFP_KERNEL);
> -	if (!info) {
> -		ctrl_err(p_slot->ctrl, "no memory to queue %s request\n",
> -			 (req == ENABLE_REQ) ? "poweron" : "poweroff");
> -		return;
> +	if (req == ENABLE_REQ) {
> +		p_slot->state = POWERON_STATE;
> +	} else {
> +		p_slot->state = POWEROFF_STATE;
> +		p_slot->disable = true;
>  	}
> -	info->p_slot = p_slot;
> -	INIT_WORK(&info->work, pciehp_power_thread);
> -	info->req = req;
> -	queue_work(p_slot->wq, &info->work);
> +	p_slot->hotplug_req = req;
> +
> +	queue_work(p_slot->wq, &p_slot->hotplug_work);
>  }
>  
>  void pciehp_queue_pushbutton_work(struct work_struct *work)
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 5c24e938042f..e4e6fcbe1e20 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -755,6 +755,7 @@ static int pcie_init_slot(struct controller *ctrl)
>  	mutex_init(&slot->lock);
>  	mutex_init(&slot->hotplug_lock);
>  	INIT_DELAYED_WORK(&slot->work, pciehp_queue_pushbutton_work);
> +	INIT_WORK(&slot->hotplug_work, pciehp_power_thread);
>  	ctrl->slot = slot;
>  	return 0;
>  abort:
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
  2015-11-03  3:48 [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
  2015-11-03  3:48 ` [RFC PATCH 2/2] PCI: pciehp: Implement support for delayed poweron Guenter Roeck
  2015-11-16 21:34 ` [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
@ 2016-01-08 16:18 ` Bjorn Helgaas
  2016-01-08 16:30   ` Guenter Roeck
  2 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2016-01-08 16:18 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Bjorn Helgaas, Yinghai Lu, linux-pci, linux-kernel

Hi Guenter,

Sorry for the delay in getting to this.

On Mon, Nov 02, 2015 at 07:48:15PM -0800, Guenter Roeck wrote:
> Some oddball devices may experience a PCIe link flap after power-on.
> This may result in the following sequence of events.
> 
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Up event ignored on slot(0): already powering on
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Down event queued on slot(0): currently getting powered on
> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> 	Link Up event queued on slot(0): currently getting powered off
> 
> This causes the driver for affected devices to be instantiated, removed,
> and re-instantiated.
> 
> An even worse problem is a device which resets itself continuously.
> This can result in the following endless sequence of messages.
> 
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> 
> The problem in the both cases is that all events are enqueued as hotplug
> work requests and executed in sequence, which can overwhelm the system
> and even result in "hung task" tracebacks in pciehp_power_thread().
> 
> This exposes an underlying limitation of the hotplug state machine: It
> executes all received requests, even though only the most recent request
> really needs to be handled. As a result, by the time a request is handled,
> it may well be obsolete and have been superseded by many other enable /
> disable requests which have been enqueued in the meantime.
> 
> To solve the problem, fold hotplug work requests into a single request.
> Store the request as well as the work data structure in 'struct slot',
> thus eliminating the need to allocate memory for each request.
> Handle a sequence of requests by setting a 'disable' flag when needed,
> indicating that a link needs to be disabled prior to re-enabling it.
> 
> With this change, requests and request sequences are handled as follows.
> 
> enable (single request):		enable link
> disable (single request):		disable link
> ... disable-enable-disable...disable:	disable link
> ... disable-enable-disable...enable:	disable link, then enable it

I think this is a really good idea, but I would like to understand
what the critical points are and how they affect the state machine.

I think you're basically accounting for the fact that some hotplug
controller commands are not completed instantaneously, and we might
receive more interrupts before the first command is completed.  I
suspect that your patch only makes a difference on controllers that
support Command Completed events, right?

You're not adding any timeouts, so I *think* you must be effectively
collapsing any events that occur before a Command Completed interrupt.
If that's the case, we should probably handle other events and
commands similarly.  The LED controls probably aren't important, but
what about the Electromechanical Interlock control?  Should that be
handled the same way you're handling power control?

Right now it feels a little bit ad hoc, and it would be nice if we
could make it more explicit somehow.

Bjorn

> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> This is a different approach to solve the problem I tried to address
> earlier with with "PCI: pciehp: Add support for delayed power-on" [1].
> 
> While the earlier patch implemented an additional state in the hotplug
> state machine to solve the problem, the approach taken here is a bit
> simpler and more straightfoward. By folding multiple requests into one,
> the follow-up patch can use delayed work to implement power-on delays.
> An additional advantage is that this patch can be applied separately
> to simplify the state machine.
> 
> While working on this patch, I also tried to drop multiple "disable"
> requests, and only disable a slot if it was already disabled, to reduce
> overhead. Unfortunately, this did not always work. In some instances,
> I ended up in a situation where a device could not be enabled
> because the system thought that it already existed. I don't know
> if this is a weakness in this patch or some state change I did not catch. 
> This may be left for further study.
> 
> [1] https://lkml.org/lkml/2015/10/12/686
> 
>  drivers/pci/hotplug/pciehp.h      |  4 +++
>  drivers/pci/hotplug/pciehp_ctrl.c | 52 ++++++++++++++++++---------------------
>  drivers/pci/hotplug/pciehp_hpc.c  |  1 +
>  3 files changed, 29 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> index 62d6fe6c3714..364b6fa32978 100644
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -78,6 +78,9 @@ struct slot {
>  	struct mutex lock;
>  	struct mutex hotplug_lock;
>  	struct workqueue_struct *wq;
> +	struct work_struct hotplug_work;
> +	u32 hotplug_req;
> +	bool disable;			/* true to disable before enable */
>  };
>  
>  struct event_info {
> @@ -130,6 +133,7 @@ void pciehp_queue_interrupt_event(struct slot *slot, u32 event_type);
>  int pciehp_configure_device(struct slot *p_slot);
>  int pciehp_unconfigure_device(struct slot *p_slot);
>  void pciehp_queue_pushbutton_work(struct work_struct *work);
> +void pciehp_power_thread(struct work_struct *work);
>  struct controller *pcie_init(struct pcie_device *dev);
>  int pcie_init_notification(struct controller *ctrl);
>  int pciehp_enable_slot(struct slot *p_slot);
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 880978b6d534..ad1321e91546 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -156,13 +156,9 @@ static int remove_board(struct slot *p_slot)
>  	return 0;
>  }
>  
> -struct power_work_info {
> -	struct slot *p_slot;
> -	struct work_struct work;
> -	unsigned int req;
> -#define DISABLE_REQ 0
> -#define ENABLE_REQ  1
> -};
> +/* Hotplug work requests */
> +#define DISABLE_REQ	0
> +#define ENABLE_REQ	1
>  
>  /**
>   * pciehp_power_thread - handle pushbutton events
> @@ -171,14 +167,19 @@ struct power_work_info {
>   * Scheduled procedure to handle blocking stuff for the pushbuttons.
>   * Handles all pending events and exits.
>   */
> -static void pciehp_power_thread(struct work_struct *work)
> +void pciehp_power_thread(struct work_struct *work)
>  {
> -	struct power_work_info *info =
> -		container_of(work, struct power_work_info, work);
> -	struct slot *p_slot = info->p_slot;
> -	int ret;
> +	struct slot *p_slot = container_of(work, struct slot, hotplug_work);
> +	int ret, req;
> +	bool disable;
> +
> +	mutex_lock(&p_slot->lock);
> +	req = p_slot->hotplug_req;
> +	disable = p_slot->disable;
> +	p_slot->disable = false;
> +	mutex_unlock(&p_slot->lock);
>  
> -	switch (info->req) {
> +	switch (req) {
>  	case DISABLE_REQ:
>  		mutex_lock(&p_slot->hotplug_lock);
>  		pciehp_disable_slot(p_slot);
> @@ -189,6 +190,8 @@ static void pciehp_power_thread(struct work_struct *work)
>  		break;
>  	case ENABLE_REQ:
>  		mutex_lock(&p_slot->hotplug_lock);
> +		if (disable)
> +			pciehp_disable_slot(p_slot);
>  		ret = pciehp_enable_slot(p_slot);
>  		mutex_unlock(&p_slot->hotplug_lock);
>  		if (ret)
> @@ -200,26 +203,19 @@ static void pciehp_power_thread(struct work_struct *work)
>  	default:
>  		break;
>  	}
> -
> -	kfree(info);
>  }
>  
>  static void pciehp_queue_power_work(struct slot *p_slot, int req)
>  {
> -	struct power_work_info *info;
> -
> -	p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE;
> -
> -	info = kmalloc(sizeof(*info), GFP_KERNEL);
> -	if (!info) {
> -		ctrl_err(p_slot->ctrl, "no memory to queue %s request\n",
> -			 (req == ENABLE_REQ) ? "poweron" : "poweroff");
> -		return;
> +	if (req == ENABLE_REQ) {
> +		p_slot->state = POWERON_STATE;
> +	} else {
> +		p_slot->state = POWEROFF_STATE;
> +		p_slot->disable = true;
>  	}
> -	info->p_slot = p_slot;
> -	INIT_WORK(&info->work, pciehp_power_thread);
> -	info->req = req;
> -	queue_work(p_slot->wq, &info->work);
> +	p_slot->hotplug_req = req;
> +
> +	queue_work(p_slot->wq, &p_slot->hotplug_work);
>  }
>  
>  void pciehp_queue_pushbutton_work(struct work_struct *work)
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 5c24e938042f..e4e6fcbe1e20 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -755,6 +755,7 @@ static int pcie_init_slot(struct controller *ctrl)
>  	mutex_init(&slot->lock);
>  	mutex_init(&slot->hotplug_lock);
>  	INIT_DELAYED_WORK(&slot->work, pciehp_queue_pushbutton_work);
> +	INIT_WORK(&slot->hotplug_work, pciehp_power_thread);
>  	ctrl->slot = slot;
>  	return 0;
>  abort:
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
  2016-01-08 16:18 ` Bjorn Helgaas
@ 2016-01-08 16:30   ` Guenter Roeck
  2016-01-08 17:46     ` Bjorn Helgaas
  0 siblings, 1 reply; 7+ messages in thread
From: Guenter Roeck @ 2016-01-08 16:30 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Bjorn Helgaas, Yinghai Lu, linux-pci, linux-kernel

On 01/08/2016 08:18 AM, Bjorn Helgaas wrote:
> Hi Guenter,
>
> Sorry for the delay in getting to this.
>
> On Mon, Nov 02, 2015 at 07:48:15PM -0800, Guenter Roeck wrote:
>> Some oddball devices may experience a PCIe link flap after power-on.
>> This may result in the following sequence of events.
>>
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>> 	Link Up event ignored on slot(0): already powering on
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>> 	Link Down event queued on slot(0): currently getting powered on
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>> 	Link Up event queued on slot(0): currently getting powered off
>>
>> This causes the driver for affected devices to be instantiated, removed,
>> and re-instantiated.
>>
>> An even worse problem is a device which resets itself continuously.
>> This can result in the following endless sequence of messages.
>>
>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>>
>> The problem in the both cases is that all events are enqueued as hotplug
>> work requests and executed in sequence, which can overwhelm the system
>> and even result in "hung task" tracebacks in pciehp_power_thread().
>>
>> This exposes an underlying limitation of the hotplug state machine: It
>> executes all received requests, even though only the most recent request
>> really needs to be handled. As a result, by the time a request is handled,
>> it may well be obsolete and have been superseded by many other enable /
>> disable requests which have been enqueued in the meantime.
>>
>> To solve the problem, fold hotplug work requests into a single request.
>> Store the request as well as the work data structure in 'struct slot',
>> thus eliminating the need to allocate memory for each request.
>> Handle a sequence of requests by setting a 'disable' flag when needed,
>> indicating that a link needs to be disabled prior to re-enabling it.
>>
>> With this change, requests and request sequences are handled as follows.
>>
>> enable (single request):		enable link
>> disable (single request):		disable link
>> ... disable-enable-disable...disable:	disable link
>> ... disable-enable-disable...enable:	disable link, then enable it
>
> I think this is a really good idea, but I would like to understand
> what the critical points are and how they affect the state machine.
>
> I think you're basically accounting for the fact that some hotplug
> controller commands are not completed instantaneously, and we might
> receive more interrupts before the first command is completed.  I
> suspect that your patch only makes a difference on controllers that
> support Command Completed events, right?
>

No, not really. problem is that state change interrupts are not handled
immediately but enqueued. By the time an event is handled by the workqueue,
it is long since obsolete and superseded by other events.

> You're not adding any timeouts, so I *think* you must be effectively
> collapsing any events that occur before a Command Completed interrupt.
> If that's the case, we should probably handle other events and
> commands similarly.  The LED controls probably aren't important, but
> what about the Electromechanical Interlock control?  Should that be
> handled the same way you're handling power control?
>
> Right now it feels a little bit ad hoc, and it would be nice if we
> could make it more explicit somehow.
>

Yes, collapsing events was the idea, though not specifically related
to command completion but related to external events (power, link state
changes. I did not explore interaction with command completed interrupts.

I'll look into it some more to see if there is any problem or
interaction with command completion interrupts.

Guenter

> Bjorn
>
>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>> ---
>> This is a different approach to solve the problem I tried to address
>> earlier with with "PCI: pciehp: Add support for delayed power-on" [1].
>>
>> While the earlier patch implemented an additional state in the hotplug
>> state machine to solve the problem, the approach taken here is a bit
>> simpler and more straightfoward. By folding multiple requests into one,
>> the follow-up patch can use delayed work to implement power-on delays.
>> An additional advantage is that this patch can be applied separately
>> to simplify the state machine.
>>
>> While working on this patch, I also tried to drop multiple "disable"
>> requests, and only disable a slot if it was already disabled, to reduce
>> overhead. Unfortunately, this did not always work. In some instances,
>> I ended up in a situation where a device could not be enabled
>> because the system thought that it already existed. I don't know
>> if this is a weakness in this patch or some state change I did not catch.
>> This may be left for further study.
>>
>> [1] https://lkml.org/lkml/2015/10/12/686
>>
>>   drivers/pci/hotplug/pciehp.h      |  4 +++
>>   drivers/pci/hotplug/pciehp_ctrl.c | 52 ++++++++++++++++++---------------------
>>   drivers/pci/hotplug/pciehp_hpc.c  |  1 +
>>   3 files changed, 29 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
>> index 62d6fe6c3714..364b6fa32978 100644
>> --- a/drivers/pci/hotplug/pciehp.h
>> +++ b/drivers/pci/hotplug/pciehp.h
>> @@ -78,6 +78,9 @@ struct slot {
>>   	struct mutex lock;
>>   	struct mutex hotplug_lock;
>>   	struct workqueue_struct *wq;
>> +	struct work_struct hotplug_work;
>> +	u32 hotplug_req;
>> +	bool disable;			/* true to disable before enable */
>>   };
>>
>>   struct event_info {
>> @@ -130,6 +133,7 @@ void pciehp_queue_interrupt_event(struct slot *slot, u32 event_type);
>>   int pciehp_configure_device(struct slot *p_slot);
>>   int pciehp_unconfigure_device(struct slot *p_slot);
>>   void pciehp_queue_pushbutton_work(struct work_struct *work);
>> +void pciehp_power_thread(struct work_struct *work);
>>   struct controller *pcie_init(struct pcie_device *dev);
>>   int pcie_init_notification(struct controller *ctrl);
>>   int pciehp_enable_slot(struct slot *p_slot);
>> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
>> index 880978b6d534..ad1321e91546 100644
>> --- a/drivers/pci/hotplug/pciehp_ctrl.c
>> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
>> @@ -156,13 +156,9 @@ static int remove_board(struct slot *p_slot)
>>   	return 0;
>>   }
>>
>> -struct power_work_info {
>> -	struct slot *p_slot;
>> -	struct work_struct work;
>> -	unsigned int req;
>> -#define DISABLE_REQ 0
>> -#define ENABLE_REQ  1
>> -};
>> +/* Hotplug work requests */
>> +#define DISABLE_REQ	0
>> +#define ENABLE_REQ	1
>>
>>   /**
>>    * pciehp_power_thread - handle pushbutton events
>> @@ -171,14 +167,19 @@ struct power_work_info {
>>    * Scheduled procedure to handle blocking stuff for the pushbuttons.
>>    * Handles all pending events and exits.
>>    */
>> -static void pciehp_power_thread(struct work_struct *work)
>> +void pciehp_power_thread(struct work_struct *work)
>>   {
>> -	struct power_work_info *info =
>> -		container_of(work, struct power_work_info, work);
>> -	struct slot *p_slot = info->p_slot;
>> -	int ret;
>> +	struct slot *p_slot = container_of(work, struct slot, hotplug_work);
>> +	int ret, req;
>> +	bool disable;
>> +
>> +	mutex_lock(&p_slot->lock);
>> +	req = p_slot->hotplug_req;
>> +	disable = p_slot->disable;
>> +	p_slot->disable = false;
>> +	mutex_unlock(&p_slot->lock);
>>
>> -	switch (info->req) {
>> +	switch (req) {
>>   	case DISABLE_REQ:
>>   		mutex_lock(&p_slot->hotplug_lock);
>>   		pciehp_disable_slot(p_slot);
>> @@ -189,6 +190,8 @@ static void pciehp_power_thread(struct work_struct *work)
>>   		break;
>>   	case ENABLE_REQ:
>>   		mutex_lock(&p_slot->hotplug_lock);
>> +		if (disable)
>> +			pciehp_disable_slot(p_slot);
>>   		ret = pciehp_enable_slot(p_slot);
>>   		mutex_unlock(&p_slot->hotplug_lock);
>>   		if (ret)
>> @@ -200,26 +203,19 @@ static void pciehp_power_thread(struct work_struct *work)
>>   	default:
>>   		break;
>>   	}
>> -
>> -	kfree(info);
>>   }
>>
>>   static void pciehp_queue_power_work(struct slot *p_slot, int req)
>>   {
>> -	struct power_work_info *info;
>> -
>> -	p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE;
>> -
>> -	info = kmalloc(sizeof(*info), GFP_KERNEL);
>> -	if (!info) {
>> -		ctrl_err(p_slot->ctrl, "no memory to queue %s request\n",
>> -			 (req == ENABLE_REQ) ? "poweron" : "poweroff");
>> -		return;
>> +	if (req == ENABLE_REQ) {
>> +		p_slot->state = POWERON_STATE;
>> +	} else {
>> +		p_slot->state = POWEROFF_STATE;
>> +		p_slot->disable = true;
>>   	}
>> -	info->p_slot = p_slot;
>> -	INIT_WORK(&info->work, pciehp_power_thread);
>> -	info->req = req;
>> -	queue_work(p_slot->wq, &info->work);
>> +	p_slot->hotplug_req = req;
>> +
>> +	queue_work(p_slot->wq, &p_slot->hotplug_work);
>>   }
>>
>>   void pciehp_queue_pushbutton_work(struct work_struct *work)
>> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>> index 5c24e938042f..e4e6fcbe1e20 100644
>> --- a/drivers/pci/hotplug/pciehp_hpc.c
>> +++ b/drivers/pci/hotplug/pciehp_hpc.c
>> @@ -755,6 +755,7 @@ static int pcie_init_slot(struct controller *ctrl)
>>   	mutex_init(&slot->lock);
>>   	mutex_init(&slot->hotplug_lock);
>>   	INIT_DELAYED_WORK(&slot->work, pciehp_queue_pushbutton_work);
>> +	INIT_WORK(&slot->hotplug_work, pciehp_power_thread);
>>   	ctrl->slot = slot;
>>   	return 0;
>>   abort:
>> --
>> 2.1.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
  2016-01-08 16:30   ` Guenter Roeck
@ 2016-01-08 17:46     ` Bjorn Helgaas
  2016-01-09  1:27       ` Guenter Roeck
  0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2016-01-08 17:46 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Bjorn Helgaas, Yinghai Lu, linux-pci, linux-kernel

On Fri, Jan 08, 2016 at 08:30:30AM -0800, Guenter Roeck wrote:
> On 01/08/2016 08:18 AM, Bjorn Helgaas wrote:
> >Hi Guenter,
> >
> >Sorry for the delay in getting to this.
> >
> >On Mon, Nov 02, 2015 at 07:48:15PM -0800, Guenter Roeck wrote:
> >>Some oddball devices may experience a PCIe link flap after power-on.
> >>This may result in the following sequence of events.
> >>
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> >>	Link Up event ignored on slot(0): already powering on
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> >>	Link Down event queued on slot(0): currently getting powered on
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
> >>fpc0 kernel: pciehp 0000:02:08.0:pcie24:
> >>	Link Up event queued on slot(0): currently getting powered off
> >>
> >>This causes the driver for affected devices to be instantiated, removed,
> >>and re-instantiated.
> >>
> >>An even worse problem is a device which resets itself continuously.
> >>This can result in the following endless sequence of messages.
> >>
> >>pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> >>pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> >>pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> >>pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> >>pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
> >>pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
> >>
> >>The problem in the both cases is that all events are enqueued as hotplug
> >>work requests and executed in sequence, which can overwhelm the system
> >>and even result in "hung task" tracebacks in pciehp_power_thread().
> >>
> >>This exposes an underlying limitation of the hotplug state machine: It
> >>executes all received requests, even though only the most recent request
> >>really needs to be handled. As a result, by the time a request is handled,
> >>it may well be obsolete and have been superseded by many other enable /
> >>disable requests which have been enqueued in the meantime.
> >>
> >>To solve the problem, fold hotplug work requests into a single request.
> >>Store the request as well as the work data structure in 'struct slot',
> >>thus eliminating the need to allocate memory for each request.
> >>Handle a sequence of requests by setting a 'disable' flag when needed,
> >>indicating that a link needs to be disabled prior to re-enabling it.
> >>
> >>With this change, requests and request sequences are handled as follows.
> >>
> >>enable (single request):		enable link
> >>disable (single request):		disable link
> >>... disable-enable-disable...disable:	disable link
> >>... disable-enable-disable...enable:	disable link, then enable it
> >
> >I think this is a really good idea, but I would like to understand
> >what the critical points are and how they affect the state machine.
> >
> >I think you're basically accounting for the fact that some hotplug
> >controller commands are not completed instantaneously, and we might
> >receive more interrupts before the first command is completed.  I
> >suspect that your patch only makes a difference on controllers that
> >support Command Completed events, right?
> 
> No, not really. problem is that state change interrupts are not handled
> immediately but enqueued. By the time an event is handled by the workqueue,
> it is long since obsolete and superseded by other events.

Ah.  So the important interval is the one between pcie_isr(), where we
enqueue work, and the worker threads that are awakened to actually do
the work.  Then the idea is that only the most recent state is
important -- if we have several presence detect changes, e.g.,
present, absent, present, absent, before the worker thread starts
processing them, it should only look at the most recent state.  That
seems like the right thing to me, and I think the removal of kmalloc
from the ISR path is an important consequence of doing that.

Blue sky thinking:

  - Do all interrupt-related CSR reads in pcie_isr() (as opposed to
    doing some in pcie_isr() and others in the worker threads).  I
    think this is important for consistency.  I think it's a mistake
    to read and clear the status register in pcie_isr(), then go back
    and read other CSRs later in the worker threads.

  - Have pcie_isr() update a single set of "current state" CSR values.
    This means we don't need any allocation in pcie_isr().  Some
    values, e.g., Power Fault Detected, might be OR-d in.  Others, 
    e.g., Presence Detect, might overwrite the previous value.

  - Collapse the several worker threads into a single one that
    pcie_isr() would awaken (as opposed to having pcie_isr() decide
    whether this is a button press, presence detect change, link
    event, etc.)

  - Have the single worker thread figure out what happened and how to
    advance the state machine, based on the CSR values read by the
    most recent pcie_isr() invocation.

This would be a lot of changes, but I think it has the potential to
centralize the state machine management and simplify things
significantly.

Bjorn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests
  2016-01-08 17:46     ` Bjorn Helgaas
@ 2016-01-09  1:27       ` Guenter Roeck
  0 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2016-01-09  1:27 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Bjorn Helgaas, Yinghai Lu, linux-pci, linux-kernel

Hi Bjorn,

On 01/08/2016 09:46 AM, Bjorn Helgaas wrote:
> On Fri, Jan 08, 2016 at 08:30:30AM -0800, Guenter Roeck wrote:
>> On 01/08/2016 08:18 AM, Bjorn Helgaas wrote:
>>> Hi Guenter,
>>>
>>> Sorry for the delay in getting to this.
>>>
>>> On Mon, Nov 02, 2015 at 07:48:15PM -0800, Guenter Roeck wrote:
>>>> Some oddball devices may experience a PCIe link flap after power-on.
>>>> This may result in the following sequence of events.
>>>>
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: Card present on Slot(0)
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>>>> 	Link Up event ignored on slot(0): already powering on
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Down event
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>>>> 	Link Down event queued on slot(0): currently getting powered on
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24: slot(0): Link Up event
>>>> fpc0 kernel: pciehp 0000:02:08.0:pcie24:
>>>> 	Link Up event queued on slot(0): currently getting powered off
>>>>
>>>> This causes the driver for affected devices to be instantiated, removed,
>>>> and re-instantiated.
>>>>
>>>> An even worse problem is a device which resets itself continuously.
>>>> This can result in the following endless sequence of messages.
>>>>
>>>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>>>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>>>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>>>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>>>> pciehp 0000:02:0a.0:pcie24: Card present on Slot(10)
>>>> pciehp 0000:02:0a.0:pcie24: Card not present on Slot(10)
>>>>
>>>> The problem in the both cases is that all events are enqueued as hotplug
>>>> work requests and executed in sequence, which can overwhelm the system
>>>> and even result in "hung task" tracebacks in pciehp_power_thread().
>>>>
>>>> This exposes an underlying limitation of the hotplug state machine: It
>>>> executes all received requests, even though only the most recent request
>>>> really needs to be handled. As a result, by the time a request is handled,
>>>> it may well be obsolete and have been superseded by many other enable /
>>>> disable requests which have been enqueued in the meantime.
>>>>
>>>> To solve the problem, fold hotplug work requests into a single request.
>>>> Store the request as well as the work data structure in 'struct slot',
>>>> thus eliminating the need to allocate memory for each request.
>>>> Handle a sequence of requests by setting a 'disable' flag when needed,
>>>> indicating that a link needs to be disabled prior to re-enabling it.
>>>>
>>>> With this change, requests and request sequences are handled as follows.
>>>>
>>>> enable (single request):		enable link
>>>> disable (single request):		disable link
>>>> ... disable-enable-disable...disable:	disable link
>>>> ... disable-enable-disable...enable:	disable link, then enable it
>>>
>>> I think this is a really good idea, but I would like to understand
>>> what the critical points are and how they affect the state machine.
>>>
>>> I think you're basically accounting for the fact that some hotplug
>>> controller commands are not completed instantaneously, and we might
>>> receive more interrupts before the first command is completed.  I
>>> suspect that your patch only makes a difference on controllers that
>>> support Command Completed events, right?
>>
>> No, not really. problem is that state change interrupts are not handled
>> immediately but enqueued. By the time an event is handled by the workqueue,
>> it is long since obsolete and superseded by other events.
>
> Ah.  So the important interval is the one between pcie_isr(), where we
> enqueue work, and the worker threads that are awakened to actually do
> the work.  Then the idea is that only the most recent state is
> important -- if we have several presence detect changes, e.g.,
> present, absent, present, absent, before the worker thread starts
> processing them, it should only look at the most recent state.  That
> seems like the right thing to me, and I think the removal of kmalloc
> from the ISR path is an important consequence of doing that.
>
Yes, exactly.

> Blue sky thinking:
>
>    - Do all interrupt-related CSR reads in pcie_isr() (as opposed to
>      doing some in pcie_isr() and others in the worker threads).  I
>      think this is important for consistency.  I think it's a mistake
>      to read and clear the status register in pcie_isr(), then go back
>      and read other CSRs later in the worker threads.
>
>    - Have pcie_isr() update a single set of "current state" CSR values.
>      This means we don't need any allocation in pcie_isr().  Some
>      values, e.g., Power Fault Detected, might be OR-d in.  Others,
>      e.g., Presence Detect, might overwrite the previous value.
>
>    - Collapse the several worker threads into a single one that
>      pcie_isr() would awaken (as opposed to having pcie_isr() decide
>      whether this is a button press, presence detect change, link
>      event, etc.)
>
>    - Have the single worker thread figure out what happened and how to
>      advance the state machine, based on the CSR values read by the
>      most recent pcie_isr() invocation.
>
> This would be a lot of changes, but I think it has the potential to
> centralize the state machine management and simplify things
> significantly.
>
That all makes a lot of sense, and I would love to spend some time on it.
Unfortunately, I don't think I will be able to spend much time on this
anytime soon.

Any chance to accept at least the first of my two patches ? Or, in other
words, do you see an immediate problem with it that I could address in,
say, the next week or two ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-09  1:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03  3:48 [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
2015-11-03  3:48 ` [RFC PATCH 2/2] PCI: pciehp: Implement support for delayed poweron Guenter Roeck
2015-11-16 21:34 ` [RFC PATCH 1/2] PCI: pciehp: Merge hotplug work requests Guenter Roeck
2016-01-08 16:18 ` Bjorn Helgaas
2016-01-08 16:30   ` Guenter Roeck
2016-01-08 17:46     ` Bjorn Helgaas
2016-01-09  1:27       ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.