linux-gpio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses
@ 2019-11-18 14:20 Hans de Goede
  2019-11-18 14:20 ` [PATCH] " Hans de Goede
  2019-11-18 17:23 ` [PATCH 0/1] " Andy Shevchenko
  0 siblings, 2 replies; 6+ messages in thread
From: Hans de Goede @ 2019-11-18 14:20 UTC (permalink / raw)
  To: Mika Westerberg, Andy Shevchenko, Bartosz Golaszewski, Linus Walleij
  Cc: Hans de Goede, linux-gpio, linux-acpi

Hi All,

First let me copy-paste a bit of the commit msg for background:

---begin commit msg---
Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
added a spinlock around all register accesses because:

"There is a hardware issue in Intel Baytrail where concurrent GPIO register
 access might result reads of 0xffffffff and writes might get dropped
 completely."

Testing has shown that this does not catch all cases, there are still
2 problems remaining

1) The original fix uses a spinlock per byt_gpio device / struct,
additional testing has shown that this is not sufficient concurent
accesses to 2 different GPIO banks also suffer from the same problem.

This commit fixes this by moving to a single global lock.

2) The original fix did not add a lock around the register accesses in
the suspend/resume handling.

Since pinctrl-baytrail.c is using normal suspend/resume handlers,
interrupts are still enabled during suspend/resume handling. Nothing
should be using the GPIOs when they are being taken down, _but_ the
GPIOs themselves may still cause interrupts, which are likely to
use (read) the triggering GPIO. So we need to protect against
concurrent GPIO register accesses in the suspend/resume handlers too.

This commit fixes this by adding the missing spin_lock / unlock calls.
---end commit msg---

I was discussing the problem at my local hackerspace yesterday with
someone who knows quite a bit low-level details about Intel CPUs.
He said that on "big" core's all the GPIO islands sit behind the IOSF
and that there is a bridge which make their registers look like regular
mmio registers.

I wonder if the same is happening on BYT, that would point to an issue
with concurent accesses being done through the IOSF bridge, which would
explain why I've found that we need a single global lock for all GPIO
islands (*). But that in turn would mean that we really need global lock
for _all_ accesses done through the IOSF bridge, not just GPIO register
accesses...  Which would explain why on some machines BYT has never been
really stable with Linux.

Can someone at Intel with access to internal docs about BYT dig a bit into
the docs and see if 1. this theory makes sense, 2. if it does if there us
anything else behind the IOSF for which we also need to serialize accesses?

Regards,

Hans




*) It seems that the GPIO interrupt storm on the test device I hit
this is really good in triggering this


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] pinctrl: baytrail: Really serialize all register accesses
  2019-11-18 14:20 [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses Hans de Goede
@ 2019-11-18 14:20 ` Hans de Goede
  2019-11-19  8:19   ` Mika Westerberg
  2019-11-18 17:23 ` [PATCH 0/1] " Andy Shevchenko
  1 sibling, 1 reply; 6+ messages in thread
From: Hans de Goede @ 2019-11-18 14:20 UTC (permalink / raw)
  To: Mika Westerberg, Andy Shevchenko, Bartosz Golaszewski, Linus Walleij
  Cc: Hans de Goede, linux-gpio, linux-acpi, stable

Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
added a spinlock around all register accesses because:

"There is a hardware issue in Intel Baytrail where concurrent GPIO register
 access might result reads of 0xffffffff and writes might get dropped
 completely."

Testing has shown that this does not catch all cases, there are still
2 problems remaining

1) The original fix uses a spinlock per byt_gpio device / struct,
additional testing has shown that this is not sufficient concurent
accesses to 2 different GPIO banks also suffer from the same problem.

This commit fixes this by moving to a single global lock.

2) The original fix did not add a lock around the register accesses in
the suspend/resume handling.

Since pinctrl-baytrail.c is using normal suspend/resume handlers,
interrupts are still enabled during suspend/resume handling. Nothing
should be using the GPIOs when they are being taken down, _but_ the
GPIOs themselves may still cause interrupts, which are likely to
use (read) the triggering GPIO. So we need to protect against
concurrent GPIO register accesses in the suspend/resume handlers too.

This commit fixes this by adding the missing spin_lock / unlock calls.

The 2 fixes together fix the Acer Switch 10 SW5-012 getting completely
confused after a suspend resume. The DSDT for this device has a bug
in its _LID method which reprograms the home and power button trigger-
flags requesting both high and low _level_ interrupts so the IRQs for
these 2 GPIOs continuously fire. This combined with the saving of
registers during suspend, triggers concurrent GPIO register accesses
resulting in saving 0xffffffff as pconf0 value during suspend and then
when restoring this on resume the pinmux settings get all messed up,
resulting in various I2C busses being stuck, the wifi no longer working
and often the tablet simply not coming out of suspend at all.

Cc: stable@vger.kernel.org
Fixes: 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/pinctrl/intel/pinctrl-baytrail.c | 81 +++++++++++++-----------
 1 file changed, 44 insertions(+), 37 deletions(-)

diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c
index b18336d42252..1b289f64c3a2 100644
--- a/drivers/pinctrl/intel/pinctrl-baytrail.c
+++ b/drivers/pinctrl/intel/pinctrl-baytrail.c
@@ -111,7 +111,6 @@ struct byt_gpio {
 	struct platform_device *pdev;
 	struct pinctrl_dev *pctl_dev;
 	struct pinctrl_desc pctl_desc;
-	raw_spinlock_t lock;
 	const struct intel_pinctrl_soc_data *soc_data;
 	struct intel_community *communities_copy;
 	struct byt_gpio_pin_context *saved_context;
@@ -550,6 +549,8 @@ static const struct intel_pinctrl_soc_data *byt_soc_data[] = {
 	NULL
 };
 
+static DEFINE_RAW_SPINLOCK(byt_gpio_lock);
+
 static struct intel_community *byt_get_community(struct byt_gpio *vg,
 						 unsigned int pin)
 {
@@ -659,7 +660,7 @@ static void byt_set_group_simple_mux(struct byt_gpio *vg,
 	unsigned long flags;
 	int i;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 
 	for (i = 0; i < group.npins; i++) {
 		void __iomem *padcfg0;
@@ -679,7 +680,7 @@ static void byt_set_group_simple_mux(struct byt_gpio *vg,
 		writel(value, padcfg0);
 	}
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 }
 
 static void byt_set_group_mixed_mux(struct byt_gpio *vg,
@@ -689,7 +690,7 @@ static void byt_set_group_mixed_mux(struct byt_gpio *vg,
 	unsigned long flags;
 	int i;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 
 	for (i = 0; i < group.npins; i++) {
 		void __iomem *padcfg0;
@@ -709,7 +710,7 @@ static void byt_set_group_mixed_mux(struct byt_gpio *vg,
 		writel(value, padcfg0);
 	}
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 }
 
 static int byt_set_mux(struct pinctrl_dev *pctldev, unsigned int func_selector,
@@ -750,11 +751,11 @@ static void byt_gpio_clear_triggering(struct byt_gpio *vg, unsigned int offset)
 	unsigned long flags;
 	u32 value;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	value = readl(reg);
 	value &= ~(BYT_TRIG_POS | BYT_TRIG_NEG | BYT_TRIG_LVL);
 	writel(value, reg);
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 }
 
 static int byt_gpio_request_enable(struct pinctrl_dev *pctl_dev,
@@ -766,7 +767,7 @@ static int byt_gpio_request_enable(struct pinctrl_dev *pctl_dev,
 	u32 value, gpio_mux;
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 
 	/*
 	 * In most cases, func pin mux 000 means GPIO function.
@@ -788,7 +789,7 @@ static int byt_gpio_request_enable(struct pinctrl_dev *pctl_dev,
 			 "pin %u forcibly re-configured as GPIO\n", offset);
 	}
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	pm_runtime_get(&vg->pdev->dev);
 
@@ -816,7 +817,7 @@ static int byt_gpio_set_direction(struct pinctrl_dev *pctl_dev,
 	unsigned long flags;
 	u32 value;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 
 	value = readl(val_reg);
 	value &= ~BYT_DIR_MASK;
@@ -833,7 +834,7 @@ static int byt_gpio_set_direction(struct pinctrl_dev *pctl_dev,
 		     "Potential Error: Setting GPIO with direct_irq_en to output");
 	writel(value, val_reg);
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	return 0;
 }
@@ -902,11 +903,11 @@ static int byt_pin_config_get(struct pinctrl_dev *pctl_dev, unsigned int offset,
 	u32 conf, pull, val, debounce;
 	u16 arg = 0;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	conf = readl(conf_reg);
 	pull = conf & BYT_PULL_ASSIGN_MASK;
 	val = readl(val_reg);
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	switch (param) {
 	case PIN_CONFIG_BIAS_DISABLE:
@@ -933,9 +934,9 @@ static int byt_pin_config_get(struct pinctrl_dev *pctl_dev, unsigned int offset,
 		if (!(conf & BYT_DEBOUNCE_EN))
 			return -EINVAL;
 
-		raw_spin_lock_irqsave(&vg->lock, flags);
+		raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 		debounce = readl(db_reg);
-		raw_spin_unlock_irqrestore(&vg->lock, flags);
+		raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 		switch (debounce & BYT_DEBOUNCE_PULSE_MASK) {
 		case BYT_DEBOUNCE_PULSE_375US:
@@ -987,7 +988,7 @@ static int byt_pin_config_set(struct pinctrl_dev *pctl_dev,
 	u32 conf, val, debounce;
 	int i, ret = 0;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 
 	conf = readl(conf_reg);
 	val = readl(val_reg);
@@ -1095,7 +1096,7 @@ static int byt_pin_config_set(struct pinctrl_dev *pctl_dev,
 	if (!ret)
 		writel(conf, conf_reg);
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	return ret;
 }
@@ -1120,9 +1121,9 @@ static int byt_gpio_get(struct gpio_chip *chip, unsigned int offset)
 	unsigned long flags;
 	u32 val;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	val = readl(reg);
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	return !!(val & BYT_LEVEL);
 }
@@ -1137,13 +1138,13 @@ static void byt_gpio_set(struct gpio_chip *chip, unsigned int offset, int value)
 	if (!reg)
 		return;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	old_val = readl(reg);
 	if (value)
 		writel(old_val | BYT_LEVEL, reg);
 	else
 		writel(old_val & ~BYT_LEVEL, reg);
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 }
 
 static int byt_gpio_get_direction(struct gpio_chip *chip, unsigned int offset)
@@ -1156,9 +1157,9 @@ static int byt_gpio_get_direction(struct gpio_chip *chip, unsigned int offset)
 	if (!reg)
 		return -EINVAL;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	value = readl(reg);
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	if (!(value & BYT_OUTPUT_EN))
 		return 0;
@@ -1201,14 +1202,14 @@ static void byt_gpio_dbg_show(struct seq_file *s, struct gpio_chip *chip)
 		const char *label;
 		unsigned int pin;
 
-		raw_spin_lock_irqsave(&vg->lock, flags);
+		raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 		pin = vg->soc_data->pins[i].number;
 		reg = byt_gpio_reg(vg, pin, BYT_CONF0_REG);
 		if (!reg) {
 			seq_printf(s,
 				   "Could not retrieve pin %i conf0 reg\n",
 				   pin);
-			raw_spin_unlock_irqrestore(&vg->lock, flags);
+			raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 			continue;
 		}
 		conf0 = readl(reg);
@@ -1217,11 +1218,11 @@ static void byt_gpio_dbg_show(struct seq_file *s, struct gpio_chip *chip)
 		if (!reg) {
 			seq_printf(s,
 				   "Could not retrieve pin %i val reg\n", pin);
-			raw_spin_unlock_irqrestore(&vg->lock, flags);
+			raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 			continue;
 		}
 		val = readl(reg);
-		raw_spin_unlock_irqrestore(&vg->lock, flags);
+		raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 		comm = byt_get_community(vg, pin);
 		if (!comm) {
@@ -1305,9 +1306,9 @@ static void byt_irq_ack(struct irq_data *d)
 	if (!reg)
 		return;
 
-	raw_spin_lock(&vg->lock);
+	raw_spin_lock(&byt_gpio_lock);
 	writel(BIT(offset % 32), reg);
-	raw_spin_unlock(&vg->lock);
+	raw_spin_unlock(&byt_gpio_lock);
 }
 
 static void byt_irq_mask(struct irq_data *d)
@@ -1331,7 +1332,7 @@ static void byt_irq_unmask(struct irq_data *d)
 	if (!reg)
 		return;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	value = readl(reg);
 
 	switch (irqd_get_trigger_type(d)) {
@@ -1354,7 +1355,7 @@ static void byt_irq_unmask(struct irq_data *d)
 
 	writel(value, reg);
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 }
 
 static int byt_irq_type(struct irq_data *d, unsigned int type)
@@ -1368,7 +1369,7 @@ static int byt_irq_type(struct irq_data *d, unsigned int type)
 	if (!reg || offset >= vg->chip.ngpio)
 		return -EINVAL;
 
-	raw_spin_lock_irqsave(&vg->lock, flags);
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
 	value = readl(reg);
 
 	WARN(value & BYT_DIRECT_IRQ_EN,
@@ -1390,7 +1391,7 @@ static int byt_irq_type(struct irq_data *d, unsigned int type)
 	else if (type & IRQ_TYPE_LEVEL_MASK)
 		irq_set_handler_locked(d, handle_level_irq);
 
-	raw_spin_unlock_irqrestore(&vg->lock, flags);
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 
 	return 0;
 }
@@ -1417,9 +1418,9 @@ static void byt_gpio_irq_handler(struct irq_desc *desc)
 			continue;
 		}
 
-		raw_spin_lock(&vg->lock);
+		raw_spin_lock(&byt_gpio_lock);
 		pending = readl(reg);
-		raw_spin_unlock(&vg->lock);
+		raw_spin_unlock(&byt_gpio_lock);
 		for_each_set_bit(pin, &pending, 32) {
 			virq = irq_find_mapping(vg->chip.irq.domain, base + pin);
 			generic_handle_irq(virq);
@@ -1646,8 +1647,6 @@ static int byt_pinctrl_probe(struct platform_device *pdev)
 		return PTR_ERR(vg->pctl_dev);
 	}
 
-	raw_spin_lock_init(&vg->lock);
-
 	ret = byt_gpio_probe(vg);
 	if (ret)
 		return ret;
@@ -1662,8 +1661,11 @@ static int byt_pinctrl_probe(struct platform_device *pdev)
 static int byt_gpio_suspend(struct device *dev)
 {
 	struct byt_gpio *vg = dev_get_drvdata(dev);
+	unsigned long flags;
 	int i;
 
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
+
 	for (i = 0; i < vg->soc_data->npins; i++) {
 		void __iomem *reg;
 		u32 value;
@@ -1684,14 +1686,18 @@ static int byt_gpio_suspend(struct device *dev)
 		vg->saved_context[i].val = value;
 	}
 
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 	return 0;
 }
 
 static int byt_gpio_resume(struct device *dev)
 {
 	struct byt_gpio *vg = dev_get_drvdata(dev);
+	unsigned long flags;
 	int i;
 
+	raw_spin_lock_irqsave(&byt_gpio_lock, flags);
+
 	for (i = 0; i < vg->soc_data->npins; i++) {
 		void __iomem *reg;
 		u32 value;
@@ -1729,6 +1735,7 @@ static int byt_gpio_resume(struct device *dev)
 		}
 	}
 
+	raw_spin_unlock_irqrestore(&byt_gpio_lock, flags);
 	return 0;
 }
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses
  2019-11-18 14:20 [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses Hans de Goede
  2019-11-18 14:20 ` [PATCH] " Hans de Goede
@ 2019-11-18 17:23 ` Andy Shevchenko
  2019-11-18 18:17   ` Hans de Goede
  1 sibling, 1 reply; 6+ messages in thread
From: Andy Shevchenko @ 2019-11-18 17:23 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Mika Westerberg, Bartosz Golaszewski, Linus Walleij, linux-gpio,
	linux-acpi

On Mon, Nov 18, 2019 at 03:20:19PM +0100, Hans de Goede wrote:
> Hi All,
> 
> First let me copy-paste a bit of the commit msg for background:
> 
> ---begin commit msg---
> Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
> added a spinlock around all register accesses because:
> 
> "There is a hardware issue in Intel Baytrail where concurrent GPIO register
>  access might result reads of 0xffffffff and writes might get dropped
>  completely."
> 
> Testing has shown that this does not catch all cases, there are still
> 2 problems remaining
> 
> 1) The original fix uses a spinlock per byt_gpio device / struct,
> additional testing has shown that this is not sufficient concurent
> accesses to 2 different GPIO banks also suffer from the same problem.
> 
> This commit fixes this by moving to a single global lock.
> 
> 2) The original fix did not add a lock around the register accesses in
> the suspend/resume handling.
> 
> Since pinctrl-baytrail.c is using normal suspend/resume handlers,
> interrupts are still enabled during suspend/resume handling. Nothing
> should be using the GPIOs when they are being taken down, _but_ the
> GPIOs themselves may still cause interrupts, which are likely to
> use (read) the triggering GPIO. So we need to protect against
> concurrent GPIO register accesses in the suspend/resume handlers too.
> 
> This commit fixes this by adding the missing spin_lock / unlock calls.
> ---end commit msg---
> 
> I was discussing the problem at my local hackerspace yesterday with
> someone who knows quite a bit low-level details about Intel CPUs.
> He said that on "big" core's all the GPIO islands sit behind the IOSF
> and that there is a bridge which make their registers look like regular
> mmio registers.
> 
> I wonder if the same is happening on BYT, that would point to an issue
> with concurent accesses being done through the IOSF bridge, which would
> explain why I've found that we need a single global lock for all GPIO
> islands (*).

> But that in turn would mean that we really need global lock
> for _all_ accesses done through the IOSF bridge, not just GPIO register
> accesses...  Which would explain why on some machines BYT has never been
> really stable with Linux.

I don't think so. The Cherriview seems inherited the same silicon issue (see
comment near to chv_lock), though it seems to be fixed for all new hardware
(Skylake+).

> Can someone at Intel with access to internal docs about BYT dig a bit into
> the docs and see if 1. this theory makes sense, 2. if it does if there us
> anything else behind the IOSF for which we also need to serialize accesses?

Unfortunately I have no time to dig into this right now.

> 
> Regards,
> 
> Hans
> 
> 
> 
> 
> *) It seems that the GPIO interrupt storm on the test device I hit
> this is really good in triggering this
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses
  2019-11-18 17:23 ` [PATCH 0/1] " Andy Shevchenko
@ 2019-11-18 18:17   ` Hans de Goede
  0 siblings, 0 replies; 6+ messages in thread
From: Hans de Goede @ 2019-11-18 18:17 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Mika Westerberg, Bartosz Golaszewski, Linus Walleij, linux-gpio,
	linux-acpi

Hi,

On 18-11-2019 18:23, Andy Shevchenko wrote:
> On Mon, Nov 18, 2019 at 03:20:19PM +0100, Hans de Goede wrote:
>> Hi All,
>>
>> First let me copy-paste a bit of the commit msg for background:
>>
>> ---begin commit msg---
>> Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
>> added a spinlock around all register accesses because:
>>
>> "There is a hardware issue in Intel Baytrail where concurrent GPIO register
>>   access might result reads of 0xffffffff and writes might get dropped
>>   completely."
>>
>> Testing has shown that this does not catch all cases, there are still
>> 2 problems remaining
>>
>> 1) The original fix uses a spinlock per byt_gpio device / struct,
>> additional testing has shown that this is not sufficient concurent
>> accesses to 2 different GPIO banks also suffer from the same problem.
>>
>> This commit fixes this by moving to a single global lock.
>>
>> 2) The original fix did not add a lock around the register accesses in
>> the suspend/resume handling.
>>
>> Since pinctrl-baytrail.c is using normal suspend/resume handlers,
>> interrupts are still enabled during suspend/resume handling. Nothing
>> should be using the GPIOs when they are being taken down, _but_ the
>> GPIOs themselves may still cause interrupts, which are likely to
>> use (read) the triggering GPIO. So we need to protect against
>> concurrent GPIO register accesses in the suspend/resume handlers too.
>>
>> This commit fixes this by adding the missing spin_lock / unlock calls.
>> ---end commit msg---
>>
>> I was discussing the problem at my local hackerspace yesterday with
>> someone who knows quite a bit low-level details about Intel CPUs.
>> He said that on "big" core's all the GPIO islands sit behind the IOSF
>> and that there is a bridge which make their registers look like regular
>> mmio registers.
>>
>> I wonder if the same is happening on BYT, that would point to an issue
>> with concurent accesses being done through the IOSF bridge, which would
>> explain why I've found that we need a single global lock for all GPIO
>> islands (*).
> 
>> But that in turn would mean that we really need global lock
>> for _all_ accesses done through the IOSF bridge, not just GPIO register
>> accesses...  Which would explain why on some machines BYT has never been
>> really stable with Linux.
> 
> I don't think so. The Cherriview seems inherited the same silicon issue (see
> comment near to chv_lock), though it seems to be fixed for all new hardware
> (Skylake+).

Ah, interestingly enough pinctrl-cherryview.c already uses a global
lock and it also already protects suspend and resume, well at least
that shows that my baytrail patch seems sound / the right thing to do.

>> Can someone at Intel with access to internal docs about BYT dig a bit into
>> the docs and see if 1. this theory makes sense, 2. if it does if there us
>> anything else behind the IOSF for which we also need to serialize accesses?
> 
> Unfortunately I have no time to dig into this right now.

Ok.

Regards,

Hans


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pinctrl: baytrail: Really serialize all register accesses
  2019-11-18 14:20 ` [PATCH] " Hans de Goede
@ 2019-11-19  8:19   ` Mika Westerberg
  2019-11-19 15:45     ` Hans de Goede
  0 siblings, 1 reply; 6+ messages in thread
From: Mika Westerberg @ 2019-11-19  8:19 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Andy Shevchenko, Bartosz Golaszewski, Linus Walleij, linux-gpio,
	linux-acpi, stable

On Mon, Nov 18, 2019 at 03:20:20PM +0100, Hans de Goede wrote:
> Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
> added a spinlock around all register accesses because:
> 
> "There is a hardware issue in Intel Baytrail where concurrent GPIO register
>  access might result reads of 0xffffffff and writes might get dropped
>  completely."
> 
> Testing has shown that this does not catch all cases, there are still
> 2 problems remaining
> 
> 1) The original fix uses a spinlock per byt_gpio device / struct,
> additional testing has shown that this is not sufficient concurent
> accesses to 2 different GPIO banks also suffer from the same problem.
> 
> This commit fixes this by moving to a single global lock.
> 
> 2) The original fix did not add a lock around the register accesses in
> the suspend/resume handling.
> 
> Since pinctrl-baytrail.c is using normal suspend/resume handlers,
> interrupts are still enabled during suspend/resume handling. Nothing
> should be using the GPIOs when they are being taken down, _but_ the
> GPIOs themselves may still cause interrupts, which are likely to
> use (read) the triggering GPIO. So we need to protect against
> concurrent GPIO register accesses in the suspend/resume handlers too.
> 
> This commit fixes this by adding the missing spin_lock / unlock calls.
> 
> The 2 fixes together fix the Acer Switch 10 SW5-012 getting completely
> confused after a suspend resume. The DSDT for this device has a bug
> in its _LID method which reprograms the home and power button trigger-
> flags requesting both high and low _level_ interrupts so the IRQs for
> these 2 GPIOs continuously fire. This combined with the saving of
> registers during suspend, triggers concurrent GPIO register accesses
> resulting in saving 0xffffffff as pconf0 value during suspend and then
> when restoring this on resume the pinmux settings get all messed up,
> resulting in various I2C busses being stuck, the wifi no longer working
> and often the tablet simply not coming out of suspend at all.
> 
> Cc: stable@vger.kernel.org
> Fixes: 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  drivers/pinctrl/intel/pinctrl-baytrail.c | 81 +++++++++++++-----------
>  1 file changed, 44 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c
> index b18336d42252..1b289f64c3a2 100644
> --- a/drivers/pinctrl/intel/pinctrl-baytrail.c
> +++ b/drivers/pinctrl/intel/pinctrl-baytrail.c
> @@ -111,7 +111,6 @@ struct byt_gpio {
>  	struct platform_device *pdev;
>  	struct pinctrl_dev *pctl_dev;
>  	struct pinctrl_desc pctl_desc;
> -	raw_spinlock_t lock;
>  	const struct intel_pinctrl_soc_data *soc_data;
>  	struct intel_community *communities_copy;
>  	struct byt_gpio_pin_context *saved_context;
> @@ -550,6 +549,8 @@ static const struct intel_pinctrl_soc_data *byt_soc_data[] = {
>  	NULL
>  };
>  
> +static DEFINE_RAW_SPINLOCK(byt_gpio_lock);

Can we call it byt_lock instead? Following same convention we use in
chv.

Other than that looks good and definitely right thing to do. Thanks for
doing this Hans!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pinctrl: baytrail: Really serialize all register accesses
  2019-11-19  8:19   ` Mika Westerberg
@ 2019-11-19 15:45     ` Hans de Goede
  0 siblings, 0 replies; 6+ messages in thread
From: Hans de Goede @ 2019-11-19 15:45 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Andy Shevchenko, Bartosz Golaszewski, Linus Walleij, linux-gpio,
	linux-acpi, stable

Hi,

On 19-11-2019 09:19, Mika Westerberg wrote:
> On Mon, Nov 18, 2019 at 03:20:20PM +0100, Hans de Goede wrote:
>> Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
>> added a spinlock around all register accesses because:
>>
>> "There is a hardware issue in Intel Baytrail where concurrent GPIO register
>>   access might result reads of 0xffffffff and writes might get dropped
>>   completely."
>>
>> Testing has shown that this does not catch all cases, there are still
>> 2 problems remaining
>>
>> 1) The original fix uses a spinlock per byt_gpio device / struct,
>> additional testing has shown that this is not sufficient concurent
>> accesses to 2 different GPIO banks also suffer from the same problem.
>>
>> This commit fixes this by moving to a single global lock.
>>
>> 2) The original fix did not add a lock around the register accesses in
>> the suspend/resume handling.
>>
>> Since pinctrl-baytrail.c is using normal suspend/resume handlers,
>> interrupts are still enabled during suspend/resume handling. Nothing
>> should be using the GPIOs when they are being taken down, _but_ the
>> GPIOs themselves may still cause interrupts, which are likely to
>> use (read) the triggering GPIO. So we need to protect against
>> concurrent GPIO register accesses in the suspend/resume handlers too.
>>
>> This commit fixes this by adding the missing spin_lock / unlock calls.
>>
>> The 2 fixes together fix the Acer Switch 10 SW5-012 getting completely
>> confused after a suspend resume. The DSDT for this device has a bug
>> in its _LID method which reprograms the home and power button trigger-
>> flags requesting both high and low _level_ interrupts so the IRQs for
>> these 2 GPIOs continuously fire. This combined with the saving of
>> registers during suspend, triggers concurrent GPIO register accesses
>> resulting in saving 0xffffffff as pconf0 value during suspend and then
>> when restoring this on resume the pinmux settings get all messed up,
>> resulting in various I2C busses being stuck, the wifi no longer working
>> and often the tablet simply not coming out of suspend at all.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   drivers/pinctrl/intel/pinctrl-baytrail.c | 81 +++++++++++++-----------
>>   1 file changed, 44 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c
>> index b18336d42252..1b289f64c3a2 100644
>> --- a/drivers/pinctrl/intel/pinctrl-baytrail.c
>> +++ b/drivers/pinctrl/intel/pinctrl-baytrail.c
>> @@ -111,7 +111,6 @@ struct byt_gpio {
>>   	struct platform_device *pdev;
>>   	struct pinctrl_dev *pctl_dev;
>>   	struct pinctrl_desc pctl_desc;
>> -	raw_spinlock_t lock;
>>   	const struct intel_pinctrl_soc_data *soc_data;
>>   	struct intel_community *communities_copy;
>>   	struct byt_gpio_pin_context *saved_context;
>> @@ -550,6 +549,8 @@ static const struct intel_pinctrl_soc_data *byt_soc_data[] = {
>>   	NULL
>>   };
>>   
>> +static DEFINE_RAW_SPINLOCK(byt_gpio_lock);
> 
> Can we call it byt_lock instead? Following same convention we use in
> chv.

Ok, v2 with this changed coming up.

> Other than that looks good and definitely right thing to do. Thanks for
> doing this Hans!

You are welcome. I must say that this was an interesting adventure :)
The interrupt storm issue on the Acer SW5-012 really managed to hit this bug
very reliably, resulting in all sorts of fun due to the pinmux settings
getting messed up.

Regards,

Hans


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-19 15:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 14:20 [PATCH 0/1] pinctrl: baytrail: Really serialize all register accesses Hans de Goede
2019-11-18 14:20 ` [PATCH] " Hans de Goede
2019-11-19  8:19   ` Mika Westerberg
2019-11-19 15:45     ` Hans de Goede
2019-11-18 17:23 ` [PATCH 0/1] " Andy Shevchenko
2019-11-18 18:17   ` Hans de Goede

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).