All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction
@ 2018-05-08  1:06 Benjamin Herrenschmidt
  2018-05-08  1:06 ` [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers Benjamin Herrenschmidt
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

In aspeed_gpio_dir_out(), we need to establish the new output
value in the output latch *before* we change the direction
to output in order to avoid a glitch on the output line if
the previous value of the latch was different.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/gpio/gpio-aspeed.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
index 67f3eae3b550..ac54b9b25f74 100644
--- a/drivers/gpio/gpio-aspeed.c
+++ b/drivers/gpio/gpio-aspeed.c
@@ -287,11 +287,10 @@ static int aspeed_gpio_dir_out(struct gpio_chip *gc,
 
 	spin_lock_irqsave(&gpio->lock, flags);
 
+	__aspeed_gpio_set(gc, offset, val);
 	reg = ioread32(bank_val_reg(gpio, bank, GPIO_DIR));
 	iowrite32(reg | GPIO_BIT(offset), bank_val_reg(gpio, bank, GPIO_DIR));
 
-	__aspeed_gpio_set(gc, offset, val);
-
 	spin_unlock_irqrestore(&gpio->lock, flags);
 
 	return 0;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
@ 2018-05-08  1:06 ` Benjamin Herrenschmidt
  2018-05-08 16:17   ` Christopher Bostic
  2018-05-08  1:06 ` [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase Benjamin Herrenschmidt
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

The current driver does a read/modify/write of the output
registers when changing a bit in __aspeed_gpio_set().

This is sub-optimal for a couple of reasons:

  - If any of the neighbouring GPIOs (sharing the shared
register) isn't (yet) configured as an output, it will
read the current input value, and then apply it to the
output latch, which may not be what the user expects. There
should be no bug in practice as aspeed_gpio_dir_out() will
establish a new value but it's not great either.

  - The GPIO block in the aspeed chip is clocked rather
slowly (typically 25Mhz). That extra MMIO read halves the maximum
speed at which we can toggle the GPIO.

This provides a significant performance improvement to the GPIO
based FSI master.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/gpio/gpio-aspeed.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
index ac54b9b25f74..a62cbf3ed4a8 100644
--- a/drivers/gpio/gpio-aspeed.c
+++ b/drivers/gpio/gpio-aspeed.c
@@ -54,6 +54,8 @@ struct aspeed_gpio {
 	u8 *offset_timer;
 	unsigned int timer_users[4];
 	struct clk *clk;
+
+	u32 *dcache;
 };
 
 struct aspeed_gpio_bank {
@@ -231,12 +233,13 @@ static void __aspeed_gpio_set(struct gpio_chip *gc, unsigned int offset,
 	u32 reg;
 
 	addr = bank_val_reg(gpio, bank, GPIO_DATA);
-	reg = ioread32(addr);
+	reg = gpio->dcache[GPIO_BANK(offset)];
 
 	if (val)
 		reg |= GPIO_BIT(offset);
 	else
 		reg &= ~GPIO_BIT(offset);
+	gpio->dcache[GPIO_BANK(offset)] = reg;
 
 	iowrite32(reg, addr);
 }
@@ -848,7 +851,7 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
 	const struct of_device_id *gpio_id;
 	struct aspeed_gpio *gpio;
 	struct resource *res;
-	int rc;
+	int rc, i, banks;
 
 	gpio = devm_kzalloc(&pdev->dev, sizeof(*gpio), GFP_KERNEL);
 	if (!gpio)
@@ -889,6 +892,20 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
 	gpio->chip.base = -1;
 	gpio->chip.irq_need_valid_mask = true;
 
+	/* Allocate a cache of the output registers */
+	banks = gpio->config->nr_gpios >> 5;
+	gpio->dcache = devm_kzalloc(&pdev->dev,
+				    sizeof(u32) * banks, GFP_KERNEL);
+	if (!gpio->dcache)
+		return -ENOMEM;
+
+	/* Populate it with initial values read from the HW */
+	for (i = 0; i < banks; i++) {
+		const struct aspeed_gpio_bank *bank = &aspeed_gpio_banks[i];
+		gpio->dcache[i] = ioread32(gpio->base + bank->val_regs +
+					   GPIO_DATA);
+	}
+
 	rc = devm_gpiochip_add_data(&pdev->dev, &gpio->chip, gpio);
 	if (rc < 0)
 		return rc;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
  2018-05-08  1:06 ` [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers Benjamin Herrenschmidt
@ 2018-05-08  1:06 ` Benjamin Herrenschmidt
  2018-05-08 16:42   ` Christopher Bostic
  2018-05-08  1:06 ` [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option Benjamin Herrenschmidt
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

We currently sample the input data right after we toggle the
clock low, then high. The slave establishes the data on the
rising edge, so this is not ideal. We should sample it on
the low phase instead.

This currently works because we have an extra delay, but subsequent
patches will remove it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/fsi/fsi-master-gpio.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index 4295a46780cb..d6508bbad1fb 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -86,12 +86,15 @@ static void clock_toggle(struct fsi_master_gpio *master, int count)
 	}
 }
 
-static int sda_in(struct fsi_master_gpio *master)
+static int sda_clock_in(struct fsi_master_gpio *master)
 {
 	int in;
 
 	ndelay(FSI_GPIO_STD_DLY);
+	gpiod_set_value(master->gpio_clk, 0);
 	in = gpiod_get_value(master->gpio_data);
+	ndelay(FSI_GPIO_STD_DLY);
+	gpiod_set_value(master->gpio_clk, 1);
 	return in ? 1 : 0;
 }
 
@@ -126,8 +129,7 @@ static void serial_in(struct fsi_master_gpio *master, struct fsi_gpio_msg *msg,
 	set_sda_input(master);
 
 	for (bit = 0; bit < num_bits; bit++) {
-		clock_toggle(master, 1);
-		in_bit = sda_in(master);
+		in_bit = sda_clock_in(master);
 		msg->msg <<= 1;
 		msg->msg |= ~in_bit & 0x1;	/* Data is active low */
 	}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
  2018-05-08  1:06 ` [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers Benjamin Herrenschmidt
  2018-05-08  1:06 ` [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase Benjamin Herrenschmidt
@ 2018-05-08  1:06 ` Benjamin Herrenschmidt
  2018-05-08 19:54   ` Christopher Bostic
  2018-05-08  1:06 ` [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks Benjamin Herrenschmidt
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

This adds support for an optional device-tree property that
makes the driver skip all the delays around clocking the
GPIOs and set it in the device-tree of common POWER9 based
OpenPower platforms.

This useful on chips like the AST2500 where the GPIO block is
running at a fairly low clock frequency (25Mhz typically). In
this case, the delays are unnecessary and due to the low
precision of the timers, actually quite harmful in terms of
performance.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts  |  1 +
 .../boot/dts/aspeed-bmc-opp-witherspoon.dts   |  1 +
 arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts    |  1 +
 drivers/fsi/fsi-master-gpio.c                 | 20 +++++++++++++++----
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
index d746e562e8fd..0fdf08077ed3 100644
--- a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
@@ -59,6 +59,7 @@
 		compatible = "fsi-master-gpio", "fsi-master";
 		#address-cells = <2>;
 		#size-cells = <0>;
+		no-gpio-delays;
 
 		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
 		data-gpios = <&gpio ASPEED_GPIO(AA, 2) GPIO_ACTIVE_HIGH>;
diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
index 596608f4b936..214718b7784d 100644
--- a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
@@ -160,6 +160,7 @@
 		compatible = "fsi-master-gpio", "fsi-master";
 		#address-cells = <2>;
 		#size-cells = <0>;
+		no-gpio-delays;
 
 		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
 		data-gpios = <&gpio ASPEED_GPIO(E, 0) GPIO_ACTIVE_HIGH>;
diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
index 7f5a29566801..085926935574 100644
--- a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
+++ b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
@@ -91,6 +91,7 @@
 		compatible = "fsi-master-gpio", "fsi-master";
 		#address-cells = <2>;
 		#size-cells = <0>;
+		no-gpio-delays;
 
 		trans-gpios = <&gpio ASPEED_GPIO(O, 6) GPIO_ACTIVE_HIGH>;
 		enable-gpios = <&gpio ASPEED_GPIO(D, 0) GPIO_ACTIVE_HIGH>;
diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index d6508bbad1fb..c82bbd35276e 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -62,6 +62,7 @@ struct fsi_master_gpio {
 	struct gpio_desc	*gpio_enable;	/* FSI enable */
 	struct gpio_desc	*gpio_mux;	/* Mux control */
 	bool			external_mode;
+	bool			no_delays;
 };
 
 #define CREATE_TRACE_POINTS
@@ -79,9 +80,11 @@ static void clock_toggle(struct fsi_master_gpio *master, int count)
 	int i;
 
 	for (i = 0; i < count; i++) {
-		ndelay(FSI_GPIO_STD_DLY);
+		if (!master->no_delays)
+			ndelay(FSI_GPIO_STD_DLY);
 		gpiod_set_value(master->gpio_clk, 0);
-		ndelay(FSI_GPIO_STD_DLY);
+		if (!master->no_delays)
+			ndelay(FSI_GPIO_STD_DLY);
 		gpiod_set_value(master->gpio_clk, 1);
 	}
 }
@@ -90,10 +93,12 @@ static int sda_clock_in(struct fsi_master_gpio *master)
 {
 	int in;
 
-	ndelay(FSI_GPIO_STD_DLY);
+	if (!master->no_delays)
+		ndelay(FSI_GPIO_STD_DLY);
 	gpiod_set_value(master->gpio_clk, 0);
 	in = gpiod_get_value(master->gpio_data);
-	ndelay(FSI_GPIO_STD_DLY);
+	if (!master->no_delays)
+		ndelay(FSI_GPIO_STD_DLY);
 	gpiod_set_value(master->gpio_clk, 1);
 	return in ? 1 : 0;
 }
@@ -677,6 +682,13 @@ static int fsi_master_gpio_probe(struct platform_device *pdev)
 	}
 	master->gpio_mux = gpio;
 
+	/*
+	 * Check if GPIO block is slow enought that no extra delays
+	 * are necessary. This improves performance on ast2500 by
+	 * an order of magnitude.
+	 */
+	master->no_delays = device_property_present(&pdev->dev, "no-gpio-delays");
+
 	master->master.n_links = 1;
 	master->master.flags = FSI_MASTER_FLAG_SWCLOCK;
 	master->master.read = fsi_master_gpio_read;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
                   ` (2 preceding siblings ...)
  2018-05-08  1:06 ` [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option Benjamin Herrenschmidt
@ 2018-05-08  1:06 ` Benjamin Herrenschmidt
  2018-05-08 19:55   ` Christopher Bostic
  2018-05-08  1:06 ` [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks Benjamin Herrenschmidt
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

FSI_GPIO_PRIME_SLAVE_CLOCKS is the number of clocks if the
"idle" phase between the end of a response and the beginning
of the next one. It corresponds to tSendDelay in the FSI
specification.

The default value in the slave is 16 clocks. 100 is way overkill
and significantly reduces the driver performance.

This changes it to 20 (which gives the HW a bit of margin still
just in case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/fsi/fsi-master-gpio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index c82bbd35276e..029b0a5b6d89 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -49,7 +49,7 @@
 #define	FSI_GPIO_CRC_SIZE	4
 #define	FSI_GPIO_MSG_ID_SIZE		2
 #define	FSI_GPIO_MSG_RESPID_SIZE	2
-#define	FSI_GPIO_PRIME_SLAVE_CLOCKS	100
+#define	FSI_GPIO_PRIME_SLAVE_CLOCKS	20
 
 struct fsi_master_gpio {
 	struct fsi_master	master;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
                   ` (3 preceding siblings ...)
  2018-05-08  1:06 ` [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks Benjamin Herrenschmidt
@ 2018-05-08  1:06 ` Benjamin Herrenschmidt
  2018-05-08  1:46 ` Benjamin Herrenschmidt
  2018-05-08 16:13 ` [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Christopher Bostic
  6 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:06 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

FSI_GPIO_DPOLL_CLOCKS is the number of clocks before sending
a DPOLL command after receiving a BUSY status. It should be
at least tSendDelay (16 clocks).

According to comments in the code, it needs to also be at least
21 clocks due to HW issues.

It's currently 100 clocks which impacts performances negatively
in some cases. Reduces it in half to 50 clocks which seems to
still be solid.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/fsi/fsi-master-gpio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index 029b0a5b6d89..bd2b2cbd5eb5 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -29,7 +29,8 @@
 #define	FSI_GPIO_CMD_TERM	0x3f
 #define FSI_GPIO_CMD_ABS_AR	0x4
 
-#define	FSI_GPIO_DPOLL_CLOCKS	100      /* < 21 will cause slave to hang */
+
+#define	FSI_GPIO_DPOLL_CLOCKS	50      /* < 21 will cause slave to hang */
 
 /* Bus errors */
 #define	FSI_GPIO_ERR_BUSY	1	/* Slave stuck in busy state */
@@ -43,7 +44,7 @@
 #define	FSI_GPIO_RESP_ACK	0
 #define	FSI_GPIO_RESP_ACKD	4
 
-#define	FSI_GPIO_MAX_BUSY	100
+#define	FSI_GPIO_MAX_BUSY	200
 #define	FSI_GPIO_MTOE_COUNT	1000
 #define	FSI_GPIO_DRAIN_BITS	20
 #define	FSI_GPIO_CRC_SIZE	4
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
                   ` (4 preceding siblings ...)
  2018-05-08  1:06 ` [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks Benjamin Herrenschmidt
@ 2018-05-08  1:46 ` Benjamin Herrenschmidt
  2018-05-08 20:00   ` Christopher Bostic
  2018-05-08 16:13 ` [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Christopher Bostic
  6 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-08  1:46 UTC (permalink / raw)
  To: openbmc; +Cc: Christopher Bostic, Benjamin Herrenschmidt

FSI_GPIO_DPOLL_CLOCKS is the number of clocks before sending
a DPOLL command after receiving a BUSY status. It should be
at least tSendDelay (16 clocks).

According to comments in the code, it needs to also be at least
21 clocks due to HW issues.

It's currently 100 clocks which impacts performances negatively
in some cases. Reduces it in half to 50 clocks which seems to
still be solid.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/fsi/fsi-master-gpio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
index 029b0a5b6d89..bd2b2cbd5eb5 100644
--- a/drivers/fsi/fsi-master-gpio.c
+++ b/drivers/fsi/fsi-master-gpio.c
@@ -29,7 +29,8 @@
 #define	FSI_GPIO_CMD_TERM	0x3f
 #define FSI_GPIO_CMD_ABS_AR	0x4
 
-#define	FSI_GPIO_DPOLL_CLOCKS	100      /* < 21 will cause slave to hang */
+
+#define	FSI_GPIO_DPOLL_CLOCKS	50      /* < 21 will cause slave to hang */
 
 /* Bus errors */
 #define	FSI_GPIO_ERR_BUSY	1	/* Slave stuck in busy state */
@@ -43,7 +44,7 @@
 #define	FSI_GPIO_RESP_ACK	0
 #define	FSI_GPIO_RESP_ACKD	4
 
-#define	FSI_GPIO_MAX_BUSY	100
+#define	FSI_GPIO_MAX_BUSY	200
 #define	FSI_GPIO_MTOE_COUNT	1000
 #define	FSI_GPIO_DRAIN_BITS	20
 #define	FSI_GPIO_CRC_SIZE	4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction
  2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
                   ` (5 preceding siblings ...)
  2018-05-08  1:46 ` Benjamin Herrenschmidt
@ 2018-05-08 16:13 ` Christopher Bostic
  2018-05-10  2:50   ` Andrew Jeffery
  6 siblings, 1 reply; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 16:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>


On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> In aspeed_gpio_dir_out(), we need to establish the new output
> value in the output latch *before* we change the direction
> to output in order to avoid a glitch on the output line if
> the previous value of the latch was different.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   drivers/gpio/gpio-aspeed.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
> index 67f3eae3b550..ac54b9b25f74 100644
> --- a/drivers/gpio/gpio-aspeed.c
> +++ b/drivers/gpio/gpio-aspeed.c
> @@ -287,11 +287,10 @@ static int aspeed_gpio_dir_out(struct gpio_chip *gc,
>
>   	spin_lock_irqsave(&gpio->lock, flags);
>
> +	__aspeed_gpio_set(gc, offset, val);
>   	reg = ioread32(bank_val_reg(gpio, bank, GPIO_DIR));
>   	iowrite32(reg | GPIO_BIT(offset), bank_val_reg(gpio, bank, GPIO_DIR));
>
> -	__aspeed_gpio_set(gc, offset, val);
> -
>   	spin_unlock_irqrestore(&gpio->lock, flags);
>
>   	return 0;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers
  2018-05-08  1:06 ` [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers Benjamin Herrenschmidt
@ 2018-05-08 16:17   ` Christopher Bostic
  2018-05-10  2:56     ` Andrew Jeffery
  0 siblings, 1 reply; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 16:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>


On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> The current driver does a read/modify/write of the output
> registers when changing a bit in __aspeed_gpio_set().
>
> This is sub-optimal for a couple of reasons:
>
>    - If any of the neighbouring GPIOs (sharing the shared
> register) isn't (yet) configured as an output, it will
> read the current input value, and then apply it to the
> output latch, which may not be what the user expects. There
> should be no bug in practice as aspeed_gpio_dir_out() will
> establish a new value but it's not great either.
>
>    - The GPIO block in the aspeed chip is clocked rather
> slowly (typically 25Mhz). That extra MMIO read halves the maximum
> speed at which we can toggle the GPIO.
>
> This provides a significant performance improvement to the GPIO
> based FSI master.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   drivers/gpio/gpio-aspeed.c | 21 +++++++++++++++++++--
>   1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
> index ac54b9b25f74..a62cbf3ed4a8 100644
> --- a/drivers/gpio/gpio-aspeed.c
> +++ b/drivers/gpio/gpio-aspeed.c
> @@ -54,6 +54,8 @@ struct aspeed_gpio {
>   	u8 *offset_timer;
>   	unsigned int timer_users[4];
>   	struct clk *clk;
> +
> +	u32 *dcache;
>   };
>
>   struct aspeed_gpio_bank {
> @@ -231,12 +233,13 @@ static void __aspeed_gpio_set(struct gpio_chip *gc, unsigned int offset,
>   	u32 reg;
>
>   	addr = bank_val_reg(gpio, bank, GPIO_DATA);
> -	reg = ioread32(addr);
> +	reg = gpio->dcache[GPIO_BANK(offset)];
>
>   	if (val)
>   		reg |= GPIO_BIT(offset);
>   	else
>   		reg &= ~GPIO_BIT(offset);
> +	gpio->dcache[GPIO_BANK(offset)] = reg;
>
>   	iowrite32(reg, addr);
>   }
> @@ -848,7 +851,7 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
>   	const struct of_device_id *gpio_id;
>   	struct aspeed_gpio *gpio;
>   	struct resource *res;
> -	int rc;
> +	int rc, i, banks;
>
>   	gpio = devm_kzalloc(&pdev->dev, sizeof(*gpio), GFP_KERNEL);
>   	if (!gpio)
> @@ -889,6 +892,20 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
>   	gpio->chip.base = -1;
>   	gpio->chip.irq_need_valid_mask = true;
>
> +	/* Allocate a cache of the output registers */
> +	banks = gpio->config->nr_gpios >> 5;
> +	gpio->dcache = devm_kzalloc(&pdev->dev,
> +				    sizeof(u32) * banks, GFP_KERNEL);
> +	if (!gpio->dcache)
> +		return -ENOMEM;
> +
> +	/* Populate it with initial values read from the HW */
> +	for (i = 0; i < banks; i++) {
> +		const struct aspeed_gpio_bank *bank = &aspeed_gpio_banks[i];
> +		gpio->dcache[i] = ioread32(gpio->base + bank->val_regs +
> +					   GPIO_DATA);
> +	}
> +
>   	rc = devm_gpiochip_add_data(&pdev->dev, &gpio->chip, gpio);
>   	if (rc < 0)
>   		return rc;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase
  2018-05-08  1:06 ` [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase Benjamin Herrenschmidt
@ 2018-05-08 16:42   ` Christopher Bostic
  0 siblings, 0 replies; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 16:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>

On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> We currently sample the input data right after we toggle the
> clock low, then high. The slave establishes the data on the
> rising edge, so this is not ideal. We should sample it on
> the low phase instead.
>
> This currently works because we have an extra delay, but subsequent
> patches will remove it.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   drivers/fsi/fsi-master-gpio.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> index 4295a46780cb..d6508bbad1fb 100644
> --- a/drivers/fsi/fsi-master-gpio.c
> +++ b/drivers/fsi/fsi-master-gpio.c
> @@ -86,12 +86,15 @@ static void clock_toggle(struct fsi_master_gpio *master, int count)
>   	}
>   }
>
> -static int sda_in(struct fsi_master_gpio *master)
> +static int sda_clock_in(struct fsi_master_gpio *master)
>   {
>   	int in;
>
>   	ndelay(FSI_GPIO_STD_DLY);
> +	gpiod_set_value(master->gpio_clk, 0);
>   	in = gpiod_get_value(master->gpio_data);
> +	ndelay(FSI_GPIO_STD_DLY);
> +	gpiod_set_value(master->gpio_clk, 1);
>   	return in ? 1 : 0;
>   }
>
> @@ -126,8 +129,7 @@ static void serial_in(struct fsi_master_gpio *master, struct fsi_gpio_msg *msg,
>   	set_sda_input(master);
>
>   	for (bit = 0; bit < num_bits; bit++) {
> -		clock_toggle(master, 1);
> -		in_bit = sda_in(master);
> +		in_bit = sda_clock_in(master);
>   		msg->msg <<= 1;
>   		msg->msg |= ~in_bit & 0x1;	/* Data is active low */
>   	}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option
  2018-05-08  1:06 ` [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option Benjamin Herrenschmidt
@ 2018-05-08 19:54   ` Christopher Bostic
  2018-05-10  3:03     ` Andrew Jeffery
  0 siblings, 1 reply; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 19:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>


On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> This adds support for an optional device-tree property that
> makes the driver skip all the delays around clocking the
> GPIOs and set it in the device-tree of common POWER9 based
> OpenPower platforms.
>
> This useful on chips like the AST2500 where the GPIO block is
> running at a fairly low clock frequency (25Mhz typically). In
> this case, the delays are unnecessary and due to the low
> precision of the timers, actually quite harmful in terms of
> performance.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts  |  1 +
>   .../boot/dts/aspeed-bmc-opp-witherspoon.dts   |  1 +
>   arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts    |  1 +
>   drivers/fsi/fsi-master-gpio.c                 | 20 +++++++++++++++----
>   4 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> index d746e562e8fd..0fdf08077ed3 100644
> --- a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> +++ b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> @@ -59,6 +59,7 @@
>   		compatible = "fsi-master-gpio", "fsi-master";
>   		#address-cells = <2>;
>   		#size-cells = <0>;
> +		no-gpio-delays;
>
>   		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
>   		data-gpios = <&gpio ASPEED_GPIO(AA, 2) GPIO_ACTIVE_HIGH>;
> diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> index 596608f4b936..214718b7784d 100644
> --- a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> +++ b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> @@ -160,6 +160,7 @@
>   		compatible = "fsi-master-gpio", "fsi-master";
>   		#address-cells = <2>;
>   		#size-cells = <0>;
> +		no-gpio-delays;
>
>   		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
>   		data-gpios = <&gpio ASPEED_GPIO(E, 0) GPIO_ACTIVE_HIGH>;
> diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> index 7f5a29566801..085926935574 100644
> --- a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> +++ b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> @@ -91,6 +91,7 @@
>   		compatible = "fsi-master-gpio", "fsi-master";
>   		#address-cells = <2>;
>   		#size-cells = <0>;
> +		no-gpio-delays;
>
>   		trans-gpios = <&gpio ASPEED_GPIO(O, 6) GPIO_ACTIVE_HIGH>;
>   		enable-gpios = <&gpio ASPEED_GPIO(D, 0) GPIO_ACTIVE_HIGH>;
> diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> index d6508bbad1fb..c82bbd35276e 100644
> --- a/drivers/fsi/fsi-master-gpio.c
> +++ b/drivers/fsi/fsi-master-gpio.c
> @@ -62,6 +62,7 @@ struct fsi_master_gpio {
>   	struct gpio_desc	*gpio_enable;	/* FSI enable */
>   	struct gpio_desc	*gpio_mux;	/* Mux control */
>   	bool			external_mode;
> +	bool			no_delays;
>   };
>
>   #define CREATE_TRACE_POINTS
> @@ -79,9 +80,11 @@ static void clock_toggle(struct fsi_master_gpio *master, int count)
>   	int i;
>
>   	for (i = 0; i < count; i++) {
> -		ndelay(FSI_GPIO_STD_DLY);
> +		if (!master->no_delays)
> +			ndelay(FSI_GPIO_STD_DLY);
>   		gpiod_set_value(master->gpio_clk, 0);
> -		ndelay(FSI_GPIO_STD_DLY);
> +		if (!master->no_delays)
> +			ndelay(FSI_GPIO_STD_DLY);
>   		gpiod_set_value(master->gpio_clk, 1);
>   	}
>   }
> @@ -90,10 +93,12 @@ static int sda_clock_in(struct fsi_master_gpio *master)
>   {
>   	int in;
>
> -	ndelay(FSI_GPIO_STD_DLY);
> +	if (!master->no_delays)
> +		ndelay(FSI_GPIO_STD_DLY);
>   	gpiod_set_value(master->gpio_clk, 0);
>   	in = gpiod_get_value(master->gpio_data);
> -	ndelay(FSI_GPIO_STD_DLY);
> +	if (!master->no_delays)
> +		ndelay(FSI_GPIO_STD_DLY);
>   	gpiod_set_value(master->gpio_clk, 1);
>   	return in ? 1 : 0;
>   }
> @@ -677,6 +682,13 @@ static int fsi_master_gpio_probe(struct platform_device *pdev)
>   	}
>   	master->gpio_mux = gpio;
>
> +	/*
> +	 * Check if GPIO block is slow enought that no extra delays
> +	 * are necessary. This improves performance on ast2500 by
> +	 * an order of magnitude.
> +	 */
> +	master->no_delays = device_property_present(&pdev->dev, "no-gpio-delays");
> +
>   	master->master.n_links = 1;
>   	master->master.flags = FSI_MASTER_FLAG_SWCLOCK;
>   	master->master.read = fsi_master_gpio_read;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks
  2018-05-08  1:06 ` [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks Benjamin Herrenschmidt
@ 2018-05-08 19:55   ` Christopher Bostic
  0 siblings, 0 replies; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 19:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>


On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> FSI_GPIO_PRIME_SLAVE_CLOCKS is the number of clocks if the
> "idle" phase between the end of a response and the beginning
> of the next one. It corresponds to tSendDelay in the FSI
> specification.
>
> The default value in the slave is 16 clocks. 100 is way overkill
> and significantly reduces the driver performance.
>
> This changes it to 20 (which gives the HW a bit of margin still
> just in case).
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   drivers/fsi/fsi-master-gpio.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> index c82bbd35276e..029b0a5b6d89 100644
> --- a/drivers/fsi/fsi-master-gpio.c
> +++ b/drivers/fsi/fsi-master-gpio.c
> @@ -49,7 +49,7 @@
>   #define	FSI_GPIO_CRC_SIZE	4
>   #define	FSI_GPIO_MSG_ID_SIZE		2
>   #define	FSI_GPIO_MSG_RESPID_SIZE	2
> -#define	FSI_GPIO_PRIME_SLAVE_CLOCKS	100
> +#define	FSI_GPIO_PRIME_SLAVE_CLOCKS	20
>
>   struct fsi_master_gpio {
>   	struct fsi_master	master;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks
  2018-05-08  1:46 ` Benjamin Herrenschmidt
@ 2018-05-08 20:00   ` Christopher Bostic
  2018-05-10  3:06     ` Andrew Jeffery
  0 siblings, 1 reply; 21+ messages in thread
From: Christopher Bostic @ 2018-05-08 20:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, openbmc

Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>


On 5/7/18 8:46 PM, Benjamin Herrenschmidt wrote:
> FSI_GPIO_DPOLL_CLOCKS is the number of clocks before sending
> a DPOLL command after receiving a BUSY status. It should be
> at least tSendDelay (16 clocks).
>
> According to comments in the code, it needs to also be at least
> 21 clocks due to HW issues.
>
> It's currently 100 clocks which impacts performances negatively
> in some cases. Reduces it in half to 50 clocks which seems to
> still be solid.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>   drivers/fsi/fsi-master-gpio.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> index 029b0a5b6d89..bd2b2cbd5eb5 100644
> --- a/drivers/fsi/fsi-master-gpio.c
> +++ b/drivers/fsi/fsi-master-gpio.c
> @@ -29,7 +29,8 @@
>   #define	FSI_GPIO_CMD_TERM	0x3f
>   #define FSI_GPIO_CMD_ABS_AR	0x4
>
> -#define	FSI_GPIO_DPOLL_CLOCKS	100      /* < 21 will cause slave to hang */
> +
> +#define	FSI_GPIO_DPOLL_CLOCKS	50      /* < 21 will cause slave to hang */
>
>   /* Bus errors */
>   #define	FSI_GPIO_ERR_BUSY	1	/* Slave stuck in busy state */
> @@ -43,7 +44,7 @@
>   #define	FSI_GPIO_RESP_ACK	0
>   #define	FSI_GPIO_RESP_ACKD	4
>
> -#define	FSI_GPIO_MAX_BUSY	100
> +#define	FSI_GPIO_MAX_BUSY	200
>   #define	FSI_GPIO_MTOE_COUNT	1000
>   #define	FSI_GPIO_DRAIN_BITS	20
>   #define	FSI_GPIO_CRC_SIZE	4
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction
  2018-05-08 16:13 ` [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Christopher Bostic
@ 2018-05-10  2:50   ` Andrew Jeffery
  2018-05-10  3:48     ` Joel Stanley
  2018-05-11 10:51     ` Joel Stanley
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Jeffery @ 2018-05-10  2:50 UTC (permalink / raw)
  To: Christopher Bostic, Benjamin Herrenschmidt, openbmc

On Wed, 9 May 2018, at 01:43, Christopher Bostic wrote:
> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
> 
> 
> On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> > In aspeed_gpio_dir_out(), we need to establish the new output
> > value in the output latch *before* we change the direction
> > to output in order to avoid a glitch on the output line if
> > the previous value of the latch was different.
> >

I think Milton mentioned this in a review way back when I was upstreaming the driver, but looks like I forgot to address it.

> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

> > ---
> >   drivers/gpio/gpio-aspeed.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
> > index 67f3eae3b550..ac54b9b25f74 100644
> > --- a/drivers/gpio/gpio-aspeed.c
> > +++ b/drivers/gpio/gpio-aspeed.c
> > @@ -287,11 +287,10 @@ static int aspeed_gpio_dir_out(struct gpio_chip *gc,
> >
> >   	spin_lock_irqsave(&gpio->lock, flags);
> >
> > +	__aspeed_gpio_set(gc, offset, val);
> >   	reg = ioread32(bank_val_reg(gpio, bank, GPIO_DIR));
> >   	iowrite32(reg | GPIO_BIT(offset), bank_val_reg(gpio, bank, GPIO_DIR));
> >
> > -	__aspeed_gpio_set(gc, offset, val);
> > -
> >   	spin_unlock_irqrestore(&gpio->lock, flags);
> >
> >   	return 0;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers
  2018-05-08 16:17   ` Christopher Bostic
@ 2018-05-10  2:56     ` Andrew Jeffery
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Jeffery @ 2018-05-10  2:56 UTC (permalink / raw)
  To: Christopher Bostic, Benjamin Herrenschmidt, openbmc

On Wed, 9 May 2018, at 01:47, Christopher Bostic wrote:
> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
> 
> 
> On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> > The current driver does a read/modify/write of the output
> > registers when changing a bit in __aspeed_gpio_set().
> >
> > This is sub-optimal for a couple of reasons:
> >
> >    - If any of the neighbouring GPIOs (sharing the shared
> > register) isn't (yet) configured as an output, it will
> > read the current input value, and then apply it to the
> > output latch, which may not be what the user expects. There
> > should be no bug in practice as aspeed_gpio_dir_out() will
> > establish a new value but it's not great either.
> >
> >    - The GPIO block in the aspeed chip is clocked rather
> > slowly (typically 25Mhz). That extra MMIO read halves the maximum
> > speed at which we can toggle the GPIO.
> >
> > This provides a significant performance improvement to the GPIO
> > based FSI master.
> >
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

> > ---
> >   drivers/gpio/gpio-aspeed.c | 21 +++++++++++++++++++--
> >   1 file changed, 19 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c
> > index ac54b9b25f74..a62cbf3ed4a8 100644
> > --- a/drivers/gpio/gpio-aspeed.c
> > +++ b/drivers/gpio/gpio-aspeed.c
> > @@ -54,6 +54,8 @@ struct aspeed_gpio {
> >   	u8 *offset_timer;
> >   	unsigned int timer_users[4];
> >   	struct clk *clk;
> > +
> > +	u32 *dcache;
> >   };
> >
> >   struct aspeed_gpio_bank {
> > @@ -231,12 +233,13 @@ static void __aspeed_gpio_set(struct gpio_chip *gc, unsigned int offset,
> >   	u32 reg;
> >
> >   	addr = bank_val_reg(gpio, bank, GPIO_DATA);
> > -	reg = ioread32(addr);
> > +	reg = gpio->dcache[GPIO_BANK(offset)];
> >
> >   	if (val)
> >   		reg |= GPIO_BIT(offset);
> >   	else
> >   		reg &= ~GPIO_BIT(offset);
> > +	gpio->dcache[GPIO_BANK(offset)] = reg;
> >
> >   	iowrite32(reg, addr);
> >   }
> > @@ -848,7 +851,7 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
> >   	const struct of_device_id *gpio_id;
> >   	struct aspeed_gpio *gpio;
> >   	struct resource *res;
> > -	int rc;
> > +	int rc, i, banks;
> >
> >   	gpio = devm_kzalloc(&pdev->dev, sizeof(*gpio), GFP_KERNEL);
> >   	if (!gpio)
> > @@ -889,6 +892,20 @@ static int __init aspeed_gpio_probe(struct platform_device *pdev)
> >   	gpio->chip.base = -1;
> >   	gpio->chip.irq_need_valid_mask = true;
> >
> > +	/* Allocate a cache of the output registers */
> > +	banks = gpio->config->nr_gpios >> 5;
> > +	gpio->dcache = devm_kzalloc(&pdev->dev,
> > +				    sizeof(u32) * banks, GFP_KERNEL);
> > +	if (!gpio->dcache)
> > +		return -ENOMEM;
> > +
> > +	/* Populate it with initial values read from the HW */
> > +	for (i = 0; i < banks; i++) {
> > +		const struct aspeed_gpio_bank *bank = &aspeed_gpio_banks[i];
> > +		gpio->dcache[i] = ioread32(gpio->base + bank->val_regs +
> > +					   GPIO_DATA);
> > +	}
> > +
> >   	rc = devm_gpiochip_add_data(&pdev->dev, &gpio->chip, gpio);
> >   	if (rc < 0)
> >   		return rc;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option
  2018-05-08 19:54   ` Christopher Bostic
@ 2018-05-10  3:03     ` Andrew Jeffery
  2018-05-10  3:10       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Jeffery @ 2018-05-10  3:03 UTC (permalink / raw)
  To: Christopher Bostic, Benjamin Herrenschmidt, openbmc



On Wed, 9 May 2018, at 05:24, Christopher Bostic wrote:
> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
> 
> 
> On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
> > This adds support for an optional device-tree property that
> > makes the driver skip all the delays around clocking the
> > GPIOs and set it in the device-tree of common POWER9 based
> > OpenPower platforms.
> >
> > This useful on chips like the AST2500 where the GPIO block is
> > running at a fairly low clock frequency (25Mhz typically). In
> > this case, the delays are unnecessary and due to the low
> > precision of the timers, actually quite harmful in terms of
> > performance.
> >
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > ---
> >   arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts  |  1 +
> >   .../boot/dts/aspeed-bmc-opp-witherspoon.dts   |  1 +
> >   arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts    |  1 +
> >   drivers/fsi/fsi-master-gpio.c                 | 20 +++++++++++++++----
> >   4 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> > index d746e562e8fd..0fdf08077ed3 100644
> > --- a/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> > +++ b/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dts
> > @@ -59,6 +59,7 @@
> >   		compatible = "fsi-master-gpio", "fsi-master";
> >   		#address-cells = <2>;
> >   		#size-cells = <0>;
> > +		no-gpio-delays;
> >
> >   		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
> >   		data-gpios = <&gpio ASPEED_GPIO(AA, 2) GPIO_ACTIVE_HIGH>;
> > diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> > index 596608f4b936..214718b7784d 100644
> > --- a/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> > +++ b/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts
> > @@ -160,6 +160,7 @@
> >   		compatible = "fsi-master-gpio", "fsi-master";
> >   		#address-cells = <2>;
> >   		#size-cells = <0>;
> > +		no-gpio-delays;
> >
> >   		clock-gpios = <&gpio ASPEED_GPIO(AA, 0) GPIO_ACTIVE_HIGH>;
> >   		data-gpios = <&gpio ASPEED_GPIO(E, 0) GPIO_ACTIVE_HIGH>;
> > diff --git a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> > index 7f5a29566801..085926935574 100644
> > --- a/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> > +++ b/arch/arm/boot/dts/aspeed-bmc-opp-zaius.dts
> > @@ -91,6 +91,7 @@
> >   		compatible = "fsi-master-gpio", "fsi-master";
> >   		#address-cells = <2>;
> >   		#size-cells = <0>;
> > +		no-gpio-delays;
> >
> >   		trans-gpios = <&gpio ASPEED_GPIO(O, 6) GPIO_ACTIVE_HIGH>;
> >   		enable-gpios = <&gpio ASPEED_GPIO(D, 0) GPIO_ACTIVE_HIGH>;
> > diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> > index d6508bbad1fb..c82bbd35276e 100644
> > --- a/drivers/fsi/fsi-master-gpio.c
> > +++ b/drivers/fsi/fsi-master-gpio.c
> > @@ -62,6 +62,7 @@ struct fsi_master_gpio {
> >   	struct gpio_desc	*gpio_enable;	/* FSI enable */
> >   	struct gpio_desc	*gpio_mux;	/* Mux control */
> >   	bool			external_mode;
> > +	bool			no_delays;
> >   };
> >
> >   #define CREATE_TRACE_POINTS
> > @@ -79,9 +80,11 @@ static void clock_toggle(struct fsi_master_gpio *master, int count)
> >   	int i;
> >
> >   	for (i = 0; i < count; i++) {
> > -		ndelay(FSI_GPIO_STD_DLY);
> > +		if (!master->no_delays)

This is a bit of a nit, but we don't have to have the member name reflect the devicetree property. I reckon the code would be mildly easier to read (avoiding the double-negative) if we inverted it:

                        if (master->do_delays)
 
> > +			ndelay(FSI_GPIO_STD_DLY);
> >   		gpiod_set_value(master->gpio_clk, 0);
> > -		ndelay(FSI_GPIO_STD_DLY);
> > +		if (!master->no_delays)
> > +			ndelay(FSI_GPIO_STD_DLY);
> >   		gpiod_set_value(master->gpio_clk, 1);
> >   	}
> >   }
> > @@ -90,10 +93,12 @@ static int sda_clock_in(struct fsi_master_gpio *master)
> >   {
> >   	int in;
> >
> > -	ndelay(FSI_GPIO_STD_DLY);
> > +	if (!master->no_delays)
> > +		ndelay(FSI_GPIO_STD_DLY);
> >   	gpiod_set_value(master->gpio_clk, 0);
> >   	in = gpiod_get_value(master->gpio_data);
> > -	ndelay(FSI_GPIO_STD_DLY);
> > +	if (!master->no_delays)
> > +		ndelay(FSI_GPIO_STD_DLY);
> >   	gpiod_set_value(master->gpio_clk, 1);
> >   	return in ? 1 : 0;
> >   }
> > @@ -677,6 +682,13 @@ static int fsi_master_gpio_probe(struct platform_device *pdev)
> >   	}
> >   	master->gpio_mux = gpio;
> >
> > +	/*
> > +	 * Check if GPIO block is slow enought that no extra delays
> > +	 * are necessary. This improves performance on ast2500 by
> > +	 * an order of magnitude.
> > +	 */
> > +	master->no_delays = device_property_present(&pdev->dev, "no-gpio-delays");

Obviously negate the expression here as well if you agree with the comment above.

Cheers,

Andrew

> > +
> >   	master->master.n_links = 1;
> >   	master->master.flags = FSI_MASTER_FLAG_SWCLOCK;
> >   	master->master.read = fsi_master_gpio_read;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks
  2018-05-08 20:00   ` Christopher Bostic
@ 2018-05-10  3:06     ` Andrew Jeffery
  2018-05-10  3:12       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Jeffery @ 2018-05-10  3:06 UTC (permalink / raw)
  To: Christopher Bostic, Benjamin Herrenschmidt, openbmc

On Wed, 9 May 2018, at 05:30, Christopher Bostic wrote:
> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
> 
> 
> On 5/7/18 8:46 PM, Benjamin Herrenschmidt wrote:
> > FSI_GPIO_DPOLL_CLOCKS is the number of clocks before sending
> > a DPOLL command after receiving a BUSY status. It should be
> > at least tSendDelay (16 clocks).
> >
> > According to comments in the code, it needs to also be at least
> > 21 clocks due to HW issues.
> >
> > It's currently 100 clocks which impacts performances negatively
> > in some cases. Reduces it in half to 50 clocks which seems to
> > still be solid.

Out of curiosity, was there any science to 50 beyond the (16/21 cycle) reasoning above? Was there instability at lower values?

> >
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > ---
> >   drivers/fsi/fsi-master-gpio.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> > index 029b0a5b6d89..bd2b2cbd5eb5 100644
> > --- a/drivers/fsi/fsi-master-gpio.c
> > +++ b/drivers/fsi/fsi-master-gpio.c
> > @@ -29,7 +29,8 @@
> >   #define	FSI_GPIO_CMD_TERM	0x3f
> >   #define FSI_GPIO_CMD_ABS_AR	0x4
> >
> > -#define	FSI_GPIO_DPOLL_CLOCKS	100      /* < 21 will cause slave to hang */
> > +
> > +#define	FSI_GPIO_DPOLL_CLOCKS	50      /* < 21 will cause slave to hang */
> >
> >   /* Bus errors */
> >   #define	FSI_GPIO_ERR_BUSY	1	/* Slave stuck in busy state */
> > @@ -43,7 +44,7 @@
> >   #define	FSI_GPIO_RESP_ACK	0
> >   #define	FSI_GPIO_RESP_ACKD	4
> >
> > -#define	FSI_GPIO_MAX_BUSY	100
> > +#define	FSI_GPIO_MAX_BUSY	200
> >   #define	FSI_GPIO_MTOE_COUNT	1000
> >   #define	FSI_GPIO_DRAIN_BITS	20
> >   #define	FSI_GPIO_CRC_SIZE	4
> >
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option
  2018-05-10  3:03     ` Andrew Jeffery
@ 2018-05-10  3:10       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-10  3:10 UTC (permalink / raw)
  To: Andrew Jeffery, Christopher Bostic, openbmc

On Thu, 2018-05-10 at 12:33 +0930, Andrew Jeffery wrote:
> > >      for (i = 0; i < count; i++) {
> > > -           ndelay(FSI_GPIO_STD_DLY);
> > > +           if (!master->no_delays)
> 
> This is a bit of a nit, but we don't have to have the member name
> reflect the devicetree property. I reckon the code would be mildly
> easier to read (avoiding the double-negative) if we inverted it:
> 
>                         if (master->do_delays)
>  
> > > +                   ndelay(FSI_GPIO_STD_DLY);

I prefer it as "no_delays" personally because delays is the norm,
and no delays is the exception for aspeed (or slow GPIO blocks).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks
  2018-05-10  3:06     ` Andrew Jeffery
@ 2018-05-10  3:12       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-10  3:12 UTC (permalink / raw)
  To: Andrew Jeffery, Christopher Bostic, openbmc

On Thu, 2018-05-10 at 12:36 +0930, Andrew Jeffery wrote:
> On Wed, 9 May 2018, at 05:30, Christopher Bostic wrote:
> > Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
> > 
> > 
> > On 5/7/18 8:46 PM, Benjamin Herrenschmidt wrote:
> > > FSI_GPIO_DPOLL_CLOCKS is the number of clocks before sending
> > > a DPOLL command after receiving a BUSY status. It should be
> > > at least tSendDelay (16 clocks).
> > > 
> > > According to comments in the code, it needs to also be at least
> > > 21 clocks due to HW issues.
> > > 
> > > It's currently 100 clocks which impacts performances negatively
> > > in some cases. Reduces it in half to 50 clocks which seems to
> > > still be solid.
> 
> Out of curiosity, was there any science to 50 beyond the (16/21
> cycle) reasoning above? Was there instability at lower values?

No. The "science" aka spec would need it to be tSendDelay, which is 16
cycles. However there's a comment that says that there are issues below
21. I decided to keep *some* paranoia and cut in half to 50 instead of
100.


> > > 
> > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > ---
> > >   drivers/fsi/fsi-master-gpio.c | 5 +++--
> > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c
> > > index 029b0a5b6d89..bd2b2cbd5eb5 100644
> > > --- a/drivers/fsi/fsi-master-gpio.c
> > > +++ b/drivers/fsi/fsi-master-gpio.c
> > > @@ -29,7 +29,8 @@
> > >   #define	FSI_GPIO_CMD_TERM	0x3f
> > >   #define FSI_GPIO_CMD_ABS_AR	0x4
> > > 
> > > -#define	FSI_GPIO_DPOLL_CLOCKS	100      /* < 21 will cause slave to hang */
> > > +
> > > +#define	FSI_GPIO_DPOLL_CLOCKS	50      /* < 21 will cause slave to hang */
> > > 
> > >   /* Bus errors */
> > >   #define	FSI_GPIO_ERR_BUSY	1	/* Slave stuck in busy state */
> > > @@ -43,7 +44,7 @@
> > >   #define	FSI_GPIO_RESP_ACK	0
> > >   #define	FSI_GPIO_RESP_ACKD	4
> > > 
> > > -#define	FSI_GPIO_MAX_BUSY	100
> > > +#define	FSI_GPIO_MAX_BUSY	200
> > >   #define	FSI_GPIO_MTOE_COUNT	1000
> > >   #define	FSI_GPIO_DRAIN_BITS	20
> > >   #define	FSI_GPIO_CRC_SIZE	4
> > > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction
  2018-05-10  2:50   ` Andrew Jeffery
@ 2018-05-10  3:48     ` Joel Stanley
  2018-05-11 10:51     ` Joel Stanley
  1 sibling, 0 replies; 21+ messages in thread
From: Joel Stanley @ 2018-05-10  3:48 UTC (permalink / raw)
  To: Andrew Jeffery
  Cc: Christopher Bostic, Benjamin Herrenschmidt, OpenBMC Maillist

On 10 May 2018 at 12:20, Andrew Jeffery <andrew@aj.id.au> wrote:
> On Wed, 9 May 2018, at 01:43, Christopher Bostic wrote:
>> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
>>
>>
>> On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
>> > In aspeed_gpio_dir_out(), we need to establish the new output
>> > value in the output latch *before* we change the direction
>> > to output in order to avoid a glitch on the output line if
>> > the previous value of the latch was different.
>> >
>
> I think Milton mentioned this in a review way back when I was upstreaming the driver, but looks like I forgot to address it.
>
>> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

Ben, can you please send this upstream as a fix?

Cheers,

Joel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction
  2018-05-10  2:50   ` Andrew Jeffery
  2018-05-10  3:48     ` Joel Stanley
@ 2018-05-11 10:51     ` Joel Stanley
  1 sibling, 0 replies; 21+ messages in thread
From: Joel Stanley @ 2018-05-11 10:51 UTC (permalink / raw)
  To: Andrew Jeffery
  Cc: Christopher Bostic, Benjamin Herrenschmidt, OpenBMC Maillist

On 10 May 2018 at 12:20, Andrew Jeffery <andrew@aj.id.au> wrote:
> On Wed, 9 May 2018, at 01:43, Christopher Bostic wrote:
>> Reviewed-by: Christopher Bostic <cbostic@linux.vnet.ibm.com>
>>
>>
>> On 5/7/18 8:06 PM, Benjamin Herrenschmidt wrote:
>> > In aspeed_gpio_dir_out(), we need to establish the new output
>> > value in the output latch *before* we change the direction
>> > to output in order to avoid a glitch on the output line if
>> > the previous value of the latch was different.
>> >
>
> I think Milton mentioned this in a review way back when I was upstreaming the driver, but looks like I forgot to address it.
>
>> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> Reviewed-by: Andrew Jeffery <andrew@aj.id.au>

I've applied this series to dev-4.13. Thanks for the patches, and
thanks to everyone who reviewed.

Cheers,

Joel

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-05-11 10:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08  1:06 [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Benjamin Herrenschmidt
2018-05-08  1:06 ` [PATCH linux dev-4.13 2/6] gpio/aspeed: Use a cache of output data registers Benjamin Herrenschmidt
2018-05-08 16:17   ` Christopher Bostic
2018-05-10  2:56     ` Andrew Jeffery
2018-05-08  1:06 ` [PATCH linux dev-4.13 3/6] fsi/fsi-master-gpio: Sample input data on different clock phase Benjamin Herrenschmidt
2018-05-08 16:42   ` Christopher Bostic
2018-05-08  1:06 ` [PATCH linux dev-4.13 4/6] fsi/fsi-master-gpio: Add "no-gpio-delays" option Benjamin Herrenschmidt
2018-05-08 19:54   ` Christopher Bostic
2018-05-10  3:03     ` Andrew Jeffery
2018-05-10  3:10       ` Benjamin Herrenschmidt
2018-05-08  1:06 ` [PATCH linux dev-4.13 5/6] fsi/fsi-master-gpio: Reduce turnaround clocks Benjamin Herrenschmidt
2018-05-08 19:55   ` Christopher Bostic
2018-05-08  1:06 ` [PATCH linux dev-4.13 6/6] fsi/fsi-master-gpio: Reduce dpoll clocks Benjamin Herrenschmidt
2018-05-08  1:46 ` Benjamin Herrenschmidt
2018-05-08 20:00   ` Christopher Bostic
2018-05-10  3:06     ` Andrew Jeffery
2018-05-10  3:12       ` Benjamin Herrenschmidt
2018-05-08 16:13 ` [PATCH linux dev-4.13 1/6] gpio/aspeed: Set output latch before changing direction Christopher Bostic
2018-05-10  2:50   ` Andrew Jeffery
2018-05-10  3:48     ` Joel Stanley
2018-05-11 10:51     ` Joel Stanley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.