All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
@ 2020-04-20 17:27 Andy Shevchenko
  2020-04-20 17:27 ` [PATCH v3 2/3] gpio: pca953x: fix handling of automatic address incrementing Andy Shevchenko
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-20 17:27 UTC (permalink / raw)
  To: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König
  Cc: Andy Shevchenko, Paul Thomas

The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
basically did everything wrong from style and code reuse perspective, i.e.
- it didn't utilize existing PCA953x internal helpers
- it didn't utilize bitmap API
- it misses the point that ilog2(), besides that BANK_SFT is useless,
  can be used in macros
- it has indentation issues.

Rewrite the function completely.

Cc: Paul Thomas <pthomas8589@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/gpio/gpio-pca953x.c | 41 ++++++++++---------------------------
 1 file changed, 11 insertions(+), 30 deletions(-)

diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index 60ae18e4b5f5a..41be681ae77c2 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -115,7 +115,6 @@ MODULE_DEVICE_TABLE(acpi, pca953x_acpi_ids);
 
 #define MAX_BANK 5
 #define BANK_SZ 8
-#define BANK_SFT 3 /* ilog2(BANK_SZ) */
 #define MAX_LINE	(MAX_BANK * BANK_SZ)
 
 #define NBANK(chip) DIV_ROUND_UP(chip->gpio_chip.ngpio, BANK_SZ)
@@ -469,38 +468,20 @@ static int pca953x_gpio_get_direction(struct gpio_chip *gc, unsigned off)
 }
 
 static int pca953x_gpio_get_multiple(struct gpio_chip *gc,
-				      unsigned long *mask, unsigned long *bits)
+				     unsigned long *mask, unsigned long *bits)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	unsigned int reg_val;
-	int offset, value, i, ret = 0;
-	u8 inreg;
+	DECLARE_BITMAP(reg_val, MAX_LINE);
+	int ret;
 
-	/* Force offset outside the range of i so that
-	 * at least the first relevant register is read
-	 */
-	offset = gc->ngpio;
-	for_each_set_bit(i, mask, gc->ngpio) {
-		/* whenever i goes into a new bank update inreg
-		 * and read the register
-		 */
-		if ((offset >> BANK_SFT) != (i >> BANK_SFT)) {
-			offset = i;
-			inreg = pca953x_recalc_addr(chip, chip->regs->input,
-						    offset, true, false);
-			mutex_lock(&chip->i2c_lock);
-			ret = regmap_read(chip->regmap, inreg, &reg_val);
-			mutex_unlock(&chip->i2c_lock);
-			if (ret < 0)
-				return ret;
-		}
-		/* reg_val is relative to the last read byte,
-		 * so only shift the relative bits
-		 */
-		value = (reg_val >> (i % 8)) & 0x01;
-		__assign_bit(i, bits, value);
-	}
-	return ret;
+	mutex_lock(&chip->i2c_lock);
+	ret = pca953x_read_regs(chip, chip->regs->input, reg_val);
+	mutex_unlock(&chip->i2c_lock);
+	if (ret)
+		return ret;
+
+	bitmap_replace(bits, bits, reg_val, mask, gc->ngpio);
+	return 0;
 }
 
 static void pca953x_gpio_set_multiple(struct gpio_chip *gc,
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/3] gpio: pca953x: fix handling of automatic address incrementing
  2020-04-20 17:27 [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Andy Shevchenko
@ 2020-04-20 17:27 ` Andy Shevchenko
  2020-04-20 17:27 ` [PATCH v3 3/3] gpio: pca953x: drop unused parameters of pca953x_recalc_addr() Andy Shevchenko
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-20 17:27 UTC (permalink / raw)
  To: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König
  Cc: Marcel Gudert, Andy Shevchenko

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Some of the chips supported by the pca953x driver need the most
significant bit in the address word set to automatically increment the
address pointer on subsequent reads and writes (example: PCA9505). With
this bit unset the same register is read multiple times on a multi-byte
read sequence. Other chips must not have this bit set and autoincrement
always (example: PCA9555).

Up to now this AI bit was interpreted to be part of the address, which
resulted in inconsistent regmap caching when a register was written with
AI set and then read without it. This happened for the PCA9505 in
pca953x_gpio_set_multiple() where pca953x_read_regs() bulk read from the
cache for registers 0x8-0xc and then wrote to registers 0x88-0x8c. (Side
note: reading 5 values from offset 0x8 yiels OP0 5 times because AI must
be set to get OP0-OP4, which is another bug that is resolved here as a
by-product.) The same problem happens when calls to gpio_set_value() and
gpio_set_array_value() were mixed.

With this patch the AI bit is always set for chips that support it. This
works as there are no code locations that make use of the behaviour with
AI unset (for the chips that support it).

Note that the call to pca953x_setup_gpio() had to be done a bit earlier
to make the NBANK macro work.

The history of this bug is a bit complicated. Commit b32cecb46bdc
("gpio: pca953x: Extract the register address mangling to single
function") changed which chips and functions are affected. Commit
3b00691cc46a ("gpio: pca953x: hack to fix 24 bit gpio expanders") used
some duct tape to make the driver at least appear to work. Commit
49427232764d ("gpio: pca953x: Perform basic regmap conversion")
introduced the caching. Commit b4818afeacbd ("gpio: pca953x: Add
set_multiple to allow multiple bits to be set in one write.") introduced
the .set_multiple() callback which didn't work for chips that need the
AI bit which was fixed later for some chips in 8958262af3fb ("gpio:
pca953x: Repair multi-byte IO address increment on PCA9575"). So I'm
sorry, I don't know which commit I should pick for a Fixes: line.

Tested-by: Marcel Gudert <m.gudert@eckelmann.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/gpio/gpio-pca953x.c | 44 +++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index 41be681ae77c2..590b072366377 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -308,8 +308,22 @@ static const struct regmap_config pca953x_i2c_regmap = {
 
 	.disable_locking = true,
 	.cache_type = REGCACHE_RBTREE,
-	/* REVISIT: should be 0x7f but some 24 bit chips use REG_ADDR_AI */
-	.max_register = 0xff,
+	.max_register = 0x7f,
+};
+
+static const struct regmap_config pca953x_ai_i2c_regmap = {
+	.reg_bits = 8,
+	.val_bits = 8,
+
+	.read_flag_mask = REG_ADDR_AI,
+	.write_flag_mask = REG_ADDR_AI,
+
+	.readable_reg = pca953x_readable_register,
+	.writeable_reg = pca953x_writeable_register,
+	.volatile_reg = pca953x_volatile_register,
+
+	.cache_type = REGCACHE_RBTREE,
+	.max_register = 0x7f,
 };
 
 static u8 pca953x_recalc_addr(struct pca953x_chip *chip, int reg, int off,
@@ -320,18 +334,6 @@ static u8 pca953x_recalc_addr(struct pca953x_chip *chip, int reg, int off,
 	int pinctrl = (reg & PCAL_PINCTRL_MASK) << 1;
 	u8 regaddr = pinctrl | addr | (off / BANK_SZ);
 
-	/* Single byte read doesn't need AI bit set. */
-	if (!addrinc)
-		return regaddr;
-
-	/* Chips with 24 and more GPIOs always support Auto Increment */
-	if (write && NBANK(chip) > 2)
-		regaddr |= REG_ADDR_AI;
-
-	/* PCA9575 needs address-increment on multi-byte writes */
-	if (PCA_CHIP_TYPE(chip->driver_data) == PCA957X_TYPE)
-		regaddr |= REG_ADDR_AI;
-
 	return regaddr;
 }
 
@@ -882,6 +884,7 @@ static int pca953x_probe(struct i2c_client *client,
 	int ret;
 	u32 invert = 0;
 	struct regulator *reg;
+	const struct regmap_config *regmap_config;
 
 	chip = devm_kzalloc(&client->dev, sizeof(*chip), GFP_KERNEL);
 	if (chip == NULL)
@@ -944,7 +947,17 @@ static int pca953x_probe(struct i2c_client *client,
 
 	i2c_set_clientdata(client, chip);
 
-	chip->regmap = devm_regmap_init_i2c(client, &pca953x_i2c_regmap);
+	pca953x_setup_gpio(chip, chip->driver_data & PCA_GPIO_MASK);
+
+	if (NBANK(chip) > 2 || PCA_CHIP_TYPE(chip->driver_data) == PCA957X_TYPE) {
+		dev_info(&client->dev, "using AI\n");
+		regmap_config = &pca953x_ai_i2c_regmap;
+	} else {
+		dev_info(&client->dev, "using no AI\n");
+		regmap_config = &pca953x_i2c_regmap;
+	}
+
+	chip->regmap = devm_regmap_init_i2c(client, regmap_config);
 	if (IS_ERR(chip->regmap)) {
 		ret = PTR_ERR(chip->regmap);
 		goto err_exit;
@@ -975,7 +988,6 @@ static int pca953x_probe(struct i2c_client *client,
 	/* initialize cached registers from their original values.
 	 * we can't share this chip with another i2c master.
 	 */
-	pca953x_setup_gpio(chip, chip->driver_data & PCA_GPIO_MASK);
 
 	if (PCA_CHIP_TYPE(chip->driver_data) == PCA953X_TYPE) {
 		chip->regs = &pca953x_regs;
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/3] gpio: pca953x: drop unused parameters of pca953x_recalc_addr()
  2020-04-20 17:27 [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Andy Shevchenko
  2020-04-20 17:27 ` [PATCH v3 2/3] gpio: pca953x: fix handling of automatic address incrementing Andy Shevchenko
@ 2020-04-20 17:27 ` Andy Shevchenko
  2020-04-21  3:23 ` [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Paul Thomas
  2020-04-21 13:03 ` Andy Shevchenko
  3 siblings, 0 replies; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-20 17:27 UTC (permalink / raw)
  To: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König
  Cc: Andy Shevchenko

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

After the previous patch the two last parameters of
pca953x_recalc_addr() are unused and so can be dropped.

Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/gpio/gpio-pca953x.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index 590b072366377..06d6af60e6b76 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -326,8 +326,7 @@ static const struct regmap_config pca953x_ai_i2c_regmap = {
 	.max_register = 0x7f,
 };
 
-static u8 pca953x_recalc_addr(struct pca953x_chip *chip, int reg, int off,
-			      bool write, bool addrinc)
+static u8 pca953x_recalc_addr(struct pca953x_chip *chip, int reg, int off)
 {
 	int bank_shift = pca953x_bank_shift(chip);
 	int addr = (reg & PCAL_GPIO_MASK) << bank_shift;
@@ -339,7 +338,7 @@ static u8 pca953x_recalc_addr(struct pca953x_chip *chip, int reg, int off,
 
 static int pca953x_write_regs(struct pca953x_chip *chip, int reg, unsigned long *val)
 {
-	u8 regaddr = pca953x_recalc_addr(chip, reg, 0, true, true);
+	u8 regaddr = pca953x_recalc_addr(chip, reg, 0);
 	u8 value[MAX_BANK];
 	int i, ret;
 
@@ -357,7 +356,7 @@ static int pca953x_write_regs(struct pca953x_chip *chip, int reg, unsigned long
 
 static int pca953x_read_regs(struct pca953x_chip *chip, int reg, unsigned long *val)
 {
-	u8 regaddr = pca953x_recalc_addr(chip, reg, 0, false, true);
+	u8 regaddr = pca953x_recalc_addr(chip, reg, 0);
 	u8 value[MAX_BANK];
 	int i, ret;
 
@@ -376,8 +375,7 @@ static int pca953x_read_regs(struct pca953x_chip *chip, int reg, unsigned long *
 static int pca953x_gpio_direction_input(struct gpio_chip *gc, unsigned off)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off,
-					true, false);
+	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off);
 	u8 bit = BIT(off % BANK_SZ);
 	int ret;
 
@@ -391,10 +389,8 @@ static int pca953x_gpio_direction_output(struct gpio_chip *gc,
 		unsigned off, int val)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off,
-					true, false);
-	u8 outreg = pca953x_recalc_addr(chip, chip->regs->output, off,
-					true, false);
+	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off);
+	u8 outreg = pca953x_recalc_addr(chip, chip->regs->output, off);
 	u8 bit = BIT(off % BANK_SZ);
 	int ret;
 
@@ -414,8 +410,7 @@ static int pca953x_gpio_direction_output(struct gpio_chip *gc,
 static int pca953x_gpio_get_value(struct gpio_chip *gc, unsigned off)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	u8 inreg = pca953x_recalc_addr(chip, chip->regs->input, off,
-				       true, false);
+	u8 inreg = pca953x_recalc_addr(chip, chip->regs->input, off);
 	u8 bit = BIT(off % BANK_SZ);
 	u32 reg_val;
 	int ret;
@@ -439,8 +434,7 @@ static int pca953x_gpio_get_value(struct gpio_chip *gc, unsigned off)
 static void pca953x_gpio_set_value(struct gpio_chip *gc, unsigned off, int val)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	u8 outreg = pca953x_recalc_addr(chip, chip->regs->output, off,
-					true, false);
+	u8 outreg = pca953x_recalc_addr(chip, chip->regs->output, off);
 	u8 bit = BIT(off % BANK_SZ);
 
 	mutex_lock(&chip->i2c_lock);
@@ -451,8 +445,7 @@ static void pca953x_gpio_set_value(struct gpio_chip *gc, unsigned off, int val)
 static int pca953x_gpio_get_direction(struct gpio_chip *gc, unsigned off)
 {
 	struct pca953x_chip *chip = gpiochip_get_data(gc);
-	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off,
-					true, false);
+	u8 dirreg = pca953x_recalc_addr(chip, chip->regs->direction, off);
 	u8 bit = BIT(off % BANK_SZ);
 	u32 reg_val;
 	int ret;
@@ -509,10 +502,8 @@ static int pca953x_gpio_set_pull_up_down(struct pca953x_chip *chip,
 					 unsigned int offset,
 					 unsigned long config)
 {
-	u8 pull_en_reg = pca953x_recalc_addr(chip, PCAL953X_PULL_EN, offset,
-					     true, false);
-	u8 pull_sel_reg = pca953x_recalc_addr(chip, PCAL953X_PULL_SEL, offset,
-					      true, false);
+	u8 pull_en_reg = pca953x_recalc_addr(chip, PCAL953X_PULL_EN, offset);
+	u8 pull_sel_reg = pca953x_recalc_addr(chip, PCAL953X_PULL_SEL, offset);
 	u8 bit = BIT(offset % BANK_SZ);
 	int ret;
 
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-20 17:27 [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Andy Shevchenko
  2020-04-20 17:27 ` [PATCH v3 2/3] gpio: pca953x: fix handling of automatic address incrementing Andy Shevchenko
  2020-04-20 17:27 ` [PATCH v3 3/3] gpio: pca953x: drop unused parameters of pca953x_recalc_addr() Andy Shevchenko
@ 2020-04-21  3:23 ` Paul Thomas
  2020-04-21 12:55   ` Andy Shevchenko
  2020-04-21 13:03 ` Andy Shevchenko
  3 siblings, 1 reply; 19+ messages in thread
From: Paul Thomas @ 2020-04-21  3:23 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Bartosz Golaszewski, open list:GPIO SUBSYSTEM,
	Uwe Kleine-König

On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> basically did everything wrong from style and code reuse perspective, i.e.
Hi Andy,

Well your version is certainly elegant and simple, and does better
with code reuse. However there are a couple of other goals I had in
mind.
First, the "lazy" approach of 96d7c7b3e654 is actually faster when
user space sets up a 8-bit linehandle[1]146us (single regmap_read())
vs 172us (pca953x_read_regs()) which incidentally is what we do in our
application. In lazily reading 1 byte at a time it is the fastest
access for that, if user space is always setting up the linehandle for
the whole chip pca953x_read_regs() would be faster. Seeing as
get_multiple has been unimplemented for this chip until now perhaps
our use case deserves some consideration? That being said, the
pca953x_read_regs() is still far better than calling regmap_read() 8
times.

Second, your version does not work with a 5.2 kernel, bitmap_replace
is not there yet and pca953x_read_regs() signature is different. So
perhaps we'll all move on and no one will care about 5.2, but as
that's what we are using that was the basis for the patch. Have you
tested this with actual hardware? I actually didn't do a proper test
just the timing of the pca953x_read_regs().

> - it didn't utilize existing PCA953x internal helpers
> - it didn't utilize bitmap API
> - it misses the point that ilog2(), besides that BANK_SFT is useless,
>   can be used in macros
Yes, I know ilog2() can be used in macros, I didn't think it was worth
including the .h file just to calculate 3. Putting the ilog2(x) in the
comments seemed to be common in other kernel sections, but maybe that
was historic before the macro version? Either way is fine. The shift
is not useless, without that you would go into the if statement for
every bit, but you only want to do a regmap_read() for every byte.

> - it has indentation issues.
It passed checkpatch.pl, and any indentation fixes are fine with me.

thanks,
Paul

[1] Tested using 16-bit max7312

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21  3:23 ` [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Paul Thomas
@ 2020-04-21 12:55   ` Andy Shevchenko
  2020-04-21 14:06     ` Uwe Kleine-König
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-21 12:55 UTC (permalink / raw)
  To: Paul Thomas
  Cc: Linus Walleij, Bartosz Golaszewski, open list:GPIO SUBSYSTEM,
	Uwe Kleine-König

On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote:
> On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> >
> > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > basically did everything wrong from style and code reuse perspective, i.e.
> Hi Andy,
> 
> Well your version is certainly elegant and simple, and does better
> with code reuse. However there are a couple of other goals I had in
> mind.
> First, the "lazy" approach of 96d7c7b3e654 is actually faster when
> user space sets up a 8-bit linehandle[1]146us (single regmap_read())
> vs 172us (pca953x_read_regs()) which incidentally is what we do in our
> application. In lazily reading 1 byte at a time it is the fastest
> access for that, if user space is always setting up the linehandle for
> the whole chip pca953x_read_regs() would be faster. Seeing as
> get_multiple has been unimplemented for this chip until now perhaps
> our use case deserves some consideration?

I understand completely your goal, but
- for I²C expanders timings is the last thing to look for (they are quite slow
  by nature), so, I really don't care about 16% speed up for one call; don't
  forget that we are in multi-task OS, where this can be easily interrupted and
  user will see the result quite after expected quick result
- the code maintenance has a priority over micro-optimization (this driver
  suffered many times of breakages because of some optimizations done)
- it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are
  applied cleanly

>That being said, the
> pca953x_read_regs() is still far better than calling regmap_read() 8
> times.

Yes, it's a best compromise I can come with.

> Second, your version does not work with a 5.2 kernel, bitmap_replace
> is not there yet and pca953x_read_regs() signature is different. So
> perhaps we'll all move on and no one will care about 5.2, but as
> that's what we are using that was the basis for the patch.

That's not an argument at all. This is feature request, not the fix.
Uwe's ones, though, are fixes, and thus much more important.

> Have you
> tested this with actual hardware?

Do you see any issues with it?

For the record, yes, on Intel Edison/Arduino where 4 PCA9555 are present.

% gpiodetect
gpiochip0 [0000:00:0c.0] (192 lines)
gpiochip1 [i2c-INT3491:00] (16 lines)
gpiochip2 [spi-PRP0001:00] (4 lines)
gpiochip3 [i2c-INT3491:01] (16 lines)
gpiochip4 [i2c-INT3491:02] (16 lines)
gpiochip5 [i2c-INT3491:03] (16 lines)

### Consider gpiochip1: 8 and 10 are PUed

% gpioget gpiochip1 0 1 2 3 6 7 8 9 10 11 14 15
[  201.416858] pca953x i2c-INT3491:00: pca953x_gpio_get_multiple <<< 00,0000b58f
1 1 1 1 0 1 1 0 1 0 0 1

### only 8

% gpioget gpiochip1 0 1 2 3 6 7 8 9 10 11 14 15
[  354.153921] pca953x i2c-INT3491:00: pca953x_gpio_get_multiple <<< 00,0000b18f
1 1 1 1 0 1 1 0 0 0 0 1

### only 10

% gpioget gpiochip1 0 1 2 3 6 7 8 9 10 11 14 15
[  362.723867] pca953x i2c-INT3491:00: pca953x_gpio_get_multiple <<< 00,0000b48f
1 1 1 1 0 1 0 0 1 0 0 1

### None of them

% gpioget gpiochip1 0 1 2 3 6 7 8 9 10 11 14 15
[  371.294910] pca953x i2c-INT3491:00: pca953x_gpio_get_multiple <<< 00,0000b08f
1 1 1 1 0 1 0 0 0 0 0 1

Yes, I specifically added a debug print since GPIO trace does not distinguish
between plain get and complex one.

  dev_info(&chip->client->dev, "%s <<< %*pb\n", __func__, MAX_LINE, reg_val);

> I actually didn't do a proper test
> just the timing of the pca953x_read_regs().
> 
> > - it didn't utilize existing PCA953x internal helpers
> > - it didn't utilize bitmap API
> > - it misses the point that ilog2(), besides that BANK_SFT is useless,
> >   can be used in macros
> Yes, I know ilog2() can be used in macros, I didn't think it was worth
> including the .h file just to calculate 3. Putting the ilog2(x) in the
> comments seemed to be common in other kernel sections, but maybe that
> was historic before the macro version? Either way is fine. The shift
> is not useless, without that you would go into the if statement for
> every bit, but you only want to do a regmap_read() for every byte.
> 
> > - it has indentation issues.
> It passed checkpatch.pl, and any indentation fixes are fine with me.

> [1] Tested using 16-bit max7312

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-20 17:27 [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Andy Shevchenko
                   ` (2 preceding siblings ...)
  2020-04-21  3:23 ` [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Paul Thomas
@ 2020-04-21 13:03 ` Andy Shevchenko
  2020-04-21 14:31   ` Uwe Kleine-König
  2020-04-21 15:42   ` Bartosz Golaszewski
  3 siblings, 2 replies; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-21 13:03 UTC (permalink / raw)
  To: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König
  Cc: Paul Thomas

On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> basically did everything wrong from style and code reuse perspective, i.e.
> - it didn't utilize existing PCA953x internal helpers
> - it didn't utilize bitmap API
> - it misses the point that ilog2(), besides that BANK_SFT is useless,
>   can be used in macros
> - it has indentation issues.
> 
> Rewrite the function completely.

Bart, Linus, please, consider this series to be applied, because it has Uwe's fixes.
We may still discuss the approach with ->get_multiple(), though.

For the record, should some of us volunteer to be a reviewer for this driver.
It's awful that almost every release we get something either ugly or broken in it.
Uwe, would you like to be a designated reviewer (I would also support)?

> Cc: Paul Thomas <pthomas8589@gmail.com>
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> ---
>  drivers/gpio/gpio-pca953x.c | 41 ++++++++++---------------------------
>  1 file changed, 11 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
> index 60ae18e4b5f5a..41be681ae77c2 100644
> --- a/drivers/gpio/gpio-pca953x.c
> +++ b/drivers/gpio/gpio-pca953x.c
> @@ -115,7 +115,6 @@ MODULE_DEVICE_TABLE(acpi, pca953x_acpi_ids);
>  
>  #define MAX_BANK 5
>  #define BANK_SZ 8
> -#define BANK_SFT 3 /* ilog2(BANK_SZ) */
>  #define MAX_LINE	(MAX_BANK * BANK_SZ)
>  
>  #define NBANK(chip) DIV_ROUND_UP(chip->gpio_chip.ngpio, BANK_SZ)
> @@ -469,38 +468,20 @@ static int pca953x_gpio_get_direction(struct gpio_chip *gc, unsigned off)
>  }
>  
>  static int pca953x_gpio_get_multiple(struct gpio_chip *gc,
> -				      unsigned long *mask, unsigned long *bits)
> +				     unsigned long *mask, unsigned long *bits)
>  {
>  	struct pca953x_chip *chip = gpiochip_get_data(gc);
> -	unsigned int reg_val;
> -	int offset, value, i, ret = 0;
> -	u8 inreg;
> +	DECLARE_BITMAP(reg_val, MAX_LINE);
> +	int ret;
>  
> -	/* Force offset outside the range of i so that
> -	 * at least the first relevant register is read
> -	 */
> -	offset = gc->ngpio;
> -	for_each_set_bit(i, mask, gc->ngpio) {
> -		/* whenever i goes into a new bank update inreg
> -		 * and read the register
> -		 */
> -		if ((offset >> BANK_SFT) != (i >> BANK_SFT)) {
> -			offset = i;
> -			inreg = pca953x_recalc_addr(chip, chip->regs->input,
> -						    offset, true, false);
> -			mutex_lock(&chip->i2c_lock);
> -			ret = regmap_read(chip->regmap, inreg, &reg_val);
> -			mutex_unlock(&chip->i2c_lock);
> -			if (ret < 0)
> -				return ret;
> -		}
> -		/* reg_val is relative to the last read byte,
> -		 * so only shift the relative bits
> -		 */
> -		value = (reg_val >> (i % 8)) & 0x01;
> -		__assign_bit(i, bits, value);
> -	}
> -	return ret;
> +	mutex_lock(&chip->i2c_lock);
> +	ret = pca953x_read_regs(chip, chip->regs->input, reg_val);
> +	mutex_unlock(&chip->i2c_lock);
> +	if (ret)
> +		return ret;
> +
> +	bitmap_replace(bits, bits, reg_val, mask, gc->ngpio);
> +	return 0;
>  }
>  
>  static void pca953x_gpio_set_multiple(struct gpio_chip *gc,
> -- 
> 2.26.1
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 12:55   ` Andy Shevchenko
@ 2020-04-21 14:06     ` Uwe Kleine-König
  2020-04-21 14:21       ` Andy Shevchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Uwe Kleine-König @ 2020-04-21 14:06 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Paul Thomas, Linus Walleij, Bartosz Golaszewski,
	open list:GPIO SUBSYSTEM

Hello Andy,

first of all thanks for picking up my patches, very appreciated.

On Tue, Apr 21, 2020 at 03:55:53PM +0300, Andy Shevchenko wrote:
> On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote:
> > On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > >
> > > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > > basically did everything wrong from style and code reuse perspective, i.e.
> > Hi Andy,
> > 
> > Well your version is certainly elegant and simple, and does better
> > with code reuse. However there are a couple of other goals I had in
> > mind.
> > First, the "lazy" approach of 96d7c7b3e654 is actually faster when
> > user space sets up a 8-bit linehandle[1]146us (single regmap_read())
> > vs 172us (pca953x_read_regs()) which incidentally is what we do in our
> > application. In lazily reading 1 byte at a time it is the fastest
> > access for that, if user space is always setting up the linehandle for
> > the whole chip pca953x_read_regs() would be faster. Seeing as
> > get_multiple has been unimplemented for this chip until now perhaps
> > our use case deserves some consideration?
> 
> I understand completely your goal, but
> - for I²C expanders timings is the last thing to look for (they are quite slow
>   by nature), so, I really don't care about 16% speed up for one call; don't
>   forget that we are in multi-task OS, where this can be easily interrupted and
>   user will see the result quite after expected quick result

I didn't do any timing analysis and while I understand your
argumentation, I'm not sure to agree. I noticed while debugging the
problem that then resulted in my fix that gpioctl uses the .set_multiple
callback. I told my customer to use gpioctl instead of /sys/class/gpio
because it performs better just to notice that when using gpioctl to set
a single GPIO this transfers five bytes instead of only two.

Having said that I think the sane approach is to optimize
.{g,s}et_multiple to reduce the read/write size to the smallest bulk
size possible that covers all bits that have their corresponding bit set
in mask.

> - the code maintenance has a priority over micro-optimization (this driver
>   suffered many times of breakages because of some optimizations done)

ack here. Some parts of the driver were harder to grasp than necessary.

> - it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are
>   applied cleanly

I didn't check, is 96d7c7b3e654 broken for some chips?

I will add my suggested optimisation to my todo list for evaluation. If
I think it is still nice and maintainable I'll send a patch. Until I
have looked into this (or someone else did) I'm in favour of applying
Andy's patch.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 14:06     ` Uwe Kleine-König
@ 2020-04-21 14:21       ` Andy Shevchenko
  2020-04-22  4:33         ` Paul Thomas
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-21 14:21 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Paul Thomas, Linus Walleij, Bartosz Golaszewski,
	open list:GPIO SUBSYSTEM

On Tue, Apr 21, 2020 at 04:06:24PM +0200, Uwe Kleine-König wrote:
> On Tue, Apr 21, 2020 at 03:55:53PM +0300, Andy Shevchenko wrote:
> > On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote:
> > > On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
> > > <andriy.shevchenko@linux.intel.com> wrote:
> > > >
> > > > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > > > basically did everything wrong from style and code reuse perspective, i.e.
> > > Hi Andy,
> > > 
> > > Well your version is certainly elegant and simple, and does better
> > > with code reuse. However there are a couple of other goals I had in
> > > mind.
> > > First, the "lazy" approach of 96d7c7b3e654 is actually faster when
> > > user space sets up a 8-bit linehandle[1]146us (single regmap_read())
> > > vs 172us (pca953x_read_regs()) which incidentally is what we do in our
> > > application. In lazily reading 1 byte at a time it is the fastest
> > > access for that, if user space is always setting up the linehandle for
> > > the whole chip pca953x_read_regs() would be faster. Seeing as
> > > get_multiple has been unimplemented for this chip until now perhaps
> > > our use case deserves some consideration?
> > 
> > I understand completely your goal, but
> > - for I²C expanders timings is the last thing to look for (they are quite slow
> >   by nature), so, I really don't care about 16% speed up for one call; don't
> >   forget that we are in multi-task OS, where this can be easily interrupted and
> >   user will see the result quite after expected quick result
> 
> I didn't do any timing analysis and while I understand your
> argumentation, I'm not sure to agree. I noticed while debugging the
> problem that then resulted in my fix that gpioctl uses the .set_multiple
> callback. I told my customer to use gpioctl instead of /sys/class/gpio
> because it performs better just to notice that when using gpioctl to set
> a single GPIO this transfers five bytes instead of only two.
> 
> Having said that I think the sane approach is to optimize
> .{g,s}et_multiple to reduce the read/write size to the smallest bulk
> size possible that covers all bits that have their corresponding bit set
> in mask.
> 
> > - the code maintenance has a priority over micro-optimization (this driver
> >   suffered many times of breakages because of some optimizations done)
> 
> ack here. Some parts of the driver were harder to grasp than necessary.
> 
> > - it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are
> >   applied cleanly
> 
> I didn't check, is 96d7c7b3e654 broken for some chips?

I was referring to another call to recalc address with additional parameters,
which your second patch actually gets rid of.

> I will add my suggested optimisation to my todo list for evaluation. If
> I think it is still nice and maintainable I'll send a patch. Until I
> have looked into this (or someone else did) I'm in favour of applying
> Andy's patch.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 13:03 ` Andy Shevchenko
@ 2020-04-21 14:31   ` Uwe Kleine-König
  2020-04-21 15:42   ` Bartosz Golaszewski
  1 sibling, 0 replies; 19+ messages in thread
From: Uwe Kleine-König @ 2020-04-21 14:31 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Bartosz Golaszewski, linux-gpio, Paul Thomas, kernel

Hello,

On Tue, Apr 21, 2020 at 04:03:00PM +0300, Andy Shevchenko wrote:
> On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > basically did everything wrong from style and code reuse perspective, i.e.
> > - it didn't utilize existing PCA953x internal helpers
> > - it didn't utilize bitmap API
> > - it misses the point that ilog2(), besides that BANK_SFT is useless,
> >   can be used in macros
> > - it has indentation issues.
> > 
> > Rewrite the function completely.
> 
> Bart, Linus, please, consider this series to be applied, because it has Uwe's fixes.
> We may still discuss the approach with ->get_multiple(), though.
> 
> For the record, should some of us volunteer to be a reviewer for this driver.
> It's awful that almost every release we get something either ugly or broken in it.
> Uwe, would you like to be a designated reviewer (I would also support)?

Yeah that would be fine. I'm not sure how long I will have access to the
hardware I fixed the problem on, but I can see what to do when it
happens.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 13:03 ` Andy Shevchenko
  2020-04-21 14:31   ` Uwe Kleine-König
@ 2020-04-21 15:42   ` Bartosz Golaszewski
  2020-04-28 12:13     ` Linus Walleij
  1 sibling, 1 reply; 19+ messages in thread
From: Bartosz Golaszewski @ 2020-04-21 15:42 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, linux-gpio, Uwe Kleine-König, Paul Thomas

wt., 21 kwi 2020 o 15:03 Andy Shevchenko
<andriy.shevchenko@linux.intel.com> napisał(a):
>
> On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > basically did everything wrong from style and code reuse perspective, i.e.
> > - it didn't utilize existing PCA953x internal helpers
> > - it didn't utilize bitmap API
> > - it misses the point that ilog2(), besides that BANK_SFT is useless,
> >   can be used in macros
> > - it has indentation issues.
> >
> > Rewrite the function completely.
>
> Bart, Linus, please, consider this series to be applied, because it has Uwe's fixes.
> We may still discuss the approach with ->get_multiple(), though.
>

Personally I like these patches - if Linus doesn't object in the
coming days I'll pick them up into my tree as a follow-up to Paul's
patch.

Bart

> For the record, should some of us volunteer to be a reviewer for this driver.
> It's awful that almost every release we get something either ugly or broken in it.
> Uwe, would you like to be a designated reviewer (I would also support)?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 14:21       ` Andy Shevchenko
@ 2020-04-22  4:33         ` Paul Thomas
  2020-04-22  8:34           ` Andy Shevchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Thomas @ 2020-04-22  4:33 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Uwe Kleine-König, Linus Walleij, Bartosz Golaszewski,
	open list:GPIO SUBSYSTEM

Hi Everyone,

On Tue, Apr 21, 2020 at 10:21 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Tue, Apr 21, 2020 at 04:06:24PM +0200, Uwe Kleine-König wrote:
> > On Tue, Apr 21, 2020 at 03:55:53PM +0300, Andy Shevchenko wrote:
> > > On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote:
> > > > On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
> > > > <andriy.shevchenko@linux.intel.com> wrote:
> > > > >
> > > > > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > > > > basically did everything wrong from style and code reuse perspective, i.e.
> > > > Hi Andy,
> > > >
> > > > Well your version is certainly elegant and simple, and does better
> > > > with code reuse. However there are a couple of other goals I had in
> > > > mind.
> > > > First, the "lazy" approach of 96d7c7b3e654 is actually faster when
> > > > user space sets up a 8-bit linehandle[1]146us (single regmap_read())
> > > > vs 172us (pca953x_read_regs()) which incidentally is what we do in our
> > > > application. In lazily reading 1 byte at a time it is the fastest
> > > > access for that, if user space is always setting up the linehandle for
> > > > the whole chip pca953x_read_regs() would be faster. Seeing as
> > > > get_multiple has been unimplemented for this chip until now perhaps
> > > > our use case deserves some consideration?
> > >
> > > I understand completely your goal, but
> > > - for I²C expanders timings is the last thing to look for (they are quite slow
> > >   by nature), so, I really don't care about 16% speed up for one call; don't
> > >   forget that we are in multi-task OS, where this can be easily interrupted and
> > >   user will see the result quite after expected quick result
Sure, it's not a HUGE deal, but this will get worse for 5 bank chips.
Also, 26us is not insignificant, with the preempt-rt kernel we're
using that can be more than the max interrupt latency.

> >
> > I didn't do any timing analysis and while I understand your
> > argumentation, I'm not sure to agree. I noticed while debugging the
> > problem that then resulted in my fix that gpioctl uses the .set_multiple
> > callback. I told my customer to use gpioctl instead of /sys/class/gpio
> > because it performs better just to notice that when using gpioctl to set
> > a single GPIO this transfers five bytes instead of only two.
> >
> > Having said that I think the sane approach is to optimize
> > .{g,s}et_multiple to reduce the read/write size to the smallest bulk
> > size possible that covers all bits that have their corresponding bit set
> > in mask.
> >
> > > - the code maintenance has a priority over micro-optimization (this driver
> > >   suffered many times of breakages because of some optimizations done)
Yeah, I appreciate that maintainability needs to be a big priority,
I'm wondering if comments & syntax could be improved so that the
general idea of 96d7c7b3e654 is clear and maintainable. It is just
walking through mask, and whenever it gets to a new byte it reads it
from the hardware.

> >
> > ack here. Some parts of the driver were harder to grasp than necessary.
> >
> > > - it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are
> > >   applied cleanly
> >
> > I didn't check, is 96d7c7b3e654 broken for some chips?
>
> I was referring to another call to recalc address with additional parameters,
> which your second patch actually gets rid of.

If it's just the call to pca953x_recalc_addr() that caused the
conflict with Uwe's work with 96d7c7b3e654, we can just remove the
last two arguments so it matches what pca953x_gpio_get_value() is
doing.

thanks,
Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-22  4:33         ` Paul Thomas
@ 2020-04-22  8:34           ` Andy Shevchenko
  0 siblings, 0 replies; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-22  8:34 UTC (permalink / raw)
  To: Paul Thomas
  Cc: Uwe Kleine-König, Linus Walleij, Bartosz Golaszewski,
	open list:GPIO SUBSYSTEM

On Wed, Apr 22, 2020 at 12:33:34AM -0400, Paul Thomas wrote:
> On Tue, Apr 21, 2020 at 10:21 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Tue, Apr 21, 2020 at 04:06:24PM +0200, Uwe Kleine-König wrote:
> > > On Tue, Apr 21, 2020 at 03:55:53PM +0300, Andy Shevchenko wrote:
> > > > On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote:
> > > > > On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko
> > > > > <andriy.shevchenko@linux.intel.com> wrote:

...

> > > > I understand completely your goal, but
> > > > - for I²C expanders timings is the last thing to look for (they are quite slow
> > > >   by nature), so, I really don't care about 16% speed up for one call; don't
> > > >   forget that we are in multi-task OS, where this can be easily interrupted and
> > > >   user will see the result quite after expected quick result
> Sure, it's not a HUGE deal, but this will get worse for 5 bank chips.
> Also, 26us is not insignificant, with the preempt-rt kernel we're
> using that can be more than the max interrupt latency.

Using slow buses, no, not just using, but relying on them, in RT environment
seems pretty much bad hardware design. If you have time critical solution,
it should try its best by excluding points of possible failures.

> > > I didn't do any timing analysis and while I understand your
> > > argumentation, I'm not sure to agree. I noticed while debugging the
> > > problem that then resulted in my fix that gpioctl uses the .set_multiple
> > > callback. I told my customer to use gpioctl instead of /sys/class/gpio
> > > because it performs better just to notice that when using gpioctl to set
> > > a single GPIO this transfers five bytes instead of only two.
> > >
> > > Having said that I think the sane approach is to optimize
> > > .{g,s}et_multiple to reduce the read/write size to the smallest bulk
> > > size possible that covers all bits that have their corresponding bit set
> > > in mask.
> > >
> > > > - the code maintenance has a priority over micro-optimization (this driver
> > > >   suffered many times of breakages because of some optimizations done)
> Yeah, I appreciate that maintainability needs to be a big priority,
> I'm wondering if comments & syntax could be improved so that the
> general idea of 96d7c7b3e654 is clear and maintainable. It is just
> walking through mask, and whenever it gets to a new byte it reads it
> from the hardware.

I think the idea per se is worth to consider, though it should not be limited
to the ->get_multiple(), for example reading IRQ status can advantage of this
as well. So, it should be rather some common helper which takes mask as input
parameter and converts it to the set of registers to read. regmap API, IIRC,
has support of sparse reading. I dunno, though, if it is supported by regmap
I²C case.

> > > ack here. Some parts of the driver were harder to grasp than necessary.
> > >
> > > > - it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are
> > > >   applied cleanly
> > >
> > > I didn't check, is 96d7c7b3e654 broken for some chips?
> >
> > I was referring to another call to recalc address with additional parameters,
> > which your second patch actually gets rid of.
> 
> If it's just the call to pca953x_recalc_addr() that caused the
> conflict with Uwe's work with 96d7c7b3e654, we can just remove the
> last two arguments so it matches what pca953x_gpio_get_value() is
> doing.

Let's rethink entire approach. See above.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-21 15:42   ` Bartosz Golaszewski
@ 2020-04-28 12:13     ` Linus Walleij
  2020-04-28 12:19       ` Uwe Kleine-König
  2020-04-28 12:41       ` Andy Shevchenko
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Walleij @ 2020-04-28 12:13 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andy Shevchenko, linux-gpio, Uwe Kleine-König, Paul Thomas

On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
<bgolaszewski@baylibre.com> wrote:
> wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> napisał(a):
> >
> > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> > > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function")
> > > basically did everything wrong from style and code reuse perspective, i.e.
> > > - it didn't utilize existing PCA953x internal helpers
> > > - it didn't utilize bitmap API
> > > - it misses the point that ilog2(), besides that BANK_SFT is useless,
> > >   can be used in macros
> > > - it has indentation issues.
> > >
> > > Rewrite the function completely.
> >
> > Bart, Linus, please, consider this series to be applied, because it has Uwe's fixes.
> > We may still discuss the approach with ->get_multiple(), though.
> >
>
> Personally I like these patches - if Linus doesn't object in the
> coming days I'll pick them up into my tree as a follow-up to Paul's
> patch.

I don't mind.

I would like that Uwe also agrees that we can merge
these three and use as a base?

I don't mind trying to put in code to optimize use cases
when accessing single bytes here either. But I'd like there
to be explicit comments in the code saying why these
optimizations are there. Can we do those on top of
this (hopefully) known working base?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 12:13     ` Linus Walleij
@ 2020-04-28 12:19       ` Uwe Kleine-König
  2020-04-28 12:41       ` Andy Shevchenko
  1 sibling, 0 replies; 19+ messages in thread
From: Uwe Kleine-König @ 2020-04-28 12:19 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Bartosz Golaszewski, Andy Shevchenko, linux-gpio, Paul Thomas

Hello Linus,

On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> I would like that Uwe also agrees that we can merge
> these three and use as a base?

That's fine for me.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 12:13     ` Linus Walleij
  2020-04-28 12:19       ` Uwe Kleine-König
@ 2020-04-28 12:41       ` Andy Shevchenko
  2020-04-28 13:09         ` Paul Thomas
  1 sibling, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-28 12:41 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Bartosz Golaszewski, linux-gpio, Uwe Kleine-König, Paul Thomas

On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
> <bgolaszewski@baylibre.com> wrote:
> > wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> napisał(a):
> > > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:

...

> I don't mind trying to put in code to optimize use cases
> when accessing single bytes here either. But I'd like there
> to be explicit comments in the code saying why these
> optimizations are there. Can we do those on top of
> this (hopefully) known working base?

As I pointed out to Paul, the optimization like he proposed is not bad thing
per se, the implementation is. On top of that I suggested to take a look to IRQ
status bits, which presumable will leverage from this optimization as well.

So, After applying this series it would be matter of change one line in the
->get_multiple() to replace read_regs() with optimized version or so along with
IRQ bits changes.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 12:41       ` Andy Shevchenko
@ 2020-04-28 13:09         ` Paul Thomas
  2020-04-28 14:18           ` Andy Shevchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Thomas @ 2020-04-28 13:09 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König

On Tue, Apr 28, 2020 at 8:41 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> > On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
> > <bgolaszewski@baylibre.com> wrote:
> > > wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> > > <andriy.shevchenko@linux.intel.com> napisał(a):
> > > > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
>
> ...
>
> > I don't mind trying to put in code to optimize use cases
> > when accessing single bytes here either. But I'd like there
> > to be explicit comments in the code saying why these
> > optimizations are there. Can we do those on top of
> > this (hopefully) known working base?
For the record my original get_multiple patch was a known working base.

>
> As I pointed out to Paul, the optimization like he proposed is not bad thing
> per se, the implementation is. On top of that I suggested to take a look to IRQ
> status bits, which presumable will leverage from this optimization as well.
>
> So, After applying this series it would be matter of change one line in the
> ->get_multiple() to replace read_regs() with optimized version or so along with
> IRQ bits changes.

This new function would then at least need to be called with mask as
an additional argument right? Then the bitmap_replace() will set
everything regardless of if it was read, this is fine I suppose
because it doesn't matter if it's setting bits outside of mask. You
just have two loops one in the new function and one in
bitmap_replace(). If this is what people would like to see I can work
on it. I did look into the sparse reads and it seems as though regmap
has a gather_write, but not a gather_read and gather_write isn't in
regmap-i2c.c anyway.

-Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 13:09         ` Paul Thomas
@ 2020-04-28 14:18           ` Andy Shevchenko
  2020-04-28 14:58             ` Paul Thomas
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2020-04-28 14:18 UTC (permalink / raw)
  To: Paul Thomas
  Cc: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König

On Tue, Apr 28, 2020 at 09:09:03AM -0400, Paul Thomas wrote:
> On Tue, Apr 28, 2020 at 8:41 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> >
> > On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> > > On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
> > > <bgolaszewski@baylibre.com> wrote:
> > > > wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> > > > <andriy.shevchenko@linux.intel.com> napisał(a):
> > > > > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> >
> > ...
> >
> > > I don't mind trying to put in code to optimize use cases
> > > when accessing single bytes here either. But I'd like there
> > > to be explicit comments in the code saying why these
> > > optimizations are there. Can we do those on top of
> > > this (hopefully) known working base?
> For the record my original get_multiple patch was a known working base.
> 
> >
> > As I pointed out to Paul, the optimization like he proposed is not bad thing
> > per se, the implementation is. On top of that I suggested to take a look to IRQ
> > status bits, which presumable will leverage from this optimization as well.
> >
> > So, After applying this series it would be matter of change one line in the
> > ->get_multiple() to replace read_regs() with optimized version or so along with
> > IRQ bits changes.
> 
> This new function would then at least need to be called with mask as
> an additional argument right? Then the bitmap_replace() will set
> everything regardless of if it was read, this is fine I suppose
> because it doesn't matter if it's setting bits outside of mask. You
> just have two loops one in the new function and one in
> bitmap_replace().

Note, on 64-bit architectures there is no loop in bitmap ops (since 40 <= 64 in
this case). On 32-bit it might be (only for 40 case, which I think less present
in the wild than the rest). And bitmap ops are optimized over longs, so, it's
pretty much fast (esp. in comparison to I²C IO).

> If this is what people would like to see I can work
> on it. I did look into the sparse reads and it seems as though regmap
> has a gather_write, but not a gather_read and gather_write isn't in
> regmap-i2c.c anyway.

Yes, I think this is the right way to go.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 14:18           ` Andy Shevchenko
@ 2020-04-28 14:58             ` Paul Thomas
  2020-04-29 11:21               ` Bartosz Golaszewski
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Thomas @ 2020-04-28 14:58 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Bartosz Golaszewski, linux-gpio, Uwe Kleine-König

On Tue, Apr 28, 2020 at 10:18 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Tue, Apr 28, 2020 at 09:09:03AM -0400, Paul Thomas wrote:
> > On Tue, Apr 28, 2020 at 8:41 AM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > >
> > > On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> > > > On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
> > > > <bgolaszewski@baylibre.com> wrote:
> > > > > wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> > > > > <andriy.shevchenko@linux.intel.com> napisał(a):
> > > > > > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> > >
> > > ...
> > >
> > > > I don't mind trying to put in code to optimize use cases
> > > > when accessing single bytes here either. But I'd like there
> > > > to be explicit comments in the code saying why these
> > > > optimizations are there. Can we do those on top of
> > > > this (hopefully) known working base?
> > For the record my original get_multiple patch was a known working base.
> >
> > >
> > > As I pointed out to Paul, the optimization like he proposed is not bad thing
> > > per se, the implementation is. On top of that I suggested to take a look to IRQ
> > > status bits, which presumable will leverage from this optimization as well.
> > >
> > > So, After applying this series it would be matter of change one line in the
> > > ->get_multiple() to replace read_regs() with optimized version or so along with
> > > IRQ bits changes.
> >
> > This new function would then at least need to be called with mask as
> > an additional argument right? Then the bitmap_replace() will set
> > everything regardless of if it was read, this is fine I suppose
> > because it doesn't matter if it's setting bits outside of mask. You
> > just have two loops one in the new function and one in
> > bitmap_replace().
>
> Note, on 64-bit architectures there is no loop in bitmap ops (since 40 <= 64 in
> this case). On 32-bit it might be (only for 40 case, which I think less present
> in the wild than the rest). And bitmap ops are optimized over longs, so, it's
> pretty much fast (esp. in comparison to I²C IO).
>
> > If this is what people would like to see I can work
> > on it. I did look into the sparse reads and it seems as though regmap
> > has a gather_write, but not a gather_read and gather_write isn't in
> > regmap-i2c.c anyway.
>
> Yes, I think this is the right way to go.
OK, sounds good.

-Paul

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function
  2020-04-28 14:58             ` Paul Thomas
@ 2020-04-29 11:21               ` Bartosz Golaszewski
  0 siblings, 0 replies; 19+ messages in thread
From: Bartosz Golaszewski @ 2020-04-29 11:21 UTC (permalink / raw)
  To: Paul Thomas
  Cc: Andy Shevchenko, Linus Walleij, linux-gpio, Uwe Kleine-König

wt., 28 kwi 2020 o 16:58 Paul Thomas <pthomas8589@gmail.com> napisał(a):
>
> On Tue, Apr 28, 2020 at 10:18 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> >
> > On Tue, Apr 28, 2020 at 09:09:03AM -0400, Paul Thomas wrote:
> > > On Tue, Apr 28, 2020 at 8:41 AM Andy Shevchenko
> > > <andriy.shevchenko@linux.intel.com> wrote:
> > > >
> > > > On Tue, Apr 28, 2020 at 02:13:21PM +0200, Linus Walleij wrote:
> > > > > On Tue, Apr 21, 2020 at 5:42 PM Bartosz Golaszewski
> > > > > <bgolaszewski@baylibre.com> wrote:
> > > > > > wt., 21 kwi 2020 o 15:03 Andy Shevchenko
> > > > > > <andriy.shevchenko@linux.intel.com> napisał(a):
> > > > > > > On Mon, Apr 20, 2020 at 08:27:50PM +0300, Andy Shevchenko wrote:
> > > >
> > > > ...
> > > >
> > > > > I don't mind trying to put in code to optimize use cases
> > > > > when accessing single bytes here either. But I'd like there
> > > > > to be explicit comments in the code saying why these
> > > > > optimizations are there. Can we do those on top of
> > > > > this (hopefully) known working base?
> > > For the record my original get_multiple patch was a known working base.
> > >
> > > >
> > > > As I pointed out to Paul, the optimization like he proposed is not bad thing
> > > > per se, the implementation is. On top of that I suggested to take a look to IRQ
> > > > status bits, which presumable will leverage from this optimization as well.
> > > >
> > > > So, After applying this series it would be matter of change one line in the
> > > > ->get_multiple() to replace read_regs() with optimized version or so along with
> > > > IRQ bits changes.
> > >
> > > This new function would then at least need to be called with mask as
> > > an additional argument right? Then the bitmap_replace() will set
> > > everything regardless of if it was read, this is fine I suppose
> > > because it doesn't matter if it's setting bits outside of mask. You
> > > just have two loops one in the new function and one in
> > > bitmap_replace().
> >
> > Note, on 64-bit architectures there is no loop in bitmap ops (since 40 <= 64 in
> > this case). On 32-bit it might be (only for 40 case, which I think less present
> > in the wild than the rest). And bitmap ops are optimized over longs, so, it's
> > pretty much fast (esp. in comparison to I²C IO).
> >
> > > If this is what people would like to see I can work
> > > on it. I did look into the sparse reads and it seems as though regmap
> > > has a gather_write, but not a gather_read and gather_write isn't in
> > > regmap-i2c.c anyway.
> >
> > Yes, I think this is the right way to go.
> OK, sounds good.
>

I applied the whole series to my for-next branch. I'll soon send a PR to Linus.

Bart

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-04-29 11:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 17:27 [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Andy Shevchenko
2020-04-20 17:27 ` [PATCH v3 2/3] gpio: pca953x: fix handling of automatic address incrementing Andy Shevchenko
2020-04-20 17:27 ` [PATCH v3 3/3] gpio: pca953x: drop unused parameters of pca953x_recalc_addr() Andy Shevchenko
2020-04-21  3:23 ` [PATCH v3 1/3] gpio: pca953x: Rewrite ->get_multiple() function Paul Thomas
2020-04-21 12:55   ` Andy Shevchenko
2020-04-21 14:06     ` Uwe Kleine-König
2020-04-21 14:21       ` Andy Shevchenko
2020-04-22  4:33         ` Paul Thomas
2020-04-22  8:34           ` Andy Shevchenko
2020-04-21 13:03 ` Andy Shevchenko
2020-04-21 14:31   ` Uwe Kleine-König
2020-04-21 15:42   ` Bartosz Golaszewski
2020-04-28 12:13     ` Linus Walleij
2020-04-28 12:19       ` Uwe Kleine-König
2020-04-28 12:41       ` Andy Shevchenko
2020-04-28 13:09         ` Paul Thomas
2020-04-28 14:18           ` Andy Shevchenko
2020-04-28 14:58             ` Paul Thomas
2020-04-29 11:21               ` Bartosz Golaszewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.