All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-06 10:37 ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-06 10:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind

With no word size given in the users platform data,
a generic spi host controller driver will assume a default
word size of eight bit. This causes transmission to be
performed bytewise, which will fail on little endian
machines for sure. Failure on big endian depends on usage
of slave select to mark word boundaries.

Anyway, ad7877 is specified to work with 16 bit per word,
so unconditionally set the word size accordingly.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index e019d53..92acd85 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -669,6 +669,8 @@ static int __devinit ad7877_probe(struct spi_device *spi)
 		dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
 		return -EINVAL;
 	}
+	spi->bits_per_word = 16;
+	spi_setup(spi);
 
 	ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
 	input_dev = input_allocate_device();
-- 
1.5.3.7


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-06 10:37 ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-06 10:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind

With no word size given in the users platform data,
a generic spi host controller driver will assume a default
word size of eight bit. This causes transmission to be
performed bytewise, which will fail on little endian
machines for sure. Failure on big endian depends on usage
of slave select to mark word boundaries.

Anyway, ad7877 is specified to work with 16 bit per word,
so unconditionally set the word size accordingly.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index e019d53..92acd85 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -669,6 +669,8 @@ static int __devinit ad7877_probe(struct spi_device *spi)
 		dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
 		return -EINVAL;
 	}
+	spi->bits_per_word = 16;
+	spi_setup(spi);
 
 	ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
 	input_dev = input_allocate_device();
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-06 10:37 ` Oskar Schirmer
  (?)
@ 2010-05-06 10:37 ` Oskar Schirmer
  2010-05-06 18:46     ` Mike Frysinger
  2010-05-10 10:34     ` Oskar Schirmer
  -1 siblings, 2 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-06 10:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

With dma based spi transmission, data corruption
is observed occasionally. With dma buffers located
right next to msg and xfer fields, cache lines
correctly flushed in preparation for dma usage
may be polluted again when writing to fields
in the same cache line.

Make sure cache fields used with dma do not
share cache lines with fields changed during
dma handling. As both fields are part of a
struct that is allocated via kzalloc, thus
cache aligned, moving the fields to the 1st
position and insert padding for alignment
does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 92acd85..9719368 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -153,15 +153,21 @@ enum {
  */
 
 struct ser_req {
+	u16			sample;
+	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
+
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
 };
 
 struct ad7877 {
+	u16			conversion_data[AD7877_NR_SENSE];
+	char			__padalign[L1_CACHE_BYTES
+					- AD7877_NR_SENSE * sizeof(u16)];
+
 	struct input_dev	*input;
 	char			phys[32];
 
@@ -182,8 +188,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
-- 
1.5.3.7


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-06 10:37 ` Oskar Schirmer
@ 2010-05-06 11:18   ` Hennerich, Michael
  -1 siblings, 0 replies; 124+ messages in thread
From: Hennerich, Michael @ 2010-05-06 11:18 UTC (permalink / raw)
  To: Oskar Schirmer, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1742 bytes --]

Oskar Schirmer wrote on 2010-05-06:
> With no word size given in the users platform data, a generic spi host
> controller driver will assume a default word size of eight bit. This
> causes transmission to be performed bytewise, which will fail on
> little endian machines for sure. Failure on big endian depends on
> usage of slave select to mark word boundaries.
>
> Anyway, ad7877 is specified to work with 16 bit per word, so
> unconditionally set the word size accordingly.
>
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>

Acked-by: Michael Hennerich <michael.hennerich@analog.com>

> ---
>  drivers/input/touchscreen/ad7877.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> diff --git a/drivers/input/touchscreen/ad7877.c
> b/drivers/input/touchscreen/ad7877.c index e019d53..92acd85 100644 ---
> a/drivers/input/touchscreen/ad7877.c +++
> b/drivers/input/touchscreen/ad7877.c @@ -669,6 +669,8 @@ static int
> __devinit ad7877_probe(struct spi_device *spi)
>               dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
>               return -EINVAL;
>       }
> +     spi->bits_per_word = 16;
> +     spi_setup(spi);
>
>       ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
>       input_dev = input_allocate_device();

Greetings,
Michael

Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 4036 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-06 11:18   ` Hennerich, Michael
  0 siblings, 0 replies; 124+ messages in thread
From: Hennerich, Michael @ 2010-05-06 11:18 UTC (permalink / raw)
  To: Oskar Schirmer, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind

Oskar Schirmer wrote on 2010-05-06:
> With no word size given in the users platform data, a generic spi host
> controller driver will assume a default word size of eight bit. This
> causes transmission to be performed bytewise, which will fail on
> little endian machines for sure. Failure on big endian depends on
> usage of slave select to mark word boundaries.
>
> Anyway, ad7877 is specified to work with 16 bit per word, so
> unconditionally set the word size accordingly.
>
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>

Acked-by: Michael Hennerich <michael.hennerich@analog.com>

> ---
>  drivers/input/touchscreen/ad7877.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> diff --git a/drivers/input/touchscreen/ad7877.c
> b/drivers/input/touchscreen/ad7877.c index e019d53..92acd85 100644 ---
> a/drivers/input/touchscreen/ad7877.c +++
> b/drivers/input/touchscreen/ad7877.c @@ -669,6 +669,8 @@ static int
> __devinit ad7877_probe(struct spi_device *spi)
>               dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
>               return -EINVAL;
>       }
> +     spi->bits_per_word = 16;
> +     spi_setup(spi);
>
>       ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
>       input_dev = input_allocate_device();

Greetings,
Michael

Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 4036 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-06 10:37 ` Oskar Schirmer
@ 2010-05-06 18:26   ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-06 18:26 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Michael Hennerich

On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -669,6 +669,8 @@ static int __devinit ad7877_probe(struct spi_device *spi)
>                dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
>                return -EINVAL;
>        }
> +       spi->bits_per_word = 16;
> +       spi_setup(spi);

i think it'd be a better idea to do something like:
  if (spi->bits_per_word != 16) {
    if (spi->bits_per_word) {
      dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
      return -EINVAL;
    }
    spi->bits_per_word = 16;
    spi_setup(spi);
  }
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-06 18:26   ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-06 18:26 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Michael Hennerich

On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -669,6 +669,8 @@ static int __devinit ad7877_probe(struct spi_device *spi)
>                dev_dbg(&spi->dev, "SPI CLK %d Hz?\n",spi->max_speed_hz);
>                return -EINVAL;
>        }
> +       spi->bits_per_word = 16;
> +       spi_setup(spi);

i think it'd be a better idea to do something like:
  if (spi->bits_per_word != 16) {
    if (spi->bits_per_word) {
      dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
      return -EINVAL;
    }
    spi->bits_per_word = 16;
    spi_setup(spi);
  }
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-06 10:37 ` [PATCH] ad7877: keep dma rx buffers in seperate cache lines Oskar Schirmer
@ 2010-05-06 18:46     ` Mike Frysinger
  2010-05-10 10:34     ` Oskar Schirmer
  1 sibling, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-06 18:46 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 1957 bytes --]

On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:>  struct ser_req {> +       u16                     sample;> +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];> +>        u16                     reset;>        u16                     ref_on;>        u16                     command;> -       u16                     sample;>        struct spi_message      msg;>        struct spi_transfer     xfer[6];>  };
are you sure this is necessary ?  ser_req is only ever used withspi_sync() and it's allocated/released on the fly, so how couldanything be reading that memory between the start of the transmissionand the return to adi7877 ?
>  struct ad7877 {> +       u16                     conversion_data[AD7877_NR_SENSE];> +       char                    __padalign[L1_CACHE_BYTES> +                                       - AD7877_NR_SENSE * sizeof(u16)];> +>        struct input_dev        *input;>        char                    phys[32];>> @@ -182,8 +188,6 @@ struct ad7877 {>        u8                      averaging;>        u8                      pen_down_acc_interval;>> -       u16                     conversion_data[AD7877_NR_SENSE];> ->        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];>        struct spi_message      msg;
i can see the spi_message inside of this struct being a problembecause the spi transfer is doing asynchronously with spi_async().however, i would add a comment right above these two fields with ashort explanation as to why they're at the start and why the padexists so someone down the line doesnt move it.-mikeÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-06 18:46     ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-06 18:46 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner

On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
>  struct ser_req {
> +       u16                     sample;
> +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> +
>        u16                     reset;
>        u16                     ref_on;
>        u16                     command;
> -       u16                     sample;
>        struct spi_message      msg;
>        struct spi_transfer     xfer[6];
>  };

are you sure this is necessary ?  ser_req is only ever used with
spi_sync() and it's allocated/released on the fly, so how could
anything be reading that memory between the start of the transmission
and the return to adi7877 ?

>  struct ad7877 {
> +       u16                     conversion_data[AD7877_NR_SENSE];
> +       char                    __padalign[L1_CACHE_BYTES
> +                                       - AD7877_NR_SENSE * sizeof(u16)];
> +
>        struct input_dev        *input;
>        char                    phys[32];
>
> @@ -182,8 +188,6 @@ struct ad7877 {
>        u8                      averaging;
>        u8                      pen_down_acc_interval;
>
> -       u16                     conversion_data[AD7877_NR_SENSE];
> -
>        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>        struct spi_message      msg;

i can see the spi_message inside of this struct being a problem
because the spi transfer is doing asynchronously with spi_async().
however, i would add a comment right above these two fields with a
short explanation as to why they're at the start and why the pad
exists so someone down the line doesnt move it.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-06 18:26   ` Mike Frysinger
@ 2010-05-07  9:41     ` Daniel Glöckner
  -1 siblings, 0 replies; 124+ messages in thread
From: Daniel Glöckner @ 2010-05-07  9:41 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> i think it'd be a better idea to do something like:
>   if (spi->bits_per_word != 16) {
>     if (spi->bits_per_word) {
>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>       return -EINVAL;
>     }
>     spi->bits_per_word = 16;
>     spi_setup(spi);
>   }

There is no way to set bits_per_word using struct spi_board_info. The
description of that structure in spi.h explicitly lists the wordsize as
one of the parameters drivers should set themself in probe().

Only struct bfin5xx_spi_chip allows to set this value in the board code.

  Daniel


-- 
Dipl.-Math. Daniel Glöckner, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax -11, Bahnhofsallee 1b, 37081 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführer: Dr. Uwe Kracke, Ust-IdNr.: DE 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-07  9:41     ` Daniel Glöckner
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Glöckner @ 2010-05-07  9:41 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> i think it'd be a better idea to do something like:
>   if (spi->bits_per_word != 16) {
>     if (spi->bits_per_word) {
>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>       return -EINVAL;
>     }
>     spi->bits_per_word = 16;
>     spi_setup(spi);
>   }

There is no way to set bits_per_word using struct spi_board_info. The
description of that structure in spi.h explicitly lists the wordsize as
one of the parameters drivers should set themself in probe().

Only struct bfin5xx_spi_chip allows to set this value in the board code.

  Daniel


-- 
Dipl.-Math. Daniel Glöckner, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax -11, Bahnhofsallee 1b, 37081 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführer: Dr. Uwe Kracke, Ust-IdNr.: DE 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-06 18:46     ` Mike Frysinger
@ 2010-05-07 10:15       ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-07 10:15 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >  struct ser_req {
> > +       u16                     sample;
> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> > +
> >        u16                     reset;
> >        u16                     ref_on;
> >        u16                     command;
> > -       u16                     sample;
> >        struct spi_message      msg;
> >        struct spi_transfer     xfer[6];
> >  };
> 
> are you sure this is necessary ?  ser_req is only ever used with
> spi_sync() and it's allocated/released on the fly, so how could
> anything be reading that memory between the start of the transmission
> and the return to adi7877 ?

msg is handed over to spi_sync, it contains the addresses
which will be used to programme the DMA: the spi master
transfer function will read these fields to start DMA.

E.g., drivers/spi/atmel_spi.c, assures cache coherency
in function atmel_spi_dma_map_xfer, one xfer at a time.
Multiple transfers are worked on in a loop in atmel_spi_transfer,
so when coherency for transfer #1 is ensured, addresses
for transfer #2 are read from the msg and xfer structs,
caching lines which have been just before invalidated.

And, citing Documentation/DMA-API.txt, Part Id:
"mapped region must begin exactly on a cache line
boundary and end exactly on one", which is our case.

> 
> >  struct ad7877 {
> > +       u16                     conversion_data[AD7877_NR_SENSE];
> > +       char                    __padalign[L1_CACHE_BYTES
> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
> > +
> >        struct input_dev        *input;
> >        char                    phys[32];
> >
> > @@ -182,8 +188,6 @@ struct ad7877 {
> >        u8                      averaging;
> >        u8                      pen_down_acc_interval;
> >
> > -       u16                     conversion_data[AD7877_NR_SENSE];
> > -
> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >        struct spi_message      msg;
> 
> i can see the spi_message inside of this struct being a problem
> because the spi transfer is doing asynchronously with spi_async().
> however, i would add a comment right above these two fields with a
> short explanation as to why they're at the start and why the pad
> exists so someone down the line doesnt move it.

The code says "pad to align according to L1 cache, and
keep away other stuff by exactly the amount so it is
off the line". I'ld guess a comment would repeat just
this, so it is superfluous. But if opinions differ on
this topic, we can have a comment added, sure.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-07 10:15       ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-07 10:15 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >  struct ser_req {
> > +       u16                     sample;
> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> > +
> >        u16                     reset;
> >        u16                     ref_on;
> >        u16                     command;
> > -       u16                     sample;
> >        struct spi_message      msg;
> >        struct spi_transfer     xfer[6];
> >  };
> 
> are you sure this is necessary ?  ser_req is only ever used with
> spi_sync() and it's allocated/released on the fly, so how could
> anything be reading that memory between the start of the transmission
> and the return to adi7877 ?

msg is handed over to spi_sync, it contains the addresses
which will be used to programme the DMA: the spi master
transfer function will read these fields to start DMA.

E.g., drivers/spi/atmel_spi.c, assures cache coherency
in function atmel_spi_dma_map_xfer, one xfer at a time.
Multiple transfers are worked on in a loop in atmel_spi_transfer,
so when coherency for transfer #1 is ensured, addresses
for transfer #2 are read from the msg and xfer structs,
caching lines which have been just before invalidated.

And, citing Documentation/DMA-API.txt, Part Id:
"mapped region must begin exactly on a cache line
boundary and end exactly on one", which is our case.

> 
> >  struct ad7877 {
> > +       u16                     conversion_data[AD7877_NR_SENSE];
> > +       char                    __padalign[L1_CACHE_BYTES
> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
> > +
> >        struct input_dev        *input;
> >        char                    phys[32];
> >
> > @@ -182,8 +188,6 @@ struct ad7877 {
> >        u8                      averaging;
> >        u8                      pen_down_acc_interval;
> >
> > -       u16                     conversion_data[AD7877_NR_SENSE];
> > -
> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >        struct spi_message      msg;
> 
> i can see the spi_message inside of this struct being a problem
> because the spi transfer is doing asynchronously with spi_async().
> however, i would add a comment right above these two fields with a
> short explanation as to why they're at the start and why the pad
> exists so someone down the line doesnt move it.

The code says "pad to align according to L1 cache, and
keep away other stuff by exactly the amount so it is
off the line". I'ld guess a comment would repeat just
this, so it is superfluous. But if opinions differ on
this topic, we can have a comment added, sure.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-06 18:46     ` Mike Frysinger
  (?)
  (?)
@ 2010-05-07 12:07     ` Johannes Weiner
  -1 siblings, 0 replies; 124+ messages in thread
From: Johannes Weiner @ 2010-05-07 12:07 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Thu, May 06, 2010 at 02:46:04PM -0400, Mike Frysinger wrote:
> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >  struct ser_req {
> > +       u16                     sample;
> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> > +
> >        u16                     reset;
> >        u16                     ref_on;
> >        u16                     command;
> > -       u16                     sample;
> >        struct spi_message      msg;
> >        struct spi_transfer     xfer[6];
> >  };
> 
> are you sure this is necessary ?  ser_req is only ever used with
> spi_sync() and it's allocated/released on the fly, so how could
> anything be reading that memory between the start of the transmission
> and the return to adi7877 ?

The master driver can.  atmel_spi flushes the cache of the buffers
pretty early when queuing a message and touches the message members
afterwards.

> >  struct ad7877 {
> > +       u16                     conversion_data[AD7877_NR_SENSE];
> > +       char                    __padalign[L1_CACHE_BYTES
> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
> > +
> >        struct input_dev        *input;
> >        char                    phys[32];
> >
> > @@ -182,8 +188,6 @@ struct ad7877 {
> >        u8                      averaging;
> >        u8                      pen_down_acc_interval;
> >
> > -       u16                     conversion_data[AD7877_NR_SENSE];
> > -
> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >        struct spi_message      msg;
> 
> i can see the spi_message inside of this struct being a problem
> because the spi transfer is doing asynchronously with spi_async().
> however, i would add a comment right above these two fields with a
> short explanation as to why they're at the start and why the pad
> exists so someone down the line doesnt move it.

Good idea.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-07  9:41     ` Daniel Glöckner
@ 2010-05-07 18:23       ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-07 18:23 UTC (permalink / raw)
  To: Daniel Glöckner
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> i think it'd be a better idea to do something like:
>>   if (spi->bits_per_word != 16) {
>>     if (spi->bits_per_word) {
>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>>       return -EINVAL;
>>     }
>>     spi->bits_per_word = 16;
>>     spi_setup(spi);
>>   }
>
> There is no way to set bits_per_word using struct spi_board_info. The
> description of that structure in spi.h explicitly lists the wordsize as
> one of the parameters drivers should set themself in probe().
>
> Only struct bfin5xx_spi_chip allows to set this value in the board code.

an obvious shortcoming in the SPI framework that should be fixed, but
that doesnt make any difference to the above code now does it ?  it'll
operate correctly regardless of the SPI bus master.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-07 18:23       ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-07 18:23 UTC (permalink / raw)
  To: Daniel Glöckner
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> i think it'd be a better idea to do something like:
>>   if (spi->bits_per_word != 16) {
>>     if (spi->bits_per_word) {
>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>>       return -EINVAL;
>>     }
>>     spi->bits_per_word = 16;
>>     spi_setup(spi);
>>   }
>
> There is no way to set bits_per_word using struct spi_board_info. The
> description of that structure in spi.h explicitly lists the wordsize as
> one of the parameters drivers should set themself in probe().
>
> Only struct bfin5xx_spi_chip allows to set this value in the board code.

an obvious shortcoming in the SPI framework that should be fixed, but
that doesnt make any difference to the above code now does it ?  it'll
operate correctly regardless of the SPI bus master.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-07 10:15       ` Oskar Schirmer
@ 2010-05-07 18:28         ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-07 18:28 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner

On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
>> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
>> >  struct ser_req {
>> > +       u16                     sample;
>> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> > +
>> >        u16                     reset;
>> >        u16                     ref_on;
>> >        u16                     command;
>> > -       u16                     sample;
>> >        struct spi_message      msg;
>> >        struct spi_transfer     xfer[6];
>> >  };
>>
>> are you sure this is necessary ?  ser_req is only ever used with
>> spi_sync() and it's allocated/released on the fly, so how could
>> anything be reading that memory between the start of the transmission
>> and the return to adi7877 ?
>
> msg is handed over to spi_sync, it contains the addresses
> which will be used to programme the DMA: the spi master
> transfer function will read these fields to start DMA.

so the issue is coming from the SPI master drivers and not the AD7877 driver

>> >  struct ad7877 {
>> > +       u16                     conversion_data[AD7877_NR_SENSE];
>> > +       char                    __padalign[L1_CACHE_BYTES
>> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
>> > +
>> >        struct input_dev        *input;
>> >        char                    phys[32];
>> >
>> > @@ -182,8 +188,6 @@ struct ad7877 {
>> >        u8                      averaging;
>> >        u8                      pen_down_acc_interval;
>> >
>> > -       u16                     conversion_data[AD7877_NR_SENSE];
>> > -
>> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>> >        struct spi_message      msg;
>>
>> i can see the spi_message inside of this struct being a problem
>> because the spi transfer is doing asynchronously with spi_async().
>> however, i would add a comment right above these two fields with a
>> short explanation as to why they're at the start and why the pad
>> exists so someone down the line doesnt move it.
>
> The code says "pad to align according to L1 cache, and
> keep away other stuff by exactly the amount so it is
> off the line". I'ld guess a comment would repeat just
> this, so it is superfluous. But if opinions differ on
> this topic, we can have a comment added, sure.

not everyone knows to read every single piece of documentation that
may or may not be affected implicitly in the call stack.  a simple
comment here is not superfluous.

since the other struct is also going to be changed, a comment should
be placed there as well.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-07 18:28         ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-07 18:28 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Andrew Morton, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner

On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
>> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
>> >  struct ser_req {
>> > +       u16                     sample;
>> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> > +
>> >        u16                     reset;
>> >        u16                     ref_on;
>> >        u16                     command;
>> > -       u16                     sample;
>> >        struct spi_message      msg;
>> >        struct spi_transfer     xfer[6];
>> >  };
>>
>> are you sure this is necessary ?  ser_req is only ever used with
>> spi_sync() and it's allocated/released on the fly, so how could
>> anything be reading that memory between the start of the transmission
>> and the return to adi7877 ?
>
> msg is handed over to spi_sync, it contains the addresses
> which will be used to programme the DMA: the spi master
> transfer function will read these fields to start DMA.

so the issue is coming from the SPI master drivers and not the AD7877 driver

>> >  struct ad7877 {
>> > +       u16                     conversion_data[AD7877_NR_SENSE];
>> > +       char                    __padalign[L1_CACHE_BYTES
>> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
>> > +
>> >        struct input_dev        *input;
>> >        char                    phys[32];
>> >
>> > @@ -182,8 +188,6 @@ struct ad7877 {
>> >        u8                      averaging;
>> >        u8                      pen_down_acc_interval;
>> >
>> > -       u16                     conversion_data[AD7877_NR_SENSE];
>> > -
>> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>> >        struct spi_message      msg;
>>
>> i can see the spi_message inside of this struct being a problem
>> because the spi transfer is doing asynchronously with spi_async().
>> however, i would add a comment right above these two fields with a
>> short explanation as to why they're at the start and why the pad
>> exists so someone down the line doesnt move it.
>
> The code says "pad to align according to L1 cache, and
> keep away other stuff by exactly the amount so it is
> off the line". I'ld guess a comment would repeat just
> this, so it is superfluous. But if opinions differ on
> this topic, we can have a comment added, sure.

not everyone knows to read every single piece of documentation that
may or may not be affected implicitly in the call stack.  a simple
comment here is not superfluous.

since the other struct is also going to be changed, a comment should
be placed there as well.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-07 18:28         ` Mike Frysinger
@ 2010-05-08 22:32           ` Johannes Weiner
  -1 siblings, 0 replies; 124+ messages in thread
From: Johannes Weiner @ 2010-05-08 22:32 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind

On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >> >  struct ser_req {
> >> > +       u16                     sample;
> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> > +
> >> >        u16                     reset;
> >> >        u16                     ref_on;
> >> >        u16                     command;
> >> > -       u16                     sample;
> >> >        struct spi_message      msg;
> >> >        struct spi_transfer     xfer[6];
> >> >  };
> >>
> >> are you sure this is necessary ?  ser_req is only ever used with
> >> spi_sync() and it's allocated/released on the fly, so how could
> >> anything be reading that memory between the start of the transmission
> >> and the return to adi7877 ?
> >
> > msg is handed over to spi_sync, it contains the addresses
> > which will be used to programme the DMA: the spi master
> > transfer function will read these fields to start DMA.
> 
> so the issue is coming from the SPI master drivers and not the AD7877 driver

No, the issue is coming from ad7877 placing a transmission buffer
into the same cache line with memory locations that are accessed outside
the driver's scope.  It must assume that the SPI driver does DMA, that
cache coherency is maintained at cache line granularity, and must not
make any assumptions about how the struct spi_message members are used.

The following is about slave drivers from Documentation/spi/spi-summary:

  - Follow standard kernel rules, and provide DMA-safe buffers in
    your messages.  That way controller drivers using DMA aren't forced
    to make extra copies unless the hardware requires it (e.g. working
    around hardware errata that force the use of bounce buffering).

> >> >  struct ad7877 {
> >> > +       u16                     conversion_data[AD7877_NR_SENSE];
> >> > +       char                    __padalign[L1_CACHE_BYTES
> >> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
> >> > +
> >> >        struct input_dev        *input;
> >> >        char                    phys[32];
> >> >
> >> > @@ -182,8 +188,6 @@ struct ad7877 {
> >> >        u8                      averaging;
> >> >        u8                      pen_down_acc_interval;
> >> >
> >> > -       u16                     conversion_data[AD7877_NR_SENSE];
> >> > -
> >> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >> >        struct spi_message      msg;
> >>
> >> i can see the spi_message inside of this struct being a problem
> >> because the spi transfer is doing asynchronously with spi_async().
> >> however, i would add a comment right above these two fields with a
> >> short explanation as to why they're at the start and why the pad
> >> exists so someone down the line doesnt move it.
> >
> > The code says "pad to align according to L1 cache, and
> > keep away other stuff by exactly the amount so it is
> > off the line". I'ld guess a comment would repeat just
> > this, so it is superfluous. But if opinions differ on
> > this topic, we can have a comment added, sure.
> 
> not everyone knows to read every single piece of documentation that
> may or may not be affected implicitly in the call stack.  a simple
> comment here is not superfluous.
> 
> since the other struct is also going to be changed, a comment should
> be placed there as well.

How about

  /*
   * DMA (thus cache coherency maintainance) requires the
   * transfer buffers to live in their own cache lines.
   */
   char		__padalign[...];

?  It might be obvious what the code does, but I agree with
Mike that it might not be immediately apparent why it's needed.

	Hannes

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-08 22:32           ` Johannes Weiner
  0 siblings, 0 replies; 124+ messages in thread
From: Johannes Weiner @ 2010-05-08 22:32 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind

On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >> >  struct ser_req {
> >> > +       u16                     sample;
> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> > +
> >> >        u16                     reset;
> >> >        u16                     ref_on;
> >> >        u16                     command;
> >> > -       u16                     sample;
> >> >        struct spi_message      msg;
> >> >        struct spi_transfer     xfer[6];
> >> >  };
> >>
> >> are you sure this is necessary ?  ser_req is only ever used with
> >> spi_sync() and it's allocated/released on the fly, so how could
> >> anything be reading that memory between the start of the transmission
> >> and the return to adi7877 ?
> >
> > msg is handed over to spi_sync, it contains the addresses
> > which will be used to programme the DMA: the spi master
> > transfer function will read these fields to start DMA.
> 
> so the issue is coming from the SPI master drivers and not the AD7877 driver

No, the issue is coming from ad7877 placing a transmission buffer
into the same cache line with memory locations that are accessed outside
the driver's scope.  It must assume that the SPI driver does DMA, that
cache coherency is maintained at cache line granularity, and must not
make any assumptions about how the struct spi_message members are used.

The following is about slave drivers from Documentation/spi/spi-summary:

  - Follow standard kernel rules, and provide DMA-safe buffers in
    your messages.  That way controller drivers using DMA aren't forced
    to make extra copies unless the hardware requires it (e.g. working
    around hardware errata that force the use of bounce buffering).

> >> >  struct ad7877 {
> >> > +       u16                     conversion_data[AD7877_NR_SENSE];
> >> > +       char                    __padalign[L1_CACHE_BYTES
> >> > +                                       - AD7877_NR_SENSE * sizeof(u16)];
> >> > +
> >> >        struct input_dev        *input;
> >> >        char                    phys[32];
> >> >
> >> > @@ -182,8 +188,6 @@ struct ad7877 {
> >> >        u8                      averaging;
> >> >        u8                      pen_down_acc_interval;
> >> >
> >> > -       u16                     conversion_data[AD7877_NR_SENSE];
> >> > -
> >> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >> >        struct spi_message      msg;
> >>
> >> i can see the spi_message inside of this struct being a problem
> >> because the spi transfer is doing asynchronously with spi_async().
> >> however, i would add a comment right above these two fields with a
> >> short explanation as to why they're at the start and why the pad
> >> exists so someone down the line doesnt move it.
> >
> > The code says "pad to align according to L1 cache, and
> > keep away other stuff by exactly the amount so it is
> > off the line". I'ld guess a comment would repeat just
> > this, so it is superfluous. But if opinions differ on
> > this topic, we can have a comment added, sure.
> 
> not everyone knows to read every single piece of documentation that
> may or may not be affected implicitly in the call stack.  a simple
> comment here is not superfluous.
> 
> since the other struct is also going to be changed, a comment should
> be placed there as well.

How about

  /*
   * DMA (thus cache coherency maintainance) requires the
   * transfer buffers to live in their own cache lines.
   */
   char		__padalign[...];

?  It might be obvious what the code does, but I agree with
Mike that it might not be immediately apparent why it's needed.

	Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-08 22:32           ` Johannes Weiner
@ 2010-05-09  4:45             ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-09  4:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind

On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote:
> On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
>> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
>> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
>> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
>> >> >  struct ser_req {
>> >> > +       u16                     sample;
>> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> >> > +
>> >> >        u16                     reset;
>> >> >        u16                     ref_on;
>> >> >        u16                     command;
>> >> > -       u16                     sample;
>> >> >        struct spi_message      msg;
>> >> >        struct spi_transfer     xfer[6];
>> >> >  };
>> >>
>> >> are you sure this is necessary ?  ser_req is only ever used with
>> >> spi_sync() and it's allocated/released on the fly, so how could
>> >> anything be reading that memory between the start of the transmission
>> >> and the return to adi7877 ?
>> >
>> > msg is handed over to spi_sync, it contains the addresses
>> > which will be used to programme the DMA: the spi master
>> > transfer function will read these fields to start DMA.
>>
>> so the issue is coming from the SPI master drivers and not the AD7877 driver
>
> No, the issue is coming from ad7877 placing a transmission buffer
> into the same cache line with memory locations that are accessed outside
> the driver's scope.

you missed the point of my comment.  as i clearly explained in the
other structure, the AD7877 driver was causing the cache desync.  here
it is the SPI master that is implicitly causing it.  i'm not talking
about the AD7877 being correct wrt to the implicit SPI/DMA
requirements, just what code exactly is triggering the cache issues.

>  /*
>   * DMA (thus cache coherency maintainance) requires the
>   * transfer buffers to live in their own cache lines.
>   */
>   char         __padalign[...];
>
> ?  It might be obvious what the code does, but I agree with
> Mike that it might not be immediately apparent why it's needed.

comment looks fine once the spelling is fixed (maintenance).  thanks.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-09  4:45             ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-09  4:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind

On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote:
> On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
>> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
>> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
>> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
>> >> >  struct ser_req {
>> >> > +       u16                     sample;
>> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> >> > +
>> >> >        u16                     reset;
>> >> >        u16                     ref_on;
>> >> >        u16                     command;
>> >> > -       u16                     sample;
>> >> >        struct spi_message      msg;
>> >> >        struct spi_transfer     xfer[6];
>> >> >  };
>> >>
>> >> are you sure this is necessary ?  ser_req is only ever used with
>> >> spi_sync() and it's allocated/released on the fly, so how could
>> >> anything be reading that memory between the start of the transmission
>> >> and the return to adi7877 ?
>> >
>> > msg is handed over to spi_sync, it contains the addresses
>> > which will be used to programme the DMA: the spi master
>> > transfer function will read these fields to start DMA.
>>
>> so the issue is coming from the SPI master drivers and not the AD7877 driver
>
> No, the issue is coming from ad7877 placing a transmission buffer
> into the same cache line with memory locations that are accessed outside
> the driver's scope.

you missed the point of my comment.  as i clearly explained in the
other structure, the AD7877 driver was causing the cache desync.  here
it is the SPI master that is implicitly causing it.  i'm not talking
about the AD7877 being correct wrt to the implicit SPI/DMA
requirements, just what code exactly is triggering the cache issues.

>  /*
>   * DMA (thus cache coherency maintainance) requires the
>   * transfer buffers to live in their own cache lines.
>   */
>   char         __padalign[...];
>
> ?  It might be obvious what the code does, but I agree with
> Mike that it might not be immediately apparent why it's needed.

comment looks fine once the spelling is fixed (maintenance).  thanks.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-09  4:45             ` Mike Frysinger
@ 2010-05-09  8:50               ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-09  8:50 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Johannes Weiner, Oskar Schirmer, Dmitry Torokhov, Andrew Morton,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Hennerich, Michael

On Sun, May 09, 2010 at 00:45:41 -0400, Mike Frysinger wrote:
> On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote:
> > On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
> >> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> >> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> >> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >> >> >  struct ser_req {
> >> >> > +       u16                     sample;
> >> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> >> > +
> >> >> >        u16                     reset;
> >> >> >        u16                     ref_on;
> >> >> >        u16                     command;
> >> >> > -       u16                     sample;
> >> >> >        struct spi_message      msg;
> >> >> >        struct spi_transfer     xfer[6];
> >> >> >  };
> >> >>
> >> >> are you sure this is necessary ?  ser_req is only ever used with
> >> >> spi_sync() and it's allocated/released on the fly, so how could
> >> >> anything be reading that memory between the start of the transmission
> >> >> and the return to adi7877 ?
> >> >
> >> > msg is handed over to spi_sync, it contains the addresses
> >> > which will be used to programme the DMA: the spi master
> >> > transfer function will read these fields to start DMA.
> >>
> >> so the issue is coming from the SPI master drivers and not the AD7877 driver
> >
> > No, the issue is coming from ad7877 placing a transmission buffer
> > into the same cache line with memory locations that are accessed outside
> > the driver's scope.
> 
> you missed the point of my comment.  as i clearly explained in the
> other structure, the AD7877 driver was causing the cache desync.  here
> it is the SPI master that is implicitly causing it.  i'm not talking
> about the AD7877 being correct wrt to the implicit SPI/DMA
> requirements, just what code exactly is triggering the cache issues.

In both cases ad7877 did place DMA buffers in the same
cache line with reference data needed by spi master to
programme the DMA engine. Once the machinery is started thru
spi_sync, the other case uses spi_async. Both cases open out
into master->transfer via spi_async. In both cases, with
drivers/spi/atmel_spi.c, cache lines are flushed and then
reference data is fed into the DMA engine, thereby
causing the line in question to be cached untimely.

Note, that atmel_spi (thus master) is not wrong here,
as it must assume DMA buffers being correctly aligned
into separate cache lines, so accessing reference data
after cache flush is not vicious. So in both cases the
problem is caused by ad7877 and thus fixed analoguously.

> 
> >  /*
> >   * DMA (thus cache coherency maintainance) requires the
> >   * transfer buffers to live in their own cache lines.
> >   */
> >   char         __padalign[...];
> >
> > ?  It might be obvious what the code does, but I agree with
> > Mike that it might not be immediately apparent why it's needed.
> 
> comment looks fine once the spelling is fixed (maintenance).  thanks.

Ok, will prepare that soon.
  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-09  8:50               ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-09  8:50 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Johannes Weiner, Oskar Schirmer, Dmitry Torokhov, Andrew Morton,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Hennerich, Michael

On Sun, May 09, 2010 at 00:45:41 -0400, Mike Frysinger wrote:
> On Sat, May 8, 2010 at 18:32, Johannes Weiner wrote:
> > On Fri, May 07, 2010 at 02:28:16PM -0400, Mike Frysinger wrote:
> >> On Fri, May 7, 2010 at 06:15, Oskar Schirmer wrote:
> >> > On Thu, May 06, 2010 at 14:46:04 -0400, Mike Frysinger wrote:
> >> >> On Thu, May 6, 2010 at 06:37, Oskar Schirmer wrote:
> >> >> >  struct ser_req {
> >> >> > +       u16                     sample;
> >> >> > +       char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> >> > +
> >> >> >        u16                     reset;
> >> >> >        u16                     ref_on;
> >> >> >        u16                     command;
> >> >> > -       u16                     sample;
> >> >> >        struct spi_message      msg;
> >> >> >        struct spi_transfer     xfer[6];
> >> >> >  };
> >> >>
> >> >> are you sure this is necessary ?  ser_req is only ever used with
> >> >> spi_sync() and it's allocated/released on the fly, so how could
> >> >> anything be reading that memory between the start of the transmission
> >> >> and the return to adi7877 ?
> >> >
> >> > msg is handed over to spi_sync, it contains the addresses
> >> > which will be used to programme the DMA: the spi master
> >> > transfer function will read these fields to start DMA.
> >>
> >> so the issue is coming from the SPI master drivers and not the AD7877 driver
> >
> > No, the issue is coming from ad7877 placing a transmission buffer
> > into the same cache line with memory locations that are accessed outside
> > the driver's scope.
> 
> you missed the point of my comment.  as i clearly explained in the
> other structure, the AD7877 driver was causing the cache desync.  here
> it is the SPI master that is implicitly causing it.  i'm not talking
> about the AD7877 being correct wrt to the implicit SPI/DMA
> requirements, just what code exactly is triggering the cache issues.

In both cases ad7877 did place DMA buffers in the same
cache line with reference data needed by spi master to
programme the DMA engine. Once the machinery is started thru
spi_sync, the other case uses spi_async. Both cases open out
into master->transfer via spi_async. In both cases, with
drivers/spi/atmel_spi.c, cache lines are flushed and then
reference data is fed into the DMA engine, thereby
causing the line in question to be cached untimely.

Note, that atmel_spi (thus master) is not wrong here,
as it must assume DMA buffers being correctly aligned
into separate cache lines, so accessing reference data
after cache flush is not vicious. So in both cases the
problem is caused by ad7877 and thus fixed analoguously.

> 
> >  /*
> >   * DMA (thus cache coherency maintainance) requires the
> >   * transfer buffers to live in their own cache lines.
> >   */
> >   char         __padalign[...];
> >
> > ?  It might be obvious what the code does, but I agree with
> > Mike that it might not be immediately apparent why it's needed.
> 
> comment looks fine once the spelling is fixed (maintenance).  thanks.

Ok, will prepare that soon.
  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH v2] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-06 10:37 ` [PATCH] ad7877: keep dma rx buffers in seperate cache lines Oskar Schirmer
@ 2010-05-10 10:34     ` Oskar Schirmer
  2010-05-10 10:34     ` Oskar Schirmer
  1 sibling, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-10 10:34 UTC (permalink / raw)
  To: Michael Hennerich, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

With dma based spi transmission, data corruption
is observed occasionally. With dma buffers located
right next to msg and xfer fields, cache lines
correctly flushed in preparation for dma usage
may be polluted again when writing to fields
in the same cache line.

Make sure cache fields used with dma do not
share cache lines with fields changed during
dma handling. As both fields are part of a
struct that is allocated via kzalloc, thus
cache aligned, moving the fields to the 1st
position and insert padding for alignment
does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

v2: add a comment to explain why alignment is needed

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 9cfc8b5..69ebacf 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -153,15 +153,29 @@ enum {
  */
 
 struct ser_req {
+	u16			sample;
+	/*
+	 * DMA (thus cache coherency maintainance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
+
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
 };
 
 struct ad7877 {
+	u16			conversion_data[AD7877_NR_SENSE];
+	/*
+	 * DMA (thus cache coherency maintainance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES
+					- AD7877_NR_SENSE * sizeof(u16)];
+
 	struct input_dev	*input;
 	char			phys[32];
 
@@ -182,8 +196,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
-- 
1.5.3.7


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH v2] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-10 10:34     ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-10 10:34 UTC (permalink / raw)
  To: Michael Hennerich, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

With dma based spi transmission, data corruption
is observed occasionally. With dma buffers located
right next to msg and xfer fields, cache lines
correctly flushed in preparation for dma usage
may be polluted again when writing to fields
in the same cache line.

Make sure cache fields used with dma do not
share cache lines with fields changed during
dma handling. As both fields are part of a
struct that is allocated via kzalloc, thus
cache aligned, moving the fields to the 1st
position and insert padding for alignment
does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

v2: add a comment to explain why alignment is needed

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 9cfc8b5..69ebacf 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -153,15 +153,29 @@ enum {
  */
 
 struct ser_req {
+	u16			sample;
+	/*
+	 * DMA (thus cache coherency maintainance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
+
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
 };
 
 struct ad7877 {
+	u16			conversion_data[AD7877_NR_SENSE];
+	/*
+	 * DMA (thus cache coherency maintainance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES
+					- AD7877_NR_SENSE * sizeof(u16)];
+
 	struct input_dev	*input;
 	char			phys[32];
 
@@ -182,8 +196,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 10:34     ` Oskar Schirmer
@ 2010-05-10 10:42       ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-10 10:42 UTC (permalink / raw)
  To: Michael Hennerich, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

With dma based spi transmission, data corruption
is observed occasionally. With dma buffers located
right next to msg and xfer fields, cache lines
correctly flushed in preparation for dma usage
may be polluted again when writing to fields
in the same cache line.

Make sure cache fields used with dma do not
share cache lines with fields changed during
dma handling. As both fields are part of a
struct that is allocated via kzalloc, thus
cache aligned, moving the fields to the 1st
position and insert padding for alignment
does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

v2: add a comment to explain why alignment is needed

v3: fix the typo in comment and layout (- to end of line)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 885354c..9ebb1b4 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -153,15 +153,29 @@ enum {
  */
 
 struct ser_req {
+	u16			sample;
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
+
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
 };
 
 struct ad7877 {
+	u16			conversion_data[AD7877_NR_SENSE];
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES -
+					AD7877_NR_SENSE * sizeof(u16)];
+
 	struct input_dev	*input;
 	char			phys[32];
 
@@ -182,8 +196,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
-- 
1.5.3.7


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-10 10:42       ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-10 10:42 UTC (permalink / raw)
  To: Michael Hennerich, Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Oskar Schirmer, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

With dma based spi transmission, data corruption
is observed occasionally. With dma buffers located
right next to msg and xfer fields, cache lines
correctly flushed in preparation for dma usage
may be polluted again when writing to fields
in the same cache line.

Make sure cache fields used with dma do not
share cache lines with fields changed during
dma handling. As both fields are part of a
struct that is allocated via kzalloc, thus
cache aligned, moving the fields to the 1st
position and insert padding for alignment
does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

v2: add a comment to explain why alignment is needed

v3: fix the typo in comment and layout (- to end of line)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 885354c..9ebb1b4 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -153,15 +153,29 @@ enum {
  */
 
 struct ser_req {
+	u16			sample;
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
+
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
 };
 
 struct ad7877 {
+	u16			conversion_data[AD7877_NR_SENSE];
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	char			__padalign[L1_CACHE_BYTES -
+					AD7877_NR_SENSE * sizeof(u16)];
+
 	struct input_dev	*input;
 	char			phys[32];
 
@@ -182,8 +196,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 10:42       ` Oskar Schirmer
  (?)
@ 2010-05-10 16:39       ` Mike Frysinger
  2010-05-10 20:54         ` Dmitry Torokhov
  -1 siblings, 1 reply; 124+ messages in thread
From: Mike Frysinger @ 2010-05-10 16:39 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Michael Hennerich, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Mon, May 10, 2010 at 06:42, Oskar Schirmer wrote:
> With dma based spi transmission, data corruption
> is observed occasionally. With dma buffers located
> right next to msg and xfer fields, cache lines
> correctly flushed in preparation for dma usage
> may be polluted again when writing to fields
> in the same cache line.
>
> Make sure cache fields used with dma do not
> share cache lines with fields changed during
> dma handling. As both fields are part of a
> struct that is allocated via kzalloc, thus
> cache aligned, moving the fields to the 1st
> position and insert padding for alignment
> does the job.

Acked-by: Mike Frysinger <vapier@gentoo.org>

i'm guessing Dmitry will pick it up now
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 16:39       ` Mike Frysinger
@ 2010-05-10 20:54         ` Dmitry Torokhov
  0 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-10 20:54 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Michael Hennerich, Andrew Morton, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Mon, May 10, 2010 at 12:39:49PM -0400, Mike Frysinger wrote:
> On Mon, May 10, 2010 at 06:42, Oskar Schirmer wrote:
> > With dma based spi transmission, data corruption
> > is observed occasionally. With dma buffers located
> > right next to msg and xfer fields, cache lines
> > correctly flushed in preparation for dma usage
> > may be polluted again when writing to fields
> > in the same cache line.
> >
> > Make sure cache fields used with dma do not
> > share cache lines with fields changed during
> > dma handling. As both fields are part of a
> > struct that is allocated via kzalloc, thus
> > cache aligned, moving the fields to the 1st
> > position and insert padding for alignment
> > does the job.
> 
> Acked-by: Mike Frysinger <vapier@gentoo.org>
> 
> i'm guessing Dmitry will pick it up now

Yep.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 10:42       ` Oskar Schirmer
  (?)
  (?)
@ 2010-05-10 21:22       ` Andrew Morton
  2010-05-10 21:27         ` Mike Frysinger
  2010-05-11  6:05           ` Dmitry Torokhov
  -1 siblings, 2 replies; 124+ messages in thread
From: Andrew Morton @ 2010-05-10 21:22 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Michael Hennerich, Dmitry Torokhov, Mike Frysinger, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Mon, 10 May 2010 12:42:34 +0200
"Oskar Schirmer" <os@emlix.com> wrote:

> With dma based spi transmission, data corruption
> is observed occasionally. With dma buffers located
> right next to msg and xfer fields, cache lines
> correctly flushed in preparation for dma usage
> may be polluted again when writing to fields
> in the same cache line.
> 
> Make sure cache fields used with dma do not
> share cache lines with fields changed during
> dma handling. As both fields are part of a
> struct that is allocated via kzalloc, thus
> cache aligned, moving the fields to the 1st
> position and insert padding for alignment
> does the job.

This sounds odd.  Doesn't it imply that some code somewhere is missing
some DMA synchronisation actions?

> 
> v2: add a comment to explain why alignment is needed
> 
> v3: fix the typo in comment and layout (- to end of line)
> 
> diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> index 885354c..9ebb1b4 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -153,15 +153,29 @@ enum {
>   */
>  
>  struct ser_req {
> +	u16			sample;
> +	/*
> +	 * DMA (thus cache coherency maintenance) requires the
> +	 * transfer buffers to live in their own cache lines.
> +	 */
> +	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];

It would be better to use __cacheline_aligned, rather than open-coding
things in this manner.


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 21:22       ` Andrew Morton
@ 2010-05-10 21:27         ` Mike Frysinger
  2010-05-11  6:05           ` Dmitry Torokhov
  1 sibling, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-10 21:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oskar Schirmer, Michael Hennerich, Dmitry Torokhov,
	Mike Frysinger, linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner

On Mon, May 10, 2010 at 17:22, Andrew Morton wrote:
> On Mon, 10 May 2010 12:42:34 +0200 "Oskar Schirmer" wrote:
>> With dma based spi transmission, data corruption
>> is observed occasionally. With dma buffers located
>> right next to msg and xfer fields, cache lines
>> correctly flushed in preparation for dma usage
>> may be polluted again when writing to fields
>> in the same cache line.
>>
>> Make sure cache fields used with dma do not
>> share cache lines with fields changed during
>> dma handling. As both fields are part of a
>> struct that is allocated via kzalloc, thus
>> cache aligned, moving the fields to the 1st
>> position and insert padding for alignment
>> does the job.
>
> This sounds odd.  Doesn't it imply that some code somewhere is missing
> some DMA synchronisation actions?

i think it's kind of dumb and induces this sort of bug
semi-frequently, but it is what the current DMA API requires (see like
Documentation/spi/spi-summary)
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:11             ` Mike Frysinger
  (?)
@ 2010-05-11  3:20             ` Andrew Morton
  -1 siblings, 0 replies; 124+ messages in thread
From: Andrew Morton @ 2010-05-11  3:20 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Dmitry Torokhov, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, 11 May 2010 02:11:41 -0400 Mike Frysinger <vapier.adi@gmail.com> wrote:

> > __ __ __ __unsigned __ __ __ __ __ __ __ __pending:1; __ __ __/* P: lock */
> > +
> > + __ __ __ /*
> > + __ __ __ __* DMA (thus cache coherency maintenance) requires the
> > + __ __ __ __* transfer buffers to live in their own cache lines.
> > + __ __ __ __*/
> > + __ __ __ u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
> > __};

(^^stupid gmail)

> i'm not sure this is correct.  the cached_aligned attribute makes sure
> it starts on a cache boundary, but it doesnt make sure it pads out to
> one.  so it might work more of the time, but i dont think it's
> guaranteed.

yup.  You'd need to put something like

	int pad ____cacheline_aligned;

_after_ the trashable field.

Then look at the .s file and make sure it came out right ;)

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-10 21:22       ` Andrew Morton
@ 2010-05-11  6:05           ` Dmitry Torokhov
  2010-05-11  6:05           ` Dmitry Torokhov
  1 sibling, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11  6:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oskar Schirmer, Michael Hennerich, Mike Frysinger, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
> On Mon, 10 May 2010 12:42:34 +0200
> "Oskar Schirmer" <os@emlix.com> wrote:
> 
> > With dma based spi transmission, data corruption
> > is observed occasionally. With dma buffers located
> > right next to msg and xfer fields, cache lines
> > correctly flushed in preparation for dma usage
> > may be polluted again when writing to fields
> > in the same cache line.
> > 
> > Make sure cache fields used with dma do not
> > share cache lines with fields changed during
> > dma handling. As both fields are part of a
> > struct that is allocated via kzalloc, thus
> > cache aligned, moving the fields to the 1st
> > position and insert padding for alignment
> > does the job.
> 
> This sounds odd.  Doesn't it imply that some code somewhere is missing
> some DMA synchronisation actions?
> 
> > 
> > v2: add a comment to explain why alignment is needed
> > 
> > v3: fix the typo in comment and layout (- to end of line)
> > 
> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> > index 885354c..9ebb1b4 100644
> > --- a/drivers/input/touchscreen/ad7877.c
> > +++ b/drivers/input/touchscreen/ad7877.c
> > @@ -153,15 +153,29 @@ enum {
> >   */
> >  
> >  struct ser_req {
> > +	u16			sample;
> > +	/*
> > +	 * DMA (thus cache coherency maintenance) requires the
> > +	 * transfer buffers to live in their own cache lines.
> > +	 */
> > +	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
> 
> It would be better to use __cacheline_aligned, rather than open-coding
> things in this manner.
> 

OK, then I have the following which I will apply unless someone shouts.

-- 
Dmitry

Input: ad7877 - keep dma rx buffers in seperate cache lines

From: Oskar Schirmer <os@emlix.com>

With dma based spi transmission, data corruption is observed
occasionally. With dma buffers located right next to msg and
xfer fields, cache lines correctly flushed in preparation for
dma usage may be polluted again when writing to fields in the
same cache line.

Make sure cache fields used with dma do not share cache lines
with fields changed during dma handling. As both fields are part
of a struct that is allocated via kzalloc, thus cache aligned,
moving the fields to the 1st position and insert padding for
alignment does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
[dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
 of akpm]
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---

 drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
 1 files changed, 12 insertions(+), 3 deletions(-)


diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index e019d53..0d2d7e5 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -156,9 +156,14 @@ struct ser_req {
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
+
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	u16 sample ____cacheline_aligned;
 };
 
 struct ad7877 {
@@ -182,8 +187,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
@@ -195,6 +198,12 @@ struct ad7877 {
 	spinlock_t		lock;
 	struct timer_list	timer;		/* P: lock */
 	unsigned		pending:1;	/* P: lock */
+
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
 };
 
 static int gpio3;


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11  6:05           ` Dmitry Torokhov
  0 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11  6:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oskar Schirmer, Michael Hennerich, Mike Frysinger, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
> On Mon, 10 May 2010 12:42:34 +0200
> "Oskar Schirmer" <os@emlix.com> wrote:
> 
> > With dma based spi transmission, data corruption
> > is observed occasionally. With dma buffers located
> > right next to msg and xfer fields, cache lines
> > correctly flushed in preparation for dma usage
> > may be polluted again when writing to fields
> > in the same cache line.
> > 
> > Make sure cache fields used with dma do not
> > share cache lines with fields changed during
> > dma handling. As both fields are part of a
> > struct that is allocated via kzalloc, thus
> > cache aligned, moving the fields to the 1st
> > position and insert padding for alignment
> > does the job.
> 
> This sounds odd.  Doesn't it imply that some code somewhere is missing
> some DMA synchronisation actions?
> 
> > 
> > v2: add a comment to explain why alignment is needed
> > 
> > v3: fix the typo in comment and layout (- to end of line)
> > 
> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> > index 885354c..9ebb1b4 100644
> > --- a/drivers/input/touchscreen/ad7877.c
> > +++ b/drivers/input/touchscreen/ad7877.c
> > @@ -153,15 +153,29 @@ enum {
> >   */
> >  
> >  struct ser_req {
> > +	u16			sample;
> > +	/*
> > +	 * DMA (thus cache coherency maintenance) requires the
> > +	 * transfer buffers to live in their own cache lines.
> > +	 */
> > +	char			__padalign[L1_CACHE_BYTES - sizeof(u16)];
> 
> It would be better to use __cacheline_aligned, rather than open-coding
> things in this manner.
> 

OK, then I have the following which I will apply unless someone shouts.

-- 
Dmitry

Input: ad7877 - keep dma rx buffers in seperate cache lines

From: Oskar Schirmer <os@emlix.com>

With dma based spi transmission, data corruption is observed
occasionally. With dma buffers located right next to msg and
xfer fields, cache lines correctly flushed in preparation for
dma usage may be polluted again when writing to fields in the
same cache line.

Make sure cache fields used with dma do not share cache lines
with fields changed during dma handling. As both fields are part
of a struct that is allocated via kzalloc, thus cache aligned,
moving the fields to the 1st position and insert padding for
alignment does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
[dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
 of akpm]
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---

 drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
 1 files changed, 12 insertions(+), 3 deletions(-)


diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index e019d53..0d2d7e5 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -156,9 +156,14 @@ struct ser_req {
 	u16			reset;
 	u16			ref_on;
 	u16			command;
-	u16			sample;
 	struct spi_message	msg;
 	struct spi_transfer	xfer[6];
+
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	u16 sample ____cacheline_aligned;
 };
 
 struct ad7877 {
@@ -182,8 +187,6 @@ struct ad7877 {
 	u8			averaging;
 	u8			pen_down_acc_interval;
 
-	u16			conversion_data[AD7877_NR_SENSE];
-
 	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
 	struct spi_message	msg;
 
@@ -195,6 +198,12 @@ struct ad7877 {
 	spinlock_t		lock;
 	struct timer_list	timer;		/* P: lock */
 	unsigned		pending:1;	/* P: lock */
+
+	/*
+	 * DMA (thus cache coherency maintenance) requires the
+	 * transfer buffers to live in their own cache lines.
+	 */
+	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
 };
 
 static int gpio3;

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:05           ` Dmitry Torokhov
@ 2010-05-11  6:11             ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11  6:11 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
> On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
>> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
>> > With dma based spi transmission, data corruption
>> > is observed occasionally. With dma buffers located
>> > right next to msg and xfer fields, cache lines
>> > correctly flushed in preparation for dma usage
>> > may be polluted again when writing to fields
>> > in the same cache line.
>> >
>> > Make sure cache fields used with dma do not
>> > share cache lines with fields changed during
>> > dma handling. As both fields are part of a
>> > struct that is allocated via kzalloc, thus
>> > cache aligned, moving the fields to the 1st
>> > position and insert padding for alignment
>> > does the job.
>>
>> This sounds odd.  Doesn't it imply that some code somewhere is missing
>> some DMA synchronisation actions?
>>
>> >
>> > v2: add a comment to explain why alignment is needed
>> >
>> > v3: fix the typo in comment and layout (- to end of line)
>> >
>> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> > index 885354c..9ebb1b4 100644
>> > --- a/drivers/input/touchscreen/ad7877.c
>> > +++ b/drivers/input/touchscreen/ad7877.c
>> > @@ -153,15 +153,29 @@ enum {
>> >   */
>> >
>> >  struct ser_req {
>> > +   u16                     sample;
>> > +   /*
>> > +    * DMA (thus cache coherency maintenance) requires the
>> > +    * transfer buffers to live in their own cache lines.
>> > +    */
>> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>>
>> It would be better to use __cacheline_aligned, rather than open-coding
>> things in this manner.
>>
>
> OK, then I have the following which I will apply unless someone shouts.
>
> --
> Dmitry
>
> Input: ad7877 - keep dma rx buffers in seperate cache lines
>
> From: Oskar Schirmer <os@emlix.com>
>
> With dma based spi transmission, data corruption is observed
> occasionally. With dma buffers located right next to msg and
> xfer fields, cache lines correctly flushed in preparation for
> dma usage may be polluted again when writing to fields in the
> same cache line.
>
> Make sure cache fields used with dma do not share cache lines
> with fields changed during dma handling. As both fields are part
> of a struct that is allocated via kzalloc, thus cache aligned,
> moving the fields to the 1st position and insert padding for
> alignment does the job.
>
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> Signed-off-by: Johannes Weiner <jw@emlix.com>
> Acked-by: Mike Frysinger <vapier@gentoo.org>
> [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>  of akpm]
> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> ---
>
>  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>  1 files changed, 12 insertions(+), 3 deletions(-)
>
>
> diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> index e019d53..0d2d7e5 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -156,9 +156,14 @@ struct ser_req {
>        u16                     reset;
>        u16                     ref_on;
>        u16                     command;
> -       u16                     sample;
>        struct spi_message      msg;
>        struct spi_transfer     xfer[6];
> +
> +       /*
> +        * DMA (thus cache coherency maintenance) requires the
> +        * transfer buffers to live in their own cache lines.
> +        */
> +       u16 sample ____cacheline_aligned;
>  };
>
>  struct ad7877 {
> @@ -182,8 +187,6 @@ struct ad7877 {
>        u8                      averaging;
>        u8                      pen_down_acc_interval;
>
> -       u16                     conversion_data[AD7877_NR_SENSE];
> -
>        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>        struct spi_message      msg;
>
> @@ -195,6 +198,12 @@ struct ad7877 {
>        spinlock_t              lock;
>        struct timer_list       timer;          /* P: lock */
>        unsigned                pending:1;      /* P: lock */
> +
> +       /*
> +        * DMA (thus cache coherency maintenance) requires the
> +        * transfer buffers to live in their own cache lines.
> +        */
> +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>  };

i'm not sure this is correct.  the cached_aligned attribute makes sure
it starts on a cache boundary, but it doesnt make sure it pads out to
one.  so it might work more of the time, but i dont think it's
guaranteed.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11  6:11             ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11  6:11 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
> On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
>> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
>> > With dma based spi transmission, data corruption
>> > is observed occasionally. With dma buffers located
>> > right next to msg and xfer fields, cache lines
>> > correctly flushed in preparation for dma usage
>> > may be polluted again when writing to fields
>> > in the same cache line.
>> >
>> > Make sure cache fields used with dma do not
>> > share cache lines with fields changed during
>> > dma handling. As both fields are part of a
>> > struct that is allocated via kzalloc, thus
>> > cache aligned, moving the fields to the 1st
>> > position and insert padding for alignment
>> > does the job.
>>
>> This sounds odd.  Doesn't it imply that some code somewhere is missing
>> some DMA synchronisation actions?
>>
>> >
>> > v2: add a comment to explain why alignment is needed
>> >
>> > v3: fix the typo in comment and layout (- to end of line)
>> >
>> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> > index 885354c..9ebb1b4 100644
>> > --- a/drivers/input/touchscreen/ad7877.c
>> > +++ b/drivers/input/touchscreen/ad7877.c
>> > @@ -153,15 +153,29 @@ enum {
>> >   */
>> >
>> >  struct ser_req {
>> > +   u16                     sample;
>> > +   /*
>> > +    * DMA (thus cache coherency maintenance) requires the
>> > +    * transfer buffers to live in their own cache lines.
>> > +    */
>> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>>
>> It would be better to use __cacheline_aligned, rather than open-coding
>> things in this manner.
>>
>
> OK, then I have the following which I will apply unless someone shouts.
>
> --
> Dmitry
>
> Input: ad7877 - keep dma rx buffers in seperate cache lines
>
> From: Oskar Schirmer <os@emlix.com>
>
> With dma based spi transmission, data corruption is observed
> occasionally. With dma buffers located right next to msg and
> xfer fields, cache lines correctly flushed in preparation for
> dma usage may be polluted again when writing to fields in the
> same cache line.
>
> Make sure cache fields used with dma do not share cache lines
> with fields changed during dma handling. As both fields are part
> of a struct that is allocated via kzalloc, thus cache aligned,
> moving the fields to the 1st position and insert padding for
> alignment does the job.
>
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> Signed-off-by: Johannes Weiner <jw@emlix.com>
> Acked-by: Mike Frysinger <vapier@gentoo.org>
> [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>  of akpm]
> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> ---
>
>  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>  1 files changed, 12 insertions(+), 3 deletions(-)
>
>
> diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> index e019d53..0d2d7e5 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -156,9 +156,14 @@ struct ser_req {
>        u16                     reset;
>        u16                     ref_on;
>        u16                     command;
> -       u16                     sample;
>        struct spi_message      msg;
>        struct spi_transfer     xfer[6];
> +
> +       /*
> +        * DMA (thus cache coherency maintenance) requires the
> +        * transfer buffers to live in their own cache lines.
> +        */
> +       u16 sample ____cacheline_aligned;
>  };
>
>  struct ad7877 {
> @@ -182,8 +187,6 @@ struct ad7877 {
>        u8                      averaging;
>        u8                      pen_down_acc_interval;
>
> -       u16                     conversion_data[AD7877_NR_SENSE];
> -
>        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>        struct spi_message      msg;
>
> @@ -195,6 +198,12 @@ struct ad7877 {
>        spinlock_t              lock;
>        struct timer_list       timer;          /* P: lock */
>        unsigned                pending:1;      /* P: lock */
> +
> +       /*
> +        * DMA (thus cache coherency maintenance) requires the
> +        * transfer buffers to live in their own cache lines.
> +        */
> +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>  };

i'm not sure this is correct.  the cached_aligned attribute makes sure
it starts on a cache boundary, but it doesnt make sure it pads out to
one.  so it might work more of the time, but i dont think it's
guaranteed.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:11             ` Mike Frysinger
  (?)
  (?)
@ 2010-05-11  6:21             ` Dmitry Torokhov
  2010-05-11  6:23                 ` Mike Frysinger
  -1 siblings, 1 reply; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11  6:21 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
> > On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
> >> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
> >> > With dma based spi transmission, data corruption
> >> > is observed occasionally. With dma buffers located
> >> > right next to msg and xfer fields, cache lines
> >> > correctly flushed in preparation for dma usage
> >> > may be polluted again when writing to fields
> >> > in the same cache line.
> >> >
> >> > Make sure cache fields used with dma do not
> >> > share cache lines with fields changed during
> >> > dma handling. As both fields are part of a
> >> > struct that is allocated via kzalloc, thus
> >> > cache aligned, moving the fields to the 1st
> >> > position and insert padding for alignment
> >> > does the job.
> >>
> >> This sounds odd.  Doesn't it imply that some code somewhere is missing
> >> some DMA synchronisation actions?
> >>
> >> >
> >> > v2: add a comment to explain why alignment is needed
> >> >
> >> > v3: fix the typo in comment and layout (- to end of line)
> >> >
> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> >> > index 885354c..9ebb1b4 100644
> >> > --- a/drivers/input/touchscreen/ad7877.c
> >> > +++ b/drivers/input/touchscreen/ad7877.c
> >> > @@ -153,15 +153,29 @@ enum {
> >> >   */
> >> >
> >> >  struct ser_req {
> >> > +   u16                     sample;
> >> > +   /*
> >> > +    * DMA (thus cache coherency maintenance) requires the
> >> > +    * transfer buffers to live in their own cache lines.
> >> > +    */
> >> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >>
> >> It would be better to use __cacheline_aligned, rather than open-coding
> >> things in this manner.
> >>
> >
> > OK, then I have the following which I will apply unless someone shouts.
> >
> > --
> > Dmitry
> >
> > Input: ad7877 - keep dma rx buffers in seperate cache lines
> >
> > From: Oskar Schirmer <os@emlix.com>
> >
> > With dma based spi transmission, data corruption is observed
> > occasionally. With dma buffers located right next to msg and
> > xfer fields, cache lines correctly flushed in preparation for
> > dma usage may be polluted again when writing to fields in the
> > same cache line.
> >
> > Make sure cache fields used with dma do not share cache lines
> > with fields changed during dma handling. As both fields are part
> > of a struct that is allocated via kzalloc, thus cache aligned,
> > moving the fields to the 1st position and insert padding for
> > alignment does the job.
> >
> > Signed-off-by: Oskar Schirmer <os@emlix.com>
> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> > Signed-off-by: Johannes Weiner <jw@emlix.com>
> > Acked-by: Mike Frysinger <vapier@gentoo.org>
> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
> >  of akpm]
> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> > ---
> >
> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
> >  1 files changed, 12 insertions(+), 3 deletions(-)
> >
> >
> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> > index e019d53..0d2d7e5 100644
> > --- a/drivers/input/touchscreen/ad7877.c
> > +++ b/drivers/input/touchscreen/ad7877.c
> > @@ -156,9 +156,14 @@ struct ser_req {
> >        u16                     reset;
> >        u16                     ref_on;
> >        u16                     command;
> > -       u16                     sample;
> >        struct spi_message      msg;
> >        struct spi_transfer     xfer[6];
> > +
> > +       /*
> > +        * DMA (thus cache coherency maintenance) requires the
> > +        * transfer buffers to live in their own cache lines.
> > +        */
> > +       u16 sample ____cacheline_aligned;
> >  };
> >
> >  struct ad7877 {
> > @@ -182,8 +187,6 @@ struct ad7877 {
> >        u8                      averaging;
> >        u8                      pen_down_acc_interval;
> >
> > -       u16                     conversion_data[AD7877_NR_SENSE];
> > -
> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >        struct spi_message      msg;
> >
> > @@ -195,6 +198,12 @@ struct ad7877 {
> >        spinlock_t              lock;
> >        struct timer_list       timer;          /* P: lock */
> >        unsigned                pending:1;      /* P: lock */
> > +
> > +       /*
> > +        * DMA (thus cache coherency maintenance) requires the
> > +        * transfer buffers to live in their own cache lines.
> > +        */
> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
> >  };
> 
> i'm not sure this is correct.  the cached_aligned attribute makes sure
> it starts on a cache boundary, but it doesnt make sure it pads out to
> one.  so it might work more of the time, but i dont think it's
> guaranteed.

The buffers are moved to the end of the structure - there is nothing
else there.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:21             ` Dmitry Torokhov
@ 2010-05-11  6:23                 ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11  6:23 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:21, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
>> > On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
>> >> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
>> >> > With dma based spi transmission, data corruption
>> >> > is observed occasionally. With dma buffers located
>> >> > right next to msg and xfer fields, cache lines
>> >> > correctly flushed in preparation for dma usage
>> >> > may be polluted again when writing to fields
>> >> > in the same cache line.
>> >> >
>> >> > Make sure cache fields used with dma do not
>> >> > share cache lines with fields changed during
>> >> > dma handling. As both fields are part of a
>> >> > struct that is allocated via kzalloc, thus
>> >> > cache aligned, moving the fields to the 1st
>> >> > position and insert padding for alignment
>> >> > does the job.
>> >>
>> >> This sounds odd.  Doesn't it imply that some code somewhere is missing
>> >> some DMA synchronisation actions?
>> >>
>> >> >
>> >> > v2: add a comment to explain why alignment is needed
>> >> >
>> >> > v3: fix the typo in comment and layout (- to end of line)
>> >> >
>> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> >> > index 885354c..9ebb1b4 100644
>> >> > --- a/drivers/input/touchscreen/ad7877.c
>> >> > +++ b/drivers/input/touchscreen/ad7877.c
>> >> > @@ -153,15 +153,29 @@ enum {
>> >> >   */
>> >> >
>> >> >  struct ser_req {
>> >> > +   u16                     sample;
>> >> > +   /*
>> >> > +    * DMA (thus cache coherency maintenance) requires the
>> >> > +    * transfer buffers to live in their own cache lines.
>> >> > +    */
>> >> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> >>
>> >> It would be better to use __cacheline_aligned, rather than open-coding
>> >> things in this manner.
>> >>
>> >
>> > OK, then I have the following which I will apply unless someone shouts.
>> >
>> > --
>> > Dmitry
>> >
>> > Input: ad7877 - keep dma rx buffers in seperate cache lines
>> >
>> > From: Oskar Schirmer <os@emlix.com>
>> >
>> > With dma based spi transmission, data corruption is observed
>> > occasionally. With dma buffers located right next to msg and
>> > xfer fields, cache lines correctly flushed in preparation for
>> > dma usage may be polluted again when writing to fields in the
>> > same cache line.
>> >
>> > Make sure cache fields used with dma do not share cache lines
>> > with fields changed during dma handling. As both fields are part
>> > of a struct that is allocated via kzalloc, thus cache aligned,
>> > moving the fields to the 1st position and insert padding for
>> > alignment does the job.
>> >
>> > Signed-off-by: Oskar Schirmer <os@emlix.com>
>> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
>> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
>> > Signed-off-by: Johannes Weiner <jw@emlix.com>
>> > Acked-by: Mike Frysinger <vapier@gentoo.org>
>> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>> >  of akpm]
>> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
>> > ---
>> >
>> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>> >  1 files changed, 12 insertions(+), 3 deletions(-)
>> >
>> >
>> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> > index e019d53..0d2d7e5 100644
>> > --- a/drivers/input/touchscreen/ad7877.c
>> > +++ b/drivers/input/touchscreen/ad7877.c
>> > @@ -156,9 +156,14 @@ struct ser_req {
>> >        u16                     reset;
>> >        u16                     ref_on;
>> >        u16                     command;
>> > -       u16                     sample;
>> >        struct spi_message      msg;
>> >        struct spi_transfer     xfer[6];
>> > +
>> > +       /*
>> > +        * DMA (thus cache coherency maintenance) requires the
>> > +        * transfer buffers to live in their own cache lines.
>> > +        */
>> > +       u16 sample ____cacheline_aligned;
>> >  };
>> >
>> >  struct ad7877 {
>> > @@ -182,8 +187,6 @@ struct ad7877 {
>> >        u8                      averaging;
>> >        u8                      pen_down_acc_interval;
>> >
>> > -       u16                     conversion_data[AD7877_NR_SENSE];
>> > -
>> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>> >        struct spi_message      msg;
>> >
>> > @@ -195,6 +198,12 @@ struct ad7877 {
>> >        spinlock_t              lock;
>> >        struct timer_list       timer;          /* P: lock */
>> >        unsigned                pending:1;      /* P: lock */
>> > +
>> > +       /*
>> > +        * DMA (thus cache coherency maintenance) requires the
>> > +        * transfer buffers to live in their own cache lines.
>> > +        */
>> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>> >  };
>>
>> i'm not sure this is correct.  the cached_aligned attribute makes sure
>> it starts on a cache boundary, but it doesnt make sure it pads out to
>> one.  so it might work more of the time, but i dont think it's
>> guaranteed.
>
> The buffers are moved to the end of the structure - there is nothing
> else there.

what guarantee exactly do you have for that statement ?
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11  6:23                 ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11  6:23 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:21, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
>> > On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
>> >> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
>> >> > With dma based spi transmission, data corruption
>> >> > is observed occasionally. With dma buffers located
>> >> > right next to msg and xfer fields, cache lines
>> >> > correctly flushed in preparation for dma usage
>> >> > may be polluted again when writing to fields
>> >> > in the same cache line.
>> >> >
>> >> > Make sure cache fields used with dma do not
>> >> > share cache lines with fields changed during
>> >> > dma handling. As both fields are part of a
>> >> > struct that is allocated via kzalloc, thus
>> >> > cache aligned, moving the fields to the 1st
>> >> > position and insert padding for alignment
>> >> > does the job.
>> >>
>> >> This sounds odd.  Doesn't it imply that some code somewhere is missing
>> >> some DMA synchronisation actions?
>> >>
>> >> >
>> >> > v2: add a comment to explain why alignment is needed
>> >> >
>> >> > v3: fix the typo in comment and layout (- to end of line)
>> >> >
>> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> >> > index 885354c..9ebb1b4 100644
>> >> > --- a/drivers/input/touchscreen/ad7877.c
>> >> > +++ b/drivers/input/touchscreen/ad7877.c
>> >> > @@ -153,15 +153,29 @@ enum {
>> >> >   */
>> >> >
>> >> >  struct ser_req {
>> >> > +   u16                     sample;
>> >> > +   /*
>> >> > +    * DMA (thus cache coherency maintenance) requires the
>> >> > +    * transfer buffers to live in their own cache lines.
>> >> > +    */
>> >> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
>> >>
>> >> It would be better to use __cacheline_aligned, rather than open-coding
>> >> things in this manner.
>> >>
>> >
>> > OK, then I have the following which I will apply unless someone shouts.
>> >
>> > --
>> > Dmitry
>> >
>> > Input: ad7877 - keep dma rx buffers in seperate cache lines
>> >
>> > From: Oskar Schirmer <os@emlix.com>
>> >
>> > With dma based spi transmission, data corruption is observed
>> > occasionally. With dma buffers located right next to msg and
>> > xfer fields, cache lines correctly flushed in preparation for
>> > dma usage may be polluted again when writing to fields in the
>> > same cache line.
>> >
>> > Make sure cache fields used with dma do not share cache lines
>> > with fields changed during dma handling. As both fields are part
>> > of a struct that is allocated via kzalloc, thus cache aligned,
>> > moving the fields to the 1st position and insert padding for
>> > alignment does the job.
>> >
>> > Signed-off-by: Oskar Schirmer <os@emlix.com>
>> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
>> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
>> > Signed-off-by: Johannes Weiner <jw@emlix.com>
>> > Acked-by: Mike Frysinger <vapier@gentoo.org>
>> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>> >  of akpm]
>> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
>> > ---
>> >
>> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>> >  1 files changed, 12 insertions(+), 3 deletions(-)
>> >
>> >
>> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> > index e019d53..0d2d7e5 100644
>> > --- a/drivers/input/touchscreen/ad7877.c
>> > +++ b/drivers/input/touchscreen/ad7877.c
>> > @@ -156,9 +156,14 @@ struct ser_req {
>> >        u16                     reset;
>> >        u16                     ref_on;
>> >        u16                     command;
>> > -       u16                     sample;
>> >        struct spi_message      msg;
>> >        struct spi_transfer     xfer[6];
>> > +
>> > +       /*
>> > +        * DMA (thus cache coherency maintenance) requires the
>> > +        * transfer buffers to live in their own cache lines.
>> > +        */
>> > +       u16 sample ____cacheline_aligned;
>> >  };
>> >
>> >  struct ad7877 {
>> > @@ -182,8 +187,6 @@ struct ad7877 {
>> >        u8                      averaging;
>> >        u8                      pen_down_acc_interval;
>> >
>> > -       u16                     conversion_data[AD7877_NR_SENSE];
>> > -
>> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>> >        struct spi_message      msg;
>> >
>> > @@ -195,6 +198,12 @@ struct ad7877 {
>> >        spinlock_t              lock;
>> >        struct timer_list       timer;          /* P: lock */
>> >        unsigned                pending:1;      /* P: lock */
>> > +
>> > +       /*
>> > +        * DMA (thus cache coherency maintenance) requires the
>> > +        * transfer buffers to live in their own cache lines.
>> > +        */
>> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>> >  };
>>
>> i'm not sure this is correct.  the cached_aligned attribute makes sure
>> it starts on a cache boundary, but it doesnt make sure it pads out to
>> one.  so it might work more of the time, but i dont think it's
>> guaranteed.
>
> The buffers are moved to the end of the structure - there is nothing
> else there.

what guarantee exactly do you have for that statement ?
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:23                 ` Mike Frysinger
@ 2010-05-11  6:33                   ` Dmitry Torokhov
  -1 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11  6:33 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:23:34AM -0400, Mike Frysinger wrote:
> On Tue, May 11, 2010 at 02:21, Dmitry Torokhov wrote:
> > On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
> >> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
> >> > On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
> >> >> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
> >> >> > With dma based spi transmission, data corruption
> >> >> > is observed occasionally. With dma buffers located
> >> >> > right next to msg and xfer fields, cache lines
> >> >> > correctly flushed in preparation for dma usage
> >> >> > may be polluted again when writing to fields
> >> >> > in the same cache line.
> >> >> >
> >> >> > Make sure cache fields used with dma do not
> >> >> > share cache lines with fields changed during
> >> >> > dma handling. As both fields are part of a
> >> >> > struct that is allocated via kzalloc, thus
> >> >> > cache aligned, moving the fields to the 1st
> >> >> > position and insert padding for alignment
> >> >> > does the job.
> >> >>
> >> >> This sounds odd.  Doesn't it imply that some code somewhere is missing
> >> >> some DMA synchronisation actions?
> >> >>
> >> >> >
> >> >> > v2: add a comment to explain why alignment is needed
> >> >> >
> >> >> > v3: fix the typo in comment and layout (- to end of line)
> >> >> >
> >> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> >> >> > index 885354c..9ebb1b4 100644
> >> >> > --- a/drivers/input/touchscreen/ad7877.c
> >> >> > +++ b/drivers/input/touchscreen/ad7877.c
> >> >> > @@ -153,15 +153,29 @@ enum {
> >> >> >   */
> >> >> >
> >> >> >  struct ser_req {
> >> >> > +   u16                     sample;
> >> >> > +   /*
> >> >> > +    * DMA (thus cache coherency maintenance) requires the
> >> >> > +    * transfer buffers to live in their own cache lines.
> >> >> > +    */
> >> >> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> >>
> >> >> It would be better to use __cacheline_aligned, rather than open-coding
> >> >> things in this manner.
> >> >>
> >> >
> >> > OK, then I have the following which I will apply unless someone shouts.
> >> >
> >> > --
> >> > Dmitry
> >> >
> >> > Input: ad7877 - keep dma rx buffers in seperate cache lines
> >> >
> >> > From: Oskar Schirmer <os@emlix.com>
> >> >
> >> > With dma based spi transmission, data corruption is observed
> >> > occasionally. With dma buffers located right next to msg and
> >> > xfer fields, cache lines correctly flushed in preparation for
> >> > dma usage may be polluted again when writing to fields in the
> >> > same cache line.
> >> >
> >> > Make sure cache fields used with dma do not share cache lines
> >> > with fields changed during dma handling. As both fields are part
> >> > of a struct that is allocated via kzalloc, thus cache aligned,
> >> > moving the fields to the 1st position and insert padding for
> >> > alignment does the job.
> >> >
> >> > Signed-off-by: Oskar Schirmer <os@emlix.com>
> >> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
> >> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> >> > Signed-off-by: Johannes Weiner <jw@emlix.com>
> >> > Acked-by: Mike Frysinger <vapier@gentoo.org>
> >> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
> >> >  of akpm]
> >> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> >> > ---
> >> >
> >> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
> >> >  1 files changed, 12 insertions(+), 3 deletions(-)
> >> >
> >> >
> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> >> > index e019d53..0d2d7e5 100644
> >> > --- a/drivers/input/touchscreen/ad7877.c
> >> > +++ b/drivers/input/touchscreen/ad7877.c
> >> > @@ -156,9 +156,14 @@ struct ser_req {
> >> >        u16                     reset;
> >> >        u16                     ref_on;
> >> >        u16                     command;
> >> > -       u16                     sample;
> >> >        struct spi_message      msg;
> >> >        struct spi_transfer     xfer[6];
> >> > +
> >> > +       /*
> >> > +        * DMA (thus cache coherency maintenance) requires the
> >> > +        * transfer buffers to live in their own cache lines.
> >> > +        */
> >> > +       u16 sample ____cacheline_aligned;
> >> >  };
> >> >
> >> >  struct ad7877 {
> >> > @@ -182,8 +187,6 @@ struct ad7877 {
> >> >        u8                      averaging;
> >> >        u8                      pen_down_acc_interval;
> >> >
> >> > -       u16                     conversion_data[AD7877_NR_SENSE];
> >> > -
> >> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >> >        struct spi_message      msg;
> >> >
> >> > @@ -195,6 +198,12 @@ struct ad7877 {
> >> >        spinlock_t              lock;
> >> >        struct timer_list       timer;          /* P: lock */
> >> >        unsigned                pending:1;      /* P: lock */
> >> > +
> >> > +       /*
> >> > +        * DMA (thus cache coherency maintenance) requires the
> >> > +        * transfer buffers to live in their own cache lines.
> >> > +        */
> >> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
> >> >  };
> >>
> >> i'm not sure this is correct.  the cached_aligned attribute makes sure
> >> it starts on a cache boundary, but it doesnt make sure it pads out to
> >> one.  so it might work more of the time, but i dont think it's
> >> guaranteed.
> >
> > The buffers are moved to the end of the structure - there is nothing
> > else there.
> 
> what guarantee exactly do you have for that statement ?

The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
means that next kmalloc data chunk will not share "our" cacheline.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11  6:33                   ` Dmitry Torokhov
  0 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11  6:33 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:23:34AM -0400, Mike Frysinger wrote:
> On Tue, May 11, 2010 at 02:21, Dmitry Torokhov wrote:
> > On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
> >> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
> >> > On Mon, May 10, 2010 at 02:22:25PM -0700, Andrew Morton wrote:
> >> >> On Mon, 10 May 2010 12:42:34 +0200 Oskar Schirmer wrote:
> >> >> > With dma based spi transmission, data corruption
> >> >> > is observed occasionally. With dma buffers located
> >> >> > right next to msg and xfer fields, cache lines
> >> >> > correctly flushed in preparation for dma usage
> >> >> > may be polluted again when writing to fields
> >> >> > in the same cache line.
> >> >> >
> >> >> > Make sure cache fields used with dma do not
> >> >> > share cache lines with fields changed during
> >> >> > dma handling. As both fields are part of a
> >> >> > struct that is allocated via kzalloc, thus
> >> >> > cache aligned, moving the fields to the 1st
> >> >> > position and insert padding for alignment
> >> >> > does the job.
> >> >>
> >> >> This sounds odd.  Doesn't it imply that some code somewhere is missing
> >> >> some DMA synchronisation actions?
> >> >>
> >> >> >
> >> >> > v2: add a comment to explain why alignment is needed
> >> >> >
> >> >> > v3: fix the typo in comment and layout (- to end of line)
> >> >> >
> >> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> >> >> > index 885354c..9ebb1b4 100644
> >> >> > --- a/drivers/input/touchscreen/ad7877.c
> >> >> > +++ b/drivers/input/touchscreen/ad7877.c
> >> >> > @@ -153,15 +153,29 @@ enum {
> >> >> >   */
> >> >> >
> >> >> >  struct ser_req {
> >> >> > +   u16                     sample;
> >> >> > +   /*
> >> >> > +    * DMA (thus cache coherency maintenance) requires the
> >> >> > +    * transfer buffers to live in their own cache lines.
> >> >> > +    */
> >> >> > +   char                    __padalign[L1_CACHE_BYTES - sizeof(u16)];
> >> >>
> >> >> It would be better to use __cacheline_aligned, rather than open-coding
> >> >> things in this manner.
> >> >>
> >> >
> >> > OK, then I have the following which I will apply unless someone shouts.
> >> >
> >> > --
> >> > Dmitry
> >> >
> >> > Input: ad7877 - keep dma rx buffers in seperate cache lines
> >> >
> >> > From: Oskar Schirmer <os@emlix.com>
> >> >
> >> > With dma based spi transmission, data corruption is observed
> >> > occasionally. With dma buffers located right next to msg and
> >> > xfer fields, cache lines correctly flushed in preparation for
> >> > dma usage may be polluted again when writing to fields in the
> >> > same cache line.
> >> >
> >> > Make sure cache fields used with dma do not share cache lines
> >> > with fields changed during dma handling. As both fields are part
> >> > of a struct that is allocated via kzalloc, thus cache aligned,
> >> > moving the fields to the 1st position and insert padding for
> >> > alignment does the job.
> >> >
> >> > Signed-off-by: Oskar Schirmer <os@emlix.com>
> >> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
> >> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> >> > Signed-off-by: Johannes Weiner <jw@emlix.com>
> >> > Acked-by: Mike Frysinger <vapier@gentoo.org>
> >> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
> >> >  of akpm]
> >> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> >> > ---
> >> >
> >> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
> >> >  1 files changed, 12 insertions(+), 3 deletions(-)
> >> >
> >> >
> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> >> > index e019d53..0d2d7e5 100644
> >> > --- a/drivers/input/touchscreen/ad7877.c
> >> > +++ b/drivers/input/touchscreen/ad7877.c
> >> > @@ -156,9 +156,14 @@ struct ser_req {
> >> >        u16                     reset;
> >> >        u16                     ref_on;
> >> >        u16                     command;
> >> > -       u16                     sample;
> >> >        struct spi_message      msg;
> >> >        struct spi_transfer     xfer[6];
> >> > +
> >> > +       /*
> >> > +        * DMA (thus cache coherency maintenance) requires the
> >> > +        * transfer buffers to live in their own cache lines.
> >> > +        */
> >> > +       u16 sample ____cacheline_aligned;
> >> >  };
> >> >
> >> >  struct ad7877 {
> >> > @@ -182,8 +187,6 @@ struct ad7877 {
> >> >        u8                      averaging;
> >> >        u8                      pen_down_acc_interval;
> >> >
> >> > -       u16                     conversion_data[AD7877_NR_SENSE];
> >> > -
> >> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
> >> >        struct spi_message      msg;
> >> >
> >> > @@ -195,6 +198,12 @@ struct ad7877 {
> >> >        spinlock_t              lock;
> >> >        struct timer_list       timer;          /* P: lock */
> >> >        unsigned                pending:1;      /* P: lock */
> >> > +
> >> > +       /*
> >> > +        * DMA (thus cache coherency maintenance) requires the
> >> > +        * transfer buffers to live in their own cache lines.
> >> > +        */
> >> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
> >> >  };
> >>
> >> i'm not sure this is correct.  the cached_aligned attribute makes sure
> >> it starts on a cache boundary, but it doesnt make sure it pads out to
> >> one.  so it might work more of the time, but i dont think it's
> >> guaranteed.
> >
> > The buffers are moved to the end of the structure - there is nothing
> > else there.
> 
> what guarantee exactly do you have for that statement ?

The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
means that next kmalloc data chunk will not share "our" cacheline.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:33                   ` Dmitry Torokhov
  (?)
@ 2010-05-11  6:37                   ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11  6:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner

On Tue, May 11, 2010 at 02:33, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 02:23:34AM -0400, Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 02:21, Dmitry Torokhov wrote:
>> > On Tue, May 11, 2010 at 02:11:41AM -0400, Mike Frysinger wrote:
>> >> On Tue, May 11, 2010 at 02:05, Dmitry Torokhov wrote:
>> >> > Dmitry
>> >> >
>> >> > Input: ad7877 - keep dma rx buffers in seperate cache lines
>> >> >
>> >> > From: Oskar Schirmer <os@emlix.com>
>> >> >
>> >> > With dma based spi transmission, data corruption is observed
>> >> > occasionally. With dma buffers located right next to msg and
>> >> > xfer fields, cache lines correctly flushed in preparation for
>> >> > dma usage may be polluted again when writing to fields in the
>> >> > same cache line.
>> >> >
>> >> > Make sure cache fields used with dma do not share cache lines
>> >> > with fields changed during dma handling. As both fields are part
>> >> > of a struct that is allocated via kzalloc, thus cache aligned,
>> >> > moving the fields to the 1st position and insert padding for
>> >> > alignment does the job.
>> >> >
>> >> > Signed-off-by: Oskar Schirmer <os@emlix.com>
>> >> > Signed-off-by: Daniel Glöckner <dg@emlix.com>
>> >> > Signed-off-by: Oliver Schneidewind <osw@emlix.com>
>> >> > Signed-off-by: Johannes Weiner <jw@emlix.com>
>> >> > Acked-by: Mike Frysinger <vapier@gentoo.org>
>> >> > [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>> >> >  of akpm]
>> >> > Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
>> >> > ---
>> >> >
>> >> >  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>> >> >  1 files changed, 12 insertions(+), 3 deletions(-)
>> >> >
>> >> >
>> >> > diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
>> >> > index e019d53..0d2d7e5 100644
>> >> > --- a/drivers/input/touchscreen/ad7877.c
>> >> > +++ b/drivers/input/touchscreen/ad7877.c
>> >> > @@ -156,9 +156,14 @@ struct ser_req {
>> >> >        u16                     reset;
>> >> >        u16                     ref_on;
>> >> >        u16                     command;
>> >> > -       u16                     sample;
>> >> >        struct spi_message      msg;
>> >> >        struct spi_transfer     xfer[6];
>> >> > +
>> >> > +       /*
>> >> > +        * DMA (thus cache coherency maintenance) requires the
>> >> > +        * transfer buffers to live in their own cache lines.
>> >> > +        */
>> >> > +       u16 sample ____cacheline_aligned;
>> >> >  };
>> >> >
>> >> >  struct ad7877 {
>> >> > @@ -182,8 +187,6 @@ struct ad7877 {
>> >> >        u8                      averaging;
>> >> >        u8                      pen_down_acc_interval;
>> >> >
>> >> > -       u16                     conversion_data[AD7877_NR_SENSE];
>> >> > -
>> >> >        struct spi_transfer     xfer[AD7877_NR_SENSE + 2];
>> >> >        struct spi_message      msg;
>> >> >
>> >> > @@ -195,6 +198,12 @@ struct ad7877 {
>> >> >        spinlock_t              lock;
>> >> >        struct timer_list       timer;          /* P: lock */
>> >> >        unsigned                pending:1;      /* P: lock */
>> >> > +
>> >> > +       /*
>> >> > +        * DMA (thus cache coherency maintenance) requires the
>> >> > +        * transfer buffers to live in their own cache lines.
>> >> > +        */
>> >> > +       u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>> >> >  };
>> >>
>> >> i'm not sure this is correct.  the cached_aligned attribute makes sure
>> >> it starts on a cache boundary, but it doesnt make sure it pads out to
>> >> one.  so it might work more of the time, but i dont think it's
>> >> guaranteed.
>> >
>> > The buffers are moved to the end of the structure - there is nothing
>> > else there.
>>
>> what guarantee exactly do you have for that statement ?
>
> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
> means that next kmalloc data chunk will not share "our" cacheline.

so obvious once you say it out loud.  as long as kmalloc() guarantees
it, your patch sounds fine to me.  once Oskar double checks, ship it!

thanks :)
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:33                   ` Dmitry Torokhov
  (?)
  (?)
@ 2010-05-11  6:42                   ` Pekka Enberg
  2010-05-11 13:57                     ` Christoph Lameter
                                       ` (2 more replies)
  -1 siblings, 3 replies; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11  6:42 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, Oskar Schirmer, Michael Hennerich,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner, Christoph Lameter,
	Nick Piggin, David Rientjes, Matt Mackall

Hi Dmitry,

On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
>> what guarantee exactly do you have for that statement ?
>
> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
> means that next kmalloc data chunk will not share "our" cacheline.

No, there are no such guarantees. kmalloc() aligns on
ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
bigger but beyond that, there are no guarantees. You can, of course,
use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
boundary.

                        Pekka

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:42                   ` Pekka Enberg
@ 2010-05-11 13:57                     ` Christoph Lameter
  2010-05-11 16:52                     ` Dmitry Torokhov
  2010-05-11 20:03                     ` Mike Frysinger
  2 siblings, 0 replies; 124+ messages in thread
From: Christoph Lameter @ 2010-05-11 13:57 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Dmitry Torokhov, Mike Frysinger, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, Matt Mackall

On Tue, 11 May 2010, Pekka Enberg wrote:

> > The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
> > means that next kmalloc data chunk will not share "our" cacheline.
>
> No, there are no such guarantees. kmalloc() aligns on
> ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
> bigger but beyond that, there are no guarantees. You can, of course,
> use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
> boundary.

Note the difference between kmalloc aligment and the alignment of manually
created slabs.

Kmalloc data is often aligned on cacheline boundary due to the allocator
methods of placing data in pages. But there is no guarantee that this will
always be the case. In particular if slab debugging is on then the
alignments become different.


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:05           ` Dmitry Torokhov
@ 2010-05-11 14:12             ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-11 14:12 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, Mike Frysinger,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner

On Mon, May 10, 2010 at 23:05:47 -0700, Dmitry Torokhov wrote:
> OK, then I have the following which I will apply unless someone shouts.

Looks good, we tested it on G45, works fine.

  Oskar


> -- 
> Dmitry
> 
> Input: ad7877 - keep dma rx buffers in seperate cache lines
> 
> From: Oskar Schirmer <os@emlix.com>
> 
> With dma based spi transmission, data corruption is observed
> occasionally. With dma buffers located right next to msg and
> xfer fields, cache lines correctly flushed in preparation for
> dma usage may be polluted again when writing to fields in the
> same cache line.
> 
> Make sure cache fields used with dma do not share cache lines
> with fields changed during dma handling. As both fields are part
> of a struct that is allocated via kzalloc, thus cache aligned,
> moving the fields to the 1st position and insert padding for
> alignment does the job.
> 
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> Signed-off-by: Johannes Weiner <jw@emlix.com>
> Acked-by: Mike Frysinger <vapier@gentoo.org>
> [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>  of akpm]
> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> ---
> 
>  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>  1 files changed, 12 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> index e019d53..0d2d7e5 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -156,9 +156,14 @@ struct ser_req {
>  	u16			reset;
>  	u16			ref_on;
>  	u16			command;
> -	u16			sample;
>  	struct spi_message	msg;
>  	struct spi_transfer	xfer[6];
> +
> +	/*
> +	 * DMA (thus cache coherency maintenance) requires the
> +	 * transfer buffers to live in their own cache lines.
> +	 */
> +	u16 sample ____cacheline_aligned;
>  };
>  
>  struct ad7877 {
> @@ -182,8 +187,6 @@ struct ad7877 {
>  	u8			averaging;
>  	u8			pen_down_acc_interval;
>  
> -	u16			conversion_data[AD7877_NR_SENSE];
> -
>  	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
>  	struct spi_message	msg;
>  
> @@ -195,6 +198,12 @@ struct ad7877 {
>  	spinlock_t		lock;
>  	struct timer_list	timer;		/* P: lock */
>  	unsigned		pending:1;	/* P: lock */
> +
> +	/*
> +	 * DMA (thus cache coherency maintenance) requires the
> +	 * transfer buffers to live in their own cache lines.
> +	 */
> +	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>  };
>  
>  static int gpio3;
> 
> 

-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 14:12             ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-11 14:12 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andrew Morton, Oskar Schirmer, Michael Hennerich, Mike Frysinger,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner

On Mon, May 10, 2010 at 23:05:47 -0700, Dmitry Torokhov wrote:
> OK, then I have the following which I will apply unless someone shouts.

Looks good, we tested it on G45, works fine.

  Oskar


> -- 
> Dmitry
> 
> Input: ad7877 - keep dma rx buffers in seperate cache lines
> 
> From: Oskar Schirmer <os@emlix.com>
> 
> With dma based spi transmission, data corruption is observed
> occasionally. With dma buffers located right next to msg and
> xfer fields, cache lines correctly flushed in preparation for
> dma usage may be polluted again when writing to fields in the
> same cache line.
> 
> Make sure cache fields used with dma do not share cache lines
> with fields changed during dma handling. As both fields are part
> of a struct that is allocated via kzalloc, thus cache aligned,
> moving the fields to the 1st position and insert padding for
> alignment does the job.
> 
> Signed-off-by: Oskar Schirmer <os@emlix.com>
> Signed-off-by: Daniel Glöckner <dg@emlix.com>
> Signed-off-by: Oliver Schneidewind <osw@emlix.com>
> Signed-off-by: Johannes Weiner <jw@emlix.com>
> Acked-by: Mike Frysinger <vapier@gentoo.org>
> [dtor@mail.ru - changed to use ___cacheline_aligned at suggestion
>  of akpm]
> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
> ---
> 
>  drivers/input/touchscreen/ad7877.c |   15 ++++++++++++---
>  1 files changed, 12 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
> index e019d53..0d2d7e5 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -156,9 +156,14 @@ struct ser_req {
>  	u16			reset;
>  	u16			ref_on;
>  	u16			command;
> -	u16			sample;
>  	struct spi_message	msg;
>  	struct spi_transfer	xfer[6];
> +
> +	/*
> +	 * DMA (thus cache coherency maintenance) requires the
> +	 * transfer buffers to live in their own cache lines.
> +	 */
> +	u16 sample ____cacheline_aligned;
>  };
>  
>  struct ad7877 {
> @@ -182,8 +187,6 @@ struct ad7877 {
>  	u8			averaging;
>  	u8			pen_down_acc_interval;
>  
> -	u16			conversion_data[AD7877_NR_SENSE];
> -
>  	struct spi_transfer	xfer[AD7877_NR_SENSE + 2];
>  	struct spi_message	msg;
>  
> @@ -195,6 +198,12 @@ struct ad7877 {
>  	spinlock_t		lock;
>  	struct timer_list	timer;		/* P: lock */
>  	unsigned		pending:1;	/* P: lock */
> +
> +	/*
> +	 * DMA (thus cache coherency maintenance) requires the
> +	 * transfer buffers to live in their own cache lines.
> +	 */
> +	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
>  };
>  
>  static int gpio3;
> 
> 

-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:42                   ` Pekka Enberg
  2010-05-11 13:57                     ` Christoph Lameter
@ 2010-05-11 16:52                     ` Dmitry Torokhov
  2010-05-11 17:31                       ` Mike Frysinger
  2010-05-11 18:59                       ` Pekka Enberg
  2010-05-11 20:03                     ` Mike Frysinger
  2 siblings, 2 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11 16:52 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Mike Frysinger, Andrew Morton, Oskar Schirmer, Michael Hennerich,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner, Christoph Lameter,
	Nick Piggin, David Rientjes, Matt Mackall

On Tue, May 11, 2010 at 09:42:03AM +0300, Pekka Enberg wrote:
> Hi Dmitry,
> 
> On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> >> what guarantee exactly do you have for that statement ?
> >
> > The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
> > means that next kmalloc data chunk will not share "our" cacheline.
> 
> No, there are no such guarantees. kmalloc() aligns on
> ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
> bigger but beyond that, there are no guarantees. You can, of course,
> use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
> boundary.
> 

The architectures that we are trying to deal with here should be forcing
kmalloc to the cache boundary already though - otherwise they would not
be able to used kmalloced memory for DMA buffers at all. Or am I utterly
lost here?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 16:52                     ` Dmitry Torokhov
@ 2010-05-11 17:31                       ` Mike Frysinger
  2010-05-11 18:59                       ` Pekka Enberg
  1 sibling, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 17:31 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Pekka Enberg, Andrew Morton, Oskar Schirmer, Michael Hennerich,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner, Christoph Lameter,
	Nick Piggin, David Rientjes, Matt Mackall

On Tue, May 11, 2010 at 12:52, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 09:42:03AM +0300, Pekka Enberg wrote:
>> On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov wrote:
>> >> what guarantee exactly do you have for that statement ?
>> >
>> > The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
>> > means that next kmalloc data chunk will not share "our" cacheline.
>>
>> No, there are no such guarantees. kmalloc() aligns on
>> ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
>> bigger but beyond that, there are no guarantees. You can, of course,
>> use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
>> boundary.
>
> The architectures that we are trying to deal with here should be forcing
> kmalloc to the cache boundary already though - otherwise they would not
> be able to used kmalloced memory for DMA buffers at all. Or am I utterly
> lost here?

i dont want to restrict the driver so that it implicitly only works on
certain arches.  that kind of goes against the entire point of the
driver model.  the only thing this driver needs is a SPI bus, and
people can easily do that on most every arch.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 16:52                     ` Dmitry Torokhov
  2010-05-11 17:31                       ` Mike Frysinger
@ 2010-05-11 18:59                       ` Pekka Enberg
  1 sibling, 0 replies; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11 18:59 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, Oskar Schirmer, Michael Hennerich,
	linux-input, linux-kernel, Daniel Glöckner,
	Oliver Schneidewind, Johannes Weiner, Christoph Lameter,
	Nick Piggin, David Rientjes, Matt Mackall

Dmitry Torokhov wrote:
> The architectures that we are trying to deal with here should be forcing
> kmalloc to the cache boundary already though - otherwise they would not
> be able to used kmalloced memory for DMA buffers at all. Or am I utterly
> lost here?

I don't know which architecture you're talking about but it's not true 
in general and you probably don't want to depend on it even if it 
happens to work now.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11  6:42                   ` Pekka Enberg
  2010-05-11 13:57                     ` Christoph Lameter
  2010-05-11 16:52                     ` Dmitry Torokhov
@ 2010-05-11 20:03                     ` Mike Frysinger
  2010-05-11 20:07                       ` Matt Mackall
  2 siblings, 1 reply; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:03 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Christoph Lameter, Nick Piggin, David Rientjes, Matt Mackall

On Tue, May 11, 2010 at 02:42, Pekka Enberg wrote:
> On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov wrote:
>>> what guarantee exactly do you have for that statement ?
>>
>> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
>> means that next kmalloc data chunk will not share "our" cacheline.
>
> No, there are no such guarantees. kmalloc() aligns on
> ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
> bigger but beyond that, there are no guarantees. You can, of course,
> use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
> boundary.

so how is this to be addressed in general ?  this is a problem for any
device that does SPI transactions, and having every driver create its
own kmem cache isnt the answer.

do people need to kmalloc() like 2x the desired size and manually
align it themselves ?  declaring alignments on struct members doesnt
matter if the base of the struct isnt aligned.  seems like we need a
new GFP flag that says we need a cache aligned pointer so we can give
that to kmalloc() and such.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:03                     ` Mike Frysinger
@ 2010-05-11 20:07                       ` Matt Mackall
  2010-05-11 20:10                           ` Mike Frysinger
  0 siblings, 1 reply; 124+ messages in thread
From: Matt Mackall @ 2010-05-11 20:07 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Pekka Enberg, Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Christoph Lameter, Nick Piggin, David Rientjes

On Tue, 2010-05-11 at 16:03 -0400, Mike Frysinger wrote:
> On Tue, May 11, 2010 at 02:42, Pekka Enberg wrote:
> > On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov wrote:
> >>> what guarantee exactly do you have for that statement ?
> >>
> >> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
> >> means that next kmalloc data chunk will not share "our" cacheline.
> >
> > No, there are no such guarantees. kmalloc() aligns on
> > ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
> > bigger but beyond that, there are no guarantees. You can, of course,
> > use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
> > boundary.
> 
> so how is this to be addressed in general ?  this is a problem for any
> device that does SPI transactions, and having every driver create its
> own kmem cache isnt the answer.
> 
> do people need to kmalloc() like 2x the desired size and manually
> align it themselves ?  declaring alignments on struct members doesnt
> matter if the base of the struct isnt aligned.  seems like we need a
> new GFP flag that says we need a cache aligned pointer so we can give
> that to kmalloc() and such.

Make your own slab cache with the alignment flag set, as Pekka already
mentioned.

-- 
Mathematics is the supreme nostalgia of our time.



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:07                       ` Matt Mackall
@ 2010-05-11 20:10                           ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:10 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Pekka Enberg, Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Christoph Lameter, Nick Piggin, David Rientjes

On Tue, May 11, 2010 at 16:07, Matt Mackall wrote:
> On Tue, 2010-05-11 at 16:03 -0400, Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 02:42, Pekka Enberg wrote:
>> > On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov wrote:
>> >>> what guarantee exactly do you have for that statement ?
>> >>
>> >> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
>> >> means that next kmalloc data chunk will not share "our" cacheline.
>> >
>> > No, there are no such guarantees. kmalloc() aligns on
>> > ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
>> > bigger but beyond that, there are no guarantees. You can, of course,
>> > use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
>> > boundary.
>>
>> so how is this to be addressed in general ?  this is a problem for any
>> device that does SPI transactions, and having every driver create its
>> own kmem cache isnt the answer.
>>
>> do people need to kmalloc() like 2x the desired size and manually
>> align it themselves ?  declaring alignments on struct members doesnt
>> matter if the base of the struct isnt aligned.  seems like we need a
>> new GFP flag that says we need a cache aligned pointer so we can give
>> that to kmalloc() and such.
>
> Make your own slab cache with the alignment flag set, as Pekka already
> mentioned.

and like i said, that doesnt sound like a reasonable solution when
every single SPI driver (over 100 atm) out there is affected
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 20:10                           ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:10 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Pekka Enberg, Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Christoph Lameter, Nick Piggin, David Rientjes

On Tue, May 11, 2010 at 16:07, Matt Mackall wrote:
> On Tue, 2010-05-11 at 16:03 -0400, Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 02:42, Pekka Enberg wrote:
>> > On Tue, May 11, 2010 at 9:33 AM, Dmitry Torokhov wrote:
>> >>> what guarantee exactly do you have for that statement ?
>> >>
>> >> The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK which
>> >> means that next kmalloc data chunk will not share "our" cacheline.
>> >
>> > No, there are no such guarantees. kmalloc() aligns on
>> > ARCH_KMALLOC_MINALIGN or ARCH_SLAB_MINALIGN depending on which is
>> > bigger but beyond that, there are no guarantees. You can, of course,
>> > use kmem_cache_create() with SLAB_HWCACHE_ALIGN to align on cacheline
>> > boundary.
>>
>> so how is this to be addressed in general ?  this is a problem for any
>> device that does SPI transactions, and having every driver create its
>> own kmem cache isnt the answer.
>>
>> do people need to kmalloc() like 2x the desired size and manually
>> align it themselves ?  declaring alignments on struct members doesnt
>> matter if the base of the struct isnt aligned.  seems like we need a
>> new GFP flag that says we need a cache aligned pointer so we can give
>> that to kmalloc() and such.
>
> Make your own slab cache with the alignment flag set, as Pekka already
> mentioned.

and like i said, that doesnt sound like a reasonable solution when
every single SPI driver (over 100 atm) out there is affected
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:10                           ` Mike Frysinger
  (?)
@ 2010-05-11 20:21                           ` Christoph Lameter
  2010-05-11 20:34                             ` Mike Frysinger
  -1 siblings, 1 reply; 124+ messages in thread
From: Christoph Lameter @ 2010-05-11 20:21 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Matt Mackall, Pekka Enberg, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes

On Tue, 11 May 2010, Mike Frysinger wrote:

> and like i said, that doesnt sound like a reasonable solution when
> every single SPI driver (over 100 atm) out there is affected

If you are on an embedded arch then you can of course set the default
kmalloc alignment.

The requirement for kmalloc is that it returns memory that can be used for
DMA. If the arch can only DMA into cacheline aligned objects then the
correct method is to force kmalloc alignment to cacheline size.


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:10                           ` Mike Frysinger
  (?)
  (?)
@ 2010-05-11 20:22                           ` Pekka Enberg
  -1 siblings, 0 replies; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11 20:22 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Matt Mackall, Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Christoph Lameter, Nick Piggin, David Rientjes

On Tue, May 11, 2010 at 11:10 PM, Mike Frysinger <vapier.adi@gmail.com> wrote:
>> Make your own slab cache with the alignment flag set, as Pekka already
>> mentioned.
>
> and like i said, that doesnt sound like a reasonable solution when
> every single SPI driver (over 100 atm) out there is affected

Why do you need to do it in every driver? Why not create a cache of
size cache_line_size()*2 or something in the SPI stack and add a
helper function spi_alloc_whatever_this_is_about() for the drivers to
use?

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:21                           ` Christoph Lameter
@ 2010-05-11 20:34                             ` Mike Frysinger
  2010-05-11 20:38                               ` Pekka Enberg
                                                 ` (2 more replies)
  0 siblings, 3 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg
  Cc: Matt Mackall, Dmitry Torokhov, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 16:21, Christoph Lameter wrote:
> On Tue, 11 May 2010, Mike Frysinger wrote:
>> and like i said, that doesnt sound like a reasonable solution when
>> every single SPI driver (over 100 atm) out there is affected
>
> If you are on an embedded arch then you can of course set the default
> kmalloc alignment.
>
> The requirement for kmalloc is that it returns memory that can be used for
> DMA. If the arch can only DMA into cacheline aligned objects then the
> correct method is to force kmalloc alignment to cacheline size.

these are SPI drivers and are usable on any arch that supports a SPI
bus (which is pretty much every arch).  forget about "embedded"
arches.

the issue here is simple: a SPI driver (AD7877) needs to do a receive
SPI transfer into a DMA safe buffer.  what is the exact API to
dynamically allocate memory for the structure with this buffer
embedded in it such that the start of the structure is cached aligned
?  creating a dedicated kmem cache may work, but it isnt a scalable
solution if every SPI driver needs to create its own cache.

On Tue, May 11, 2010 at 16:22, Pekka Enberg wrote:
> Why do you need to do it in every driver? Why not create a cache of
> size cache_line_size()*2 or something in the SPI stack and add a
> helper function spi_alloc_whatever_this_is_about() for the drivers to
> use?

that is a question for David/Grant.  i'm not the SPI core maintainer,
i'm merely watching over some SPI drivers.  however, this answer also
doesnt sound like it's thinking big enough because what you're
proposing isnt specific to the SPI bus -- any time a DMA safe buffer
is needed dynamically, this function could be used.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:34                             ` Mike Frysinger
@ 2010-05-11 20:38                               ` Pekka Enberg
  2010-05-11 20:43                                   ` Mike Frysinger
  2010-05-11 20:46                               ` Christoph Lameter
  2010-05-11 22:37                               ` Alan Cox
  2 siblings, 1 reply; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11 20:38 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

Mike Frysinger wrote:
> that is a question for David/Grant.  i'm not the SPI core maintainer,
> i'm merely watching over some SPI drivers.  however, this answer also
> doesnt sound like it's thinking big enough because what you're
> proposing isnt specific to the SPI bus -- any time a DMA safe buffer
> is needed dynamically, this function could be used.

Well, we have dma_alloc_coherent(), shouldn't you be using that instead?

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:38                               ` Pekka Enberg
@ 2010-05-11 20:43                                   ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:43 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 16:38, Pekka Enberg wrote:
> Mike Frysinger wrote:
>> that is a question for David/Grant.  i'm not the SPI core maintainer,
>> i'm merely watching over some SPI drivers.  however, this answer also
>> doesnt sound like it's thinking big enough because what you're
>> proposing isnt specific to the SPI bus -- any time a DMA safe buffer
>> is needed dynamically, this function could be used.
>
> Well, we have dma_alloc_coherent(), shouldn't you be using that instead?

my understanding is that dma_alloc_coherent() gives you a buffer that
is always coherent.  the SPI layers take care of flushing and such on
the fly which means allocating coherent memory is overkill and bad for
performance.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 20:43                                   ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:43 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 16:38, Pekka Enberg wrote:
> Mike Frysinger wrote:
>> that is a question for David/Grant.  i'm not the SPI core maintainer,
>> i'm merely watching over some SPI drivers.  however, this answer also
>> doesnt sound like it's thinking big enough because what you're
>> proposing isnt specific to the SPI bus -- any time a DMA safe buffer
>> is needed dynamically, this function could be used.
>
> Well, we have dma_alloc_coherent(), shouldn't you be using that instead?

my understanding is that dma_alloc_coherent() gives you a buffer that
is always coherent.  the SPI layers take care of flushing and such on
the fly which means allocating coherent memory is overkill and bad for
performance.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:34                             ` Mike Frysinger
  2010-05-11 20:38                               ` Pekka Enberg
@ 2010-05-11 20:46                               ` Christoph Lameter
  2010-05-11 20:54                                   ` Mike Frysinger
  2010-05-11 22:37                               ` Alan Cox
  2 siblings, 1 reply; 124+ messages in thread
From: Christoph Lameter @ 2010-05-11 20:46 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Pekka Enberg, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, 11 May 2010, Mike Frysinger wrote:

> > DMA. If the arch can only DMA into cacheline aligned objects then the
> > correct method is to force kmalloc alignment to cacheline size.
>
> these are SPI drivers and are usable on any arch that supports a SPI
> bus (which is pretty much every arch).  forget about "embedded"
> arches.
>
> the issue here is simple: a SPI driver (AD7877) needs to do a receive
> SPI transfer into a DMA safe buffer.  what is the exact API to
> dynamically allocate memory for the structure with this buffer
> embedded in it such that the start of the structure is cached aligned
> ?  creating a dedicated kmem cache may work, but it isnt a scalable
> solution if every SPI driver needs to create its own cache.

kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
will be invalidated as needed.



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:43                                   ` Mike Frysinger
  (?)
@ 2010-05-11 20:47                                   ` Pekka Enberg
  2010-05-13  6:21                                     ` Paul Mundt
  -1 siblings, 1 reply; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11 20:47 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

Mike Frysinger wrote:
> On Tue, May 11, 2010 at 16:38, Pekka Enberg wrote:
>> Mike Frysinger wrote:
>>> that is a question for David/Grant.  i'm not the SPI core maintainer,
>>> i'm merely watching over some SPI drivers.  however, this answer also
>>> doesnt sound like it's thinking big enough because what you're
>>> proposing isnt specific to the SPI bus -- any time a DMA safe buffer
>>> is needed dynamically, this function could be used.
>> Well, we have dma_alloc_coherent(), shouldn't you be using that instead?
> 
> my understanding is that dma_alloc_coherent() gives you a buffer that
> is always coherent.  the SPI layers take care of flushing and such on
> the fly which means allocating coherent memory is overkill and bad for
> performance.

OK, I'm out of my expert area here but if dma_alloc_coherent() doesn't 
work for you, you should probably extend the DMA API, not kmalloc().

			Pekka

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:46                               ` Christoph Lameter
@ 2010-05-11 20:54                                   ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
> On Tue, 11 May 2010, Mike Frysinger wrote:
>> > DMA. If the arch can only DMA into cacheline aligned objects then the
>> > correct method is to force kmalloc alignment to cacheline size.
>>
>> these are SPI drivers and are usable on any arch that supports a SPI
>> bus (which is pretty much every arch).  forget about "embedded"
>> arches.
>>
>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
>> SPI transfer into a DMA safe buffer.  what is the exact API to
>> dynamically allocate memory for the structure with this buffer
>> embedded in it such that the start of the structure is cached aligned
>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
>> solution if every SPI driver needs to create its own cache.
>
> kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
> the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
> will be invalidated as needed.

so this guarantee is made by the kmalloc() API ?  and for arches where
the cacheline invalidation is handled in software rather than
hardware, they must declare a min alignment value for kmalloc to be at
least as big as their cache alignment ?

does the phrase "DMA safe buffer" imply cache alignment ?
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 20:54                                   ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 20:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
> On Tue, 11 May 2010, Mike Frysinger wrote:
>> > DMA. If the arch can only DMA into cacheline aligned objects then the
>> > correct method is to force kmalloc alignment to cacheline size.
>>
>> these are SPI drivers and are usable on any arch that supports a SPI
>> bus (which is pretty much every arch).  forget about "embedded"
>> arches.
>>
>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
>> SPI transfer into a DMA safe buffer.  what is the exact API to
>> dynamically allocate memory for the structure with this buffer
>> embedded in it such that the start of the structure is cached aligned
>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
>> solution if every SPI driver needs to create its own cache.
>
> kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
> the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
> will be invalidated as needed.

so this guarantee is made by the kmalloc() API ?  and for arches where
the cacheline invalidation is handled in software rather than
hardware, they must declare a min alignment value for kmalloc to be at
least as big as their cache alignment ?

does the phrase "DMA safe buffer" imply cache alignment ?
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:54                                   ` Mike Frysinger
  (?)
@ 2010-05-11 21:01                                   ` Pekka Enberg
  2010-05-11 21:11                                       ` Mike Frysinger
  2010-05-12  1:58                                     ` FUJITA Tomonori
  -1 siblings, 2 replies; 124+ messages in thread
From: Pekka Enberg @ 2010-05-11 21:01 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

Mike Frysinger wrote:
> On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
>> On Tue, 11 May 2010, Mike Frysinger wrote:
>>>> DMA. If the arch can only DMA into cacheline aligned objects then the
>>>> correct method is to force kmalloc alignment to cacheline size.
>>> these are SPI drivers and are usable on any arch that supports a SPI
>>> bus (which is pretty much every arch).  forget about "embedded"
>>> arches.
>>>
>>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
>>> SPI transfer into a DMA safe buffer.  what is the exact API to
>>> dynamically allocate memory for the structure with this buffer
>>> embedded in it such that the start of the structure is cached aligned
>>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
>>> solution if every SPI driver needs to create its own cache.
>> kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
>> the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
>> will be invalidated as needed.
> 
> so this guarantee is made by the kmalloc() API ?  and for arches where
> the cacheline invalidation is handled in software rather than
> hardware, they must declare a min alignment value for kmalloc to be at
> least as big as their cache alignment ?
> 
> does the phrase "DMA safe buffer" imply cache alignment ?

Yes, you should be able to DMA into kmalloc'd memory. IIRC the block or 
the SCSI layer depends on that.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 21:01                                   ` Pekka Enberg
@ 2010-05-11 21:11                                       ` Mike Frysinger
  2010-05-12  1:58                                     ` FUJITA Tomonori
  1 sibling, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 21:11 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 17:01, Pekka Enberg wrote:
> Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
>>> On Tue, 11 May 2010, Mike Frysinger wrote:
>>>>> DMA. If the arch can only DMA into cacheline aligned objects then the
>>>>> correct method is to force kmalloc alignment to cacheline size.
>>>>
>>>> these are SPI drivers and are usable on any arch that supports a SPI
>>>> bus (which is pretty much every arch).  forget about "embedded"
>>>> arches.
>>>>
>>>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
>>>> SPI transfer into a DMA safe buffer.  what is the exact API to
>>>> dynamically allocate memory for the structure with this buffer
>>>> embedded in it such that the start of the structure is cached aligned
>>>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
>>>> solution if every SPI driver needs to create its own cache.
>>>
>>> kmalloc returns a pointer to a DMA safe buffer. There is no requirement
>>> on
>>> the x86 hardware that the DMA buffers have to be cache aligned.
>>> Cachelines
>>> will be invalidated as needed.
>>
>> so this guarantee is made by the kmalloc() API ?  and for arches where
>> the cacheline invalidation is handled in software rather than
>> hardware, they must declare a min alignment value for kmalloc to be at
>> least as big as their cache alignment ?
>>
>> does the phrase "DMA safe buffer" imply cache alignment ?
>
> Yes, you should be able to DMA into kmalloc'd memory. IIRC the block or the
> SCSI layer depends on that.

we can DMA into it just fine, it's just that if the small DMA buffer
shares a cacheline with some polled state data (like in current AD7877
driver) instead of getting its own, the core may pull back in stale
data before the DMA finishes.  so we're looking for something that
guarantees both cacheline alignment (when applicable, so this doesnt
apply to x86 apparently) and DMA-safe dynamic memory.

so if Dmitry's original statement:
The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK
which means that next kmalloc data chunk will not share "our"
cacheline.

was revised to say:
The data is kmalloced, kmalloc aligns on cacheline boundary (when the
arch lacks hardware cacheline invalidation between the core and dma
channels) which means that next kmalloc data chunk will not share
"our" cacheline.

and this revised statement is correct, then Dmitry's revised fix
should be sufficient.  and i guess we should update the spi-summary
document with these tips so we dont have to hash this out again.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 21:11                                       ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 21:11 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Matt Mackall, Dmitry Torokhov, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Johannes Weiner,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 17:01, Pekka Enberg wrote:
> Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
>>> On Tue, 11 May 2010, Mike Frysinger wrote:
>>>>> DMA. If the arch can only DMA into cacheline aligned objects then the
>>>>> correct method is to force kmalloc alignment to cacheline size.
>>>>
>>>> these are SPI drivers and are usable on any arch that supports a SPI
>>>> bus (which is pretty much every arch).  forget about "embedded"
>>>> arches.
>>>>
>>>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
>>>> SPI transfer into a DMA safe buffer.  what is the exact API to
>>>> dynamically allocate memory for the structure with this buffer
>>>> embedded in it such that the start of the structure is cached aligned
>>>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
>>>> solution if every SPI driver needs to create its own cache.
>>>
>>> kmalloc returns a pointer to a DMA safe buffer. There is no requirement
>>> on
>>> the x86 hardware that the DMA buffers have to be cache aligned.
>>> Cachelines
>>> will be invalidated as needed.
>>
>> so this guarantee is made by the kmalloc() API ?  and for arches where
>> the cacheline invalidation is handled in software rather than
>> hardware, they must declare a min alignment value for kmalloc to be at
>> least as big as their cache alignment ?
>>
>> does the phrase "DMA safe buffer" imply cache alignment ?
>
> Yes, you should be able to DMA into kmalloc'd memory. IIRC the block or the
> SCSI layer depends on that.

we can DMA into it just fine, it's just that if the small DMA buffer
shares a cacheline with some polled state data (like in current AD7877
driver) instead of getting its own, the core may pull back in stale
data before the DMA finishes.  so we're looking for something that
guarantees both cacheline alignment (when applicable, so this doesnt
apply to x86 apparently) and DMA-safe dynamic memory.

so if Dmitry's original statement:
The data is kmalloced, kmalloc aligns on cacheline boundary AFAIK
which means that next kmalloc data chunk will not share "our"
cacheline.

was revised to say:
The data is kmalloced, kmalloc aligns on cacheline boundary (when the
arch lacks hardware cacheline invalidation between the core and dma
channels) which means that next kmalloc data chunk will not share
"our" cacheline.

and this revised statement is correct, then Dmitry's revised fix
should be sufficient.  and i guess we should update the spi-summary
document with these tips so we dont have to hash this out again.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
       [not found]                                   ` <20100511214836.GH1726@emlix.com>
@ 2010-05-11 21:53                                     ` Dmitry Torokhov
  2010-05-11 22:39                                         ` Mike Frysinger
  0 siblings, 1 reply; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-11 21:53 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Mike Frysinger, Christoph Lameter, Pekka Enberg, Matt Mackall,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
> > does the phrase "DMA safe buffer" imply cache alignment ?
> 
> I guess that depends on the architectural requirements.  On x86,
> apparently, not so much.  On ARM, probably yes, as it's the
> requirement to properly maintain coherency.

It looks liek ARM (and a few others) do this:

[dtor@hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
arch/frv/include/asm/mem-layout.h:#define       ARCH_KMALLOC_MINALIGN           8
arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN      L1_CACHE_BYTES
arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN      128
arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN    L1_CACHE_BYTES

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:34                             ` Mike Frysinger
  2010-05-11 20:38                               ` Pekka Enberg
  2010-05-11 20:46                               ` Christoph Lameter
@ 2010-05-11 22:37                               ` Alan Cox
  2 siblings, 0 replies; 124+ messages in thread
From: Alan Cox @ 2010-05-11 22:37 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Christoph Lameter, Pekka Enberg, Matt Mackall, Dmitry Torokhov,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner, Nick Piggin, David Rientjes, David Brownell,
	Grant Likely

> SPI transfer into a DMA safe buffer.  what is the exact API to
> dynamically allocate memory for the structure with this buffer
> embedded in it such that the start of the structure is cached aligned
> ?  creating a dedicated kmem cache may work, but it isnt a scalable
> solution if every SPI driver needs to create its own cache.

If you are embedding structures then one solution is to cheat a bit and
make the structure, compiler and existing kernel compile abuse do the
work. Something like

struct {
	void *except_where_prohibited;
	long boat;
	unsigned photograph;

	u8 pad[0] __cacheline_aligned;
}..

ought to do the trick providing you align the start of the object - which
should happen naturally with kmalloc or dma_* apis



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 21:53                                     ` Dmitry Torokhov
@ 2010-05-11 22:39                                         ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 22:39 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Johannes Weiner, Christoph Lameter, Pekka Enberg, Matt Mackall,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 17:53, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
>> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
>> > does the phrase "DMA safe buffer" imply cache alignment ?
>>
>> I guess that depends on the architectural requirements.  On x86,
>> apparently, not so much.  On ARM, probably yes, as it's the
>> requirement to properly maintain coherency.
>
> It looks liek ARM (and a few others) do this:
>
> [dtor@hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
> arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/frv/include/asm/mem-layout.h:#define       ARCH_KMALLOC_MINALIGN           8
> arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN      L1_CACHE_BYTES
> arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
> arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN      128
> arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN    L1_CACHE_BYTES

if ARCH_KMALLOC_MINALIGN is not defined, the current allocators default to:
 slub - alignof(unsigned long long)
 slab - alignof(unsigned long long)
 slob - alignof(unsigned long)
which for many arches can mean an alignment of merely 4 or 8

lets look at the cacheline sizes for arches that dont set
ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
 - alplha - 32 or 64
 - frv - 32 or 64
 - blackfin - 32
 - parisc - 32 or 64
 - mn10300 - 16
 - s390 - 256
 - score - 16
 - sparc - 32
 - xtensa - 16 or 32

assuming alpha and s390 handle cache coherency in hardware, it looks
to me like the proposed assumption (kmalloc returns cachealigned
pointers when cache management is in software) does not hold true.

so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
L1_CACHE_BYTES ?
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-11 22:39                                         ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-11 22:39 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Johannes Weiner, Christoph Lameter, Pekka Enberg, Matt Mackall,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely

On Tue, May 11, 2010 at 17:53, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
>> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
>> > does the phrase "DMA safe buffer" imply cache alignment ?
>>
>> I guess that depends on the architectural requirements.  On x86,
>> apparently, not so much.  On ARM, probably yes, as it's the
>> requirement to properly maintain coherency.
>
> It looks liek ARM (and a few others) do this:
>
> [dtor@hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
> arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/frv/include/asm/mem-layout.h:#define       ARCH_KMALLOC_MINALIGN           8
> arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN      L1_CACHE_BYTES
> arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
> arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN      128
> arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN    L1_CACHE_BYTES

if ARCH_KMALLOC_MINALIGN is not defined, the current allocators default to:
 slub - alignof(unsigned long long)
 slab - alignof(unsigned long long)
 slob - alignof(unsigned long)
which for many arches can mean an alignment of merely 4 or 8

lets look at the cacheline sizes for arches that dont set
ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
 - alplha - 32 or 64
 - frv - 32 or 64
 - blackfin - 32
 - parisc - 32 or 64
 - mn10300 - 16
 - s390 - 256
 - score - 16
 - sparc - 32
 - xtensa - 16 or 32

assuming alpha and s390 handle cache coherency in hardware, it looks
to me like the proposed assumption (kmalloc returns cachealigned
pointers when cache management is in software) does not hold true.

so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
L1_CACHE_BYTES ?
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 21:01                                   ` Pekka Enberg
  2010-05-11 21:11                                       ` Mike Frysinger
@ 2010-05-12  1:58                                     ` FUJITA Tomonori
  1 sibling, 0 replies; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-12  1:58 UTC (permalink / raw)
  To: penberg
  Cc: vapier.adi, cl, mpm, dmitry.torokhov, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, jw,
	npiggin, rientjes, dbrownell, grant.likely

On Wed, 12 May 2010 00:01:02 +0300
Pekka Enberg <penberg@cs.helsinki.fi> wrote:

> Mike Frysinger wrote:
> > On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
> >> On Tue, 11 May 2010, Mike Frysinger wrote:
> >>>> DMA. If the arch can only DMA into cacheline aligned objects then the
> >>>> correct method is to force kmalloc alignment to cacheline size.
> >>> these are SPI drivers and are usable on any arch that supports a SPI
> >>> bus (which is pretty much every arch).  forget about "embedded"
> >>> arches.
> >>>
> >>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
> >>> SPI transfer into a DMA safe buffer.  what is the exact API to
> >>> dynamically allocate memory for the structure with this buffer
> >>> embedded in it such that the start of the structure is cached aligned
> >>> ?  creating a dedicated kmem cache may work, but it isnt a scalable
> >>> solution if every SPI driver needs to create its own cache.
> >> kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
> >> the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
> >> will be invalidated as needed.
> > 
> > so this guarantee is made by the kmalloc() API ?  and for arches where
> > the cacheline invalidation is handled in software rather than
> > hardware, they must declare a min alignment value for kmalloc to be at
> > least as big as their cache alignment ?
> > 
> > does the phrase "DMA safe buffer" imply cache alignment ?
> 
> Yes, you should be able to DMA into kmalloc'd memory. IIRC the block or 
> the SCSI layer depends on that.

Yeah, SCSI subsystem and drivers, and block drivers depend on it.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 22:39                                         ` Mike Frysinger
  (?)
@ 2010-05-12  2:07                                         ` Marc Gauthier
  2010-05-12  3:03                                           ` Nick Piggin
  -1 siblings, 1 reply; 124+ messages in thread
From: Marc Gauthier @ 2010-05-12  2:07 UTC (permalink / raw)
  To: Mike Frysinger, Dmitry Torokhov
  Cc: Johannes Weiner, Christoph Lameter, Pekka Enberg, Matt Mackall,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Nick Piggin, David Rientjes, David Brownell, Grant Likely,
	Chris Zankel, Piet Delaney

Mike Frysinger wrote:
> lets look at the cacheline sizes for arches that dont set
> ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
>  - alplha - 32 or 64
>  - frv - 32 or 64
>  - blackfin - 32
>  - parisc - 32 or 64
>  - mn10300 - 16
>  - s390 - 256
>  - score - 16
>  - sparc - 32
>  - xtensa - 16 or 32
>
> assuming alpha and s390 handle cache coherency in hardware, it looks
> to me like the proposed assumption (kmalloc returns cachealigned
> pointers when cache management is in software) does not hold true.
>
> so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
> L1_CACHE_BYTES ?

IMHO, yes.  It just makes sense to avoid false-sharing issues, not to
allocate unrelated blocks in the same cache line.

Also as it turns out (hope he doesn't me telling), Christian Zankel
recently found a bug that was fixed exactly that way, by setting
ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES for the Xtensa architecture.
(Too recent to have percolated to mainline.)

A lot of the above might be cache line aligned (?).

-Marc

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12  2:07                                         ` [LKML] " Marc Gauthier
@ 2010-05-12  3:03                                           ` Nick Piggin
  2010-05-12  3:23                                             ` FUJITA Tomonori
  2010-05-19 12:48                                             ` David Woodhouse
  0 siblings, 2 replies; 124+ messages in thread
From: Nick Piggin @ 2010-05-12  3:03 UTC (permalink / raw)
  To: Marc Gauthier
  Cc: Mike Frysinger, Dmitry Torokhov, Johannes Weiner,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Tue, May 11, 2010 at 07:07:35PM -0700, Marc Gauthier wrote:
> Mike Frysinger wrote:
> > lets look at the cacheline sizes for arches that dont set
> > ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
> >  - alplha - 32 or 64
> >  - frv - 32 or 64
> >  - blackfin - 32
> >  - parisc - 32 or 64
> >  - mn10300 - 16
> >  - s390 - 256
> >  - score - 16
> >  - sparc - 32
> >  - xtensa - 16 or 32
> >
> > assuming alpha and s390 handle cache coherency in hardware, it looks
> > to me like the proposed assumption (kmalloc returns cachealigned
> > pointers when cache management is in software) does not hold true.
> >
> > so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
> > L1_CACHE_BYTES ?
> 
> IMHO, yes.  It just makes sense to avoid false-sharing issues, not to
> allocate unrelated blocks in the same cache line.
> 
> Also as it turns out (hope he doesn't me telling), Christian Zankel
> recently found a bug that was fixed exactly that way, by setting
> ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES for the Xtensa architecture.
> (Too recent to have percolated to mainline.)
> 
> A lot of the above might be cache line aligned (?).

I don't think it's necessarily a good idea. MINALIGN is an enforced
minimum alignment and the allocator has no leeway in reducing this.
In a UP system, or in a memory constrained system, it might be a better
idea to pack objects more tightly, for example.

If we allow drivers to assume kmalloc is cacheline aligned, it will be
(practically) impossible to revert this because it would require driver
audits.

So whenever strengthening API guarantees like this, it is better to be
very careful and conservative. Probably even introducing a new API with
the stronger semantics (even if it is just a wrapper in the case where
KMALLOC_MINALIGNED *is* cacheline sized).

I think adding to the DMA API would be a better idea. If the arch knows
that kmalloc is suitable for the job directly, it can be used. Drivers
can use the new interface, and kmalloc doesn't get saddled with
alignment requirements.


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12  3:03                                           ` Nick Piggin
@ 2010-05-12  3:23                                             ` FUJITA Tomonori
  2010-05-12  4:35                                                 ` Mike Frysinger
  2010-05-19 12:48                                             ` David Woodhouse
  1 sibling, 1 reply; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-12  3:23 UTC (permalink / raw)
  To: npiggin
  Cc: marc, vapier.adi, dmitry.torokhov, jw, cl, penberg, mpm, akpm,
	os, Michael.Hennerich, linux-input, linux-kernel, dg, osw,
	rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, 12 May 2010 13:03:50 +1000
Nick Piggin <npiggin@suse.de> wrote:

> On Tue, May 11, 2010 at 07:07:35PM -0700, Marc Gauthier wrote:
> > Mike Frysinger wrote:
> > > lets look at the cacheline sizes for arches that dont set
> > > ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
> > >  - alplha - 32 or 64
> > >  - frv - 32 or 64
> > >  - blackfin - 32
> > >  - parisc - 32 or 64
> > >  - mn10300 - 16
> > >  - s390 - 256
> > >  - score - 16
> > >  - sparc - 32
> > >  - xtensa - 16 or 32
> > >
> > > assuming alpha and s390 handle cache coherency in hardware, it looks
> > > to me like the proposed assumption (kmalloc returns cachealigned
> > > pointers when cache management is in software) does not hold true.
> > >
> > > so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
> > > L1_CACHE_BYTES ?
> > 
> > IMHO, yes.  It just makes sense to avoid false-sharing issues, not to
> > allocate unrelated blocks in the same cache line.
> > 
> > Also as it turns out (hope he doesn't me telling), Christian Zankel
> > recently found a bug that was fixed exactly that way, by setting
> > ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES for the Xtensa architecture.
> > (Too recent to have percolated to mainline.)
> > 
> > A lot of the above might be cache line aligned (?).
> 
> I don't think it's necessarily a good idea. MINALIGN is an enforced
> minimum alignment and the allocator has no leeway in reducing this.
> In a UP system, or in a memory constrained system, it might be a better
> idea to pack objects more tightly, for example.

I agree, however...

> If we allow drivers to assume kmalloc is cacheline aligned, it will be
> (practically) impossible to revert this because it would require driver
> audits.

SCSI and some block drivers already depend on it. If kmalloc'ed buffer
is not DMA safe, they breaks.

Seems that kmalloc is not cacheline aligned on some architectures but
they works. Probably, we might be just lucky because in general they
allocate larger buffers than 64 for DMA via kmalloc and the buffers
are aligned on the size?


> So whenever strengthening API guarantees like this, it is better to be
> very careful and conservative. Probably even introducing a new API with
> the stronger semantics (even if it is just a wrapper in the case where
> KMALLOC_MINALIGNED *is* cacheline sized).
> 
> I think adding to the DMA API would be a better idea. If the arch knows
> that kmalloc is suitable for the job directly, it can be used. Drivers
> can use the new interface, and kmalloc doesn't get saddled with
> alignment requirements.

Or a new GFP flag?

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate  cache lines
  2010-05-12  3:23                                             ` FUJITA Tomonori
@ 2010-05-12  4:35                                                 ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12  4:35 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, marc, dmitry.torokhov, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
> Seems that kmalloc is not cacheline aligned on some architectures but
> they works. Probably, we might be just lucky because in general they
> allocate larger buffers than 64 for DMA via kmalloc and the buffers
> are aligned on the size?

i think the magic combo is:
 - DMA buffer is written to (receive)
 - some driver state is in the same cacheline as the DMA buffer
 - that driver state is used after the flush but before the DMA finishes
 - only on arches that need software cache coherency

so i could see this not being an obvious issue for many people
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-12  4:35                                                 ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12  4:35 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, marc, dmitry.torokhov, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
> Seems that kmalloc is not cacheline aligned on some architectures but
> they works. Probably, we might be just lucky because in general they
> allocate larger buffers than 64 for DMA via kmalloc and the buffers
> are aligned on the size?

i think the magic combo is:
 - DMA buffer is written to (receive)
 - some driver state is in the same cacheline as the DMA buffer
 - that driver state is used after the flush but before the DMA finishes
 - only on arches that need software cache coherency

so i could see this not being an obvious issue for many people
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate  cache lines
  2010-05-12  4:35                                                 ` Mike Frysinger
@ 2010-05-12  5:28                                                   ` FUJITA Tomonori
  -1 siblings, 0 replies; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-12  5:28 UTC (permalink / raw)
  To: vapier.adi
  Cc: fujita.tomonori, npiggin, marc, dmitry.torokhov, jw, cl, penberg,
	mpm, akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg,
	osw, rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, 12 May 2010 00:35:45 -0400
Mike Frysinger <vapier.adi@gmail.com> wrote:

> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
> > Seems that kmalloc is not cacheline aligned on some architectures but
> > they works. Probably, we might be just lucky because in general they
> > allocate larger buffers than 64 for DMA via kmalloc and the buffers
> > are aligned on the size?
> 
> i think the magic combo is:
>  - DMA buffer is written to (receive)
>  - some driver state is in the same cacheline as the DMA buffer
>  - that driver state is used after the flush but before the DMA finishes
>  - only on arches that need software cache coherency
> 
> so i could see this not being an obvious issue for many people

Yeah, we don't hit such condition everyday. And as I said, SCSI
usually uses kmalloc'ed buffers rather than the cache size (buffers
are aligned on roundup_pow_of_two(the allocated size)). But I think
that we want to fix this issue explicitly. I guess that it would be
better to start a new thread about this issue.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-12  5:28                                                   ` FUJITA Tomonori
  0 siblings, 0 replies; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-12  5:28 UTC (permalink / raw)
  To: vapier.adi
  Cc: fujita.tomonori, npiggin, marc, dmitry.torokhov, jw, cl, penberg,
	mpm, akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg,
	osw, rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, 12 May 2010 00:35:45 -0400
Mike Frysinger <vapier.adi@gmail.com> wrote:

> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
> > Seems that kmalloc is not cacheline aligned on some architectures but
> > they works. Probably, we might be just lucky because in general they
> > allocate larger buffers than 64 for DMA via kmalloc and the buffers
> > are aligned on the size?
> 
> i think the magic combo is:
>  - DMA buffer is written to (receive)
>  - some driver state is in the same cacheline as the DMA buffer
>  - that driver state is used after the flush but before the DMA finishes
>  - only on arches that need software cache coherency
> 
> so i could see this not being an obvious issue for many people

Yeah, we don't hit such condition everyday. And as I said, SCSI
usually uses kmalloc'ed buffers rather than the cache size (buffers
are aligned on roundup_pow_of_two(the allocated size)). But I think
that we want to fix this issue explicitly. I guess that it would be
better to start a new thread about this issue.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12  4:35                                                 ` Mike Frysinger
  (?)
  (?)
@ 2010-05-12 10:36                                                 ` Johannes Weiner
  -1 siblings, 0 replies; 124+ messages in thread
From: Johannes Weiner @ 2010-05-12 10:36 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: FUJITA Tomonori, npiggin, marc, dmitry.torokhov, cl, penberg,
	mpm, akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg,
	osw, rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 12:35:45AM -0400, Mike Frysinger wrote:
> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
> > Seems that kmalloc is not cacheline aligned on some architectures but
> > they works. Probably, we might be just lucky because in general they
> > allocate larger buffers than 64 for DMA via kmalloc and the buffers
> > are aligned on the size?
> 
> i think the magic combo is:
>  - DMA buffer is written to (receive)

Check.

>  - some driver state is in the same cacheline as the DMA buffer
>  - that driver state is used after the flush but before the DMA finishes

The kmalloc caches are system-wide.  Any other kmalloc(samesize) user
could interfer when touching its object between dma_map_single() on
the neighbor object and the end of the transfer.

>  - only on arches that need software cache coherency
> 
> so i could see this not being an obvious issue for many people
> -mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12  4:35                                                 ` Mike Frysinger
                                                                   ` (2 preceding siblings ...)
  (?)
@ 2010-05-12 12:35                                                 ` Marc Gauthier
  2010-05-12 14:36                                                     ` Mike Frysinger
  -1 siblings, 1 reply; 124+ messages in thread
From: Marc Gauthier @ 2010-05-12 12:35 UTC (permalink / raw)
  To: Mike Frysinger, FUJITA Tomonori
  Cc: npiggin, dmitry.torokhov, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet Delaney

Mike Frysinger wrote:
> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
>> Seems that kmalloc is not cacheline aligned on some architectures but
>> they works. Probably, we might be just lucky because in general they
>> allocate larger buffers than 64 for DMA via kmalloc and the buffers
>> are aligned on the size?
>
> i think the magic combo is:
>  - DMA buffer is written to (receive)
[...]
>  - only on arches that need software cache coherency

In particular, when the architecture port uses cache invalidates that
throw away dirty lines.  They're equivalent to writing old data to a
cache line, so an unrelated kmalloc allocation in the same cache line
gets corrupted.

So true, not all architectures need the extra alignment.
(And maybe some get lucky too, don't know.)

-Marc

PS.  Sorry about the gratuitous [LKML] in the subject line, mea culpa.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate  cache lines
  2010-05-12 12:35                                                 ` Marc Gauthier
@ 2010-05-12 14:36                                                     ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 14:36 UTC (permalink / raw)
  To: Marc Gauthier
  Cc: FUJITA Tomonori, npiggin, dmitry.torokhov, jw, cl, penberg, mpm,
	akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg, osw,
	rientjes, dbrownell, grant.likely, chris, Piet Delaney

On Wed, May 12, 2010 at 08:35, Marc Gauthier wrote:
> Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
>>> Seems that kmalloc is not cacheline aligned on some architectures but
>>> they works. Probably, we might be just lucky because in general they
>>> allocate larger buffers than 64 for DMA via kmalloc and the buffers
>>> are aligned on the size?
>>
>> i think the magic combo is:
>>  - DMA buffer is written to (receive)
> [...]
>>  - only on arches that need software cache coherency
>
> In particular, when the architecture port uses cache invalidates that
> throw away dirty lines.  They're equivalent to writing old data to a
> cache line, so an unrelated kmalloc allocation in the same cache line
> gets corrupted.

true; i was thinking of the Blackfin implementation that only has a
FLUSH+INV insn (i complain about the lack of a pure INV insn every now
and again)
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-12 14:36                                                     ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 14:36 UTC (permalink / raw)
  To: Marc Gauthier
  Cc: FUJITA Tomonori, npiggin, dmitry.torokhov, jw, cl, penberg, mpm,
	akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg, osw,
	rientjes, dbrownell, grant.likely, chris, Piet Delaney

On Wed, May 12, 2010 at 08:35, Marc Gauthier wrote:
> Mike Frysinger wrote:
>> On Tue, May 11, 2010 at 23:23, FUJITA Tomonori wrote:
>>> Seems that kmalloc is not cacheline aligned on some architectures but
>>> they works. Probably, we might be just lucky because in general they
>>> allocate larger buffers than 64 for DMA via kmalloc and the buffers
>>> are aligned on the size?
>>
>> i think the magic combo is:
>>  - DMA buffer is written to (receive)
> [...]
>>  - only on arches that need software cache coherency
>
> In particular, when the architecture port uses cache invalidates that
> throw away dirty lines.  They're equivalent to writing old data to a
> cache line, so an unrelated kmalloc allocation in the same cache line
> gets corrupted.

true; i was thinking of the Blackfin implementation that only has a
FLUSH+INV insn (i complain about the lack of a pure INV insn every now
and again)
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate  cache lines
  2010-05-12  5:28                                                   ` FUJITA Tomonori
@ 2010-05-12 14:37                                                     ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 14:37 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, marc, dmitry.torokhov, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 01:28, FUJITA Tomonori wrote:
> I guess that it would be better to start a new thread about this issue.

will do
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-12 14:37                                                     ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 14:37 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, marc, dmitry.torokhov, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 01:28, FUJITA Tomonori wrote:
> I guess that it would be better to start a new thread about this issue.

will do
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12 14:37                                                     ` Mike Frysinger
  (?)
@ 2010-05-12 18:11                                                     ` Dmitry Torokhov
  2010-05-12 18:28                                                         ` Mike Frysinger
  -1 siblings, 1 reply; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-12 18:11 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: FUJITA Tomonori, npiggin, marc, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 10:37:05AM -0400, Mike Frysinger wrote:
> On Wed, May 12, 2010 at 01:28, FUJITA Tomonori wrote:
> > I guess that it would be better to start a new thread about this issue.
> 
> will do

In the mean time I will be applying the version of the patch with
___cacheline_aligned and buffers at the end, 'k?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate  cache lines
  2010-05-12 18:11                                                     ` Dmitry Torokhov
@ 2010-05-12 18:28                                                         ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 18:28 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: FUJITA Tomonori, npiggin, marc, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 14:11, Dmitry Torokhov wrote:
> On Wed, May 12, 2010 at 10:37:05AM -0400, Mike Frysinger wrote:
>> On Wed, May 12, 2010 at 01:28, FUJITA Tomonori wrote:
>> > I guess that it would be better to start a new thread about this issue.
>>
>> will do
>
> In the mean time I will be applying the version of the patch with
> ___cacheline_aligned and buffers at the end, 'k?

i dont see a problem with it, and regardless of the route we choose to
go with kmalloc(), this change will be needed ...
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
@ 2010-05-12 18:28                                                         ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-12 18:28 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: FUJITA Tomonori, npiggin, marc, jw, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 12, 2010 at 14:11, Dmitry Torokhov wrote:
> On Wed, May 12, 2010 at 10:37:05AM -0400, Mike Frysinger wrote:
>> On Wed, May 12, 2010 at 01:28, FUJITA Tomonori wrote:
>> > I guess that it would be better to start a new thread about this issue.
>>
>> will do
>
> In the mean time I will be applying the version of the patch with
> ___cacheline_aligned and buffers at the end, 'k?

i dont see a problem with it, and regardless of the route we choose to
go with kmalloc(), this change will be needed ...
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:47                                   ` Pekka Enberg
@ 2010-05-13  6:21                                     ` Paul Mundt
  0 siblings, 0 replies; 124+ messages in thread
From: Paul Mundt @ 2010-05-13  6:21 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Mike Frysinger, Christoph Lameter, Matt Mackall, Dmitry Torokhov,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Gl??ckner, Oliver Schneidewind,
	Johannes Weiner, Nick Piggin, David Rientjes, David Brownell,
	Grant Likely

On Tue, May 11, 2010 at 11:47:20PM +0300, Pekka Enberg wrote:
> Mike Frysinger wrote:
> >On Tue, May 11, 2010 at 16:38, Pekka Enberg wrote:
> >>Mike Frysinger wrote:
> >>>that is a question for David/Grant.  i'm not the SPI core maintainer,
> >>>i'm merely watching over some SPI drivers.  however, this answer also
> >>>doesnt sound like it's thinking big enough because what you're
> >>>proposing isnt specific to the SPI bus -- any time a DMA safe buffer
> >>>is needed dynamically, this function could be used.
> >>Well, we have dma_alloc_coherent(), shouldn't you be using that instead?
> >
> >my understanding is that dma_alloc_coherent() gives you a buffer that
> >is always coherent.  the SPI layers take care of flushing and such on
> >the fly which means allocating coherent memory is overkill and bad for
> >performance.
> 
> OK, I'm out of my expert area here but if dma_alloc_coherent() doesn't 
> work for you, you should probably extend the DMA API, not kmalloc().
> 
Note that the DMA API already has dma_alloc_noncoherent() for these
sorts of cases. If the driver is taking care of cache maintenance then
dma_alloc_noncoherent() is certainly a reasonable way to go.

Most architectures today simply wrap dma_alloc_noncoherent() to
dma_alloc_coherent(), but if there were more users of the API then
that would quickly change.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-07 18:23       ` Mike Frysinger
@ 2010-05-13  7:53         ` Dmitry Torokhov
  -1 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-13  7:53 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Daniel Glöckner, Oskar Schirmer, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >> i think it'd be a better idea to do something like:
> >>   if (spi->bits_per_word != 16) {
> >>     if (spi->bits_per_word) {
> >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >>       return -EINVAL;
> >>     }
> >>     spi->bits_per_word = 16;
> >>     spi_setup(spi);
> >>   }
> >
> > There is no way to set bits_per_word using struct spi_board_info. The
> > description of that structure in spi.h explicitly lists the wordsize as
> > one of the parameters drivers should set themself in probe().
> >
> > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> 
> an obvious shortcoming in the SPI framework that should be fixed, but
> that doesnt make any difference to the above code now does it ?  it'll
> operate correctly regardless of the SPI bus master.

So is the updated patch coming?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-13  7:53         ` Dmitry Torokhov
  0 siblings, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-13  7:53 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Daniel Glöckner, Oskar Schirmer, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >> i think it'd be a better idea to do something like:
> >>   if (spi->bits_per_word != 16) {
> >>     if (spi->bits_per_word) {
> >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >>       return -EINVAL;
> >>     }
> >>     spi->bits_per_word = 16;
> >>     spi_setup(spi);
> >>   }
> >
> > There is no way to set bits_per_word using struct spi_board_info. The
> > description of that structure in spi.h explicitly lists the wordsize as
> > one of the parameters drivers should set themself in probe().
> >
> > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> 
> an obvious shortcoming in the SPI framework that should be fixed, but
> that doesnt make any difference to the above code now does it ?  it'll
> operate correctly regardless of the SPI bus master.

So is the updated patch coming?

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-13  7:53         ` Dmitry Torokhov
@ 2010-05-15 18:15           ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-15 18:15 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Daniel Glöckner, Oskar Schirmer,
	Andrew Morton, linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> > >> i think it'd be a better idea to do something like:
> > >>   if (spi->bits_per_word != 16) {
> > >>     if (spi->bits_per_word) {
> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> > >>       return -EINVAL;
> > >>     }
> > >>     spi->bits_per_word = 16;
> > >>     spi_setup(spi);
> > >>   }
> > >
> > > There is no way to set bits_per_word using struct spi_board_info. The
> > > description of that structure in spi.h explicitly lists the wordsize as
> > > one of the parameters drivers should set themself in probe().
> > >
> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> > 
> > an obvious shortcoming in the SPI framework that should be fixed, but
> > that doesnt make any difference to the above code now does it ?  it'll
> > operate correctly regardless of the SPI bus master.
> 
> So is the updated patch coming?

The basic question I see is, whether it is in the
responsibility of ad7877 to check a wrong setting
possibly caused in board specific code. If so,
then the proposal by Mike should be used, but if not
so, it would introduce unneeded code.

Remember: both versions end up in correctly setting
bits_per_word, with the difference merely in feedback
level.

This is a design decision, I'ld say. So what are the
opinions on it, has it been taken yet, previously?

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-15 18:15           ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-15 18:15 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Daniel Glöckner, Oskar Schirmer,
	Andrew Morton, linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> > >> i think it'd be a better idea to do something like:
> > >>   if (spi->bits_per_word != 16) {
> > >>     if (spi->bits_per_word) {
> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> > >>       return -EINVAL;
> > >>     }
> > >>     spi->bits_per_word = 16;
> > >>     spi_setup(spi);
> > >>   }
> > >
> > > There is no way to set bits_per_word using struct spi_board_info. The
> > > description of that structure in spi.h explicitly lists the wordsize as
> > > one of the parameters drivers should set themself in probe().
> > >
> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> > 
> > an obvious shortcoming in the SPI framework that should be fixed, but
> > that doesnt make any difference to the above code now does it ?  it'll
> > operate correctly regardless of the SPI bus master.
> 
> So is the updated patch coming?

The basic question I see is, whether it is in the
responsibility of ad7877 to check a wrong setting
possibly caused in board specific code. If so,
then the proposal by Mike should be used, but if not
so, it would introduce unneeded code.

Remember: both versions end up in correctly setting
bits_per_word, with the difference merely in feedback
level.

This is a design decision, I'ld say. So what are the
opinions on it, has it been taken yet, previously?

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-15 18:15           ` Oskar Schirmer
@ 2010-05-16 19:25             ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-16 19:25 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> > >> i think it'd be a better idea to do something like:
>> > >>   if (spi->bits_per_word != 16) {
>> > >>     if (spi->bits_per_word) {
>> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>> > >>       return -EINVAL;
>> > >>     }
>> > >>     spi->bits_per_word = 16;
>> > >>     spi_setup(spi);
>> > >>   }
>> > >
>> > > There is no way to set bits_per_word using struct spi_board_info. The
>> > > description of that structure in spi.h explicitly lists the wordsize as
>> > > one of the parameters drivers should set themself in probe().
>> > >
>> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
>> >
>> > an obvious shortcoming in the SPI framework that should be fixed, but
>> > that doesnt make any difference to the above code now does it ?  it'll
>> > operate correctly regardless of the SPI bus master.
>>
>> So is the updated patch coming?
>
> The basic question I see is, whether it is in the
> responsibility of ad7877 to check a wrong setting
> possibly caused in board specific code. If so,
> then the proposal by Mike should be used, but if not
> so, it would introduce unneeded code.
>
> Remember: both versions end up in correctly setting
> bits_per_word, with the difference merely in feedback
> level.

imo, unsupported board settings should always be detected & rejected.
all SPI master drivers do this (detect & reject unsupported SPI slave
settings).
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-16 19:25             ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-16 19:25 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> > >> i think it'd be a better idea to do something like:
>> > >>   if (spi->bits_per_word != 16) {
>> > >>     if (spi->bits_per_word) {
>> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>> > >>       return -EINVAL;
>> > >>     }
>> > >>     spi->bits_per_word = 16;
>> > >>     spi_setup(spi);
>> > >>   }
>> > >
>> > > There is no way to set bits_per_word using struct spi_board_info. The
>> > > description of that structure in spi.h explicitly lists the wordsize as
>> > > one of the parameters drivers should set themself in probe().
>> > >
>> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
>> >
>> > an obvious shortcoming in the SPI framework that should be fixed, but
>> > that doesnt make any difference to the above code now does it ?  it'll
>> > operate correctly regardless of the SPI bus master.
>>
>> So is the updated patch coming?
>
> The basic question I see is, whether it is in the
> responsibility of ad7877 to check a wrong setting
> possibly caused in board specific code. If so,
> then the proposal by Mike should be used, but if not
> so, it would introduce unneeded code.
>
> Remember: both versions end up in correctly setting
> bits_per_word, with the difference merely in feedback
> level.

imo, unsupported board settings should always be detected & rejected.
all SPI master drivers do this (detect & reject unsupported SPI slave
settings).
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 22:39                                         ` Mike Frysinger
  (?)
  (?)
@ 2010-05-17  6:12                                         ` Michal Simek
  -1 siblings, 0 replies; 124+ messages in thread
From: Michal Simek @ 2010-05-17  6:12 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Dmitry Torokhov, Johannes Weiner, Christoph Lameter,
	Pekka Enberg, Matt Mackall, Andrew Morton, Oskar Schirmer,
	Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, Nick Piggin,
	David Rientjes, David Brownell, Grant Likely

Mike Frysinger wrote:
> On Tue, May 11, 2010 at 17:53, Dmitry Torokhov wrote:
>> On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
>>> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
>>>> does the phrase "DMA safe buffer" imply cache alignment ?
>>> I guess that depends on the architectural requirements.  On x86,
>>> apparently, not so much.  On ARM, probably yes, as it's the
>>> requirement to properly maintain coherency.
>> It looks liek ARM (and a few others) do this:
>>
>> [dtor@hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
>> arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
>> arch/frv/include/asm/mem-layout.h:#define       ARCH_KMALLOC_MINALIGN           8
>> arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
>> arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN      L1_CACHE_BYTES
>> arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
>> arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
>> arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
>> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
>> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
>> arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN      128
>> arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN    L1_CACHE_BYTES
> 
> if ARCH_KMALLOC_MINALIGN is not defined, the current allocators default to:
>  slub - alignof(unsigned long long)
>  slab - alignof(unsigned long long)
>  slob - alignof(unsigned long)
> which for many arches can mean an alignment of merely 4 or 8
> 
> lets look at the cacheline sizes for arches that dont set
> ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
>  - alplha - 32 or 64
>  - frv - 32 or 64
>  - blackfin - 32
>  - parisc - 32 or 64
>  - mn10300 - 16
>  - s390 - 256
>  - score - 16
>  - sparc - 32
>  - xtensa - 16 or 32

Just for completion.

microblaze - 16 or 32.

Michal

-- 
Michal Simek, Ing. (M.Eng)
PetaLogix - Linux Solutions for a Reconfigurable World
w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-16 19:25             ` Mike Frysinger
@ 2010-05-17  7:29               ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-17  7:29 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Daniel Glöckner,
	Andrew Morton, linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> > On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >> > >> i think it'd be a better idea to do something like:
> >> > >>   if (spi->bits_per_word != 16) {
> >> > >>     if (spi->bits_per_word) {
> >> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >> > >>       return -EINVAL;
> >> > >>     }
> >> > >>     spi->bits_per_word = 16;
> >> > >>     spi_setup(spi);
> >> > >>   }
> >> > >
> >> > > There is no way to set bits_per_word using struct spi_board_info. The
> >> > > description of that structure in spi.h explicitly lists the wordsize as
> >> > > one of the parameters drivers should set themself in probe().
> >> > >
> >> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> >> >
> >> > an obvious shortcoming in the SPI framework that should be fixed, but
> >> > that doesnt make any difference to the above code now does it ?  it'll
> >> > operate correctly regardless of the SPI bus master.
> >>
> >> So is the updated patch coming?
> >
> > The basic question I see is, whether it is in the
> > responsibility of ad7877 to check a wrong setting
> > possibly caused in board specific code. If so,
> > then the proposal by Mike should be used, but if not
> > so, it would introduce unneeded code.
> >
> > Remember: both versions end up in correctly setting
> > bits_per_word, with the difference merely in feedback
> > level.
> 
> imo, unsupported board settings should always be detected & rejected.
> all SPI master drivers do this (detect & reject unsupported SPI slave
> settings).

please note, that bits_per_word is not a board setting,
it's a demand of the device. consequently, there is no one
to set unsupported values and thus none to be detected.

the only architecture setting bits_per_word thru spi_chip
is blackfin, but I cannot see a good reason, why the board
settings should engage with a fixed demand of the device?

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-17  7:29               ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-17  7:29 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Daniel Glöckner,
	Andrew Morton, linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> > On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >> > >> i think it'd be a better idea to do something like:
> >> > >>   if (spi->bits_per_word != 16) {
> >> > >>     if (spi->bits_per_word) {
> >> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >> > >>       return -EINVAL;
> >> > >>     }
> >> > >>     spi->bits_per_word = 16;
> >> > >>     spi_setup(spi);
> >> > >>   }
> >> > >
> >> > > There is no way to set bits_per_word using struct spi_board_info. The
> >> > > description of that structure in spi.h explicitly lists the wordsize as
> >> > > one of the parameters drivers should set themself in probe().
> >> > >
> >> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
> >> >
> >> > an obvious shortcoming in the SPI framework that should be fixed, but
> >> > that doesnt make any difference to the above code now does it ?  it'll
> >> > operate correctly regardless of the SPI bus master.
> >>
> >> So is the updated patch coming?
> >
> > The basic question I see is, whether it is in the
> > responsibility of ad7877 to check a wrong setting
> > possibly caused in board specific code. If so,
> > then the proposal by Mike should be used, but if not
> > so, it would introduce unneeded code.
> >
> > Remember: both versions end up in correctly setting
> > bits_per_word, with the difference merely in feedback
> > level.
> 
> imo, unsupported board settings should always be detected & rejected.
> all SPI master drivers do this (detect & reject unsupported SPI slave
> settings).

please note, that bits_per_word is not a board setting,
it's a demand of the device. consequently, there is no one
to set unsupported values and thus none to be detected.

the only architecture setting bits_per_word thru spi_chip
is blackfin, but I cannot see a good reason, why the board
settings should engage with a fixed demand of the device?

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-17  7:29               ` Oskar Schirmer
@ 2010-05-17  8:14                 ` Hennerich, Michael
  -1 siblings, 0 replies; 124+ messages in thread
From: Hennerich, Michael @ 2010-05-17  8:14 UTC (permalink / raw)
  To: Oskar Schirmer, Mike Frysinger
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2932 bytes --]

Oskar Schirmer wrote on 2010-05-17:
> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>>>>>>> i think it'd be a better idea to do something like:
>>>>>>>   if (spi->bits_per_word != 16) {
>>>>>>>     if (spi->bits_per_word) {
>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings;
>>>>>>> bits_per_word must be 16\n");
>>>>>>>       return -EINVAL;
>>>>>>>     }
>>>>>>>     spi->bits_per_word = 16;
>>>>>>>     spi_setup(spi);
>>>>>>>   }
>>>>>>
>>>>>> There is no way to set bits_per_word using struct spi_board_info.
>>>>>> The description of that structure in spi.h explicitly lists the
>>>>>> wordsize as one of the parameters drivers should set themself in
>>>>>> probe().
>>>>>>
>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the
> board code.
>>>>>
>>>>> an obvious shortcoming in the SPI framework that should be fixed,
>>>>> but that doesnt make any difference to the above code now does it ?
>>>>>  it'll operate correctly regardless of the SPI bus master.
>>>>
>>>> So is the updated patch coming?
>>>
>>> The basic question I see is, whether it is in the responsibility
>>> of
>>> ad7877 to check a wrong setting possibly caused in board specific
>>> code. If so, then the proposal by Mike should be used, but if not
>>> so, it would introduce unneeded code.
>>>
>>> Remember: both versions end up in correctly setting bits_per_word,
>>> with the difference merely in feedback level.
>>
>> imo, unsupported board settings should always be detected & rejected.
>> all SPI master drivers do this (detect & reject unsupported SPI
>> slave settings).
>
> please note, that bits_per_word is not a board setting, it's a demand
> of the device. consequently, there is no one to set unsupported values
> and thus none to be detected.
>
> the only architecture setting bits_per_word thru spi_chip is blackfin,
> but I cannot see a good reason, why the board settings should engage
> with a fixed demand of the device?
>
>   Oskar

100% agreed.

Two ways to address the issue:
1) Forcing spi->bits_per_word = 16 like this patch does.
2) Or going with the default 8-bit transfers and using be16_to_cpu().

Personally I prefer 1) unless someone tells me that not all SPI bus drivers
support 16-bit transfers.

Greetings,
Michael

Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 4036 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-17  8:14                 ` Hennerich, Michael
  0 siblings, 0 replies; 124+ messages in thread
From: Hennerich, Michael @ 2010-05-17  8:14 UTC (permalink / raw)
  To: Oskar Schirmer, Mike Frysinger
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind

Oskar Schirmer wrote on 2010-05-17:
> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>>>>>>> i think it'd be a better idea to do something like:
>>>>>>>   if (spi->bits_per_word != 16) {
>>>>>>>     if (spi->bits_per_word) {
>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings;
>>>>>>> bits_per_word must be 16\n");
>>>>>>>       return -EINVAL;
>>>>>>>     }
>>>>>>>     spi->bits_per_word = 16;
>>>>>>>     spi_setup(spi);
>>>>>>>   }
>>>>>>
>>>>>> There is no way to set bits_per_word using struct spi_board_info.
>>>>>> The description of that structure in spi.h explicitly lists the
>>>>>> wordsize as one of the parameters drivers should set themself in
>>>>>> probe().
>>>>>>
>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the
> board code.
>>>>>
>>>>> an obvious shortcoming in the SPI framework that should be fixed,
>>>>> but that doesnt make any difference to the above code now does it ?
>>>>>  it'll operate correctly regardless of the SPI bus master.
>>>>
>>>> So is the updated patch coming?
>>>
>>> The basic question I see is, whether it is in the responsibility
>>> of
>>> ad7877 to check a wrong setting possibly caused in board specific
>>> code. If so, then the proposal by Mike should be used, but if not
>>> so, it would introduce unneeded code.
>>>
>>> Remember: both versions end up in correctly setting bits_per_word,
>>> with the difference merely in feedback level.
>>
>> imo, unsupported board settings should always be detected & rejected.
>> all SPI master drivers do this (detect & reject unsupported SPI
>> slave settings).
>
> please note, that bits_per_word is not a board setting, it's a demand
> of the device. consequently, there is no one to set unsupported values
> and thus none to be detected.
>
> the only architecture setting bits_per_word thru spi_chip is blackfin,
> but I cannot see a good reason, why the board settings should engage
> with a fixed demand of the device?
>
>   Oskar

100% agreed.

Two ways to address the issue:
1) Forcing spi->bits_per_word = 16 like this patch does.
2) Or going with the default 8-bit transfers and using be16_to_cpu().

Personally I prefer 1) unless someone tells me that not all SPI bus drivers
support 16-bit transfers.

Greetings,
Michael

Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 4036 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-17  8:14                 ` Hennerich, Michael
@ 2010-05-17  8:41                   ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-17  8:41 UTC (permalink / raw)
  To: Hennerich, Michael
  Cc: Oskar Schirmer, Mike Frysinger, Dmitry Torokhov,
	Daniel Glöckner, Andrew Morton, linux-input, linux-kernel,
	Oliver Schneidewind

On Mon, May 17, 2010 at 09:14:39 +0100, Hennerich, Michael wrote:
> Oskar Schirmer wrote on 2010-05-17:
> > On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> >> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> >>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >>>>>>> i think it'd be a better idea to do something like:
> >>>>>>>   if (spi->bits_per_word != 16) {
> >>>>>>>     if (spi->bits_per_word) {
> >>>>>>>       dev_err(&spi->dev, "Invalid SPI settings;
> >>>>>>> bits_per_word must be 16\n");
> >>>>>>>       return -EINVAL;
> >>>>>>>     }
> >>>>>>>     spi->bits_per_word = 16;
> >>>>>>>     spi_setup(spi);
> >>>>>>>   }
> >>>>>>
> >>>>>> There is no way to set bits_per_word using struct spi_board_info.
> >>>>>> The description of that structure in spi.h explicitly lists the
> >>>>>> wordsize as one of the parameters drivers should set themself in
> >>>>>> probe().
> >>>>>>
> >>>>>> Only struct bfin5xx_spi_chip allows to set this value in the
> > board code.
> >>>>>
> >>>>> an obvious shortcoming in the SPI framework that should be fixed,
> >>>>> but that doesnt make any difference to the above code now does it ?
> >>>>>  it'll operate correctly regardless of the SPI bus master.
> >>>>
> >>>> So is the updated patch coming?
> >>>
> >>> The basic question I see is, whether it is in the responsibility
> >>> of
> >>> ad7877 to check a wrong setting possibly caused in board specific
> >>> code. If so, then the proposal by Mike should be used, but if not
> >>> so, it would introduce unneeded code.
> >>>
> >>> Remember: both versions end up in correctly setting bits_per_word,
> >>> with the difference merely in feedback level.
> >>
> >> imo, unsupported board settings should always be detected & rejected.
> >> all SPI master drivers do this (detect & reject unsupported SPI
> >> slave settings).
> >
> > please note, that bits_per_word is not a board setting, it's a demand
> > of the device. consequently, there is no one to set unsupported values
> > and thus none to be detected.
> >
> > the only architecture setting bits_per_word thru spi_chip is blackfin,
> > but I cannot see a good reason, why the board settings should engage
> > with a fixed demand of the device?
> >
> >   Oskar
> 
> 100% agreed.
> 
> Two ways to address the issue:
> 1) Forcing spi->bits_per_word = 16 like this patch does.
> 2) Or going with the default 8-bit transfers and using be16_to_cpu().
> 
> Personally I prefer 1) unless someone tells me that not all SPI bus drivers
> support 16-bit transfers.

yes, I prefer (1) too.

if (2) is for the bus driver that didn't support
16 bit up to now, though the peripheral unit does
support it, that bus driver needs a fix.

but then, if (2) is for, e.g., a 16 bit only device wired
to an 8 bit only bus driver, which, I'ld say, is a
board design fault. sure one may want to implement an
ugly software work around for it, just to rescue some
high budget project and managers calculated that messing
up the software will cost less then having another
board design cycle, but I'ld better propose to give the
hardware engineer the whip.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-17  8:41                   ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-17  8:41 UTC (permalink / raw)
  To: Hennerich, Michael
  Cc: Oskar Schirmer, Mike Frysinger, Dmitry Torokhov,
	Daniel Glöckner, Andrew Morton, linux-input, linux-kernel,
	Oliver Schneidewind

On Mon, May 17, 2010 at 09:14:39 +0100, Hennerich, Michael wrote:
> Oskar Schirmer wrote on 2010-05-17:
> > On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> >> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> >>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >>>>>>> i think it'd be a better idea to do something like:
> >>>>>>>   if (spi->bits_per_word != 16) {
> >>>>>>>     if (spi->bits_per_word) {
> >>>>>>>       dev_err(&spi->dev, "Invalid SPI settings;
> >>>>>>> bits_per_word must be 16\n");
> >>>>>>>       return -EINVAL;
> >>>>>>>     }
> >>>>>>>     spi->bits_per_word = 16;
> >>>>>>>     spi_setup(spi);
> >>>>>>>   }
> >>>>>>
> >>>>>> There is no way to set bits_per_word using struct spi_board_info.
> >>>>>> The description of that structure in spi.h explicitly lists the
> >>>>>> wordsize as one of the parameters drivers should set themself in
> >>>>>> probe().
> >>>>>>
> >>>>>> Only struct bfin5xx_spi_chip allows to set this value in the
> > board code.
> >>>>>
> >>>>> an obvious shortcoming in the SPI framework that should be fixed,
> >>>>> but that doesnt make any difference to the above code now does it ?
> >>>>>  it'll operate correctly regardless of the SPI bus master.
> >>>>
> >>>> So is the updated patch coming?
> >>>
> >>> The basic question I see is, whether it is in the responsibility
> >>> of
> >>> ad7877 to check a wrong setting possibly caused in board specific
> >>> code. If so, then the proposal by Mike should be used, but if not
> >>> so, it would introduce unneeded code.
> >>>
> >>> Remember: both versions end up in correctly setting bits_per_word,
> >>> with the difference merely in feedback level.
> >>
> >> imo, unsupported board settings should always be detected & rejected.
> >> all SPI master drivers do this (detect & reject unsupported SPI
> >> slave settings).
> >
> > please note, that bits_per_word is not a board setting, it's a demand
> > of the device. consequently, there is no one to set unsupported values
> > and thus none to be detected.
> >
> > the only architecture setting bits_per_word thru spi_chip is blackfin,
> > but I cannot see a good reason, why the board settings should engage
> > with a fixed demand of the device?
> >
> >   Oskar
> 
> 100% agreed.
> 
> Two ways to address the issue:
> 1) Forcing spi->bits_per_word = 16 like this patch does.
> 2) Or going with the default 8-bit transfers and using be16_to_cpu().
> 
> Personally I prefer 1) unless someone tells me that not all SPI bus drivers
> support 16-bit transfers.

yes, I prefer (1) too.

if (2) is for the bus driver that didn't support
16 bit up to now, though the peripheral unit does
support it, that bus driver needs a fix.

but then, if (2) is for, e.g., a 16 bit only device wired
to an 8 bit only bus driver, which, I'ld say, is a
board design fault. sure one may want to implement an
ugly software work around for it, just to rescue some
high budget project and managers calculated that messing
up the software will cost less then having another
board design cycle, but I'ld better propose to give the
hardware engineer the whip.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-17  7:29               ` Oskar Schirmer
@ 2010-05-17 23:49                 ` Mike Frysinger
  -1 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-17 23:49 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

2010/5/17 Oskar Schirmer:
> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>> > On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>> >> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>> >> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>> >> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> >> > >> i think it'd be a better idea to do something like:
>> >> > >>   if (spi->bits_per_word != 16) {
>> >> > >>     if (spi->bits_per_word) {
>> >> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>> >> > >>       return -EINVAL;
>> >> > >>     }
>> >> > >>     spi->bits_per_word = 16;
>> >> > >>     spi_setup(spi);
>> >> > >>   }
>> >> > >
>> >> > > There is no way to set bits_per_word using struct spi_board_info. The
>> >> > > description of that structure in spi.h explicitly lists the wordsize as
>> >> > > one of the parameters drivers should set themself in probe().
>> >> > >
>> >> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
>> >> >
>> >> > an obvious shortcoming in the SPI framework that should be fixed, but
>> >> > that doesnt make any difference to the above code now does it ?  it'll
>> >> > operate correctly regardless of the SPI bus master.
>> >>
>> >> So is the updated patch coming?
>> >
>> > The basic question I see is, whether it is in the
>> > responsibility of ad7877 to check a wrong setting
>> > possibly caused in board specific code. If so,
>> > then the proposal by Mike should be used, but if not
>> > so, it would introduce unneeded code.
>> >
>> > Remember: both versions end up in correctly setting
>> > bits_per_word, with the difference merely in feedback
>> > level.
>>
>> imo, unsupported board settings should always be detected & rejected.
>> all SPI master drivers do this (detect & reject unsupported SPI slave
>> settings).
>
> please note, that bits_per_word is not a board setting,
> it's a demand of the device. consequently, there is no one
> to set unsupported values and thus none to be detected.
>
> the only architecture setting bits_per_word thru spi_chip
> is blackfin, but I cannot see a good reason, why the board
> settings should engage with a fixed demand of the device?

you're telling me that every single SPI device out there can only ever
operate in one specific bit length or lacks optional settings ?  i
find that hard to believe which means it does make sense to let the
boards pick a length when appropriate.

the board settings also function as default setup values so when using
generic things like the spidev driver, there is no kernel driver to
request the desired bitmode.

the simple code i posted addresses your concerns as well as reject
invalid settings (wherever they may originate).  if you're not going
to post an updated patch, i'll take the simple one Michael committed
and post that.
-mike

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-17 23:49                 ` Mike Frysinger
  0 siblings, 0 replies; 124+ messages in thread
From: Mike Frysinger @ 2010-05-17 23:49 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

2010/5/17 Oskar Schirmer:
> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>> > On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>> >> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>> >> > On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>> >> > > On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>> >> > >> i think it'd be a better idea to do something like:
>> >> > >>   if (spi->bits_per_word != 16) {
>> >> > >>     if (spi->bits_per_word) {
>> >> > >>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>> >> > >>       return -EINVAL;
>> >> > >>     }
>> >> > >>     spi->bits_per_word = 16;
>> >> > >>     spi_setup(spi);
>> >> > >>   }
>> >> > >
>> >> > > There is no way to set bits_per_word using struct spi_board_info. The
>> >> > > description of that structure in spi.h explicitly lists the wordsize as
>> >> > > one of the parameters drivers should set themself in probe().
>> >> > >
>> >> > > Only struct bfin5xx_spi_chip allows to set this value in the board code.
>> >> >
>> >> > an obvious shortcoming in the SPI framework that should be fixed, but
>> >> > that doesnt make any difference to the above code now does it ?  it'll
>> >> > operate correctly regardless of the SPI bus master.
>> >>
>> >> So is the updated patch coming?
>> >
>> > The basic question I see is, whether it is in the
>> > responsibility of ad7877 to check a wrong setting
>> > possibly caused in board specific code. If so,
>> > then the proposal by Mike should be used, but if not
>> > so, it would introduce unneeded code.
>> >
>> > Remember: both versions end up in correctly setting
>> > bits_per_word, with the difference merely in feedback
>> > level.
>>
>> imo, unsupported board settings should always be detected & rejected.
>> all SPI master drivers do this (detect & reject unsupported SPI slave
>> settings).
>
> please note, that bits_per_word is not a board setting,
> it's a demand of the device. consequently, there is no one
> to set unsupported values and thus none to be detected.
>
> the only architecture setting bits_per_word thru spi_chip
> is blackfin, but I cannot see a good reason, why the board
> settings should engage with a fixed demand of the device?

you're telling me that every single SPI device out there can only ever
operate in one specific bit length or lacks optional settings ?  i
find that hard to believe which means it does make sense to let the
boards pick a length when appropriate.

the board settings also function as default setup values so when using
generic things like the spidev driver, there is no kernel driver to
request the desired bitmode.

the simple code i posted addresses your concerns as well as reject
invalid settings (wherever they may originate).  if you're not going
to post an updated patch, i'll take the simple one Michael committed
and post that.
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-17 23:49                 ` Mike Frysinger
@ 2010-05-18  0:18                   ` H Hartley Sweeten
  -1 siblings, 0 replies; 124+ messages in thread
From: H Hartley Sweeten @ 2010-05-18  0:18 UTC (permalink / raw)
  To: Mike Frysinger, Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4487 bytes --]

On Monday, May 17, 2010 4:50 PM, Mike Frysinger wrote:
> 2010/5/17 Oskar Schirmer:
>> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>>>>>>>> i think it'd be a better idea to do something like:
>>>>>>>>   if (spi->bits_per_word != 16) {
>>>>>>>>     if (spi->bits_per_word) {
>>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>>>>>>>>       return -EINVAL;
>>>>>>>>     }
>>>>>>>>     spi->bits_per_word = 16;
>>>>>>>>     spi_setup(spi);
>>>>>>>>   }
>>>>>>>
>>>>>>> There is no way to set bits_per_word using struct spi_board_info. The
>>>>>>> description of that structure in spi.h explicitly lists the wordsize as
>>>>>>> one of the parameters drivers should set themself in probe().
>>>>>> >
>>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the board code.
>>>>>>
>>>>>> an obvious shortcoming in the SPI framework that should be fixed, but
>>>>>> that doesnt make any difference to the above code now does it ?  it'll
>>>>>> operate correctly regardless of the SPI bus master.
>>>>>
>>>>> So is the updated patch coming?
>>>>
>>>> The basic question I see is, whether it is in the
>>>> responsibility of ad7877 to check a wrong setting
>>>> possibly caused in board specific code. If so,
>>>> then the proposal by Mike should be used, but if not
>>>> so, it would introduce unneeded code.
>>>>
>>>> Remember: both versions end up in correctly setting
>>>> bits_per_word, with the difference merely in feedback
>>>> level.
>>>
>>> imo, unsupported board settings should always be detected & rejected.
>>> all SPI master drivers do this (detect & reject unsupported SPI slave
>>> settings).
>>
>> please note, that bits_per_word is not a board setting,
>> it's a demand of the device. consequently, there is no one
>> to set unsupported values and thus none to be detected.
>>
>> the only architecture setting bits_per_word thru spi_chip
>> is blackfin, but I cannot see a good reason, why the board
>> settings should engage with a fixed demand of the device?
>
> you're telling me that every single SPI device out there can only ever
> operate in one specific bit length or lacks optional settings ?  i
> find that hard to believe which means it does make sense to let the
> boards pick a length when appropriate.
>
> the board settings also function as default setup values so when using
> generic things like the spidev driver, there is no kernel driver to
> request the desired bitmode.
>
> the simple code i posted addresses your concerns as well as reject
> invalid settings (wherever they may originate).  if you're not going
> to post an updated patch, i'll take the simple one Michael committed
> and post that.

Just my 0.02, I haven't been following this thread very closely...

I believe it's the spi "device", i.e. the protocol, drivers responsibility
to supply the bits_per_word that it will operate with.  The spi master
driver simply validates it if can support the requested size.  If the
bits_per_word is left at the default (0) it indicates that the protocol
words are eight bit bytes.

If a protocol driver can support multiple word sizes it should probably
also support a "setup" callback during the driver probe so that the
underlying board support can provide the desired word size.  This is what
the libertas_spi driver does (drivers/net/wireless/libertas/if_spi.c).
That driver does nothing with the bits_per_word value but the callbacks in
arch/arm/mach-pxa/cm-x270.c and em-x270.c do.  They both set the bits_per_word
to 16 and then call spi_setup.  They also probably should be checking the
return value of spi_setup to make sure that the requested mode is supported
by the master driver, but they probably don't since it is already known...

If the ad7877 "only" supports 16-bit word sizes it should be setting the
bits_per_word to indicate this.  If the underlying master driver does not
support that word size it will return an error when spi_setup is called.

Just my 0.02... 

Regards,
Hartley
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-18  0:18                   ` H Hartley Sweeten
  0 siblings, 0 replies; 124+ messages in thread
From: H Hartley Sweeten @ 2010-05-18  0:18 UTC (permalink / raw)
  To: Mike Frysinger, Oskar Schirmer
  Cc: Dmitry Torokhov, Daniel Glöckner, Andrew Morton,
	linux-input, linux-kernel, Oliver Schneidewind,
	Michael Hennerich

On Monday, May 17, 2010 4:50 PM, Mike Frysinger wrote:
> 2010/5/17 Oskar Schirmer:
>> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
>>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
>>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
>>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
>>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
>>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
>>>>>>>> i think it'd be a better idea to do something like:
>>>>>>>>   if (spi->bits_per_word != 16) {
>>>>>>>>     if (spi->bits_per_word) {
>>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
>>>>>>>>       return -EINVAL;
>>>>>>>>     }
>>>>>>>>     spi->bits_per_word = 16;
>>>>>>>>     spi_setup(spi);
>>>>>>>>   }
>>>>>>>
>>>>>>> There is no way to set bits_per_word using struct spi_board_info. The
>>>>>>> description of that structure in spi.h explicitly lists the wordsize as
>>>>>>> one of the parameters drivers should set themself in probe().
>>>>>> >
>>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the board code.
>>>>>>
>>>>>> an obvious shortcoming in the SPI framework that should be fixed, but
>>>>>> that doesnt make any difference to the above code now does it ?  it'll
>>>>>> operate correctly regardless of the SPI bus master.
>>>>>
>>>>> So is the updated patch coming?
>>>>
>>>> The basic question I see is, whether it is in the
>>>> responsibility of ad7877 to check a wrong setting
>>>> possibly caused in board specific code. If so,
>>>> then the proposal by Mike should be used, but if not
>>>> so, it would introduce unneeded code.
>>>>
>>>> Remember: both versions end up in correctly setting
>>>> bits_per_word, with the difference merely in feedback
>>>> level.
>>>
>>> imo, unsupported board settings should always be detected & rejected.
>>> all SPI master drivers do this (detect & reject unsupported SPI slave
>>> settings).
>>
>> please note, that bits_per_word is not a board setting,
>> it's a demand of the device. consequently, there is no one
>> to set unsupported values and thus none to be detected.
>>
>> the only architecture setting bits_per_word thru spi_chip
>> is blackfin, but I cannot see a good reason, why the board
>> settings should engage with a fixed demand of the device?
>
> you're telling me that every single SPI device out there can only ever
> operate in one specific bit length or lacks optional settings ?  i
> find that hard to believe which means it does make sense to let the
> boards pick a length when appropriate.
>
> the board settings also function as default setup values so when using
> generic things like the spidev driver, there is no kernel driver to
> request the desired bitmode.
>
> the simple code i posted addresses your concerns as well as reject
> invalid settings (wherever they may originate).  if you're not going
> to post an updated patch, i'll take the simple one Michael committed
> and post that.

Just my 0.02, I haven't been following this thread very closely...

I believe it's the spi "device", i.e. the protocol, drivers responsibility
to supply the bits_per_word that it will operate with.  The spi master
driver simply validates it if can support the requested size.  If the
bits_per_word is left at the default (0) it indicates that the protocol
words are eight bit bytes.

If a protocol driver can support multiple word sizes it should probably
also support a "setup" callback during the driver probe so that the
underlying board support can provide the desired word size.  This is what
the libertas_spi driver does (drivers/net/wireless/libertas/if_spi.c).
That driver does nothing with the bits_per_word value but the callbacks in
arch/arm/mach-pxa/cm-x270.c and em-x270.c do.  They both set the bits_per_word
to 16 and then call spi_setup.  They also probably should be checking the
return value of spi_setup to make sure that the requested mode is supported
by the master driver, but they probably don't since it is already known...

If the ad7877 "only" supports 16-bit word sizes it should be setting the
bits_per_word to indicate this.  If the underlying master driver does not
support that word size it will return an error when spi_setup is called.

Just my 0.02... 

Regards,
Hartley

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-18  0:18                   ` H Hartley Sweeten
@ 2010-05-18  8:32                     ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-18  8:32 UTC (permalink / raw)
  To: H Hartley Sweeten
  Cc: Mike Frysinger, Oskar Schirmer, Dmitry Torokhov,
	Daniel Glöckner, Andrew Morton, linux-input, linux-kernel,
	Oliver Schneidewind, Michael Hennerich

On Mon, May 17, 2010 at 19:18:28 -0500, H Hartley Sweeten wrote:
> On Monday, May 17, 2010 4:50 PM, Mike Frysinger wrote:
> > 2010/5/17 Oskar Schirmer:
> >> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> >>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> >>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >>>>>>>> i think it'd be a better idea to do something like:
> >>>>>>>>   if (spi->bits_per_word != 16) {
> >>>>>>>>     if (spi->bits_per_word) {
> >>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >>>>>>>>       return -EINVAL;
> >>>>>>>>     }
> >>>>>>>>     spi->bits_per_word = 16;
> >>>>>>>>     spi_setup(spi);
> >>>>>>>>   }
> >>>>>>>
> >>>>>>> There is no way to set bits_per_word using struct spi_board_info. The
> >>>>>>> description of that structure in spi.h explicitly lists the wordsize as
> >>>>>>> one of the parameters drivers should set themself in probe().
> >>>>>> >
> >>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the board code.
> >>>>>>
> >>>>>> an obvious shortcoming in the SPI framework that should be fixed, but
> >>>>>> that doesnt make any difference to the above code now does it ?  it'll
> >>>>>> operate correctly regardless of the SPI bus master.
> >>>>>
> >>>>> So is the updated patch coming?
> >>>>
> >>>> The basic question I see is, whether it is in the
> >>>> responsibility of ad7877 to check a wrong setting
> >>>> possibly caused in board specific code. If so,
> >>>> then the proposal by Mike should be used, but if not
> >>>> so, it would introduce unneeded code.
> >>>>
> >>>> Remember: both versions end up in correctly setting
> >>>> bits_per_word, with the difference merely in feedback
> >>>> level.
> >>>
> >>> imo, unsupported board settings should always be detected & rejected.
> >>> all SPI master drivers do this (detect & reject unsupported SPI slave
> >>> settings).
> >>
> >> please note, that bits_per_word is not a board setting,
> >> it's a demand of the device. consequently, there is no one
> >> to set unsupported values and thus none to be detected.
> >>
> >> the only architecture setting bits_per_word thru spi_chip
> >> is blackfin, but I cannot see a good reason, why the board
> >> settings should engage with a fixed demand of the device?
> >
> > you're telling me that every single SPI device out there can only ever
> > operate in one specific bit length or lacks optional settings ?  i
> > find that hard to believe which means it does make sense to let the
> > boards pick a length when appropriate.

Hartley was so kind to give a description on how this
can be done, including an example. See below.

> > the board settings also function as default setup values so when using
> > generic things like the spidev driver, there is no kernel driver to
> > request the desired bitmode.
> >
> > the simple code i posted addresses your concerns as well as reject
> > invalid settings (wherever they may originate).  if you're not going
> > to post an updated patch, i'll take the simple one Michael committed
> > and post that.
> 
> Just my 0.02, I haven't been following this thread very closely...
> 
> I believe it's the spi "device", i.e. the protocol, drivers responsibility
> to supply the bits_per_word that it will operate with.  The spi master
> driver simply validates it if can support the requested size.  If the
> bits_per_word is left at the default (0) it indicates that the protocol
> words are eight bit bytes.
> 
> If a protocol driver can support multiple word sizes it should probably
> also support a "setup" callback during the driver probe so that the
> underlying board support can provide the desired word size.  This is what
> the libertas_spi driver does (drivers/net/wireless/libertas/if_spi.c).
> That driver does nothing with the bits_per_word value but the callbacks in
> arch/arm/mach-pxa/cm-x270.c and em-x270.c do.  They both set the bits_per_word
> to 16 and then call spi_setup.  They also probably should be checking the
> return value of spi_setup to make sure that the requested mode is supported
> by the master driver, but they probably don't since it is already known...

Right, for cases where both device driver and master driver
support multiple bits_per_word settings, and the intersection
does contain more than one possible setting, it may be desirable
to have some third party select the real value, unless it is
ok to just use the default value, which in most cases will be 8.

So the callback functions you mention solve the dilemma,
don't they?

> If the ad7877 "only" supports 16-bit word sizes it should be setting the
> bits_per_word to indicate this.  If the underlying master driver does not
> support that word size it will return an error when spi_setup is called.

Yes. So the original patch should be extended to
check the return value of spi_setup to see whether
16 bit are possible. Thank you for that point.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-18  8:32                     ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-18  8:32 UTC (permalink / raw)
  To: H Hartley Sweeten
  Cc: Mike Frysinger, Oskar Schirmer, Dmitry Torokhov,
	Daniel Glöckner, Andrew Morton, linux-input, linux-kernel,
	Oliver Schneidewind, Michael Hennerich

On Mon, May 17, 2010 at 19:18:28 -0500, H Hartley Sweeten wrote:
> On Monday, May 17, 2010 4:50 PM, Mike Frysinger wrote:
> > 2010/5/17 Oskar Schirmer:
> >> On Sun, May 16, 2010 at 15:25:34 -0400, Mike Frysinger wrote:
> >>> On Sat, May 15, 2010 at 14:15, Oskar Schirmer wrote:
> >>>> On Thu, May 13, 2010 at 00:53:35 -0700, Dmitry Torokhov wrote:
> >>>>> On Fri, May 07, 2010 at 02:23:07PM -0400, Mike Frysinger wrote:
> >>>>>> On Fri, May 7, 2010 at 05:41, Daniel Glöckner wrote:
> >>>>>>> On 05/06/2010 08:26 PM, Mike Frysinger wrote:
> >>>>>>>> i think it'd be a better idea to do something like:
> >>>>>>>>   if (spi->bits_per_word != 16) {
> >>>>>>>>     if (spi->bits_per_word) {
> >>>>>>>>       dev_err(&spi->dev, "Invalid SPI settings; bits_per_word must be 16\n");
> >>>>>>>>       return -EINVAL;
> >>>>>>>>     }
> >>>>>>>>     spi->bits_per_word = 16;
> >>>>>>>>     spi_setup(spi);
> >>>>>>>>   }
> >>>>>>>
> >>>>>>> There is no way to set bits_per_word using struct spi_board_info. The
> >>>>>>> description of that structure in spi.h explicitly lists the wordsize as
> >>>>>>> one of the parameters drivers should set themself in probe().
> >>>>>> >
> >>>>>>> Only struct bfin5xx_spi_chip allows to set this value in the board code.
> >>>>>>
> >>>>>> an obvious shortcoming in the SPI framework that should be fixed, but
> >>>>>> that doesnt make any difference to the above code now does it ?  it'll
> >>>>>> operate correctly regardless of the SPI bus master.
> >>>>>
> >>>>> So is the updated patch coming?
> >>>>
> >>>> The basic question I see is, whether it is in the
> >>>> responsibility of ad7877 to check a wrong setting
> >>>> possibly caused in board specific code. If so,
> >>>> then the proposal by Mike should be used, but if not
> >>>> so, it would introduce unneeded code.
> >>>>
> >>>> Remember: both versions end up in correctly setting
> >>>> bits_per_word, with the difference merely in feedback
> >>>> level.
> >>>
> >>> imo, unsupported board settings should always be detected & rejected.
> >>> all SPI master drivers do this (detect & reject unsupported SPI slave
> >>> settings).
> >>
> >> please note, that bits_per_word is not a board setting,
> >> it's a demand of the device. consequently, there is no one
> >> to set unsupported values and thus none to be detected.
> >>
> >> the only architecture setting bits_per_word thru spi_chip
> >> is blackfin, but I cannot see a good reason, why the board
> >> settings should engage with a fixed demand of the device?
> >
> > you're telling me that every single SPI device out there can only ever
> > operate in one specific bit length or lacks optional settings ?  i
> > find that hard to believe which means it does make sense to let the
> > boards pick a length when appropriate.

Hartley was so kind to give a description on how this
can be done, including an example. See below.

> > the board settings also function as default setup values so when using
> > generic things like the spidev driver, there is no kernel driver to
> > request the desired bitmode.
> >
> > the simple code i posted addresses your concerns as well as reject
> > invalid settings (wherever they may originate).  if you're not going
> > to post an updated patch, i'll take the simple one Michael committed
> > and post that.
> 
> Just my 0.02, I haven't been following this thread very closely...
> 
> I believe it's the spi "device", i.e. the protocol, drivers responsibility
> to supply the bits_per_word that it will operate with.  The spi master
> driver simply validates it if can support the requested size.  If the
> bits_per_word is left at the default (0) it indicates that the protocol
> words are eight bit bytes.
> 
> If a protocol driver can support multiple word sizes it should probably
> also support a "setup" callback during the driver probe so that the
> underlying board support can provide the desired word size.  This is what
> the libertas_spi driver does (drivers/net/wireless/libertas/if_spi.c).
> That driver does nothing with the bits_per_word value but the callbacks in
> arch/arm/mach-pxa/cm-x270.c and em-x270.c do.  They both set the bits_per_word
> to 16 and then call spi_setup.  They also probably should be checking the
> return value of spi_setup to make sure that the requested mode is supported
> by the master driver, but they probably don't since it is already known...

Right, for cases where both device driver and master driver
support multiple bits_per_word settings, and the intersection
does contain more than one possible setting, it may be desirable
to have some third party select the real value, unless it is
ok to just use the default value, which in most cases will be 8.

So the callback functions you mention solve the dilemma,
don't they?

> If the ad7877 "only" supports 16-bit word sizes it should be setting the
> bits_per_word to indicate this.  If the underlying master driver does not
> support that word size it will return an error when spi_setup is called.

Yes. So the original patch should be extended to
check the return value of spi_setup to see whether
16 bit are possible. Thank you for that point.

  Oskar
-- 
oskar schirmer, emlix gmbh, http://www.emlix.com
fon +49 551 30664-0, fax -11, bahnhofsallee 1b, 37081 göttingen, germany
sitz der gesellschaft: göttingen, amtsgericht göttingen hr b 3160
geschäftsführer: dr. uwe kracke, ust-idnr.: de 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH v2] ad7877: fix spi word size to 16 bit
  2010-05-06 10:37 ` Oskar Schirmer
@ 2010-05-18  8:34   ` Oskar Schirmer
  -1 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-18  8:34 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Michael Hennerich, H Hartley Sweeten, Oskar Schirmer,
	Daniel Glöckner, Oliver Schneidewind

With no word size given in the users platform data,
a generic spi host controller driver will assume a default
word size of eight bit. This causes transmission to be
performed bytewise, which will fail on little endian
machines for sure. Failure on big endian depends on usage
of slave select to mark word boundaries.

Anyway, ad7877 is specified to work with 16 bit per word,
so unconditionally set the word size accordingly.
Flag an error where 16 bit per word is not available.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

v2: check return value for spi_setup failure

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 698d6e7..eebde5b 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -680,6 +680,13 @@ static int __devinit ad7877_probe(struct spi_device *spi)
 		return -EINVAL;
 	}
 
+	spi->bits_per_word = 16;
+	err = spi_setup(spi);
+	if (err) {
+		dev_dbg(&spi->dev, "spi master doesn't support 16 bits/word\n");
+		return err;
+	}
+
 	ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
 	input_dev = input_allocate_device();
 	if (!ts || !input_dev) {
-- 
1.5.3.7


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH v2] ad7877: fix spi word size to 16 bit
@ 2010-05-18  8:34   ` Oskar Schirmer
  0 siblings, 0 replies; 124+ messages in thread
From: Oskar Schirmer @ 2010-05-18  8:34 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mike Frysinger, Andrew Morton, linux-input, linux-kernel,
	Michael Hennerich, H Hartley Sweeten, Oskar Schirmer,
	Daniel Glöckner, Oliver Schneidewind

With no word size given in the users platform data,
a generic spi host controller driver will assume a default
word size of eight bit. This causes transmission to be
performed bytewise, which will fail on little endian
machines for sure. Failure on big endian depends on usage
of slave select to mark word boundaries.

Anyway, ad7877 is specified to work with 16 bit per word,
so unconditionally set the word size accordingly.
Flag an error where 16 bit per word is not available.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
---
 drivers/input/touchscreen/ad7877.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

v2: check return value for spi_setup failure

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 698d6e7..eebde5b 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -680,6 +680,13 @@ static int __devinit ad7877_probe(struct spi_device *spi)
 		return -EINVAL;
 	}
 
+	spi->bits_per_word = 16;
+	err = spi_setup(spi);
+	if (err) {
+		dev_dbg(&spi->dev, "spi master doesn't support 16 bits/word\n");
+		return err;
+	}
+
 	ts = kzalloc(sizeof(struct ad7877), GFP_KERNEL);
 	input_dev = input_allocate_device();
 	if (!ts || !input_dev) {
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
  2010-05-17 23:49                 ` Mike Frysinger
@ 2010-05-18  9:37                   ` Daniel Glöckner
  -1 siblings, 0 replies; 124+ messages in thread
From: Daniel Glöckner @ 2010-05-18  9:37 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On 05/18/2010 01:49 AM, Mike Frysinger wrote:
> the board settings also function as default setup values so when using
> generic things like the spidev driver, there is no kernel driver to
> request the desired bitmode.

spidev allows to set the bits per word with the SPI_IOC_WR_BITS_PER_WORD
ioctl. it is also possible to set this on a per-transfer basis.

  Daniel


-- 
Dipl.-Math. Daniel Glöckner, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax -11, Bahnhofsallee 1b, 37081 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführer: Dr. Uwe Kracke, Ust-IdNr.: DE 205 198 055

emlix - your embedded linux partner

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] ad7877: fix spi word size to 16 bit
@ 2010-05-18  9:37                   ` Daniel Glöckner
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Glöckner @ 2010-05-18  9:37 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Oskar Schirmer, Dmitry Torokhov, Andrew Morton, linux-input,
	linux-kernel, Oliver Schneidewind, Michael Hennerich

On 05/18/2010 01:49 AM, Mike Frysinger wrote:
> the board settings also function as default setup values so when using
> generic things like the spidev driver, there is no kernel driver to
> request the desired bitmode.

spidev allows to set the bits per word with the SPI_IOC_WR_BITS_PER_WORD
ioctl. it is also possible to set this on a per-transfer basis.

  Daniel


-- 
Dipl.-Math. Daniel Glöckner, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax -11, Bahnhofsallee 1b, 37081 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführer: Dr. Uwe Kracke, Ust-IdNr.: DE 205 198 055

emlix - your embedded linux partner
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-12  3:03                                           ` Nick Piggin
  2010-05-12  3:23                                             ` FUJITA Tomonori
@ 2010-05-19 12:48                                             ` David Woodhouse
  2010-05-19 13:07                                               ` Nick Piggin
  1 sibling, 1 reply; 124+ messages in thread
From: David Woodhouse @ 2010-05-19 12:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Marc Gauthier, Mike Frysinger, Dmitry Torokhov, Johannes Weiner,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, 2010-05-12 at 13:03 +1000, Nick Piggin wrote:
> I don't think it's necessarily a good idea. MINALIGN is an enforced
> minimum alignment and the allocator has no leeway in reducing this.
> In a UP system, or in a memory constrained system, it might be a better
> idea to pack objects more tightly, for example.
> 
> If we allow drivers to assume kmalloc is cacheline aligned, it will be
> (practically) impossible to revert this because it would require driver
> audits.

No, we definitely don't, and shouldn't, allow drivers to assume that
kmalloc is cacheline-aligned.

However, we _do_ allow drivers to assume that kmalloc is DMA-safe. That
happens to mean "cacheline-aligned" for cache-incoherent architectures,
but drivers should never really have to think about that.

> So whenever strengthening API guarantees like this, it is better to be
> very careful and conservative. Probably even introducing a new API with
> the stronger semantics (even if it is just a wrapper in the case where
> KMALLOC_MINALIGNED *is* cacheline sized).

We're not talking about strengthening API guarantees. It's _always_ been
this way; it's just that some architectures are buggy.

But it looks like ARM, PowerPC, SH, MIPS, Microblaze, AVR32 and all
unconditionally cache-coherent architectures _do_ get it right already.

> I think adding to the DMA API would be a better idea. If the arch knows
> that kmalloc is suitable for the job directly, it can be used. Drivers
> can use the new interface, and kmalloc doesn't get saddled with
> alignment requirements.

No, that would be a change which would require auditing all drivers. The
_current_ rule is that buffers returned from kmalloc() are OK for DMA.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-11 20:54                                   ` Mike Frysinger
                                                     ` (2 preceding siblings ...)
  (?)
@ 2010-05-19 13:00                                   ` David Woodhouse
  -1 siblings, 0 replies; 124+ messages in thread
From: David Woodhouse @ 2010-05-19 13:00 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Christoph Lameter, Pekka Enberg, Matt Mackall, Dmitry Torokhov,
	Andrew Morton, Oskar Schirmer, Michael Hennerich, linux-input,
	linux-kernel, Daniel Glöckner, Oliver Schneidewind,
	Johannes Weiner, Nick Piggin, David Rientjes, David Brownell,
	Grant Likely

On Tue, 2010-05-11 at 16:54 -0400, Mike Frysinger wrote:
> 
> > kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
> > the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
> > will be invalidated as needed.
> 
> so this guarantee is made by the kmalloc() API ?  and for arches where
> the cacheline invalidation is handled in software rather than
> hardware, they must declare a min alignment value for kmalloc to be at
> least as big as their cache alignment ?
> 
> does the phrase "DMA safe buffer" imply cache alignment ? 

Not necessarily. If you have cache-coherent DMA, then there's no need to
align things. You only need cache-alignment when you have
cache-incoherent DMA and have to manage the cache in software. And in
that case, the architecture must set ARCH_KMALLOC_MINALIGN to an
appropriate value so that kmalloc() meets the guarantee.

However, your problem is kind of unrelated to that -- your problem is
actually that you're putting both a DMA buffer _and_ other stuff into
the same kmalloc'd buffer. Don't do that. Instead, allocate your DMA
buffer separately and put a _pointer_ to it into the structure.

Or if you really _must_ include it in your structure, then align to
ARCH_KMALLOC_MINALIGN bytes both before and after the DMA buffer by
adding __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) to both the
DMA buffer _and_ the element which follows it in your struct (if any).

Using ____cacheline_aligned wastes a lot of space on the architectures
that don't need it.

Arguably, this __dma_aligned definition should go somewhere generic...

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 0d2d7e5..ae5d56e 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -151,6 +151,11 @@ enum {
 /*
  * Non-touchscreen sensors only use single-ended conversions.
  */
+#ifdef ARCH_KMALLOC_MINALIGN
+#define __dma_aligned __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN)))
+#else
+#define __dma_aligned
+#endif
 
 struct ser_req {
 	u16			reset;
@@ -163,7 +168,7 @@ struct ser_req {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	u16 sample ____cacheline_aligned;
+	u16 sample __dma_aligned;
 };
 
 struct ad7877 {
@@ -203,7 +208,7 @@ struct ad7877 {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
+	u16 conversion_data[AD7877_NR_SENSE] __dma_aligned;
 };
 
 static int gpio3;


-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 12:48                                             ` David Woodhouse
@ 2010-05-19 13:07                                               ` Nick Piggin
  2010-05-19 13:17                                                 ` David Woodhouse
  0 siblings, 1 reply; 124+ messages in thread
From: Nick Piggin @ 2010-05-19 13:07 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Marc Gauthier, Mike Frysinger, Dmitry Torokhov, Johannes Weiner,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, May 19, 2010 at 01:48:45PM +0100, David Woodhouse wrote:
> On Wed, 2010-05-12 at 13:03 +1000, Nick Piggin wrote:
> > I don't think it's necessarily a good idea. MINALIGN is an enforced
> > minimum alignment and the allocator has no leeway in reducing this.
> > In a UP system, or in a memory constrained system, it might be a better
> > idea to pack objects more tightly, for example.
> > 
> > If we allow drivers to assume kmalloc is cacheline aligned, it will be
> > (practically) impossible to revert this because it would require driver
> > audits.
> 
> No, we definitely don't, and shouldn't, allow drivers to assume that
> kmalloc is cacheline-aligned.

Good.


> However, we _do_ allow drivers to assume that kmalloc is DMA-safe. That
> happens to mean "cacheline-aligned" for cache-incoherent architectures,
> but drivers should never really have to think about that.

DMA-safe for GFP_DMA, or all kmalloc?

Either way, yes the arch should take care of those details.

 
> > So whenever strengthening API guarantees like this, it is better to be
> > very careful and conservative. Probably even introducing a new API with
> > the stronger semantics (even if it is just a wrapper in the case where
> > KMALLOC_MINALIGNED *is* cacheline sized).
> 
> We're not talking about strengthening API guarantees. It's _always_ been
> this way; it's just that some architectures are buggy.

It just appeared, in the post I replied to, that there was a suggestion
of making it explicitly cacheline aligned. If I misread that, ignore
me.

> 
> But it looks like ARM, PowerPC, SH, MIPS, Microblaze, AVR32 and all
> unconditionally cache-coherent architectures _do_ get it right already.
> 
> > I think adding to the DMA API would be a better idea. If the arch knows
> > that kmalloc is suitable for the job directly, it can be used. Drivers
> > can use the new interface, and kmalloc doesn't get saddled with
> > alignment requirements.
> 
> No, that would be a change which would require auditing all drivers. The
> _current_ rule is that buffers returned from kmalloc() are OK for DMA.
> 
> -- 
> David Woodhouse                            Open Source Technology Centre
> David.Woodhouse@intel.com                              Intel Corporation

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:07                                               ` Nick Piggin
@ 2010-05-19 13:17                                                 ` David Woodhouse
  2010-05-19 13:36                                                   ` Nick Piggin
  0 siblings, 1 reply; 124+ messages in thread
From: David Woodhouse @ 2010-05-19 13:17 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Marc Gauthier, Mike Frysinger, Dmitry Torokhov, Johannes Weiner,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, 2010-05-19 at 23:07 +1000, Nick Piggin wrote:
> On Wed, May 19, 2010 at 01:48:45PM +0100, David Woodhouse wrote:
> > However, we _do_ allow drivers to assume that kmalloc is DMA-safe. That
> > happens to mean "cacheline-aligned" for cache-incoherent architectures,
> > but drivers should never really have to think about that.
> 
> DMA-safe for GFP_DMA, or all kmalloc?

Try not to think about that. Seriously, look over there! A kitten!

(You have to use the DMA API anyway, so swiotlb will handle things if
you're trying to use pages which are above the range that certain
devices can reach. But we're only talking about the problems of sharing
cache lines here; don't worry about that bit.)

> > > So whenever strengthening API guarantees like this, it is better to be
> > > very careful and conservative. Probably even introducing a new API with
> > > the stronger semantics (even if it is just a wrapper in the case where
> > > KMALLOC_MINALIGNED *is* cacheline sized).
> > 
> > We're not talking about strengthening API guarantees. It's _always_ been
> > this way; it's just that some architectures are buggy.
> 
> It just appeared, in the post I replied to, that there was a suggestion
> of making it explicitly cacheline aligned. If I misread that, ignore
> me.

Actually I think you read it just fine -- but it was misguided. They
meant to say "cacheline aligned on architectures with cache-incoherent
DMA which need that". But that's what ARCH_KMALLOC_MINALIGN is _already_
supposed to do. If any architecture needs to set ARCH_KMALLOC_MINALIGN
but doesn't, that's broken.

The 'cacheline aligned' misconception did manage to get into the ad7877
driver in commit 3843384a though -- it now uses ____cacheline_aligned
instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
should.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:17                                                 ` David Woodhouse
@ 2010-05-19 13:36                                                   ` Nick Piggin
  2010-05-19 13:44                                                     ` Johannes Weiner
  0 siblings, 1 reply; 124+ messages in thread
From: Nick Piggin @ 2010-05-19 13:36 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Marc Gauthier, Mike Frysinger, Dmitry Torokhov, Johannes Weiner,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> On Wed, 2010-05-19 at 23:07 +1000, Nick Piggin wrote:
> > On Wed, May 19, 2010 at 01:48:45PM +0100, David Woodhouse wrote:
> > > However, we _do_ allow drivers to assume that kmalloc is DMA-safe. That
> > > happens to mean "cacheline-aligned" for cache-incoherent architectures,
> > > but drivers should never really have to think about that.
> > 
> > DMA-safe for GFP_DMA, or all kmalloc?
> 
> Try not to think about that. Seriously, look over there! A kitten!
> 
> (You have to use the DMA API anyway, so swiotlb will handle things if
> you're trying to use pages which are above the range that certain
> devices can reach. But we're only talking about the problems of sharing
> cache lines here; don't worry about that bit.)

Yeah, well, GFP_DMA is pretty crappy anyway. I was just curious, but
it's no big deal to support dma with regular kmalloc.

 
> > > > So whenever strengthening API guarantees like this, it is better to be
> > > > very careful and conservative. Probably even introducing a new API with
> > > > the stronger semantics (even if it is just a wrapper in the case where
> > > > KMALLOC_MINALIGNED *is* cacheline sized).
> > > 
> > > We're not talking about strengthening API guarantees. It's _always_ been
> > > this way; it's just that some architectures are buggy.
> > 
> > It just appeared, in the post I replied to, that there was a suggestion
> > of making it explicitly cacheline aligned. If I misread that, ignore
> > me.
> 
> Actually I think you read it just fine -- but it was misguided. They
> meant to say "cacheline aligned on architectures with cache-incoherent
> DMA which need that". But that's what ARCH_KMALLOC_MINALIGN is _already_
> supposed to do. If any architecture needs to set ARCH_KMALLOC_MINALIGN
> but doesn't, that's broken.
> 
> The 'cacheline aligned' misconception did manage to get into the ad7877
> driver in commit 3843384a though -- it now uses ____cacheline_aligned
> instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> should.

OK so long as there is not a "must be cacheline aligned" requirement.
Your proposal for a __dma_aligned attribute in an arch header looks
like a good idea there.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:36                                                   ` Nick Piggin
@ 2010-05-19 13:44                                                     ` Johannes Weiner
  2010-05-19 13:52                                                       ` Nick Piggin
  0 siblings, 1 reply; 124+ messages in thread
From: Johannes Weiner @ 2010-05-19 13:44 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Woodhouse, Marc Gauthier, Mike Frysinger, Dmitry Torokhov,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, May 19, 2010 at 11:36:56PM +1000, Nick Piggin wrote:
> On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> >
> > The 'cacheline aligned' misconception did manage to get into the ad7877
> > driver in commit 3843384a though -- it now uses ____cacheline_aligned
> > instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> > should.
> 
> OK so long as there is not a "must be cacheline aligned" requirement.
> Your proposal for a __dma_aligned attribute in an arch header looks
> like a good idea there.

Would you happen to know of other potential users?  At this point I'd
much rather just allocate the buffers dynamically and hide the issue
nicely behind kmalloc().

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:44                                                     ` Johannes Weiner
@ 2010-05-19 13:52                                                       ` Nick Piggin
  2010-05-19 14:38                                                         ` FUJITA Tomonori
  2010-05-19 15:00                                                         ` Johannes Weiner
  0 siblings, 2 replies; 124+ messages in thread
From: Nick Piggin @ 2010-05-19 13:52 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Woodhouse, Marc Gauthier, Mike Frysinger, Dmitry Torokhov,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, May 19, 2010 at 03:44:30PM +0200, Johannes Weiner wrote:
> On Wed, May 19, 2010 at 11:36:56PM +1000, Nick Piggin wrote:
> > On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> > >
> > > The 'cacheline aligned' misconception did manage to get into the ad7877
> > > driver in commit 3843384a though -- it now uses ____cacheline_aligned
> > > instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> > > should.
> > 
> > OK so long as there is not a "must be cacheline aligned" requirement.
> > Your proposal for a __dma_aligned attribute in an arch header looks
> > like a good idea there.
> 
> Would you happen to know of other potential users?  At this point I'd
> much rather just allocate the buffers dynamically and hide the issue
> nicely behind kmalloc().

I don't think we need to hide the fact that some platforms have
specific alignment restrictions for DMA. So if any drivers make use
of the alignment, I see no problem with __dma_aligned.


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:52                                                       ` Nick Piggin
@ 2010-05-19 14:38                                                         ` FUJITA Tomonori
  2010-05-19 14:58                                                           ` David Woodhouse
  2010-05-19 16:37                                                           ` Dmitry Torokhov
  2010-05-19 15:00                                                         ` Johannes Weiner
  1 sibling, 2 replies; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-19 14:38 UTC (permalink / raw)
  To: npiggin
  Cc: jw, dwmw2, marc, vapier.adi, dmitry.torokhov, cl, penberg, mpm,
	akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg, osw,
	rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, 19 May 2010 23:52:26 +1000
Nick Piggin <npiggin@suse.de> wrote:

> On Wed, May 19, 2010 at 03:44:30PM +0200, Johannes Weiner wrote:
> > On Wed, May 19, 2010 at 11:36:56PM +1000, Nick Piggin wrote:
> > > On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> > > >
> > > > The 'cacheline aligned' misconception did manage to get into the ad7877
> > > > driver in commit 3843384a though -- it now uses ____cacheline_aligned
> > > > instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> > > > should.
> > > 
> > > OK so long as there is not a "must be cacheline aligned" requirement.
> > > Your proposal for a __dma_aligned attribute in an arch header looks
> > > like a good idea there.
> > 
> > Would you happen to know of other potential users?  At this point I'd
> > much rather just allocate the buffers dynamically and hide the issue
> > nicely behind kmalloc().
> 
> I don't think we need to hide the fact that some platforms have
> specific alignment restrictions for DMA. So if any drivers make use
> of the alignment, I see no problem with __dma_aligned.

IIRC, such was proposed several times:

http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12633.html

I guess that we agreed that it's better to tell driver writers to just
use kmalloc.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 14:38                                                         ` FUJITA Tomonori
@ 2010-05-19 14:58                                                           ` David Woodhouse
  2010-05-20  4:42                                                             ` FUJITA Tomonori
  2010-05-19 16:37                                                           ` Dmitry Torokhov
  1 sibling, 1 reply; 124+ messages in thread
From: David Woodhouse @ 2010-05-19 14:58 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, jw, marc, vapier.adi, dmitry.torokhov, cl, penberg, mpm,
	akpm, os, Michael.Hennerich, linux-input, linux-kernel, dg, osw,
	rientjes, dbrownell, grant.likely, chris, Piet.Delaney

On Wed, 2010-05-19 at 23:38 +0900, FUJITA Tomonori wrote:
> 
> > I don't think we need to hide the fact that some platforms have
> > specific alignment restrictions for DMA. So if any drivers make use
> > of the alignment, I see no problem with __dma_aligned.
> 
> IIRC, such was proposed several times:
> 
> http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12633.html
> 
> I guess that we agreed that it's better to tell driver writers to just
> use kmalloc. 

Perhaps -- but only a few days ago in this thread, they were being
advised to use ____cacheline_aligned instead!

And for this case it really does seem to make sense to keep the buffer
in the parent structure rather than allocating it separately. The DMA
buffers are tiny and on cache-coherent architectures it's _much_ more
efficient just to have them in the original structure and use
__dma_aligned.

-- 
dwmw2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 13:52                                                       ` Nick Piggin
  2010-05-19 14:38                                                         ` FUJITA Tomonori
@ 2010-05-19 15:00                                                         ` Johannes Weiner
  1 sibling, 0 replies; 124+ messages in thread
From: Johannes Weiner @ 2010-05-19 15:00 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Woodhouse, Marc Gauthier, Mike Frysinger, Dmitry Torokhov,
	Christoph Lameter, Pekka Enberg, Matt Mackall, Andrew Morton,
	Oskar Schirmer, Michael Hennerich, linux-input, linux-kernel,
	Daniel Glöckner, Oliver Schneidewind, David Rientjes,
	David Brownell, Grant Likely, Chris Zankel, Piet Delaney

On Wed, May 19, 2010 at 11:52:26PM +1000, Nick Piggin wrote:
> On Wed, May 19, 2010 at 03:44:30PM +0200, Johannes Weiner wrote:
> > On Wed, May 19, 2010 at 11:36:56PM +1000, Nick Piggin wrote:
> > > On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> > > >
> > > > The 'cacheline aligned' misconception did manage to get into the ad7877
> > > > driver in commit 3843384a though -- it now uses ____cacheline_aligned
> > > > instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> > > > should.
> > > 
> > > OK so long as there is not a "must be cacheline aligned" requirement.
> > > Your proposal for a __dma_aligned attribute in an arch header looks
> > > like a good idea there.
> > 
> > Would you happen to know of other potential users?  At this point I'd
> > much rather just allocate the buffers dynamically and hide the issue
> > nicely behind kmalloc().
> 
> I don't think we need to hide the fact that some platforms have
> specific alignment restrictions for DMA. So if any drivers make use
> of the alignment, I see no problem with __dma_aligned.

In this case, the slave driver does not know whether the master driver
is even using DMA.  That's the reason we do not use dma_alloc_* for
the buffers in the first place.  And that's why I'd prefer to leave
the slave agnostic about it.  Use kmalloc(), don't worry about the
details.

Of course, there may still be more appropriate users for the macro,
that's why I asked.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 14:38                                                         ` FUJITA Tomonori
  2010-05-19 14:58                                                           ` David Woodhouse
@ 2010-05-19 16:37                                                           ` Dmitry Torokhov
  1 sibling, 0 replies; 124+ messages in thread
From: Dmitry Torokhov @ 2010-05-19 16:37 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: npiggin, jw, dwmw2, marc, vapier.adi, cl, penberg, mpm, akpm, os,
	Michael.Hennerich, linux-input, linux-kernel, dg, osw, rientjes,
	dbrownell, grant.likely, chris, Piet.Delaney

On Wed, May 19, 2010 at 11:38:34PM +0900, FUJITA Tomonori wrote:
> On Wed, 19 May 2010 23:52:26 +1000
> Nick Piggin <npiggin@suse.de> wrote:
> 
> > On Wed, May 19, 2010 at 03:44:30PM +0200, Johannes Weiner wrote:
> > > On Wed, May 19, 2010 at 11:36:56PM +1000, Nick Piggin wrote:
> > > > On Wed, May 19, 2010 at 02:17:47PM +0100, David Woodhouse wrote:
> > > > >
> > > > > The 'cacheline aligned' misconception did manage to get into the ad7877
> > > > > driver in commit 3843384a though -- it now uses ____cacheline_aligned
> > > > > instead of __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) as it
> > > > > should.
> > > > 
> > > > OK so long as there is not a "must be cacheline aligned" requirement.
> > > > Your proposal for a __dma_aligned attribute in an arch header looks
> > > > like a good idea there.
> > > 
> > > Would you happen to know of other potential users?  At this point I'd
> > > much rather just allocate the buffers dynamically and hide the issue
> > > nicely behind kmalloc().
> > 
> > I don't think we need to hide the fact that some platforms have
> > specific alignment restrictions for DMA. So if any drivers make use
> > of the alignment, I see no problem with __dma_aligned.
> 
> IIRC, such was proposed several times:
> 
> http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12633.html
> 
> I guess that we agreed that it's better to tell driver writers to just
> use kmalloc.

It really dpeends on the size of the buffer. When I need a single byte I
really do not want to mess with separate kmalloced buffer.

If somebody coudl pick David's patch that would be great.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [LKML] Re: [PATCH v3] ad7877: keep dma rx buffers in seperate cache lines
  2010-05-19 14:58                                                           ` David Woodhouse
@ 2010-05-20  4:42                                                             ` FUJITA Tomonori
  0 siblings, 0 replies; 124+ messages in thread
From: FUJITA Tomonori @ 2010-05-20  4:42 UTC (permalink / raw)
  To: dwmw2
  Cc: fujita.tomonori, npiggin, jw, marc, vapier.adi, dmitry.torokhov,
	cl, penberg, mpm, akpm, os, Michael.Hennerich, linux-input,
	linux-kernel, dg, osw, rientjes, dbrownell, grant.likely, chris,
	Piet.Delaney

On Wed, 19 May 2010 15:58:37 +0100
David Woodhouse <dwmw2@infradead.org> wrote:

> On Wed, 2010-05-19 at 23:38 +0900, FUJITA Tomonori wrote:
> > 
> > > I don't think we need to hide the fact that some platforms have
> > > specific alignment restrictions for DMA. So if any drivers make use
> > > of the alignment, I see no problem with __dma_aligned.
> > 
> > IIRC, such was proposed several times:
> > 
> > http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12633.html
> > 
> > I guess that we agreed that it's better to tell driver writers to just
> > use kmalloc. 
> 
> Perhaps -- but only a few days ago in this thread, they were being
> advised to use ____cacheline_aligned instead!

Hmm, driver writers should read DMA-API and DMA-API-HOWTO docs.


> And for this case it really does seem to make sense to keep the buffer
> in the parent structure rather than allocating it separately. The DMA
> buffers are tiny and on cache-coherent architectures it's _much_ more
> efficient just to have them in the original structure and use
> __dma_aligned.

Yeah, I think that that's a valid point (which was also discussed in
the past). However, I tend to agree on:

http://lwn.net/Articles/2270/

I think that forcing kmalloc is not that bad. Surely more code, but
the performance is not notable in most cases. IMHO, exporting cache
line thing everywhere is worse than it.

^ permalink raw reply	[flat|nested] 124+ messages in thread

end of thread, other threads:[~2010-05-20  4:48 UTC | newest]

Thread overview: 124+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-06 10:37 [PATCH] ad7877: fix spi word size to 16 bit Oskar Schirmer
2010-05-06 10:37 ` Oskar Schirmer
2010-05-06 10:37 ` [PATCH] ad7877: keep dma rx buffers in seperate cache lines Oskar Schirmer
2010-05-06 18:46   ` Mike Frysinger
2010-05-06 18:46     ` Mike Frysinger
2010-05-07 10:15     ` Oskar Schirmer
2010-05-07 10:15       ` Oskar Schirmer
2010-05-07 18:28       ` Mike Frysinger
2010-05-07 18:28         ` Mike Frysinger
2010-05-08 22:32         ` Johannes Weiner
2010-05-08 22:32           ` Johannes Weiner
2010-05-09  4:45           ` Mike Frysinger
2010-05-09  4:45             ` Mike Frysinger
2010-05-09  8:50             ` Oskar Schirmer
2010-05-09  8:50               ` Oskar Schirmer
2010-05-07 12:07     ` Johannes Weiner
2010-05-10 10:34   ` [PATCH v2] " Oskar Schirmer
2010-05-10 10:34     ` Oskar Schirmer
2010-05-10 10:42     ` [PATCH v3] " Oskar Schirmer
2010-05-10 10:42       ` Oskar Schirmer
2010-05-10 16:39       ` Mike Frysinger
2010-05-10 20:54         ` Dmitry Torokhov
2010-05-10 21:22       ` Andrew Morton
2010-05-10 21:27         ` Mike Frysinger
2010-05-11  6:05         ` Dmitry Torokhov
2010-05-11  6:05           ` Dmitry Torokhov
2010-05-11  6:11           ` Mike Frysinger
2010-05-11  6:11             ` Mike Frysinger
2010-05-11  3:20             ` Andrew Morton
2010-05-11  6:21             ` Dmitry Torokhov
2010-05-11  6:23               ` Mike Frysinger
2010-05-11  6:23                 ` Mike Frysinger
2010-05-11  6:33                 ` Dmitry Torokhov
2010-05-11  6:33                   ` Dmitry Torokhov
2010-05-11  6:37                   ` Mike Frysinger
2010-05-11  6:42                   ` Pekka Enberg
2010-05-11 13:57                     ` Christoph Lameter
2010-05-11 16:52                     ` Dmitry Torokhov
2010-05-11 17:31                       ` Mike Frysinger
2010-05-11 18:59                       ` Pekka Enberg
2010-05-11 20:03                     ` Mike Frysinger
2010-05-11 20:07                       ` Matt Mackall
2010-05-11 20:10                         ` Mike Frysinger
2010-05-11 20:10                           ` Mike Frysinger
2010-05-11 20:21                           ` Christoph Lameter
2010-05-11 20:34                             ` Mike Frysinger
2010-05-11 20:38                               ` Pekka Enberg
2010-05-11 20:43                                 ` Mike Frysinger
2010-05-11 20:43                                   ` Mike Frysinger
2010-05-11 20:47                                   ` Pekka Enberg
2010-05-13  6:21                                     ` Paul Mundt
2010-05-11 20:46                               ` Christoph Lameter
2010-05-11 20:54                                 ` Mike Frysinger
2010-05-11 20:54                                   ` Mike Frysinger
2010-05-11 21:01                                   ` Pekka Enberg
2010-05-11 21:11                                     ` Mike Frysinger
2010-05-11 21:11                                       ` Mike Frysinger
2010-05-12  1:58                                     ` FUJITA Tomonori
     [not found]                                   ` <20100511214836.GH1726@emlix.com>
2010-05-11 21:53                                     ` Dmitry Torokhov
2010-05-11 22:39                                       ` Mike Frysinger
2010-05-11 22:39                                         ` Mike Frysinger
2010-05-12  2:07                                         ` [LKML] " Marc Gauthier
2010-05-12  3:03                                           ` Nick Piggin
2010-05-12  3:23                                             ` FUJITA Tomonori
2010-05-12  4:35                                               ` Mike Frysinger
2010-05-12  4:35                                                 ` Mike Frysinger
2010-05-12  5:28                                                 ` FUJITA Tomonori
2010-05-12  5:28                                                   ` FUJITA Tomonori
2010-05-12 14:37                                                   ` Mike Frysinger
2010-05-12 14:37                                                     ` Mike Frysinger
2010-05-12 18:11                                                     ` Dmitry Torokhov
2010-05-12 18:28                                                       ` Mike Frysinger
2010-05-12 18:28                                                         ` Mike Frysinger
2010-05-12 10:36                                                 ` Johannes Weiner
2010-05-12 12:35                                                 ` Marc Gauthier
2010-05-12 14:36                                                   ` Mike Frysinger
2010-05-12 14:36                                                     ` Mike Frysinger
2010-05-19 12:48                                             ` David Woodhouse
2010-05-19 13:07                                               ` Nick Piggin
2010-05-19 13:17                                                 ` David Woodhouse
2010-05-19 13:36                                                   ` Nick Piggin
2010-05-19 13:44                                                     ` Johannes Weiner
2010-05-19 13:52                                                       ` Nick Piggin
2010-05-19 14:38                                                         ` FUJITA Tomonori
2010-05-19 14:58                                                           ` David Woodhouse
2010-05-20  4:42                                                             ` FUJITA Tomonori
2010-05-19 16:37                                                           ` Dmitry Torokhov
2010-05-19 15:00                                                         ` Johannes Weiner
2010-05-17  6:12                                         ` Michal Simek
2010-05-19 13:00                                   ` David Woodhouse
2010-05-11 22:37                               ` Alan Cox
2010-05-11 20:22                           ` Pekka Enberg
2010-05-11 14:12           ` Oskar Schirmer
2010-05-11 14:12             ` Oskar Schirmer
2010-05-06 11:18 ` [PATCH] ad7877: fix spi word size to 16 bit Hennerich, Michael
2010-05-06 11:18   ` Hennerich, Michael
2010-05-06 18:26 ` Mike Frysinger
2010-05-06 18:26   ` Mike Frysinger
2010-05-07  9:41   ` Daniel Glöckner
2010-05-07  9:41     ` Daniel Glöckner
2010-05-07 18:23     ` Mike Frysinger
2010-05-07 18:23       ` Mike Frysinger
2010-05-13  7:53       ` Dmitry Torokhov
2010-05-13  7:53         ` Dmitry Torokhov
2010-05-15 18:15         ` Oskar Schirmer
2010-05-15 18:15           ` Oskar Schirmer
2010-05-16 19:25           ` Mike Frysinger
2010-05-16 19:25             ` Mike Frysinger
2010-05-17  7:29             ` Oskar Schirmer
2010-05-17  7:29               ` Oskar Schirmer
2010-05-17  8:14               ` Hennerich, Michael
2010-05-17  8:14                 ` Hennerich, Michael
2010-05-17  8:41                 ` Oskar Schirmer
2010-05-17  8:41                   ` Oskar Schirmer
2010-05-17 23:49               ` Mike Frysinger
2010-05-17 23:49                 ` Mike Frysinger
2010-05-18  0:18                 ` H Hartley Sweeten
2010-05-18  0:18                   ` H Hartley Sweeten
2010-05-18  8:32                   ` Oskar Schirmer
2010-05-18  8:32                     ` Oskar Schirmer
2010-05-18  9:37                 ` Daniel Glöckner
2010-05-18  9:37                   ` Daniel Glöckner
2010-05-18  8:34 ` [PATCH v2] " Oskar Schirmer
2010-05-18  8:34   ` Oskar Schirmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.