linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations
@ 2019-11-25 19:43 Fenghua Yu
  2019-11-25 19:43 ` [PATCH v2 1/4] drivers/net/b44: Change to non-atomic " Fenghua Yu
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Fenghua Yu @ 2019-11-25 19:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86, Fenghua Yu

A split lock is an unaligned atomic operation that spans two cache
lines. The split lock access needs to hold bus lock and will
degrade performance.

Accessing one split lock long can take over one thousand extra cycles
than atomically accessing one unaligned long within one cache line.
And while the split lock access holds the bus lock, any other
memory accesses are not allowed and the overall memory access performance
is degraded.

Because badly performance impact by split lock, this patch series
solve the split lock issues instead of other alignment issues.

These parts are all simple fixes which are a necessary pre-cursor 
before we can enable #AC traps for split lock access. But they 
are also worthwhile performance fixes in their own right. So 
no sense in holding them back while we discuss the merits of 
the rest of the series. 

The alignment issues may be fixed by changing the atomic bit operations
APIs e.g. new set_bit_32() for 4-byte alignment. But the fixes will
be complex because they touch a lot of call sites and architectures.

Change Log:
v2:
- Remove patch 1 and 3 in v1 because they are in the tip tree already.
- Add new patches 2-4 per David Laight's comments:
https://lore.kernel.org/lkml/e7c75de9191847ed98c573f9ad871518@AcuMS.aculab.com/
Running "grep -r --include '*.[ch]' '_bit([^(]*, *([^)]*\*)' ."
returns about 200 results. Most of them don't have split lock issues.

Fenghua Yu (3):
  xen-pcifront: Align address of flags to size of unsigned long
  mtd: rawnand: fsmc: Change to non-atomic bit operations
  x86/mce: Force alignment for atomic bit operations

Peter Zijlstra (1):
  drivers/net/b44: Change to non-atomic bit operations

 arch/x86/include/asm/mce.h          | 3 ++-
 drivers/mtd/nand/raw/fsmc_nand.c    | 4 ++--
 drivers/net/ethernet/broadcom/b44.c | 4 ++--
 include/xen/interface/io/pciif.h    | 7 +++++--
 4 files changed, 11 insertions(+), 7 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] drivers/net/b44: Change to non-atomic bit operations
  2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
@ 2019-11-25 19:43 ` Fenghua Yu
  2019-11-26  9:49   ` David Laight
  2019-11-25 19:43 ` [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long Fenghua Yu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Fenghua Yu @ 2019-11-25 19:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86, Fenghua Yu

From: Peter Zijlstra <peterz@infradead.org>

Since "pwol_mask" is local and never exposed to concurrency, there is
no need to set bit in pwol_mask by costly atomic operations.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 drivers/net/ethernet/broadcom/b44.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index 97ab0dd25552..5738ab963dfb 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -1520,7 +1520,7 @@ static int b44_magic_pattern(u8 *macaddr, u8 *ppattern, u8 *pmask, int offset)
 
 	memset(ppattern + offset, 0xff, magicsync);
 	for (j = 0; j < magicsync; j++)
-		set_bit(len++, (unsigned long *) pmask);
+		__set_bit(len++, (unsigned long *)pmask);
 
 	for (j = 0; j < B44_MAX_PATTERNS; j++) {
 		if ((B44_PATTERN_SIZE - len) >= ETH_ALEN)
@@ -1532,7 +1532,7 @@ static int b44_magic_pattern(u8 *macaddr, u8 *ppattern, u8 *pmask, int offset)
 		for (k = 0; k< ethaddr_bytes; k++) {
 			ppattern[offset + magicsync +
 				(j * ETH_ALEN) + k] = macaddr[k];
-			set_bit(len++, (unsigned long *) pmask);
+			__set_bit(len++, (unsigned long *)pmask);
 		}
 	}
 	return len - 1;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long
  2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
  2019-11-25 19:43 ` [PATCH v2 1/4] drivers/net/b44: Change to non-atomic " Fenghua Yu
@ 2019-11-25 19:43 ` Fenghua Yu
  2019-11-26  9:53   ` David Laight
  2019-11-26 10:02   ` Jürgen Groß
  2019-11-25 19:43 ` [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations Fenghua Yu
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Fenghua Yu @ 2019-11-25 19:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86, Fenghua Yu

The address of "flags" is passed to atomic bitops which require the
address is aligned to size of unsigned long.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 include/xen/interface/io/pciif.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/xen/interface/io/pciif.h b/include/xen/interface/io/pciif.h
index d9922ae36eb5..639d5fb484a3 100644
--- a/include/xen/interface/io/pciif.h
+++ b/include/xen/interface/io/pciif.h
@@ -103,8 +103,11 @@ struct xen_pcie_aer_op {
 	uint32_t devfn;
 };
 struct xen_pci_sharedinfo {
-	/* flags - XEN_PCIF_* */
-	uint32_t flags;
+	/* flags - XEN_PCIF_*. Force alignment for atomic bit operations. */
+	union {
+		uint32_t	flags;
+		unsigned long	flags_alignment;
+	};
 	struct xen_pci_op op;
 	struct xen_pcie_aer_op aer_op;
 };
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations
  2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
  2019-11-25 19:43 ` [PATCH v2 1/4] drivers/net/b44: Change to non-atomic " Fenghua Yu
  2019-11-25 19:43 ` [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long Fenghua Yu
@ 2019-11-25 19:43 ` Fenghua Yu
  2019-12-04 15:28   ` Peter Zijlstra
  2019-11-25 19:43 ` [PATCH v2 4/4] x86/mce: Force alignment for atomic " Fenghua Yu
  2019-11-26 10:13 ` [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in " David Laight
  4 siblings, 1 reply; 13+ messages in thread
From: Fenghua Yu @ 2019-11-25 19:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86, Fenghua Yu

No need for expensive atomic change_bit() on the local variable err_idx[].

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 drivers/mtd/nand/raw/fsmc_nand.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
index a6964feeec77..916496d4ecf2 100644
--- a/drivers/mtd/nand/raw/fsmc_nand.c
+++ b/drivers/mtd/nand/raw/fsmc_nand.c
@@ -809,8 +809,8 @@ static int fsmc_bch8_correct_data(struct nand_chip *chip, u8 *dat,
 
 	i = 0;
 	while (num_err--) {
-		change_bit(0, (unsigned long *)&err_idx[i]);
-		change_bit(1, (unsigned long *)&err_idx[i]);
+		__change_bit(0, (unsigned long *)&err_idx[i]);
+		__change_bit(1, (unsigned long *)&err_idx[i]);
 
 		if (err_idx[i] < chip->ecc.size * 8) {
 			change_bit(err_idx[i], (unsigned long *)dat);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] x86/mce: Force alignment for atomic bit operations
  2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
                   ` (2 preceding siblings ...)
  2019-11-25 19:43 ` [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations Fenghua Yu
@ 2019-11-25 19:43 ` Fenghua Yu
  2019-11-26 10:13 ` [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in " David Laight
  4 siblings, 0 replies; 13+ messages in thread
From: Fenghua Yu @ 2019-11-25 19:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86, Fenghua Yu

The "flags" field is passed to atomic bit operations which require
unsigned long alignment.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/include/asm/mce.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index dc2d4b206ab7..d16503c5602e 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -134,10 +134,11 @@
  * is set.
  */
 struct mce_log_buffer {
+	/* Force alignment for atomic bit operations. */
+	unsigned flags;
 	char signature[12]; /* "MACHINECHECK" */
 	unsigned len;	    /* = MCE_LOG_LEN */
 	unsigned next;
-	unsigned flags;
 	unsigned recordlen;	/* length of struct mce */
 	struct mce entry[MCE_LOG_LEN];
 };
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH v2 1/4] drivers/net/b44: Change to non-atomic bit operations
  2019-11-25 19:43 ` [PATCH v2 1/4] drivers/net/b44: Change to non-atomic " Fenghua Yu
@ 2019-11-26  9:49   ` David Laight
  2019-12-04 15:19     ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2019-11-26  9:49 UTC (permalink / raw)
  To: 'Fenghua Yu',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, Ashok Raj, Tony Luck,
	Ravi V Shankar
  Cc: linux-kernel, x86

From: Fenghua Yu
> Sent: 25 November 2019 19:43
> From: Peter Zijlstra <peterz@infradead.org>
> 
> Since "pwol_mask" is local and never exposed to concurrency, there is
> no need to set bit in pwol_mask by costly atomic operations.
> 
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> ---
>  drivers/net/ethernet/broadcom/b44.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
> index 97ab0dd25552..5738ab963dfb 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -1520,7 +1520,7 @@ static int b44_magic_pattern(u8 *macaddr, u8 *ppattern, u8 *pmask, int offset)
> 
>  	memset(ppattern + offset, 0xff, magicsync);
>  	for (j = 0; j < magicsync; j++)
> -		set_bit(len++, (unsigned long *) pmask);
> +		__set_bit(len++, (unsigned long *)pmask);

While this stops the misaligned locks, the code is still horribly borked on BE systems.
The way pmask is used definitely wanst a u32[] not a u64[] one.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long
  2019-11-25 19:43 ` [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long Fenghua Yu
@ 2019-11-26  9:53   ` David Laight
  2019-11-26 10:02   ` Jürgen Groß
  1 sibling, 0 replies; 13+ messages in thread
From: David Laight @ 2019-11-26  9:53 UTC (permalink / raw)
  To: 'Fenghua Yu',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, Ashok Raj, Tony Luck,
	Ravi V Shankar
  Cc: linux-kernel, x86

From: Fenghua Yu <fenghua.yu@intel.com>
> Sent: 25 November 2019 19:43
> The address of "flags" is passed to atomic bitops which require the
> address is aligned to size of unsigned long.
> 
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> ---
>  include/xen/interface/io/pciif.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/xen/interface/io/pciif.h b/include/xen/interface/io/pciif.h
> index d9922ae36eb5..639d5fb484a3 100644
> --- a/include/xen/interface/io/pciif.h
> +++ b/include/xen/interface/io/pciif.h
> @@ -103,8 +103,11 @@ struct xen_pcie_aer_op {
>  	uint32_t devfn;
>  };
>  struct xen_pci_sharedinfo {
> -	/* flags - XEN_PCIF_* */
> -	uint32_t flags;
> +	/* flags - XEN_PCIF_*. Force alignment for atomic bit operations. */
> +	union {
> +		uint32_t	flags;
> +		unsigned long	flags_alignment;
> +	};

This is papering over the cracks.....
Changing flags to be 'unsigned long' and removing the casts is correct.
Either that or change to code to use simple assignments on the single 'flags' variable
instead of the relatively expensive function calls that are designed for large arrays
of bits.

	David.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long
  2019-11-25 19:43 ` [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long Fenghua Yu
  2019-11-26  9:53   ` David Laight
@ 2019-11-26 10:02   ` Jürgen Groß
  2019-12-02 18:28     ` Luck, Tony
  2019-12-02 22:29     ` Fenghua Yu
  1 sibling, 2 replies; 13+ messages in thread
From: Jürgen Groß @ 2019-11-26 10:02 UTC (permalink / raw)
  To: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Andy Lutomirski, Peter Zijlstra, David Laight,
	Ashok Raj, Tony Luck, Ravi V Shankar
  Cc: linux-kernel, x86

On 25.11.19 20:43, Fenghua Yu wrote:
> The address of "flags" is passed to atomic bitops which require the
> address is aligned to size of unsigned long.
> 
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> ---
>   include/xen/interface/io/pciif.h | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/xen/interface/io/pciif.h b/include/xen/interface/io/pciif.h
> index d9922ae36eb5..639d5fb484a3 100644
> --- a/include/xen/interface/io/pciif.h
> +++ b/include/xen/interface/io/pciif.h
> @@ -103,8 +103,11 @@ struct xen_pcie_aer_op {
>   	uint32_t devfn;
>   };
>   struct xen_pci_sharedinfo {
> -	/* flags - XEN_PCIF_* */
> -	uint32_t flags;
> +	/* flags - XEN_PCIF_*. Force alignment for atomic bit operations. */
> +	union {
> +		uint32_t	flags;
> +		unsigned long	flags_alignment;
> +	};
>   	struct xen_pci_op op;
>   	struct xen_pcie_aer_op aer_op;
>   };
> 

NAK.

This is an interface definition for communication between Xen dom0 and
guests via shared memory. It can't be changed.

BTW: you should Cc: the maintainers for the files you are modifying.


Juergen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations
  2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
                   ` (3 preceding siblings ...)
  2019-11-25 19:43 ` [PATCH v2 4/4] x86/mce: Force alignment for atomic " Fenghua Yu
@ 2019-11-26 10:13 ` David Laight
  4 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2019-11-26 10:13 UTC (permalink / raw)
  To: 'Fenghua Yu',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, Ashok Raj, Tony Luck,
	Ravi V Shankar
  Cc: linux-kernel, x86

From: Fenghua Yu
> Sent: 25 November 2019 19:43
...
> Change Log:
> v2:
> - Remove patch 1 and 3 in v1 because they are in the tip tree already.
> - Add new patches 2-4 per David Laight's comments:
> https://lore.kernel.org/lkml/e7c75de9191847ed98c573f9ad871518@AcuMS.aculab.com/
> Running "grep -r --include '*.[ch]' '_bit([^(]*, *([^)]*\*)' ."
> returns about 200 results. Most of them don't have split lock issues.

Except that any code that has to cast the long[] array argument to any of
the bitops functions is almost certainly broken in some way or other.
At best it is relying on accidental alignment of whatever address it is passing
to avoid alignment faults (for general misaligned accesses).

In other cases attempting to run the code on a BE system will update the
wrong bits - potentially in some other member of the structure.

Now, you might say that 'this code is LE only', but at some point it might
get used on a BE system.

So all of those 200 results need fixing to remove the casts.
Just using the unblocked __bit_xxx() functions doesn't fix
all the problems.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long
  2019-11-26 10:02   ` Jürgen Groß
@ 2019-12-02 18:28     ` Luck, Tony
  2019-12-02 22:29     ` Fenghua Yu
  1 sibling, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2019-12-02 18:28 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Andy Lutomirski, Peter Zijlstra, David Laight,
	Ashok Raj, Ravi V Shankar, linux-kernel, x86

On Tue, Nov 26, 2019 at 11:02:13AM +0100, Jürgen Groß wrote:
> On 25.11.19 20:43, Fenghua Yu wrote:
> > The address of "flags" is passed to atomic bitops which require the
> > address is aligned to size of unsigned long.
> > 
> > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> > ---
> >   include/xen/interface/io/pciif.h | 7 +++++--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/xen/interface/io/pciif.h b/include/xen/interface/io/pciif.h
> > index d9922ae36eb5..639d5fb484a3 100644
> > --- a/include/xen/interface/io/pciif.h
> > +++ b/include/xen/interface/io/pciif.h
> > @@ -103,8 +103,11 @@ struct xen_pcie_aer_op {
> >   	uint32_t devfn;
> >   };
> >   struct xen_pci_sharedinfo {
> > -	/* flags - XEN_PCIF_* */
> > -	uint32_t flags;
> > +	/* flags - XEN_PCIF_*. Force alignment for atomic bit operations. */
> > +	union {
> > +		uint32_t	flags;
> > +		unsigned long	flags_alignment;
> > +	};
> >   	struct xen_pci_op op;
> >   	struct xen_pcie_aer_op aer_op;
> >   };
> > 
> 
> NAK.
> 
> This is an interface definition for communication between Xen dom0 and
> guests via shared memory. It can't be changed.

Bother. There doesn't seem to be anything else in the xen_pci_sharedinfo
structure (or any of its sub-components) that would force the whole thing
to be any more strictly aligned than uint32_t

Oh ... but it looks like this is the only allocation:


        pdev->sh_info =
            (struct xen_pci_sharedinfo *)__get_free_page(GFP_KERNEL);

so "flags" is page aligned.

Fenghua: drop this patch, it isn't needed.

-Tony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long
  2019-11-26 10:02   ` Jürgen Groß
  2019-12-02 18:28     ` Luck, Tony
@ 2019-12-02 22:29     ` Fenghua Yu
  1 sibling, 0 replies; 13+ messages in thread
From: Fenghua Yu @ 2019-12-02 22:29 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Peter Zijlstra, David Laight, Ashok Raj,
	Tony Luck, Ravi V Shankar, linux-kernel, x86

On Tue, Nov 26, 2019 at 11:02:13AM +0100, Jürgen Groß wrote:
> On 25.11.19 20:43, Fenghua Yu wrote:
> >The address of "flags" is passed to atomic bitops which require the 
> >address is aligned to size of unsigned long.
> >
> >Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> ---
> >  include/xen/interface/io/pciif.h | 7 +++++-- 1 file changed, 5 
> >  insertions(+), 2 deletions(-)
> >
> >diff --git a/include/xen/interface/io/pciif.h 
> >b/include/xen/interface/io/pciif.h index d9922ae36eb5..639d5fb484a3 
> >100644 --- a/include/xen/interface/io/pciif.h +++ 
> >b/include/xen/interface/io/pciif.h @@ -103,8 +103,11 @@ struct 
> >xen_pcie_aer_op {
> >  	uint32_t devfn;
> >  };
> >  struct xen_pci_sharedinfo { - /* flags - XEN_PCIF_* */ - uint32_t 
> >flags; + /* flags - XEN_PCIF_*. Force alignment for atomic bit 
> >operations. */ + union { + uint32_t flags; + unsigned long 
> >flags_alignment; + };
> >  	struct xen_pci_op op; struct xen_pcie_aer_op aer_op;
> >  };
> >
> 
> NAK.
> 
> This is an interface definition for communication between Xen dom0 and 
> guests via shared memory. It can't be changed.
> 
> BTW: you should Cc: the maintainers for the files you are modifying.
> 
> 
> Juergen

After spending more time on the patch, actually I find this patch is not 
needed because there is no split lock issue on the "flags" field:

The bit offsets in all atomic bitops on the "flags" field are always 
constant values. On x86, an atomic bitop is optimized to use a locked byte
instruciton which only operates the byte containing the bit. So the
byte won't across cache line boundaries and there won't be split lock
issue in the atomic bitops

I will remove this patch in the next version of the patch set.

Thanks.

-Fenghua


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] drivers/net/b44: Change to non-atomic bit operations
  2019-11-26  9:49   ` David Laight
@ 2019-12-04 15:19     ` Peter Zijlstra
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2019-12-04 15:19 UTC (permalink / raw)
  To: David Laight
  Cc: 'Fenghua Yu',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, Ashok Raj, Tony Luck, Ravi V Shankar,
	linux-kernel, x86

On Tue, Nov 26, 2019 at 09:49:08AM +0000, David Laight wrote:

> >  	memset(ppattern + offset, 0xff, magicsync);
> >  	for (j = 0; j < magicsync; j++)
> > -		set_bit(len++, (unsigned long *) pmask);
> > +		__set_bit(len++, (unsigned long *)pmask);
> 
> While this stops the misaligned locks, the code is still horribly borked on BE systems.

Quite so.

> The way pmask is used definitely wanst a u32[] not a u64[] one.

Not sure, the code seems fairly consistent in using u8[].

And I suppose we can write the above like:

	pmask[len >> 3] |= BIT(len & 7); len++;

instead.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations
  2019-11-25 19:43 ` [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations Fenghua Yu
@ 2019-12-04 15:28   ` Peter Zijlstra
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2019-12-04 15:28 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	Andy Lutomirski, David Laight, Ashok Raj, Tony Luck,
	Ravi V Shankar, linux-kernel, x86, richard

On Mon, Nov 25, 2019 at 11:43:03AM -0800, Fenghua Yu wrote:
> No need for expensive atomic change_bit() on the local variable err_idx[].
> 
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> ---
>  drivers/mtd/nand/raw/fsmc_nand.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
> index a6964feeec77..916496d4ecf2 100644
> --- a/drivers/mtd/nand/raw/fsmc_nand.c
> +++ b/drivers/mtd/nand/raw/fsmc_nand.c
> @@ -809,8 +809,8 @@ static int fsmc_bch8_correct_data(struct nand_chip *chip, u8 *dat,
>  
>  	i = 0;
>  	while (num_err--) {
> -		change_bit(0, (unsigned long *)&err_idx[i]);
> -		change_bit(1, (unsigned long *)&err_idx[i]);
> +		__change_bit(0, (unsigned long *)&err_idx[i]);
> +		__change_bit(1, (unsigned long *)&err_idx[i]);
>  
>  		if (err_idx[i] < chip->ecc.size * 8) {
>  			change_bit(err_idx[i], (unsigned long *)dat);

I'm thinking this all can be written like:

		err_idx[i] ^= 3;

		if (err_idx[i] < chip->ecc.size * 8) {
			err = err_idx[i];
			dat[err >> 3] ^= BIT(err & 7);
		}

It seems unlikely that the @dat we're correcting has concurrency issues.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-12-04 15:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-25 19:43 [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in atomic bit operations Fenghua Yu
2019-11-25 19:43 ` [PATCH v2 1/4] drivers/net/b44: Change to non-atomic " Fenghua Yu
2019-11-26  9:49   ` David Laight
2019-12-04 15:19     ` Peter Zijlstra
2019-11-25 19:43 ` [PATCH v2 2/4] xen-pcifront: Align address of flags to size of unsigned long Fenghua Yu
2019-11-26  9:53   ` David Laight
2019-11-26 10:02   ` Jürgen Groß
2019-12-02 18:28     ` Luck, Tony
2019-12-02 22:29     ` Fenghua Yu
2019-11-25 19:43 ` [PATCH v2 3/4] mtd: rawnand: fsmc: Change to non-atomic bit operations Fenghua Yu
2019-12-04 15:28   ` Peter Zijlstra
2019-11-25 19:43 ` [PATCH v2 4/4] x86/mce: Force alignment for atomic " Fenghua Yu
2019-11-26 10:13 ` [PATCH v2 0/4] Fix some 4-byte vs. 8-byte alignment issues in " David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).