linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] buffer: Fix I/O error due to ARM read-after-read hazard
@ 2019-11-12 13:02 Vincent Whitchurch
  2019-11-12 16:08 ` Catalin Marinas
  0 siblings, 1 reply; 15+ messages in thread
From: Vincent Whitchurch @ 2019-11-12 13:02 UTC (permalink / raw)
  To: torvalds
  Cc: axboe, linux, linux-kernel, linux-arm-kernel, will,
	catalin.marinas, Vincent Whitchurch

On my dual-core ARM Cortex-A9, reading from squashfs (over
dm-verity/ubi/mtd) in a loop for hundreds of hours invariably results in
a read failure in squashfs_read_data().  The errors occur because the
buffer_uptodate() check fails after wait_on_buffer().  Further debugging
shows that the bh was in fact uptodate and that there is no actual I/O
error in the lower layers.

The problem is caused by the read-after-read hazards in the ARM
Cortex-A9 MPCore (erratum #761319, see [1]).  The code generated by the
compiler for the combination of the wait_on_buffer() and
buffer_uptodate() calls reads the flags value twice from memory (see the
excerpt of the assembly below).  The new value of the BH_Lock flag is
seen but the new value of BH_Uptodate is not even though both the bits
are read from the same memory location.

 27c:	9d08      	ldr	r5, [sp, #32]
 27e:	2400      	movs	r4, #0
 280:	e006      	b.n	290 <squashfs_read_data+0x290>
 282:	6803      	ldr	r3, [r0, #0]
 284:	07da      	lsls	r2, r3, #31
 286:	f140 810d 	bpl.w	4a4 <squashfs_read_data+0x4a4>
 28a:	3401      	adds	r4, #1
 28c:	42bc      	cmp	r4, r7
 28e:	da08      	bge.n	2a2 <squashfs_read_data+0x2a2>
 290:	f855 0f04 	ldr.w	r0, [r5, #4]!
 294:	6803      	ldr	r3, [r0, #0]
 296:	0759      	lsls	r1, r3, #29
 298:	d5f3      	bpl.n	282 <squashfs_read_data+0x282>
 29a:	f7ff fffe 	bl	0 <__wait_on_buffer>

Work around this problem by adding a DMB between the two reads of
bh->flags, as recommended in the ARM document.  With this barrier, no
failures have been seen in more than 5000 hours of the same test.

[1] http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
v2: Reword commit message.

 include/linux/buffer_head.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 7b73ef7f902d..4ef909a91f8c 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -352,6 +352,14 @@ static inline void wait_on_buffer(struct buffer_head *bh)
 	might_sleep();
 	if (buffer_locked(bh))
 		__wait_on_buffer(bh);
+
+#ifdef CONFIG_ARM
+	/*
+	 * Work around ARM Cortex-A9 MPcore Read-after-Read Hazards (erratum
+	 * 761319).
+	 */
+	smp_rmb();
+#endif
 }
 
 static inline int trylock_buffer(struct buffer_head *bh)
-- 
2.20.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-11-21 16:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-12 13:02 [PATCH v2] buffer: Fix I/O error due to ARM read-after-read hazard Vincent Whitchurch
2019-11-12 16:08 ` Catalin Marinas
2019-11-12 17:54   ` Linus Torvalds
2019-11-12 18:00   ` Will Deacon
2019-11-12 18:22     ` Catalin Marinas
2019-11-12 18:39       ` Linus Torvalds
2019-11-13 10:23         ` Will Deacon
2019-11-13 10:31           ` Russell King - ARM Linux admin
2019-11-13 10:49             ` Will Deacon
2019-11-14 13:28               ` Herbert Xu
2019-11-20 19:18                 ` Will Deacon
2019-11-21  1:25                   ` Herbert Xu
2019-11-21 16:53                     ` Will Deacon
2019-11-13 16:36           ` Linus Torvalds
2019-11-13 16:40             ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).