From: Peter Zijlstra <peterz@infradead.org> To: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Will Deacon <Will.Deacon@arm.com>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, arcml <linux-snps-arc@lists.infradead.org>, lkml <linux-kernel@vger.kernel.org>, "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Subject: Re: single copy atomicity for double load/stores on 32-bit systems Date: Fri, 31 May 2019 10:21:12 +0200 [thread overview] Message-ID: <20190531082112.GH2623@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <2fd3a455-6267-5d21-c530-41964a4f6ce9@synopsys.com> On Thu, May 30, 2019 at 11:22:42AM -0700, Vineet Gupta wrote: > Hi Peter, > > Had an interesting lunch time discussion with our hardware architects pertinent to > "minimal guarantees expected of a CPU" section of memory-barriers.txt > > > | (*) These guarantees apply only to properly aligned and sized scalar > | variables. "Properly sized" currently means variables that are > | the same size as "char", "short", "int" and "long". "Properly > | aligned" means the natural alignment, thus no constraints for > | "char", two-byte alignment for "short", four-byte alignment for > | "int", and either four-byte or eight-byte alignment for "long", > | on 32-bit and 64-bit systems, respectively. > > > I'm not sure how to interpret "natural alignment" for the case of double > load/stores on 32-bit systems where the hardware and ABI allow for 4 byte > alignment (ARCv2 LDD/STD, ARM LDRD/STRD ....) Natural alignment: !((uintptr_t)ptr % sizeof(*ptr)) For any u64 type, that would give 8 byte alignment. the problem otherwise being that your data spans two lines/pages etc.. > I presume (and the question) that lkmm doesn't expect such 8 byte load/stores to > be atomic unless 8-byte aligned > > ARMv7 arch ref manual seems to confirm this. Quoting > > | LDM, LDC, LDC2, LDRD, STM, STC, STC2, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, > | VSTM, and VSTR instructions are executed as a sequence of word-aligned word > | accesses. Each 32-bit word access is guaranteed to be single-copy atomic. A > | subsequence of two or more word accesses from the sequence might not exhibit > | single-copy atomicity > > While it seems reasonable form hardware pov to not implement such atomicity by > default it seems there's an additional burden on application writers. They could > be happily using a lockless algorithm with just a shared flag between 2 threads > w/o need for any explicit synchronization. If you're that careless with lockless code, you deserve all the pain you get. > But upgrade to a new compiler which > aggressively "packs" struct rendering long long 32-bit aligned (vs. 64-bit before) > causing the code to suddenly stop working. Is the onus on them to declare such > memory as c11 atomic or some such. When a programmer wants guarantees they already need to know wth they're doing. And I'll stand by my earlier conviction that any architecture that has a native u64 (be it a 64bit arch or a 32bit with double-width instructions) but has an ABI that allows u32 alignment on them is daft.
WARNING: multiple messages have this Message-ID (diff)
From: peterz@infradead.org (Peter Zijlstra) To: linux-snps-arc@lists.infradead.org Subject: single copy atomicity for double load/stores on 32-bit systems Date: Fri, 31 May 2019 10:21:12 +0200 [thread overview] Message-ID: <20190531082112.GH2623@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <2fd3a455-6267-5d21-c530-41964a4f6ce9@synopsys.com> On Thu, May 30, 2019@11:22:42AM -0700, Vineet Gupta wrote: > Hi Peter, > > Had an interesting lunch time discussion with our hardware architects pertinent to > "minimal guarantees expected of a CPU" section of memory-barriers.txt > > > | (*) These guarantees apply only to properly aligned and sized scalar > | variables. "Properly sized" currently means variables that are > | the same size as "char", "short", "int" and "long". "Properly > | aligned" means the natural alignment, thus no constraints for > | "char", two-byte alignment for "short", four-byte alignment for > | "int", and either four-byte or eight-byte alignment for "long", > | on 32-bit and 64-bit systems, respectively. > > > I'm not sure how to interpret "natural alignment" for the case of double > load/stores on 32-bit systems where the hardware and ABI allow for 4 byte > alignment (ARCv2 LDD/STD, ARM LDRD/STRD ....) Natural alignment: !((uintptr_t)ptr % sizeof(*ptr)) For any u64 type, that would give 8 byte alignment. the problem otherwise being that your data spans two lines/pages etc.. > I presume (and the question) that lkmm doesn't expect such 8 byte load/stores to > be atomic unless 8-byte aligned > > ARMv7 arch ref manual seems to confirm this. Quoting > > | LDM, LDC, LDC2, LDRD, STM, STC, STC2, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, > | VSTM, and VSTR instructions are executed as a sequence of word-aligned word > | accesses. Each 32-bit word access is guaranteed to be single-copy atomic. A > | subsequence of two or more word accesses from the sequence might not exhibit > | single-copy atomicity > > While it seems reasonable form hardware pov to not implement such atomicity by > default it seems there's an additional burden on application writers. They could > be happily using a lockless algorithm with just a shared flag between 2 threads > w/o need for any explicit synchronization. If you're that careless with lockless code, you deserve all the pain you get. > But upgrade to a new compiler which > aggressively "packs" struct rendering long long 32-bit aligned (vs. 64-bit before) > causing the code to suddenly stop working. Is the onus on them to declare such > memory as c11 atomic or some such. When a programmer wants guarantees they already need to know wth they're doing. And I'll stand by my earlier conviction that any architecture that has a native u64 (be it a 64bit arch or a 32bit with double-width instructions) but has an ABI that allows u32 alignment on them is daft.
next prev parent reply other threads:[~2019-05-31 8:21 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-30 18:22 single copy atomicity for double load/stores on 32-bit systems Vineet Gupta 2019-05-30 18:22 ` Vineet Gupta 2019-05-30 18:53 ` Paul E. McKenney 2019-05-30 18:53 ` Paul E. McKenney 2019-05-30 19:16 ` Vineet Gupta 2019-05-30 19:16 ` Vineet Gupta 2019-05-31 8:23 ` Peter Zijlstra 2019-05-31 8:23 ` Peter Zijlstra 2019-05-31 8:25 ` Peter Zijlstra 2019-05-31 8:25 ` Peter Zijlstra 2019-05-31 8:21 ` Peter Zijlstra [this message] 2019-05-31 8:21 ` Peter Zijlstra 2019-06-03 18:08 ` Vineet Gupta 2019-06-03 18:08 ` Vineet Gupta 2019-06-03 20:13 ` Paul E. McKenney 2019-06-03 20:13 ` Paul E. McKenney 2019-06-03 21:59 ` Vineet Gupta 2019-06-03 21:59 ` Vineet Gupta 2019-06-04 7:41 ` Geert Uytterhoeven 2019-06-04 7:41 ` Geert Uytterhoeven 2019-06-04 7:41 ` Geert Uytterhoeven 2019-06-06 9:43 ` Paul E. McKenney 2019-06-06 9:43 ` Paul E. McKenney 2019-06-06 9:53 ` Geert Uytterhoeven 2019-06-06 9:53 ` Geert Uytterhoeven 2019-06-06 16:34 ` David Laight 2019-06-06 16:34 ` David Laight 2019-06-06 21:17 ` Paul E. McKenney 2019-06-06 21:17 ` Paul E. McKenney 2019-06-03 18:43 ` Vineet Gupta 2019-06-03 18:43 ` Vineet Gupta 2019-07-01 20:05 ` Vineet Gupta 2019-07-01 20:05 ` Vineet Gupta 2019-07-02 10:46 ` Will Deacon 2019-07-02 10:46 ` Will Deacon 2019-05-31 9:41 ` David Laight 2019-05-31 9:41 ` David Laight 2019-05-31 9:41 ` David Laight 2019-05-31 11:44 ` Paul E. McKenney 2019-05-31 11:44 ` Paul E. McKenney 2019-06-03 18:44 ` Vineet Gupta 2019-06-03 18:44 ` Vineet Gupta
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190531082112.GH2623@hirez.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=Vineet.Gupta1@synopsys.com \ --cc=Will.Deacon@arm.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-snps-arc@lists.infradead.org \ --cc=paulmck@linux.vnet.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.