* seqcount usage in xt_replace_table() @ 2019-01-08 19:33 Anatol Pomozov 2019-01-08 22:37 ` Florian Westphal ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Anatol Pomozov @ 2019-01-08 19:33 UTC (permalink / raw) To: fw, Dmitry Vyukov, paulmck, LKML Hello folks, A bit of context what I am doing. I am trying to port KTSAN (Kernel Thread Sanitizer) tool to v4.20. That tool tracks shared data usage and makes sure it is accessed in a thread-safe manner. seqlock is a synchronization primitive used by Linux kernel. KTSAN annotates read_seqbegin()/read_seqretry() and tracks what data been accessed in its critical section. During KTSAN port I found and interesting seqcount usage introduced in commit 80055dab5de0c8677bc148c4717ddfc753a9148e If I read this commit correctly xt_replace_table() does not use seqlock in a canonical way to specify a critical section. Instead the code reads the counter and waits until it gets to a specific value. Now I want KTSAN to play with this code nicely. I need to tell KTSAN something like "this raw_read_seqcount() does not start a critical section, just ignore it". So temporary I introduced raw_read_seqcount_nocritical() function that is ignored by KTSAN. Is it a good solution? Or maybe xt_replace_table() can be enhanced? When I hear that something waits until an event happens on all CPUs I think about wait_event() function. Would it be better for xt_replace_table() to introduce an atomic counter that is decremented by CPUs, and the main CPU waits until the counter gets zero? WDYT? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-08 19:33 seqcount usage in xt_replace_table() Anatol Pomozov @ 2019-01-08 22:37 ` Florian Westphal 2019-01-10 12:41 ` Peter Zijlstra 2019-01-09 0:02 ` Andrea Parri 2019-01-10 12:44 ` Peter Zijlstra 2 siblings, 1 reply; 28+ messages in thread From: Florian Westphal @ 2019-01-08 22:37 UTC (permalink / raw) To: Anatol Pomozov; +Cc: fw, Dmitry Vyukov, paulmck, LKML Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > Or maybe xt_replace_table() can be enhanced? When I hear that > something waits until an event happens on all CPUs I think about > wait_event() function. Would it be better for xt_replace_table() to > introduce an atomic counter that is decremented by CPUs, and the main > CPU waits until the counter gets zero? That would mean placing an additional atomic op into the iptables evaluation path (ipt_do_table and friends). Only alternative I see that might work is synchronize_rcu (the _do_table functions are called with rcu read lock held). I guess current scheme is cheaper though. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-08 22:37 ` Florian Westphal @ 2019-01-10 12:41 ` Peter Zijlstra 2019-01-10 12:53 ` Dmitry Vyukov ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Peter Zijlstra @ 2019-01-10 12:41 UTC (permalink / raw) To: Florian Westphal; +Cc: Anatol Pomozov, Dmitry Vyukov, paulmck, LKML On Tue, Jan 08, 2019 at 11:37:46PM +0100, Florian Westphal wrote: > Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > > Or maybe xt_replace_table() can be enhanced? When I hear that > > something waits until an event happens on all CPUs I think about > > wait_event() function. Would it be better for xt_replace_table() to > > introduce an atomic counter that is decremented by CPUs, and the main > > CPU waits until the counter gets zero? > > That would mean placing an additional atomic op into the > iptables evaluation path (ipt_do_table and friends). > For: /* * Ensure contents of newinfo are visible before assigning to * private. */ smp_wmb(); table->private = newinfo; we have: smp_store_release(&table->private, newinfo); But what store does that second smp_wmb() order against? The comment: /* make sure all cpus see new ->private value */ smp_wmb(); makes no sense what so ever, no smp_*() barrier can provide such guarantees. > Only alternative I see that might work is synchronize_rcu (the > _do_table functions are called with rcu read lock held). > > I guess current scheme is cheaper though. Is performance a concern in this path? There is no comment justifying this 'creative' stuff. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:41 ` Peter Zijlstra @ 2019-01-10 12:53 ` Dmitry Vyukov 2019-01-10 20:18 ` Peter Zijlstra 2019-01-10 14:48 ` Florian Westphal 2019-01-10 14:52 ` Paul E. McKenney 2 siblings, 1 reply; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-10 12:53 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Florian Westphal, Anatol Pomozov, Paul E. McKenney, LKML On Thu, Jan 10, 2019 at 1:41 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Jan 08, 2019 at 11:37:46PM +0100, Florian Westphal wrote: > > Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > > > Or maybe xt_replace_table() can be enhanced? When I hear that > > > something waits until an event happens on all CPUs I think about > > > wait_event() function. Would it be better for xt_replace_table() to > > > introduce an atomic counter that is decremented by CPUs, and the main > > > CPU waits until the counter gets zero? > > > > That would mean placing an additional atomic op into the > > iptables evaluation path (ipt_do_table and friends). > > > > For: > > /* > * Ensure contents of newinfo are visible before assigning to > * private. > */ > smp_wmb(); > table->private = newinfo; > > we have: > > smp_store_release(&table->private, newinfo); > > But what store does that second smp_wmb() order against? The comment: > > /* make sure all cpus see new ->private value */ > smp_wmb(); > > makes no sense what so ever, no smp_*() barrier can provide such > guarantees. Do we want WRITE_ONCE here then? We want it to be compiled to an actual memory access and then it's up to hardware to make it visible to other CPUs. smp_wmb should most likely have this as a side effect too, but somewhat indirect. Also race-detector-friendly. > > Only alternative I see that might work is synchronize_rcu (the > > _do_table functions are called with rcu read lock held). > > > > I guess current scheme is cheaper though. > > Is performance a concern in this path? There is no comment justifying > this 'creative' stuff. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:53 ` Dmitry Vyukov @ 2019-01-10 20:18 ` Peter Zijlstra 0 siblings, 0 replies; 28+ messages in thread From: Peter Zijlstra @ 2019-01-10 20:18 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Florian Westphal, Anatol Pomozov, Paul E. McKenney, LKML On Thu, Jan 10, 2019 at 01:53:28PM +0100, Dmitry Vyukov wrote: > On Thu, Jan 10, 2019 at 1:41 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Tue, Jan 08, 2019 at 11:37:46PM +0100, Florian Westphal wrote: > > > Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > > > > Or maybe xt_replace_table() can be enhanced? When I hear that > > > > something waits until an event happens on all CPUs I think about > > > > wait_event() function. Would it be better for xt_replace_table() to > > > > introduce an atomic counter that is decremented by CPUs, and the main > > > > CPU waits until the counter gets zero? > > > > > > That would mean placing an additional atomic op into the > > > iptables evaluation path (ipt_do_table and friends). > > > > > > > For: > > > > /* > > * Ensure contents of newinfo are visible before assigning to > > * private. > > */ > > smp_wmb(); > > table->private = newinfo; > > > > we have: > > > > smp_store_release(&table->private, newinfo); > > > > But what store does that second smp_wmb() order against? The comment: > > > > /* make sure all cpus see new ->private value */ > > smp_wmb(); > > > > makes no sense what so ever, no smp_*() barrier can provide such > > guarantees. > > Do we want WRITE_ONCE here then? The smp_store_release() already implies WRITE_ONCE(). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:41 ` Peter Zijlstra 2019-01-10 12:53 ` Dmitry Vyukov @ 2019-01-10 14:48 ` Florian Westphal 2019-01-10 20:20 ` Peter Zijlstra 2019-01-10 20:25 ` Peter Zijlstra 2019-01-10 14:52 ` Paul E. McKenney 2 siblings, 2 replies; 28+ messages in thread From: Florian Westphal @ 2019-01-10 14:48 UTC (permalink / raw) To: Peter Zijlstra Cc: Florian Westphal, Anatol Pomozov, Dmitry Vyukov, paulmck, LKML Peter Zijlstra <peterz@infradead.org> wrote: > /* > * Ensure contents of newinfo are visible before assigning to > * private. > */ > smp_wmb(); > table->private = newinfo; > > we have: > > smp_store_release(&table->private, newinfo); > > But what store does that second smp_wmb() order against? The comment: > > /* make sure all cpus see new ->private value */ > smp_wmb(); > > makes no sense what so ever, no smp_*() barrier can provide such > guarantees. IIRC I added this at the request of a reviewer of an earlier iteration of that patch. IIRC the concern was that compiler/hw could re-order smb_wmb(); table->private = newinfo; /* wait until all cpus are done with old table */ into: smb_wmb(); /* wait until all cpus are done with old table */ ... table->private = newinfo; and that (obviously) makes the wait-loop useless. > > Only alternative I see that might work is synchronize_rcu (the > > _do_table functions are called with rcu read lock held). > > > > I guess current scheme is cheaper though. > > Is performance a concern in this path? There is no comment justifying > this 'creative' stuff. We have to wait until all cpus are done with current iptables ruleset. Before this 'creative' change, this relied on get_counters synchronization. And that caused wait times of 30 seconds or more on busy systems. I have no objections swapping this with a synchronize_rcu() if that makes it easier. (synchronize_rcu might be confusing though, as we don't use rcu for table->private), and one 'has to know' that all the netfilter hooks, including the iptables eval loop, run with rcu_read_lock held). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 14:48 ` Florian Westphal @ 2019-01-10 20:20 ` Peter Zijlstra 2019-01-10 20:25 ` Peter Zijlstra 1 sibling, 0 replies; 28+ messages in thread From: Peter Zijlstra @ 2019-01-10 20:20 UTC (permalink / raw) To: Florian Westphal; +Cc: Anatol Pomozov, Dmitry Vyukov, paulmck, LKML On Thu, Jan 10, 2019 at 03:48:12PM +0100, Florian Westphal wrote: > Peter Zijlstra <peterz@infradead.org> wrote: > > /* > > * Ensure contents of newinfo are visible before assigning to > > * private. > > */ > > smp_wmb(); > > table->private = newinfo; > > > > we have: > > > > smp_store_release(&table->private, newinfo); > > > > But what store does that second smp_wmb() order against? The comment: > > > > /* make sure all cpus see new ->private value */ > > smp_wmb(); > > > > makes no sense what so ever, no smp_*() barrier can provide such > > guarantees. > > IIRC I added this at the request of a reviewer of an earlier iteration > of that patch. > > IIRC the concern was that compiler/hw could re-order > > smb_wmb(); > table->private = newinfo; > /* wait until all cpus are done with old table */ > > into: > > smb_wmb(); > /* wait until all cpus are done with old table */ > ... > table->private = newinfo; > > and that (obviously) makes the wait-loop useless. The thing is, the 'wait for all cpus' thing is pure loads, not stores, so smp_wmb() is a complete NOP there. If you want to ensure those loads happen after that store (which does indeed seem like a sensible thing), you're going to have to use smp_mb(). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 14:48 ` Florian Westphal 2019-01-10 20:20 ` Peter Zijlstra @ 2019-01-10 20:25 ` Peter Zijlstra 2019-01-10 22:29 ` Florian Westphal 1 sibling, 1 reply; 28+ messages in thread From: Peter Zijlstra @ 2019-01-10 20:25 UTC (permalink / raw) To: Florian Westphal; +Cc: Anatol Pomozov, Dmitry Vyukov, paulmck, LKML On Thu, Jan 10, 2019 at 03:48:12PM +0100, Florian Westphal wrote: > Peter Zijlstra <peterz@infradead.org> wrote: > > Is performance a concern in this path? There is no comment justifying > > this 'creative' stuff. > > We have to wait until all cpus are done with current iptables ruleset. > > Before this 'creative' change, this relied on get_counters > synchronization. And that caused wait times of 30 seconds or more on > busy systems. > > I have no objections swapping this with a synchronize_rcu() if that > makes it easier. Would using synchronize_rcu() not also mean you can get rid of that xt_write_recseq*() stuff entirely? Anyway, synchronize_rcu() can also take a little while, but I don't think anywere near 30 seconds. > (synchronize_rcu might be confusing though, as we don't use rcu > for table->private), and one 'has to know' that all the netfilter > hooks, including the iptables eval loop, run with rcu_read_lock > held). A comment can help there, right? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 20:25 ` Peter Zijlstra @ 2019-01-10 22:29 ` Florian Westphal 2019-01-11 8:34 ` Peter Zijlstra 0 siblings, 1 reply; 28+ messages in thread From: Florian Westphal @ 2019-01-10 22:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Florian Westphal, Anatol Pomozov, Dmitry Vyukov, paulmck, LKML Peter Zijlstra <peterz@infradead.org> wrote: > Would using synchronize_rcu() not also mean you can get rid of that > xt_write_recseq*() stuff entirely? No, because those are used to synchronize with cpus that read the ruleset counters, see net/ipv4/netfilter/ip_tables.c:get_counters(). > Anyway, synchronize_rcu() can also take a little while, but I don't > think anywere near 30 seconds. Ok, I think in that case it would be best to just replace the recseq value sampling with smp_mb + synchronize_rcu plus a comment that explains why its done. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 22:29 ` Florian Westphal @ 2019-01-11 8:34 ` Peter Zijlstra 2019-01-11 14:08 ` Paul E. McKenney 0 siblings, 1 reply; 28+ messages in thread From: Peter Zijlstra @ 2019-01-11 8:34 UTC (permalink / raw) To: Florian Westphal; +Cc: Anatol Pomozov, Dmitry Vyukov, paulmck, LKML On Thu, Jan 10, 2019 at 11:29:20PM +0100, Florian Westphal wrote: > Peter Zijlstra <peterz@infradead.org> wrote: > > Would using synchronize_rcu() not also mean you can get rid of that > > xt_write_recseq*() stuff entirely? > > No, because those are used to synchronize with cpus that read > the ruleset counters, see > > net/ipv4/netfilter/ip_tables.c:get_counters(). Ah, bummer :/ > > Anyway, synchronize_rcu() can also take a little while, but I don't > > think anywere near 30 seconds. > > Ok, I think in that case it would be best to just replace the > recseq value sampling with smp_mb + synchronize_rcu plus a comment > that explains why its done. synchronize_rcu() implies smp_mb() on all CPUs. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-11 8:34 ` Peter Zijlstra @ 2019-01-11 14:08 ` Paul E. McKenney 0 siblings, 0 replies; 28+ messages in thread From: Paul E. McKenney @ 2019-01-11 14:08 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Florian Westphal, Anatol Pomozov, Dmitry Vyukov, LKML On Fri, Jan 11, 2019 at 09:34:11AM +0100, Peter Zijlstra wrote: > On Thu, Jan 10, 2019 at 11:29:20PM +0100, Florian Westphal wrote: > > Peter Zijlstra <peterz@infradead.org> wrote: > > > Would using synchronize_rcu() not also mean you can get rid of that > > > xt_write_recseq*() stuff entirely? > > > > No, because those are used to synchronize with cpus that read > > the ruleset counters, see > > > > net/ipv4/netfilter/ip_tables.c:get_counters(). > > Ah, bummer :/ > > > > Anyway, synchronize_rcu() can also take a little while, but I don't > > > think anywere near 30 seconds. > > > > Ok, I think in that case it would be best to just replace the > > recseq value sampling with smp_mb + synchronize_rcu plus a comment > > that explains why its done. > > synchronize_rcu() implies smp_mb() on all CPUs. Yes, it does, but in the case of idle CPUs, the smp_mb() calls are only required to follow any pre-existing RCU read-side critical section on the one hand an precede any RCU read-side critical section completing after the synchronize_rcu() on the other. To do more would mean waking up idle CPUs, which does not make the battery-powered guys happy. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:41 ` Peter Zijlstra 2019-01-10 12:53 ` Dmitry Vyukov 2019-01-10 14:48 ` Florian Westphal @ 2019-01-10 14:52 ` Paul E. McKenney 2 siblings, 0 replies; 28+ messages in thread From: Paul E. McKenney @ 2019-01-10 14:52 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Florian Westphal, Anatol Pomozov, Dmitry Vyukov, LKML On Thu, Jan 10, 2019 at 01:41:23PM +0100, Peter Zijlstra wrote: > On Tue, Jan 08, 2019 at 11:37:46PM +0100, Florian Westphal wrote: > > Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > > > Or maybe xt_replace_table() can be enhanced? When I hear that > > > something waits until an event happens on all CPUs I think about > > > wait_event() function. Would it be better for xt_replace_table() to > > > introduce an atomic counter that is decremented by CPUs, and the main > > > CPU waits until the counter gets zero? > > > > That would mean placing an additional atomic op into the > > iptables evaluation path (ipt_do_table and friends). > > > > For: > > /* > * Ensure contents of newinfo are visible before assigning to > * private. > */ > smp_wmb(); > table->private = newinfo; > > we have: > > smp_store_release(&table->private, newinfo); > > But what store does that second smp_wmb() order against? The comment: > > /* make sure all cpus see new ->private value */ > smp_wmb(); > > makes no sense what so ever, no smp_*() barrier can provide such > guarantees. Agreed, this would require something like synchronize_rcu() or some sort of IPI-based sys_membarrier() lookalike. Thanx, Paul > > Only alternative I see that might work is synchronize_rcu (the > > _do_table functions are called with rcu read lock held). > > > > I guess current scheme is cheaper though. > > Is performance a concern in this path? There is no comment justifying > this 'creative' stuff. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-08 19:33 seqcount usage in xt_replace_table() Anatol Pomozov 2019-01-08 22:37 ` Florian Westphal @ 2019-01-09 0:02 ` Andrea Parri 2019-01-09 0:36 ` Anatol Pomozov 2019-01-10 12:44 ` Peter Zijlstra 2 siblings, 1 reply; 28+ messages in thread From: Andrea Parri @ 2019-01-09 0:02 UTC (permalink / raw) To: Anatol Pomozov; +Cc: fw, Dmitry Vyukov, paulmck, LKML Hi Anatol, On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > Hello folks, > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > and makes sure it is accessed in a thread-safe manner. Interesting! FYI, some LKMM's maintainers (Paul included) had and continued to have some "fun" discussing topics related to "thread- safe memory accesses": I'm sure that they'll be very interested in such work of yours and eager to discuss your results. Cheers, Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 0:02 ` Andrea Parri @ 2019-01-09 0:36 ` Anatol Pomozov 2019-01-09 5:35 ` Dmitry Vyukov 2019-01-09 11:24 ` Andrea Parri 0 siblings, 2 replies; 28+ messages in thread From: Anatol Pomozov @ 2019-01-09 0:36 UTC (permalink / raw) To: Andrea Parri; +Cc: fw, Dmitry Vyukov, Paul McKenney, LKML Hello On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > Hi Anatol, > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > Hello folks, > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > and makes sure it is accessed in a thread-safe manner. > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > continued to have some "fun" discussing topics related to "thread- > safe memory accesses": I'm sure that they'll be very interested in > such work of yours and eager to discuss your results. Thread Sanitizer is a great tool to find thread-safety issues with user-space code. The tool been developed by a team of smart people from Google [1]. KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A bunch of work been done there but the project is still at proof-of-concept point. I am not a part of Google's dynamic tools team. But I've decided to pick something to do during the New Year holidays so started porting KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to fix a few crashes [3]. [1] https://github.com/google/sanitizers [2] https://github.com/google/ktsan/wiki [3] https://github.com/anatol/linux/tree/ktsan_4.20 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 0:36 ` Anatol Pomozov @ 2019-01-09 5:35 ` Dmitry Vyukov 2019-01-09 11:24 ` Andrea Parri 1 sibling, 0 replies; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-09 5:35 UTC (permalink / raw) To: Anatol Pomozov Cc: Andrea Parri, Florian Westphal, Paul McKenney, LKML, Andrey Konovalov On Wed, Jan 9, 2019 at 1:36 AM Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > > Hello > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > Hi Anatol, > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > Hello folks, > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > and makes sure it is accessed in a thread-safe manner. > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > continued to have some "fun" discussing topics related to "thread- > > safe memory accesses": I'm sure that they'll be very interested in > > such work of yours and eager to discuss your results. > > Thread Sanitizer is a great tool to find thread-safety issues with > user-space code. The tool been developed by a team of smart people > from Google [1]. > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > bunch of work been done there but the project is still at > proof-of-concept point. > > I am not a part of Google's dynamic tools team. But I've decided to > pick something to do during the New Year holidays so started porting > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > fix a few crashes [3]. > > [1] https://github.com/google/sanitizers > [2] https://github.com/google/ktsan/wiki > [3] https://github.com/anatol/linux/tree/ktsan_4.20 For completeness, at the time we also had to add read_seqcount_cancel() function to dynamically mark all seqcount read regions. It can be used here too, start read region and immediately end it. Less clear than raw_read_seqcount_nocritical(), but also less API surface. /** * read_seqcount_cancel - cancel a seq-read critical section * @s: pointer to seqcount_t * * This is a no-op except for ktsan, it needs to know scopes of seq-read * critical sections. The sections are denoted either by begin->retry or * by begin->cancel. */ static inline void read_seqcount_cancel(const seqcount_t *s) { ktsan_seqcount_end(s); } https://github.com/google/ktsan/blob/tsan/include/linux/seqlock.h#L235-L246 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 0:36 ` Anatol Pomozov 2019-01-09 5:35 ` Dmitry Vyukov @ 2019-01-09 11:24 ` Andrea Parri 2019-01-09 11:55 ` Dmitry Vyukov 1 sibling, 1 reply; 28+ messages in thread From: Andrea Parri @ 2019-01-09 11:24 UTC (permalink / raw) To: Anatol Pomozov; +Cc: fw, Dmitry Vyukov, Paul McKenney, LKML, Andrey Konovalov On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > Hello > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > Hi Anatol, > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > Hello folks, > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > and makes sure it is accessed in a thread-safe manner. > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > continued to have some "fun" discussing topics related to "thread- > > safe memory accesses": I'm sure that they'll be very interested in > > such work of yours and eager to discuss your results. > > Thread Sanitizer is a great tool to find thread-safety issues with > user-space code. The tool been developed by a team of smart people > from Google [1]. > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > bunch of work been done there but the project is still at > proof-of-concept point. Yes, I have been aware of these tools since at least ;-) https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > I am not a part of Google's dynamic tools team. But I've decided to > pick something to do during the New Year holidays so started porting > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > fix a few crashes [3]. I guess my first reaction would remain "it's kind of hard (to use an euphemism) to review 7,582 additions or so for a data race detector without a clear/an accepted (by the community) notion of data race..." Andrea > > [1] https://github.com/google/sanitizers > [2] https://github.com/google/ktsan/wiki > [3] https://github.com/anatol/linux/tree/ktsan_4.20 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 11:24 ` Andrea Parri @ 2019-01-09 11:55 ` Dmitry Vyukov 2019-01-09 12:11 ` Andrea Parri 0 siblings, 1 reply; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-09 11:55 UTC (permalink / raw) To: Andrea Parri Cc: Anatol Pomozov, Florian Westphal, Paul McKenney, LKML, Andrey Konovalov On Wed, Jan 9, 2019 at 12:24 PM Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > > Hello > > > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > Hi Anatol, > > > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > > Hello folks, > > > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > > and makes sure it is accessed in a thread-safe manner. > > > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > > continued to have some "fun" discussing topics related to "thread- > > > safe memory accesses": I'm sure that they'll be very interested in > > > such work of yours and eager to discuss your results. > > > > Thread Sanitizer is a great tool to find thread-safety issues with > > user-space code. The tool been developed by a team of smart people > > from Google [1]. > > > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > > bunch of work been done there but the project is still at > > proof-of-concept point. > > Yes, I have been aware of these tools since at least ;-) > > https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > > > > > I am not a part of Google's dynamic tools team. But I've decided to > > pick something to do during the New Year holidays so started porting > > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > > fix a few crashes [3]. > > I guess my first reaction would remain > > "it's kind of hard (to use an euphemism) to review 7,582 additions > or so for a data race detector without a clear/an accepted (by the > community) notion of data race..." Tsan's notion of a data race is basically the C/C++'s notion: concurrent/unsynchronized non-atomic access in different threads at least one of which is a write. Tremendous (for such a project) benefits of automatic data race detection is a good motivation to finally agree on and accept a practically useful notion of a data race. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 11:55 ` Dmitry Vyukov @ 2019-01-09 12:11 ` Andrea Parri 2019-01-09 12:29 ` Dmitry Vyukov 0 siblings, 1 reply; 28+ messages in thread From: Andrea Parri @ 2019-01-09 12:11 UTC (permalink / raw) To: Dmitry Vyukov Cc: Anatol Pomozov, Florian Westphal, Paul McKenney, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Wed, Jan 09, 2019 at 12:55:27PM +0100, Dmitry Vyukov wrote: > On Wed, Jan 9, 2019 at 12:24 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > > > Hello > > > > > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > Hi Anatol, > > > > > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > > > Hello folks, > > > > > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > > > and makes sure it is accessed in a thread-safe manner. > > > > > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > > > continued to have some "fun" discussing topics related to "thread- > > > > safe memory accesses": I'm sure that they'll be very interested in > > > > such work of yours and eager to discuss your results. > > > > > > Thread Sanitizer is a great tool to find thread-safety issues with > > > user-space code. The tool been developed by a team of smart people > > > from Google [1]. > > > > > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > > > bunch of work been done there but the project is still at > > > proof-of-concept point. > > > > Yes, I have been aware of these tools since at least ;-) > > > > https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > > > > > > > > > I am not a part of Google's dynamic tools team. But I've decided to > > > pick something to do during the New Year holidays so started porting > > > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > > > fix a few crashes [3]. > > > > I guess my first reaction would remain > > > > "it's kind of hard (to use an euphemism) to review 7,582 additions > > or so for a data race detector without a clear/an accepted (by the > > community) notion of data race..." > > Tsan's notion of a data race is basically the C/C++'s notion: > concurrent/unsynchronized non-atomic access in different threads at > least one of which is a write. Yeah, I think that this notion needs to be detailed, discussed, documented, and discussed again. ;-) > Tremendous (for such a project) benefits of automatic data race > detection is a good motivation to finally agree on and accept a > practically useful notion of a data race. Agreed. Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 12:11 ` Andrea Parri @ 2019-01-09 12:29 ` Dmitry Vyukov 2019-01-09 17:10 ` Paul E. McKenney 0 siblings, 1 reply; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-09 12:29 UTC (permalink / raw) To: Andrea Parri Cc: Anatol Pomozov, Florian Westphal, Paul McKenney, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Wed, Jan 9, 2019 at 1:11 PM Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > On Wed, Jan 09, 2019 at 12:55:27PM +0100, Dmitry Vyukov wrote: > > On Wed, Jan 9, 2019 at 12:24 PM Andrea Parri > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > > > > Hello > > > > > > > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > > > Hi Anatol, > > > > > > > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > > > > Hello folks, > > > > > > > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > > > > and makes sure it is accessed in a thread-safe manner. > > > > > > > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > > > > continued to have some "fun" discussing topics related to "thread- > > > > > safe memory accesses": I'm sure that they'll be very interested in > > > > > such work of yours and eager to discuss your results. > > > > > > > > Thread Sanitizer is a great tool to find thread-safety issues with > > > > user-space code. The tool been developed by a team of smart people > > > > from Google [1]. > > > > > > > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > > > > bunch of work been done there but the project is still at > > > > proof-of-concept point. > > > > > > Yes, I have been aware of these tools since at least ;-) > > > > > > https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > > > > > > > > > > > > > I am not a part of Google's dynamic tools team. But I've decided to > > > > pick something to do during the New Year holidays so started porting > > > > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > > > > fix a few crashes [3]. > > > > > > I guess my first reaction would remain > > > > > > "it's kind of hard (to use an euphemism) to review 7,582 additions > > > or so for a data race detector without a clear/an accepted (by the > > > community) notion of data race..." > > > > Tsan's notion of a data race is basically the C/C++'s notion: > > concurrent/unsynchronized non-atomic access in different threads at > > least one of which is a write. > > Yeah, I think that this notion needs to be detailed, discussed, > documented, and discussed again. ;-) > > > > Tremendous (for such a project) benefits of automatic data race > > detection is a good motivation to finally agree on and accept a > > practically useful notion of a data race. > > Agreed. While having a 100% formal definition of a data race upfront would be useful, I don't think this is a hard requirement for deployment of KTSAN. What I think is required is: 1. Agree that the overall direction is right. 2. Agree that we want to enable data race detection and resolve problems as they appear in a practical manner (rather than block whole effort on every small thing). We deployed TSAN in user-space in much larger code bases than kernel, and while we had the C/C++ formal definition of a data race, practical and legacy matters were similar to that of the kernel (lots of legacy code, different opinions, etc). Doing both things in tandem (defining a memory model and deploying a data race detector) can actually have benefits as a race detector may point to under-defined or impractically defined areas, and will otherwise help to validate that the model works and is useful. KTSAN is not fixed as well. We adopted it as we gathered more knowledge and understanding of the kernel. So it's not that we have to commit to something upfront. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 12:29 ` Dmitry Vyukov @ 2019-01-09 17:10 ` Paul E. McKenney 2019-01-10 8:49 ` Dmitry Vyukov 0 siblings, 1 reply; 28+ messages in thread From: Paul E. McKenney @ 2019-01-09 17:10 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrea Parri, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Wed, Jan 09, 2019 at 01:29:02PM +0100, Dmitry Vyukov wrote: > On Wed, Jan 9, 2019 at 1:11 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > On Wed, Jan 09, 2019 at 12:55:27PM +0100, Dmitry Vyukov wrote: > > > On Wed, Jan 9, 2019 at 12:24 PM Andrea Parri > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > > > > > Hello > > > > > > > > > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > > > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > > > > > Hi Anatol, > > > > > > > > > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > > > > > Hello folks, > > > > > > > > > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > > > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > > > > > and makes sure it is accessed in a thread-safe manner. > > > > > > > > > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > > > > > continued to have some "fun" discussing topics related to "thread- > > > > > > safe memory accesses": I'm sure that they'll be very interested in > > > > > > such work of yours and eager to discuss your results. > > > > > > > > > > Thread Sanitizer is a great tool to find thread-safety issues with > > > > > user-space code. The tool been developed by a team of smart people > > > > > from Google [1]. > > > > > > > > > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > > > > > bunch of work been done there but the project is still at > > > > > proof-of-concept point. > > > > > > > > Yes, I have been aware of these tools since at least ;-) > > > > > > > > https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > > > > > > > > > > > > > > > > > I am not a part of Google's dynamic tools team. But I've decided to > > > > > pick something to do during the New Year holidays so started porting > > > > > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > > > > > fix a few crashes [3]. > > > > > > > > I guess my first reaction would remain > > > > > > > > "it's kind of hard (to use an euphemism) to review 7,582 additions > > > > or so for a data race detector without a clear/an accepted (by the > > > > community) notion of data race..." > > > > > > Tsan's notion of a data race is basically the C/C++'s notion: > > > concurrent/unsynchronized non-atomic access in different threads at > > > least one of which is a write. > > > > Yeah, I think that this notion needs to be detailed, discussed, > > documented, and discussed again. ;-) > > > > > > > Tremendous (for such a project) benefits of automatic data race > > > detection is a good motivation to finally agree on and accept a > > > practically useful notion of a data race. > > > > Agreed. > > While having a 100% formal definition of a data race upfront would be > useful, I don't think this is a hard requirement for deployment of > KTSAN. What I think is required is: > 1. Agree that the overall direction is right. > 2. Agree that we want to enable data race detection and resolve > problems as they appear in a practical manner (rather than block whole > effort on every small thing). > We deployed TSAN in user-space in much larger code bases than kernel, > and while we had the C/C++ formal definition of a data race, practical > and legacy matters were similar to that of the kernel (lots of legacy > code, different opinions, etc). Doing both things in tandem (defining > a memory model and deploying a data race detector) can actually have > benefits as a race detector may point to under-defined or > impractically defined areas, and will otherwise help to validate that > the model works and is useful. > KTSAN is not fixed as well. We adopted it as we gathered more > knowledge and understanding of the kernel. So it's not that we have to > commit to something upfront. In any case, there might well be some differences in approach between KTSAN and LKMM due to input size differences: One would expect LKMM to be able to tolerate a more computationally intensive definition as a consequence of KTSAN's ability to process much larger code bases. But I nevertheless believe that it would be good to have these differences be a matter of conscious choice rather than a matter of chance. ;-) My guess is that LKMM picks its starting point (which might take some additional time), then KTSAN critiques it, and then we work out what differences should result in a change to one or the other (or both) and which differences are inherent in the different workloads that LKMM and KTSAN are presented with. Seem reasonable? Thanx, Paul ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-09 17:10 ` Paul E. McKenney @ 2019-01-10 8:49 ` Dmitry Vyukov 2019-01-10 12:30 ` Andrea Parri 0 siblings, 1 reply; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-10 8:49 UTC (permalink / raw) To: Paul E. McKenney Cc: Andrea Parri, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Wed, Jan 9, 2019 at 6:11 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > On Wed, Jan 09, 2019 at 01:29:02PM +0100, Dmitry Vyukov wrote: > > On Wed, Jan 9, 2019 at 1:11 PM Andrea Parri > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > On Wed, Jan 09, 2019 at 12:55:27PM +0100, Dmitry Vyukov wrote: > > > > On Wed, Jan 9, 2019 at 12:24 PM Andrea Parri > > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > > > On Tue, Jan 08, 2019 at 04:36:46PM -0800, Anatol Pomozov wrote: > > > > > > Hello > > > > > > > > > > > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > > > > > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > > > > > > > > Hi Anatol, > > > > > > > > > > > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > > > > > > Hello folks, > > > > > > > > > > > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > > > > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > > > > > > and makes sure it is accessed in a thread-safe manner. > > > > > > > > > > > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > > > > > > continued to have some "fun" discussing topics related to "thread- > > > > > > > safe memory accesses": I'm sure that they'll be very interested in > > > > > > > such work of yours and eager to discuss your results. > > > > > > > > > > > > Thread Sanitizer is a great tool to find thread-safety issues with > > > > > > user-space code. The tool been developed by a team of smart people > > > > > > from Google [1]. > > > > > > > > > > > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > > > > > > bunch of work been done there but the project is still at > > > > > > proof-of-concept point. > > > > > > > > > > Yes, I have been aware of these tools since at least ;-) > > > > > > > > > > https://groups.google.com/forum/#!msg/ktsan/bVZ1c6H2NE0/Dxrw55bfBAAJ > > > > > > > > > > > > > > > > > > > > > > I am not a part of Google's dynamic tools team. But I've decided to > > > > > > pick something to do during the New Year holidays so started porting > > > > > > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > > > > > > fix a few crashes [3]. > > > > > > > > > > I guess my first reaction would remain > > > > > > > > > > "it's kind of hard (to use an euphemism) to review 7,582 additions > > > > > or so for a data race detector without a clear/an accepted (by the > > > > > community) notion of data race..." > > > > > > > > Tsan's notion of a data race is basically the C/C++'s notion: > > > > concurrent/unsynchronized non-atomic access in different threads at > > > > least one of which is a write. > > > > > > Yeah, I think that this notion needs to be detailed, discussed, > > > documented, and discussed again. ;-) > > > > > > > > > > Tremendous (for such a project) benefits of automatic data race > > > > detection is a good motivation to finally agree on and accept a > > > > practically useful notion of a data race. > > > > > > Agreed. > > > > While having a 100% formal definition of a data race upfront would be > > useful, I don't think this is a hard requirement for deployment of > > KTSAN. What I think is required is: > > 1. Agree that the overall direction is right. > > 2. Agree that we want to enable data race detection and resolve > > problems as they appear in a practical manner (rather than block whole > > effort on every small thing). > > We deployed TSAN in user-space in much larger code bases than kernel, > > and while we had the C/C++ formal definition of a data race, practical > > and legacy matters were similar to that of the kernel (lots of legacy > > code, different opinions, etc). Doing both things in tandem (defining > > a memory model and deploying a data race detector) can actually have > > benefits as a race detector may point to under-defined or > > impractically defined areas, and will otherwise help to validate that > > the model works and is useful. > > KTSAN is not fixed as well. We adopted it as we gathered more > > knowledge and understanding of the kernel. So it's not that we have to > > commit to something upfront. > > In any case, there might well be some differences in approach between > KTSAN and LKMM due to input size differences: One would expect LKMM > to be able to tolerate a more computationally intensive definition as > a consequence of KTSAN's ability to process much larger code bases. > > But I nevertheless believe that it would be good to have these differences > be a matter of conscious choice rather than a matter of chance. ;-) > > My guess is that LKMM picks its starting point (which might take some > additional time), then KTSAN critiques it, and then we work out what > differences should result in a change to one or the other (or both) > and which differences are inherent in the different workloads that LKMM > and KTSAN are presented with. > > Seem reasonable? Sounds reasonable. For seqcounts we currently simply ignore all accesses within the read section (thus the requirement to dynamically track read sections). What does LKMM say about seqlocks? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 8:49 ` Dmitry Vyukov @ 2019-01-10 12:30 ` Andrea Parri 2019-01-10 12:38 ` Dmitry Vyukov 0 siblings, 1 reply; 28+ messages in thread From: Andrea Parri @ 2019-01-10 12:30 UTC (permalink / raw) To: Dmitry Vyukov Cc: Paul E. McKenney, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra > For seqcounts we currently simply ignore all accesses within the read > section (thus the requirement to dynamically track read sections). > What does LKMM say about seqlocks? LKMM does not currently model seqlocks, if that's what you're asking; c.f., tools/memory-model/linux-kernel.def for a list of the currently supported synchronization primitives. LKMM has also no notion of "data race", it insists that the code must contain no unmarked accesses; we have been discussing such extensions since at least Dec'17 (we're not quite there!, as mentioned by Paul). My opinion is that ignoring all accesses within a given read section _can_ lead to false negatives (in every possible definition of "data race" and "read sections" I can think of at the moment ;D): P0 P1 read_seqbegin() x = 1; r0 = x; read_seqretry() // =0 ought to be "racy"..., right? (I didn't audit all the callsites for read_{seqbegin,seqretry}(), but I wouldn't be surprised to find such pattern ;D ... "legacy", as you recalled). Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:30 ` Andrea Parri @ 2019-01-10 12:38 ` Dmitry Vyukov 2019-01-10 12:46 ` Andrea Parri 2019-01-10 14:50 ` Paul E. McKenney 0 siblings, 2 replies; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-10 12:38 UTC (permalink / raw) To: Andrea Parri Cc: Paul E. McKenney, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Thu, Jan 10, 2019 at 1:30 PM Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > > For seqcounts we currently simply ignore all accesses within the read > > section (thus the requirement to dynamically track read sections). > > What does LKMM say about seqlocks? > > LKMM does not currently model seqlocks, if that's what you're asking; > c.f., tools/memory-model/linux-kernel.def for a list of the currently > supported synchronization primitives. > > LKMM has also no notion of "data race", it insists that the code must > contain no unmarked accesses; we have been discussing such extensions > since at least Dec'17 (we're not quite there!, as mentioned by Paul). How does it call cases that do contain unmarked accesses then? :) > My opinion is that ignoring all accesses within a given read section > _can_ lead to false negatives Absolutely. But this is a deliberate decision. For our tools we consider priority 1: no false positives. Period. Priority 2: also report some true positives in best effort manner. > (in every possible definition of "data > race" and "read sections" I can think of at the moment ;D): > > P0 P1 > read_seqbegin() x = 1; > r0 = x; > read_seqretry() // =0 > > ought to be "racy"..., right? (I didn't audit all the callsites for > read_{seqbegin,seqretry}(), but I wouldn't be surprised to find such > pattern ;D ... "legacy", as you recalled). > > Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:38 ` Dmitry Vyukov @ 2019-01-10 12:46 ` Andrea Parri 2019-01-10 13:25 ` Dmitry Vyukov 2019-01-10 14:50 ` Paul E. McKenney 1 sibling, 1 reply; 28+ messages in thread From: Andrea Parri @ 2019-01-10 12:46 UTC (permalink / raw) To: Dmitry Vyukov Cc: Paul E. McKenney, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Thu, Jan 10, 2019 at 01:38:11PM +0100, Dmitry Vyukov wrote: > On Thu, Jan 10, 2019 at 1:30 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > > For seqcounts we currently simply ignore all accesses within the read > > > section (thus the requirement to dynamically track read sections). > > > What does LKMM say about seqlocks? > > > > LKMM does not currently model seqlocks, if that's what you're asking; > > c.f., tools/memory-model/linux-kernel.def for a list of the currently > > supported synchronization primitives. > > > > LKMM has also no notion of "data race", it insists that the code must > > contain no unmarked accesses; we have been discussing such extensions > > since at least Dec'17 (we're not quite there!, as mentioned by Paul). > > How does it call cases that do contain unmarked accesses then? :) "work-in-progress" ;), or "limitation" (see tools/memory-model/README) > > > My opinion is that ignoring all accesses within a given read section > > _can_ lead to false negatives > > Absolutely. But this is a deliberate decision. > For our tools we consider priority 1: no false positives. Period. > Priority 2: also report some true positives in best effort manner. This sound reasonable to me. But please don't overlook the fact that to be able to talk about "false positive" and "false negative" (for a data race detector) we need to agree about "what a data race is". (The hope, of course, is that the LKMM will have a say soon here ...) Andrea > > > (in every possible definition of "data > > race" and "read sections" I can think of at the moment ;D): > > > > P0 P1 > > read_seqbegin() x = 1; > > r0 = x; > > read_seqretry() // =0 > > > > ought to be "racy"..., right? (I didn't audit all the callsites for > > read_{seqbegin,seqretry}(), but I wouldn't be surprised to find such > > pattern ;D ... "legacy", as you recalled). > > > > Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:46 ` Andrea Parri @ 2019-01-10 13:25 ` Dmitry Vyukov 0 siblings, 0 replies; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-10 13:25 UTC (permalink / raw) To: Andrea Parri Cc: Paul E. McKenney, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Thu, Jan 10, 2019 at 1:47 PM Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > On Thu, Jan 10, 2019 at 01:38:11PM +0100, Dmitry Vyukov wrote: > > On Thu, Jan 10, 2019 at 1:30 PM Andrea Parri > > <andrea.parri@amarulasolutions.com> wrote: > > > > > > > For seqcounts we currently simply ignore all accesses within the read > > > > section (thus the requirement to dynamically track read sections). > > > > What does LKMM say about seqlocks? > > > > > > LKMM does not currently model seqlocks, if that's what you're asking; > > > c.f., tools/memory-model/linux-kernel.def for a list of the currently > > > supported synchronization primitives. > > > > > > LKMM has also no notion of "data race", it insists that the code must > > > contain no unmarked accesses; we have been discussing such extensions > > > since at least Dec'17 (we're not quite there!, as mentioned by Paul). > > > > How does it call cases that do contain unmarked accesses then? :) > > "work-in-progress" ;), or "limitation" (see tools/memory-model/README) Let's call it /data race/ interim then :) Which also have literally undefined behavior in LKMM. Which now precisely matches the implementation language (C) definitions. Which is nice. > > > My opinion is that ignoring all accesses within a given read section > > > _can_ lead to false negatives > > > > Absolutely. But this is a deliberate decision. > > For our tools we consider priority 1: no false positives. Period. > > Priority 2: also report some true positives in best effort manner. > > This sound reasonable to me. But please don't overlook the fact that > to be able to talk about "false positive" and "false negative" (for a > data race detector) we need to agree about "what a data race is". Having a formal model would be undoubtedly good. But in practice things are much simpler. The complex cases that majority of LKMM deals with are <<1% of kernel concurrency. The majority of kernel cases are "no concurrent accesses at all", "always protected by a mutex", "passed as argument to a new thread", "the canonical store-release/load-acquire synchronization". For these I hope there is no controversy across C, POSIX, gcc, clang, kernel. Handling these cases in a race detector brings 99.9% of benefit. And for more complex cases (like seqlock) we can always approximate as "no races there" which inevitably satisfy our priorities (if you report nothing, you don't report false positives). But I am much more concerned about actual kernel code and behavior wrt a memory model. We are talking about interaction between LKMM <-> KTSAN. When a way more important question is LKMM <-> actual kernel behavior. KTSAN is really a secondary thing in this picture. So if anything needs a memory model, or needs to be blocked on a memory model, that's writing kernel code ;) > (The hope, of course, is that the LKMM will have a say soon here ...) > > Andrea > > > > > > > (in every possible definition of "data > > > race" and "read sections" I can think of at the moment ;D): > > > > > > P0 P1 > > > read_seqbegin() x = 1; > > > r0 = x; > > > read_seqretry() // =0 > > > > > > ought to be "racy"..., right? (I didn't audit all the callsites for > > > read_{seqbegin,seqretry}(), but I wouldn't be surprised to find such > > > pattern ;D ... "legacy", as you recalled). > > > > > > Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:38 ` Dmitry Vyukov 2019-01-10 12:46 ` Andrea Parri @ 2019-01-10 14:50 ` Paul E. McKenney 1 sibling, 0 replies; 28+ messages in thread From: Paul E. McKenney @ 2019-01-10 14:50 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrea Parri, Anatol Pomozov, Florian Westphal, LKML, Andrey Konovalov, Alan Stern, Luc Maranget, Will Deacon, Peter Zijlstra On Thu, Jan 10, 2019 at 01:38:11PM +0100, Dmitry Vyukov wrote: > On Thu, Jan 10, 2019 at 1:30 PM Andrea Parri > <andrea.parri@amarulasolutions.com> wrote: > > > > > For seqcounts we currently simply ignore all accesses within the read > > > section (thus the requirement to dynamically track read sections). > > > What does LKMM say about seqlocks? > > > > LKMM does not currently model seqlocks, if that's what you're asking; > > c.f., tools/memory-model/linux-kernel.def for a list of the currently > > supported synchronization primitives. > > > > LKMM has also no notion of "data race", it insists that the code must > > contain no unmarked accesses; we have been discussing such extensions > > since at least Dec'17 (we're not quite there!, as mentioned by Paul). > > How does it call cases that do contain unmarked accesses then? :) > > > My opinion is that ignoring all accesses within a given read section > > _can_ lead to false negatives > > Absolutely. But this is a deliberate decision. > For our tools we consider priority 1: no false positives. Period. > Priority 2: also report some true positives in best effort manner. > > > (in every possible definition of "data > > race" and "read sections" I can think of at the moment ;D): > > > > P0 P1 > > read_seqbegin() x = 1; > > r0 = x; > > read_seqretry() // =0 > > > > ought to be "racy"..., right? (I didn't audit all the callsites for > > read_{seqbegin,seqretry}(), but I wouldn't be surprised to find such > > pattern ;D ... "legacy", as you recalled). One approach would be to forgive data races in the seqlock read-side critical section only if: o There was a later matching read_seqretry() that returned true, and o There were no dereferences of any data-racy load. (Yeah, this one should be good clean fun to model!) Do people nest read_seqbegin(), and if so, what does that mean? ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-08 19:33 seqcount usage in xt_replace_table() Anatol Pomozov 2019-01-08 22:37 ` Florian Westphal 2019-01-09 0:02 ` Andrea Parri @ 2019-01-10 12:44 ` Peter Zijlstra 2019-01-10 12:54 ` Dmitry Vyukov 2 siblings, 1 reply; 28+ messages in thread From: Peter Zijlstra @ 2019-01-10 12:44 UTC (permalink / raw) To: Anatol Pomozov; +Cc: fw, Dmitry Vyukov, paulmck, LKML On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > Hello folks, > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > and makes sure it is accessed in a thread-safe manner. > > seqlock is a synchronization primitive used by Linux kernel. KTSAN > annotates read_seqbegin()/read_seqretry() and tracks what data been > accessed in its critical section. > > During KTSAN port I found and interesting seqcount usage introduced in > commit 80055dab5de0c8677bc148c4717ddfc753a9148e > > If I read this commit correctly xt_replace_table() does not use > seqlock in a canonical way to specify a critical section. Instead the > code reads the counter and waits until it gets to a specific value. (gets away from) > Now I want KTSAN to play with this code nicely. I need to tell KTSAN > something like "this raw_read_seqcount() does not start a critical > section, just ignore it". So temporary I introduced > raw_read_seqcount_nocritical() function that is ignored by KTSAN. Is > it a good solution? This code is special enough to just do: READ_ONCE(->sequence) and be done with it. It doesn't need the smp_rmb() or anything else. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: seqcount usage in xt_replace_table() 2019-01-10 12:44 ` Peter Zijlstra @ 2019-01-10 12:54 ` Dmitry Vyukov 0 siblings, 0 replies; 28+ messages in thread From: Dmitry Vyukov @ 2019-01-10 12:54 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Anatol Pomozov, Florian Westphal, Paul E. McKenney, LKML On Thu, Jan 10, 2019 at 1:44 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > Hello folks, > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > and makes sure it is accessed in a thread-safe manner. > > > > seqlock is a synchronization primitive used by Linux kernel. KTSAN > > annotates read_seqbegin()/read_seqretry() and tracks what data been > > accessed in its critical section. > > > > During KTSAN port I found and interesting seqcount usage introduced in > > commit 80055dab5de0c8677bc148c4717ddfc753a9148e > > > > If I read this commit correctly xt_replace_table() does not use > > seqlock in a canonical way to specify a critical section. Instead the > > code reads the counter and waits until it gets to a specific value. > > (gets away from) > > > Now I want KTSAN to play with this code nicely. I need to tell KTSAN > > something like "this raw_read_seqcount() does not start a critical > > section, just ignore it". So temporary I introduced > > raw_read_seqcount_nocritical() function that is ignored by KTSAN. Is > > it a good solution? > > This code is special enough to just do: READ_ONCE(->sequence) and be > done with it. It doesn't need the smp_rmb() or anything else. Sounds good to me. From KTSAN perspective it then should work without any additional dance, it's always good when code works as-is. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2019-01-11 14:08 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-08 19:33 seqcount usage in xt_replace_table() Anatol Pomozov 2019-01-08 22:37 ` Florian Westphal 2019-01-10 12:41 ` Peter Zijlstra 2019-01-10 12:53 ` Dmitry Vyukov 2019-01-10 20:18 ` Peter Zijlstra 2019-01-10 14:48 ` Florian Westphal 2019-01-10 20:20 ` Peter Zijlstra 2019-01-10 20:25 ` Peter Zijlstra 2019-01-10 22:29 ` Florian Westphal 2019-01-11 8:34 ` Peter Zijlstra 2019-01-11 14:08 ` Paul E. McKenney 2019-01-10 14:52 ` Paul E. McKenney 2019-01-09 0:02 ` Andrea Parri 2019-01-09 0:36 ` Anatol Pomozov 2019-01-09 5:35 ` Dmitry Vyukov 2019-01-09 11:24 ` Andrea Parri 2019-01-09 11:55 ` Dmitry Vyukov 2019-01-09 12:11 ` Andrea Parri 2019-01-09 12:29 ` Dmitry Vyukov 2019-01-09 17:10 ` Paul E. McKenney 2019-01-10 8:49 ` Dmitry Vyukov 2019-01-10 12:30 ` Andrea Parri 2019-01-10 12:38 ` Dmitry Vyukov 2019-01-10 12:46 ` Andrea Parri 2019-01-10 13:25 ` Dmitry Vyukov 2019-01-10 14:50 ` Paul E. McKenney 2019-01-10 12:44 ` Peter Zijlstra 2019-01-10 12:54 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).