* [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
@ 2020-05-19 21:45 Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, David S. Miller,
Andrew Morton, Jens Axboe, Jonathan Corbet, Alexander Viro,
David Airlie, Daniel Vetter, netdev, linux-mm, linux-block,
dri-devel, linux-fsdevel, linux-doc
Hi,
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the write side critical section.
There is no built-in debugging mechanism to verify that the lock used
for writer serialization is held and preemption is disabled. Some usage
sites like dma-buf have explicit lockdep checks for the writer-side
lock, but this covers only a small portion of the sequence counter usage
in the kernel.
Add new sequence counter types which allows to associate a lock to the
sequence counter at initialization time. The seqcount API functions are
extended to provide appropriate lockdep assertions depending on the
seqcount/lock type.
For sequence counters with associated locks that do not implicitly
disable preemption, preemption protection is enforced in the sequence
counter write side functions. This removes the need to explicitly add
preempt_disable/enable() around the write side critical sections: the
write_begin/end() functions for these new sequence counter types
automatically do this.
Extend the lockdep API with a macro asserting that preemption is
disabled. Use it to verify that preemption is disabled for all sequence
counters write side critical sections.
If lockdep is disabled, these lock associations and non-preemptibility
checks are compiled out and have neither storage size nor runtime
overhead. If lockdep is enabled, a pointer to the lock is stored in the
seqcount and the write side API functions enable lockdep assertions.
The following seqcount types with associated locks are introduced:
seqcount_spinlock_t
seqcount_raw_spinlock_t
seqcount_rwlock_t
seqcount_mutex_t
seqcount_ww_mutex_t
This lock association is not only useful for debugging purposes, it also
provides a mechanism for PREEMPT_RT to prevent writer starvation. On RT
kernels spinlocks and rwlocks are substituted with sleeping locks and
the code sections protected by these locks become preemptible, which has
the same problem as write side critical section with preemption enabled
on a non-RT kernel. RT utilizes this association by storing the provided
lock pointer and in case that a reader sees an active writer (seqcount
is odd), it does not spin, but blocks on the associated lock similar to
read_seqbegin_or_lock().
By using the lockdep debugging mechanisms added in this patch series, a
number of erroneous seqcount call-sites were discovered across the
kernel. The fixes are included at the beginning of the series.
Thanks,
8<--------------
Ahmed S. Darwish (25):
net: core: device_rename: Use rwsem instead of a seqcount
mm/swap: Don't abuse the seqcount latching API
net: phy: fixed_phy: Remove unused seqcount
block: nr_sects_write(): Disable preemption on seqcount write
u64_stats: Document writer non-preemptibility requirement
dma-buf: Remove custom seqcount lockdep class key
lockdep: Add preemption disabled assertion API
seqlock: lockdep assert non-preemptibility on seqcount_t write
Documentation: locking: Describe seqlock design and usage
seqlock: Add RST directives to kernel-doc code samples and notes
seqlock: Add missing kernel-doc annotations
seqlock: Extend seqcount API with associated locks
dma-buf: Use sequence counter with associated wound/wait mutex
sched: tasks: Use sequence counter with associated spinlock
netfilter: conntrack: Use sequence counter with associated spinlock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
xfrm: policy: Use sequence counters with associated lock
timekeeping: Use sequence counter with associated raw spinlock
vfs: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
kvm/eventfd: Use sequence counter with associated spinlock
hrtimer: Use sequence counter with associated raw spinlock
Documentation/locking/index.rst | 1 +
Documentation/locking/seqlock.rst | 239 +++++
MAINTAINERS | 2 +-
block/blk-iocost.c | 5 +-
block/blk.h | 2 +
drivers/dma-buf/dma-resv.c | 15 +-
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 -
drivers/md/raid5.c | 2 +-
drivers/md/raid5.h | 2 +-
drivers/net/phy/fixed_phy.c | 25 +-
fs/dcache.c | 2 +-
fs/fs_struct.c | 4 +-
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4state.c | 2 +-
fs/userfaultfd.c | 4 +-
include/linux/dcache.h | 2 +-
include/linux/dma-resv.h | 4 +-
include/linux/fs_struct.h | 2 +-
include/linux/hrtimer.h | 2 +-
include/linux/kvm_irqfd.h | 2 +-
include/linux/lockdep.h | 9 +
include/linux/sched.h | 2 +-
include/linux/seqlock.h | 882 +++++++++++++++---
include/linux/seqlock_types_internal.h | 187 ++++
include/linux/u64_stats_sync.h | 38 +-
include/net/netfilter/nf_conntrack.h | 2 +-
init/init_task.c | 3 +-
kernel/fork.c | 2 +-
kernel/locking/lockdep.c | 15 +
kernel/time/hrtimer.c | 13 +-
kernel/time/timekeeping.c | 19 +-
lib/Kconfig.debug | 1 +
mm/swap.c | 57 +-
net/core/dev.c | 30 +-
net/netfilter/nf_conntrack_core.c | 5 +-
net/netfilter/nft_set_rbtree.c | 4 +-
net/xfrm/xfrm_policy.c | 10 +-
virt/kvm/eventfd.c | 2 +-
38 files changed, 1325 insertions(+), 277 deletions(-)
create mode 100644 Documentation/locking/seqlock.rst
create mode 100644 include/linux/seqlock_types_internal.h
base-commit: 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
--
2.20.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
2020-05-19 21:45 [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-05-19 21:45 ` Ahmed S. Darwish
2020-05-22 16:39 ` Peter Zijlstra
[not found] ` <20200522001237.A00E8206BE@mail.kernel.org>
2020-05-19 21:45 ` [PATCH v1 21/25] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
` (2 subsequent siblings)
3 siblings, 2 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, Phillip Susi,
Vivek Goyal, linux-block
For optimized block readers not holding a mutex, the "number of sectors"
64-bit value is protected from tearing on 32-bit architectures by a
sequence counter.
Disable preemption before entering that sequence counter's write side
critical section. Otherwise, the read side can preempt the write side
section and spin for the entire scheduler tick. If the reader belongs to
a real-time scheduling class, it can spin forever and the kernel will
livelock.
Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
block/blk.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/blk.h b/block/blk.h
index 0a94ec68af32..151f86932547 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+ preempt_disable();
write_seqcount_begin(&part->nr_sects_seq);
part->nr_sects = size;
write_seqcount_end(&part->nr_sects_seq);
+ preempt_enable();
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
preempt_disable();
part->nr_sects = size;
--
2.20.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v1 21/25] iocost: Use sequence counter with associated spinlock
2020-05-19 21:45 [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
@ 2020-05-19 21:45 ` Ahmed S. Darwish
2020-06-08 0:57 ` [PATCH v2 00/18] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-06-30 5:44 ` [PATCH v3 00/20] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
3 siblings, 0 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, linux-block
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.
Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.
If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
block/blk-iocost.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 7c1fe605d0d6..8029a9e8fa55 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -405,7 +405,7 @@ struct ioc {
enum ioc_running running;
atomic64_t vtime_rate;
- seqcount_t period_seqcount;
+ seqcount_spinlock_t period_seqcount;
u32 period_at; /* wallclock starttime */
u64 period_at_vtime; /* vtime starttime */
@@ -872,7 +872,6 @@ static void ioc_now(struct ioc *ioc, struct ioc_now *now)
static void ioc_start_period(struct ioc *ioc, struct ioc_now *now)
{
- lockdep_assert_held(&ioc->lock);
WARN_ON_ONCE(ioc->running != IOC_RUNNING);
write_seqcount_begin(&ioc->period_seqcount);
@@ -1958,7 +1957,7 @@ static int blk_iocost_init(struct request_queue *q)
ioc->running = IOC_IDLE;
atomic64_set(&ioc->vtime_rate, VTIME_PER_USEC);
- seqcount_init(&ioc->period_seqcount);
+ seqcount_spinlock_init(&ioc->period_seqcount, &ioc->lock);
ioc->period_at = ktime_to_us(ktime_get());
atomic64_set(&ioc->cur_period, 0);
atomic_set(&ioc->hweight_gen, 0);
--
2.20.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
2020-05-19 21:45 ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
@ 2020-05-22 16:39 ` Peter Zijlstra
2020-05-25 9:56 ` Ahmed S. Darwish
[not found] ` <20200522001237.A00E8206BE@mail.kernel.org>
1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-22 16:39 UTC (permalink / raw)
To: Ahmed S. Darwish
Cc: Ingo Molnar, Will Deacon, Thomas Gleixner, Paul E. McKenney,
Sebastian A. Siewior, Steven Rostedt, LKML, Jens Axboe,
Phillip Susi, Vivek Goyal, linux-block
On Tue, May 19, 2020 at 11:45:26PM +0200, Ahmed S. Darwish wrote:
> For optimized block readers not holding a mutex, the "number of sectors"
> 64-bit value is protected from tearing on 32-bit architectures by a
> sequence counter.
>
> Disable preemption before entering that sequence counter's write side
> critical section. Otherwise, the read side can preempt the write side
> section and spin for the entire scheduler tick. If the reader belongs to
> a real-time scheduling class, it can spin forever and the kernel will
> livelock.
>
> Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> block/blk.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk.h b/block/blk.h
> index 0a94ec68af32..151f86932547 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
> static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> {
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> + preempt_disable();
> write_seqcount_begin(&part->nr_sects_seq);
> part->nr_sects = size;
> write_seqcount_end(&part->nr_sects_seq);
> + preempt_enable();
> #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
> preempt_disable();
> part->nr_sects = size;
This does look like something that include/linux/u64_stats_sync.h could
help with.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
2020-05-22 16:39 ` Peter Zijlstra
@ 2020-05-25 9:56 ` Ahmed S. Darwish
0 siblings, 0 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-05-25 9:56 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Will Deacon, Thomas Gleixner, Paul E. McKenney,
Sebastian A. Siewior, Steven Rostedt, LKML, Jens Axboe,
Phillip Susi, Vivek Goyal, linux-block
Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, May 19, 2020 at 11:45:26PM +0200, Ahmed S. Darwish wrote:
> > For optimized block readers not holding a mutex, the "number of sectors"
> > 64-bit value is protected from tearing on 32-bit architectures by a
> > sequence counter.
> >
> > Disable preemption before entering that sequence counter's write side
> > critical section. Otherwise, the read side can preempt the write side
> > section and spin for the entire scheduler tick. If the reader belongs to
> > a real-time scheduling class, it can spin forever and the kernel will
> > livelock.
> >
> > Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> > block/blk.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/block/blk.h b/block/blk.h
> > index 0a94ec68af32..151f86932547 100644
> > --- a/block/blk.h
> > +++ b/block/blk.h
> > @@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
> > static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> > {
> > #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> > + preempt_disable();
> > write_seqcount_begin(&part->nr_sects_seq);
> > part->nr_sects = size;
> > write_seqcount_end(&part->nr_sects_seq);
> > + preempt_enable();
> > #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
> > preempt_disable();
> > part->nr_sects = size;
>
> This does look like something that include/linux/u64_stats_sync.h could
> help with.
Correct.
I just felt though that this would be too much for a 'Cc: stable' patch.
In another (in-progress) seqlock.h patch series, all of the seqcount_t
call sites that are used for 64-bit values tearing protection on 32-bit
kernels are transformed to the u64_stats_sync.h API.
Thanks,
--
Ahmed S. Darwish
Linutronix GmbH
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
[not found] ` <20200522001237.A00E8206BE@mail.kernel.org>
@ 2020-05-25 10:12 ` Ahmed S. Darwish
0 siblings, 0 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-05-25 10:12 UTC (permalink / raw)
To: Sasha Levin
Cc: Peter Zijlstra, Thomas Gleixner, Sebastian A. Siewior, stable,
Jens Axboe, Christoph Hellwig, linux-block, LKML
Sasha Levin <sashal@kernel.org> wrote:
> Hi
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag
> fixing commit: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl").
>
> The bot has tested the following trees: v5.6.13, v5.4.41, v4.19.123, v4.14.180, v4.9.223, v4.4.223.
>
> v5.6.13: Failed to apply! Possible dependencies:
...
> v5.4.41: Failed to apply! Possible dependencies:
...
> v4.19.123: Failed to apply! Possible dependencies:
...
> v4.14.180: Failed to apply! Possible dependencies:
...
> v4.9.223: Failed to apply! Possible dependencies:
...
> v4.4.223: Failed to apply! Possible dependencies:
...
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
>
The v5.7-rc1 commit 581e26004a09 ("block: move block layer internals out
of include/linux/genhd.h") moved the part_nr_sects_write() static inline
function from include/linux/genhd.h to block/blk.h.
After review, I'll send a rebased patch to stable.
Thanks,
--
Ahmed S. Darwish
Linutronix GmbH
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 00/18] seqlock: Extend seqcount API with associated locks
2020-05-19 21:45 [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 21/25] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
@ 2020-06-08 0:57 ` Ahmed S. Darwish
2020-06-08 0:57 ` [PATCH v2 14/18] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-06-30 5:44 ` [PATCH v3 00/20] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
3 siblings, 1 reply; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-06-08 0:57 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
linux-doc, David Airlie, Daniel Vetter, dri-devel,
David S. Miller, netdev, Jens Axboe, linux-block, Alexander Viro,
linux-fsdevel
Hi,
This is v2 of the seqlock patch series:
[PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
https://lore.kernel.org/lkml/20200519214547.352050-1-a.darwish@linutronix.de
Patches 1=>3 of this v2 series add documentation for the existing
seqlock.h datatypes and APIs. Hopefully they can hit v5.8 -rc2 or -rc3.
Changelog-v2
============
1. Drop, for now, the seqlock v1 patches #7 and #8. These patches added
lockdep non-preemptibility checks to seqcount_t write paths, but they
now depend on on-going work by Peter:
[PATCH v3 0/5] lockdep: Change IRQ state tracking to use per-cpu variables
https://lkml.kernel.org/r/20200529213550.683440625@infradead.org
[PATCH 00/14] x86/entry: disallow #DB more and x86/entry lockdep/nmi
https://lkml.kernel.org/r/20200529212728.795169701@infradead.org
Once Peter's work get merged, I'll send the non-preemptibility checks as
a separate series.
2. Drop the v1 seqcount_t call-sites bugfixes. I've already posted them
in an isolated series. They got merged into their respective trees, and
will hit v5.8-rc1 soon:
[PATCH v2 0/6] seqlock: seqcount_t call sites bugfixes
https://lore.kernel.org/lkml/20200603144949.1122421-1-a.darwish@linutronix.de
3. Patch #1: Add a small paragraph explaining that seqcount_t/seqlock_t
cannot be used if the protected data contains pointers. A similar
paragraph already existed in seqlock.h, but got mistakenly dropped.
4. Patch #2: Don't add RST directives inside kernel-doc comments. Peter
doesn't like them :) I've kept the indentation though, and found a
minimal way for Sphinx to properly render these code samples without too
much disruption.
5. Patch #3: Brush up the introduced kernel-doc comments. Make them more
consistent overall, and more concise.
Thanks,
8<--------------
Ahmed S. Darwish (18):
Documentation: locking: Describe seqlock design and usage
seqlock: Properly format kernel-doc code samples
seqlock: Add missing kernel-doc annotations
seqlock: Extend seqcount API with associated locks
dma-buf: Remove custom seqcount lockdep class key
dma-buf: Use sequence counter with associated wound/wait mutex
sched: tasks: Use sequence counter with associated spinlock
netfilter: conntrack: Use sequence counter with associated spinlock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
xfrm: policy: Use sequence counters with associated lock
timekeeping: Use sequence counter with associated raw spinlock
vfs: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
kvm/eventfd: Use sequence counter with associated spinlock
hrtimer: Use sequence counter with associated raw spinlock
Documentation/locking/index.rst | 1 +
Documentation/locking/seqlock.rst | 242 +++++
MAINTAINERS | 2 +-
block/blk-iocost.c | 5 +-
drivers/dma-buf/dma-resv.c | 15 +-
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 -
drivers/md/raid5.c | 2 +-
drivers/md/raid5.h | 2 +-
fs/dcache.c | 2 +-
fs/fs_struct.c | 4 +-
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4state.c | 2 +-
fs/userfaultfd.c | 4 +-
include/linux/dcache.h | 2 +-
include/linux/dma-resv.h | 4 +-
include/linux/fs_struct.h | 2 +-
include/linux/hrtimer.h | 2 +-
include/linux/kvm_irqfd.h | 2 +-
include/linux/sched.h | 2 +-
include/linux/seqlock.h | 855 ++++++++++++++----
include/linux/seqlock_types_internal.h | 187 ++++
include/net/netfilter/nf_conntrack.h | 2 +-
init/init_task.c | 3 +-
kernel/fork.c | 2 +-
kernel/time/hrtimer.c | 13 +-
kernel/time/timekeeping.c | 19 +-
net/netfilter/nf_conntrack_core.c | 5 +-
net/netfilter/nft_set_rbtree.c | 4 +-
net/xfrm/xfrm_policy.c | 10 +-
virt/kvm/eventfd.c | 2 +-
30 files changed, 1175 insertions(+), 226 deletions(-)
create mode 100644 Documentation/locking/seqlock.rst
create mode 100644 include/linux/seqlock_types_internal.h
base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
--
2.20.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 14/18] iocost: Use sequence counter with associated spinlock
2020-06-08 0:57 ` [PATCH v2 00/18] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-06-08 0:57 ` Ahmed S. Darwish
0 siblings, 0 replies; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-06-08 0:57 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, linux-block
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.
Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.
If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
block/blk-iocost.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 7c1fe605d0d6..8029a9e8fa55 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -405,7 +405,7 @@ struct ioc {
enum ioc_running running;
atomic64_t vtime_rate;
- seqcount_t period_seqcount;
+ seqcount_spinlock_t period_seqcount;
u32 period_at; /* wallclock starttime */
u64 period_at_vtime; /* vtime starttime */
@@ -872,7 +872,6 @@ static void ioc_now(struct ioc *ioc, struct ioc_now *now)
static void ioc_start_period(struct ioc *ioc, struct ioc_now *now)
{
- lockdep_assert_held(&ioc->lock);
WARN_ON_ONCE(ioc->running != IOC_RUNNING);
write_seqcount_begin(&ioc->period_seqcount);
@@ -1958,7 +1957,7 @@ static int blk_iocost_init(struct request_queue *q)
ioc->running = IOC_IDLE;
atomic64_set(&ioc->vtime_rate, VTIME_PER_USEC);
- seqcount_init(&ioc->period_seqcount);
+ seqcount_spinlock_init(&ioc->period_seqcount, &ioc->lock);
ioc->period_at = ktime_to_us(ktime_get());
atomic64_set(&ioc->cur_period, 0);
atomic_set(&ioc->hweight_gen, 0);
--
2.20.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 00/20] seqlock: Extend seqcount API with associated locks
2020-05-19 21:45 [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
` (2 preceding siblings ...)
2020-06-08 0:57 ` [PATCH v2 00/18] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-06-30 5:44 ` Ahmed S. Darwish
2020-06-30 5:44 ` [PATCH v3 16/20] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
3 siblings, 1 reply; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-06-30 5:44 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
linux-doc, David Airlie, Daniel Vetter, dri-devel,
David S. Miller, netdev, Jens Axboe, linux-block, Alexander Viro,
linux-fsdevel
Hi,
This is v3 of the seqlock patch series:
[PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
https://lore.kernel.org/lkml/20200519214547.352050-1-a.darwish@linutronix.de
[PATCH v2 00/18]
https://lore.kernel.org/lkml/20200608005729.1874024-1-a.darwish@linutronix.de
It's based over:
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/core
to get Peter's lockdep irqstate tracking series below, which untangles
mainline seqlock.h<=>sched.h 'current->' task_struct circular dependency:
https://lkml.kernel.org/r/linuxppc-dev/20200623083645.277342609@infradead.org
Changelog-v3:
- Re-add lockdep non-preemptibility checks on seqcount_t write paths.
They were removed from v2 due to the circular dependencies mentioned.
- Slight rebase over the new v5.8-rc1 KCSAN seqlock.h changes
- Collect seqcount_t call-sites acked-by tags
Thanks,
8<--------------
Ahmed S. Darwish (20):
Documentation: locking: Describe seqlock design and usage
seqlock: Properly format kernel-doc code samples
seqlock: Add missing kernel-doc annotations
lockdep: Add preemption enabled/disabled assertion APIs
seqlock: lockdep assert non-preemptibility on seqcount_t write
seqlock: Extend seqcount API with associated locks
dma-buf: Remove custom seqcount lockdep class key
dma-buf: Use sequence counter with associated wound/wait mutex
sched: tasks: Use sequence counter with associated spinlock
netfilter: conntrack: Use sequence counter with associated spinlock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
xfrm: policy: Use sequence counters with associated lock
timekeeping: Use sequence counter with associated raw spinlock
vfs: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
kvm/eventfd: Use sequence counter with associated spinlock
hrtimer: Use sequence counter with associated raw spinlock
Documentation/locking/index.rst | 1 +
Documentation/locking/seqlock.rst | 242 +++++
MAINTAINERS | 2 +-
block/blk-iocost.c | 5 +-
drivers/dma-buf/dma-resv.c | 15 +-
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 -
drivers/md/raid5.c | 2 +-
drivers/md/raid5.h | 2 +-
fs/dcache.c | 2 +-
fs/fs_struct.c | 4 +-
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4state.c | 2 +-
fs/userfaultfd.c | 4 +-
include/linux/dcache.h | 2 +-
include/linux/dma-resv.h | 4 +-
include/linux/fs_struct.h | 2 +-
include/linux/hrtimer.h | 2 +-
include/linux/kvm_irqfd.h | 2 +-
include/linux/lockdep.h | 18 +
include/linux/sched.h | 2 +-
include/linux/seqlock.h | 872 ++++++++++++++----
include/linux/seqlock_types_internal.h | 186 ++++
include/net/netfilter/nf_conntrack.h | 2 +-
init/init_task.c | 3 +-
kernel/fork.c | 2 +-
kernel/time/hrtimer.c | 13 +-
kernel/time/timekeeping.c | 19 +-
lib/Kconfig.debug | 1 +
net/netfilter/nf_conntrack_core.c | 5 +-
net/netfilter/nft_set_rbtree.c | 4 +-
net/xfrm/xfrm_policy.c | 10 +-
virt/kvm/eventfd.c | 2 +-
32 files changed, 1211 insertions(+), 225 deletions(-)
create mode 100644 Documentation/locking/seqlock.rst
create mode 100644 include/linux/seqlock_types_internal.h
base-commit: 997e89fa345e9006f311cf9f9c8fd9f7d96c240f
--
2.20.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 16/20] iocost: Use sequence counter with associated spinlock
2020-06-30 5:44 ` [PATCH v3 00/20] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-06-30 5:44 ` Ahmed S. Darwish
2020-06-30 7:11 ` Daniel Wagner
0 siblings, 1 reply; 11+ messages in thread
From: Ahmed S. Darwish @ 2020-06-30 5:44 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon
Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, linux-block
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.
Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.
If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
block/blk-iocost.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 8ac4aad66ebc..8e940c27c27c 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -406,7 +406,7 @@ struct ioc {
enum ioc_running running;
atomic64_t vtime_rate;
- seqcount_t period_seqcount;
+ seqcount_spinlock_t period_seqcount;
u32 period_at; /* wallclock starttime */
u64 period_at_vtime; /* vtime starttime */
@@ -873,7 +873,6 @@ static void ioc_now(struct ioc *ioc, struct ioc_now *now)
static void ioc_start_period(struct ioc *ioc, struct ioc_now *now)
{
- lockdep_assert_held(&ioc->lock);
WARN_ON_ONCE(ioc->running != IOC_RUNNING);
write_seqcount_begin(&ioc->period_seqcount);
@@ -2001,7 +2000,7 @@ static int blk_iocost_init(struct request_queue *q)
ioc->running = IOC_IDLE;
atomic64_set(&ioc->vtime_rate, VTIME_PER_USEC);
- seqcount_init(&ioc->period_seqcount);
+ seqcount_spinlock_init(&ioc->period_seqcount, &ioc->lock);
ioc->period_at = ktime_to_us(ktime_get());
atomic64_set(&ioc->cur_period, 0);
atomic_set(&ioc->hweight_gen, 0);
--
2.20.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 16/20] iocost: Use sequence counter with associated spinlock
2020-06-30 5:44 ` [PATCH v3 16/20] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
@ 2020-06-30 7:11 ` Daniel Wagner
0 siblings, 0 replies; 11+ messages in thread
From: Daniel Wagner @ 2020-06-30 7:11 UTC (permalink / raw)
To: Ahmed S. Darwish
Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Thomas Gleixner,
Paul E. McKenney, Sebastian A. Siewior, Steven Rostedt, LKML,
Jens Axboe, linux-block
On Tue, Jun 30, 2020 at 07:44:48AM +0200, Ahmed S. Darwish wrote:
> A sequence counter write side critical section must be protected by some
> form of locking to serialize writers. A plain seqcount_t does not
> contain the information of which lock must be held when entering a write
> side critical section.
>
> Use the new seqcount_spinlock_t data type, which allows to associate a
> spinlock with the sequence counter. This enables lockdep to verify that
> the spinlock used for writer serialization is held when the write side
> critical section is entered.
>
> If lockdep is disabled this lock association is compiled out and has
> neither storage size nor runtime overhead.
>
> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-06-30 7:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-19 21:45 [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
2020-05-22 16:39 ` Peter Zijlstra
2020-05-25 9:56 ` Ahmed S. Darwish
[not found] ` <20200522001237.A00E8206BE@mail.kernel.org>
2020-05-25 10:12 ` Ahmed S. Darwish
2020-05-19 21:45 ` [PATCH v1 21/25] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-06-08 0:57 ` [PATCH v2 00/18] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-06-08 0:57 ` [PATCH v2 14/18] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-06-30 5:44 ` [PATCH v3 00/20] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-06-30 5:44 ` [PATCH v3 16/20] iocost: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-06-30 7:11 ` Daniel Wagner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).