* [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting
2020-12-02 11:57 [PATCH 0/5] irq: Reorder time handling against HARDIRQ_OFFSET on IRQ entry v3 Frederic Weisbecker
@ 2020-12-02 11:57 ` Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
2020-12-02 19:28 ` [PATCH 1/5] " Christian Borntraeger
2020-12-02 11:57 ` [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting Frederic Weisbecker
` (3 subsequent siblings)
4 siblings, 2 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-02 11:57 UTC (permalink / raw)
To: Thomas Gleixner, Peter Zijlstra
Cc: LKML, Frederic Weisbecker, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
account_irq_enter_time() and account_irq_exit_time() are not called
from modules. EXPORT_SYMBOL_GPL() can be safely removed from the IRQ
cputime accounting functions called from there.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/kernel/vtime.c | 10 +++++-----
kernel/sched/cputime.c | 2 --
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 8df10d3c8f6c..f9f2a11958a5 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -226,7 +226,7 @@ void vtime_flush(struct task_struct *tsk)
* Update process times based on virtual cpu times stored by entry.S
* to the lowcore fields user_timer, system_timer & steal_clock.
*/
-void vtime_account_irq_enter(struct task_struct *tsk)
+void vtime_account_kernel(struct task_struct *tsk)
{
u64 timer;
@@ -245,12 +245,12 @@ void vtime_account_irq_enter(struct task_struct *tsk)
virt_timer_forward(timer);
}
-EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
-
-void vtime_account_kernel(struct task_struct *tsk)
-__attribute__((alias("vtime_account_irq_enter")));
EXPORT_SYMBOL_GPL(vtime_account_kernel);
+void vtime_account_irq_enter(struct task_struct *tsk)
+__attribute__((alias("vtime_account_kernel")));
+
+
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5a55d2300452..61ce9f9bf0a3 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -71,7 +71,6 @@ void irqtime_account_irq(struct task_struct *curr)
else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}
-EXPORT_SYMBOL_GPL(irqtime_account_irq);
static u64 irqtime_tick_accounted(u64 maxtime)
{
@@ -434,7 +433,6 @@ void vtime_account_irq_enter(struct task_struct *tsk)
else
vtime_account_kernel(tsk);
}
-EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
#endif /* __ARCH_HAS_VTIME_ACCOUNT */
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
--
2.25.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [tip: irq/core] sched/cputime: Remove symbol exports from IRQ time accounting
2020-12-02 11:57 ` [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting Frederic Weisbecker
@ 2020-12-02 19:23 ` tip-bot2 for Frederic Weisbecker
2020-12-02 19:28 ` [PATCH 1/5] " Christian Borntraeger
1 sibling, 0 replies; 28+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2020-12-02 19:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Thomas Gleixner, x86, linux-kernel, maz
The following commit has been merged into the irq/core branch of tip:
Commit-ID: 7197688b2006357da75a014e0a76be89ca9c2d46
Gitweb: https://git.kernel.org/tip/7197688b2006357da75a014e0a76be89ca9c2d46
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Wed, 02 Dec 2020 12:57:28 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 02 Dec 2020 20:20:04 +01:00
sched/cputime: Remove symbol exports from IRQ time accounting
account_irq_enter_time() and account_irq_exit_time() are not called
from modules. EXPORT_SYMBOL_GPL() can be safely removed from the IRQ
cputime accounting functions called from there.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201202115732.27827-2-frederic@kernel.org
---
arch/s390/kernel/vtime.c | 10 +++++-----
kernel/sched/cputime.c | 2 --
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 8df10d3..f9f2a11 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -226,7 +226,7 @@ void vtime_flush(struct task_struct *tsk)
* Update process times based on virtual cpu times stored by entry.S
* to the lowcore fields user_timer, system_timer & steal_clock.
*/
-void vtime_account_irq_enter(struct task_struct *tsk)
+void vtime_account_kernel(struct task_struct *tsk)
{
u64 timer;
@@ -245,12 +245,12 @@ void vtime_account_irq_enter(struct task_struct *tsk)
virt_timer_forward(timer);
}
-EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
-
-void vtime_account_kernel(struct task_struct *tsk)
-__attribute__((alias("vtime_account_irq_enter")));
EXPORT_SYMBOL_GPL(vtime_account_kernel);
+void vtime_account_irq_enter(struct task_struct *tsk)
+__attribute__((alias("vtime_account_kernel")));
+
+
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5a55d23..61ce9f9 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -71,7 +71,6 @@ void irqtime_account_irq(struct task_struct *curr)
else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}
-EXPORT_SYMBOL_GPL(irqtime_account_irq);
static u64 irqtime_tick_accounted(u64 maxtime)
{
@@ -434,7 +433,6 @@ void vtime_account_irq_enter(struct task_struct *tsk)
else
vtime_account_kernel(tsk);
}
-EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
#endif /* __ARCH_HAS_VTIME_ACCOUNT */
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting
2020-12-02 11:57 ` [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
@ 2020-12-02 19:28 ` Christian Borntraeger
1 sibling, 0 replies; 28+ messages in thread
From: Christian Borntraeger @ 2020-12-02 19:28 UTC (permalink / raw)
To: Frederic Weisbecker, Thomas Gleixner, Peter Zijlstra
Cc: LKML, Tony Luck, Vasily Gorbik, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Fenghua Yu,
Heiko Carstens
On 02.12.20 12:57, Frederic Weisbecker wrote:
> account_irq_enter_time() and account_irq_exit_time() are not called
> from modules. EXPORT_SYMBOL_GPL() can be safely removed from the IRQ
> cputime accounting functions called from there.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> arch/s390/kernel/vtime.c | 10 +++++-----
> kernel/sched/cputime.c | 2 --
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
> index 8df10d3c8f6c..f9f2a11958a5 100644
> --- a/arch/s390/kernel/vtime.c
> +++ b/arch/s390/kernel/vtime.c
> @@ -226,7 +226,7 @@ void vtime_flush(struct task_struct *tsk)
> * Update process times based on virtual cpu times stored by entry.S
> * to the lowcore fields user_timer, system_timer & steal_clock.
> */
> -void vtime_account_irq_enter(struct task_struct *tsk)
> +void vtime_account_kernel(struct task_struct *tsk)
> {
> u64 timer;
>
> @@ -245,12 +245,12 @@ void vtime_account_irq_enter(struct task_struct *tsk)
>
> virt_timer_forward(timer);
> }
> -EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
> -
> -void vtime_account_kernel(struct task_struct *tsk)
> -__attribute__((alias("vtime_account_irq_enter")));
> EXPORT_SYMBOL_GPL(vtime_account_kernel);
>
> +void vtime_account_irq_enter(struct task_struct *tsk)
> +__attribute__((alias("vtime_account_kernel")));
> +
> +
One new line is enough I think. Apart from that this looks sane from an s390 perspective.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting
2020-12-02 11:57 [PATCH 0/5] irq: Reorder time handling against HARDIRQ_OFFSET on IRQ entry v3 Frederic Weisbecker
2020-12-02 11:57 ` [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting Frederic Weisbecker
@ 2020-12-02 11:57 ` Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
2020-12-02 19:34 ` [PATCH 2/5] " Christian Borntraeger
2020-12-02 11:57 ` [PATCH 3/5] sched/vtime: Consolidate IRQ time accounting Frederic Weisbecker
` (2 subsequent siblings)
4 siblings, 2 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-02 11:57 UTC (permalink / raw)
To: Thomas Gleixner, Peter Zijlstra
Cc: LKML, Frederic Weisbecker, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
s390 has its own version of IRQ entry accounting because it doesn't
account the idle time the same way the other architectures do. Only
the actual idle sleep time is accounted as idle time, the rest of the
idle task execution is accounted as system time.
Make the generic IRQ entry accounting aware of architectures that have
their own way of accounting idle time and convert s390 to use it.
This prepares s390 to get involved in further consolidations of IRQ
time accounting.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/Kconfig | 7 ++++++-
arch/s390/Kconfig | 1 +
arch/s390/include/asm/vtime.h | 1 -
arch/s390/kernel/vtime.c | 4 ----
kernel/sched/cputime.c | 13 ++-----------
5 files changed, 9 insertions(+), 17 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..0f151b49c7b7 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -627,6 +627,12 @@ config HAVE_TIF_NOHZ
config HAVE_VIRT_CPU_ACCOUNTING
bool
+config HAVE_VIRT_CPU_ACCOUNTING_IDLE
+ bool
+ help
+ Architecture has its own way to account idle CPU time and therefore
+ doesn't implement vtime_account_idle().
+
config ARCH_HAS_SCALED_CPUTIME
bool
@@ -641,7 +647,6 @@ config HAVE_VIRT_CPU_ACCOUNTING_GEN
some 32-bit arches may require multiple accesses, so proper
locking is needed to protect against concurrent accesses.
-
config HAVE_IRQ_TIME_ACCOUNTING
bool
help
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 4a2a12be04c9..6f1fdcd3b5db 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -181,6 +181,7 @@ config S390
select HAVE_RSEQ
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_VIRT_CPU_ACCOUNTING
+ select HAVE_VIRT_CPU_ACCOUNTING_IDLE
select IOMMU_HELPER if PCI
select IOMMU_SUPPORT if PCI
select MODULES_USE_ELF_RELA
diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h
index 3622d4ebc73a..fac6a67988eb 100644
--- a/arch/s390/include/asm/vtime.h
+++ b/arch/s390/include/asm/vtime.h
@@ -2,7 +2,6 @@
#ifndef _S390_VTIME_H
#define _S390_VTIME_H
-#define __ARCH_HAS_VTIME_ACCOUNT
#define __ARCH_HAS_VTIME_TASK_SWITCH
#endif /* _S390_VTIME_H */
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index f9f2a11958a5..ebd8e5655789 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -247,10 +247,6 @@ void vtime_account_kernel(struct task_struct *tsk)
}
EXPORT_SYMBOL_GPL(vtime_account_kernel);
-void vtime_account_irq_enter(struct task_struct *tsk)
-__attribute__((alias("vtime_account_kernel")));
-
-
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 61ce9f9bf0a3..2783162542b1 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -417,23 +417,14 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-/*
- * Archs that account the whole time spent in the idle task
- * (outside irq) as idle time can rely on this and just implement
- * vtime_account_kernel() and vtime_account_idle(). Archs that
- * have other meaning of the idle time (s390 only includes the
- * time spent by the CPU when it's in low power mode) must override
- * vtime_account().
- */
-#ifndef __ARCH_HAS_VTIME_ACCOUNT
void vtime_account_irq_enter(struct task_struct *tsk)
{
- if (!in_interrupt() && is_idle_task(tsk))
+ if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
+ !in_interrupt() && is_idle_task(tsk))
vtime_account_idle(tsk);
else
vtime_account_kernel(tsk);
}
-#endif /* __ARCH_HAS_VTIME_ACCOUNT */
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
u64 *ut, u64 *st)
--
2.25.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [tip: irq/core] s390/vtime: Use the generic IRQ entry accounting
2020-12-02 11:57 ` [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting Frederic Weisbecker
@ 2020-12-02 19:23 ` tip-bot2 for Frederic Weisbecker
2020-12-02 19:34 ` [PATCH 2/5] " Christian Borntraeger
1 sibling, 0 replies; 28+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2020-12-02 19:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Thomas Gleixner, x86, linux-kernel, maz
The following commit has been merged into the irq/core branch of tip:
Commit-ID: 2b91ec9f551b56751cde48792f1c0a1130358844
Gitweb: https://git.kernel.org/tip/2b91ec9f551b56751cde48792f1c0a1130358844
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Wed, 02 Dec 2020 12:57:29 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 02 Dec 2020 20:20:04 +01:00
s390/vtime: Use the generic IRQ entry accounting
s390 has its own version of IRQ entry accounting because it doesn't
account the idle time the same way the other architectures do. Only
the actual idle sleep time is accounted as idle time, the rest of the
idle task execution is accounted as system time.
Make the generic IRQ entry accounting aware of architectures that have
their own way of accounting idle time and convert s390 to use it.
This prepares s390 to get involved in further consolidations of IRQ
time accounting.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201202115732.27827-3-frederic@kernel.org
---
arch/Kconfig | 7 ++++++-
arch/s390/Kconfig | 1 +
arch/s390/include/asm/vtime.h | 1 -
arch/s390/kernel/vtime.c | 4 ----
kernel/sched/cputime.c | 13 ++-----------
5 files changed, 9 insertions(+), 17 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc..0f151b4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -627,6 +627,12 @@ config HAVE_TIF_NOHZ
config HAVE_VIRT_CPU_ACCOUNTING
bool
+config HAVE_VIRT_CPU_ACCOUNTING_IDLE
+ bool
+ help
+ Architecture has its own way to account idle CPU time and therefore
+ doesn't implement vtime_account_idle().
+
config ARCH_HAS_SCALED_CPUTIME
bool
@@ -641,7 +647,6 @@ config HAVE_VIRT_CPU_ACCOUNTING_GEN
some 32-bit arches may require multiple accesses, so proper
locking is needed to protect against concurrent accesses.
-
config HAVE_IRQ_TIME_ACCOUNTING
bool
help
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 4a2a12b..6f1fdcd 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -181,6 +181,7 @@ config S390
select HAVE_RSEQ
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_VIRT_CPU_ACCOUNTING
+ select HAVE_VIRT_CPU_ACCOUNTING_IDLE
select IOMMU_HELPER if PCI
select IOMMU_SUPPORT if PCI
select MODULES_USE_ELF_RELA
diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h
index 3622d4e..fac6a67 100644
--- a/arch/s390/include/asm/vtime.h
+++ b/arch/s390/include/asm/vtime.h
@@ -2,7 +2,6 @@
#ifndef _S390_VTIME_H
#define _S390_VTIME_H
-#define __ARCH_HAS_VTIME_ACCOUNT
#define __ARCH_HAS_VTIME_TASK_SWITCH
#endif /* _S390_VTIME_H */
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index f9f2a11..ebd8e56 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -247,10 +247,6 @@ void vtime_account_kernel(struct task_struct *tsk)
}
EXPORT_SYMBOL_GPL(vtime_account_kernel);
-void vtime_account_irq_enter(struct task_struct *tsk)
-__attribute__((alias("vtime_account_kernel")));
-
-
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 61ce9f9..2783162 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -417,23 +417,14 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-/*
- * Archs that account the whole time spent in the idle task
- * (outside irq) as idle time can rely on this and just implement
- * vtime_account_kernel() and vtime_account_idle(). Archs that
- * have other meaning of the idle time (s390 only includes the
- * time spent by the CPU when it's in low power mode) must override
- * vtime_account().
- */
-#ifndef __ARCH_HAS_VTIME_ACCOUNT
void vtime_account_irq_enter(struct task_struct *tsk)
{
- if (!in_interrupt() && is_idle_task(tsk))
+ if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
+ !in_interrupt() && is_idle_task(tsk))
vtime_account_idle(tsk);
else
vtime_account_kernel(tsk);
}
-#endif /* __ARCH_HAS_VTIME_ACCOUNT */
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
u64 *ut, u64 *st)
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting
2020-12-02 11:57 ` [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
@ 2020-12-02 19:34 ` Christian Borntraeger
1 sibling, 0 replies; 28+ messages in thread
From: Christian Borntraeger @ 2020-12-02 19:34 UTC (permalink / raw)
To: Frederic Weisbecker, Thomas Gleixner, Peter Zijlstra
Cc: LKML, Tony Luck, Vasily Gorbik, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Fenghua Yu,
Heiko Carstens
On 02.12.20 12:57, Frederic Weisbecker wrote:
> s390 has its own version of IRQ entry accounting because it doesn't
> account the idle time the same way the other architectures do. Only
> the actual idle sleep time is accounted as idle time, the rest of the
> idle task execution is accounted as system time.
>
> Make the generic IRQ entry accounting aware of architectures that have
> their own way of accounting idle time and convert s390 to use it.
>
> This prepares s390 to get involved in further consolidations of IRQ
> time accounting.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
As far as I can tel,l this patch should be a no-op for s390 function-wise.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> arch/Kconfig | 7 ++++++-
> arch/s390/Kconfig | 1 +
> arch/s390/include/asm/vtime.h | 1 -
> arch/s390/kernel/vtime.c | 4 ----
> kernel/sched/cputime.c | 13 ++-----------
> 5 files changed, 9 insertions(+), 17 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 56b6ccc0e32d..0f151b49c7b7 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -627,6 +627,12 @@ config HAVE_TIF_NOHZ
> config HAVE_VIRT_CPU_ACCOUNTING
> bool
>
> +config HAVE_VIRT_CPU_ACCOUNTING_IDLE
> + bool
> + help
> + Architecture has its own way to account idle CPU time and therefore
> + doesn't implement vtime_account_idle().
> +
> config ARCH_HAS_SCALED_CPUTIME
> bool
>
> @@ -641,7 +647,6 @@ config HAVE_VIRT_CPU_ACCOUNTING_GEN
> some 32-bit arches may require multiple accesses, so proper
> locking is needed to protect against concurrent accesses.
>
> -
> config HAVE_IRQ_TIME_ACCOUNTING
> bool
> help
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 4a2a12be04c9..6f1fdcd3b5db 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -181,6 +181,7 @@ config S390
> select HAVE_RSEQ
> select HAVE_SYSCALL_TRACEPOINTS
> select HAVE_VIRT_CPU_ACCOUNTING
> + select HAVE_VIRT_CPU_ACCOUNTING_IDLE
> select IOMMU_HELPER if PCI
> select IOMMU_SUPPORT if PCI
> select MODULES_USE_ELF_RELA
> diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h
> index 3622d4ebc73a..fac6a67988eb 100644
> --- a/arch/s390/include/asm/vtime.h
> +++ b/arch/s390/include/asm/vtime.h
> @@ -2,7 +2,6 @@
> #ifndef _S390_VTIME_H
> #define _S390_VTIME_H
>
> -#define __ARCH_HAS_VTIME_ACCOUNT
> #define __ARCH_HAS_VTIME_TASK_SWITCH
>
> #endif /* _S390_VTIME_H */
> diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
> index f9f2a11958a5..ebd8e5655789 100644
> --- a/arch/s390/kernel/vtime.c
> +++ b/arch/s390/kernel/vtime.c
> @@ -247,10 +247,6 @@ void vtime_account_kernel(struct task_struct *tsk)
> }
> EXPORT_SYMBOL_GPL(vtime_account_kernel);
>
> -void vtime_account_irq_enter(struct task_struct *tsk)
> -__attribute__((alias("vtime_account_kernel")));
> -
> -
> /*
> * Sorted add to a list. List is linear searched until first bigger
> * element is found.
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 61ce9f9bf0a3..2783162542b1 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -417,23 +417,14 @@ void vtime_task_switch(struct task_struct *prev)
> }
> # endif
>
> -/*
> - * Archs that account the whole time spent in the idle task
> - * (outside irq) as idle time can rely on this and just implement
> - * vtime_account_kernel() and vtime_account_idle(). Archs that
> - * have other meaning of the idle time (s390 only includes the
> - * time spent by the CPU when it's in low power mode) must override
> - * vtime_account().
> - */
> -#ifndef __ARCH_HAS_VTIME_ACCOUNT
> void vtime_account_irq_enter(struct task_struct *tsk)
> {
> - if (!in_interrupt() && is_idle_task(tsk))
> + if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
> + !in_interrupt() && is_idle_task(tsk))
> vtime_account_idle(tsk);
> else
> vtime_account_kernel(tsk);
> }
> -#endif /* __ARCH_HAS_VTIME_ACCOUNT */
>
> void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
> u64 *ut, u64 *st)
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 3/5] sched/vtime: Consolidate IRQ time accounting
2020-12-02 11:57 [PATCH 0/5] irq: Reorder time handling against HARDIRQ_OFFSET on IRQ entry v3 Frederic Weisbecker
2020-12-02 11:57 ` [PATCH 1/5] sched/cputime: Remove symbol exports from IRQ time accounting Frederic Weisbecker
2020-12-02 11:57 ` [PATCH 2/5] s390/vtime: Use the generic IRQ entry accounting Frederic Weisbecker
@ 2020-12-02 11:57 ` Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
2020-12-02 11:57 ` [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation Frederic Weisbecker
2020-12-02 11:57 ` [PATCH 5/5] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET Frederic Weisbecker
4 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-02 11:57 UTC (permalink / raw)
To: Thomas Gleixner, Peter Zijlstra
Cc: LKML, Frederic Weisbecker, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
The 3 architectures implementing CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
all have their own version of irq time accounting that dispatch the
cputime to the appropriate index: hardirq, softirq, system, idle,
guest... from an all-in-one function.
Instead of having these ad-hoc versions, move the cputime destination
dispatch decision to the core code and leave only the actual per-index
cputime accounting to the architecture.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/ia64/kernel/time.c | 20 +++++++++----
arch/powerpc/kernel/time.c | 58 +++++++++++++++++++++++++++-----------
arch/s390/kernel/vtime.c | 51 ++++++++++++++++++++++-----------
include/linux/vtime.h | 16 ++++-------
kernel/sched/cputime.c | 13 ++++++---
5 files changed, 106 insertions(+), 52 deletions(-)
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 7abc5f37bfaf..733e0e3324b8 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -138,12 +138,8 @@ void vtime_account_kernel(struct task_struct *tsk)
struct thread_info *ti = task_thread_info(tsk);
__u64 stime = vtime_delta(tsk);
- if ((tsk->flags & PF_VCPU) && !irq_count())
+ if (tsk->flags & PF_VCPU)
ti->gtime += stime;
- else if (hardirq_count())
- ti->hardirq_time += stime;
- else if (in_serving_softirq())
- ti->softirq_time += stime;
else
ti->stime += stime;
}
@@ -156,6 +152,20 @@ void vtime_account_idle(struct task_struct *tsk)
ti->idle_time += vtime_delta(tsk);
}
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ struct thread_info *ti = task_thread_info(tsk);
+
+ ti->softirq_time += vtime_delta(tsk);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ struct thread_info *ti = task_thread_info(tsk);
+
+ ti->hardirq_time += vtime_delta(tsk);
+}
+
#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
static irqreturn_t
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 74efe46f5532..cf3f8db7e0e3 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -311,12 +311,11 @@ static unsigned long vtime_delta_scaled(struct cpu_accounting_data *acct,
return stime_scaled;
}
-static unsigned long vtime_delta(struct task_struct *tsk,
+static unsigned long vtime_delta(struct cpu_accounting_data *acct,
unsigned long *stime_scaled,
unsigned long *steal_time)
{
unsigned long now, stime;
- struct cpu_accounting_data *acct = get_accounting(tsk);
WARN_ON_ONCE(!irqs_disabled());
@@ -331,29 +330,30 @@ static unsigned long vtime_delta(struct task_struct *tsk,
return stime;
}
-void vtime_account_kernel(struct task_struct *tsk)
+static void vtime_delta_kernel(struct cpu_accounting_data *acct,
+ unsigned long *stime, unsigned long *stime_scaled)
{
- unsigned long stime, stime_scaled, steal_time;
- struct cpu_accounting_data *acct = get_accounting(tsk);
+ unsigned long steal_time;
- stime = vtime_delta(tsk, &stime_scaled, &steal_time);
-
- stime -= min(stime, steal_time);
+ *stime = vtime_delta(acct, stime_scaled, &steal_time);
+ *stime -= min(*stime, steal_time);
acct->steal_time += steal_time;
+}
- if ((tsk->flags & PF_VCPU) && !irq_count()) {
+void vtime_account_kernel(struct task_struct *tsk)
+{
+ struct cpu_accounting_data *acct = get_accounting(tsk);
+ unsigned long stime, stime_scaled;
+
+ vtime_delta_kernel(acct, &stime, &stime_scaled);
+
+ if (tsk->flags & PF_VCPU) {
acct->gtime += stime;
#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
acct->utime_scaled += stime_scaled;
#endif
} else {
- if (hardirq_count())
- acct->hardirq_time += stime;
- else if (in_serving_softirq())
- acct->softirq_time += stime;
- else
- acct->stime += stime;
-
+ acct->stime += stime;
#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
acct->stime_scaled += stime_scaled;
#endif
@@ -366,10 +366,34 @@ void vtime_account_idle(struct task_struct *tsk)
unsigned long stime, stime_scaled, steal_time;
struct cpu_accounting_data *acct = get_accounting(tsk);
- stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+ stime = vtime_delta(acct, &stime_scaled, &steal_time);
acct->idle_time += stime + steal_time;
}
+static void vtime_account_irq_field(struct cpu_accounting_data *acct,
+ unsigned long *field)
+{
+ unsigned long stime, stime_scaled;
+
+ vtime_delta_kernel(acct, &stime, &stime_scaled);
+ *field += stime;
+#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
+ acct->stime_scaled += stime_scaled;
+#endif
+}
+
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ struct cpu_accounting_data *acct = get_accounting(tsk);
+ vtime_account_irq_field(acct, &acct->softirq_time);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ struct cpu_accounting_data *acct = get_accounting(tsk);
+ vtime_account_irq_field(acct, &acct->hardirq_time);
+}
+
static void vtime_flush_scaled(struct task_struct *tsk,
struct cpu_accounting_data *acct)
{
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index ebd8e5655789..5aaa2ca6a928 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -222,31 +222,50 @@ void vtime_flush(struct task_struct *tsk)
S390_lowcore.avg_steal_timer = avg_steal;
}
-/*
- * Update process times based on virtual cpu times stored by entry.S
- * to the lowcore fields user_timer, system_timer & steal_clock.
- */
-void vtime_account_kernel(struct task_struct *tsk)
+static u64 vtime_delta(void)
{
- u64 timer;
+ u64 timer = S390_lowcore.last_update_timer;
- timer = S390_lowcore.last_update_timer;
S390_lowcore.last_update_timer = get_vtimer();
- timer -= S390_lowcore.last_update_timer;
- if ((tsk->flags & PF_VCPU) && (irq_count() == 0))
- S390_lowcore.guest_timer += timer;
- else if (hardirq_count())
- S390_lowcore.hardirq_timer += timer;
- else if (in_serving_softirq())
- S390_lowcore.softirq_timer += timer;
+ return timer - S390_lowcore.last_update_timer;
+}
+
+/*
+ * Update process times based on virtual cpu times stored by entry.S
+ * to the lowcore fields user_timer, system_timer & steal_clock.
+ */
+void vtime_account_kernel(struct task_struct *tsk)
+{
+ u64 delta = vtime_delta();
+
+ if (tsk->flags & PF_VCPU)
+ S390_lowcore.guest_timer += delta;
else
- S390_lowcore.system_timer += timer;
+ S390_lowcore.system_timer += delta;
- virt_timer_forward(timer);
+ virt_timer_forward(delta);
}
EXPORT_SYMBOL_GPL(vtime_account_kernel);
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ u64 delta = vtime_delta();
+
+ S390_lowcore.softirq_timer += delta;
+
+ virt_timer_forward(delta);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ u64 delta = vtime_delta();
+
+ S390_lowcore.hardirq_timer += delta;
+
+ virt_timer_forward(delta);
+}
+
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 2cdeca062db3..6c9867419615 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -83,16 +83,12 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
#endif
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-extern void vtime_account_irq_enter(struct task_struct *tsk);
-static inline void vtime_account_irq_exit(struct task_struct *tsk)
-{
- /* On hard|softirq exit we always account to hard|softirq cputime */
- vtime_account_kernel(tsk);
-}
+extern void vtime_account_irq(struct task_struct *tsk);
+extern void vtime_account_softirq(struct task_struct *tsk);
+extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-static inline void vtime_account_irq_enter(struct task_struct *tsk) { }
-static inline void vtime_account_irq_exit(struct task_struct *tsk) { }
+static inline void vtime_account_irq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
#endif
@@ -105,13 +101,13 @@ static inline void irqtime_account_irq(struct task_struct *tsk) { }
static inline void account_irq_enter_time(struct task_struct *tsk)
{
- vtime_account_irq_enter(tsk);
+ vtime_account_irq(tsk);
irqtime_account_irq(tsk);
}
static inline void account_irq_exit_time(struct task_struct *tsk)
{
- vtime_account_irq_exit(tsk);
+ vtime_account_irq(tsk);
irqtime_account_irq(tsk);
}
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 2783162542b1..02163d4260d7 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -417,13 +417,18 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-void vtime_account_irq_enter(struct task_struct *tsk)
+void vtime_account_irq(struct task_struct *tsk)
{
- if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
- !in_interrupt() && is_idle_task(tsk))
+ if (hardirq_count()) {
+ vtime_account_hardirq(tsk);
+ } else if (in_serving_softirq()) {
+ vtime_account_softirq(tsk);
+ } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
+ is_idle_task(tsk)) {
vtime_account_idle(tsk);
- else
+ } else {
vtime_account_kernel(tsk);
+ }
}
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
--
2.25.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [tip: irq/core] sched/vtime: Consolidate IRQ time accounting
2020-12-02 11:57 ` [PATCH 3/5] sched/vtime: Consolidate IRQ time accounting Frederic Weisbecker
@ 2020-12-02 19:23 ` tip-bot2 for Frederic Weisbecker
0 siblings, 0 replies; 28+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2020-12-02 19:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Thomas Gleixner, x86, linux-kernel, maz
The following commit has been merged into the irq/core branch of tip:
Commit-ID: 8a6a5920d3286eb0eae9f36a4ec4fc9df511eccb
Gitweb: https://git.kernel.org/tip/8a6a5920d3286eb0eae9f36a4ec4fc9df511eccb
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Wed, 02 Dec 2020 12:57:30 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 02 Dec 2020 20:20:05 +01:00
sched/vtime: Consolidate IRQ time accounting
The 3 architectures implementing CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
all have their own version of irq time accounting that dispatch the
cputime to the appropriate index: hardirq, softirq, system, idle,
guest... from an all-in-one function.
Instead of having these ad-hoc versions, move the cputime destination
dispatch decision to the core code and leave only the actual per-index
cputime accounting to the architecture.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201202115732.27827-4-frederic@kernel.org
---
arch/ia64/kernel/time.c | 20 +++++++++----
arch/powerpc/kernel/time.c | 56 ++++++++++++++++++++++++++-----------
arch/s390/kernel/vtime.c | 45 +++++++++++++++++++++---------
include/linux/vtime.h | 16 ++++-------
kernel/sched/cputime.c | 13 ++++++---
5 files changed, 102 insertions(+), 48 deletions(-)
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 7abc5f3..733e0e3 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -138,12 +138,8 @@ void vtime_account_kernel(struct task_struct *tsk)
struct thread_info *ti = task_thread_info(tsk);
__u64 stime = vtime_delta(tsk);
- if ((tsk->flags & PF_VCPU) && !irq_count())
+ if (tsk->flags & PF_VCPU)
ti->gtime += stime;
- else if (hardirq_count())
- ti->hardirq_time += stime;
- else if (in_serving_softirq())
- ti->softirq_time += stime;
else
ti->stime += stime;
}
@@ -156,6 +152,20 @@ void vtime_account_idle(struct task_struct *tsk)
ti->idle_time += vtime_delta(tsk);
}
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ struct thread_info *ti = task_thread_info(tsk);
+
+ ti->softirq_time += vtime_delta(tsk);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ struct thread_info *ti = task_thread_info(tsk);
+
+ ti->hardirq_time += vtime_delta(tsk);
+}
+
#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
static irqreturn_t
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 74efe46..cf3f8db 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -311,12 +311,11 @@ static unsigned long vtime_delta_scaled(struct cpu_accounting_data *acct,
return stime_scaled;
}
-static unsigned long vtime_delta(struct task_struct *tsk,
+static unsigned long vtime_delta(struct cpu_accounting_data *acct,
unsigned long *stime_scaled,
unsigned long *steal_time)
{
unsigned long now, stime;
- struct cpu_accounting_data *acct = get_accounting(tsk);
WARN_ON_ONCE(!irqs_disabled());
@@ -331,29 +330,30 @@ static unsigned long vtime_delta(struct task_struct *tsk,
return stime;
}
+static void vtime_delta_kernel(struct cpu_accounting_data *acct,
+ unsigned long *stime, unsigned long *stime_scaled)
+{
+ unsigned long steal_time;
+
+ *stime = vtime_delta(acct, stime_scaled, &steal_time);
+ *stime -= min(*stime, steal_time);
+ acct->steal_time += steal_time;
+}
+
void vtime_account_kernel(struct task_struct *tsk)
{
- unsigned long stime, stime_scaled, steal_time;
struct cpu_accounting_data *acct = get_accounting(tsk);
+ unsigned long stime, stime_scaled;
- stime = vtime_delta(tsk, &stime_scaled, &steal_time);
-
- stime -= min(stime, steal_time);
- acct->steal_time += steal_time;
+ vtime_delta_kernel(acct, &stime, &stime_scaled);
- if ((tsk->flags & PF_VCPU) && !irq_count()) {
+ if (tsk->flags & PF_VCPU) {
acct->gtime += stime;
#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
acct->utime_scaled += stime_scaled;
#endif
} else {
- if (hardirq_count())
- acct->hardirq_time += stime;
- else if (in_serving_softirq())
- acct->softirq_time += stime;
- else
- acct->stime += stime;
-
+ acct->stime += stime;
#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
acct->stime_scaled += stime_scaled;
#endif
@@ -366,10 +366,34 @@ void vtime_account_idle(struct task_struct *tsk)
unsigned long stime, stime_scaled, steal_time;
struct cpu_accounting_data *acct = get_accounting(tsk);
- stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+ stime = vtime_delta(acct, &stime_scaled, &steal_time);
acct->idle_time += stime + steal_time;
}
+static void vtime_account_irq_field(struct cpu_accounting_data *acct,
+ unsigned long *field)
+{
+ unsigned long stime, stime_scaled;
+
+ vtime_delta_kernel(acct, &stime, &stime_scaled);
+ *field += stime;
+#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
+ acct->stime_scaled += stime_scaled;
+#endif
+}
+
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ struct cpu_accounting_data *acct = get_accounting(tsk);
+ vtime_account_irq_field(acct, &acct->softirq_time);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ struct cpu_accounting_data *acct = get_accounting(tsk);
+ vtime_account_irq_field(acct, &acct->hardirq_time);
+}
+
static void vtime_flush_scaled(struct task_struct *tsk,
struct cpu_accounting_data *acct)
{
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index ebd8e56..5aaa2ca 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -222,31 +222,50 @@ void vtime_flush(struct task_struct *tsk)
S390_lowcore.avg_steal_timer = avg_steal;
}
+static u64 vtime_delta(void)
+{
+ u64 timer = S390_lowcore.last_update_timer;
+
+ S390_lowcore.last_update_timer = get_vtimer();
+
+ return timer - S390_lowcore.last_update_timer;
+}
+
/*
* Update process times based on virtual cpu times stored by entry.S
* to the lowcore fields user_timer, system_timer & steal_clock.
*/
void vtime_account_kernel(struct task_struct *tsk)
{
- u64 timer;
-
- timer = S390_lowcore.last_update_timer;
- S390_lowcore.last_update_timer = get_vtimer();
- timer -= S390_lowcore.last_update_timer;
+ u64 delta = vtime_delta();
- if ((tsk->flags & PF_VCPU) && (irq_count() == 0))
- S390_lowcore.guest_timer += timer;
- else if (hardirq_count())
- S390_lowcore.hardirq_timer += timer;
- else if (in_serving_softirq())
- S390_lowcore.softirq_timer += timer;
+ if (tsk->flags & PF_VCPU)
+ S390_lowcore.guest_timer += delta;
else
- S390_lowcore.system_timer += timer;
+ S390_lowcore.system_timer += delta;
- virt_timer_forward(timer);
+ virt_timer_forward(delta);
}
EXPORT_SYMBOL_GPL(vtime_account_kernel);
+void vtime_account_softirq(struct task_struct *tsk)
+{
+ u64 delta = vtime_delta();
+
+ S390_lowcore.softirq_timer += delta;
+
+ virt_timer_forward(delta);
+}
+
+void vtime_account_hardirq(struct task_struct *tsk)
+{
+ u64 delta = vtime_delta();
+
+ S390_lowcore.hardirq_timer += delta;
+
+ virt_timer_forward(delta);
+}
+
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 2cdeca0..6c98674 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -83,16 +83,12 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
#endif
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-extern void vtime_account_irq_enter(struct task_struct *tsk);
-static inline void vtime_account_irq_exit(struct task_struct *tsk)
-{
- /* On hard|softirq exit we always account to hard|softirq cputime */
- vtime_account_kernel(tsk);
-}
+extern void vtime_account_irq(struct task_struct *tsk);
+extern void vtime_account_softirq(struct task_struct *tsk);
+extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-static inline void vtime_account_irq_enter(struct task_struct *tsk) { }
-static inline void vtime_account_irq_exit(struct task_struct *tsk) { }
+static inline void vtime_account_irq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
#endif
@@ -105,13 +101,13 @@ static inline void irqtime_account_irq(struct task_struct *tsk) { }
static inline void account_irq_enter_time(struct task_struct *tsk)
{
- vtime_account_irq_enter(tsk);
+ vtime_account_irq(tsk);
irqtime_account_irq(tsk);
}
static inline void account_irq_exit_time(struct task_struct *tsk)
{
- vtime_account_irq_exit(tsk);
+ vtime_account_irq(tsk);
irqtime_account_irq(tsk);
}
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 2783162..02163d4 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -417,13 +417,18 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-void vtime_account_irq_enter(struct task_struct *tsk)
+void vtime_account_irq(struct task_struct *tsk)
{
- if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
- !in_interrupt() && is_idle_task(tsk))
+ if (hardirq_count()) {
+ vtime_account_hardirq(tsk);
+ } else if (in_serving_softirq()) {
+ vtime_account_softirq(tsk);
+ } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
+ is_idle_task(tsk)) {
vtime_account_idle(tsk);
- else
+ } else {
vtime_account_kernel(tsk);
+ }
}
void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-02 11:57 [PATCH 0/5] irq: Reorder time handling against HARDIRQ_OFFSET on IRQ entry v3 Frederic Weisbecker
` (2 preceding siblings ...)
2020-12-02 11:57 ` [PATCH 3/5] sched/vtime: Consolidate IRQ time accounting Frederic Weisbecker
@ 2020-12-02 11:57 ` Frederic Weisbecker
2020-12-02 12:36 ` Peter Zijlstra
` (2 more replies)
2020-12-02 11:57 ` [PATCH 5/5] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET Frederic Weisbecker
4 siblings, 3 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-02 11:57 UTC (permalink / raw)
To: Thomas Gleixner, Peter Zijlstra
Cc: LKML, Frederic Weisbecker, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
IRQ time entry is currently accounted before HARDIRQ_OFFSET or
SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
index the cputime to account is dispatched.
Unfortunately it prevents tick_irq_enter() from being called under
HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
entry accounting due to the necessary clock catch up. As a result we
don't benefit from appropriate lockdep coverage on tick_irq_enter().
To prepare for fixing this, move the IRQ entry cputime accounting after
the preempt offset is incremented. This requires the cputime dispatch
code to handle the extra offset.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
include/linux/hardirq.h | 4 ++--
include/linux/vtime.h | 34 ++++++++++++++++++++++++----------
kernel/sched/cputime.c | 18 +++++++++++-------
kernel/softirq.c | 6 +++---
4 files changed, 40 insertions(+), 22 deletions(-)
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 754f67ac4326..7c9d6a2d7e90 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -32,9 +32,9 @@ static __always_inline void rcu_irq_enter_check_tick(void)
*/
#define __irq_enter() \
do { \
- account_irq_enter_time(current); \
preempt_count_add(HARDIRQ_OFFSET); \
lockdep_hardirq_enter(); \
+ account_hardirq_enter(current); \
} while (0)
/*
@@ -62,8 +62,8 @@ void irq_enter_rcu(void);
*/
#define __irq_exit() \
do { \
+ account_hardirq_exit(current); \
lockdep_hardirq_exit(); \
- account_irq_exit_time(current); \
preempt_count_sub(HARDIRQ_OFFSET); \
} while (0)
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 6c9867419615..041d6524d144 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -83,32 +83,46 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
#endif
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-extern void vtime_account_irq(struct task_struct *tsk);
+extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
extern void vtime_account_softirq(struct task_struct *tsk);
extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-static inline void vtime_account_irq(struct task_struct *tsk) { }
+static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
+static inline void vtime_account_softirq(struct task_struct *tsk) { }
+static inline void vtime_account_hardirq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
#endif
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-extern void irqtime_account_irq(struct task_struct *tsk);
+extern void irqtime_account_irq(struct task_struct *tsk, unsigned int offset);
#else
-static inline void irqtime_account_irq(struct task_struct *tsk) { }
+static inline void irqtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
#endif
-static inline void account_irq_enter_time(struct task_struct *tsk)
+static inline void account_softirq_enter(struct task_struct *tsk)
{
- vtime_account_irq(tsk);
- irqtime_account_irq(tsk);
+ vtime_account_irq(tsk, SOFTIRQ_OFFSET);
+ irqtime_account_irq(tsk, SOFTIRQ_OFFSET);
}
-static inline void account_irq_exit_time(struct task_struct *tsk)
+static inline void account_softirq_exit(struct task_struct *tsk)
{
- vtime_account_irq(tsk);
- irqtime_account_irq(tsk);
+ vtime_account_softirq(tsk);
+ irqtime_account_irq(tsk, 0);
+}
+
+static inline void account_hardirq_enter(struct task_struct *tsk)
+{
+ vtime_account_irq(tsk, HARDIRQ_OFFSET);
+ irqtime_account_irq(tsk, HARDIRQ_OFFSET);
+}
+
+static inline void account_hardirq_exit(struct task_struct *tsk)
+{
+ vtime_account_hardirq(tsk);
+ irqtime_account_irq(tsk, 0);
}
#endif /* _LINUX_KERNEL_VTIME_H */
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 02163d4260d7..5f611658eeab 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -44,12 +44,13 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
}
/*
- * Called before incrementing preempt_count on {soft,}irq_enter
+ * Called after incrementing preempt_count on {soft,}irq_enter
* and before decrementing preempt_count on {soft,}irq_exit.
*/
-void irqtime_account_irq(struct task_struct *curr)
+void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
{
struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
+ unsigned int pc;
s64 delta;
int cpu;
@@ -59,6 +60,7 @@ void irqtime_account_irq(struct task_struct *curr)
cpu = smp_processor_id();
delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
irqtime->irq_start_time += delta;
+ pc = preempt_count() - offset;
/*
* We do not account for softirq time from ksoftirqd here.
@@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
* in that case, so as not to confuse scheduler with a special task
* that do not consume any time, but still wants to run.
*/
- if (hardirq_count())
+ if (pc & HARDIRQ_MASK)
irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
- else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+ else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}
@@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-void vtime_account_irq(struct task_struct *tsk)
+void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
{
- if (hardirq_count()) {
+ unsigned int pc = preempt_count() - offset;
+
+ if (pc & HARDIRQ_OFFSET) {
vtime_account_hardirq(tsk);
- } else if (in_serving_softirq()) {
+ } else if (pc & SOFTIRQ_OFFSET) {
vtime_account_softirq(tsk);
} else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
is_idle_task(tsk)) {
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 617009ccd82c..b8f42b3ba8ca 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -315,10 +315,10 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
current->flags &= ~PF_MEMALLOC;
pending = local_softirq_pending();
- account_irq_enter_time(current);
__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
in_hardirq = lockdep_softirq_start();
+ account_softirq_enter(current);
restart:
/* Reset the pending bitmask before enabling irqs */
@@ -365,8 +365,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
wakeup_softirqd();
}
+ account_softirq_exit(current);
lockdep_softirq_end(in_hardirq);
- account_irq_exit_time(current);
__local_bh_enable(SOFTIRQ_OFFSET);
WARN_ON_ONCE(in_interrupt());
current_restore_flags(old_flags, PF_MEMALLOC);
@@ -418,7 +418,7 @@ static inline void __irq_exit_rcu(void)
#else
lockdep_assert_irqs_disabled();
#endif
- account_irq_exit_time(current);
+ account_hardirq_exit(current);
preempt_count_sub(HARDIRQ_OFFSET);
if (!in_interrupt() && local_softirq_pending())
invoke_softirq();
--
2.25.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-02 11:57 ` [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation Frederic Weisbecker
@ 2020-12-02 12:36 ` Peter Zijlstra
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
2020-12-28 2:15 ` [PATCH 4/5] " Qais Yousef
2 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2020-12-02 12:36 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Thomas Gleixner, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
On Wed, Dec 02, 2020 at 12:57:31PM +0100, Frederic Weisbecker wrote:
> IRQ time entry is currently accounted before HARDIRQ_OFFSET or
> SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
> index the cputime to account is dispatched.
>
> Unfortunately it prevents tick_irq_enter() from being called under
> HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
> entry accounting due to the necessary clock catch up. As a result we
> don't benefit from appropriate lockdep coverage on tick_irq_enter().
>
> To prepare for fixing this, move the IRQ entry cputime accounting after
> the preempt offset is incremented. This requires the cputime dispatch
> code to handle the extra offset.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [tip: irq/core] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-02 11:57 ` [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation Frederic Weisbecker
2020-12-02 12:36 ` Peter Zijlstra
@ 2020-12-02 19:23 ` tip-bot2 for Frederic Weisbecker
2020-12-28 2:15 ` [PATCH 4/5] " Qais Yousef
2 siblings, 0 replies; 28+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2020-12-02 19:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Thomas Gleixner, Peter Zijlstra (Intel),
x86, linux-kernel, maz
The following commit has been merged into the irq/core branch of tip:
Commit-ID: d3759e7184f8f6187e62f8c4e7dcb1f6c47c075a
Gitweb: https://git.kernel.org/tip/d3759e7184f8f6187e62f8c4e7dcb1f6c47c075a
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Wed, 02 Dec 2020 12:57:31 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 02 Dec 2020 20:20:05 +01:00
irqtime: Move irqtime entry accounting after irq offset incrementation
IRQ time entry is currently accounted before HARDIRQ_OFFSET or
SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
index the cputime to account is dispatched.
Unfortunately it prevents tick_irq_enter() from being called under
HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
entry accounting due to the necessary clock catch up. As a result we
don't benefit from appropriate lockdep coverage on tick_irq_enter().
To prepare for fixing this, move the IRQ entry cputime accounting after
the preempt offset is incremented. This requires the cputime dispatch
code to handle the extra offset.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20201202115732.27827-5-frederic@kernel.org
---
include/linux/hardirq.h | 4 ++--
include/linux/vtime.h | 34 ++++++++++++++++++++++++----------
kernel/sched/cputime.c | 18 +++++++++++-------
kernel/softirq.c | 6 +++---
4 files changed, 40 insertions(+), 22 deletions(-)
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 754f67a..7c9d6a2 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -32,9 +32,9 @@ static __always_inline void rcu_irq_enter_check_tick(void)
*/
#define __irq_enter() \
do { \
- account_irq_enter_time(current); \
preempt_count_add(HARDIRQ_OFFSET); \
lockdep_hardirq_enter(); \
+ account_hardirq_enter(current); \
} while (0)
/*
@@ -62,8 +62,8 @@ void irq_enter_rcu(void);
*/
#define __irq_exit() \
do { \
+ account_hardirq_exit(current); \
lockdep_hardirq_exit(); \
- account_irq_exit_time(current); \
preempt_count_sub(HARDIRQ_OFFSET); \
} while (0)
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 6c98674..041d652 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -83,32 +83,46 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
#endif
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-extern void vtime_account_irq(struct task_struct *tsk);
+extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
extern void vtime_account_softirq(struct task_struct *tsk);
extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-static inline void vtime_account_irq(struct task_struct *tsk) { }
+static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
+static inline void vtime_account_softirq(struct task_struct *tsk) { }
+static inline void vtime_account_hardirq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
#endif
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-extern void irqtime_account_irq(struct task_struct *tsk);
+extern void irqtime_account_irq(struct task_struct *tsk, unsigned int offset);
#else
-static inline void irqtime_account_irq(struct task_struct *tsk) { }
+static inline void irqtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
#endif
-static inline void account_irq_enter_time(struct task_struct *tsk)
+static inline void account_softirq_enter(struct task_struct *tsk)
{
- vtime_account_irq(tsk);
- irqtime_account_irq(tsk);
+ vtime_account_irq(tsk, SOFTIRQ_OFFSET);
+ irqtime_account_irq(tsk, SOFTIRQ_OFFSET);
}
-static inline void account_irq_exit_time(struct task_struct *tsk)
+static inline void account_softirq_exit(struct task_struct *tsk)
{
- vtime_account_irq(tsk);
- irqtime_account_irq(tsk);
+ vtime_account_softirq(tsk);
+ irqtime_account_irq(tsk, 0);
+}
+
+static inline void account_hardirq_enter(struct task_struct *tsk)
+{
+ vtime_account_irq(tsk, HARDIRQ_OFFSET);
+ irqtime_account_irq(tsk, HARDIRQ_OFFSET);
+}
+
+static inline void account_hardirq_exit(struct task_struct *tsk)
+{
+ vtime_account_hardirq(tsk);
+ irqtime_account_irq(tsk, 0);
}
#endif /* _LINUX_KERNEL_VTIME_H */
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 02163d4..5f61165 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -44,12 +44,13 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
}
/*
- * Called before incrementing preempt_count on {soft,}irq_enter
+ * Called after incrementing preempt_count on {soft,}irq_enter
* and before decrementing preempt_count on {soft,}irq_exit.
*/
-void irqtime_account_irq(struct task_struct *curr)
+void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
{
struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
+ unsigned int pc;
s64 delta;
int cpu;
@@ -59,6 +60,7 @@ void irqtime_account_irq(struct task_struct *curr)
cpu = smp_processor_id();
delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
irqtime->irq_start_time += delta;
+ pc = preempt_count() - offset;
/*
* We do not account for softirq time from ksoftirqd here.
@@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
* in that case, so as not to confuse scheduler with a special task
* that do not consume any time, but still wants to run.
*/
- if (hardirq_count())
+ if (pc & HARDIRQ_MASK)
irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
- else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+ else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}
@@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
}
# endif
-void vtime_account_irq(struct task_struct *tsk)
+void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
{
- if (hardirq_count()) {
+ unsigned int pc = preempt_count() - offset;
+
+ if (pc & HARDIRQ_OFFSET) {
vtime_account_hardirq(tsk);
- } else if (in_serving_softirq()) {
+ } else if (pc & SOFTIRQ_OFFSET) {
vtime_account_softirq(tsk);
} else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
is_idle_task(tsk)) {
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 617009c..b8f42b3 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -315,10 +315,10 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
current->flags &= ~PF_MEMALLOC;
pending = local_softirq_pending();
- account_irq_enter_time(current);
__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
in_hardirq = lockdep_softirq_start();
+ account_softirq_enter(current);
restart:
/* Reset the pending bitmask before enabling irqs */
@@ -365,8 +365,8 @@ restart:
wakeup_softirqd();
}
+ account_softirq_exit(current);
lockdep_softirq_end(in_hardirq);
- account_irq_exit_time(current);
__local_bh_enable(SOFTIRQ_OFFSET);
WARN_ON_ONCE(in_interrupt());
current_restore_flags(old_flags, PF_MEMALLOC);
@@ -418,7 +418,7 @@ static inline void __irq_exit_rcu(void)
#else
lockdep_assert_irqs_disabled();
#endif
- account_irq_exit_time(current);
+ account_hardirq_exit(current);
preempt_count_sub(HARDIRQ_OFFSET);
if (!in_interrupt() && local_softirq_pending())
invoke_softirq();
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-02 11:57 ` [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation Frederic Weisbecker
2020-12-02 12:36 ` Peter Zijlstra
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
@ 2020-12-28 2:15 ` Qais Yousef
2020-12-29 13:41 ` Frederic Weisbecker
2 siblings, 1 reply; 28+ messages in thread
From: Qais Yousef @ 2020-12-28 2:15 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Thomas Gleixner, Peter Zijlstra, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
Hi Frederic
On 12/02/20 12:57, Frederic Weisbecker wrote:
> #endif /* _LINUX_KERNEL_VTIME_H */
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 02163d4260d7..5f611658eeab 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -44,12 +44,13 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
> }
>
> /*
> - * Called before incrementing preempt_count on {soft,}irq_enter
> + * Called after incrementing preempt_count on {soft,}irq_enter
> * and before decrementing preempt_count on {soft,}irq_exit.
> */
> -void irqtime_account_irq(struct task_struct *curr)
> +void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
> {
> struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
> + unsigned int pc;
> s64 delta;
> int cpu;
>
> @@ -59,6 +60,7 @@ void irqtime_account_irq(struct task_struct *curr)
> cpu = smp_processor_id();
> delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> irqtime->irq_start_time += delta;
> + pc = preempt_count() - offset;
>
> /*
> * We do not account for softirq time from ksoftirqd here.
> @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
> * in that case, so as not to confuse scheduler with a special task
> * that do not consume any time, but still wants to run.
> */
> - if (hardirq_count())
> + if (pc & HARDIRQ_MASK)
> irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> + else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
seems we're in-softirq only if the count is odd numbered.
/me tries to dig more
Hmm could it be because the softirq count is actually 1 bit and the rest is
for SOFTIRQ_DISABLE_OFFSET (BH disabled)?
IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
count BH disable nesting, right?
I guess this would make sense; we don't nest softirqs processing AFAIK. But
I could be misreading the code too :-)
> irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> }
>
> @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
> }
> # endif
>
> -void vtime_account_irq(struct task_struct *tsk)
> +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> {
> - if (hardirq_count()) {
> + unsigned int pc = preempt_count() - offset;
> +
> + if (pc & HARDIRQ_OFFSET) {
Shouldn't this be HARDIRQ_MASK like above?
> vtime_account_hardirq(tsk);
> - } else if (in_serving_softirq()) {
> + } else if (pc & SOFTIRQ_OFFSET) {
> vtime_account_softirq(tsk);
> } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
> is_idle_task(tsk)) {
Thanks
--
Qais Yousef
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-28 2:15 ` [PATCH 4/5] " Qais Yousef
@ 2020-12-29 13:41 ` Frederic Weisbecker
2020-12-29 14:12 ` Qais Yousef
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-29 13:41 UTC (permalink / raw)
To: Qais Yousef
Cc: Thomas Gleixner, Peter Zijlstra, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
On Mon, Dec 28, 2020 at 02:15:29AM +0000, Qais Yousef wrote:
> Hi Frederic
>
> On 12/02/20 12:57, Frederic Weisbecker wrote:
> > @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
> > * in that case, so as not to confuse scheduler with a special task
> > * that do not consume any time, but still wants to run.
> > */
> > - if (hardirq_count())
> > + if (pc & HARDIRQ_MASK)
> > irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> > + else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
>
> Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
> seems we're in-softirq only if the count is odd numbered.
>
> /me tries to dig more
>
> Hmm could it be because the softirq count is actually 1 bit and the rest is
> for SOFTIRQ_DISABLE_OFFSET (BH disabled)?
Exactly!
>
> IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
> count BH disable nesting, right?
>
> I guess this would make sense; we don't nest softirqs processing AFAIK. But
> I could be misreading the code too :-)
You got it right!
This is commented in softirq.c somewhere:
/*
* preempt_count and SOFTIRQ_OFFSET usage:
* - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving
* softirq processing.
* - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
* on local_bh_disable or local_bh_enable.
* This lets us distinguish between whether we are currently processing
* softirq and whether we just have bh disabled.
*/
But we should elaborate on the fact that, indeed, softirq processing can't nest,
while softirq disablement can. I should try to send a patch and comment more
thoroughly on the subtleties of preempt mask in preempt.h.
>
> > irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > }
> >
> > @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
> > }
> > # endif
> >
> > -void vtime_account_irq(struct task_struct *tsk)
> > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > {
> > - if (hardirq_count()) {
> > + unsigned int pc = preempt_count() - offset;
> > +
> > + if (pc & HARDIRQ_OFFSET) {
>
> Shouldn't this be HARDIRQ_MASK like above?
In the rare cases of nested hardirqs happening with broken drivers, Only the outer hardirq
does matter. All the time spent in the inner hardirqs is included in the outer
one.
Thanks.
>
> > vtime_account_hardirq(tsk);
> > - } else if (in_serving_softirq()) {
> > + } else if (pc & SOFTIRQ_OFFSET) {
> > vtime_account_softirq(tsk);
> > } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
> > is_idle_task(tsk)) {
>
> Thanks
>
> --
> Qais Yousef
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-29 13:41 ` Frederic Weisbecker
@ 2020-12-29 14:12 ` Qais Yousef
2020-12-29 14:30 ` Frederic Weisbecker
0 siblings, 1 reply; 28+ messages in thread
From: Qais Yousef @ 2020-12-29 14:12 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Thomas Gleixner, Peter Zijlstra, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
On 12/29/20 14:41, Frederic Weisbecker wrote:
> On Mon, Dec 28, 2020 at 02:15:29AM +0000, Qais Yousef wrote:
> > Hi Frederic
> >
> > On 12/02/20 12:57, Frederic Weisbecker wrote:
> > > @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
> > > * in that case, so as not to confuse scheduler with a special task
> > > * that do not consume any time, but still wants to run.
> > > */
> > > - if (hardirq_count())
> > > + if (pc & HARDIRQ_MASK)
> > > irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > > - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> > > + else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
> >
> > Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
> > seems we're in-softirq only if the count is odd numbered.
> >
> > /me tries to dig more
> >
> > Hmm could it be because the softirq count is actually 1 bit and the rest is
> > for SOFTIRQ_DISABLE_OFFSET (BH disabled)?
>
> Exactly!
>
> >
> > IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
> > count BH disable nesting, right?
> >
> > I guess this would make sense; we don't nest softirqs processing AFAIK. But
> > I could be misreading the code too :-)
>
> You got it right!
>
> This is commented in softirq.c somewhere:
>
> /*
> * preempt_count and SOFTIRQ_OFFSET usage:
> * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving
> * softirq processing.
> * - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
> * on local_bh_disable or local_bh_enable.
> * This lets us distinguish between whether we are currently processing
> * softirq and whether we just have bh disabled.
> */
>
> But we should elaborate on the fact that, indeed, softirq processing can't nest,
> while softirq disablement can. I should try to send a patch and comment more
> thoroughly on the subtleties of preempt mask in preempt.h.
Thanks for the info!
>
> >
> > > irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > > }
> > >
> > > @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
> > > }
> > > # endif
> > >
> > > -void vtime_account_irq(struct task_struct *tsk)
> > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > > {
> > > - if (hardirq_count()) {
> > > + unsigned int pc = preempt_count() - offset;
> > > +
> > > + if (pc & HARDIRQ_OFFSET) {
> >
> > Shouldn't this be HARDIRQ_MASK like above?
>
> In the rare cases of nested hardirqs happening with broken drivers, Only the outer hardirq
> does matter. All the time spent in the inner hardirqs is included in the outer
> one.
Ah I see. The original code was doing hardirq_count(), which apparently wasn't
right either.
Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
this otherwise, and IIUC we want this to trigger once on first entry only.
Thanks
--
Qais Yousef
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-29 14:12 ` Qais Yousef
@ 2020-12-29 14:30 ` Frederic Weisbecker
2020-12-29 15:58 ` Qais Yousef
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-29 14:30 UTC (permalink / raw)
To: Qais Yousef
Cc: Thomas Gleixner, Peter Zijlstra, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
On Tue, Dec 29, 2020 at 02:12:31PM +0000, Qais Yousef wrote:
> On 12/29/20 14:41, Frederic Weisbecker wrote:
> > > > -void vtime_account_irq(struct task_struct *tsk)
> > > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > > > {
> > > > - if (hardirq_count()) {
> > > > + unsigned int pc = preempt_count() - offset;
> > > > +
> > > > + if (pc & HARDIRQ_OFFSET) {
> > >
> > > Shouldn't this be HARDIRQ_MASK like above?
> >
> > In the rare cases of nested hardirqs happening with broken drivers, Only the outer hardirq
> > does matter. All the time spent in the inner hardirqs is included in the outer
> > one.
>
> Ah I see. The original code was doing hardirq_count(), which apparently wasn't
> right either.
>
> Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
> this otherwise, and IIUC we want this to trigger once on first entry only.
Right but we must also handle hardirqs interrupting either preempt disabled sections
or softirq servicing/disabled section.
3 stacking hardirqs should be rare enough that we don't really care. In the
worst case we are going to account the third IRQ seperately. Not a correctness
issue, just a rare unoptimized case.
>
> Thanks
>
> --
> Qais Yousef
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation
2020-12-29 14:30 ` Frederic Weisbecker
@ 2020-12-29 15:58 ` Qais Yousef
0 siblings, 0 replies; 28+ messages in thread
From: Qais Yousef @ 2020-12-29 15:58 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Thomas Gleixner, Peter Zijlstra, LKML, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
On 12/29/20 15:30, Frederic Weisbecker wrote:
> On Tue, Dec 29, 2020 at 02:12:31PM +0000, Qais Yousef wrote:
> > On 12/29/20 14:41, Frederic Weisbecker wrote:
> > > > > -void vtime_account_irq(struct task_struct *tsk)
> > > > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > > > > {
> > > > > - if (hardirq_count()) {
> > > > > + unsigned int pc = preempt_count() - offset;
> > > > > +
> > > > > + if (pc & HARDIRQ_OFFSET) {
> > > >
> > > > Shouldn't this be HARDIRQ_MASK like above?
> > >
> > > In the rare cases of nested hardirqs happening with broken drivers, Only the outer hardirq
> > > does matter. All the time spent in the inner hardirqs is included in the outer
> > > one.
> >
> > Ah I see. The original code was doing hardirq_count(), which apparently wasn't
> > right either.
> >
> > Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
> > this otherwise, and IIUC we want this to trigger once on first entry only.
>
> Right but we must also handle hardirqs interrupting either preempt disabled sections
> or softirq servicing/disabled section.
>
> 3 stacking hardirqs should be rare enough that we don't really care. In the
> worst case we are going to account the third IRQ seperately. Not a correctness
> issue, just a rare unoptimized case.
I admit I need to wrap my head around some more details to fully comprehend
that, but that's my own confusion to clear out :-)
Thanks for your answer.
Cheers
--
Qais Yousef
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 5/5] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET
2020-12-02 11:57 [PATCH 0/5] irq: Reorder time handling against HARDIRQ_OFFSET on IRQ entry v3 Frederic Weisbecker
` (3 preceding siblings ...)
2020-12-02 11:57 ` [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation Frederic Weisbecker
@ 2020-12-02 11:57 ` Frederic Weisbecker
2020-12-02 19:23 ` [tip: irq/core] " tip-bot2 for Frederic Weisbecker
4 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2020-12-02 11:57 UTC (permalink / raw)
To: Thomas Gleixner, Peter Zijlstra
Cc: LKML, Frederic Weisbecker, Tony Luck, Vasily Gorbik,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Christian Borntraeger, Fenghua Yu, Heiko Carstens
Now that account_hardirq_enter() is called after HARDIRQ_OFFSET has
been incremented, there is nothing left that prevents us from also
moving tick_irq_enter() after HARDIRQ_OFFSET is incremented.
The desired outcome is to remove the nasty hack that prevents softirqs
from being raised through ksoftirqd instead of the hardirq bottom half.
Also tick_irq_enter() then becomes appropriately covered by lockdep.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
kernel/softirq.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index b8f42b3ba8ca..d5bfd5e661fc 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -377,16 +377,12 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
*/
void irq_enter_rcu(void)
{
- if (is_idle_task(current) && !in_interrupt()) {
- /*
- * Prevent raise_softirq from needlessly waking up ksoftirqd
- * here, as softirq will be serviced on return from interrupt.
- */
- local_bh_disable();
+ __irq_enter_raw();
+
+ if (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET))
tick_irq_enter();
- _local_bh_enable();
- }
- __irq_enter();
+
+ account_hardirq_enter(current);
}
/**
--
2.25.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [tip: irq/core] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET
2020-12-02 11:57 ` [PATCH 5/5] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET Frederic Weisbecker
@ 2020-12-02 19:23 ` tip-bot2 for Frederic Weisbecker
0 siblings, 0 replies; 28+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2020-12-02 19:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Thomas Gleixner, x86, linux-kernel, maz
The following commit has been merged into the irq/core branch of tip:
Commit-ID: d14ce74f1fb376ccbbc0b05ded477ada51253729
Gitweb: https://git.kernel.org/tip/d14ce74f1fb376ccbbc0b05ded477ada51253729
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Wed, 02 Dec 2020 12:57:32 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 02 Dec 2020 20:20:05 +01:00
irq: Call tick_irq_enter() inside HARDIRQ_OFFSET
Now that account_hardirq_enter() is called after HARDIRQ_OFFSET has
been incremented, there is nothing left that prevents us from also
moving tick_irq_enter() after HARDIRQ_OFFSET is incremented.
The desired outcome is to remove the nasty hack that prevents softirqs
from being raised through ksoftirqd instead of the hardirq bottom half.
Also tick_irq_enter() then becomes appropriately covered by lockdep.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201202115732.27827-6-frederic@kernel.org
---
kernel/softirq.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index b8f42b3..d5bfd5e 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -377,16 +377,12 @@ restart:
*/
void irq_enter_rcu(void)
{
- if (is_idle_task(current) && !in_interrupt()) {
- /*
- * Prevent raise_softirq from needlessly waking up ksoftirqd
- * here, as softirq will be serviced on return from interrupt.
- */
- local_bh_disable();
+ __irq_enter_raw();
+
+ if (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET))
tick_irq_enter();
- _local_bh_enable();
- }
- __irq_enter();
+
+ account_hardirq_enter(current);
}
/**
^ permalink raw reply related [flat|nested] 28+ messages in thread