All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG nohz]: wrong user and system time accounting
@ 2017-03-23 20:55 Luiz Capitulino
  2017-03-24  0:56 ` Rik van Riel
                   ` (3 more replies)
  0 siblings, 4 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-23 20:55 UTC (permalink / raw)
  To: fweisbec; +Cc: riel, linux-kernel, linux-rt-users


When there are two or more tasks executing in user-space and
taking 100% of a nohz_full CPU, top reports 70% system time
and 30% user time utilization. Sometimes I'm even able to get
100% system time and 0% user time.

This was reproduced with latest Linus tree (093b995), but I
don't believe it's a regression (at least not a recent one)
as I can reproduce it with older kernels. Also, I have
CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
without it yet.

Below you'll find the steps to reproduce and some initial
analysis.

Steps to reproduce
------------------

1. Set up a CPU for nohz_full with isolcpus= nohz_full=

2. Pin two tasks that hog the CPU 100% of the time to that CPU

3. Run top -d1 and check system time

NOTE: When there's only one task hogging a nohz_full CPU, top
      shows 100% user-time, as expected

Initial analysis
----------------

When tracing vtime accounting functions and the user-space/kernel
transitions when the issue is taking place, I see several of the
following:

hog-10552 [015]  1132.711104: function:             enter_from_user_mode <-- apic_timer_interrupt
hog-10552 [015]  1132.711105: function:             __context_tracking_exit <-- enter_from_user_mode
hog-10552 [015]  1132.711105: bprint:               __context_tracking_exit.part.4: new state=1 cur state=1 active=1
hog-10552 [015]  1132.711105: function:             vtime_account_user <-- __context_tracking_exit.part.4
hog-10552 [015]  1132.711105: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
hog-10552 [015]  1132.711106: function:             irq_enter <-- smp_apic_timer_interrupt
hog-10552 [015]  1132.711106: function:             tick_sched_timer <-- __hrtimer_run_queues
hog-10552 [015]  1132.711108: function:             irq_exit <-- smp_apic_timer_interrupt
hog-10552 [015]  1132.711108: function:             __context_tracking_enter <-- prepare_exit_to_usermode
hog-10552 [015]  1132.711108: bprint:               __context_tracking_enter.part.2: new state=1 cur state=0 active=1
hog-10552 [015]  1132.711109: function:             vtime_user_enter <-- __context_tracking_enter.part.2
hog-10552 [015]  1132.711109: function:             __vtime_account_system <-- vtime_user_enter
hog-10552 [015]  1132.711109: function:             account_system_time <-- __vtime_account_system

On entering the kernel due to a timer interrupt, vtime_account_user()
skips user-time accounting. Then later on when returning to user-space,
vtime_user_enter() is probably accounting the whole time (ie. user-space
plus kernel-space) to system time.

Now, when does vtime_account_user() skips accounting? Well, when the
time delta is less then one jiffie. This would imply that vtime_account_user()
is being called less than one jiffie since the last accounting, but I haven't
confirmed any of this yet.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
@ 2017-03-24  0:56 ` Rik van Riel
  2017-03-24  1:05   ` Luiz Capitulino
  2017-03-27  5:33   ` lkml
  2017-03-24  1:52 ` Wanpeng Li
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 67+ messages in thread
From: Rik van Riel @ 2017-03-24  0:56 UTC (permalink / raw)
  To: Luiz Capitulino, fweisbec; +Cc: linux-kernel, linux-rt-users

On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:
> When there are two or more tasks executing in user-space and
> taking 100% of a nohz_full CPU, top reports 70% system time
> and 30% user time utilization. Sometimes I'm even able to get
> 100% system time and 0% user time.
> 
> This was reproduced with latest Linus tree (093b995), but I
> don't believe it's a regression (at least not a recent one)
> as I can reproduce it with older kernels. Also, I have
> CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> without it yet.
> 
> Below you'll find the steps to reproduce and some initial
> analysis.
> 
> Steps to reproduce
> ------------------
> 
> 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> 
> 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> 
> 3. Run top -d1 and check system time
> 
> NOTE: When there's only one task hogging a nohz_full CPU, top
>       shows 100% user-time, as expected
> 
> Initial analysis
> ----------------
> 
> When tracing vtime accounting functions and the user-space/kernel
> transitions when the issue is taking place, I see several of the
> following:
> 
> hog-10552 [015]  1132.711104:
> function:             enter_from_user_mode <-- apic_timer_interrupt
> hog-10552 [015]  1132.711105:
> function:             __context_tracking_exit <--
> enter_from_user_mode
> hog-10552 [015]  1132.711105:
> bprint:               __context_tracking_exit.part.4: new state=1 cur
> state=1 active=1
> hog-10552 [015]  1132.711105:
> function:             vtime_account_user <--
> __context_tracking_exit.part.4
> hog-10552 [015]  1132.711105:
> function:             smp_apic_timer_interrupt <--
> apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             irq_enter <--
> smp_apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             tick_sched_timer
> <-- __hrtimer_run_queues
> hog-10552 [015]  1132.711108: function:             irq_exit <--
> smp_apic_timer_interrupt
> hog-10552 [015]  1132.711108:
> function:             __context_tracking_enter <--
> prepare_exit_to_usermode
> hog-10552 [015]  1132.711108:
> bprint:               __context_tracking_enter.part.2: new state=1
> cur state=0 active=1
> hog-10552 [015]  1132.711109: function:             vtime_user_enter
> <-- __context_tracking_enter.part.2
> hog-10552 [015]  1132.711109:
> function:             __vtime_account_system <-- vtime_user_enter
> hog-10552 [015]  1132.711109:
> function:             account_system_time <-- __vtime_account_system
> 
> On entering the kernel due to a timer interrupt, vtime_account_user()
> skips user-time accounting. Then later on when returning to user-
> space,
> vtime_user_enter() is probably accounting the whole time (ie. user-
> space
> plus kernel-space) to system time.
> 
> Now, when does vtime_account_user() skips accounting? Well, when the
> time delta is less then one jiffie. This would imply that
> vtime_account_user()
> is being called less than one jiffie since the last accounting, but I
> haven't
> confirmed any of this yet.

Jiffies should be advanced by the timer interrupt, on the
housekeeping CPU, which is not doing context tracking.

Why is the isolated/nohz_full CPU receiving timer interrupts
at all?

I thought it would not, but obviously I am wrong. What is
going on here?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-24  0:56 ` Rik van Riel
@ 2017-03-24  1:05   ` Luiz Capitulino
  2017-03-24  1:08     ` Rik van Riel
  2017-03-27  5:33   ` lkml
  1 sibling, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-24  1:05 UTC (permalink / raw)
  To: Rik van Riel; +Cc: fweisbec, linux-kernel, linux-rt-users

On Thu, 23 Mar 2017 20:56:02 -0400
Rik van Riel <riel@redhat.com> wrote:

> On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:
> > When there are two or more tasks executing in user-space and
> > taking 100% of a nohz_full CPU, top reports 70% system time
> > and 30% user time utilization. Sometimes I'm even able to get
> > 100% system time and 0% user time.
> > 
> > This was reproduced with latest Linus tree (093b995), but I
> > don't believe it's a regression (at least not a recent one)
> > as I can reproduce it with older kernels. Also, I have
> > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > without it yet.
> > 
> > Below you'll find the steps to reproduce and some initial
> > analysis.
> > 
> > Steps to reproduce
> > ------------------
> > 
> > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> > 
> > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> > 
> > 3. Run top -d1 and check system time
> > 
> > NOTE: When there's only one task hogging a nohz_full CPU, top
> >       shows 100% user-time, as expected
> > 
> > Initial analysis
> > ----------------
> > 
> > When tracing vtime accounting functions and the user-space/kernel
> > transitions when the issue is taking place, I see several of the
> > following:
> > 
> > hog-10552 [015]  1132.711104:
> > function:             enter_from_user_mode <-- apic_timer_interrupt
> > hog-10552 [015]  1132.711105:
> > function:             __context_tracking_exit <--
> > enter_from_user_mode
> > hog-10552 [015]  1132.711105:
> > bprint:               __context_tracking_exit.part.4: new state=1 cur
> > state=1 active=1
> > hog-10552 [015]  1132.711105:
> > function:             vtime_account_user <--
> > __context_tracking_exit.part.4
> > hog-10552 [015]  1132.711105:
> > function:             smp_apic_timer_interrupt <--
> > apic_timer_interrupt
> > hog-10552 [015]  1132.711106: function:             irq_enter <--
> > smp_apic_timer_interrupt
> > hog-10552 [015]  1132.711106: function:             tick_sched_timer
> > <-- __hrtimer_run_queues
> > hog-10552 [015]  1132.711108: function:             irq_exit <--
> > smp_apic_timer_interrupt
> > hog-10552 [015]  1132.711108:
> > function:             __context_tracking_enter <--
> > prepare_exit_to_usermode
> > hog-10552 [015]  1132.711108:
> > bprint:               __context_tracking_enter.part.2: new state=1
> > cur state=0 active=1
> > hog-10552 [015]  1132.711109: function:             vtime_user_enter
> > <-- __context_tracking_enter.part.2
> > hog-10552 [015]  1132.711109:
> > function:             __vtime_account_system <-- vtime_user_enter
> > hog-10552 [015]  1132.711109:
> > function:             account_system_time <-- __vtime_account_system
> > 
> > On entering the kernel due to a timer interrupt, vtime_account_user()
> > skips user-time accounting. Then later on when returning to user-
> > space,
> > vtime_user_enter() is probably accounting the whole time (ie. user-
> > space
> > plus kernel-space) to system time.
> > 
> > Now, when does vtime_account_user() skips accounting? Well, when the
> > time delta is less then one jiffie. This would imply that
> > vtime_account_user()
> > is being called less than one jiffie since the last accounting, but I
> > haven't
> > confirmed any of this yet.  
> 
> Jiffies should be advanced by the timer interrupt, on the
> housekeeping CPU, which is not doing context tracking.

The hypothesis isn't that it wasn't advanced, but that we stayed in
user-space less than 1ms.

> Why is the isolated/nohz_full CPU receiving timer interrupts
> at all?
> 
> I thought it would not, but obviously I am wrong. What is
> going on here?

There are two runnable SCHED_OTHER tasks on the nohz_full CPU. When
that happens, the tick is re-activated. We're not nohz_full anymore,
but accounting should still work.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-24  1:05   ` Luiz Capitulino
@ 2017-03-24  1:08     ` Rik van Riel
  2017-03-24  1:39       ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Rik van Riel @ 2017-03-24  1:08 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: fweisbec, linux-kernel, linux-rt-users

On Thu, 2017-03-23 at 21:05 -0400, Luiz Capitulino wrote:
> On Thu, 23 Mar 2017 20:56:02 -0400
> Rik van Riel <riel@redhat.com> wrote:
> 
> > On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:
> > > When there are two or more tasks executing in user-space and
> > > taking 100% of a nohz_full CPU, top reports 70% system time
> > > and 30% user time utilization. Sometimes I'm even able to get
> > > 100% system time and 0% user time.
> > > 
> > > This was reproduced with latest Linus tree (093b995), but I
> > > don't believe it's a regression (at least not a recent one)
> > > as I can reproduce it with older kernels. Also, I have
> > > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > > without it yet.
> > > 
> > > Below you'll find the steps to reproduce and some initial
> > > analysis.
> > > 
> > > Steps to reproduce
> > > ------------------
> > > 
> > > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> > > 
> > > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> > > 
> > > 3. Run top -d1 and check system time
> > > 
> > > NOTE: When there's only one task hogging a nohz_full CPU, top
> > >       shows 100% user-time, as expected
> > > 
> > > Initial analysis
> > > ----------------
> > > 
> > > When tracing vtime accounting functions and the user-space/kernel
> > > transitions when the issue is taking place, I see several of the
> > > following:
> > > 
> > > hog-10552 [015]  1132.711104:
> > > function:             enter_from_user_mode <--
> > > apic_timer_interrupt
> > > hog-10552 [015]  1132.711105:
> > > function:             __context_tracking_exit <--
> > > enter_from_user_mode
> > > hog-10552 [015]  1132.711105:
> > > bprint:               __context_tracking_exit.part.4: new state=1
> > > cur
> > > state=1 active=1
> > > hog-10552 [015]  1132.711105:
> > > function:             vtime_account_user <--
> > > __context_tracking_exit.part.4
> > > hog-10552 [015]  1132.711105:
> > > function:             smp_apic_timer_interrupt <--
> > > apic_timer_interrupt
> > > hog-10552 [015]  1132.711106: function:             irq_enter <--
> > > smp_apic_timer_interrupt
> > > hog-10552 [015]  1132.711106:
> > > function:             tick_sched_timer
> > > <-- __hrtimer_run_queues
> > > hog-10552 [015]  1132.711108: function:             irq_exit <--
> > > smp_apic_timer_interrupt
> > > hog-10552 [015]  1132.711108:
> > > function:             __context_tracking_enter <--
> > > prepare_exit_to_usermode
> > > hog-10552 [015]  1132.711108:
> > > bprint:               __context_tracking_enter.part.2: new
> > > state=1
> > > cur state=0 active=1
> > > hog-10552 [015]  1132.711109:
> > > function:             vtime_user_enter
> > > <-- __context_tracking_enter.part.2
> > > hog-10552 [015]  1132.711109:
> > > function:             __vtime_account_system <-- vtime_user_enter
> > > hog-10552 [015]  1132.711109:
> > > function:             account_system_time <--
> > > __vtime_account_system
> > > 
> > > On entering the kernel due to a timer interrupt,
> > > vtime_account_user()
> > > skips user-time accounting. Then later on when returning to user-
> > > space,
> > > vtime_user_enter() is probably accounting the whole time (ie.
> > > user-
> > > space
> > > plus kernel-space) to system time.
> > > 
> > > Now, when does vtime_account_user() skips accounting? Well, when
> > > the
> > > time delta is less then one jiffie. This would imply that
> > > vtime_account_user()
> > > is being called less than one jiffie since the last accounting,
> > > but I
> > > haven't
> > > confirmed any of this yet.  
> > 
> > Jiffies should be advanced by the timer interrupt, on the
> > housekeeping CPU, which is not doing context tracking.
> 
> The hypothesis isn't that it wasn't advanced, but that we stayed in
> user-space less than 1ms.

That is part of the hypothesis. The other part of the hypothesis
involves jiffies advancing on the nohz_full & isolated CPU while
that CPU is in kernel mode 30% of the time.

I have no good explanation for the latter yet...

> > Why is the isolated/nohz_full CPU receiving timer interrupts
> > at all?
> > 
> > I thought it would not, but obviously I am wrong. What is
> > going on here?
> 
> There are two runnable SCHED_OTHER tasks on the nohz_full CPU. When
> that happens, the tick is re-activated. We're not nohz_full anymore,
> but accounting should still work.

Isn't the scheduler tick distinct from the timer interrupt,
or am I confused?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-24  1:08     ` Rik van Riel
@ 2017-03-24  1:39       ` Luiz Capitulino
  0 siblings, 0 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-24  1:39 UTC (permalink / raw)
  To: Rik van Riel; +Cc: fweisbec, linux-kernel, linux-rt-users

On Thu, 23 Mar 2017 21:08:38 -0400
Rik van Riel <riel@redhat.com> wrote:

> On Thu, 2017-03-23 at 21:05 -0400, Luiz Capitulino wrote:
> > On Thu, 23 Mar 2017 20:56:02 -0400
> > Rik van Riel <riel@redhat.com> wrote:
> >   
> > > On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:  
> > > > When there are two or more tasks executing in user-space and
> > > > taking 100% of a nohz_full CPU, top reports 70% system time
> > > > and 30% user time utilization. Sometimes I'm even able to get
> > > > 100% system time and 0% user time.
> > > > 
> > > > This was reproduced with latest Linus tree (093b995), but I
> > > > don't believe it's a regression (at least not a recent one)
> > > > as I can reproduce it with older kernels. Also, I have
> > > > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > > > without it yet.
> > > > 
> > > > Below you'll find the steps to reproduce and some initial
> > > > analysis.
> > > > 
> > > > Steps to reproduce
> > > > ------------------
> > > > 
> > > > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> > > > 
> > > > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> > > > 
> > > > 3. Run top -d1 and check system time
> > > > 
> > > > NOTE: When there's only one task hogging a nohz_full CPU, top
> > > >       shows 100% user-time, as expected
> > > > 
> > > > Initial analysis
> > > > ----------------
> > > > 
> > > > When tracing vtime accounting functions and the user-space/kernel
> > > > transitions when the issue is taking place, I see several of the
> > > > following:
> > > > 
> > > > hog-10552 [015]  1132.711104:
> > > > function:             enter_from_user_mode <--
> > > > apic_timer_interrupt
> > > > hog-10552 [015]  1132.711105:
> > > > function:             __context_tracking_exit <--
> > > > enter_from_user_mode
> > > > hog-10552 [015]  1132.711105:
> > > > bprint:               __context_tracking_exit.part.4: new state=1
> > > > cur
> > > > state=1 active=1
> > > > hog-10552 [015]  1132.711105:
> > > > function:             vtime_account_user <--
> > > > __context_tracking_exit.part.4
> > > > hog-10552 [015]  1132.711105:
> > > > function:             smp_apic_timer_interrupt <--
> > > > apic_timer_interrupt
> > > > hog-10552 [015]  1132.711106: function:             irq_enter <--
> > > > smp_apic_timer_interrupt
> > > > hog-10552 [015]  1132.711106:
> > > > function:             tick_sched_timer
> > > > <-- __hrtimer_run_queues
> > > > hog-10552 [015]  1132.711108: function:             irq_exit <--
> > > > smp_apic_timer_interrupt
> > > > hog-10552 [015]  1132.711108:
> > > > function:             __context_tracking_enter <--
> > > > prepare_exit_to_usermode
> > > > hog-10552 [015]  1132.711108:
> > > > bprint:               __context_tracking_enter.part.2: new
> > > > state=1
> > > > cur state=0 active=1
> > > > hog-10552 [015]  1132.711109:
> > > > function:             vtime_user_enter
> > > > <-- __context_tracking_enter.part.2
> > > > hog-10552 [015]  1132.711109:
> > > > function:             __vtime_account_system <-- vtime_user_enter
> > > > hog-10552 [015]  1132.711109:
> > > > function:             account_system_time <--
> > > > __vtime_account_system
> > > > 
> > > > On entering the kernel due to a timer interrupt,
> > > > vtime_account_user()
> > > > skips user-time accounting. Then later on when returning to user-
> > > > space,
> > > > vtime_user_enter() is probably accounting the whole time (ie.
> > > > user-
> > > > space
> > > > plus kernel-space) to system time.
> > > > 
> > > > Now, when does vtime_account_user() skips accounting? Well, when
> > > > the
> > > > time delta is less then one jiffie. This would imply that
> > > > vtime_account_user()
> > > > is being called less than one jiffie since the last accounting,
> > > > but I
> > > > haven't
> > > > confirmed any of this yet.    
> > > 
> > > Jiffies should be advanced by the timer interrupt, on the
> > > housekeeping CPU, which is not doing context tracking.  
> > 
> > The hypothesis isn't that it wasn't advanced, but that we stayed in
> > user-space less than 1ms.  
> 
> That is part of the hypothesis. The other part of the hypothesis
> involves jiffies advancing on the nohz_full & isolated CPU while
> that CPU is in kernel mode 30% of the time.

OK.

> I have no good explanation for the latter yet...
> 
> > > Why is the isolated/nohz_full CPU receiving timer interrupts
> > > at all?
> > > 
> > > I thought it would not, but obviously I am wrong. What is
> > > going on here?  
> > 
> > There are two runnable SCHED_OTHER tasks on the nohz_full CPU. When
> > that happens, the tick is re-activated. We're not nohz_full anymore,
> > but accounting should still work.  
> 
> Isn't the scheduler tick distinct from the timer interrupt,
> or am I confused?

If you consider the scheduler tick to be the code that's run
by scheduler_tick(), yes they are distinct. But I was referring
to tick_sched_timer() the "main" tick handler. This one runs
as a hrtimer handler. In the case described in this email, the
timer interrupt fires because the nohz code sets up a hrtimer
to run (which is the tick, tick_sched_timer()).

Btw, _if_ the hypothesis is correct, I guess I might be able to
create a reproducer that doesn't depend on the tick. A task
staying 980us busy-looping in user-space and then making a
few dozen microseconds kernel call will probably report 100%
system time. This will be hard to do, but I'll give it try tomorrow.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
  2017-03-24  0:56 ` Rik van Riel
@ 2017-03-24  1:52 ` Wanpeng Li
  2017-03-24  3:56   ` Luiz Capitulino
  2017-03-27  1:56 ` Wanpeng Li
  2017-03-29 13:04 ` Frederic Weisbecker
  3 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-24  1:52 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

2017-03-24 4:55 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
>
> When there are two or more tasks executing in user-space and
> taking 100% of a nohz_full CPU, top reports 70% system time
> and 30% user time utilization. Sometimes I'm even able to get
> 100% system time and 0% user time.
>
> This was reproduced with latest Linus tree (093b995), but I
> don't believe it's a regression (at least not a recent one)
> as I can reproduce it with older kernels. Also, I have
> CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> without it yet.
>
> Below you'll find the steps to reproduce and some initial
> analysis.
>
> Steps to reproduce
> ------------------
>
> 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
>
> 2. Pin two tasks that hog the CPU 100% of the time to that CPU
>
> 3. Run top -d1 and check system time
>
> NOTE: When there's only one task hogging a nohz_full CPU, top
>       shows 100% user-time, as expected

I just saw at most 12% system time instead of 30% or 100%. Could you
grep HZ /boot/config-`uname -r` and post here?

Regards,
Wanpeng Li

>
> Initial analysis
> ----------------
>
> When tracing vtime accounting functions and the user-space/kernel
> transitions when the issue is taking place, I see several of the
> following:
>
> hog-10552 [015]  1132.711104: function:             enter_from_user_mode <-- apic_timer_interrupt
> hog-10552 [015]  1132.711105: function:             __context_tracking_exit <-- enter_from_user_mode
> hog-10552 [015]  1132.711105: bprint:               __context_tracking_exit.part.4: new state=1 cur state=1 active=1
> hog-10552 [015]  1132.711105: function:             vtime_account_user <-- __context_tracking_exit.part.4
> hog-10552 [015]  1132.711105: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             irq_enter <-- smp_apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             tick_sched_timer <-- __hrtimer_run_queues
> hog-10552 [015]  1132.711108: function:             irq_exit <-- smp_apic_timer_interrupt
> hog-10552 [015]  1132.711108: function:             __context_tracking_enter <-- prepare_exit_to_usermode
> hog-10552 [015]  1132.711108: bprint:               __context_tracking_enter.part.2: new state=1 cur state=0 active=1
> hog-10552 [015]  1132.711109: function:             vtime_user_enter <-- __context_tracking_enter.part.2
> hog-10552 [015]  1132.711109: function:             __vtime_account_system <-- vtime_user_enter
> hog-10552 [015]  1132.711109: function:             account_system_time <-- __vtime_account_system
>
> On entering the kernel due to a timer interrupt, vtime_account_user()
> skips user-time accounting. Then later on when returning to user-space,
> vtime_user_enter() is probably accounting the whole time (ie. user-space
> plus kernel-space) to system time.
>
> Now, when does vtime_account_user() skips accounting? Well, when the
> time delta is less then one jiffie. This would imply that vtime_account_user()
> is being called less than one jiffie since the last accounting, but I haven't
> confirmed any of this yet.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-24  1:52 ` Wanpeng Li
@ 2017-03-24  3:56   ` Luiz Capitulino
  0 siblings, 0 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-24  3:56 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

On Fri, 24 Mar 2017 09:52:11 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> 2017-03-24 4:55 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> >
> > When there are two or more tasks executing in user-space and
> > taking 100% of a nohz_full CPU, top reports 70% system time
> > and 30% user time utilization. Sometimes I'm even able to get
> > 100% system time and 0% user time.
> >
> > This was reproduced with latest Linus tree (093b995), but I
> > don't believe it's a regression (at least not a recent one)
> > as I can reproduce it with older kernels. Also, I have
> > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > without it yet.
> >
> > Below you'll find the steps to reproduce and some initial
> > analysis.
> >
> > Steps to reproduce
> > ------------------
> >
> > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> >
> > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> >
> > 3. Run top -d1 and check system time
> >
> > NOTE: When there's only one task hogging a nohz_full CPU, top
> >       shows 100% user-time, as expected  
> 
> I just saw at most 12% system time instead of 30% or 100%. Could you
> grep HZ /boot/config-`uname -r` and post here?

I'm sending you the whole thing:

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.11.0-rc3 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
# CONFIG_CONTEXT_TRACKING_FORCE is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_NONE is not set
# CONFIG_RCU_NOCB_CPU_ZERO is not set
CONFIG_RCU_NOCB_CPU_ALL=y
CONFIG_BUILD_BIN2C=y
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=20
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_SWAP_ENABLED=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_BPF is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_INITRAMFS_COMPRESSION=".gz"
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_POSIX_TIMERS=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_PRINTK=y
CONFIG_PRINTK_NMI=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_BPF_SYSCALL=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_USERFAULTFD=y
CONFIG_PCI_QUIRKS=y
CONFIG_MEMBARRIER=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y
# CONFIG_PC104 is not set

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLUB_MEMCG_SYSFS_ON is not set
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLAB_FREELIST_RANDOM is not set
CONFIG_SLUB_CPU_PARTIAL=y
CONFIG_SYSTEM_DATA_VERIFICATION=y
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_KEXEC_CORE=y
CONFIG_OPROFILE=m
CONFIG_OPROFILE_EVENT_MULTIPLEX=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
CONFIG_KPROBES_ON_FTRACE=y
# CONFIG_UPROBES is not set
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_GCC_PLUGINS=y
CONFIG_GCC_PLUGINS=y
# CONFIG_GCC_PLUGIN_LATENT_ENTROPY is not set
# CONFIG_GCC_PLUGIN_STRUCTLEAK is not set
CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR_NONE is not set
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
CONFIG_CC_STACKPROTECTOR_STRONG=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_HAVE_STACK_VALIDATION=y
# CONFIG_HAVE_ARCH_HASH is not set
# CONFIG_ISA_BUS_API is not set
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
# CONFIG_CPU_NO_EFFICIENT_FFS is not set
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
# CONFIG_ARCH_OPTIONAL_KERNEL_RWX is not set
# CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT is not set
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
# CONFIG_MODULE_SIG_SHA224 is not set
CONFIG_MODULE_SIG_SHA256=y
# CONFIG_MODULE_SIG_SHA384 is not set
# CONFIG_MODULE_SIG_SHA512 is not set
CONFIG_MODULE_SIG_HASH="sha256"
# CONFIG_MODULE_COMPRESS is not set
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLK_SCSI_REQUEST=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
# CONFIG_BLK_DEV_ZONED is not set
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_CMDLINE_PARSER is not set
# CONFIG_BLK_WBT is not set
CONFIG_BLK_DEBUG_FS=y
# CONFIG_BLK_SED_OPAL is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
CONFIG_BLOCK_COMPAT=y
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_DEFAULT_DEADLINE=y
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_FAST_FEATURE_TESTS=y
CONFIG_X86_X2APIC=y
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_INTEL_RDT_A=y
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_NUMACHIP is not set
# CONFIG_X86_VSMP is not set
CONFIG_X86_UV=y
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_MID is not set
CONFIG_X86_INTEL_LPSS=y
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=y
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_QUEUED_LOCK_STAT is not set
CONFIG_XEN=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_512GB=y
CONFIG_XEN_SAVE_RESTORE=y
# CONFIG_XEN_DEBUG_FS is not set
# CONFIG_XEN_PVH is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
CONFIG_PARAVIRT_TIME_ACCOUNTING=y
CONFIG_PARAVIRT_CLOCK=y
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_MAXSMP=y
CONFIG_NR_CPUS=8192
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_SCHED_MC_PRIO=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=m
CONFIG_X86_THERMAL_VECTOR=y

#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
# CONFIG_PERF_EVENTS_AMD_POWER is not set
# CONFIG_VM86 is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_I8K=m
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_AMD=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=10
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_MEMORY_PROBE=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_MOVABLE_NODE=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
CONFIG_BALLOON_COMPACTION=y
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_HWPOISON_INJECT=m
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
CONFIG_CLEANCACHE=y
CONFIG_FRONTSWAP=y
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_ZSWAP=y
CONFIG_ZPOOL=y
CONFIG_ZBUD=y
# CONFIG_Z3FOLD is not set
CONFIG_ZSMALLOC=y
# CONFIG_PGTABLE_MAPPING is not set
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
# CONFIG_ZONE_DEVICE is not set
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
# CONFIG_X86_PMEM_LEGACY is not set
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_MIXED is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=m
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_CPPC_LIB=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_IPMI=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_TABLE_UPGRADE=y
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=m
CONFIG_ACPI_HED=y
CONFIG_ACPI_CUSTOM_METHOD=m
CONFIG_ACPI_BGRT=y
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ACPI_APEI_EINJ=m
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_DPTF_POWER is not set
CONFIG_ACPI_EXTLOG=m
# CONFIG_PMIC_OPREGION is not set
# CONFIG_ACPI_CONFIGFS is not set
CONFIG_SFI=y

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
# CONFIG_CPU_FREQ_STAT is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
CONFIG_X86_PCC_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_AMD_FREQ_SENSITIVITY=m
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_P4_CLOCKMOD=m

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
# CONFIG_CPU_IDLE_GOV_LADDER is not set
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
CONFIG_INTEL_IDLE=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_XEN=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_PCIE_ECRC=y
CONFIG_PCIEAER_INJECT=m
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
# CONFIG_PCIE_DPC is not set
# CONFIG_PCIE_PTM is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
CONFIG_PCI_STUB=y
# CONFIG_XEN_PCIDEV_FRONTEND is not set
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
CONFIG_PCI_LABEL=y
# CONFIG_PCI_HYPERV is not set
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_SHPC=m

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT is not set

#
# PCI host controller drivers
#
# CONFIG_VMD is not set
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
CONFIG_PCCARD=y
# CONFIG_PCMCIA is not set
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=m
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
# CONFIG_X86_X32 is not set
CONFIG_COMPAT_32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_NET=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_UNIX_DIAG=m
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_ROUTE_CLASSID=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE_DEMUX=m
CONFIG_NET_IP_TUNNEL=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=m
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
# CONFIG_INET_ESP_OFFLOAD is not set
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
# CONFIG_INET_RAW_DIAG is not set
# CONFIG_INET_DIAG_DESTROY is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
# CONFIG_TCP_CONG_NV is not set
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
CONFIG_TCP_CONG_DCTCP=m
# CONFIG_TCP_CONG_CDG is not set
# CONFIG_TCP_CONG_BBR is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
# CONFIG_INET6_ESP_OFFLOAD is not set
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=m
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=m
CONFIG_INET6_XFRM_MODE_TUNNEL=m
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_SIT_6RD=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
CONFIG_IPV6_GRE=m
# CONFIG_IPV6_FOU is not set
# CONFIG_IPV6_FOU_TUNNEL is not set
CONFIG_IPV6_MULTIPLE_TABLES=y
# CONFIG_IPV6_SUBTREES is not set
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_PIMSM_V2=y
# CONFIG_IPV6_SEG6_LWTUNNEL is not set
# CONFIG_IPV6_SEG6_HMAC is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
CONFIG_NETWORK_PHY_TIMESTAMPING=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_ACCT=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_LOG_COMMON=m
# CONFIG_NF_LOG_NETDEV is not set
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CONNTRACK_TIMEOUT is not set
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=y
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=y
CONFIG_NF_CT_PROTO_UDPLITE=y
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_BROADCAST=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_SNMP=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
# CONFIG_NETFILTER_NETLINK_GLUE_CT is not set
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_NF_NAT_PROTO_DCCP=y
CONFIG_NF_NAT_PROTO_UDPLITE=y
CONFIG_NF_NAT_PROTO_SCTP=y
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_REDIRECT=m
CONFIG_NETFILTER_SYNPROXY=m
CONFIG_NF_TABLES=m
CONFIG_NF_TABLES_INET=m
# CONFIG_NF_TABLES_NETDEV is not set
CONFIG_NFT_EXTHDR=m
CONFIG_NFT_META=m
# CONFIG_NFT_RT is not set
# CONFIG_NFT_NUMGEN is not set
CONFIG_NFT_CT=m
# CONFIG_NFT_SET_RBTREE is not set
# CONFIG_NFT_SET_HASH is not set
# CONFIG_NFT_SET_BITMAP is not set
CONFIG_NFT_COUNTER=m
CONFIG_NFT_LOG=m
CONFIG_NFT_LIMIT=m
# CONFIG_NFT_MASQ is not set
# CONFIG_NFT_REDIR is not set
CONFIG_NFT_NAT=m
# CONFIG_NFT_OBJREF is not set
CONFIG_NFT_QUEUE=m
# CONFIG_NFT_QUOTA is not set
# CONFIG_NFT_REJECT is not set
# CONFIG_NFT_REJECT_INET is not set
CONFIG_NFT_COMPAT=m
CONFIG_NFT_HASH=m
CONFIG_NETFILTER_XTABLES=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m
CONFIG_NETFILTER_XT_SET=m

#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_AUDIT=m
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
CONFIG_NETFILTER_XT_TARGET_LED=m
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_NAT=m
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
CONFIG_NETFILTER_XT_TARGET_TEE=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_CPU=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m
CONFIG_NETFILTER_XT_MATCH_L2TP=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_NFACCT=m
CONFIG_NETFILTER_XT_MATCH_OSF=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
# CONFIG_IP_SET_HASH_IPMARK is not set
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
# CONFIG_IP_SET_HASH_IPMAC is not set
# CONFIG_IP_SET_HASH_MAC is not set
# CONFIG_IP_SET_HASH_NETPORTNET is not set
CONFIG_IP_SET_HASH_NET=m
# CONFIG_IP_SET_HASH_NETNET is not set
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_IP_VS=m
CONFIG_IP_VS_IPV6=y
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
# CONFIG_IP_VS_FO is not set
# CONFIG_IP_VS_OVF is not set
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_PE_SIP=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_SOCKET_IPV4 is not set
CONFIG_NF_TABLES_IPV4=m
CONFIG_NFT_CHAIN_ROUTE_IPV4=m
# CONFIG_NFT_REJECT_IPV4 is not set
# CONFIG_NFT_DUP_IPV4 is not set
# CONFIG_NFT_FIB_IPV4 is not set
# CONFIG_NF_TABLES_ARP is not set
CONFIG_NF_DUP_IPV4=m
# CONFIG_NF_LOG_ARP is not set
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=m
CONFIG_NF_NAT_IPV4=m
CONFIG_NFT_CHAIN_NAT_IPV4=m
CONFIG_NF_NAT_MASQUERADE_IPV4=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_SYNPROXY=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# IPv6: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV6=m
CONFIG_NF_CONNTRACK_IPV6=m
# CONFIG_NF_SOCKET_IPV6 is not set
CONFIG_NF_TABLES_IPV6=m
CONFIG_NFT_CHAIN_ROUTE_IPV6=m
# CONFIG_NFT_REJECT_IPV6 is not set
# CONFIG_NFT_DUP_IPV6 is not set
# CONFIG_NFT_FIB_IPV6 is not set
CONFIG_NF_DUP_IPV6=m
CONFIG_NF_REJECT_IPV6=m
CONFIG_NF_LOG_IPV6=m
CONFIG_NF_NAT_IPV6=m
CONFIG_NFT_CHAIN_NAT_IPV6=m
CONFIG_NF_NAT_MASQUERADE_IPV6=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RPFILTER=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_SYNPROXY=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_SECURITY=m
CONFIG_IP6_NF_NAT=m
CONFIG_IP6_NF_TARGET_MASQUERADE=m
# CONFIG_IP6_NF_TARGET_NPT is not set
CONFIG_NF_TABLES_BRIDGE=m
CONFIG_NFT_BRIDGE_META=m
# CONFIG_NF_LOG_BRIDGE is not set
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_IP6=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_NFLOG=m
CONFIG_IP_DCCP=m
CONFIG_INET_DCCP_DIAG=m

#
# DCCP CCIDs Configuration
#
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
CONFIG_IP_DCCP_CCID3=y
# CONFIG_IP_DCCP_CCID3_DEBUG is not set
CONFIG_IP_DCCP_TFRC_LIB=y

#
# DCCP Kernel Hacking
#
# CONFIG_IP_DCCP_DEBUG is not set
# CONFIG_NET_DCCPPROBE is not set
CONFIG_IP_SCTP=m
CONFIG_NET_SCTPPROBE=m
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
CONFIG_SCTP_COOKIE_HMAC_MD5=y
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
CONFIG_INET_SCTP_DIAG=m
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_L2TP=m
CONFIG_L2TP_DEBUGFS=m
CONFIG_L2TP_V3=y
CONFIG_L2TP_IP=m
CONFIG_L2TP_ETH=m
CONFIG_STP=m
CONFIG_GARP=m
CONFIG_MRP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_BRIDGE_VLAN_FILTERING=y
CONFIG_HAVE_NET_DSA=y
# CONFIG_NET_DSA is not set
CONFIG_VLAN_8021Q=m
CONFIG_VLAN_8021Q_GVRP=y
CONFIG_VLAN_8021Q_MVRP=y
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
CONFIG_IEEE802154=m
# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set
CONFIG_IEEE802154_SOCKET=m
CONFIG_MAC802154=m
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_MULTIQ=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFB=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_DRR=m
CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
CONFIG_NET_SCH_CODEL=m
CONFIG_NET_SCH_FQ_CODEL=m
CONFIG_NET_SCH_FQ=m
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_SCH_PLUG=m

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_CLS_FLOWER is not set
# CONFIG_NET_CLS_MATCHALL is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_EMATCH_IPSET=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=m
# CONFIG_NET_ACT_SAMPLE is not set
CONFIG_NET_ACT_IPT=m
CONFIG_NET_ACT_NAT=m
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
# CONFIG_NET_ACT_VLAN is not set
# CONFIG_NET_ACT_BPF is not set
# CONFIG_NET_ACT_CONNMARK is not set
# CONFIG_NET_ACT_SKBMOD is not set
# CONFIG_NET_ACT_IFE is not set
# CONFIG_NET_ACT_TUNNEL_KEY is not set
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
CONFIG_DNS_RESOLVER=m
# CONFIG_BATMAN_ADV is not set
CONFIG_OPENVSWITCH=m
CONFIG_OPENVSWITCH_GRE=m
CONFIG_OPENVSWITCH_VXLAN=m
CONFIG_OPENVSWITCH_GENEVE=m
CONFIG_VSOCKETS=m
CONFIG_VMWARE_VMCI_VSOCKETS=m
# CONFIG_VIRTIO_VSOCKETS is not set
CONFIG_NETLINK_DIAG=m
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
# CONFIG_MPLS_ROUTING is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
# CONFIG_NET_NCSI is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_JIT=y
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_NET_TCPPROBE is not set
CONFIG_NET_DROP_MONITOR=y
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
# CONFIG_STREAM_PARSER is not set
CONFIG_FIB_RULES=y
# CONFIG_WIRELESS is not set
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set
# CONFIG_CAIF is not set
CONFIG_CEPH_LIB=m
# CONFIG_CEPH_LIB_PRETTYDEBUG is not set
CONFIG_CEPH_LIB_USE_DNS_RESOLVER=y
# CONFIG_NFC is not set
# CONFIG_PSAMPLE is not set
# CONFIG_NET_IFE is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
# CONFIG_NET_DEVLINK is not set
CONFIG_MAY_USE_DEVLINK=y
CONFIG_HAVE_EBPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_SYS_HYPERVISOR=y
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=m
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_DMA_FENCE_TRACE is not set
# CONFIG_DMA_CMA is not set

#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_MTD=m
# CONFIG_MTD_TESTS is not set
# CONFIG_MTD_REDBOOT_PARTS is not set
# CONFIG_MTD_CMDLINE_PARTS is not set
# CONFIG_MTD_AR7_PARTS is not set

#
# User Modules And Translation Layers
#
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
# CONFIG_MTD_BLOCK_RO is not set
# CONFIG_FTL is not set
# CONFIG_NFTL is not set
# CONFIG_INFTL is not set
# CONFIG_RFD_FTL is not set
# CONFIG_SSFDC is not set
# CONFIG_SM_FTL is not set
# CONFIG_MTD_OOPS is not set
# CONFIG_MTD_SWAP is not set
# CONFIG_MTD_PARTITIONED_MASTER is not set

#
# RAM/ROM/Flash chip drivers
#
# CONFIG_MTD_CFI is not set
# CONFIG_MTD_JEDECPROBE is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
# CONFIG_MTD_RAM is not set
# CONFIG_MTD_ROM is not set
# CONFIG_MTD_ABSENT is not set

#
# Mapping drivers for chip access
#
# CONFIG_MTD_COMPLEX_MAPPINGS is not set
# CONFIG_MTD_INTEL_VR_NOR is not set
# CONFIG_MTD_PLATRAM is not set

#
# Self-contained MTD device drivers
#
# CONFIG_MTD_PMC551 is not set
# CONFIG_MTD_SLRAM is not set
# CONFIG_MTD_PHRAM is not set
# CONFIG_MTD_MTDRAM is not set
# CONFIG_MTD_BLOCK2MTD is not set

#
# Disk-On-Chip Device Drivers
#
# CONFIG_MTD_DOCG3 is not set
# CONFIG_MTD_NAND is not set
# CONFIG_MTD_ONENAND is not set

#
# LPDDR & LPDDR2 PCM memory drivers
#
# CONFIG_MTD_LPDDR is not set
# CONFIG_MTD_SPI_NOR is not set
CONFIG_MTD_UBI=m
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_LIMIT=20
# CONFIG_MTD_UBI_FASTMAP is not set
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_BLOCK is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_GSC is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_NULL_BLK=m
CONFIG_BLK_DEV_FD=m
# CONFIG_PARIDE is not set
CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m
CONFIG_ZRAM=m
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
CONFIG_BLK_DEV_OSD=m
CONFIG_BLK_DEV_SX8=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_XEN_BLKDEV_FRONTEND=m
# CONFIG_XEN_BLKDEV_BACKEND is not set
CONFIG_VIRTIO_BLK=m
# CONFIG_VIRTIO_BLK_SCSI is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_RBD=m
# CONFIG_BLK_DEV_RSXX is not set
CONFIG_NVME_CORE=m
CONFIG_BLK_DEV_NVME=m
# CONFIG_BLK_DEV_NVME_SCSI is not set
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TARGET is not set

#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=m
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
CONFIG_SGI_IOC4=m
CONFIG_TIFM_CORE=m
CONFIG_TIFM_7XX1=m
# CONFIG_ICS932S401 is not set
CONFIG_ENCLOSURE_SERVICES=m
CONFIG_SGI_XP=m
CONFIG_HP_ILO=m
CONFIG_SGI_GRU=m
# CONFIG_SGI_GRU_DEBUG is not set
CONFIG_APDS9802ALS=m
CONFIG_ISL29003=m
CONFIG_ISL29020=m
CONFIG_SENSORS_TSL2550=m
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=m
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
CONFIG_VMWARE_BALLOON=m
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_SRAM is not set
# CONFIG_PANEL is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
CONFIG_EEPROM_AT24=m
CONFIG_EEPROM_LEGACY=m
CONFIG_EEPROM_MAX6875=m
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_IDT_89HPESX is not set
CONFIG_CB710_CORE=m
# CONFIG_CB710_DEBUG is not set
CONFIG_CB710_DEBUG_ASSUMPTIONS=y

#
# Texas Instruments shared transport line discipline
#
CONFIG_SENSORS_LIS3_I2C=m

#
# Altera FPGA firmware download module
#
CONFIG_ALTERA_STAPL=m
CONFIG_INTEL_MEI=m
CONFIG_INTEL_MEI_ME=m
# CONFIG_INTEL_MEI_TXE is not set
CONFIG_VMWARE_VMCI=m

#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set

#
# SCIF Bus Driver
#
# CONFIG_SCIF_BUS is not set

#
# VOP Bus Driver
#
# CONFIG_VOP_BUS is not set

#
# Intel MIC Host Driver
#

#
# Intel MIC Card Driver
#

#
# SCIF Driver
#

#
# Intel MIC Coprocessor State Management (COSM) Drivers
#

#
# VOP Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_CXL_BASE is not set
# CONFIG_CXL_AFU_DRIVER_OPS is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m
CONFIG_SCSI_ENCLOSURE=m
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SRP_ATTRS=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
CONFIG_ISCSI_BOOT_SYSFS=m
CONFIG_SCSI_CXGB3_ISCSI=m
CONFIG_SCSI_CXGB4_ISCSI=m
CONFIG_SCSI_BNX2_ISCSI=m
CONFIG_SCSI_BNX2X_FCOE=m
CONFIG_BE2ISCSI=m
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
CONFIG_SCSI_HPSA=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_3W_SAS=m
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AACRAID=m
# CONFIG_SCSI_AIC7XXX is not set
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=4
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC94XX is not set
CONFIG_SCSI_MVSAS=m
# CONFIG_SCSI_MVSAS_DEBUG is not set
CONFIG_SCSI_MVSAS_TASKLET=y
CONFIG_SCSI_MVUMI=m
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
CONFIG_SCSI_ARCMSR=m
# CONFIG_SCSI_ESAS2R is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_MPT3SAS=m
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
# CONFIG_SCSI_SMARTPQI is not set
CONFIG_SCSI_UFSHCD=m
CONFIG_SCSI_UFSHCD_PCI=m
# CONFIG_SCSI_UFS_DWC_TC_PCI is not set
# CONFIG_SCSI_UFSHCD_PLATFORM is not set
CONFIG_SCSI_HPTIOP=m
# CONFIG_SCSI_BUSLOGIC is not set
CONFIG_VMWARE_PVSCSI=m
# CONFIG_XEN_SCSI_FRONTEND is not set
CONFIG_HYPERV_STORAGE=m
CONFIG_LIBFC=m
CONFIG_LIBFCOE=m
CONFIG_FCOE=m
CONFIG_FCOE_FNIC=m
# CONFIG_SCSI_SNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
CONFIG_SCSI_ISCI=m
# CONFIG_SCSI_IPS is not set
CONFIG_SCSI_INITIO=m
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
CONFIG_SCSI_STEX=m
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA_FC=m
# CONFIG_TCM_QLA2XXX is not set
CONFIG_SCSI_QLA_ISCSI=m
CONFIG_SCSI_LPFC=m
# CONFIG_SCSI_LPFC_DEBUG_FS is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
CONFIG_SCSI_DEBUG=m
CONFIG_SCSI_PMCRAID=m
CONFIG_SCSI_PM8001=m
CONFIG_SCSI_BFA_FC=m
CONFIG_SCSI_VIRTIO=m
# CONFIG_SCSI_CHELSIO_FCOE is not set
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y
CONFIG_SCSI_DH_ALUA=y
CONFIG_SCSI_OSD_INITIATOR=m
CONFIG_SCSI_OSD_ULD=m
CONFIG_SCSI_OSD_DPRINT_SENSE=1
# CONFIG_SCSI_OSD_DEBUG is not set
CONFIG_ATA=m
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=m
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_SATA_INIC162X is not set
CONFIG_SATA_ACARD_AHCI=m
CONFIG_SATA_SIL24=m
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
CONFIG_PDC_ADMA=m
CONFIG_SATA_QSTOR=m
CONFIG_SATA_SX4=m
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=m
# CONFIG_SATA_DWC is not set
CONFIG_SATA_MV=m
CONFIG_SATA_NV=m
CONFIG_SATA_PROMISE=m
CONFIG_SATA_SIL=m
CONFIG_SATA_SIS=m
CONFIG_SATA_SVW=m
CONFIG_SATA_ULI=m
CONFIG_SATA_VIA=m
CONFIG_SATA_VITESSE=m

#
# PATA SFF controllers with BMDMA
#
CONFIG_PATA_ALI=m
CONFIG_PATA_AMD=m
CONFIG_PATA_ARTOP=m
CONFIG_PATA_ATIIXP=m
CONFIG_PATA_ATP867X=m
CONFIG_PATA_CMD64X=m
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_PATA_HPT366=m
CONFIG_PATA_HPT37X=m
CONFIG_PATA_HPT3X2N=m
CONFIG_PATA_HPT3X3=m
# CONFIG_PATA_HPT3X3_DMA is not set
CONFIG_PATA_IT8213=m
CONFIG_PATA_IT821X=m
CONFIG_PATA_JMICRON=m
CONFIG_PATA_MARVELL=m
CONFIG_PATA_NETCELL=m
CONFIG_PATA_NINJA32=m
# CONFIG_PATA_NS87415 is not set
CONFIG_PATA_OLDPIIX=m
# CONFIG_PATA_OPTIDMA is not set
CONFIG_PATA_PDC2027X=m
CONFIG_PATA_PDC_OLD=m
# CONFIG_PATA_RADISYS is not set
CONFIG_PATA_RDC=m
CONFIG_PATA_SCH=m
CONFIG_PATA_SERVERWORKS=m
CONFIG_PATA_SIL680=m
CONFIG_PATA_SIS=m
CONFIG_PATA_TOSHIBA=m
# CONFIG_PATA_TRIFLEX is not set
CONFIG_PATA_VIA=m
# CONFIG_PATA_WINBOND is not set

#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
CONFIG_PATA_ACPI=m
CONFIG_ATA_GENERIC=m
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
# CONFIG_MD_MULTIPATH is not set
CONFIG_MD_FAULTY=m
# CONFIG_MD_CLUSTER is not set
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_MQ_DEFAULT is not set
CONFIG_DM_DEBUG=y
CONFIG_DM_BUFIO=m
# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_THIN_PROVISIONING=m
CONFIG_DM_CACHE=m
CONFIG_DM_CACHE_SMQ=m
CONFIG_DM_CACHE_CLEANER=m
CONFIG_DM_ERA=m
CONFIG_DM_MIRROR=m
CONFIG_DM_LOG_USERSPACE=m
CONFIG_DM_RAID=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_QL=m
CONFIG_DM_MULTIPATH_ST=m
CONFIG_DM_DELAY=m
CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
# CONFIG_DM_VERITY_FEC is not set
CONFIG_DM_SWITCH=m
# CONFIG_DM_LOG_WRITES is not set
CONFIG_TARGET_CORE=m
CONFIG_TCM_IBLOCK=m
CONFIG_TCM_FILEIO=m
CONFIG_TCM_PSCSI=m
# CONFIG_TCM_USER2 is not set
CONFIG_LOOPBACK_TARGET=m
CONFIG_TCM_FC=m
CONFIG_ISCSI_TARGET=m
# CONFIG_ISCSI_TARGET_CXGB4 is not set
# CONFIG_SBP_TARGET is not set
CONFIG_FUSION=y
CONFIG_FUSION_SPI=m
# CONFIG_FUSION_FC is not set
CONFIG_FUSION_SAS=m
CONFIG_FUSION_MAX_SGE=128
CONFIG_FUSION_CTL=m
CONFIG_FUSION_LOGGING=y

#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_SBP2=m
CONFIG_FIREWIRE_NET=m
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=m
CONFIG_NET_CORE=y
CONFIG_BONDING=m
CONFIG_DUMMY=m
# CONFIG_EQUALIZER is not set
CONFIG_NET_FC=y
CONFIG_IFB=m
CONFIG_NET_TEAM=m
CONFIG_NET_TEAM_MODE_BROADCAST=m
CONFIG_NET_TEAM_MODE_ROUNDROBIN=m
CONFIG_NET_TEAM_MODE_RANDOM=m
CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m
CONFIG_NET_TEAM_MODE_LOADBALANCE=m
CONFIG_MACVLAN=m
CONFIG_MACVTAP=m
CONFIG_VXLAN=m
CONFIG_GENEVE=m
# CONFIG_GTP is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=m
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=m
CONFIG_TAP=m
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=m
CONFIG_VIRTIO_NET=m
CONFIG_NLMON=m
# CONFIG_ARCNET is not set

#
# CAIF transport drivers
#

#
# Distributed Switch Architecture drivers
#
CONFIG_ETHERNET=y
CONFIG_MDIO=m
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
# CONFIG_NET_VENDOR_AGERE is not set
CONFIG_NET_VENDOR_ALACRITECH=y
# CONFIG_SLICOSS is not set
# CONFIG_NET_VENDOR_ALTEON is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMAZON=y
# CONFIG_ENA_ETHERNET is not set
# CONFIG_NET_VENDOR_AMD is not set
CONFIG_NET_VENDOR_AQUANTIA=y
# CONFIG_AQTION is not set
# CONFIG_NET_VENDOR_ARC is not set
# CONFIG_NET_VENDOR_ATHEROS is not set
# CONFIG_NET_VENDOR_AURORA is not set
# CONFIG_NET_CADENCE is not set
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_B44=m
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
# CONFIG_BCMGENET is not set
CONFIG_BNX2=m
CONFIG_CNIC=m
CONFIG_TIGON3=m
# CONFIG_BNX2X is not set
# CONFIG_BNXT is not set
# CONFIG_NET_VENDOR_BROCADE is not set
# CONFIG_NET_VENDOR_CAVIUM is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
CONFIG_CHELSIO_T3=m
CONFIG_CHELSIO_T4=m
# CONFIG_CHELSIO_T4_DCB is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_CHELSIO_LIB=m
CONFIG_NET_VENDOR_CISCO=y
CONFIG_ENIC=m
# CONFIG_CX_ECAT is not set
# CONFIG_DNET is not set
# CONFIG_NET_VENDOR_DEC is not set
# CONFIG_NET_VENDOR_DLINK is not set
CONFIG_NET_VENDOR_EMULEX=y
CONFIG_BE2NET=m
# CONFIG_BE2NET_HWMON is not set
# CONFIG_NET_VENDOR_EZCHIP is not set
# CONFIG_NET_VENDOR_EXAR is not set
# CONFIG_NET_VENDOR_HP is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
CONFIG_E1000=m
CONFIG_E1000E=m
CONFIG_E1000E_HWTS=y
CONFIG_IGB=m
CONFIG_IGB_HWMON=y
CONFIG_IGB_DCA=y
CONFIG_IGBVF=m
# CONFIG_IXGB is not set
CONFIG_IXGBE=m
CONFIG_IXGBE_HWMON=y
CONFIG_IXGBE_DCA=y
CONFIG_IXGBE_DCB=y
CONFIG_IXGBEVF=m
CONFIG_I40E=m
CONFIG_I40E_DCB=y
# CONFIG_I40E_FCOE is not set
CONFIG_I40EVF=m
CONFIG_FM10K=m
# CONFIG_NET_VENDOR_I825XX is not set
# CONFIG_JME is not set
# CONFIG_NET_VENDOR_MARVELL is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX4_CORE is not set
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_NET_VENDOR_MICREL is not set
# CONFIG_NET_VENDOR_MYRI is not set
# CONFIG_FEALNX is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP is not set
# CONFIG_NET_VENDOR_NVIDIA is not set
# CONFIG_NET_VENDOR_OKI is not set
# CONFIG_ETHOC is not set
# CONFIG_NET_PACKET_ENGINE is not set
# CONFIG_NET_VENDOR_QLOGIC is not set
# CONFIG_NET_VENDOR_QUALCOMM is not set
# CONFIG_NET_VENDOR_REALTEK is not set
# CONFIG_NET_VENDOR_RENESAS is not set
# CONFIG_NET_VENDOR_RDC is not set
# CONFIG_NET_VENDOR_ROCKER is not set
# CONFIG_NET_VENDOR_SAMSUNG is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SILAN is not set
# CONFIG_NET_VENDOR_SIS is not set
CONFIG_NET_VENDOR_SOLARFLARE=y
# CONFIG_SFC is not set
# CONFIG_SFC_FALCON is not set
# CONFIG_NET_VENDOR_SMSC is not set
# CONFIG_NET_VENDOR_STMICRO is not set
# CONFIG_NET_VENDOR_SUN is not set
# CONFIG_NET_VENDOR_TEHUTI is not set
# CONFIG_NET_VENDOR_TI is not set
# CONFIG_NET_VENDOR_VIA is not set
# CONFIG_NET_VENDOR_WIZNET is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLIB=y
CONFIG_SWPHY=y
# CONFIG_LED_TRIGGER_PHY is not set

#
# MDIO bus device drivers
#
# CONFIG_MDIO_BCM_UNIMAC is not set
CONFIG_MDIO_BITBANG=m
# CONFIG_MDIO_OCTEON is not set
# CONFIG_MDIO_THUNDER is not set

#
# MII PHY device drivers
#
CONFIG_AMD_PHY=m
# CONFIG_AQUANTIA_PHY is not set
CONFIG_AT803X_PHY=m
# CONFIG_BCM7XXX_PHY is not set
CONFIG_BCM87XX_PHY=m
CONFIG_BCM_NET_PHYLIB=m
CONFIG_BROADCOM_PHY=m
CONFIG_CICADA_PHY=m
CONFIG_DAVICOM_PHY=m
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
CONFIG_FIXED_PHY=y
CONFIG_ICPLUS_PHY=m
# CONFIG_INTEL_XWAY_PHY is not set
CONFIG_LSI_ET1011C_PHY=m
CONFIG_LXT_PHY=m
CONFIG_MARVELL_PHY=m
CONFIG_MICREL_PHY=m
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
CONFIG_NATIONAL_PHY=m
CONFIG_QSEMI_PHY=m
CONFIG_REALTEK_PHY=m
CONFIG_SMSC_PHY=m
CONFIG_STE10XP=m
# CONFIG_TERANETICS_PHY is not set
CONFIG_VITESSE_PHY=m
# CONFIG_XILINX_GMII2RGMII is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_FILTER=y
CONFIG_PPP_MPPE=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPPOE=m
CONFIG_PPTP=m
CONFIG_PPPOL2TP=m
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_SLIP=m
CONFIG_SLHC=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_USB_NET_DRIVERS=y
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_RTL8152=m
# CONFIG_USB_LAN78XX is not set
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_AX88179_178A=m
CONFIG_USB_NET_CDCETHER=m
CONFIG_USB_NET_CDC_EEM=m
CONFIG_USB_NET_CDC_NCM=m
CONFIG_USB_NET_HUAWEI_CDC_NCM=m
CONFIG_USB_NET_CDC_MBIM=m
CONFIG_USB_NET_DM9601=m
# CONFIG_USB_NET_SR9700 is not set
# CONFIG_USB_NET_SR9800 is not set
CONFIG_USB_NET_SMSC75XX=m
CONFIG_USB_NET_SMSC95XX=m
CONFIG_USB_NET_GL620A=m
CONFIG_USB_NET_NET1080=m
CONFIG_USB_NET_PLUSB=m
CONFIG_USB_NET_MCS7830=m
CONFIG_USB_NET_RNDIS_HOST=m
CONFIG_USB_NET_CDC_SUBSET_ENABLE=m
CONFIG_USB_NET_CDC_SUBSET=m
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_KC2190=y
CONFIG_USB_NET_ZAURUS=m
CONFIG_USB_NET_CX82310_ETH=m
CONFIG_USB_NET_KALMIA=m
CONFIG_USB_NET_QMI_WWAN=m
CONFIG_USB_NET_INT51X1=m
CONFIG_USB_IPHETH=m
CONFIG_USB_SIERRA_NET=m
CONFIG_USB_VL600=m
# CONFIG_USB_NET_CH9200 is not set
# CONFIG_WLAN is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_WAN is not set
CONFIG_IEEE802154_DRIVERS=m
CONFIG_IEEE802154_FAKELB=m
# CONFIG_IEEE802154_ATUSB is not set
# CONFIG_XEN_NETDEV_FRONTEND is not set
# CONFIG_XEN_NETDEV_BACKEND is not set
# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
# CONFIG_HYPERV_NET is not set
# CONFIG_ISDN is not set
# CONFIG_NVM is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
CONFIG_INPUT_FF_MEMLESS=m
CONFIG_INPUT_POLLDEV=m
CONFIG_INPUT_SPARSEKMAP=m
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
CONFIG_MOUSE_SERIAL=m
CONFIG_MOUSE_APPLETOUCH=m
CONFIG_MOUSE_BCM5974=m
CONFIG_MOUSE_CYAPA=m
# CONFIG_MOUSE_ELAN_I2C is not set
CONFIG_MOUSE_VSXXXAA=m
CONFIG_MOUSE_SYNAPTICS_I2C=m
CONFIG_MOUSE_SYNAPTICS_USB=m
# CONFIG_INPUT_JOYSTICK is not set
CONFIG_INPUT_TABLET=y
CONFIG_TABLET_USB_ACECAD=m
CONFIG_TABLET_USB_AIPTEK=m
CONFIG_TABLET_USB_GTCO=m
# CONFIG_TABLET_USB_HANWANG is not set
CONFIG_TABLET_USB_KBTAB=m
# CONFIG_TABLET_USB_PEGASUS is not set
# CONFIG_TABLET_SERIAL_WACOM4 is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_PROPERTIES=y
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
# CONFIG_TOUCHSCREEN_BU21013 is not set
# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set
# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set
# CONFIG_TOUCHSCREEN_DYNAPRO is not set
# CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
# CONFIG_TOUCHSCREEN_EETI is not set
# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_ILI210X is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_EKTF2127 is not set
# CONFIG_TOUCHSCREEN_ELAN is not set
CONFIG_TOUCHSCREEN_ELO=m
CONFIG_TOUCHSCREEN_WACOM_W8001=m
CONFIG_TOUCHSCREEN_WACOM_I2C=m
# CONFIG_TOUCHSCREEN_MAX11801 is not set
# CONFIG_TOUCHSCREEN_MCS5000 is not set
# CONFIG_TOUCHSCREEN_MMS114 is not set
# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_PIXCIR is not set
# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC_SERIO is not set
# CONFIG_TOUCHSCREEN_TSC2004 is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
# CONFIG_TOUCHSCREEN_SILEAD is not set
# CONFIG_TOUCHSCREEN_ST1232 is not set
# CONFIG_TOUCHSCREEN_SX8654 is not set
# CONFIG_TOUCHSCREEN_TPS6507X is not set
# CONFIG_TOUCHSCREEN_ZET6223 is not set
# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_MMA8450 is not set
CONFIG_INPUT_APANEL=m
CONFIG_INPUT_ATLAS_BTNS=m
CONFIG_INPUT_ATI_REMOTE2=m
CONFIG_INPUT_KEYSPAN_REMOTE=m
# CONFIG_INPUT_KXTJ9 is not set
CONFIG_INPUT_POWERMATE=m
CONFIG_INPUT_YEALINK=m
CONFIG_INPUT_CM109=m
CONFIG_INPUT_UINPUT=m
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
CONFIG_RMI4_CORE=m
# CONFIG_RMI4_I2C is not set
# CONFIG_RMI4_SMB is not set
CONFIG_RMI4_F03=y
CONFIG_RMI4_F03_SERIO=m
CONFIG_RMI4_2D_SENSOR=y
CONFIG_RMI4_F11=y
CONFIG_RMI4_F12=y
CONFIG_RMI4_F30=y
# CONFIG_RMI4_F34 is not set
# CONFIG_RMI4_F55 is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_SERIO_ALTERA_PS2=m
# CONFIG_SERIO_PS2MULT is not set
CONFIG_SERIO_ARC_PS2=m
CONFIG_HYPERV_KEYBOARD=m
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_ROCKETPORT is not set
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_NOZOMI=m
# CONFIG_ISI is not set
CONFIG_N_HDLC=m
CONFIG_N_GSM=m
# CONFIG_TRACE_SINK is not set
CONFIG_DEVMEM=y
# CONFIG_DEVKMEM is not set

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_EXAR=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_FSL is not set
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_LPSS=y
CONFIG_SERIAL_8250_MID=y
# CONFIG_SERIAL_8250_MOXA is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_KGDB_NMI is not set
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_CONSOLE_POLL=y
CONFIG_SERIAL_JSM=m
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
CONFIG_SERIAL_ARC=m
CONFIG_SERIAL_ARC_NR_PORTS=1
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_DEV_BUS is not set
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
CONFIG_HVC_XEN_FRONTEND=y
CONFIG_VIRTIO_CONSOLE=m
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_SSIF=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=m
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_HW_RANDOM_VIRTIO=m
CONFIG_HW_RANDOM_TPM=m
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=y
CONFIG_MAX_RAW_DEVS=8192
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
# CONFIG_HPET_MMAP_DEFAULT is not set
CONFIG_HANGCHECK_TIMER=m
CONFIG_UV_MMTIMER=m
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS_CORE=y
CONFIG_TCG_TIS=y
CONFIG_TCG_TIS_I2C_ATMEL=m
CONFIG_TCG_TIS_I2C_INFINEON=m
CONFIG_TCG_TIS_I2C_NUVOTON=m
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
# CONFIG_TCG_XEN is not set
CONFIG_TCG_CRB=m
# CONFIG_TCG_VTPM_PROXY is not set
# CONFIG_TCG_TIS_ST33ZP24_I2C is not set
CONFIG_TELCLOCK=m
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set

#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_MUX=m

#
# Multiplexer I2C Chip support
#
# CONFIG_I2C_MUX_PCA9541 is not set
# CONFIG_I2C_MUX_PINCTRL is not set
# CONFIG_I2C_MUX_REG is not set
# CONFIG_I2C_MUX_MLXCPLD is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=m
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_ISCH=m
CONFIG_I2C_ISMT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_NFORCE2_S4985=m
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
CONFIG_I2C_SIS96X=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m

#
# ACPI drivers
#
CONFIG_I2C_SCMI=m

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
CONFIG_I2C_DESIGNWARE_CORE=m
CONFIG_I2C_DESIGNWARE_PLATFORM=m
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_DESIGNWARE_BAYTRAIL is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_OCORES is not set
CONFIG_I2C_PCA_PLATFORM=m
# CONFIG_I2C_PXA_PCI is not set
CONFIG_I2C_SIMTEC=m
# CONFIG_I2C_XILINX is not set

#
# External I2C/SMBus adapter drivers
#
CONFIG_I2C_DIOLAN_U2C=m
CONFIG_I2C_PARPORT=m
CONFIG_I2C_PARPORT_LIGHT=m
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
CONFIG_I2C_TINY_USB=m
CONFIG_I2C_VIPERBOARD=m

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_MLXCPLD is not set
CONFIG_I2C_STUB=m
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set

#
# PPS support
#
CONFIG_PPS=m
# CONFIG_PPS_DEBUG is not set

#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
CONFIG_PPS_CLIENT_LDISC=m
CONFIG_PPS_CLIENT_PARPORT=m
CONFIG_PPS_CLIENT_GPIO=m

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=m
CONFIG_DP83640_PHY=m
CONFIG_PTP_1588_CLOCK_KVM=m
CONFIG_PINCTRL=y

#
# Pin controllers
#
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_GEMINILAKE is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_AVS is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_CHARGER_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_BQ2415X is not set
CONFIG_CHARGER_SMB347=m
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
CONFIG_SENSORS_ABITUGURU=m
CONFIG_SENSORS_ABITUGURU3=m
CONFIG_SENSORS_AD7414=m
CONFIG_SENSORS_AD7418=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1029=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ADT7X10=m
CONFIG_SENSORS_ADT7410=m
CONFIG_SENSORS_ADT7411=m
CONFIG_SENSORS_ADT7462=m
CONFIG_SENSORS_ADT7470=m
CONFIG_SENSORS_ADT7475=m
CONFIG_SENSORS_ASC7621=m
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_APPLESMC=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS620=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_DELL_SMM=m
CONFIG_SENSORS_I5K_AMB=m
CONFIG_SENSORS_F71805F=m
CONFIG_SENSORS_F71882FG=m
CONFIG_SENSORS_F75375S=m
CONFIG_SENSORS_FSCHMD=m
# CONFIG_SENSORS_FTSTEUTATES is not set
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
CONFIG_SENSORS_G760A=m
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
CONFIG_SENSORS_IBMAEM=m
CONFIG_SENSORS_IBMPEX=m
# CONFIG_SENSORS_I5500 is not set
CONFIG_SENSORS_CORETEMP=m
CONFIG_SENSORS_IT87=m
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
CONFIG_SENSORS_LINEAGE=m
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2990 is not set
CONFIG_SENSORS_LTC4151=m
CONFIG_SENSORS_LTC4215=m
# CONFIG_SENSORS_LTC4222 is not set
CONFIG_SENSORS_LTC4245=m
# CONFIG_SENSORS_LTC4260 is not set
CONFIG_SENSORS_LTC4261=m
CONFIG_SENSORS_MAX16065=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_MAX1668=m
CONFIG_SENSORS_MAX197=m
CONFIG_SENSORS_MAX6639=m
CONFIG_SENSORS_MAX6642=m
CONFIG_SENSORS_MAX6650=m
CONFIG_SENSORS_MAX6697=m
# CONFIG_SENSORS_MAX31790 is not set
CONFIG_SENSORS_MCP3021=m
# CONFIG_SENSORS_TC654 is not set
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM73=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
CONFIG_SENSORS_LM93=m
CONFIG_SENSORS_LM95234=m
CONFIG_SENSORS_LM95241=m
CONFIG_SENSORS_LM95245=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_PC87427=m
CONFIG_SENSORS_NTC_THERMISTOR=m
# CONFIG_SENSORS_NCT6683 is not set
CONFIG_SENSORS_NCT6775=m
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
CONFIG_SENSORS_PCF8591=m
CONFIG_PMBUS=m
CONFIG_SENSORS_PMBUS=m
CONFIG_SENSORS_ADM1275=m
CONFIG_SENSORS_LM25066=m
CONFIG_SENSORS_LTC2978=m
# CONFIG_SENSORS_LTC3815 is not set
CONFIG_SENSORS_MAX16064=m
# CONFIG_SENSORS_MAX20751 is not set
CONFIG_SENSORS_MAX34440=m
CONFIG_SENSORS_MAX8688=m
# CONFIG_SENSORS_TPS40422 is not set
CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
CONFIG_SENSORS_ZL6100=m
CONFIG_SENSORS_SHT21=m
# CONFIG_SENSORS_SHT3x is not set
# CONFIG_SENSORS_SHTC1 is not set
CONFIG_SENSORS_SIS5595=m
CONFIG_SENSORS_DME1737=m
CONFIG_SENSORS_EMC1403=m
# CONFIG_SENSORS_EMC2103 is not set
CONFIG_SENSORS_EMC6W201=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_SCH56XX_COMMON=m
CONFIG_SENSORS_SCH5627=m
CONFIG_SENSORS_SCH5636=m
# CONFIG_SENSORS_STTS751 is not set
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS1015=m
CONFIG_SENSORS_ADS7828=m
CONFIG_SENSORS_AMC6821=m
CONFIG_SENSORS_INA209=m
CONFIG_SENSORS_INA2XX=m
# CONFIG_SENSORS_INA3221 is not set
# CONFIG_SENSORS_TC74 is not set
CONFIG_SENSORS_THMC50=m
CONFIG_SENSORS_TMP102=m
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP108 is not set
CONFIG_SENSORS_TMP401=m
CONFIG_SENSORS_TMP421=m
CONFIG_SENSORS_VIA_CPUTEMP=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83795=m
# CONFIG_SENSORS_W83795_FANCTRL is not set
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83L786NG=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
# CONFIG_SENSORS_XGENE is not set

#
# ACPI drivers
#
CONFIG_SENSORS_ACPI_POWER=m
CONFIG_SENSORS_ATK0110=m
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_BANG_BANG=y
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_EMULATION is not set
CONFIG_INTEL_POWERCLAMP=m
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# CONFIG_INTEL_PCH_THERMAL is not set
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
# CONFIG_WATCHDOG_SYSFS is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
# CONFIG_WDAT_WDT is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_F71808E_WDT=m
CONFIG_SP5100_TCO=m
CONFIG_SBC_FITPC2_WATCHDOG=m
# CONFIG_EUROTECH_WDT is not set
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=m
CONFIG_IE6XX_WDT=m
CONFIG_ITCO_WDT=m
CONFIG_ITCO_VENDOR_SUPPORT=y
CONFIG_IT8712F_WDT=m
CONFIG_IT87_WDT=m
CONFIG_HP_WATCHDOG=m
CONFIG_HPWDT_NMI_DECODING=y
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
CONFIG_NV_TCO=m
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
CONFIG_SMSC_SCH311X_WDT=m
# CONFIG_SMSC37B787_WDT is not set
CONFIG_VIA_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_INTEL_MEI_WDT is not set
# CONFIG_NI903X_WDT is not set
# CONFIG_NIC7018_WDT is not set
CONFIG_XEN_WDT=m

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m

#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m

#
# Watchdog Pretimeout Governors
#
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
CONFIG_SSB=m
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_B43_PCI_BRIDGE is not set
CONFIG_SSB_SDIOHOST_POSSIBLE=y
CONFIG_SSB_SDIOHOST=y
# CONFIG_SSB_DEBUG is not set
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
CONFIG_BCMA_POSSIBLE=y

#
# Broadcom specific AMBA
#
CONFIG_BCMA=m
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
CONFIG_BCMA_HOST_PCI=y
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
CONFIG_BCMA_DRIVER_GMAC_CMN=y
# CONFIG_BCMA_DEBUG is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=m
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=m
CONFIG_LPC_SCH=m
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
CONFIG_MFD_VIPERBOARD=m
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_MFD_RDC321X is not set
CONFIG_MFD_RTSX_PCI=m
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RTSX_USB is not set
# CONFIG_MFD_SI476X_CORE is not set
CONFIG_MFD_SM501=m
# CONFIG_MFD_SKY81452 is not set
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TI_LP873X is not set
# CONFIG_MFD_TPS65218 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TMIO is not set
CONFIG_MFD_VX855=m
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=64
CONFIG_VGA_SWITCHEROO=y
CONFIG_DRM=m
CONFIG_DRM_MIPI_DSI=y
# CONFIG_DRM_DP_AUX_CHARDEV is not set
# CONFIG_DRM_DEBUG_MM_SELFTEST is not set
CONFIG_DRM_KMS_HELPER=m
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_TTM=m
CONFIG_DRM_VM=y

#
# I2C encoder or helper chips
#
CONFIG_DRM_I2C_CH7006=m
CONFIG_DRM_I2C_SIL164=m
CONFIG_DRM_I2C_NXP_TDA998X=m
CONFIG_DRM_RADEON=m
# CONFIG_DRM_RADEON_USERPTR is not set
# CONFIG_DRM_AMDGPU is not set

#
# ACP (Audio CoProcessor) Configuration
#
CONFIG_DRM_NOUVEAU=m
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
CONFIG_DRM_NOUVEAU_BACKLIGHT=y
CONFIG_DRM_I915=m
# CONFIG_DRM_I915_ALPHA_SUPPORT is not set
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_I915_GVT is not set
# CONFIG_DRM_VGEM is not set
CONFIG_DRM_VMWGFX=m
CONFIG_DRM_VMWGFX_FBCON=y
CONFIG_DRM_GMA500=m
CONFIG_DRM_GMA600=y
CONFIG_DRM_GMA3600=y
CONFIG_DRM_UDL=m
CONFIG_DRM_AST=m
CONFIG_DRM_MGAG200=m
CONFIG_DRM_CIRRUS_QEMU=m
CONFIG_DRM_QXL=m
CONFIG_DRM_BOCHS=m
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_DRM_PANEL=y

#
# Display Panels
#
CONFIG_DRM_BRIDGE=y

#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
# CONFIG_HSA_AMD is not set
# CONFIG_DRM_HISI_HIBMC is not set
# CONFIG_DRM_TINYDRM is not set
# CONFIG_DRM_LEGACY is not set
# CONFIG_DRM_LIB_RANDOM is not set

#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
# CONFIG_FB_DDC is not set
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
# CONFIG_FB_PROVIDE_GET_FB_UNMAPPED_AREA is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
CONFIG_FB_BACKLIGHT=y
# CONFIG_FB_MODE_HELPERS is not set
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SM501 is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_XEN_FBDEV_FRONTEND is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
CONFIG_FB_HYPERV=m
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SM712 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_LCD_PLATFORM=m
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_GENERIC is not set
CONFIG_BACKLIGHT_APPLE=m
# CONFIG_BACKLIGHT_PM8941_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_VGASTATE is not set
CONFIG_HDMI=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
# CONFIG_VGACON_SOFT_SCROLLBACK_PERSISTENT_ENABLE_BY_DEFAULT is not set
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
# CONFIG_SOUND is not set

#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
CONFIG_UHID=m
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
CONFIG_HID_ACRUX=m
# CONFIG_HID_ACRUX_FF is not set
CONFIG_HID_APPLE=y
CONFIG_HID_APPLEIR=m
# CONFIG_HID_ASUS is not set
CONFIG_HID_AUREAL=m
CONFIG_HID_BELKIN=y
# CONFIG_HID_BETOP_FF is not set
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
# CONFIG_HID_CORSAIR is not set
# CONFIG_HID_CMEDIA is not set
CONFIG_HID_CYPRESS=y
CONFIG_HID_DRAGONRISE=m
# CONFIG_DRAGONRISE_FF is not set
# CONFIG_HID_EMS_FF is not set
CONFIG_HID_ELECOM=m
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
CONFIG_HID_HOLTEK=m
# CONFIG_HOLTEK_FF is not set
# CONFIG_HID_GT683R is not set
CONFIG_HID_KEYTOUCH=m
CONFIG_HID_KYE=m
CONFIG_HID_UCLOGIC=m
CONFIG_HID_WALTOP=m
CONFIG_HID_GYRATION=m
CONFIG_HID_ICADE=m
CONFIG_HID_TWINHAN=m
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LCPOWER=m
CONFIG_HID_LED=m
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
CONFIG_HID_LOGITECH_DJ=m
CONFIG_HID_LOGITECH_HIDPP=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
# CONFIG_LOGIWHEELS_FF is not set
CONFIG_HID_MAGICMOUSE=y
# CONFIG_HID_MAYFLASH is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_MULTITOUCH=m
CONFIG_HID_NTRIG=y
CONFIG_HID_ORTEK=m
CONFIG_HID_PANTHERLORD=m
# CONFIG_PANTHERLORD_FF is not set
# CONFIG_HID_PENMOUNT is not set
CONFIG_HID_PETALYNX=m
CONFIG_HID_PICOLCD=m
CONFIG_HID_PICOLCD_FB=y
CONFIG_HID_PICOLCD_BACKLIGHT=y
CONFIG_HID_PICOLCD_LCD=y
CONFIG_HID_PICOLCD_LEDS=y
# CONFIG_HID_PLANTRONICS is not set
CONFIG_HID_PRIMAX=m
CONFIG_HID_ROCCAT=m
CONFIG_HID_SAITEK=m
CONFIG_HID_SAMSUNG=m
CONFIG_HID_SONY=m
# CONFIG_SONY_FF is not set
CONFIG_HID_SPEEDLINK=m
CONFIG_HID_STEELSERIES=m
CONFIG_HID_SUNPLUS=m
CONFIG_HID_RMI=m
CONFIG_HID_GREENASIA=m
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_HYPERV_MOUSE=m
CONFIG_HID_SMARTJOYPLUS=m
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TIVO=m
CONFIG_HID_TOPSEED=m
CONFIG_HID_THINGM=m
CONFIG_HID_THRUSTMASTER=m
# CONFIG_THRUSTMASTER_FF is not set
# CONFIG_HID_UDRAW_PS3 is not set
CONFIG_HID_WACOM=m
CONFIG_HID_WIIMOTE=m
# CONFIG_HID_XINMO is not set
CONFIG_HID_ZEROPLUS=m
# CONFIG_ZEROPLUS_FF is not set
CONFIG_HID_ZYDACRON=m
# CONFIG_HID_SENSOR_HUB is not set
# CONFIG_HID_ALPS is not set

#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y

#
# I2C HID support
#
CONFIG_I2C_HID=m

#
# Intel ISH HID support
#
# CONFIG_INTEL_ISH_HID is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_LEDS_TRIGGER_USBPORT is not set
CONFIG_USB_MON=y
CONFIG_USB_WUSB=m
CONFIG_USB_WUSB_CBAF=m
# CONFIG_USB_WUSB_CBAF_DEBUG is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PCI=y
# CONFIG_USB_XHCI_PLATFORM is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_U132_HCD is not set
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
CONFIG_USB_HWA_HCD=m
# CONFIG_USB_HCD_BCMA is not set
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m
CONFIG_USB_WDM=m
CONFIG_USB_TMC=m

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_REALTEK=m
CONFIG_REALTEK_AUTOPM=y
CONFIG_USB_STORAGE_DATAFAB=m
CONFIG_USB_STORAGE_FREECOM=m
CONFIG_USB_STORAGE_ISD200=m
CONFIG_USB_STORAGE_USBAT=m
CONFIG_USB_STORAGE_SDDR09=m
CONFIG_USB_STORAGE_SDDR55=m
CONFIG_USB_STORAGE_JUMPSHOT=m
CONFIG_USB_STORAGE_ALAUDA=m
CONFIG_USB_STORAGE_ONETOUCH=m
CONFIG_USB_STORAGE_KARMA=m
CONFIG_USB_STORAGE_CYPRESS_ATACB=m
CONFIG_USB_STORAGE_ENE_UB6250=m
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
CONFIG_USB_USS720=m
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
CONFIG_USB_SERIAL_AIRCABLE=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
CONFIG_USB_SERIAL_CH341=m
CONFIG_USB_SERIAL_WHITEHEAT=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP210X=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
# CONFIG_USB_SERIAL_F81232 is not set
# CONFIG_USB_SERIAL_F8153X is not set
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_IUU=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
# CONFIG_USB_SERIAL_METRO is not set
CONFIG_USB_SERIAL_MOS7720=m
CONFIG_USB_SERIAL_MOS7715_PARPORT=y
CONFIG_USB_SERIAL_MOS7840=m
# CONFIG_USB_SERIAL_MXUPORT is not set
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
CONFIG_USB_SERIAL_OTI6858=m
CONFIG_USB_SERIAL_QCAUX=m
CONFIG_USB_SERIAL_QUALCOMM=m
CONFIG_USB_SERIAL_SPCP8X5=m
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_SYMBOL=m
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_WWAN=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_SERIAL_OPTICON=m
CONFIG_USB_SERIAL_XSENS_MT=m
# CONFIG_USB_SERIAL_WISHBONE is not set
CONFIG_USB_SERIAL_SSU100=m
CONFIG_USB_SERIAL_QT2=m
# CONFIG_USB_SERIAL_UPD78F0730 is not set
CONFIG_USB_SERIAL_DEBUG=m

#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
CONFIG_USB_EMI26=m
CONFIG_USB_ADUTUX=m
CONFIG_USB_SEVSEG=m
# CONFIG_USB_RIO500 is not set
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
CONFIG_USB_IDMOUSE=m
CONFIG_USB_FTDI_ELAN=m
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
# CONFIG_USB_TRANCEVIBRATOR is not set
CONFIG_USB_IOWARRIOR=m
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
CONFIG_USB_ISIGHTFW=m
# CONFIG_USB_YUREX is not set
CONFIG_USB_EZUSB_FX2=m
# CONFIG_USB_HUB_USB251XB is not set
CONFIG_USB_HSIC_USB3503=m
# CONFIG_USB_HSIC_USB4604 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set
# CONFIG_UCSI is not set

#
# USB Physical Layer drivers
#
# CONFIG_USB_PHY is not set
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_ISP1301 is not set
# CONFIG_USB_GADGET is not set
# CONFIG_USB_LED_TRIG is not set
# CONFIG_USB_ULPI_BUS is not set
CONFIG_UWB=m
CONFIG_UWB_HWA=m
CONFIG_UWB_WHCI=m
CONFIG_UWB_I1480U=m
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_BLOCK_BOUNCE=y
CONFIG_SDIO_UART=m
# CONFIG_MMC_TEST is not set

#
# MMC/SD/SDIO Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
CONFIG_MMC_SDHCI_PCI=m
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=m
CONFIG_MMC_SDHCI_PLTFM=m
# CONFIG_MMC_WBSD is not set
CONFIG_MMC_TIFM_SD=m
CONFIG_MMC_CB710=m
CONFIG_MMC_VIA_SDMMC=m
CONFIG_MMC_VUB300=m
CONFIG_MMC_USHC=m
# CONFIG_MMC_USDHI6ROL0 is not set
CONFIG_MMC_REALTEK_PCI=m
# CONFIG_MMC_TOSHIBA_PCI is not set
# CONFIG_MMC_MTK is not set
CONFIG_MEMSTICK=m
# CONFIG_MEMSTICK_DEBUG is not set

#
# MemoryStick drivers
#
# CONFIG_MEMSTICK_UNSAFE_RESUME is not set
CONFIG_MSPRO_BLOCK=m
# CONFIG_MS_BLOCK is not set

#
# MemoryStick Host Controller Drivers
#
CONFIG_MEMSTICK_TIFM_MS=m
CONFIG_MEMSTICK_JMICRON_38X=m
CONFIG_MEMSTICK_R592=m
CONFIG_MEMSTICK_REALTEK_PCI=m
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
# CONFIG_LEDS_BRIGHTNESS_HW_CHANGED is not set

#
# LED drivers
#
CONFIG_LEDS_LM3530=m
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
CONFIG_LEDS_LP3944=m
CONFIG_LEDS_LP55XX_COMMON=m
CONFIG_LEDS_LP5521=m
CONFIG_LEDS_LP5523=m
CONFIG_LEDS_LP5562=m
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_LP8860 is not set
CONFIG_LEDS_CLEVO_MAIL=m
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_BD2802 is not set
CONFIG_LEDS_INTEL_SS4200=m
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=m
# CONFIG_LEDS_MLXCPLD is not set
# CONFIG_LEDS_USER is not set
# CONFIG_LEDS_NIC78BX is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=m
CONFIG_LEDS_TRIGGER_ONESHOT=m
# CONFIG_LEDS_TRIGGER_DISK is not set
# CONFIG_LEDS_TRIGGER_MTD is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
CONFIG_LEDS_TRIGGER_BACKLIGHT=m
# CONFIG_LEDS_TRIGGER_CPU is not set
CONFIG_LEDS_TRIGGER_DEFAULT_ON=m

#
# iptables trigger is under Netfilter config (LED target)
#
CONFIG_LEDS_TRIGGER_TRANSIENT=m
CONFIG_LEDS_TRIGGER_CAMERA=m
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
CONFIG_EDAC_MM_EDAC=m
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=m
CONFIG_EDAC_IE31200=m
CONFIG_EDAC_X38=m
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I7CORE=m
CONFIG_EDAC_I5000=m
CONFIG_EDAC_I5100=m
CONFIG_EDAC_I7300=m
CONFIG_EDAC_SBRIDGE=m
# CONFIG_EDAC_SKX is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_SYSTOHC is not set
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABX80X is not set
CONFIG_RTC_DRV_DS1307=m
CONFIG_RTC_DRV_DS1307_HWMON=y
# CONFIG_RTC_DRV_DS1307_CENTURY is not set
CONFIG_RTC_DRV_DS1374=m
# CONFIG_RTC_DRV_DS1374_WDT is not set
CONFIG_RTC_DRV_DS1672=m
CONFIG_RTC_DRV_MAX6900=m
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_ISL12022=m
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8523=m
# CONFIG_RTC_DRV_PCF85063 is not set
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
CONFIG_RTC_DRV_M41T80=m
CONFIG_RTC_DRV_M41T80_WDT=y
CONFIG_RTC_DRV_BQ32K=m
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=m
# CONFIG_RTC_DRV_RX8010 is not set
CONFIG_RTC_DRV_RX8581=m
CONFIG_RTC_DRV_RX8025=m
CONFIG_RTC_DRV_EM3027=m
# CONFIG_RTC_DRV_RV8803 is not set

#
# SPI RTC drivers
#
CONFIG_RTC_I2C_AND_SPI=m

#
# SPI and I2C RTC drivers
#
CONFIG_RTC_DRV_DS3232=m
# CONFIG_RTC_DRV_PCF2127 is not set
CONFIG_RTC_DRV_RV3029C2=m
CONFIG_RTC_DRV_RV3029_HWMON=y

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
CONFIG_RTC_DRV_DS1286=m
CONFIG_RTC_DRV_DS1511=m
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
CONFIG_RTC_DRV_DS1742=m
CONFIG_RTC_DRV_DS2404=m
CONFIG_RTC_DRV_STK17TA8=m
# CONFIG_RTC_DRV_M48T86 is not set
CONFIG_RTC_DRV_M48T35=m
CONFIG_RTC_DRV_M48T59=m
CONFIG_RTC_DRV_MSM6242=m
CONFIG_RTC_DRV_BQ4802=m
CONFIG_RTC_DRV_RP5C01=m
CONFIG_RTC_DRV_V3020=m

#
# on-CPU RTC drivers
#

#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y
# CONFIG_INTEL_IDMA64 is not set
CONFIG_INTEL_IOATDMA=m
# CONFIG_QCOM_HIDMA_MGMT is not set
# CONFIG_QCOM_HIDMA is not set
CONFIG_DW_DMAC_CORE=y
CONFIG_DW_DMAC=m
CONFIG_DW_DMAC_PCI=y
CONFIG_HSU_DMA=y

#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
# CONFIG_DMATEST is not set
CONFIG_DMA_ENGINE_RAID=y

#
# DMABUF options
#
CONFIG_SYNC_FILE=y
# CONFIG_SW_SYNC is not set
CONFIG_DCA=m
CONFIG_AUXDISPLAY=y
CONFIG_KS0108=m
CONFIG_KS0108_PORT=0x378
CONFIG_KS0108_DELAY=2
CONFIG_CFAG12864B=m
CONFIG_CFAG12864B_RATE=20
# CONFIG_IMG_ASCII_LCD is not set
CONFIG_UIO=m
CONFIG_UIO_CIF=m
CONFIG_UIO_PDRV_GENIRQ=m
# CONFIG_UIO_DMEM_GENIRQ is not set
CONFIG_UIO_AEC=m
CONFIG_UIO_SERCOS3=m
CONFIG_UIO_PCI_GENERIC=m
# CONFIG_UIO_NETX is not set
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
# CONFIG_UIO_HV_GENERIC is not set
CONFIG_VFIO_IOMMU_TYPE1=m
CONFIG_VFIO_VIRQFD=m
CONFIG_VFIO=m
# CONFIG_VFIO_NOIOMMU is not set
CONFIG_VFIO_PCI=m
# CONFIG_VFIO_PCI_VGA is not set
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI_IGD=y
# CONFIG_VFIO_MDEV is not set
CONFIG_IRQ_BYPASS_MANAGER=m
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO=m

#
# Virtio drivers
#
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=m
# CONFIG_VIRTIO_MMIO is not set

#
# Microsoft Hyper-V guest support
#
CONFIG_HYPERV=m
CONFIG_HYPERV_UTILS=m
CONFIG_HYPERV_BALLOON=m

#
# Xen driver support
#
CONFIG_XEN_BALLOON=y
# CONFIG_XEN_SELFBALLOONING is not set
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=m
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=m
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
# CONFIG_XEN_GNTDEV is not set
# CONFIG_XEN_GRANT_DEV_ALLOC is not set
CONFIG_SWIOTLB_XEN=y
CONFIG_XEN_TMEM=m
CONFIG_XEN_PCIDEV_BACKEND=m
# CONFIG_XEN_SCSI_BACKEND is not set
CONFIG_XEN_PRIVCMD=m
CONFIG_XEN_ACPI_PROCESSOR=m
# CONFIG_XEN_MCE_LOG is not set
CONFIG_XEN_HAVE_PVMMU=y
CONFIG_XEN_EFI=y
CONFIG_XEN_AUTO_XLATE=y
CONFIG_XEN_ACPI=y
CONFIG_XEN_SYMS=y
CONFIG_XEN_HAVE_VPMU=y
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=m
CONFIG_ACERHDF=m
# CONFIG_ALIENWARE_WMI is not set
CONFIG_ASUS_LAPTOP=m
# CONFIG_DELL_LAPTOP is not set
# CONFIG_DELL_WMI is not set
CONFIG_DELL_WMI_AIO=m
# CONFIG_DELL_SMO8800 is not set
CONFIG_FUJITSU_LAPTOP=m
# CONFIG_FUJITSU_LAPTOP_DEBUG is not set
CONFIG_FUJITSU_TABLET=m
CONFIG_HP_ACCEL=m
CONFIG_HP_WIRELESS=m
CONFIG_HP_WMI=m
CONFIG_PANASONIC_LAPTOP=m
CONFIG_THINKPAD_ACPI=m
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
# CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set
CONFIG_THINKPAD_ACPI_VIDEO=y
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
CONFIG_SENSORS_HDAPS=m
# CONFIG_INTEL_MENLOW is not set
CONFIG_EEEPC_LAPTOP=m
CONFIG_ASUS_WMI=m
CONFIG_ASUS_NB_WMI=m
CONFIG_EEEPC_WMI=m
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ACPI_WMI=m
CONFIG_MSI_WMI=m
CONFIG_TOPSTAR_LAPTOP=m
CONFIG_TOSHIBA_BT_RFKILL=m
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
CONFIG_ACPI_CMPC=m
# CONFIG_INTEL_HID_EVENT is not set
# CONFIG_INTEL_VBTN is not set
CONFIG_INTEL_IPS=m
# CONFIG_INTEL_PMC_CORE is not set
# CONFIG_IBM_RTL is not set
CONFIG_SAMSUNG_LAPTOP=m
CONFIG_MXM_WMI=m
CONFIG_SAMSUNG_Q10=m
CONFIG_APPLE_GMUX=m
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
CONFIG_PVPANIC=y
# CONFIG_INTEL_PMC_IPC is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_MLX_PLATFORM is not set
# CONFIG_MLX_CPLD_PLATFORM is not set
# CONFIG_INTEL_TURBO_MAX_3 is not set
CONFIG_PMC_ATOM=y
# CONFIG_CHROME_PLATFORMS is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y

#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_COMMON_CLK_NXP is not set
# CONFIG_COMMON_CLK_PXA is not set
# CONFIG_COMMON_CLK_PIC32 is not set

#
# Hardware Spinlock drivers
#

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y

#
# Generic IOMMU Pagetable Support
#
CONFIG_IOMMU_IOVA=y
CONFIG_AMD_IOMMU=y
CONFIG_AMD_IOMMU_V2=m
CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
CONFIG_IRQ_REMAP=y

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set

#
# Rpmsg drivers
#

#
# SOC (System On Chip) specific Drivers
#

#
# Broadcom SoC drivers
#
# CONFIG_SUNXI_SRAM is not set
# CONFIG_SOC_TI is not set
# CONFIG_SOC_ZTE is not set
CONFIG_PM_DEVFREQ=y

#
# DEVFREQ Governors
#
CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m
# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set
# CONFIG_DEVFREQ_GOV_POWERSAVE is not set
# CONFIG_DEVFREQ_GOV_USERSPACE is not set
# CONFIG_DEVFREQ_GOV_PASSIVE is not set

#
# DEVFREQ Drivers
#
# CONFIG_PM_DEVFREQ_EVENT is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
CONFIG_NTB=m
# CONFIG_NTB_AMD is not set
# CONFIG_NTB_INTEL is not set
# CONFIG_NTB_PINGPONG is not set
# CONFIG_NTB_TOOL is not set
# CONFIG_NTB_PERF is not set
# CONFIG_NTB_TRANSPORT is not set
# CONFIG_VME_BUS is not set
# CONFIG_PWM is not set
CONFIG_ARM_GIC_MAX_NR=1
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
# CONFIG_FMC is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
CONFIG_POWERCAP=y
# CONFIG_INTEL_RAPL is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
CONFIG_RAS=y
# CONFIG_MCE_AMD_INJ is not set
# CONFIG_THUNDERBOLT is not set

#
# Android
#
# CONFIG_ANDROID is not set
# CONFIG_LIBNVDIMM is not set
# CONFIG_DEV_DAX is not set
CONFIG_NVMEM=m
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set

#
# FPGA Configuration Support
#
# CONFIG_FPGA is not set

#
# FSI support
#
# CONFIG_FSI is not set

#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y
CONFIG_DMI_SYSFS=y
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
CONFIG_ISCSI_IBFT_FIND=y
CONFIG_ISCSI_IBFT=m
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_VARS=y
CONFIG_EFI_ESRT=y
CONFIG_EFI_VARS_PSTORE=y
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_BOOTLOADER_CONTROL is not set
# CONFIG_EFI_CAPSULE_LOADER is not set
# CONFIG_EFI_TEST is not set
# CONFIG_APPLE_PROPERTIES is not set
CONFIG_UEFI_CPER=y
# CONFIG_EFI_DEV_PATH_PARSER is not set

#
# Tegra firmware driver
#

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_IOMAP=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=m
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_ENCRYPTION is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=m
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=m
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=m
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_WARN is not set
# CONFIG_XFS_DEBUG is not set
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=y
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
CONFIG_MANDATORY_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
CONFIG_AUTOFS4_FS=y
CONFIG_FUSE_FS=m
CONFIG_CUSE=m
CONFIG_OVERLAY_FS=m
# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set

#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
# CONFIG_FSCACHE_HISTOGRAM is not set
# CONFIG_FSCACHE_DEBUG is not set
# CONFIG_FSCACHE_OBJECT_LIST is not set
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_UBIFS_FS is not set
CONFIG_CRAMFS=m
CONFIG_SQUASHFS=m
CONFIG_SQUASHFS_FILE_CACHE=y
# CONFIG_SQUASHFS_FILE_DIRECT is not set
CONFIG_SQUASHFS_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_DECOMP_MULTI is not set
# CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU is not set
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_ZLIB=y
# CONFIG_SQUASHFS_LZ4 is not set
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set
# CONFIG_SQUASHFS_EMBEDDED is not set
CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
CONFIG_PSTORE_ZLIB_COMPRESS=y
# CONFIG_PSTORE_LZO_COMPRESS is not set
# CONFIG_PSTORE_LZ4_COMPRESS is not set
# CONFIG_PSTORE_CONSOLE is not set
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EXOFS_FS is not set
CONFIG_ORE=m
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
# CONFIG_NFS_V2 is not set
CONFIG_NFS_V3=m
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_OBJLAYOUT=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_NFS_FSCACHE=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
# CONFIG_NFSD_SCSILAYOUT is not set
# CONFIG_NFSD_FLEXFILELAYOUT is not set
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=m
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_SUNRPC_DEBUG=y
CONFIG_CEPH_FS=m
# CONFIG_CEPH_FSCACHE is not set
# CONFIG_CEPH_FS_POSIX_ACL is not set
CONFIG_CIFS=m
CONFIG_CIFS_STATS=y
# CONFIG_CIFS_STATS2 is not set
CONFIG_CIFS_WEAK_PW_HASH=y
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
CONFIG_CIFS_ACL=y
CONFIG_CIFS_DEBUG=y
# CONFIG_CIFS_DEBUG2 is not set
CONFIG_CIFS_DFS_UPCALL=y
CONFIG_CIFS_SMB2=y
# CONFIG_CIFS_SMB311 is not set
# CONFIG_CIFS_FSCACHE is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_DLM_DEBUG=y

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
# CONFIG_STACK_VALIDATION is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
CONFIG_DEBUG_RODATA_TEST=y
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
# CONFIG_KASAN is not set
CONFIG_ARCH_HAS_KCOV=y
# CONFIG_KCOV is not set
CONFIG_DEBUG_SHIRQ=y

#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_WQ_WATCHDOG is not set
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_DEBUG_PREEMPT=y

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
# CONFIG_PROVE_RCU is not set
CONFIG_SPARSE_RCU_POINTER=y
# CONFIG_TORTURE_TEST is not set
# CONFIG_RCU_PERF_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
CONFIG_SCHED_TRACER=y
# CONFIG_HWLAT_TRACER is not set
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_STACK_TRACER=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_UPROBE_EVENTS is not set
CONFIG_BPF_EVENTS=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_HIST_TRIGGERS is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
CONFIG_RING_BUFFER_BENCHMARK=m
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_TRACE_ENUM_MAP_FILE is not set

#
# Runtime Testing
#
# CONFIG_LKDTM is not set
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_TEST_SORT is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
CONFIG_ATOMIC64_SELFTEST=y
CONFIG_ASYNC_RAID6_TEST=m
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
CONFIG_TEST_KSTRTOX=y
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_UUID is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_HASH is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_TEST_LKM is not set
# CONFIG_TEST_USER_COPY is not set
# CONFIG_TEST_BPF is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_MEMTEST is not set
# CONFIG_TEST_STATIC_KEYS is not set
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y
CONFIG_KGDB_TESTS=y
# CONFIG_KGDB_TESTS_ON_BOOT is not set
CONFIG_KGDB_LOW_LEVEL_TRAP=y
CONFIG_KGDB_KDB=y
CONFIG_KDB_DEFAULT_ENABLE=0x1
CONFIG_KDB_KEYBOARD=y
CONFIG_KDB_CONTINUE_CATASTROPHIC=0
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_ARCH_WANTS_UBSAN_NO_NULL is not set
# CONFIG_UBSAN is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_IO_STRICT_DEVMEM is not set
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
CONFIG_EARLY_PRINTK_EFI=y
# CONFIG_X86_PTDUMP_CORE is not set
# CONFIG_X86_PTDUMP is not set
# CONFIG_EFI_PGT_DUMP is not set
# CONFIG_DEBUG_WX is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_X86_DECODER_SELFTEST=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
CONFIG_TRUSTED_KEYS=y
CONFIG_ENCRYPTED_KEYS=y
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
# CONFIG_SECURITY_PATH is not set
CONFIG_INTEL_TXT=y
CONFIG_LSM_MMAP_MIN_ADDR=65535
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
CONFIG_HAVE_ARCH_HARDENED_USERCOPY=y
# CONFIG_HARDENED_USERCOPY is not set
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_LOADPIN is not set
# CONFIG_SECURITY_YAMA is not set
CONFIG_INTEGRITY=y
CONFIG_INTEGRITY_SIGNATURE=y
CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
CONFIG_INTEGRITY_TRUSTED_KEYRING=y
CONFIG_INTEGRITY_AUDIT=y
CONFIG_IMA=y
CONFIG_IMA_MEASURE_PCR_IDX=10
CONFIG_IMA_LSM_RULES=y
# CONFIG_IMA_TEMPLATE is not set
CONFIG_IMA_NG_TEMPLATE=y
# CONFIG_IMA_SIG_TEMPLATE is not set
CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
CONFIG_IMA_DEFAULT_HASH_SHA1=y
# CONFIG_IMA_DEFAULT_HASH_SHA256 is not set
# CONFIG_IMA_DEFAULT_HASH_SHA512 is not set
# CONFIG_IMA_DEFAULT_HASH_WP512 is not set
CONFIG_IMA_DEFAULT_HASH="sha1"
# CONFIG_IMA_WRITE_POLICY is not set
# CONFIG_IMA_READ_POLICY is not set
CONFIG_IMA_APPRAISE=y
CONFIG_IMA_TRUSTED_KEYRING=y
# CONFIG_IMA_BLACKLIST_KEYRING is not set
# CONFIG_IMA_LOAD_X509 is not set
CONFIG_EVM=y
CONFIG_EVM_ATTR_FSUUID=y
# CONFIG_EVM_LOAD_X509 is not set
CONFIG_DEFAULT_SECURITY_SELINUX=y
# CONFIG_DEFAULT_SECURITY_DAC is not set
CONFIG_DEFAULT_SECURITY="selinux"
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_ASYNC_PQ=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_FIPS=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=m
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_DH=m
# CONFIG_CRYPTO_ECDH is not set
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=m
CONFIG_CRYPTO_MCRYPTD=m
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_ABLK_HELPER=m
CONFIG_CRYPTO_SIMD=m
CONFIG_CRYPTO_GLUE_HELPER_X86=m
CONFIG_CRYPTO_ENGINE=m

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=m

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
# CONFIG_CRYPTO_KEYWRAP is not set

#
# Hash modes
#
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_VMAC=m

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
CONFIG_CRYPTO_GHASH=m
# CONFIG_CRYPTO_POLY1305 is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA1_SSSE3=y
CONFIG_CRYPTO_SHA256_SSSE3=y
CONFIG_CRYPTO_SHA512_SSSE3=m
CONFIG_CRYPTO_SHA1_MB=m
# CONFIG_CRYPTO_SHA256_MB is not set
# CONFIG_CRYPTO_SHA512_MB is not set
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=m
# CONFIG_CRYPTO_SHA3 is not set
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
CONFIG_CRYPTO_AES_X86_64=y
CONFIG_CRYPTO_AES_NI_INTEL=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_BLOWFISH_X86_64=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST_COMMON=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST5_AVX_X86_64=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=m
CONFIG_CRYPTO_SALSA20_X86_64=m
# CONFIG_CRYPTO_CHACHA20 is not set
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_SERPENT_SSE2_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX2_X86_64=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
CONFIG_CRYPTO_TWOFISH_X86_64=m
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=m
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=m

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set

#
# Random Number Generation
#
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
CONFIG_CRYPTO_DRBG_HASH=y
CONFIG_CRYPTO_DRBG_CTR=y
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_USER_API=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_USER_API_SKCIPHER=y
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HASH_INFO=y
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
# CONFIG_CRYPTO_DEV_FSL_CAAM_CRYPTO_API_DESC is not set
# CONFIG_CRYPTO_DEV_CCP is not set
CONFIG_CRYPTO_DEV_QAT=m
CONFIG_CRYPTO_DEV_QAT_DH895xCC=m
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
# CONFIG_CRYPTO_DEV_CHELSIO is not set
CONFIG_CRYPTO_DEV_VIRTIO=m
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
CONFIG_SIGNED_PE_FILE_VERIFICATION=y

#
# Certificates for signature checking
#
CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_MMU_AUDIT=y
# CONFIG_KVM_DEVICE_ASSIGNMENT is not set
CONFIG_VHOST_NET=m
# CONFIG_VHOST_SCSI is not set
# CONFIG_VHOST_VSOCK is not set
CONFIG_VHOST=m
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_BITREVERSE=y
# CONFIG_HAVE_ARCH_BITREVERSE is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_CRC8=m
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_ENC8=y
CONFIG_REED_SOLOMON_DEC8=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_BTREE=y
CONFIG_INTERVAL_TREE=y
CONFIG_RADIX_TREE_MULTIORDER=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
# CONFIG_DMA_NOOP_OPS is not set
# CONFIG_DMA_VIRT_OPS is not set
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPUMASK_OFFSTACK=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_CLZ_TAB=y
CONFIG_CORDIC=m
# CONFIG_DDR is not set
CONFIG_IRQ_POLL=y
CONFIG_MPILIB=y
CONFIG_SIGNATURE=y
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_SG_SPLIT is not set
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_MMIO_FLUSH=y
CONFIG_SBITMAP=y

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
  2017-03-24  0:56 ` Rik van Riel
  2017-03-24  1:52 ` Wanpeng Li
@ 2017-03-27  1:56 ` Wanpeng Li
  2017-03-27 17:35   ` Rik van Riel
  2017-03-27 18:38   ` Luiz Capitulino
  2017-03-29 13:04 ` Frederic Weisbecker
  3 siblings, 2 replies; 67+ messages in thread
From: Wanpeng Li @ 2017-03-27  1:56 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

2017-03-24 4:55 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
>
> When there are two or more tasks executing in user-space and
> taking 100% of a nohz_full CPU, top reports 70% system time
> and 30% user time utilization. Sometimes I'm even able to get
> 100% system time and 0% user time.
>
> This was reproduced with latest Linus tree (093b995), but I
> don't believe it's a regression (at least not a recent one)
> as I can reproduce it with older kernels. Also, I have
> CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> without it yet.
>
> Below you'll find the steps to reproduce and some initial
> analysis.
>
> Steps to reproduce
> ------------------
>
> 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
>
> 2. Pin two tasks that hog the CPU 100% of the time to that CPU
>
> 3. Run top -d1 and check system time
>
> NOTE: When there's only one task hogging a nohz_full CPU, top
>       shows 100% user-time, as expected
>
> Initial analysis
> ----------------
>
> When tracing vtime accounting functions and the user-space/kernel
> transitions when the issue is taking place, I see several of the
> following:
>
> hog-10552 [015]  1132.711104: function:             enter_from_user_mode <-- apic_timer_interrupt
> hog-10552 [015]  1132.711105: function:             __context_tracking_exit <-- enter_from_user_mode
> hog-10552 [015]  1132.711105: bprint:               __context_tracking_exit.part.4: new state=1 cur state=1 active=1
> hog-10552 [015]  1132.711105: function:             vtime_account_user <-- __context_tracking_exit.part.4
> hog-10552 [015]  1132.711105: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             irq_enter <-- smp_apic_timer_interrupt
> hog-10552 [015]  1132.711106: function:             tick_sched_timer <-- __hrtimer_run_queues
> hog-10552 [015]  1132.711108: function:             irq_exit <-- smp_apic_timer_interrupt
> hog-10552 [015]  1132.711108: function:             __context_tracking_enter <-- prepare_exit_to_usermode
> hog-10552 [015]  1132.711108: bprint:               __context_tracking_enter.part.2: new state=1 cur state=0 active=1
> hog-10552 [015]  1132.711109: function:             vtime_user_enter <-- __context_tracking_enter.part.2
> hog-10552 [015]  1132.711109: function:             __vtime_account_system <-- vtime_user_enter
> hog-10552 [015]  1132.711109: function:             account_system_time <-- __vtime_account_system
>
> On entering the kernel due to a timer interrupt, vtime_account_user()
> skips user-time accounting. Then later on when returning to user-space,
> vtime_user_enter() is probably accounting the whole time (ie. user-space
> plus kernel-space) to system time.

Actually after I bisect, the first bad commit is ff9a9b4c4334 ("sched,
time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true,
then just stress all the online cpus or just one cpu and leave others
idle(so it stresses the global timekeeping one), top show 100%
sys-time. And another way to reproduce it is by nohz_full, and gives
the stress to the house keeping cpu, the top show 100% sys-time of the
house keeping cpu, and also the other cpus who have at least two tasks
running on and in full_nohz mode.

Let's consider the cpu which has responsibility for the global
timekeeping, as the tracing posted above, the vtime_account_user() is
called before tick_sched_timer() which will update jiffies, so jiffies
is stale in vtime_account_user() and the run time in userspace is
skipped, the vtime_user_enter() is called after jiffies update, so
both the time in userspace and in  kernel are accumulated to sys time.
If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
fine. However, if you give stress to the housekeeping cpu, top will
show 100% sys-time of both the housekeeping cpu and the other cpus who
have at least two tasks running on and in full_nohz mode. I think it
is because the stress delays the timer interrupt handling in some
degree, then the jiffies is not updated timely before other cpus
access it in vtime_account_user().

I think we can keep syscalls/exceptions context tracking still in
jiffies based sampling and utilize local_clock() in vtime_delta()
again for irqs which avoids jiffies stale influence. I can make a
patch if the idea is acceptable or there is any better proposal. :)

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-24  0:56 ` Rik van Riel
  2017-03-24  1:05   ` Luiz Capitulino
@ 2017-03-27  5:33   ` lkml
  1 sibling, 0 replies; 67+ messages in thread
From: lkml @ 2017-03-27  5:33 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Luiz Capitulino, fweisbec, linux-kernel, linux-rt-users

On Thu, Mar 23, 2017 at 08:56:02PM -0400, Rik van Riel wrote:
> On Thu, 2017-03-23 at 16:55 -0400, Luiz Capitulino wrote:
> > When there are two or more tasks executing in user-space and
> > taking 100% of a nohz_full CPU, top reports 70% system time
> > and 30% user time utilization. Sometimes I'm even able to get
> > 100% system time and 0% user time.
> > 
> > This was reproduced with latest Linus tree (093b995), but I
> > don't believe it's a regression (at least not a recent one)
> > as I can reproduce it with older kernels. Also, I have
> > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > without it yet.
> > 
> > Below you'll find the steps to reproduce and some initial
> > analysis.
> > 
> > Steps to reproduce
> > ------------------
> > 
> > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> > 
> > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> > 
> > 3. Run top -d1 and check system time
> > 
> > NOTE: When there's only one task hogging a nohz_full CPU, top
> >       shows 100% user-time, as expected
> > 
> > Initial analysis
> > ----------------
> > 
> > When tracing vtime accounting functions and the user-space/kernel
> > transitions when the issue is taking place, I see several of the
> > following:
> > 
> > hog-10552 [015]  1132.711104:
> > function:             enter_from_user_mode <-- apic_timer_interrupt
> > hog-10552 [015]  1132.711105:
> > function:             __context_tracking_exit <--
> > enter_from_user_mode
> > hog-10552 [015]  1132.711105:
> > bprint:               __context_tracking_exit.part.4: new state=1 cur
> > state=1 active=1
> > hog-10552 [015]  1132.711105:
> > function:             vtime_account_user <--
> > __context_tracking_exit.part.4
> > hog-10552 [015]  1132.711105:
> > function:             smp_apic_timer_interrupt <--
> > apic_timer_interrupt
> > hog-10552 [015]  1132.711106: function:             irq_enter <--
> > smp_apic_timer_interrupt
> > hog-10552 [015]  1132.711106: function:             tick_sched_timer
> > <-- __hrtimer_run_queues
> > hog-10552 [015]  1132.711108: function:             irq_exit <--
> > smp_apic_timer_interrupt
> > hog-10552 [015]  1132.711108:
> > function:             __context_tracking_enter <--
> > prepare_exit_to_usermode
> > hog-10552 [015]  1132.711108:
> > bprint:               __context_tracking_enter.part.2: new state=1
> > cur state=0 active=1
> > hog-10552 [015]  1132.711109: function:             vtime_user_enter
> > <-- __context_tracking_enter.part.2
> > hog-10552 [015]  1132.711109:
> > function:             __vtime_account_system <-- vtime_user_enter
> > hog-10552 [015]  1132.711109:
> > function:             account_system_time <-- __vtime_account_system
> > 
> > On entering the kernel due to a timer interrupt, vtime_account_user()
> > skips user-time accounting. Then later on when returning to user-
> > space,
> > vtime_user_enter() is probably accounting the whole time (ie. user-
> > space
> > plus kernel-space) to system time.
> > 
> > Now, when does vtime_account_user() skips accounting? Well, when the
> > time delta is less then one jiffie. This would imply that
> > vtime_account_user()
> > is being called less than one jiffie since the last accounting, but I
> > haven't
> > confirmed any of this yet.
> 
> Jiffies should be advanced by the timer interrupt, on the
> housekeeping CPU, which is not doing context tracking.
> 
> Why is the isolated/nohz_full CPU receiving timer interrupts
> at all?
> 
> I thought it would not, but obviously I am wrong. What is
> going on here?

This thread sounds awful familiar to me.

With CONFIG_NO_HZ_FULL=y && CONFIG_VIRT_CPU_ACCOUNTING_GEN=y I observed process
accounting anomalies with user CPU time being misaccounted as system time all
the way back to 4.6.0.

After switching to CONFIG_NO_HZ_IDLE=y && CONFIG_VIRT_CPU_ACCOUNTING_GEN=n the
issues went away.

The lkml thread I had seen at that time which compelled me to suspect these
settings was this:
http://lkml.iu.edu/hypermail/linux/kernel/1608.2/05860.html

It sounds like this issue is finally beginning to be understood though, good
work!

Regards,
Vito Caputo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-27  1:56 ` Wanpeng Li
@ 2017-03-27 17:35   ` Rik van Riel
  2017-03-28  7:19     ` Wanpeng Li
       [not found]     ` <20170328132406.7d23579c@redhat.com>
  2017-03-27 18:38   ` Luiz Capitulino
  1 sibling, 2 replies; 67+ messages in thread
From: Rik van Riel @ 2017-03-27 17:35 UTC (permalink / raw)
  To: Wanpeng Li, Luiz Capitulino
  Cc: Frederic Weisbecker, linux-kernel, linux-rt-users

On Mon, 2017-03-27 at 09:56 +0800, Wanpeng Li wrote:
> 
> Actually after I bisect, the first bad commit is ff9a9b4c4334
> ("sched,
> time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
> can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true

At the time, we thought it was an "occasionally bad" / "unlucky"
kind of bug, not a systemic issue, like your observations seem
to suggest.

> Let's consider the cpu which has responsibility for the global
> timekeeping, as the tracing posted above, the vtime_account_user() is
> called before tick_sched_timer() which will update jiffies, so
> jiffies
> is stale in vtime_account_user() and the run time in userspace is
> skipped, the vtime_user_enter() is called after jiffies update, so
> both the time in userspace and in  kernel are accumulated to sys
> time.
> If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
> fine. However, if you give stress to the housekeeping cpu, top will
> show 100% sys-time of both the housekeeping cpu and the other cpus
> who
> have at least two tasks running on and in full_nohz mode. I think it
> is because the stress delays the timer interrupt handling in some
> degree, then the jiffies is not updated timely before other cpus
> access it in vtime_account_user().
> 
> I think we can keep syscalls/exceptions context tracking still in
> jiffies based sampling and utilize local_clock() in vtime_delta()
> again for irqs which avoids jiffies stale influence. I can make a
> patch if the idea is acceptable or there is any better proposal. :)

Making that patch seems worthwhile, but I would like to
know what the root cause is of the issue that is being
observed.

Is the problem due to the nohz_full CPU receiving an
interrupt at the same time the timer interrupt fires on
the housekeeping CPU?

Is it due to a nohz_full CPU updating jiffies all by
itself from irq context?  In that case, could it be
better to always have that be done by the housekeeping
CPU?

What exactly is going on here?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-27  1:56 ` Wanpeng Li
  2017-03-27 17:35   ` Rik van Riel
@ 2017-03-27 18:38   ` Luiz Capitulino
  2017-03-28  5:28     ` Wanpeng Li
  1 sibling, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-27 18:38 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

On Mon, 27 Mar 2017 09:56:47 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> Actually after I bisect, the first bad commit is ff9a9b4c4334 ("sched,
> time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
> can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true,
> then just stress all the online cpus or just one cpu and leave others
> idle(so it stresses the global timekeeping one), top show 100%
> sys-time. And another way to reproduce it is by nohz_full, and gives
> the stress to the house keeping cpu, the top show 100% sys-time of the
> house keeping cpu, and also the other cpus who have at least two tasks
> running on and in full_nohz mode.

We're not short on reproducers, I have a new one too:

 http://people.redhat.com/~lcapitul/real-time/acct-bug.c

This is a single threaded task that reproduces the issue. If you
run it as instructed, you'll get:

 - nohz_full CPU: 95% system time 5% idle time
 - non-nohz_full CPU: 95% user time 5% idle time (expected behavior)

This reproduces the issue, but not for the reasons I expected. I was
trying to mimic what I was seeing on my trace when tracing the two
task problem. Which is: a task stays 995us in user-space and then
enters the kernel. Time won't be accounted for user-space because
we're not 1 jiffies yet, but if the task stays in the kernel for more
than 5us, then time will be accounted for system time when going
back to user-space.

However, what really seems to be happening is: acct-bug is causing
the tick to be re-activated (why? it shouldn't) and that causes the
issue to appear. This is consistent with my other observations: I
can only reproduce the issue if the nohz_full CPU re-activates the tick.

> Let's consider the cpu which has responsibility for the global
> timekeeping, as the tracing posted above, the vtime_account_user() is
> called before tick_sched_timer() which will update jiffies,

But the vtime_account_user() call and the jiffies update happen
on different CPUs, no? So the ordering shouldn't matter.

> so jiffies
> is stale in vtime_account_user() and the run time in userspace is
> skipped, the vtime_user_enter() is called after jiffies update, so
> both the time in userspace and in  kernel are accumulated to sys time.
>
> If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
> fine. However, if you give stress to the housekeeping cpu, top will
> show 100% sys-time of both the housekeeping cpu and the other cpus who
> have at least two tasks running on and in full_nohz mode.

The housekeeping CPUs are idle with my reproducers.

> I think it
> is because the stress delays the timer interrupt handling in some
> degree, then the jiffies is not updated timely before other cpus
> access it in vtime_account_user().
> 
> I think we can keep syscalls/exceptions context tracking still in
> jiffies based sampling and utilize local_clock() in vtime_delta()
> again for irqs which avoids jiffies stale influence. I can make a
> patch if the idea is acceptable or there is any better proposal. :)
> 
> Regards,
> Wanpeng Li
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-27 18:38   ` Luiz Capitulino
@ 2017-03-28  5:28     ` Wanpeng Li
  2017-03-28 13:44       ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-28  5:28 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

2017-03-28 2:38 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> On Mon, 27 Mar 2017 09:56:47 +0800
> Wanpeng Li <kernellwp@gmail.com> wrote:
>
>> Actually after I bisect, the first bad commit is ff9a9b4c4334 ("sched,
>> time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
>> can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true,
>> then just stress all the online cpus or just one cpu and leave others
>> idle(so it stresses the global timekeeping one), top show 100%
>> sys-time. And another way to reproduce it is by nohz_full, and gives
>> the stress to the house keeping cpu, the top show 100% sys-time of the
>> house keeping cpu, and also the other cpus who have at least two tasks
>> running on and in full_nohz mode.
>
> We're not short on reproducers, I have a new one too:
>
>  http://people.redhat.com/~lcapitul/real-time/acct-bug.c
>
> This is a single threaded task that reproduces the issue. If you
> run it as instructed, you'll get:
>
>  - nohz_full CPU: 95% system time 5% idle time
>  - non-nohz_full CPU: 95% user time 5% idle time (expected behavior)
>
> This reproduces the issue, but not for the reasons I expected. I was
> trying to mimic what I was seeing on my trace when tracing the two
> task problem. Which is: a task stays 995us in user-space and then
> enters the kernel. Time won't be accounted for user-space because
> we're not 1 jiffies yet, but if the task stays in the kernel for more
> than 5us, then time will be accounted for system time when going
> back to user-space.
>
> However, what really seems to be happening is: acct-bug is causing
> the tick to be re-activated (why? it shouldn't) and that causes the
> issue to appear. This is consistent with my other observations: I
> can only reproduce the issue if the nohz_full CPU re-activates the tick.

I see there are other kthreads like migration, kworker,
torture_shuffle etc on the isolated CPU.

Regards,
Wanpeng Li

>
>> Let's consider the cpu which has responsibility for the global
>> timekeeping, as the tracing posted above, the vtime_account_user() is
>> called before tick_sched_timer() which will update jiffies,
>
> But the vtime_account_user() call and the jiffies update happen
> on different CPUs, no? So the ordering shouldn't matter.
>
>> so jiffies
>> is stale in vtime_account_user() and the run time in userspace is
>> skipped, the vtime_user_enter() is called after jiffies update, so
>> both the time in userspace and in  kernel are accumulated to sys time.
>>
>> If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
>> fine. However, if you give stress to the housekeeping cpu, top will
>> show 100% sys-time of both the housekeeping cpu and the other cpus who
>> have at least two tasks running on and in full_nohz mode.
>
> The housekeeping CPUs are idle with my reproducers.
>
>> I think it
>> is because the stress delays the timer interrupt handling in some
>> degree, then the jiffies is not updated timely before other cpus
>> access it in vtime_account_user().
>>
>> I think we can keep syscalls/exceptions context tracking still in
>> jiffies based sampling and utilize local_clock() in vtime_delta()
>> again for irqs which avoids jiffies stale influence. I can make a
>> patch if the idea is acceptable or there is any better proposal. :)
>>
>> Regards,
>> Wanpeng Li
>>
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-27 17:35   ` Rik van Riel
@ 2017-03-28  7:19     ` Wanpeng Li
       [not found]     ` <20170328132406.7d23579c@redhat.com>
  1 sibling, 0 replies; 67+ messages in thread
From: Wanpeng Li @ 2017-03-28  7:19 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Luiz Capitulino, Frederic Weisbecker, linux-kernel, linux-rt-users

2017-03-28 1:35 GMT+08:00 Rik van Riel <riel@redhat.com>:
> On Mon, 2017-03-27 at 09:56 +0800, Wanpeng Li wrote:
>>
>> Actually after I bisect, the first bad commit is ff9a9b4c4334
>> ("sched,
>> time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
>> can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true
>
> At the time, we thought it was an "occasionally bad" / "unlucky"
> kind of bug, not a systemic issue, like your observations seem
> to suggest.
>
>> Let's consider the cpu which has responsibility for the global
>> timekeeping, as the tracing posted above, the vtime_account_user() is
>> called before tick_sched_timer() which will update jiffies, so
>> jiffies
>> is stale in vtime_account_user() and the run time in userspace is
>> skipped, the vtime_user_enter() is called after jiffies update, so
>> both the time in userspace and in  kernel are accumulated to sys
>> time.
>> If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
>> fine. However, if you give stress to the housekeeping cpu, top will
>> show 100% sys-time of both the housekeeping cpu and the other cpus
>> who
>> have at least two tasks running on and in full_nohz mode. I think it
>> is because the stress delays the timer interrupt handling in some
>> degree, then the jiffies is not updated timely before other cpus
>> access it in vtime_account_user().
>>
>> I think we can keep syscalls/exceptions context tracking still in
>> jiffies based sampling and utilize local_clock() in vtime_delta()
>> again for irqs which avoids jiffies stale influence. I can make a
>> patch if the idea is acceptable or there is any better proposal. :)
>
> Making that patch seems worthwhile, but I would like to
> know what the root cause is of the issue that is being
> observed.
>
> Is the problem due to the nohz_full CPU receiving an
> interrupt at the same time the timer interrupt fires on
> the housekeeping CPU?
>
> Is it due to a nohz_full CPU updating jiffies all by
> itself from irq context?  In that case, could it be
> better to always have that be done by the housekeeping
> CPU?

I observed that the jiffies is always updated by housekeeping CPU as
we expected.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-28  5:28     ` Wanpeng Li
@ 2017-03-28 13:44       ` Luiz Capitulino
  0 siblings, 0 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-28 13:44 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, Rik van Riel, linux-kernel, linux-rt-users

On Tue, 28 Mar 2017 13:28:13 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> 2017-03-28 2:38 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> > On Mon, 27 Mar 2017 09:56:47 +0800
> > Wanpeng Li <kernellwp@gmail.com> wrote:
> >  
> >> Actually after I bisect, the first bad commit is ff9a9b4c4334 ("sched,
> >> time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity"). The bug
> >> can be reproduced readily if CONFIG_CONTEXT_TRACKING_FORCE is true,
> >> then just stress all the online cpus or just one cpu and leave others
> >> idle(so it stresses the global timekeeping one), top show 100%
> >> sys-time. And another way to reproduce it is by nohz_full, and gives
> >> the stress to the house keeping cpu, the top show 100% sys-time of the
> >> house keeping cpu, and also the other cpus who have at least two tasks
> >> running on and in full_nohz mode.  
> >
> > We're not short on reproducers, I have a new one too:
> >
> >  http://people.redhat.com/~lcapitul/real-time/acct-bug.c
> >
> > This is a single threaded task that reproduces the issue. If you
> > run it as instructed, you'll get:
> >
> >  - nohz_full CPU: 95% system time 5% idle time
> >  - non-nohz_full CPU: 95% user time 5% idle time (expected behavior)
> >
> > This reproduces the issue, but not for the reasons I expected. I was
> > trying to mimic what I was seeing on my trace when tracing the two
> > task problem. Which is: a task stays 995us in user-space and then
> > enters the kernel. Time won't be accounted for user-space because
> > we're not 1 jiffies yet, but if the task stays in the kernel for more
> > than 5us, then time will be accounted for system time when going
> > back to user-space.
> >
> > However, what really seems to be happening is: acct-bug is causing
> > the tick to be re-activated (why? it shouldn't) and that causes the
> > issue to appear. This is consistent with my other observations: I
> > can only reproduce the issue if the nohz_full CPU re-activates the tick.  
> 
> I see there are other kthreads like migration, kworker,
> torture_shuffle etc on the isolated CPU.

Except for torture_shuffle (which is new to me, and I guess could
be disabled in .config) the other threads should not be runnable
for most of the time.

> 
> Regards,
> Wanpeng Li
> 
> >  
> >> Let's consider the cpu which has responsibility for the global
> >> timekeeping, as the tracing posted above, the vtime_account_user() is
> >> called before tick_sched_timer() which will update jiffies,  
> >
> > But the vtime_account_user() call and the jiffies update happen
> > on different CPUs, no? So the ordering shouldn't matter.
> >  
> >> so jiffies
> >> is stale in vtime_account_user() and the run time in userspace is
> >> skipped, the vtime_user_enter() is called after jiffies update, so
> >> both the time in userspace and in  kernel are accumulated to sys time.
> >>
> >> If the housekeeping cpu is idle when CONFIG_NO_HZ_FULL, everything is
> >> fine. However, if you give stress to the housekeeping cpu, top will
> >> show 100% sys-time of both the housekeeping cpu and the other cpus who
> >> have at least two tasks running on and in full_nohz mode.  
> >
> > The housekeeping CPUs are idle with my reproducers.
> >  
> >> I think it
> >> is because the stress delays the timer interrupt handling in some
> >> degree, then the jiffies is not updated timely before other cpus
> >> access it in vtime_account_user().
> >>
> >> I think we can keep syscalls/exceptions context tracking still in
> >> jiffies based sampling and utilize local_clock() in vtime_delta()
> >> again for irqs which avoids jiffies stale influence. I can make a
> >> patch if the idea is acceptable or there is any better proposal. :)
> >>
> >> Regards,
> >> Wanpeng Li
> >>  
> >  
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
       [not found]       ` <20170328161454.4a5d9e8b@redhat.com>
@ 2017-03-28 21:01         ` Rik van Riel
  2017-03-28 21:26           ` Luiz Capitulino
  2017-03-28 21:24         ` Rik van Riel
  1 sibling, 1 reply; 67+ messages in thread
From: Rik van Riel @ 2017-03-28 21:01 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:
> On Tue, 28 Mar 2017 13:24:06 -0400
> Luiz Capitulino <lcapitulino@redhat.com> wrote:
> > I'm starting to suspect that the nohz code may be programming
> > the tick period to be shorter than 1ms when it re-activates
> > the tick.
> 
> And I think I was right, it looks like the nohz code is programming
> the tick period incorrectly when restarting the tick. The patch below
> fixes things for me, but I still have some homework todo and more
> testing before posting a patch for inclusion. Could you guys test it?

Your patch seems to work. I don't claim to understand why
your patch makes a difference, but for this particular test
case, on this particular setup, it seems to work...

> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 7fe53be..9abe979 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1152,6 +1152,7 @@ static enum hrtimer_restart
> tick_sched_timer(struct hrtimer *timer)
>         struct pt_regs *regs = get_irq_regs();
>         ktime_t now = ktime_get();
>  
> +       ts->last_tick = now;
>         tick_sched_do_timer(now);
>  
>         /*

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
       [not found]       ` <20170328161454.4a5d9e8b@redhat.com>
  2017-03-28 21:01         ` Rik van Riel
@ 2017-03-28 21:24         ` Rik van Riel
  2017-03-28 21:30           ` Luiz Capitulino
  1 sibling, 1 reply; 67+ messages in thread
From: Rik van Riel @ 2017-03-28 21:24 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:

> And I think I was right, it looks like the nohz code is programming
> the tick period incorrectly when restarting the tick. The patch below
> fixes things for me, but I still have some homework todo and more
> testing before posting a patch for inclusion. Could you guys test it?

I spoke too soon.  After half an hour of runtime,
things have gotten aligned to give me about 50/50
user time and system time with your test case,
again.

This is on an 8 VCPU virtual machine, with
nohz_full=2-7, and the test case running on one
of the nohz_full CPUs.

> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 7fe53be..9abe979 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1152,6 +1152,7 @@ static enum hrtimer_restart
> tick_sched_timer(struct hrtimer *timer)
>         struct pt_regs *regs = get_irq_regs();
>         ktime_t now = ktime_get();
>  
> +       ts->last_tick = now;
>         tick_sched_do_timer(now);
>  
>         /*

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-28 21:01         ` Rik van Riel
@ 2017-03-28 21:26           ` Luiz Capitulino
  2017-03-29  9:56             ` Wanpeng Li
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-28 21:26 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Tue, 28 Mar 2017 17:01:52 -0400
Rik van Riel <riel@redhat.com> wrote:

> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:
> > On Tue, 28 Mar 2017 13:24:06 -0400
> > Luiz Capitulino <lcapitulino@redhat.com> wrote:  
> > > I'm starting to suspect that the nohz code may be programming
> > > the tick period to be shorter than 1ms when it re-activates
> > > the tick.  
> > 
> > And I think I was right, it looks like the nohz code is programming
> > the tick period incorrectly when restarting the tick. The patch below
> > fixes things for me, but I still have some homework todo and more
> > testing before posting a patch for inclusion. Could you guys test it?  
> 
> Your patch seems to work. I don't claim to understand why
> your patch makes a difference, but for this particular test
> case, on this particular setup, it seems to work...

I don't fully understand why either yet. I was looking for places
where nohz might be programming the tick period incorrectly and
I found that there's a case in tick_nohz_stop_sched_tick() where
tick_nohz_restart() is called only to reprogram the tick timer,
not cancel the tick. In this case, ts->last_tick seems to be out
of date. Fixing this fixed accounting for me.

> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 7fe53be..9abe979 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1152,6 +1152,7 @@ static enum hrtimer_restart
> > tick_sched_timer(struct hrtimer *timer)
> >         struct pt_regs *regs = get_irq_regs();
> >         ktime_t now = ktime_get();
> >  
> > +       ts->last_tick = now;
> >         tick_sched_do_timer(now);
> >  
> >         /*  
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-28 21:24         ` Rik van Riel
@ 2017-03-28 21:30           ` Luiz Capitulino
  0 siblings, 0 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-28 21:30 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Tue, 28 Mar 2017 17:24:11 -0400
Rik van Riel <riel@redhat.com> wrote:

> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:
> 
> > And I think I was right, it looks like the nohz code is programming
> > the tick period incorrectly when restarting the tick. The patch below
> > fixes things for me, but I still have some homework todo and more
> > testing before posting a patch for inclusion. Could you guys test it?  
> 
> I spoke too soon.  After half an hour of runtime,
> things have gotten aligned to give me about 50/50
> user time and system time with your test case,
> again.

Hmmm, maybe it's incomplete. I still think that nohz might screwing
something up when re-activating the tick.

> 
> This is on an 8 VCPU virtual machine, with
> nohz_full=2-7, and the test case running on one
> of the nohz_full CPUs.
> 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 7fe53be..9abe979 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1152,6 +1152,7 @@ static enum hrtimer_restart
> > tick_sched_timer(struct hrtimer *timer)
> >         struct pt_regs *regs = get_irq_regs();
> >         ktime_t now = ktime_get();
> >  
> > +       ts->last_tick = now;
> >         tick_sched_do_timer(now);
> >  
> >         /*  
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-28 21:26           ` Luiz Capitulino
@ 2017-03-29  9:56             ` Wanpeng Li
  2017-03-29 12:56               ` Frederic Weisbecker
  0 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-29  9:56 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Rik van Riel, Frederic Weisbecker, linux-kernel

2017-03-29 5:26 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> On Tue, 28 Mar 2017 17:01:52 -0400
> Rik van Riel <riel@redhat.com> wrote:
>
>> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:
>> > On Tue, 28 Mar 2017 13:24:06 -0400
>> > Luiz Capitulino <lcapitulino@redhat.com> wrote:
>> > > I'm starting to suspect that the nohz code may be programming
>> > > the tick period to be shorter than 1ms when it re-activates
>> > > the tick.
>> >
>> > And I think I was right, it looks like the nohz code is programming
>> > the tick period incorrectly when restarting the tick. The patch below
>> > fixes things for me, but I still have some homework todo and more
>> > testing before posting a patch for inclusion. Could you guys test it?
>>
>> Your patch seems to work. I don't claim to understand why
>> your patch makes a difference, but for this particular test
>> case, on this particular setup, it seems to work...
>
> I don't fully understand why either yet. I was looking for places
> where nohz might be programming the tick period incorrectly and

The bug is still present when I config CONTEXT_TRACKING_FORCE and
nohz=off in the boot parameter.

Regards,
Wanpeng Li

> I found that there's a case in tick_nohz_stop_sched_tick() where
> tick_nohz_restart() is called only to reprogram the tick timer,
> not cancel the tick. In this case, ts->last_tick seems to be out
> of date. Fixing this fixed accounting for me.
>
>> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > index 7fe53be..9abe979 100644
>> > --- a/kernel/time/tick-sched.c
>> > +++ b/kernel/time/tick-sched.c
>> > @@ -1152,6 +1152,7 @@ static enum hrtimer_restart
>> > tick_sched_timer(struct hrtimer *timer)
>> >         struct pt_regs *regs = get_irq_regs();
>> >         ktime_t now = ktime_get();
>> >
>> > +       ts->last_tick = now;
>> >         tick_sched_do_timer(now);
>> >
>> >         /*
>>
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29  9:56             ` Wanpeng Li
@ 2017-03-29 12:56               ` Frederic Weisbecker
  0 siblings, 0 replies; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-29 12:56 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Luiz Capitulino, Rik van Riel, linux-kernel

On Wed, Mar 29, 2017 at 05:56:30PM +0800, Wanpeng Li wrote:
> 2017-03-29 5:26 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> > On Tue, 28 Mar 2017 17:01:52 -0400
> > Rik van Riel <riel@redhat.com> wrote:
> >
> >> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote:
> >> > On Tue, 28 Mar 2017 13:24:06 -0400
> >> > Luiz Capitulino <lcapitulino@redhat.com> wrote:
> >> > > I'm starting to suspect that the nohz code may be programming
> >> > > the tick period to be shorter than 1ms when it re-activates
> >> > > the tick.
> >> >
> >> > And I think I was right, it looks like the nohz code is programming
> >> > the tick period incorrectly when restarting the tick. The patch below
> >> > fixes things for me, but I still have some homework todo and more
> >> > testing before posting a patch for inclusion. Could you guys test it?
> >>
> >> Your patch seems to work. I don't claim to understand why
> >> your patch makes a difference, but for this particular test
> >> case, on this particular setup, it seems to work...
> >
> > I don't fully understand why either yet. I was looking for places
> > where nohz might be programming the tick period incorrectly and
> 
> The bug is still present when I config CONTEXT_TRACKING_FORCE and
> nohz=off in the boot parameter.

Indeed I saw something similar a few days ago with:

    !CONFIG_NO_HZ_FULL && CONFIG_VIRT_CPU_ACCOUNTING_GEN && CONTEXT_TRACKING_FORCE

And it disappeared with CONFIG_NO_HZ_FULL=y so I didn't care much because that setting
isn't used in production and in fact I intend to remove CONTEXT_TRACKING_FORCE. But
it could be the sign of something important.

It might be different than Luiz's bug because I can't reproduce his bug yet even with
his config.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
                   ` (2 preceding siblings ...)
  2017-03-27  1:56 ` Wanpeng Li
@ 2017-03-29 13:04 ` Frederic Weisbecker
  2017-03-29 13:14   ` Rik van Riel
  3 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-29 13:04 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: riel, linux-kernel, linux-rt-users, Wanpeng Li

On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote:
> 
> When there are two or more tasks executing in user-space and
> taking 100% of a nohz_full CPU, top reports 70% system time
> and 30% user time utilization. Sometimes I'm even able to get
> 100% system time and 0% user time.
> 
> This was reproduced with latest Linus tree (093b995), but I
> don't believe it's a regression (at least not a recent one)
> as I can reproduce it with older kernels. Also, I have
> CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> without it yet.
> 
> Below you'll find the steps to reproduce and some initial
> analysis.
> 
> Steps to reproduce
> ------------------
> 
> 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> 
> 2. Pin two tasks that hog the CPU 100% of the time to that CPU

I failed to reproduce with your config. I'm still getting 99% userspace
cputime. So I'm wondering if the hogging style plays a role.

I run pure user loops:

    int main(int argc, char **argv)
    {
        for (;;);
        return 0
    }

Does your user program perform syscalls or IOs of some sort?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 13:04 ` Frederic Weisbecker
@ 2017-03-29 13:14   ` Rik van Riel
  2017-03-29 13:23     ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Rik van Riel @ 2017-03-29 13:14 UTC (permalink / raw)
  To: Frederic Weisbecker, Luiz Capitulino
  Cc: linux-kernel, linux-rt-users, Wanpeng Li

On Wed, 2017-03-29 at 15:04 +0200, Frederic Weisbecker wrote:
> On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote:
> > 
> > When there are two or more tasks executing in user-space and
> > taking 100% of a nohz_full CPU, top reports 70% system time
> > and 30% user time utilization. Sometimes I'm even able to get
> > 100% system time and 0% user time.
> > 
> > This was reproduced with latest Linus tree (093b995), but I
> > don't believe it's a regression (at least not a recent one)
> > as I can reproduce it with older kernels. Also, I have
> > CONFIG_IRQ_TIME_ACCOUNTING=y and haven't tried to reproduce
> > without it yet.
> > 
> > Below you'll find the steps to reproduce and some initial
> > analysis.
> > 
> > Steps to reproduce
> > ------------------
> > 
> > 1. Set up a CPU for nohz_full with isolcpus= nohz_full=
> > 
> > 2. Pin two tasks that hog the CPU 100% of the time to that CPU
> 
> I failed to reproduce with your config. I'm still getting 99%
> userspace
> cputime. So I'm wondering if the hogging style plays a role.
> 
> I run pure user loops:
> 
>     int main(int argc, char **argv)
>     {
>         for (;;);
>         return 0
>     }
> 
> Does your user program perform syscalls or IOs of some sort?

Luiz's program makes a syscall every millisecond,
if started with the arguments he gave as his
reproducer.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 13:14   ` Rik van Riel
@ 2017-03-29 13:23     ` Luiz Capitulino
  2017-03-29 21:12       ` Frederic Weisbecker
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-29 13:23 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Frederic Weisbecker, linux-kernel, linux-rt-users, Wanpeng Li

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

On Wed, 29 Mar 2017 09:14:32 -0400
Rik van Riel <riel@redhat.com> wrote:

> > I failed to reproduce with your config. I'm still getting 99%
> > userspace
> > cputime. So I'm wondering if the hogging style plays a role.
> > 
> > I run pure user loops:
> > 
> >     int main(int argc, char **argv)
> >     {
> >         for (;;);
> >         return 0
> >     }
> > 
> > Does your user program perform syscalls or IOs of some sort?  
> 
> Luiz's program makes a syscall every millisecond,
> if started with the arguments he gave as his
> reproducer.

There are various reproducers actually. I started off with the simple
loop above, then wrote the attach program and then wrote the one
you're mentioning:

 http://people.redhat.com/~lcapitul/real-time/acct-bug.c

All of them reproduce the issue 100% of the time for me.

[-- Attachment #2: hog.c --]
[-- Type: text/x-c++src, Size: 998 bytes --]

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sched.h>
#include <sys/types.h>

static int move_to_cpu(int cpu)
{
        cpu_set_t set;

        CPU_ZERO(&set);
        CPU_SET(cpu, &set);
        return sched_setaffinity(0, sizeof(set), &set);
}

static void loop(void)
{
        for (;;) ;
}

static int fork_hog(int cpu)
{
        int pid;

        pid = (int) fork();
        if (pid == 0) {
                move_to_cpu(cpu);
                loop();
                exit(0);
        }

        return pid;
}

int main(int argc, char *argv[])
{
        int i, pid, cpu, nr_procs;

		if (argc != 3) {
			printf("usage: hog < nr-procs > < CPU >\n");
			exit(1);
		}

		cpu = atoi(argv[2]);
		nr_procs = atoi(argv[1]);

        for (i = 0; i < nr_procs; i++) {
                pid = fork_hog(cpu);
                fprintf(stderr, "created hog%d pid=%d\n", i, pid);
        }

        fprintf(stderr, "pausing...\n");
        pause();

        return 0;
}

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
       [not found]       ` <20170329131656.1d6cb743@redhat.com>
@ 2017-03-29 20:08         ` Rik van Riel
  2017-03-29 22:54           ` Frederic Weisbecker
                             ` (2 more replies)
       [not found]         ` <20170329221700.GB23895@lerouge>
  1 sibling, 3 replies; 67+ messages in thread
From: Rik van Riel @ 2017-03-29 20:08 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote:
> On Tue, 28 Mar 2017 13:24:06 -0400
> Luiz Capitulino <lcapitulino@redhat.com> wrote:
> 
> >  1. In my tracing I'm seeing that sometimes (always?) the
> >     time interval between two timer interrupts is less than 1ms
> 
> I think that's the root cause.
> 
> In this trace, we see the following:
> 
>  1. On CPU15, we transition from user-space to kernel-space because
>     of a timer interrupt (it's the tick)
> 
>  2. vtimer_delta() returns 0, because jiffies didn't change since the
>     last accounting
> 
>  3. While CPU15 is executing in kernel-space, jiffies is updated
>     by CPU0
> 
>  4. When going back to user-space, vtime_delta() returns non-zero
>     and the whole time is accounted for system time (observe how
>     the cputime parameter in account_system_time() is less than 1ms)

In other words, the tick on cpu0 is aligned
with the tick on the nohz_full cpus, and
jiffies is advanced while the nohz_full cpus
with an active tick happen to be in kernel
mode?

Frederic, can you think of any reason why
the tick on nohz_full CPUs would end up aligned
with the tick on cpu0, instead of running at some
random offset?

A random offset, or better yet a somewhat randomized
tick length to make sure that simultaneous ticks are
fairly rare and the vtime sampling does not end up
"in phase" with the jiffies incrementing, could make
the accounting work right again.

Of course, that assumes the above hypothesis is correct :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 13:23     ` Luiz Capitulino
@ 2017-03-29 21:12       ` Frederic Weisbecker
  2017-03-30  1:48         ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-29 21:12 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Rik van Riel, linux-kernel, linux-rt-users, Wanpeng Li

On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote:
> 
> There are various reproducers actually. I started off with the simple
> loop above, then wrote the attach program and then wrote the one
> you're mentioning:
> 
>  http://people.redhat.com/~lcapitul/real-time/acct-bug.c
> 
> All of them reproduce the issue 100% of the time for me.

> #define _GNU_SOURCE
> #include <stdio.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <sched.h>
> #include <sys/types.h>
> 
> static int move_to_cpu(int cpu)
> {
>         cpu_set_t set;
> 
>         CPU_ZERO(&set);
>         CPU_SET(cpu, &set);
>         return sched_setaffinity(0, sizeof(set), &set);
> }
> 
> static void loop(void)
> {
>         for (;;) ;
> }
> 
> static int fork_hog(int cpu)
> {
>         int pid;
> 
>         pid = (int) fork();
>         if (pid == 0) {
>                 move_to_cpu(cpu);
>                 loop();
>                 exit(0);
>         }
> 
>         return pid;
> }
> 
> int main(int argc, char *argv[])
> {
>         int i, pid, cpu, nr_procs;
> 
> 		if (argc != 3) {
> 			printf("usage: hog < nr-procs > < CPU >\n");
> 			exit(1);
> 		}
> 
> 		cpu = atoi(argv[2]);
> 		nr_procs = atoi(argv[1]);
> 
>         for (i = 0; i < nr_procs; i++) {
>                 pid = fork_hog(cpu);
>                 fprintf(stderr, "created hog%d pid=%d\n", i, pid);
>         }
> 
>         fprintf(stderr, "pausing...\n");
>         pause();
> 
>         return 0;
> }

I just tried both of these and none seem to show incorrect cputime :-/
I'm wondering if that bug depends on some hardware.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
       [not found]         ` <20170329221700.GB23895@lerouge>
@ 2017-03-29 22:46           ` Wanpeng Li
  2017-03-30  2:14             ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-29 22:46 UTC (permalink / raw)
  To: Frederic Weisbecker, linux-kernel, linux-rt-users
  Cc: Luiz Capitulino, Rik van Riel

2017-03-30 6:17 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> On Wed, Mar 29, 2017 at 01:16:56PM -0400, Luiz Capitulino wrote:
>> On Tue, 28 Mar 2017 13:24:06 -0400
>> Luiz Capitulino <lcapitulino@redhat.com> wrote:
>>
>> >  1. In my tracing I'm seeing that sometimes (always?) the
>> >     time interval between two timer interrupts is less than 1ms
>>
>> I think that's the root cause.
>>
>> I'm getting traces like this:
>>
>>    hog-11980 [015]   341.494491: function:             enter_from_user_mode <-- apic_timer_interrupt
>> <idle>-0     [000]   341.494492: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
>>    hog-11980 [015]   341.494492: function:             __context_tracking_exit <-- enter_from_user_mode
>> <idle>-0     [000]   341.494492: function:             irq_enter <-- smp_apic_timer_interrupt
>>    hog-11980 [015]   341.494492: bprint:               vtime_delta: diff=0 (now=4295008339 vtime_snap=4295008339)
>>    hog-11980 [015]   341.494492: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
>>    hog-11980 [015]   341.494492: function:             irq_enter <-- smp_apic_timer_interrupt
>>    hog-11980 [015]   341.494493: function:             tick_sched_timer <-- __hrtimer_run_queues
>> <idle>-0     [000]   341.494493: function:             tick_sched_timer <-- __hrtimer_run_queues
>> <idle>-0     [000]   341.494493: function:             tick_do_update_jiffies64.part.14 <-- tick_sched_do_timer
>> <idle>-0     [000]   341.494494: function:             do_timer <-- tick_do_update_jiffies64.part.14
>>    hog-11980 [015]   341.494494: function:             irq_exit <-- smp_apic_timer_interrupt
>> <idle>-0     [000]   341.494494: bprint:               do_timer: updated jiffies_64=4295008340 ticks=1
>>    hog-11980 [015]   341.494494: function:             __context_tracking_enter <-- prepare_exit_to_usermode
>>    hog-11980 [015]   341.494494: function:             vtime_user_enter <-- __context_tracking_enter
>>    hog-11980 [015]   341.494495: bprint:               vtime_delta: diff=1000000 (now=4295008340 vtime_snap=4295008339)
>>    hog-11980 [015]   341.494495: function:             __vtime_account_system <-- vtime_user_enter
>>    hog-11980 [015]   341.494495: bprint:               get_vtime_delta: vtime_snap=4295008339 now=4295008340
>>    hog-11980 [015]   341.494495: function:             account_system_time <-- __vtime_account_system
>>    hog-11980 [015]   341.494495: bprint:               account_system_time: cputime=995488
>> <idle>-0     [000]   341.494497: function:             irq_exit <-- smp_apic_timer_interrupt
>>
>> In this trace, we see the following:
>>
>>  1. On CPU15, we transition from user-space to kernel-space because
>>     of a timer interrupt (it's the tick)
>>
>>  2. vtimer_delta() returns 0, because jiffies didn't change since the
>>     last accounting
>>
>>  3. While CPU15 is executing in kernel-space, jiffies is updated
>>     by CPU0
>>
>>  4. When going back to user-space, vtime_delta() returns non-zero
>>     and the whole time is accounted for system time (observe how
>>     the cputime parameter in account_system_time() is less than 1ms)
>
> Aah, so the issue can indeed happen if all CPUs fire their ticks at the same time:
>
>
>                  CPU 0                         CPU 1
>                  -----                         -----
>                                                exit_user() // no cputime update
> tick X           update_jiffies
>                                                enter_user() // cputime update
>
>
>                                                exit_user() //no cputime update
> tick X+1         update_jiffies
>                                                enter_user() // cputime update
>
>>
>> That's why my patch from yesterday fixed the issue, it increased the
>> tick period to more than 1ms. So vtime_delta() always evaluate to true
>> when transitioning from user-space to kernel-space (because we spend
>> more than 1ms in user-space between ticks). The patch below achieves
>> the same result by adding 10us to the tick period.
>>
>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> index 7fe53be..00e46df 100644
>> --- a/kernel/time/tick-sched.c
>> +++ b/kernel/time/tick-sched.c
>> @@ -1165,7 +1165,7 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
>>         if (unlikely(ts->tick_stopped))
>>                 return HRTIMER_NORESTART;
>>
>> -       hrtimer_forward(timer, now, tick_period);
>> +       hrtimer_forward(timer, now, tick_period + 10000);
>
> I'm surprised it works though. If the 10us shift was only applied to CPU 0 and not the
> others then yes, but if it is applied to all CPUs, the ticks stay synchronized and the
> problem should stay...
>
> Ah wait! It can work because the nohz_full CPUs have their ticks sometimes scheduled
> by tick_nohz_stop_sched_tick() or tick_nohz_restart_sched_tick() which don't have the
> 10us shift. So a drift happens everytime the nohz_full CPUs have their tick stopped.
>
>> Now, why is the tick ticking at less than 1ms? I think it's the time
>> difference between "now" (that we pass to hrtimer_forward()) and the
>> time the timer hardware is actually programmed. That should account
>> for a few microseconds.
>
> Right, that's my feeling. And if it is the case, then it shouldn't matter.
>
> So! Now we need to find a proper fix :o)
>
> Hmm, how bad would it be to revert to sched_clock() instead of jiffies in vtime_delta()?
> We could use nanosecond granularity to check deltas but only perform an actual cputime update
> when that delta >= TICK_NSEC. That should keep the load ok.

Yeah, I mentioned something similar before.
https://lkml.org/lkml/2017/3/26/138 However, Rik's commit optimized
syscalls by not utilize sched_clock(), so if we should distinguish
between syscalls/exceptions and irqs?

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 20:08         ` Rik van Riel
@ 2017-03-29 22:54           ` Frederic Weisbecker
  2017-03-30 12:57             ` Rik van Riel
  2017-03-30  1:58           ` Wanpeng Li
  2017-03-30  4:27           ` Mike Galbraith
  2 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-29 22:54 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Luiz Capitulino, Wanpeng Li, linux-kernel, Thomas Gleixner

(Adding Thomas in Cc)

On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote:
> On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote:
> > On Tue, 28 Mar 2017 13:24:06 -0400
> > Luiz Capitulino <lcapitulino@redhat.com> wrote:
> > 
> > >  1. In my tracing I'm seeing that sometimes (always?) the
> > >     time interval between two timer interrupts is less than 1ms
> > 
> > I think that's the root cause.
> > 
> > In this trace, we see the following:
> > 
> >  1. On CPU15, we transition from user-space to kernel-space because
> >     of a timer interrupt (it's the tick)
> > 
> >  2. vtimer_delta() returns 0, because jiffies didn't change since the
> >     last accounting
> > 
> >  3. While CPU15 is executing in kernel-space, jiffies is updated
> >     by CPU0
> > 
> >  4. When going back to user-space, vtime_delta() returns non-zero
> >     and the whole time is accounted for system time (observe how
> >     the cputime parameter in account_system_time() is less than 1ms)
> 
> In other words, the tick on cpu0 is aligned
> with the tick on the nohz_full cpus, and
> jiffies is advanced while the nohz_full cpus
> with an active tick happen to be in kernel
> mode?

Ah you found out faster than me :-)

> Frederic, can you think of any reason why
> the tick on nohz_full CPUs would end up aligned
> with the tick on cpu0, instead of running at some
> random offset?

tick_init_jiffy_update() takes that decision to align all ticks.

I'm not sure why. I don't see anything that could depend on that
wide tick synchronization. The jiffies update itself relies on ktime
to check when to update it. So even if the tick fires a bit later
on CPU 1 than on CPU 0, the jiffies updates should stay coherent and
should never exceed 999us delay in the worst case (for HZ=1000)

Now I might overlook something.

> 
> A random offset, or better yet a somewhat randomized
> tick length to make sure that simultaneous ticks are
> fairly rare and the vtime sampling does not end up
> "in phase" with the jiffies incrementing, could make
> the accounting work right again.
> 
> Of course, that assumes the above hypothesis is correct :)

I'm not sure that randomizing the tick start per CPU would be a
right solution. Somewhere in the world you can be sure the tick
randomization of some nohz_full CPU will coincide with the tick
of CPU 0 :o)

Or we could force that tick on nohz_full CPUs to be far from
CPU 0's tick... I'm not sure such a solution would be accepted though.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 21:12       ` Frederic Weisbecker
@ 2017-03-30  1:48         ` Luiz Capitulino
  0 siblings, 0 replies; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-30  1:48 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Rik van Riel, linux-kernel, linux-rt-users, Wanpeng Li

On Wed, 29 Mar 2017 23:12:00 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote:
> > 
> > There are various reproducers actually. I started off with the simple
> > loop above, then wrote the attach program and then wrote the one
> > you're mentioning:
> > 
> >  http://people.redhat.com/~lcapitul/real-time/acct-bug.c
> > 
> > All of them reproduce the issue 100% of the time for me.  
> 
> > #define _GNU_SOURCE
> > #include <stdio.h>
> > #include <unistd.h>
> > #include <stdlib.h>
> > #include <sched.h>
> > #include <sys/types.h>
> > 
> > static int move_to_cpu(int cpu)
> > {
> >         cpu_set_t set;
> > 
> >         CPU_ZERO(&set);
> >         CPU_SET(cpu, &set);
> >         return sched_setaffinity(0, sizeof(set), &set);
> > }
> > 
> > static void loop(void)
> > {
> >         for (;;) ;
> > }
> > 
> > static int fork_hog(int cpu)
> > {
> >         int pid;
> > 
> >         pid = (int) fork();
> >         if (pid == 0) {
> >                 move_to_cpu(cpu);
> >                 loop();
> >                 exit(0);
> >         }
> > 
> >         return pid;
> > }
> > 
> > int main(int argc, char *argv[])
> > {
> >         int i, pid, cpu, nr_procs;
> > 
> > 		if (argc != 3) {
> > 			printf("usage: hog < nr-procs > < CPU >\n");
> > 			exit(1);
> > 		}
> > 
> > 		cpu = atoi(argv[2]);
> > 		nr_procs = atoi(argv[1]);
> > 
> >         for (i = 0; i < nr_procs; i++) {
> >                 pid = fork_hog(cpu);
> >                 fprintf(stderr, "created hog%d pid=%d\n", i, pid);
> >         }
> > 
> >         fprintf(stderr, "pausing...\n");
> >         pause();
> > 
> >         return 0;
> > }  
> 
> I just tried both of these and none seem to show incorrect cputime :-/
> I'm wondering if that bug depends on some hardware.

Are you running on x86? My CPU is:

Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

I wonder if this issue depends on the timer used by the hrtimer
subsystem.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 20:08         ` Rik van Riel
  2017-03-29 22:54           ` Frederic Weisbecker
@ 2017-03-30  1:58           ` Wanpeng Li
  2017-03-30 12:40             ` Frederic Weisbecker
  2017-03-30  4:27           ` Mike Galbraith
  2 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-30  1:58 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Luiz Capitulino, Frederic Weisbecker, linux-kernel

2017-03-30 4:08 GMT+08:00 Rik van Riel <riel@redhat.com>:
> On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote:
>> On Tue, 28 Mar 2017 13:24:06 -0400
>> Luiz Capitulino <lcapitulino@redhat.com> wrote:
>>
>> >  1. In my tracing I'm seeing that sometimes (always?) the
>> >     time interval between two timer interrupts is less than 1ms
>>
>> I think that's the root cause.
>>
>> In this trace, we see the following:
>>
>>  1. On CPU15, we transition from user-space to kernel-space because
>>     of a timer interrupt (it's the tick)
>>
>>  2. vtimer_delta() returns 0, because jiffies didn't change since the
>>     last accounting
>>
>>  3. While CPU15 is executing in kernel-space, jiffies is updated
>>     by CPU0
>>
>>  4. When going back to user-space, vtime_delta() returns non-zero
>>     and the whole time is accounted for system time (observe how
>>     the cputime parameter in account_system_time() is less than 1ms)
>
> In other words, the tick on cpu0 is aligned
> with the tick on the nohz_full cpus, and
> jiffies is advanced while the nohz_full cpus
> with an active tick happen to be in kernel
> mode?
>
> Frederic, can you think of any reason why
> the tick on nohz_full CPUs would end up aligned
> with the tick on cpu0, instead of running at some
> random offset?
>
> A random offset, or better yet a somewhat randomized
> tick length to make sure that simultaneous ticks are
> fairly rare and the vtime sampling does not end up
> "in phase" with the jiffies incrementing, could make
> the accounting work right again.
>
> Of course, that assumes the above hypothesis is correct :)

There is such a feature skew_tick currently, refer to commit
5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot
parameter, the bug disappear, however, the commit also mentioned that
it will hurt power consumption. I will try Frederic's proposal which
is similar to my original idea "how bad would it be to revert to
sched_clock() instead of jiffies in vtime_delta()? We could use
nanosecond granularity to check deltas but only perform an actual
cputime update when that delta >= TICK_NSEC."

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 22:46           ` Wanpeng Li
@ 2017-03-30  2:14             ` Luiz Capitulino
  2017-03-30 12:27               ` Wanpeng Li
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-30  2:14 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, linux-kernel, linux-rt-users, Rik van Riel

On Thu, 30 Mar 2017 06:46:30 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> > So! Now we need to find a proper fix :o)
> >
> > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in vtime_delta()?
> > We could use nanosecond granularity to check deltas but only perform an actual cputime update
> > when that delta >= TICK_NSEC. That should keep the load ok.  
> 
> Yeah, I mentioned something similar before.
> https://lkml.org/lkml/2017/3/26/138 However, Rik's commit optimized
> syscalls by not utilize sched_clock(), so if we should distinguish
> between syscalls/exceptions and irqs?

Why not use ktime_get()?

Here's the solution I was thinking about, it's mostly untested. I'm
rate limiting below TICK_NSEC because I want to avoid syncing with
the tick.

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..a8b1e85 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -676,18 +676,20 @@ void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 static u64 vtime_delta(struct task_struct *tsk)
 {
-	unsigned long now = READ_ONCE(jiffies);
+	return ktime_sub(ktime_get(), tsk->vtime_snap);
+}
 
-	if (time_before(now, (unsigned long)tsk->vtime_snap))
-		return 0;
+/* A little bit less than the tick period */
+#define VTIME_RATE_LIMIT (TICK_NSEC - 200000)
 
-	return jiffies_to_nsecs(now - tsk->vtime_snap);
+static bool vtime_should_account(struct task_struct *tsk)
+{
+	return vtime_delta(tsk) > VTIME_RATE_LIMIT;
 }
 
 static u64 get_vtime_delta(struct task_struct *tsk)
 {
-	unsigned long now = READ_ONCE(jiffies);
-	u64 delta, other;
+	u64 delta, other, now = ktime_get();
 
 	/*
 	 * Unlike tick based timing, vtime based timing never has lost
@@ -696,7 +698,7 @@ static u64 get_vtime_delta(struct task_struct *tsk)
 	 * elapsed time. Limit account_other_time to prevent rounding
 	 * errors from causing elapsed vtime to go negative.
 	 */
-	delta = jiffies_to_nsecs(now - tsk->vtime_snap);
+	delta = ktime_sub(now, tsk->vtime_snap);
 	other = account_other_time(delta);
 	WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
 	tsk->vtime_snap = now;
@@ -711,7 +713,7 @@ static void __vtime_account_system(struct task_struct *tsk)
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	if (!vtime_delta(tsk))
+	if (!vtime_should_account(tsk))
 		return;
 
 	write_seqcount_begin(&tsk->vtime_seqcount);
@@ -723,7 +725,7 @@ void vtime_account_user(struct task_struct *tsk)
 {
 	write_seqcount_begin(&tsk->vtime_seqcount);
 	tsk->vtime_snap_whence = VTIME_SYS;
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		account_user_time(tsk, get_vtime_delta(tsk));
 	write_seqcount_end(&tsk->vtime_seqcount);
 }
@@ -731,7 +733,7 @@ void vtime_account_user(struct task_struct *tsk)
 void vtime_user_enter(struct task_struct *tsk)
 {
 	write_seqcount_begin(&tsk->vtime_seqcount);
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		__vtime_account_system(tsk);
 	tsk->vtime_snap_whence = VTIME_USER;
 	write_seqcount_end(&tsk->vtime_seqcount);
@@ -747,7 +749,7 @@ void vtime_guest_enter(struct task_struct *tsk)
 	 * that can thus safely catch up with a tickless delta.
 	 */
 	write_seqcount_begin(&tsk->vtime_seqcount);
-	if (vtime_delta(tsk))
+	if (vtime_should_account(tsk))
 		__vtime_account_system(tsk);
 	current->flags |= PF_VCPU;
 	write_seqcount_end(&tsk->vtime_seqcount);
@@ -776,7 +778,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	write_seqcount_begin(&current->vtime_seqcount);
 	current->vtime_snap_whence = VTIME_SYS;
-	current->vtime_snap = jiffies;
+	current->vtime_snap = ktime_get();
 	write_seqcount_end(&current->vtime_seqcount);
 }
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 20:08         ` Rik van Riel
  2017-03-29 22:54           ` Frederic Weisbecker
  2017-03-30  1:58           ` Wanpeng Li
@ 2017-03-30  4:27           ` Mike Galbraith
  2017-03-30  6:47             ` Wanpeng Li
  2017-03-30 12:51             ` Frederic Weisbecker
  2 siblings, 2 replies; 67+ messages in thread
From: Mike Galbraith @ 2017-03-30  4:27 UTC (permalink / raw)
  To: Rik van Riel, Luiz Capitulino
  Cc: Wanpeng Li, Frederic Weisbecker, linux-kernel

On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:

> In other words, the tick on cpu0 is aligned
> with the tick on the nohz_full cpus, and
> jiffies is advanced while the nohz_full cpus
> with an active tick happen to be in kernel
> mode?

You really want skew_tick=1, especially on big boxen.
 
> Frederic, can you think of any reason why
> the tick on nohz_full CPUs would end up aligned
> with the tick on cpu0, instead of running at some
> random offset?

(I or low rq->clock bits as crude NOHZ collision avoidance)

> A random offset, or better yet a somewhat randomized
> tick length to make sure that simultaneous ticks are
> fairly rare and the vtime sampling does not end up
> "in phase" with the jiffies incrementing, could make
> the accounting work right again.

That improves jitter, especially on big boxen.  I have an 8 socket box
that thinks it's an extra large PC, there, collision avoidance matters
hugely.  I couldn't reproduce bean counting woes, no idea if collision
avoidance will help that.

	-Mike

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  4:27           ` Mike Galbraith
@ 2017-03-30  6:47             ` Wanpeng Li
  2017-03-30 11:52               ` Wanpeng Li
                                 ` (2 more replies)
  2017-03-30 12:51             ` Frederic Weisbecker
  1 sibling, 3 replies; 67+ messages in thread
From: Wanpeng Li @ 2017-03-30  6:47 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Rik van Riel, Luiz Capitulino, Frederic Weisbecker, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

Cc Peterz, Thomas,
2017-03-30 12:27 GMT+08:00 Mike Galbraith <efault@gmx.de>:
> On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
>
>> In other words, the tick on cpu0 is aligned
>> with the tick on the nohz_full cpus, and
>> jiffies is advanced while the nohz_full cpus
>> with an active tick happen to be in kernel
>> mode?
>
> You really want skew_tick=1, especially on big boxen.
>
>> Frederic, can you think of any reason why
>> the tick on nohz_full CPUs would end up aligned
>> with the tick on cpu0, instead of running at some
>> random offset?
>
> (I or low rq->clock bits as crude NOHZ collision avoidance)
>
>> A random offset, or better yet a somewhat randomized
>> tick length to make sure that simultaneous ticks are
>> fairly rare and the vtime sampling does not end up
>> "in phase" with the jiffies incrementing, could make
>> the accounting work right again.
>
> That improves jitter, especially on big boxen.  I have an 8 socket box
> that thinks it's an extra large PC, there, collision avoidance matters
> hugely.  I couldn't reproduce bean counting woes, no idea if collision
> avoidance will help that.

So I implement two methods, one is from Rik's random offset proposal
through skew tick, the other one is from Frederic's proposal and it is
the same as my original idea through use nanosecond granularity to
check deltas but only perform an actual cputime update when that delta
>= TICK_NSEC. Both methods can solve the bug which Luiz reported.
Peterz, Thomas, any ideas?

--------------------------->8-------------------------------------------------------------

skew tick:

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 7fe53be..9981437 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1198,7 +1198,11 @@ void tick_setup_sched_timer(void)
     hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());

     /* Offset the tick to avert jiffies_lock contention. */
+#ifdef CONFIG_NO_HZ_FULL
+    if (sched_skew_tick || tick_nohz_full_running) {
+#else
     if (sched_skew_tick) {
+#endif
         u64 offset = ktime_to_ns(tick_period) >> 1;
         do_div(offset, num_possible_cpus());
         offset *= smp_processor_id();

-------------------------------------->8-----------------------------------------------------

use nanosecond granularity to check deltas but only perform an actual
cputime update when that delta >= TICK_NSEC.

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..f1ee393 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
task_struct *p, u64 *ut, u64 *st)
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 static u64 vtime_delta(struct task_struct *tsk)
 {
-    unsigned long now = READ_ONCE(jiffies);
+    u64 now = local_clock();
+    u64 delta;
+
+    delta = now - tsk->vtime_snap;

-    if (time_before(now, (unsigned long)tsk->vtime_snap))
+    if (delta < TICK_NSEC)
         return 0;

-    return jiffies_to_nsecs(now - tsk->vtime_snap);
+    return jiffies_to_nsecs(delta / TICK_NSEC);
 }

 static u64 get_vtime_delta(struct task_struct *tsk)
 {
-    unsigned long now = READ_ONCE(jiffies);
-    u64 delta, other;
+    u64 delta = vtime_delta(tsk);
+    u64 other;

     /*
      * Unlike tick based timing, vtime based timing never has lost
@@ -696,10 +699,9 @@ static u64 get_vtime_delta(struct task_struct *tsk)
      * elapsed time. Limit account_other_time to prevent rounding
      * errors from causing elapsed vtime to go negative.
      */
-    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
     other = account_other_time(delta);
     WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
-    tsk->vtime_snap = now;
+    tsk->vtime_snap += delta;

     return delta - other;
 }
@@ -776,7 +778,7 @@ void arch_vtime_task_switch(struct task_struct *prev)

     write_seqcount_begin(&current->vtime_seqcount);
     current->vtime_snap_whence = VTIME_SYS;
-    current->vtime_snap = jiffies;
+    current->vtime_snap = sched_clock_cpu(smp_processor_id());
     write_seqcount_end(&current->vtime_seqcount);
 }

@@ -787,7 +789,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
     local_irq_save(flags);
     write_seqcount_begin(&t->vtime_seqcount);
     t->vtime_snap_whence = VTIME_SYS;
-    t->vtime_snap = jiffies;
+    t->vtime_snap = sched_clock_cpu(cpu);
     write_seqcount_end(&t->vtime_seqcount);
     local_irq_restore(flags);
 }

Regards,
Wanpeng Li

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  6:47             ` Wanpeng Li
@ 2017-03-30 11:52               ` Wanpeng Li
  2017-03-30 12:33                 ` Mike Galbraith
  2017-03-30 13:38               ` Frederic Weisbecker
  2017-04-11 14:22               ` Thomas Gleixner
  2 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-30 11:52 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Rik van Riel, Luiz Capitulino, Frederic Weisbecker, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

2017-03-30 14:47 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> Cc Peterz, Thomas,
> 2017-03-30 12:27 GMT+08:00 Mike Galbraith <efault@gmx.de>:
>> On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
>>
>>> In other words, the tick on cpu0 is aligned
>>> with the tick on the nohz_full cpus, and
>>> jiffies is advanced while the nohz_full cpus
>>> with an active tick happen to be in kernel
>>> mode?
>>
>> You really want skew_tick=1, especially on big boxen.
>>
>>> Frederic, can you think of any reason why
>>> the tick on nohz_full CPUs would end up aligned
>>> with the tick on cpu0, instead of running at some
>>> random offset?
>>
>> (I or low rq->clock bits as crude NOHZ collision avoidance)
>>
>>> A random offset, or better yet a somewhat randomized
>>> tick length to make sure that simultaneous ticks are
>>> fairly rare and the vtime sampling does not end up
>>> "in phase" with the jiffies incrementing, could make
>>> the accounting work right again.
>>
>> That improves jitter, especially on big boxen.  I have an 8 socket box
>> that thinks it's an extra large PC, there, collision avoidance matters
>> hugely.  I couldn't reproduce bean counting woes, no idea if collision
>> avoidance will help that.
>
> So I implement two methods, one is from Rik's random offset proposal

If we should just add random offset to the cpu in the nohz_full mode?

> through skew tick, the other one is from Frederic's proposal and it is
> the same as my original idea through use nanosecond granularity to
> check deltas but only perform an actual cputime update when that delta
>>= TICK_NSEC. Both methods can solve the bug which Luiz reported.

This can just solves two cpu hogs running on the cpu in nohz_full
mode. However, Luiz's testcase w/ ./acct-bug 1 995 shows idle 100%.

Regards,
Wanpeng Li

> Peterz, Thomas, any ideas?
>
> --------------------------->8-------------------------------------------------------------
>
> skew tick:
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 7fe53be..9981437 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1198,7 +1198,11 @@ void tick_setup_sched_timer(void)
>      hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
>
>      /* Offset the tick to avert jiffies_lock contention. */
> +#ifdef CONFIG_NO_HZ_FULL
> +    if (sched_skew_tick || tick_nohz_full_running) {
> +#else
>      if (sched_skew_tick) {
> +#endif
>          u64 offset = ktime_to_ns(tick_period) >> 1;
>          do_div(offset, num_possible_cpus());
>          offset *= smp_processor_id();
>
> -------------------------------------->8-----------------------------------------------------
>
> use nanosecond granularity to check deltas but only perform an actual
> cputime update when that delta >= TICK_NSEC.
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..f1ee393 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
> task_struct *p, u64 *ut, u64 *st)
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  static u64 vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> +    u64 now = local_clock();
> +    u64 delta;
> +
> +    delta = now - tsk->vtime_snap;
>
> -    if (time_before(now, (unsigned long)tsk->vtime_snap))
> +    if (delta < TICK_NSEC)
>          return 0;
>
> -    return jiffies_to_nsecs(now - tsk->vtime_snap);
> +    return jiffies_to_nsecs(delta / TICK_NSEC);
>  }
>
>  static u64 get_vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> -    u64 delta, other;
> +    u64 delta = vtime_delta(tsk);
> +    u64 other;
>
>      /*
>       * Unlike tick based timing, vtime based timing never has lost
> @@ -696,10 +699,9 @@ static u64 get_vtime_delta(struct task_struct *tsk)
>       * elapsed time. Limit account_other_time to prevent rounding
>       * errors from causing elapsed vtime to go negative.
>       */
> -    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
>      other = account_other_time(delta);
>      WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
> -    tsk->vtime_snap = now;
> +    tsk->vtime_snap += delta;
>
>      return delta - other;
>  }
> @@ -776,7 +778,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
>
>      write_seqcount_begin(&current->vtime_seqcount);
>      current->vtime_snap_whence = VTIME_SYS;
> -    current->vtime_snap = jiffies;
> +    current->vtime_snap = sched_clock_cpu(smp_processor_id());
>      write_seqcount_end(&current->vtime_seqcount);
>  }
>
> @@ -787,7 +789,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
>      local_irq_save(flags);
>      write_seqcount_begin(&t->vtime_seqcount);
>      t->vtime_snap_whence = VTIME_SYS;
> -    t->vtime_snap = jiffies;
> +    t->vtime_snap = sched_clock_cpu(cpu);
>      write_seqcount_end(&t->vtime_seqcount);
>      local_irq_restore(flags);
>  }
>
> Regards,
> Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  2:14             ` Luiz Capitulino
@ 2017-03-30 12:27               ` Wanpeng Li
  0 siblings, 0 replies; 67+ messages in thread
From: Wanpeng Li @ 2017-03-30 12:27 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Frederic Weisbecker, linux-kernel, linux-rt-users, Rik van Riel

2017-03-30 10:14 GMT+08:00 Luiz Capitulino <lcapitulino@redhat.com>:
> On Thu, 30 Mar 2017 06:46:30 +0800
> Wanpeng Li <kernellwp@gmail.com> wrote:
>
>> > So! Now we need to find a proper fix :o)
>> >
>> > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in vtime_delta()?
>> > We could use nanosecond granularity to check deltas but only perform an actual cputime update
>> > when that delta >= TICK_NSEC. That should keep the load ok.
>>
>> Yeah, I mentioned something similar before.
>> https://lkml.org/lkml/2017/3/26/138 However, Rik's commit optimized
>> syscalls by not utilize sched_clock(), so if we should distinguish
>> between syscalls/exceptions and irqs?
>
> Why not use ktime_get()?

I believe ktime_get() is more heavy than local_clock() when sched
clock is stable. So we can cooperate to improve
https://lkml.org/lkml/2017/3/30/456.

Regards,
Wanpeng Li

>
> Here's the solution I was thinking about, it's mostly untested. I'm
> rate limiting below TICK_NSEC because I want to avoid syncing with
> the tick.
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..a8b1e85 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -676,18 +676,20 @@ void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  static u64 vtime_delta(struct task_struct *tsk)
>  {
> -       unsigned long now = READ_ONCE(jiffies);
> +       return ktime_sub(ktime_get(), tsk->vtime_snap);
> +}
>
> -       if (time_before(now, (unsigned long)tsk->vtime_snap))
> -               return 0;
> +/* A little bit less than the tick period */
> +#define VTIME_RATE_LIMIT (TICK_NSEC - 200000)
>
> -       return jiffies_to_nsecs(now - tsk->vtime_snap);
> +static bool vtime_should_account(struct task_struct *tsk)
> +{
> +       return vtime_delta(tsk) > VTIME_RATE_LIMIT;
>  }
>
>  static u64 get_vtime_delta(struct task_struct *tsk)
>  {
> -       unsigned long now = READ_ONCE(jiffies);
> -       u64 delta, other;
> +       u64 delta, other, now = ktime_get();
>
>         /*
>          * Unlike tick based timing, vtime based timing never has lost
> @@ -696,7 +698,7 @@ static u64 get_vtime_delta(struct task_struct *tsk)
>          * elapsed time. Limit account_other_time to prevent rounding
>          * errors from causing elapsed vtime to go negative.
>          */
> -       delta = jiffies_to_nsecs(now - tsk->vtime_snap);
> +       delta = ktime_sub(now, tsk->vtime_snap);
>         other = account_other_time(delta);
>         WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
>         tsk->vtime_snap = now;
> @@ -711,7 +713,7 @@ static void __vtime_account_system(struct task_struct *tsk)
>
>  void vtime_account_system(struct task_struct *tsk)
>  {
> -       if (!vtime_delta(tsk))
> +       if (!vtime_should_account(tsk))
>                 return;
>
>         write_seqcount_begin(&tsk->vtime_seqcount);
> @@ -723,7 +725,7 @@ void vtime_account_user(struct task_struct *tsk)
>  {
>         write_seqcount_begin(&tsk->vtime_seqcount);
>         tsk->vtime_snap_whence = VTIME_SYS;
> -       if (vtime_delta(tsk))
> +       if (vtime_should_account(tsk))
>                 account_user_time(tsk, get_vtime_delta(tsk));
>         write_seqcount_end(&tsk->vtime_seqcount);
>  }
> @@ -731,7 +733,7 @@ void vtime_account_user(struct task_struct *tsk)
>  void vtime_user_enter(struct task_struct *tsk)
>  {
>         write_seqcount_begin(&tsk->vtime_seqcount);
> -       if (vtime_delta(tsk))
> +       if (vtime_should_account(tsk))
>                 __vtime_account_system(tsk);
>         tsk->vtime_snap_whence = VTIME_USER;
>         write_seqcount_end(&tsk->vtime_seqcount);
> @@ -747,7 +749,7 @@ void vtime_guest_enter(struct task_struct *tsk)
>          * that can thus safely catch up with a tickless delta.
>          */
>         write_seqcount_begin(&tsk->vtime_seqcount);
> -       if (vtime_delta(tsk))
> +       if (vtime_should_account(tsk))
>                 __vtime_account_system(tsk);
>         current->flags |= PF_VCPU;
>         write_seqcount_end(&tsk->vtime_seqcount);
> @@ -776,7 +778,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
>
>         write_seqcount_begin(&current->vtime_seqcount);
>         current->vtime_snap_whence = VTIME_SYS;
> -       current->vtime_snap = jiffies;
> +       current->vtime_snap = ktime_get();
>         write_seqcount_end(&current->vtime_seqcount);
>  }
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 11:52               ` Wanpeng Li
@ 2017-03-30 12:33                 ` Mike Galbraith
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Galbraith @ 2017-03-30 12:33 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Rik van Riel, Luiz Capitulino, Frederic Weisbecker, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Thu, 2017-03-30 at 19:52 +0800, Wanpeng Li wrote:

> If we should just add random offset to the cpu in the nohz_full mode?

Up to you, whatever works best.  I left the regular skew alone, just
added some noise to scheduler_tick_max_deferment().

	-Mike

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  1:58           ` Wanpeng Li
@ 2017-03-30 12:40             ` Frederic Weisbecker
  2017-03-30 13:19               ` Mike Galbraith
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-30 12:40 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Rik van Riel, Luiz Capitulino, linux-kernel

On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote:
> 2017-03-30 4:08 GMT+08:00 Rik van Riel <riel@redhat.com>:
> >
> > In other words, the tick on cpu0 is aligned
> > with the tick on the nohz_full cpus, and
> > jiffies is advanced while the nohz_full cpus
> > with an active tick happen to be in kernel
> > mode?
> >
> > Frederic, can you think of any reason why
> > the tick on nohz_full CPUs would end up aligned
> > with the tick on cpu0, instead of running at some
> > random offset?
> >
> > A random offset, or better yet a somewhat randomized
> > tick length to make sure that simultaneous ticks are
> > fairly rare and the vtime sampling does not end up
> > "in phase" with the jiffies incrementing, could make
> > the accounting work right again.
> >
> > Of course, that assumes the above hypothesis is correct :)
> 
> There is such a feature skew_tick currently, refer to commit
> 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot
> parameter, the bug disappear, however, the commit also mentioned that
> it will hurt power consumption.

Oh, I completely missed that!

> I will try Frederic's proposal which
> is similar to my original idea "how bad would it be to revert to
> sched_clock() instead of jiffies in vtime_delta()? We could use
> nanosecond granularity to check deltas but only perform an actual
> cputime update when that delta >= TICK_NSEC."

Thanks! I hope sched_clock() won't introduce too much overhead.
Otherwise we may want to pick up the skew_tick solution.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  4:27           ` Mike Galbraith
  2017-03-30  6:47             ` Wanpeng Li
@ 2017-03-30 12:51             ` Frederic Weisbecker
  2017-03-30 13:02               ` Rik van Riel
  1 sibling, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-30 12:51 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Rik van Riel, Luiz Capitulino, Wanpeng Li, linux-kernel

On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote:
> On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
> 
> > A random offset, or better yet a somewhat randomized
> > tick length to make sure that simultaneous ticks are
> > fairly rare and the vtime sampling does not end up
> > "in phase" with the jiffies incrementing, could make
> > the accounting work right again.
> 
> That improves jitter, especially on big boxen.  I have an 8 socket box
> that thinks it's an extra large PC, there, collision avoidance matters
> hugely.  I couldn't reproduce bean counting woes, no idea if collision
> avoidance will help that.

Out of curiosity, where is the main contention between ticks? I indeed
know some locks that can be taken on special cases, such as posix cpu timers.

Also, why does it raise power consumption issues?

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-29 22:54           ` Frederic Weisbecker
@ 2017-03-30 12:57             ` Rik van Riel
  0 siblings, 0 replies; 67+ messages in thread
From: Rik van Riel @ 2017-03-30 12:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Luiz Capitulino, Wanpeng Li, linux-kernel, Thomas Gleixner

On Thu, 2017-03-30 at 00:54 +0200, Frederic Weisbecker wrote:
> (Adding Thomas in Cc)
> 
> On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote:
> > 
> > Frederic, can you think of any reason why
> > the tick on nohz_full CPUs would end up aligned
> > with the tick on cpu0, instead of running at some
> > random offset?
> 
> tick_init_jiffy_update() takes that decision to align all ticks.
> 
> I'm not sure why. 

I don't see why that would matter, either.

> I'm not sure that randomizing the tick start per CPU would be a
> right solution. Somewhere in the world you can be sure the tick
> randomization of some nohz_full CPU will coincide with the tick
> of CPU 0 :o)
> 
> Or we could force that tick on nohz_full CPUs to be far from
> CPU 0's tick... I'm not sure such a solution would be accepted
> though.

I am not sure we would have to force things.

Simply getting rid of tick_init_jiffy_update
and scheduling the next tick for "now + tick
period" might have the same effect, when the
tick gets stopped and restarted on nohz_full
CPUs.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 12:51             ` Frederic Weisbecker
@ 2017-03-30 13:02               ` Rik van Riel
  2017-03-30 13:35                 ` Mike Galbraith
  2017-03-30 13:44                 ` Frederic Weisbecker
  0 siblings, 2 replies; 67+ messages in thread
From: Rik van Riel @ 2017-03-30 13:02 UTC (permalink / raw)
  To: Frederic Weisbecker, Mike Galbraith
  Cc: Luiz Capitulino, Wanpeng Li, linux-kernel

On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote:
> On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote:
> > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
> > 
> > > A random offset, or better yet a somewhat randomized
> > > tick length to make sure that simultaneous ticks are
> > > fairly rare and the vtime sampling does not end up
> > > "in phase" with the jiffies incrementing, could make
> > > the accounting work right again.
> > 
> > That improves jitter, especially on big boxen.  I have an 8 socket
> > box
> > that thinks it's an extra large PC, there, collision avoidance
> > matters
> > hugely.  I couldn't reproduce bean counting woes, no idea if
> > collision
> > avoidance will help that.
> 
> Out of curiosity, where is the main contention between ticks? I
> indeed
> know some locks that can be taken on special cases, such as posix cpu
> timers.
> 
> Also, why does it raise power consumption issues?

On a system without either nohz_full or nohz idle
mode, skewed ticks result in CPU cores waking up
at different times, and keeping an idle system
consuming power for more time than it would if all
the ticks happened simultaneously.

This is not a factor at all on systems that switch
off the tick while idle, since the CPU will be busy
anyway while the tick is enabled.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 12:40             ` Frederic Weisbecker
@ 2017-03-30 13:19               ` Mike Galbraith
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Galbraith @ 2017-03-30 13:19 UTC (permalink / raw)
  To: Frederic Weisbecker, Wanpeng Li
  Cc: Rik van Riel, Luiz Capitulino, linux-kernel

On Thu, 2017-03-30 at 14:40 +0200, Frederic Weisbecker wrote:
> On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote:

> > There is such a feature skew_tick currently, refer to commit
> > 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot
> > parameter, the bug disappear, however, the commit also mentioned that
> > it will hurt power consumption.
> 
> Oh, I completely missed that!

It suggests it'll harm power consumption because skew removal allegedly
saved power.  Recalling what removal did to my 8 socket box, I doubt
adding it back costs large boxen anything at all, rather the opposite.

	-Mike

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:02               ` Rik van Riel
@ 2017-03-30 13:35                 ` Mike Galbraith
  2017-04-03 14:40                   ` Frederic Weisbecker
  2017-03-30 13:44                 ` Frederic Weisbecker
  1 sibling, 1 reply; 67+ messages in thread
From: Mike Galbraith @ 2017-03-30 13:35 UTC (permalink / raw)
  To: Rik van Riel, Frederic Weisbecker
  Cc: Luiz Capitulino, Wanpeng Li, linux-kernel

On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote:
> On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote:

> > Also, why does it raise power consumption issues?
> 
> On a system without either nohz_full or nohz idle
> mode, skewed ticks result in CPU cores waking up
> at different times, and keeping an idle system
> consuming power for more time than it would if all
> the ticks happened simultaneously.

And if your server farm is mostly idle, that power savings may delay
your bankruptcy proceedings by a whole microsecond ;-)

Or more seriously, what skew does do on boxen of size X today is
something for perf to say.  At the time, removal was very bad for my 8
socket box, and allegedly caused huge SGI beasts in horrific pain.

	-Mike

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  6:47             ` Wanpeng Li
  2017-03-30 11:52               ` Wanpeng Li
@ 2017-03-30 13:38               ` Frederic Weisbecker
  2017-03-30 13:59                 ` Wanpeng Li
  2017-04-11 11:03                 ` Wanpeng Li
  2017-04-11 14:22               ` Thomas Gleixner
  2 siblings, 2 replies; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-30 13:38 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Mike Galbraith, Rik van Riel, Luiz Capitulino, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote:
> Cc Peterz, Thomas,
> 2017-03-30 12:27 GMT+08:00 Mike Galbraith <efault@gmx.de>:
> > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
> >
> >> In other words, the tick on cpu0 is aligned
> >> with the tick on the nohz_full cpus, and
> >> jiffies is advanced while the nohz_full cpus
> >> with an active tick happen to be in kernel
> >> mode?
> >
> > You really want skew_tick=1, especially on big boxen.
> >
> >> Frederic, can you think of any reason why
> >> the tick on nohz_full CPUs would end up aligned
> >> with the tick on cpu0, instead of running at some
> >> random offset?
> >
> > (I or low rq->clock bits as crude NOHZ collision avoidance)
> >
> >> A random offset, or better yet a somewhat randomized
> >> tick length to make sure that simultaneous ticks are
> >> fairly rare and the vtime sampling does not end up
> >> "in phase" with the jiffies incrementing, could make
> >> the accounting work right again.
> >
> > That improves jitter, especially on big boxen.  I have an 8 socket box
> > that thinks it's an extra large PC, there, collision avoidance matters
> > hugely.  I couldn't reproduce bean counting woes, no idea if collision
> > avoidance will help that.
> 
> So I implement two methods, one is from Rik's random offset proposal
> through skew tick, the other one is from Frederic's proposal and it is
> the same as my original idea through use nanosecond granularity to
> check deltas but only perform an actual cputime update when that delta
> >= TICK_NSEC. Both methods can solve the bug which Luiz reported.
> Peterz, Thomas, any ideas?
> 
> --------------------------->8-------------------------------------------------------------
> 
> skew tick:
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 7fe53be..9981437 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1198,7 +1198,11 @@ void tick_setup_sched_timer(void)
>      hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
> 
>      /* Offset the tick to avert jiffies_lock contention. */
> +#ifdef CONFIG_NO_HZ_FULL
> +    if (sched_skew_tick || tick_nohz_full_running) {
> +#else
>      if (sched_skew_tick) {
> +#endif

Please rather use tick_nohz_full_enabled() to avoid ifdeffery.

>          u64 offset = ktime_to_ns(tick_period) >> 1;
>          do_div(offset, num_possible_cpus());
>          offset *= smp_processor_id();

If it works, we may want to take that solution, likely less performance sensitive
than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.

> 
> -------------------------------------->8-----------------------------------------------------
> 
> use nanosecond granularity to check deltas but only perform an actual
> cputime update when that delta >= TICK_NSEC.
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..f1ee393 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
> task_struct *p, u64 *ut, u64 *st)
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  static u64 vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> +    u64 now = local_clock();

I fear we need a global clock, because the reader (task_cputime()) needs
to compute the delta and therefore use the same clock from any CPU.

Or we can use the local_clock() but the reader must access the same.

So there would be vtime_delta_writer() which uses local_clock and stores
the current CPU to tsk->vtime_cpu (under the vtime_seqcount). And then
vtime_delta_reader() which calls sched_clock_cpu(tsk->vtime_cpu) which
is protected by vtime_seqcount as well.

Although those sched_clock_cpu() things seem to only matter when the
sched_clock() is unstable. And that stability is a condition for nohz_full
to work anyway. So probably sched_clock() alone would be enough.

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:02               ` Rik van Riel
  2017-03-30 13:35                 ` Mike Galbraith
@ 2017-03-30 13:44                 ` Frederic Weisbecker
  1 sibling, 0 replies; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-30 13:44 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Mike Galbraith, Luiz Capitulino, Wanpeng Li, linux-kernel

On Thu, Mar 30, 2017 at 09:02:31AM -0400, Rik van Riel wrote:
> On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote:
> > On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote:
> > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
> > > 
> > > > A random offset, or better yet a somewhat randomized
> > > > tick length to make sure that simultaneous ticks are
> > > > fairly rare and the vtime sampling does not end up
> > > > "in phase" with the jiffies incrementing, could make
> > > > the accounting work right again.
> > > 
> > > That improves jitter, especially on big boxen.  I have an 8 socket
> > > box
> > > that thinks it's an extra large PC, there, collision avoidance
> > > matters
> > > hugely.  I couldn't reproduce bean counting woes, no idea if
> > > collision
> > > avoidance will help that.
> > 
> > Out of curiosity, where is the main contention between ticks? I
> > indeed
> > know some locks that can be taken on special cases, such as posix cpu
> > timers.
> > 
> > Also, why does it raise power consumption issues?
> 
> On a system without either nohz_full or nohz idle
> mode, skewed ticks result in CPU cores waking up
> at different times, and keeping an idle system
> consuming power for more time than it would if all
> the ticks happened simultaneously.

Ah fair point!

> 
> This is not a factor at all on systems that switch
> off the tick while idle, since the CPU will be busy
> anyway while the tick is enabled.

I see. Thanks!

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:38               ` Frederic Weisbecker
@ 2017-03-30 13:59                 ` Wanpeng Li
  2017-03-30 14:18                   ` Frederic Weisbecker
  2017-04-11 11:03                 ` Wanpeng Li
  1 sibling, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-03-30 13:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Mike Galbraith, Rik van Riel, Luiz Capitulino, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote:
>> Cc Peterz, Thomas,
>> 2017-03-30 12:27 GMT+08:00 Mike Galbraith <efault@gmx.de>:
>> > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote:
>> >
>> >> In other words, the tick on cpu0 is aligned
>> >> with the tick on the nohz_full cpus, and
>> >> jiffies is advanced while the nohz_full cpus
>> >> with an active tick happen to be in kernel
>> >> mode?
>> >
>> > You really want skew_tick=1, especially on big boxen.
>> >
>> >> Frederic, can you think of any reason why
>> >> the tick on nohz_full CPUs would end up aligned
>> >> with the tick on cpu0, instead of running at some
>> >> random offset?
>> >
>> > (I or low rq->clock bits as crude NOHZ collision avoidance)
>> >
>> >> A random offset, or better yet a somewhat randomized
>> >> tick length to make sure that simultaneous ticks are
>> >> fairly rare and the vtime sampling does not end up
>> >> "in phase" with the jiffies incrementing, could make
>> >> the accounting work right again.
>> >
>> > That improves jitter, especially on big boxen.  I have an 8 socket box
>> > that thinks it's an extra large PC, there, collision avoidance matters
>> > hugely.  I couldn't reproduce bean counting woes, no idea if collision
>> > avoidance will help that.
>>
>> So I implement two methods, one is from Rik's random offset proposal
>> through skew tick, the other one is from Frederic's proposal and it is
>> the same as my original idea through use nanosecond granularity to
>> check deltas but only perform an actual cputime update when that delta
>> >= TICK_NSEC. Both methods can solve the bug which Luiz reported.
>> Peterz, Thomas, any ideas?
>>
>> --------------------------->8-------------------------------------------------------------
>>
>> skew tick:
>>
>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> index 7fe53be..9981437 100644
>> --- a/kernel/time/tick-sched.c
>> +++ b/kernel/time/tick-sched.c
>> @@ -1198,7 +1198,11 @@ void tick_setup_sched_timer(void)
>>      hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
>>
>>      /* Offset the tick to avert jiffies_lock contention. */
>> +#ifdef CONFIG_NO_HZ_FULL
>> +    if (sched_skew_tick || tick_nohz_full_running) {
>> +#else
>>      if (sched_skew_tick) {
>> +#endif
>
> Please rather use tick_nohz_full_enabled() to avoid ifdeffery.
>
>>          u64 offset = ktime_to_ns(tick_period) >> 1;
>>          do_div(offset, num_possible_cpus());
>>          offset *= smp_processor_id();
>
> If it works, we may want to take that solution, likely less performance sensitive
> than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.

So both Rik and you agree with the skew tick solution, I will try it
tomorrow. Btw, if we should just add random offset to the cpu in the
nohz_full mode or add random offset to all cpus like the codes above?

Regards,
Wanpeng Li

>
>>
>> -------------------------------------->8-----------------------------------------------------
>>
>> use nanosecond granularity to check deltas but only perform an actual
>> cputime update when that delta >= TICK_NSEC.
>>
>> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
>> index f3778e2b..f1ee393 100644
>> --- a/kernel/sched/cputime.c
>> +++ b/kernel/sched/cputime.c
>> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
>> task_struct *p, u64 *ut, u64 *st)
>>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>>  static u64 vtime_delta(struct task_struct *tsk)
>>  {
>> -    unsigned long now = READ_ONCE(jiffies);
>> +    u64 now = local_clock();
>
> I fear we need a global clock, because the reader (task_cputime()) needs
> to compute the delta and therefore use the same clock from any CPU.
>
> Or we can use the local_clock() but the reader must access the same.
>
> So there would be vtime_delta_writer() which uses local_clock and stores
> the current CPU to tsk->vtime_cpu (under the vtime_seqcount). And then
> vtime_delta_reader() which calls sched_clock_cpu(tsk->vtime_cpu) which
> is protected by vtime_seqcount as well.
>
> Although those sched_clock_cpu() things seem to only matter when the
> sched_clock() is unstable. And that stability is a condition for nohz_full
> to work anyway. So probably sched_clock() alone would be enough.
>
> Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:59                 ` Wanpeng Li
@ 2017-03-30 14:18                   ` Frederic Weisbecker
  2017-03-30 21:25                     ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-30 14:18 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Mike Galbraith, Rik van Riel, Luiz Capitulino, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:
> 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> > If it works, we may want to take that solution, likely less performance sensitive
> > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.
> 
> So both Rik and you agree with the skew tick solution, I will try it
> tomorrow. Btw, if we should just add random offset to the cpu in the
> nohz_full mode or add random offset to all cpus like the codes above?

Lets just keep it to all CPUs for simplicty.
Also please add a comment that explains why we need that skew_tick on nohz_full.

Thanks!

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 14:18                   ` Frederic Weisbecker
@ 2017-03-30 21:25                     ` Luiz Capitulino
  2017-03-31 20:09                       ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-30 21:25 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Thu, 30 Mar 2017 16:18:17 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:
> > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:  
> > > If it works, we may want to take that solution, likely less performance sensitive
> > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.  
> > 
> > So both Rik and you agree with the skew tick solution, I will try it
> > tomorrow. Btw, if we should just add random offset to the cpu in the
> > nohz_full mode or add random offset to all cpus like the codes above?  
> 
> Lets just keep it to all CPUs for simplicty.
> Also please add a comment that explains why we need that skew_tick on nohz_full.

I've tried all the test-cases we discussed in this thread with skew_tick=1
and it worked as expected in bare-metal and KVM guests.

However, I found a test-case that works in bare-metal but show problems
in KVM guests. It could something that's KVM specific, or it could be
something that's harder to reproduce in bare-metal.

The reproducer is (not sure all the steps are necessary):

1. Isolate 8 cores in the host with isolcpus= and nohz_full= (and skew_tick=1)

2. Create a KVM guest with 8 vCPUs and pin each vCPU to an isolated
   host core

3. Boot the guest with isolcpus=2,3,4,5,6,7 nohz_full=2,3,4,5,6,7 skew_tick=1

4. Once the guest is booted, run:

# for i in $(seq 2 7); do taskset -c $i hog& ;done
# taskset -c 2,3,4,5,6,7 \
  cyclictest -m -n -q -p95 -D 1m -h60 -i 200 -t 6 -a 2,3,4,5,6,7

  (where hog is a program taking 100% of the CPU, and cyclictest
   is RT's cyclictest)

5. Run top -d1

In a few minutes into this test-case, I see one isolated CPU in the
guest reporting around 95% system time (where the expected is close
to 100% user time, which the others isolated CPUs correctly report).

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 21:25                     ` Luiz Capitulino
@ 2017-03-31 20:09                       ` Luiz Capitulino
  2017-03-31 23:24                         ` Frederic Weisbecker
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-03-31 20:09 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Thu, 30 Mar 2017 17:25:46 -0400
Luiz Capitulino <lcapitulino@redhat.com> wrote:

> On Thu, 30 Mar 2017 16:18:17 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:  
> > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:    
> > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.    
> > > 
> > > So both Rik and you agree with the skew tick solution, I will try it
> > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > nohz_full mode or add random offset to all cpus like the codes above?    
> > 
> > Lets just keep it to all CPUs for simplicty.
> > Also please add a comment that explains why we need that skew_tick on nohz_full.  
> 
> I've tried all the test-cases we discussed in this thread with skew_tick=1
> and it worked as expected in bare-metal and KVM guests.
> 
> However, I found a test-case that works in bare-metal but show problems
> in KVM guests. It could something that's KVM specific, or it could be
> something that's harder to reproduce in bare-metal.

After discussing some findings on this issue with Rik, I realized that
we don't add the skew when restarting the tick in tick_nohz_restart().
Adding the offset there seems to solve this problem.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-31 20:09                       ` Luiz Capitulino
@ 2017-03-31 23:24                         ` Frederic Weisbecker
  2017-04-01  3:11                           ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-03-31 23:24 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote:
> On Thu, 30 Mar 2017 17:25:46 -0400
> Luiz Capitulino <lcapitulino@redhat.com> wrote:
> 
> > On Thu, 30 Mar 2017 16:18:17 +0200
> > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:  
> > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:    
> > > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.    
> > > > 
> > > > So both Rik and you agree with the skew tick solution, I will try it
> > > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > > nohz_full mode or add random offset to all cpus like the codes above?    
> > > 
> > > Lets just keep it to all CPUs for simplicty.
> > > Also please add a comment that explains why we need that skew_tick on nohz_full.  
> > 
> > I've tried all the test-cases we discussed in this thread with skew_tick=1
> > and it worked as expected in bare-metal and KVM guests.
> > 
> > However, I found a test-case that works in bare-metal but show problems
> > in KVM guests. It could something that's KVM specific, or it could be
> > something that's harder to reproduce in bare-metal.
> 
> After discussing some findings on this issue with Rik, I realized that
> we don't add the skew when restarting the tick in tick_nohz_restart().
> Adding the offset there seems to solve this problem.

Are you sure? tick_nohz_restart() doesn't seem to override the initial skew. It
always forwards the expiration time on top of the last tick.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-31 23:24                         ` Frederic Weisbecker
@ 2017-04-01  3:11                           ` Luiz Capitulino
  2017-04-03 15:23                             ` Frederic Weisbecker
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-04-01  3:11 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Sat, 1 Apr 2017 01:24:54 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote:
> > On Thu, 30 Mar 2017 17:25:46 -0400
> > Luiz Capitulino <lcapitulino@redhat.com> wrote:
> >   
> > > On Thu, 30 Mar 2017 16:18:17 +0200
> > > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > >   
> > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:    
> > > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:      
> > > > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.      
> > > > > 
> > > > > So both Rik and you agree with the skew tick solution, I will try it
> > > > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > > > nohz_full mode or add random offset to all cpus like the codes above?      
> > > > 
> > > > Lets just keep it to all CPUs for simplicty.
> > > > Also please add a comment that explains why we need that skew_tick on nohz_full.    
> > > 
> > > I've tried all the test-cases we discussed in this thread with skew_tick=1
> > > and it worked as expected in bare-metal and KVM guests.
> > > 
> > > However, I found a test-case that works in bare-metal but show problems
> > > in KVM guests. It could something that's KVM specific, or it could be
> > > something that's harder to reproduce in bare-metal.  
> > 
> > After discussing some findings on this issue with Rik, I realized that
> > we don't add the skew when restarting the tick in tick_nohz_restart().
> > Adding the offset there seems to solve this problem.  
> 
> Are you sure? tick_nohz_restart() doesn't seem to override the initial skew. It
> always forwards the expiration time on top of the last tick.

OK, I'll double check. Without my change the bug triggers almost
instantly with the described reproducer. With my change it didn't
trig for several minutes (but it does look wrong looking at it now).

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:35                 ` Mike Galbraith
@ 2017-04-03 14:40                   ` Frederic Weisbecker
  2017-04-04  7:32                     ` Mike Galbraith
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-04-03 14:40 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Rik van Riel, Luiz Capitulino, Wanpeng Li, linux-kernel

On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote:
> On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote:
> > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote:
> 
> > > Also, why does it raise power consumption issues?
> > 
> > On a system without either nohz_full or nohz idle
> > mode, skewed ticks result in CPU cores waking up
> > at different times, and keeping an idle system
> > consuming power for more time than it would if all
> > the ticks happened simultaneously.
> 
> And if your server farm is mostly idle, that power savings may delay
> your bankruptcy proceedings by a whole microsecond ;-)
> 
> Or more seriously, what skew does do on boxen of size X today is
> something for perf to say.  At the time, removal was very bad for my 8
> socket box, and allegedly caused huge SGI beasts in horrific pain.

I see.
Nohz_full is already bad for powersavings anyway. CPU 0 always ticks :-)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-01  3:11                           ` Luiz Capitulino
@ 2017-04-03 15:23                             ` Frederic Weisbecker
  2017-04-03 19:06                               ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-04-03 15:23 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Fri, Mar 31, 2017 at 11:11:19PM -0400, Luiz Capitulino wrote:
> On Sat, 1 Apr 2017 01:24:54 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote:
> > > On Thu, 30 Mar 2017 17:25:46 -0400
> > > Luiz Capitulino <lcapitulino@redhat.com> wrote:
> > >   
> > > > On Thu, 30 Mar 2017 16:18:17 +0200
> > > > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > >   
> > > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:    
> > > > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:      
> > > > > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.      
> > > > > > 
> > > > > > So both Rik and you agree with the skew tick solution, I will try it
> > > > > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > > > > nohz_full mode or add random offset to all cpus like the codes above?      
> > > > > 
> > > > > Lets just keep it to all CPUs for simplicty.
> > > > > Also please add a comment that explains why we need that skew_tick on nohz_full.    
> > > > 
> > > > I've tried all the test-cases we discussed in this thread with skew_tick=1
> > > > and it worked as expected in bare-metal and KVM guests.
> > > > 
> > > > However, I found a test-case that works in bare-metal but show problems
> > > > in KVM guests. It could something that's KVM specific, or it could be
> > > > something that's harder to reproduce in bare-metal.  
> > > 
> > > After discussing some findings on this issue with Rik, I realized that
> > > we don't add the skew when restarting the tick in tick_nohz_restart().
> > > Adding the offset there seems to solve this problem.  
> > 
> > Are you sure? tick_nohz_restart() doesn't seem to override the initial skew. It
> > always forwards the expiration time on top of the last tick.
> 
> OK, I'll double check. Without my change the bug triggers almost
> instantly with the described reproducer. With my change it didn't
> trig for several minutes (but it does look wrong looking at it now).

Do you observe aligned ticks with trace events (hrtimer_expire_entry)?

You might want to enforce the global clock to trace that:

    echo "global" > /sys/kernel/debug/tracing/trace_clock

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-03 15:23                             ` Frederic Weisbecker
@ 2017-04-03 19:06                               ` Luiz Capitulino
  2017-04-04 17:36                                 ` Luiz Capitulino
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-04-03 19:06 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Mon, 3 Apr 2017 17:23:17 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Do you observe aligned ticks with trace events (hrtimer_expire_entry)?
> 
> You might want to enforce the global clock to trace that:
> 
>     echo "global" > /sys/kernel/debug/tracing/trace_clock

I've used the same trace points & debugging code I've been using to debug
this issue, and this what I'm seeing:

    stress-25757 [002]  2742.717507: function:             enter_from_user_mode <-- apic_timer_interrupt
    stress-25757 [002]  2742.717508: function:             __context_tracking_exit <-- enter_from_user_mode
    stress-25757 [002]  2742.717508: bprint:               vtime_delta: diff=0 (now=4297409970 vtime_snap=4297409970)
    stress-25757 [002]  2742.717509: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
    stress-25757 [002]  2742.717509: function:             irq_enter <-- smp_apic_timer_interrupt
    stress-25757 [002]  2742.717510: hrtimer_expire_entry: hrtimer=0xffffc900039fbe58 function=hrtimer_wakeup now=2742674000776
    stress-25757 [002]  2742.717514: function:             irq_exit <-- smp_apic_timer_interrupt
cyclictest-25760 [002]  2742.717518: function:             vtime_account_system <-- vtime_common_task_switch
cyclictest-25760 [002]  2742.717518: bprint:               vtime_delta: diff=1000000 (now=4297409971 vtime_snap=4297409970)
cyclictest-25760 [002]  2742.717519: function:             __vtime_account_system <-- vtime_account_system
cyclictest-25760 [002]  2742.717519: bprint:               get_vtime_delta: vtime_snap=4297409970 now=4297409971
cyclictest-25760 [002]  2742.717520: function:             account_system_time <-- __vtime_account_system
cyclictest-25760 [002]  2742.717520: bprint:               account_system_time: cputime=961981
cyclictest-25760 [002]  2742.717521: function:             __context_tracking_enter <-- do_syscall_64
cyclictest-25760 [002]  2742.717522: function:             vtime_user_enter <-- __context_tracking_enter
cyclictest-25760 [002]  2742.717522: bprint:               vtime_delta: diff=0 (now=4297409971 vtime_snap=4297409971)

CPU2 shows 98% system time while the other CPUs (from CPU3 to CPU7)
show 98% user time (they're all running the same workload).

What's happening here is:

1. Timer interrupt
2. Transition from user-space to kernel-space, vtimer_delta()
   returns zero
3. Context switch from hog application to cyclictest
4. This time vtime_delta() returns != zero, which implies
   jiffies was updated between steps 2 and 3

This seems to be the pattern that accounts incorrectly,
and seem to suggest that the ticks are aligned because
this repeats over and over.

Please, let me know if you want me to run a different
trace-cmd command-line.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-03 14:40                   ` Frederic Weisbecker
@ 2017-04-04  7:32                     ` Mike Galbraith
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Galbraith @ 2017-04-04  7:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Rik van Riel, Luiz Capitulino, Wanpeng Li, linux-kernel

On Mon, 2017-04-03 at 16:40 +0200, Frederic Weisbecker wrote:
> On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote:

> Nohz_full is already bad for powersavings anyway. CPU 0 always ticks :-)

OTOH, if a nohz_full set is doing what it was born to do, CPU0 tick
spikes won't be noticeable on your (pegged/glowing) watt meter :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-03 19:06                               ` Luiz Capitulino
@ 2017-04-04 17:36                                 ` Luiz Capitulino
  2017-04-05 14:26                                   ` Rik van Riel
  0 siblings, 1 reply; 67+ messages in thread
From: Luiz Capitulino @ 2017-04-04 17:36 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

On Mon, 3 Apr 2017 15:06:13 -0400
Luiz Capitulino <lcapitulino@redhat.com> wrote:

> On Mon, 3 Apr 2017 17:23:17 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > Do you observe aligned ticks with trace events (hrtimer_expire_entry)?
> > 
> > You might want to enforce the global clock to trace that:
> > 
> >     echo "global" > /sys/kernel/debug/tracing/trace_clock  
> 
> I've used the same trace points & debugging code I've been using to debug
> this issue, and this what I'm seeing:
> 
>     stress-25757 [002]  2742.717507: function:             enter_from_user_mode <-- apic_timer_interrupt
>     stress-25757 [002]  2742.717508: function:             __context_tracking_exit <-- enter_from_user_mode
>     stress-25757 [002]  2742.717508: bprint:               vtime_delta: diff=0 (now=4297409970 vtime_snap=4297409970)
>     stress-25757 [002]  2742.717509: function:             smp_apic_timer_interrupt <-- apic_timer_interrupt
>     stress-25757 [002]  2742.717509: function:             irq_enter <-- smp_apic_timer_interrupt
>     stress-25757 [002]  2742.717510: hrtimer_expire_entry: hrtimer=0xffffc900039fbe58 function=hrtimer_wakeup now=2742674000776
>     stress-25757 [002]  2742.717514: function:             irq_exit <-- smp_apic_timer_interrupt
> cyclictest-25760 [002]  2742.717518: function:             vtime_account_system <-- vtime_common_task_switch
> cyclictest-25760 [002]  2742.717518: bprint:               vtime_delta: diff=1000000 (now=4297409971 vtime_snap=4297409970)
> cyclictest-25760 [002]  2742.717519: function:             __vtime_account_system <-- vtime_account_system
> cyclictest-25760 [002]  2742.717519: bprint:               get_vtime_delta: vtime_snap=4297409970 now=4297409971
> cyclictest-25760 [002]  2742.717520: function:             account_system_time <-- __vtime_account_system
> cyclictest-25760 [002]  2742.717520: bprint:               account_system_time: cputime=961981
> cyclictest-25760 [002]  2742.717521: function:             __context_tracking_enter <-- do_syscall_64
> cyclictest-25760 [002]  2742.717522: function:             vtime_user_enter <-- __context_tracking_enter
> cyclictest-25760 [002]  2742.717522: bprint:               vtime_delta: diff=0 (now=4297409971 vtime_snap=4297409971)
> 
> CPU2 shows 98% system time while the other CPUs (from CPU3 to CPU7)
> show 98% user time (they're all running the same workload).

On further debugging this, I realized that I had overlooked something:
the timer interrupt in this trace is not the tick, but cyclictest's timer
(remember that the test-case consists of pinning cyclictest and a task
hogging the CPU to the same CPU).

I'm running cyclictest with -i 200. If I increase this to -i 1000, then
I seem unable to reproduce the issue (caution: even with -i 200 it
doesn't always happen. But it does usually happen after I restart the
test-case a few times. However, I've never been able to reproduce
with -i 1000).

Now, if it's really cyclictest that's causing the timer interrupts to
get aligned, I guess this might not have a solution? (note: I haven't
been able to reproduce this on bare-metal).

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-04 17:36                                 ` Luiz Capitulino
@ 2017-04-05 14:26                                   ` Rik van Riel
  0 siblings, 0 replies; 67+ messages in thread
From: Rik van Riel @ 2017-04-05 14:26 UTC (permalink / raw)
  To: Luiz Capitulino, Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, linux-kernel, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1799 bytes --]

On Tue, 2017-04-04 at 13:36 -0400, Luiz Capitulino wrote:
> 
> On further debugging this, I realized that I had overlooked
> something:
> the timer interrupt in this trace is not the tick, but cyclictest's
> timer
> (remember that the test-case consists of pinning cyclictest and a
> task
> hogging the CPU to the same CPU).
> 
> I'm running cyclictest with -i 200. If I increase this to -i 1000,
> then
> I seem unable to reproduce the issue (caution: even with -i 200 it
> doesn't always happen. But it does usually happen after I restart the
> test-case a few times. However, I've never been able to reproduce
> with -i 1000).
> 
> Now, if it's really cyclictest that's causing the timer interrupts to
> get aligned, I guess this might not have a solution? (note: I haven't
> been able to reproduce this on bare-metal).

With any sample (tick) based timekeeping, it is possible
to construct workloads that avoid the sampling and result
in skewed statistics as a result.

However, given that local users can already DoS the system
in all kinds of ways, skewed statistics are probably not
that high up on the list of importance.

If there were a way to do accurate accounting (true vtime
accounting) without increasing the overhead of every
syscall and interrupt noticeably, that might be worth it,
but syscall overhead is likely to be a more important
factor than the accuracy of statistics.

I don't know if doing TSC reads and subtraction/addition
only, and delaying the conversion to cputime until a later
point would slow down system calls measurably, compared
with reading jiffies and comparing it against a cached
value of jiffies, nor do I know whether spending time
implementing and testing that would be worthwhile :)

-- 
All rights reversed

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30 13:38               ` Frederic Weisbecker
  2017-03-30 13:59                 ` Wanpeng Li
@ 2017-04-11 11:03                 ` Wanpeng Li
  2017-04-11 11:36                   ` Peter Zijlstra
  1 sibling, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-04-11 11:03 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Mike Galbraith, Rik van Riel, Luiz Capitulino, linux-kernel,
	Peter Zijlstra, Thomas Gleixner

2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote:

[...]

>
>>
>> -------------------------------------->8-----------------------------------------------------
>>
>> use nanosecond granularity to check deltas but only perform an actual
>> cputime update when that delta >= TICK_NSEC.
>>
>> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
>> index f3778e2b..f1ee393 100644
>> --- a/kernel/sched/cputime.c
>> +++ b/kernel/sched/cputime.c
>> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
>> task_struct *p, u64 *ut, u64 *st)
>>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>>  static u64 vtime_delta(struct task_struct *tsk)
>>  {
>> -    unsigned long now = READ_ONCE(jiffies);
>> +    u64 now = local_clock();
>
> I fear we need a global clock, because the reader (task_cputime()) needs
> to compute the delta and therefore use the same clock from any CPU.
>
> Or we can use the local_clock() but the reader must access the same.
>
> So there would be vtime_delta_writer() which uses local_clock and stores
> the current CPU to tsk->vtime_cpu (under the vtime_seqcount). And then
> vtime_delta_reader() which calls sched_clock_cpu(tsk->vtime_cpu) which
> is protected by vtime_seqcount as well.
>
> Although those sched_clock_cpu() things seem to only matter when the
> sched_clock() is unstable. And that stability is a condition for nohz_full
> to work anyway. So probably sched_clock() alone would be enough.

I observed ~60% user time and ~40% sys time when replace local_clock()
above by sched_clock()(two cpu hogs on the cpu in nohz_full mode). In
addition, Luiz's testcast ./acct-bug 1 995 will show 100% idle time.
If keep local_clock() in vtime_delta(), cpu hogs testcase will
success. However, Luiz's testcase still show 100% idle time.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-11 11:03                 ` Wanpeng Li
@ 2017-04-11 11:36                   ` Peter Zijlstra
  2017-04-11 11:43                     ` Wanpeng Li
  0 siblings, 1 reply; 67+ messages in thread
From: Peter Zijlstra @ 2017-04-11 11:36 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, Mike Galbraith, Rik van Riel,
	Luiz Capitulino, linux-kernel, Thomas Gleixner

On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote:
> 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote:
> 
> [...]
> 
> >
> >>
> >> -------------------------------------->8-----------------------------------------------------
> >>
> >> use nanosecond granularity to check deltas but only perform an actual
> >> cputime update when that delta >= TICK_NSEC.
> >>
> >> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> >> index f3778e2b..f1ee393 100644
> >> --- a/kernel/sched/cputime.c
> >> +++ b/kernel/sched/cputime.c
> >> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
> >> task_struct *p, u64 *ut, u64 *st)
> >>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
> >>  static u64 vtime_delta(struct task_struct *tsk)
> >>  {
> >> -    unsigned long now = READ_ONCE(jiffies);
> >> +    u64 now = local_clock();
> >
> > I fear we need a global clock, because the reader (task_cputime()) needs
> > to compute the delta and therefore use the same clock from any CPU.
> >
> > Or we can use the local_clock() but the reader must access the same.
> >
> > So there would be vtime_delta_writer() which uses local_clock and stores
> > the current CPU to tsk->vtime_cpu (under the vtime_seqcount). And then
> > vtime_delta_reader() which calls sched_clock_cpu(tsk->vtime_cpu) which
> > is protected by vtime_seqcount as well.
> >
> > Although those sched_clock_cpu() things seem to only matter when the
> > sched_clock() is unstable. And that stability is a condition for nohz_full
> > to work anyway. So probably sched_clock() alone would be enough.
> 
> I observed ~60% user time and ~40% sys time when replace local_clock()
> above by sched_clock()(two cpu hogs on the cpu in nohz_full mode). In
> addition, Luiz's testcast ./acct-bug 1 995 will show 100% idle time.
> If keep local_clock() in vtime_delta(), cpu hogs testcase will
> success. However, Luiz's testcase still show 100% idle time.

Assuming a stable TSC, there should be no difference between
local_clock() and sched_clock().

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-11 11:36                   ` Peter Zijlstra
@ 2017-04-11 11:43                     ` Wanpeng Li
  0 siblings, 0 replies; 67+ messages in thread
From: Wanpeng Li @ 2017-04-11 11:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Mike Galbraith, Rik van Riel,
	Luiz Capitulino, linux-kernel, Thomas Gleixner

2017-04-11 19:36 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote:
>> 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
>> > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote:
>>
>> [...]
>>
>> >
>> >>
>> >> -------------------------------------->8-----------------------------------------------------
>> >>
>> >> use nanosecond granularity to check deltas but only perform an actual
>> >> cputime update when that delta >= TICK_NSEC.
>> >>
>> >> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
>> >> index f3778e2b..f1ee393 100644
>> >> --- a/kernel/sched/cputime.c
>> >> +++ b/kernel/sched/cputime.c
>> >> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
>> >> task_struct *p, u64 *ut, u64 *st)
>> >>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>> >>  static u64 vtime_delta(struct task_struct *tsk)
>> >>  {
>> >> -    unsigned long now = READ_ONCE(jiffies);
>> >> +    u64 now = local_clock();
>> >
>> > I fear we need a global clock, because the reader (task_cputime()) needs
>> > to compute the delta and therefore use the same clock from any CPU.
>> >
>> > Or we can use the local_clock() but the reader must access the same.
>> >
>> > So there would be vtime_delta_writer() which uses local_clock and stores
>> > the current CPU to tsk->vtime_cpu (under the vtime_seqcount). And then
>> > vtime_delta_reader() which calls sched_clock_cpu(tsk->vtime_cpu) which
>> > is protected by vtime_seqcount as well.
>> >
>> > Although those sched_clock_cpu() things seem to only matter when the
>> > sched_clock() is unstable. And that stability is a condition for nohz_full
>> > to work anyway. So probably sched_clock() alone would be enough.
>>
>> I observed ~60% user time and ~40% sys time when replace local_clock()
>> above by sched_clock()(two cpu hogs on the cpu in nohz_full mode). In
>> addition, Luiz's testcast ./acct-bug 1 995 will show 100% idle time.
>> If keep local_clock() in vtime_delta(), cpu hogs testcase will
>> success. However, Luiz's testcase still show 100% idle time.
>
> Assuming a stable TSC, there should be no difference between
> local_clock() and sched_clock().

So it is weird. I did't see any unstable tsc dump in dmesg.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-03-30  6:47             ` Wanpeng Li
  2017-03-30 11:52               ` Wanpeng Li
  2017-03-30 13:38               ` Frederic Weisbecker
@ 2017-04-11 14:22               ` Thomas Gleixner
  2017-04-12 13:18                 ` Frederic Weisbecker
  2 siblings, 1 reply; 67+ messages in thread
From: Thomas Gleixner @ 2017-04-11 14:22 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Mike Galbraith, Rik van Riel, Luiz Capitulino,
	Frederic Weisbecker, linux-kernel, Peter Zijlstra

On Thu, 30 Mar 2017, Wanpeng Li wrote:
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..f1ee393 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
> task_struct *p, u64 *ut, u64 *st)
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  static u64 vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> +    u64 now = local_clock();
> +    u64 delta;
> +
> +    delta = now - tsk->vtime_snap;
> 
> -    if (time_before(now, (unsigned long)tsk->vtime_snap))
> +    if (delta < TICK_NSEC)
>          return 0;
> 
> -    return jiffies_to_nsecs(now - tsk->vtime_snap);
> +    return jiffies_to_nsecs(delta / TICK_NSEC);

So you replaced a jiffies based approach with a jiffies based approach.

>  }
> 
>  static u64 get_vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> -    u64 delta, other;
> +    u64 delta = vtime_delta(tsk);
> +    u64 other;
> 
>      /*
>       * Unlike tick based timing, vtime based timing never has lost
> @@ -696,10 +699,9 @@ static u64 get_vtime_delta(struct task_struct *tsk)
>       * elapsed time. Limit account_other_time to prevent rounding
>       * errors from causing elapsed vtime to go negative.
>       */
> -    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
>      other = account_other_time(delta);
>      WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
> -    tsk->vtime_snap = now;
> +    tsk->vtime_snap += delta;

Here is how it works^Wfails

	       For simplicity tsk->vtime_snap starts at 0
	       HZ = 1000

CPU0 	       CPU1
	       sysexit()
		 account_system()
		   now == 0
		   delta = vtime_delta()	<- 0ns
	           tsk->vtime_snap += delta;	== 0ns

	       busy_loop(995us)

	       sysenter()
		 now == 996us
		 account_user()
		   delta = vtime_delta()	<- 0ns
		   tsk->vtime_snap += delta	== 0ns
		   
	       sysexit()
		 account_system()
		   now == 1001us
		   delta = vtime_delta()	<- 10000000ns

		   ^^^^ Gets accounted to system

	           tsk->vtime_snap += delta;	== 10000000ns

It's not different from the current jiffies based stuff at all. Same
failure mode.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-11 14:22               ` Thomas Gleixner
@ 2017-04-12 13:18                 ` Frederic Weisbecker
  2017-04-12 14:57                   ` Thomas Gleixner
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-04-12 13:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra

On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
> On Thu, 30 Mar 2017, Wanpeng Li wrote:
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index f3778e2b..f1ee393 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct
> > task_struct *p, u64 *ut, u64 *st)
> >  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
> >  static u64 vtime_delta(struct task_struct *tsk)
> >  {
> > -    unsigned long now = READ_ONCE(jiffies);
> > +    u64 now = local_clock();
> > +    u64 delta;
> > +
> > +    delta = now - tsk->vtime_snap;
> > 
> > -    if (time_before(now, (unsigned long)tsk->vtime_snap))
> > +    if (delta < TICK_NSEC)
> >          return 0;
> > 
> > -    return jiffies_to_nsecs(now - tsk->vtime_snap);
> > +    return jiffies_to_nsecs(delta / TICK_NSEC);
> 
> So you replaced a jiffies based approach with a jiffies based approach.
> 
> >  }
> > 
> >  static u64 get_vtime_delta(struct task_struct *tsk)
> >  {
> > -    unsigned long now = READ_ONCE(jiffies);
> > -    u64 delta, other;
> > +    u64 delta = vtime_delta(tsk);
> > +    u64 other;
> > 
> >      /*
> >       * Unlike tick based timing, vtime based timing never has lost
> > @@ -696,10 +699,9 @@ static u64 get_vtime_delta(struct task_struct *tsk)
> >       * elapsed time. Limit account_other_time to prevent rounding
> >       * errors from causing elapsed vtime to go negative.
> >       */
> > -    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
> >      other = account_other_time(delta);
> >      WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
> > -    tsk->vtime_snap = now;
> > +    tsk->vtime_snap += delta;
> 
> Here is how it works^Wfails
> 
> 	       For simplicity tsk->vtime_snap starts at 0
> 	       HZ = 1000
> 
> CPU0 	       CPU1
> 	       sysexit()
> 		 account_system()
> 		   now == 0
> 		   delta = vtime_delta()	<- 0ns
> 	           tsk->vtime_snap += delta;	== 0ns
> 
> 	       busy_loop(995us)
> 
> 	       sysenter()
> 		 now == 996us
> 		 account_user()
> 		   delta = vtime_delta()	<- 0ns
> 		   tsk->vtime_snap += delta	== 0ns
> 		   
> 	       sysexit()
> 		 account_system()
> 		   now == 1001us
> 		   delta = vtime_delta()	<- 10000000ns
> 
> 		   ^^^^ Gets accounted to system
> 
> 	           tsk->vtime_snap += delta;	== 10000000ns
> 
> It's not different from the current jiffies based stuff at all. Same
> failure mode.

Yes you're right, I got confused again. So to fix this we could do our snapshots
at a frequency lower than HZ but still high enough to avoid overhead.

Something like TICK_NSEC / 2 ?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-12 13:18                 ` Frederic Weisbecker
@ 2017-04-12 14:57                   ` Thomas Gleixner
  2017-04-12 15:14                     ` Frederic Weisbecker
  2017-04-13  4:31                     ` Wanpeng Li
  0 siblings, 2 replies; 67+ messages in thread
From: Thomas Gleixner @ 2017-04-12 14:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra

On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
> > It's not different from the current jiffies based stuff at all. Same
> > failure mode.
> 
> Yes you're right, I got confused again. So to fix this we could do our snapshots
> at a frequency lower than HZ but still high enough to avoid overhead.
> 
> Something like TICK_NSEC / 2 ?

If you are using TSC anyway then you can do proper accumulation for both
system and user and only account the data when the accumulation is more
than a jiffie.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-12 14:57                   ` Thomas Gleixner
@ 2017-04-12 15:14                     ` Frederic Weisbecker
  2017-04-13  4:31                     ` Wanpeng Li
  1 sibling, 0 replies; 67+ messages in thread
From: Frederic Weisbecker @ 2017-04-12 15:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Wanpeng Li, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra

On Wed, Apr 12, 2017 at 04:57:58PM +0200, Thomas Gleixner wrote:
> On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
> > On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
> > > It's not different from the current jiffies based stuff at all. Same
> > > failure mode.
> > 
> > Yes you're right, I got confused again. So to fix this we could do our snapshots
> > at a frequency lower than HZ but still high enough to avoid overhead.
> > 
> > Something like TICK_NSEC / 2 ?
> 
> If you are using TSC anyway then you can do proper accumulation for both
> system and user and only account the data when the accumulation is more
> than a jiffie.

Sounds nice, and accumulation shouldn't introduce too much overhead. Let's try that.

Thanks.

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-12 14:57                   ` Thomas Gleixner
  2017-04-12 15:14                     ` Frederic Weisbecker
@ 2017-04-13  4:31                     ` Wanpeng Li
  2017-04-13 13:32                       ` Frederic Weisbecker
  1 sibling, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-04-13  4:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Frederic Weisbecker, Mike Galbraith, Rik van Riel,
	Luiz Capitulino, linux-kernel, Peter Zijlstra

2017-04-12 22:57 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
> On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
>> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
>> > It's not different from the current jiffies based stuff at all. Same
>> > failure mode.
>>
>> Yes you're right, I got confused again. So to fix this we could do our snapshots
>> at a frequency lower than HZ but still high enough to avoid overhead.
>>
>> Something like TICK_NSEC / 2 ?
>
> If you are using TSC anyway then you can do proper accumulation for both
> system and user and only account the data when the accumulation is more
> than a jiffie.

So I implement it as below:

- HZ=1000.
  1) two cpu hogs on cpu in nohz_full mode, 100% user time
  2) Luzi's testcase, ~95% user time, ~5% idle time (as we expected)
- HZ=250
   1) two cpu hogs on cpu in nohz_full mode, 100% user time
   2) Luzi's testcase, 100% idle

So the codes below still not work correctly for HZ=250, any suggestions?

-------------------------------------->8-----------------------------------------------------

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d67eee8..6a11771 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -668,6 +668,8 @@ struct task_struct {
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
     seqcount_t            vtime_seqcount;
     unsigned long long        vtime_snap;
+    u64                vtime_acct_utime;
+    u64                vtime_acct_stime;
     enum {
         /* Task is sleeping or running in a CPU with VTIME inactive: */
         VTIME_INACTIVE = 0,
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..f8e54ba 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -674,20 +674,41 @@ void thread_group_cputime_adjusted(struct
task_struct *p, u64 *ut, u64 *st)
 #endif /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */

 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-static u64 vtime_delta(struct task_struct *tsk)
+static u64 vtime_delta(struct task_struct *tsk, bool user)
 {
-    unsigned long now = READ_ONCE(jiffies);
+    u64 delta, ret = 0;

-    if (time_before(now, (unsigned long)tsk->vtime_snap))
-        return 0;
+    delta = sched_clock() - tsk->vtime_snap;

-    return jiffies_to_nsecs(now - tsk->vtime_snap);
+    if (is_idle_task(tsk)) {
+        if (delta >= TICK_NSEC)
+            ret = delta;
+    } else {
+        if (user) {
+            tsk->vtime_acct_utime += delta;
+            if (tsk->vtime_acct_utime >= TICK_NSEC)
+                ret = tsk->vtime_acct_utime;
+        } else {
+            tsk->vtime_acct_stime += delta;
+            if (tsk->vtime_acct_utime >= TICK_NSEC)
+                ret = tsk->vtime_acct_stime;
+        }
+    }
+
+    return ret;
 }

-static u64 get_vtime_delta(struct task_struct *tsk)
+static u64 get_vtime_delta(struct task_struct *tsk, bool user)
 {
-    unsigned long now = READ_ONCE(jiffies);
-    u64 delta, other;
+    u64 delta = vtime_delta(tsk, user);
+    u64 other;
+
+    if (!is_idle_task(tsk)) {
+        if (user)
+            tsk->vtime_acct_utime = 0;
+        else
+            tsk->vtime_acct_stime = 0;
+    }

     /*
      * Unlike tick based timing, vtime based timing never has lost
@@ -696,22 +717,21 @@ static u64 get_vtime_delta(struct task_struct *tsk)
      * elapsed time. Limit account_other_time to prevent rounding
      * errors from causing elapsed vtime to go negative.
      */
-    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
     other = account_other_time(delta);
     WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
-    tsk->vtime_snap = now;
+    tsk->vtime_snap += delta;

     return delta - other;
 }

 static void __vtime_account_system(struct task_struct *tsk)
 {
-    account_system_time(tsk, irq_count(), get_vtime_delta(tsk));
+    account_system_time(tsk, irq_count(), get_vtime_delta(tsk, false));
 }

 void vtime_account_system(struct task_struct *tsk)
 {
-    if (!vtime_delta(tsk))
+    if (!vtime_delta(tsk, false))
         return;

     write_seqcount_begin(&tsk->vtime_seqcount);
@@ -723,15 +743,15 @@ void vtime_account_user(struct task_struct *tsk)
 {
     write_seqcount_begin(&tsk->vtime_seqcount);
     tsk->vtime_snap_whence = VTIME_SYS;
-    if (vtime_delta(tsk))
-        account_user_time(tsk, get_vtime_delta(tsk));
+    if (vtime_delta(tsk, true))
+        account_user_time(tsk, get_vtime_delta(tsk, true));
     write_seqcount_end(&tsk->vtime_seqcount);
 }

 void vtime_user_enter(struct task_struct *tsk)
 {
     write_seqcount_begin(&tsk->vtime_seqcount);
-    if (vtime_delta(tsk))
+    if (vtime_delta(tsk, false))
         __vtime_account_system(tsk);
     tsk->vtime_snap_whence = VTIME_USER;
     write_seqcount_end(&tsk->vtime_seqcount);
@@ -747,7 +767,7 @@ void vtime_guest_enter(struct task_struct *tsk)
      * that can thus safely catch up with a tickless delta.
      */
     write_seqcount_begin(&tsk->vtime_seqcount);
-    if (vtime_delta(tsk))
+    if (vtime_delta(tsk, false))
         __vtime_account_system(tsk);
     current->flags |= PF_VCPU;
     write_seqcount_end(&tsk->vtime_seqcount);
@@ -765,7 +785,7 @@ EXPORT_SYMBOL_GPL(vtime_guest_exit);

 void vtime_account_idle(struct task_struct *tsk)
 {
-    account_idle_time(get_vtime_delta(tsk));
+    account_idle_time(get_vtime_delta(tsk, false));
 }

 void arch_vtime_task_switch(struct task_struct *prev)
@@ -776,7 +796,7 @@ void arch_vtime_task_switch(struct task_struct *prev)

     write_seqcount_begin(&current->vtime_seqcount);
     current->vtime_snap_whence = VTIME_SYS;
-    current->vtime_snap = jiffies;
+    current->vtime_snap = sched_clock_cpu(smp_processor_id());
     write_seqcount_end(&current->vtime_seqcount);
 }

@@ -787,7 +807,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
     local_irq_save(flags);
     write_seqcount_begin(&t->vtime_seqcount);
     t->vtime_snap_whence = VTIME_SYS;
-    t->vtime_snap = jiffies;
+    t->vtime_snap = sched_clock_cpu(cpu);
     write_seqcount_end(&t->vtime_seqcount);
     local_irq_restore(flags);
 }
@@ -805,7 +825,7 @@ u64 task_gtime(struct task_struct *t)

         gtime = t->gtime;
         if (t->vtime_snap_whence == VTIME_SYS && t->flags & PF_VCPU)
-            gtime += vtime_delta(t);
+            gtime += vtime_delta(t, false);

     } while (read_seqcount_retry(&t->vtime_seqcount, seq));

@@ -819,7 +839,6 @@ u64 task_gtime(struct task_struct *t)
  */
 void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
 {
-    u64 delta;
     unsigned int seq;

     if (!vtime_accounting_enabled()) {
@@ -838,16 +857,14 @@ void task_cputime(struct task_struct *t, u64
*utime, u64 *stime)
         if (t->vtime_snap_whence == VTIME_INACTIVE || is_idle_task(t))
             continue;

-        delta = vtime_delta(t);
-
         /*
          * Task runs either in user or kernel space, add pending nohz time to
          * the right place.
          */
         if (t->vtime_snap_whence == VTIME_USER || t->flags & PF_VCPU)
-            *utime += delta;
+            *utime += vtime_delta(t, true);
         else if (t->vtime_snap_whence == VTIME_SYS)
-            *stime += delta;
+            *stime += vtime_delta(t, false);
     } while (read_seqcount_retry(&t->vtime_seqcount, seq));
 }
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-13  4:31                     ` Wanpeng Li
@ 2017-04-13 13:32                       ` Frederic Weisbecker
  2017-05-02 10:01                         ` Wanpeng Li
  0 siblings, 1 reply; 67+ messages in thread
From: Frederic Weisbecker @ 2017-04-13 13:32 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Thomas Gleixner, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra

On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote:
> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
> >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
> >> > It's not different from the current jiffies based stuff at all. Same
> >> > failure mode.
> >>
> >> Yes you're right, I got confused again. So to fix this we could do our snapshots
> >> at a frequency lower than HZ but still high enough to avoid overhead.
> >>
> >> Something like TICK_NSEC / 2 ?
> >
> > If you are using TSC anyway then you can do proper accumulation for both
> > system and user and only account the data when the accumulation is more
> > than a jiffie.
> 
> So I implement it as below:
> 
> - HZ=1000.
>   1) two cpu hogs on cpu in nohz_full mode, 100% user time
>   2) Luzi's testcase, ~95% user time, ~5% idle time (as we expected)
> - HZ=250
>    1) two cpu hogs on cpu in nohz_full mode, 100% user time
>    2) Luzi's testcase, 100% idle
> 
> So the codes below still not work correctly for HZ=250, any suggestions?

Right, so first lets reorder that code a bit so we can see clear inside :-)

> 
> -------------------------------------->8-----------------------------------------------------
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index d67eee8..6a11771 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -668,6 +668,8 @@ struct task_struct {
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>      seqcount_t            vtime_seqcount;
>      unsigned long long        vtime_snap;
> +    u64                vtime_acct_utime;
> +    u64                vtime_acct_stime;

You need to accumulate guest and steal time as well.

>      enum {
>          /* Task is sleeping or running in a CPU with VTIME inactive: */
>          VTIME_INACTIVE = 0,
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..f8e54ba 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -674,20 +674,41 @@ void thread_group_cputime_adjusted(struct
> task_struct *p, u64 *ut, u64 *st)
>  #endif /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
> 
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
> -static u64 vtime_delta(struct task_struct *tsk)
> +static u64 vtime_delta(struct task_struct *tsk, bool user)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> +    u64 delta, ret = 0;
> 
> -    if (time_before(now, (unsigned long)tsk->vtime_snap))
> -        return 0;
> +    delta = sched_clock() - tsk->vtime_snap;
> 
> -    return jiffies_to_nsecs(now - tsk->vtime_snap);
> +    if (is_idle_task(tsk)) {
> +        if (delta >= TICK_NSEC)
> +            ret = delta;
> +    } else {
> +        if (user) {
> +            tsk->vtime_acct_utime += delta;
> +            if (tsk->vtime_acct_utime >= TICK_NSEC)
> +                ret = tsk->vtime_acct_utime;
> +        } else {
> +            tsk->vtime_acct_stime += delta;
> +            if (tsk->vtime_acct_utime >= TICK_NSEC)
> +                ret = tsk->vtime_acct_stime;
> +        }

We already have vtime_account_idle, vtime_account_user, etc...
The accumulation should be done by these functions that know what and where
to account. vtime_delta() should really just return the difference against vtime_snap,
it's too low level to care about these details.

> +    }
> +
> +    return ret;
>  }
> 
> -static u64 get_vtime_delta(struct task_struct *tsk)
> +static u64 get_vtime_delta(struct task_struct *tsk, bool user)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> -    u64 delta, other;
> +    u64 delta = vtime_delta(tsk, user);
> +    u64 other;
> +
> +    if (!is_idle_task(tsk)) {
> +        if (user)
> +            tsk->vtime_acct_utime = 0;
> +        else
> +            tsk->vtime_acct_stime = 0;
> +    }

Like vtime_delta(), get_vtime_delta() shouldn't touch these accumulators.
Reset and accounting really should be done by the upper level functions
vtime_account_*()

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-04-13 13:32                       ` Frederic Weisbecker
@ 2017-05-02 10:01                         ` Wanpeng Li
  2017-05-15  8:17                           ` Wanpeng Li
  0 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-05-02 10:01 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra, Paolo Bonzini

Cc Paolo,
2017-04-13 21:32 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote:
>> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
>> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
>> >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
>> >> > It's not different from the current jiffies based stuff at all. Same
>> >> > failure mode.
>> >>
>> >> Yes you're right, I got confused again. So to fix this we could do our snapshots
>> >> at a frequency lower than HZ but still high enough to avoid overhead.
>> >>
>> >> Something like TICK_NSEC / 2 ?
>> >
>> > If you are using TSC anyway then you can do proper accumulation for both
>> > system and user and only account the data when the accumulation is more
>> > than a jiffie.
>>
>> So I implement it as below:
>>
>> - HZ=1000.
>>   1) two cpu hogs on cpu in nohz_full mode, 100% user time
>>   2) Luzi's testcase, ~95% user time, ~5% idle time (as we expected)
>> - HZ=250
>>    1) two cpu hogs on cpu in nohz_full mode, 100% user time
>>    2) Luzi's testcase, 100% idle
>>
>> So the codes below still not work correctly for HZ=250, any suggestions?
>
> Right, so first lets reorder that code a bit so we can see clear inside :-)
>
>>
>> -------------------------------------->8-----------------------------------------------------
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index d67eee8..6a11771 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -668,6 +668,8 @@ struct task_struct {
>>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>>      seqcount_t            vtime_seqcount;
>>      unsigned long long        vtime_snap;
>> +    u64                vtime_acct_utime;
>> +    u64                vtime_acct_stime;
>
> You need to accumulate guest and steal time as well.
>

Hi Frederic,

Sorry for the delay since I'm too busy recently, I just add guest time
and idle time accumulations as below, the code work as we expected for
native kernel, however, the testcase fails when it runs in kvm guest.
Top shows ~99% sys for Luzi's testcase "./acct-bug 1 995" which we
expect 95% user  and %5 idle. In addition, what's the design idea of
steal time accumluation in your mind? Pass the tsk parameter in the
function get_vtime_delta() down to the function
steal_account_process_time()?

-------------------------------------->8-----------------------------------------------------

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4cf9a59..56815cd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -672,6 +672,10 @@ struct task_struct {
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
     seqcount_t            vtime_seqcount;
     unsigned long long        vtime_snap;
+    u64                vtime_acct_utime;
+    u64                vtime_acct_stime;
+    u64                vtime_acct_idle_time;
+    u64                vtime_acct_guest_time;
     enum {
         /* Task is sleeping or running in a CPU with VTIME inactive: */
         VTIME_INACTIVE = 0,
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..2d950c6 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -676,18 +676,19 @@ void thread_group_cputime_adjusted(struct
task_struct *p, u64 *ut, u64 *st)
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 static u64 vtime_delta(struct task_struct *tsk)
 {
-    unsigned long now = READ_ONCE(jiffies);
+    unsigned long long clock;

-    if (time_before(now, (unsigned long)tsk->vtime_snap))
+    clock = sched_clock();
+    if (clock < tsk->vtime_snap)
         return 0;

-    return jiffies_to_nsecs(now - tsk->vtime_snap);
+    return clock - tsk->vtime_snap;
 }

 static u64 get_vtime_delta(struct task_struct *tsk)
 {
-    unsigned long now = READ_ONCE(jiffies);
-    u64 delta, other;
+    u64 delta = vtime_delta(tsk);
+    u64 other;

     /*
      * Unlike tick based timing, vtime based timing never has lost
@@ -696,17 +697,16 @@ static u64 get_vtime_delta(struct task_struct *tsk)
      * elapsed time. Limit account_other_time to prevent rounding
      * errors from causing elapsed vtime to go negative.
      */
-    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
     other = account_other_time(delta);
     WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
-    tsk->vtime_snap = now;
+    tsk->vtime_snap += delta;

     return delta - other;
 }

 static void __vtime_account_system(struct task_struct *tsk)
 {
-    account_system_time(tsk, irq_count(), get_vtime_delta(tsk));
+    account_system_time(tsk, irq_count(), tsk->vtime_acct_stime);
 }

 void vtime_account_system(struct task_struct *tsk)
@@ -715,7 +715,11 @@ void vtime_account_system(struct task_struct *tsk)
         return;

     write_seqcount_begin(&tsk->vtime_seqcount);
-    __vtime_account_system(tsk);
+    tsk->vtime_acct_stime += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_stime >= TICK_NSEC) {
+        __vtime_account_system(tsk);
+        tsk->vtime_acct_stime = 0;
+    }
     write_seqcount_end(&tsk->vtime_seqcount);
 }

@@ -723,16 +727,22 @@ void vtime_account_user(struct task_struct *tsk)
 {
     write_seqcount_begin(&tsk->vtime_seqcount);
     tsk->vtime_snap_whence = VTIME_SYS;
-    if (vtime_delta(tsk))
-        account_user_time(tsk, get_vtime_delta(tsk));
+    tsk->vtime_acct_utime += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_utime >= TICK_NSEC) {
+        account_user_time(tsk, tsk->vtime_acct_utime);
+        tsk->vtime_acct_utime = 0;
+    }
     write_seqcount_end(&tsk->vtime_seqcount);
 }

 void vtime_user_enter(struct task_struct *tsk)
 {
     write_seqcount_begin(&tsk->vtime_seqcount);
-    if (vtime_delta(tsk))
+    tsk->vtime_acct_stime += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_stime >= TICK_NSEC) {
         __vtime_account_system(tsk);
+        tsk->vtime_acct_stime = 0;
+    }
     tsk->vtime_snap_whence = VTIME_USER;
     write_seqcount_end(&tsk->vtime_seqcount);
 }
@@ -747,8 +757,11 @@ void vtime_guest_enter(struct task_struct *tsk)
      * that can thus safely catch up with a tickless delta.
      */
     write_seqcount_begin(&tsk->vtime_seqcount);
-    if (vtime_delta(tsk))
+    tsk->vtime_acct_stime += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_stime >= TICK_NSEC) {
         __vtime_account_system(tsk);
+        tsk->vtime_acct_stime = 0;
+    }
     current->flags |= PF_VCPU;
     write_seqcount_end(&tsk->vtime_seqcount);
 }
@@ -757,7 +770,11 @@ EXPORT_SYMBOL_GPL(vtime_guest_enter);
 void vtime_guest_exit(struct task_struct *tsk)
 {
     write_seqcount_begin(&tsk->vtime_seqcount);
-    __vtime_account_system(tsk);
+    tsk->vtime_acct_stime += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_stime >= TICK_NSEC) {
+        __vtime_account_system(tsk);
+        tsk->vtime_acct_stime = 0;
+    }
     current->flags &= ~PF_VCPU;
     write_seqcount_end(&tsk->vtime_seqcount);
 }
@@ -765,7 +782,11 @@ EXPORT_SYMBOL_GPL(vtime_guest_exit);

 void vtime_account_idle(struct task_struct *tsk)
 {
-    account_idle_time(get_vtime_delta(tsk));
+    tsk->vtime_acct_idle_time += get_vtime_delta(tsk);
+    if (tsk->vtime_acct_idle_time >= TICK_NSEC) {
+        account_idle_time(tsk->vtime_acct_idle_time);
+        tsk->vtime_acct_idle_time = 0;
+    }
 }

 void arch_vtime_task_switch(struct task_struct *prev)
@@ -776,7 +797,7 @@ void arch_vtime_task_switch(struct task_struct *prev)

     write_seqcount_begin(&current->vtime_seqcount);
     current->vtime_snap_whence = VTIME_SYS;
-    current->vtime_snap = jiffies;
+    current->vtime_snap = sched_clock_cpu(smp_processor_id());
     write_seqcount_end(&current->vtime_seqcount);
 }

@@ -787,7 +808,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
     local_irq_save(flags);
     write_seqcount_begin(&t->vtime_seqcount);
     t->vtime_snap_whence = VTIME_SYS;
-    t->vtime_snap = jiffies;
+    t->vtime_snap = sched_clock_cpu(cpu);
     write_seqcount_end(&t->vtime_seqcount);
     local_irq_restore(flags);
 }

Regards,
Wanpeng Li

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-05-02 10:01                         ` Wanpeng Li
@ 2017-05-15  8:17                           ` Wanpeng Li
  2017-06-29 17:22                             ` Frederic Weisbecker
  0 siblings, 1 reply; 67+ messages in thread
From: Wanpeng Li @ 2017-05-15  8:17 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra, Paolo Bonzini

Ping,
2017-05-02 18:01 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> Cc Paolo,
> 2017-04-13 21:32 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
>> On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote:
>>> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
>>> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote:
>>> >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote:
>>> >> > It's not different from the current jiffies based stuff at all. Same
>>> >> > failure mode.
>>> >>
>>> >> Yes you're right, I got confused again. So to fix this we could do our snapshots
>>> >> at a frequency lower than HZ but still high enough to avoid overhead.
>>> >>
>>> >> Something like TICK_NSEC / 2 ?
>>> >
>>> > If you are using TSC anyway then you can do proper accumulation for both
>>> > system and user and only account the data when the accumulation is more
>>> > than a jiffie.
>>>
>>> So I implement it as below:
>>>
>>> - HZ=1000.
>>>   1) two cpu hogs on cpu in nohz_full mode, 100% user time
>>>   2) Luzi's testcase, ~95% user time, ~5% idle time (as we expected)
>>> - HZ=250
>>>    1) two cpu hogs on cpu in nohz_full mode, 100% user time
>>>    2) Luzi's testcase, 100% idle
>>>
>>> So the codes below still not work correctly for HZ=250, any suggestions?
>>
>> Right, so first lets reorder that code a bit so we can see clear inside :-)
>>
>>>
>>> -------------------------------------->8-----------------------------------------------------
>>>
>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>> index d67eee8..6a11771 100644
>>> --- a/include/linux/sched.h
>>> +++ b/include/linux/sched.h
>>> @@ -668,6 +668,8 @@ struct task_struct {
>>>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>>>      seqcount_t            vtime_seqcount;
>>>      unsigned long long        vtime_snap;
>>> +    u64                vtime_acct_utime;
>>> +    u64                vtime_acct_stime;
>>
>> You need to accumulate guest and steal time as well.
>>
>
> Hi Frederic,
>
> Sorry for the delay since I'm too busy recently, I just add guest time
> and idle time accumulations as below, the code work as we expected for
> native kernel, however, the testcase fails when it runs in kvm guest.
> Top shows ~99% sys for Luzi's testcase "./acct-bug 1 995" which we
> expect 95% user  and %5 idle. In addition, what's the design idea of
> steal time accumluation in your mind? Pass the tsk parameter in the
> function get_vtime_delta() down to the function
> steal_account_process_time()?
>
> -------------------------------------->8-----------------------------------------------------
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 4cf9a59..56815cd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -672,6 +672,10 @@ struct task_struct {
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>      seqcount_t            vtime_seqcount;
>      unsigned long long        vtime_snap;
> +    u64                vtime_acct_utime;
> +    u64                vtime_acct_stime;
> +    u64                vtime_acct_idle_time;
> +    u64                vtime_acct_guest_time;
>      enum {
>          /* Task is sleeping or running in a CPU with VTIME inactive: */
>          VTIME_INACTIVE = 0,
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..2d950c6 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -676,18 +676,19 @@ void thread_group_cputime_adjusted(struct
> task_struct *p, u64 *ut, u64 *st)
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  static u64 vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> +    unsigned long long clock;
>
> -    if (time_before(now, (unsigned long)tsk->vtime_snap))
> +    clock = sched_clock();
> +    if (clock < tsk->vtime_snap)
>          return 0;
>
> -    return jiffies_to_nsecs(now - tsk->vtime_snap);
> +    return clock - tsk->vtime_snap;
>  }
>
>  static u64 get_vtime_delta(struct task_struct *tsk)
>  {
> -    unsigned long now = READ_ONCE(jiffies);
> -    u64 delta, other;
> +    u64 delta = vtime_delta(tsk);
> +    u64 other;
>
>      /*
>       * Unlike tick based timing, vtime based timing never has lost
> @@ -696,17 +697,16 @@ static u64 get_vtime_delta(struct task_struct *tsk)
>       * elapsed time. Limit account_other_time to prevent rounding
>       * errors from causing elapsed vtime to go negative.
>       */
> -    delta = jiffies_to_nsecs(now - tsk->vtime_snap);
>      other = account_other_time(delta);
>      WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
> -    tsk->vtime_snap = now;
> +    tsk->vtime_snap += delta;
>
>      return delta - other;
>  }
>
>  static void __vtime_account_system(struct task_struct *tsk)
>  {
> -    account_system_time(tsk, irq_count(), get_vtime_delta(tsk));
> +    account_system_time(tsk, irq_count(), tsk->vtime_acct_stime);
>  }
>
>  void vtime_account_system(struct task_struct *tsk)
> @@ -715,7 +715,11 @@ void vtime_account_system(struct task_struct *tsk)
>          return;
>
>      write_seqcount_begin(&tsk->vtime_seqcount);
> -    __vtime_account_system(tsk);
> +    tsk->vtime_acct_stime += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_stime >= TICK_NSEC) {
> +        __vtime_account_system(tsk);
> +        tsk->vtime_acct_stime = 0;
> +    }
>      write_seqcount_end(&tsk->vtime_seqcount);
>  }
>
> @@ -723,16 +727,22 @@ void vtime_account_user(struct task_struct *tsk)
>  {
>      write_seqcount_begin(&tsk->vtime_seqcount);
>      tsk->vtime_snap_whence = VTIME_SYS;
> -    if (vtime_delta(tsk))
> -        account_user_time(tsk, get_vtime_delta(tsk));
> +    tsk->vtime_acct_utime += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_utime >= TICK_NSEC) {
> +        account_user_time(tsk, tsk->vtime_acct_utime);
> +        tsk->vtime_acct_utime = 0;
> +    }
>      write_seqcount_end(&tsk->vtime_seqcount);
>  }
>
>  void vtime_user_enter(struct task_struct *tsk)
>  {
>      write_seqcount_begin(&tsk->vtime_seqcount);
> -    if (vtime_delta(tsk))
> +    tsk->vtime_acct_stime += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_stime >= TICK_NSEC) {
>          __vtime_account_system(tsk);
> +        tsk->vtime_acct_stime = 0;
> +    }
>      tsk->vtime_snap_whence = VTIME_USER;
>      write_seqcount_end(&tsk->vtime_seqcount);
>  }
> @@ -747,8 +757,11 @@ void vtime_guest_enter(struct task_struct *tsk)
>       * that can thus safely catch up with a tickless delta.
>       */
>      write_seqcount_begin(&tsk->vtime_seqcount);
> -    if (vtime_delta(tsk))
> +    tsk->vtime_acct_stime += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_stime >= TICK_NSEC) {
>          __vtime_account_system(tsk);
> +        tsk->vtime_acct_stime = 0;
> +    }
>      current->flags |= PF_VCPU;
>      write_seqcount_end(&tsk->vtime_seqcount);
>  }
> @@ -757,7 +770,11 @@ EXPORT_SYMBOL_GPL(vtime_guest_enter);
>  void vtime_guest_exit(struct task_struct *tsk)
>  {
>      write_seqcount_begin(&tsk->vtime_seqcount);
> -    __vtime_account_system(tsk);
> +    tsk->vtime_acct_stime += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_stime >= TICK_NSEC) {
> +        __vtime_account_system(tsk);
> +        tsk->vtime_acct_stime = 0;
> +    }
>      current->flags &= ~PF_VCPU;
>      write_seqcount_end(&tsk->vtime_seqcount);
>  }
> @@ -765,7 +782,11 @@ EXPORT_SYMBOL_GPL(vtime_guest_exit);
>
>  void vtime_account_idle(struct task_struct *tsk)
>  {
> -    account_idle_time(get_vtime_delta(tsk));
> +    tsk->vtime_acct_idle_time += get_vtime_delta(tsk);
> +    if (tsk->vtime_acct_idle_time >= TICK_NSEC) {
> +        account_idle_time(tsk->vtime_acct_idle_time);
> +        tsk->vtime_acct_idle_time = 0;
> +    }
>  }
>
>  void arch_vtime_task_switch(struct task_struct *prev)
> @@ -776,7 +797,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
>
>      write_seqcount_begin(&current->vtime_seqcount);
>      current->vtime_snap_whence = VTIME_SYS;
> -    current->vtime_snap = jiffies;
> +    current->vtime_snap = sched_clock_cpu(smp_processor_id());
>      write_seqcount_end(&current->vtime_seqcount);
>  }
>
> @@ -787,7 +808,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
>      local_irq_save(flags);
>      write_seqcount_begin(&t->vtime_seqcount);
>      t->vtime_snap_whence = VTIME_SYS;
> -    t->vtime_snap = jiffies;
> +    t->vtime_snap = sched_clock_cpu(cpu);
>      write_seqcount_end(&t->vtime_seqcount);
>      local_irq_restore(flags);
>  }
>
> Regards,
> Wanpeng Li

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [BUG nohz]: wrong user and system time accounting
  2017-05-15  8:17                           ` Wanpeng Li
@ 2017-06-29 17:22                             ` Frederic Weisbecker
  0 siblings, 0 replies; 67+ messages in thread
From: Frederic Weisbecker @ 2017-06-29 17:22 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Thomas Gleixner, Mike Galbraith, Rik van Riel, Luiz Capitulino,
	linux-kernel, Peter Zijlstra, Paolo Bonzini

On Mon, May 15, 2017 at 04:17:10PM +0800, Wanpeng Li wrote:
> Ping,

Sorry for the late answer, I was focused on some other bugs.

So since my ideas weren't even clear on that issue yet, I took your
patch and enhanced the code around. I just posted a new series
with it, please have a look.

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2017-06-29 17:22 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-23 20:55 [BUG nohz]: wrong user and system time accounting Luiz Capitulino
2017-03-24  0:56 ` Rik van Riel
2017-03-24  1:05   ` Luiz Capitulino
2017-03-24  1:08     ` Rik van Riel
2017-03-24  1:39       ` Luiz Capitulino
2017-03-27  5:33   ` lkml
2017-03-24  1:52 ` Wanpeng Li
2017-03-24  3:56   ` Luiz Capitulino
2017-03-27  1:56 ` Wanpeng Li
2017-03-27 17:35   ` Rik van Riel
2017-03-28  7:19     ` Wanpeng Li
     [not found]     ` <20170328132406.7d23579c@redhat.com>
     [not found]       ` <20170328161454.4a5d9e8b@redhat.com>
2017-03-28 21:01         ` Rik van Riel
2017-03-28 21:26           ` Luiz Capitulino
2017-03-29  9:56             ` Wanpeng Li
2017-03-29 12:56               ` Frederic Weisbecker
2017-03-28 21:24         ` Rik van Riel
2017-03-28 21:30           ` Luiz Capitulino
     [not found]       ` <20170329131656.1d6cb743@redhat.com>
2017-03-29 20:08         ` Rik van Riel
2017-03-29 22:54           ` Frederic Weisbecker
2017-03-30 12:57             ` Rik van Riel
2017-03-30  1:58           ` Wanpeng Li
2017-03-30 12:40             ` Frederic Weisbecker
2017-03-30 13:19               ` Mike Galbraith
2017-03-30  4:27           ` Mike Galbraith
2017-03-30  6:47             ` Wanpeng Li
2017-03-30 11:52               ` Wanpeng Li
2017-03-30 12:33                 ` Mike Galbraith
2017-03-30 13:38               ` Frederic Weisbecker
2017-03-30 13:59                 ` Wanpeng Li
2017-03-30 14:18                   ` Frederic Weisbecker
2017-03-30 21:25                     ` Luiz Capitulino
2017-03-31 20:09                       ` Luiz Capitulino
2017-03-31 23:24                         ` Frederic Weisbecker
2017-04-01  3:11                           ` Luiz Capitulino
2017-04-03 15:23                             ` Frederic Weisbecker
2017-04-03 19:06                               ` Luiz Capitulino
2017-04-04 17:36                                 ` Luiz Capitulino
2017-04-05 14:26                                   ` Rik van Riel
2017-04-11 11:03                 ` Wanpeng Li
2017-04-11 11:36                   ` Peter Zijlstra
2017-04-11 11:43                     ` Wanpeng Li
2017-04-11 14:22               ` Thomas Gleixner
2017-04-12 13:18                 ` Frederic Weisbecker
2017-04-12 14:57                   ` Thomas Gleixner
2017-04-12 15:14                     ` Frederic Weisbecker
2017-04-13  4:31                     ` Wanpeng Li
2017-04-13 13:32                       ` Frederic Weisbecker
2017-05-02 10:01                         ` Wanpeng Li
2017-05-15  8:17                           ` Wanpeng Li
2017-06-29 17:22                             ` Frederic Weisbecker
2017-03-30 12:51             ` Frederic Weisbecker
2017-03-30 13:02               ` Rik van Riel
2017-03-30 13:35                 ` Mike Galbraith
2017-04-03 14:40                   ` Frederic Weisbecker
2017-04-04  7:32                     ` Mike Galbraith
2017-03-30 13:44                 ` Frederic Weisbecker
     [not found]         ` <20170329221700.GB23895@lerouge>
2017-03-29 22:46           ` Wanpeng Li
2017-03-30  2:14             ` Luiz Capitulino
2017-03-30 12:27               ` Wanpeng Li
2017-03-27 18:38   ` Luiz Capitulino
2017-03-28  5:28     ` Wanpeng Li
2017-03-28 13:44       ` Luiz Capitulino
2017-03-29 13:04 ` Frederic Weisbecker
2017-03-29 13:14   ` Rik van Riel
2017-03-29 13:23     ` Luiz Capitulino
2017-03-29 21:12       ` Frederic Weisbecker
2017-03-30  1:48         ` Luiz Capitulino

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.