linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
@ 2023-06-23  0:58 Bagas Sanjaya
  2023-06-23  1:26 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Bagas Sanjaya @ 2023-06-23  0:58 UTC (permalink / raw)
  To: Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan
  Cc: Linux Kernel Mailing List, Linux Regressions,
	Linux Netfilter Development, Netfilter Core Developers,
	Linux Networking, Linux Power Management

Hi,

I notice a regression report on Bugzilla [1]. Quoting from it:

> kernel process "kworker/events_power_efficient" uses a lot of cpu power (100% on ESXI 6.7, ~30% on ESXI 7.0U3 or later) after upgrading from 5.17.3 to 5.19.17 or later.
> 
> dmesg log:
> [ 2430.973102]  </TASK>
> [ 2430.973131] Sending NMI from CPU 1 to CPUs 0:
> [ 2430.973241] NMI backtrace for cpu 0
> [ 2430.973247] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 6.3.3 #1
> [ 2430.973254] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> [ 2430.973258] Workqueue: events_power_efficient htable_gc [xt_hashlimit]
> [ 2430.973275] RIP: 0010:preempt_count_sub+0x2e/0xa0
> [ 2430.973289] Code: 36 01 85 c9 75 1b 65 8b 15 a7 da f8 5e 89 d1 81 e1 ff ff ff 7f 39 f9 7c 16 81 ff fe 00 00 00 76 3b f7 df 65 01 3d 8a da f8 5e <c3> cc cc cc cc e8 98 aa 25 00 85 c0 74 f2 8b 15 da 71 ed 00 85 d2
> [ 2430.973294] RSP: 0018:ffffb15ec00dbe58 EFLAGS: 00000297
> [ 2430.973299] RAX: 0000000000000000 RBX: ffffb15ec12ad000 RCX: 0000000000000001
> [ 2430.973302] RDX: 0000000080000001 RSI: ffffffffa1c3313b RDI: 00000000ffffffff
> [ 2430.973306] RBP: dead000000000122 R08: 0000000000000010 R09: 0000746e65696369
> [ 2430.973309] R10: 8080808080808080 R11: 0000000000000018 R12: 0000000000000000
> [ 2430.973312] R13: 0000000000001e2b R14: ffffb15ec12ad048 R15: ffff91c279c26a05
> [ 2430.973316] FS:  0000000000000000(0000) GS:ffff91c279c00000(0000) knlGS:0000000000000000
> [ 2430.973320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2430.973324] CR2: 000055fc138890e0 CR3: 000000010810e002 CR4: 00000000001706f0
> [ 2430.973374] Call Trace:
> [ 2430.973388]  <TASK>
> [ 2430.973390]  __local_bh_enable_ip+0x32/0x70
> [ 2430.973413]  htable_selective_cleanup+0x95/0xc0 [xt_hashlimit]
> [ 2430.973428]  htable_gc+0xf/0x30 [xt_hashlimit]
> [ 2430.973440]  process_one_work+0x1d4/0x360
> [ 2430.973459]  ? process_one_work+0x360/0x360
> [ 2430.973467]  worker_thread+0x25/0x3b0
> [ 2430.973476]  ? process_one_work+0x360/0x360
> [ 2430.973483]  kthread+0xe1/0x110
> [ 2430.973499]  ? kthread_complete_and_exit+0x20/0x20
> [ 2430.973507]  ret_from_fork+0x1f/0x30
> [ 2430.973526]  </TASK>

See Bugzilla for the full thread and perf output.

Anyway, I'm tracking it in regzbot so that it doesn't fall through
cracks unnoticed:

#regzbot introduced: v5.17.3..v5.19.17 https://bugzilla.kernel.org/show_bug.cgi?id=217586
#regzbot title: kworker/events_power_efficient utilizes full CPU power after kernel upgrade

Thanks.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=217586

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-23  0:58 Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later Bagas Sanjaya
@ 2023-06-23  1:26 ` Steven Rostedt
  2023-06-27  2:29 ` Fwd: " Bagas Sanjaya
  2023-10-20 10:55 ` Fwd: " Linux regression tracking #update (Thorsten Leemhuis)
  2 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2023-06-23  1:26 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan,
	Linux Kernel Mailing List, Linux Regressions,
	Linux Netfilter Development, Netfilter Core Developers,
	Linux Networking, Linux Power Management

On Fri, 23 Jun 2023 07:58:51 +0700
Bagas Sanjaya <bagasdotme@gmail.com> wrote:

> Hi,
> 
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
> > kernel process "kworker/events_power_efficient" uses a lot of cpu power (100% on ESXI 6.7, ~30% on ESXI 7.0U3 or later) after upgrading from 5.17.3 to 5.19.17 or later.
> > 
> > dmesg log:
> > [ 2430.973102]  </TASK>
> > [ 2430.973131] Sending NMI from CPU 1 to CPUs 0:
> > [ 2430.973241] NMI backtrace for cpu 0
> > [ 2430.973247] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 6.3.3 #1
> > [ 2430.973254] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> > [ 2430.973258] Workqueue: events_power_efficient htable_gc [xt_hashlimit]
> > [ 2430.973275] RIP: 0010:preempt_count_sub+0x2e/0xa0
> > [ 2430.973289] Code: 36 01 85 c9 75 1b 65 8b 15 a7 da f8 5e 89 d1 81 e1 ff ff ff 7f 39 f9 7c 16 81 ff fe 00 00 00 76 3b f7 df 65 01 3d 8a da f8 5e <c3> cc cc cc cc e8 98 aa 25 00 85 c0 74 f2 8b 15 da 71 ed 00 85 d2
> > [ 2430.973294] RSP: 0018:ffffb15ec00dbe58 EFLAGS: 00000297
> > [ 2430.973299] RAX: 0000000000000000 RBX: ffffb15ec12ad000 RCX: 0000000000000001
> > [ 2430.973302] RDX: 0000000080000001 RSI: ffffffffa1c3313b RDI: 00000000ffffffff
> > [ 2430.973306] RBP: dead000000000122 R08: 0000000000000010 R09: 0000746e65696369
> > [ 2430.973309] R10: 8080808080808080 R11: 0000000000000018 R12: 0000000000000000
> > [ 2430.973312] R13: 0000000000001e2b R14: ffffb15ec12ad048 R15: ffff91c279c26a05
> > [ 2430.973316] FS:  0000000000000000(0000) GS:ffff91c279c00000(0000) knlGS:0000000000000000
> > [ 2430.973320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2430.973324] CR2: 000055fc138890e0 CR3: 000000010810e002 CR4: 00000000001706f0
> > [ 2430.973374] Call Trace:
> > [ 2430.973388]  <TASK>
> > [ 2430.973390]  __local_bh_enable_ip+0x32/0x70
> > [ 2430.973413]  htable_selective_cleanup+0x95/0xc0 [xt_hashlimit]
> > [ 2430.973428]  htable_gc+0xf/0x30 [xt_hashlimit]

I take it that the "gc" in "htable_gc" means "garbage collection". It may
not have anything to do with changes to this thread. It could very well be
something is feeding it too much garbage!

-- Steve


> > [ 2430.973440]  process_one_work+0x1d4/0x360
> > [ 2430.973459]  ? process_one_work+0x360/0x360
> > [ 2430.973467]  worker_thread+0x25/0x3b0
> > [ 2430.973476]  ? process_one_work+0x360/0x360
> > [ 2430.973483]  kthread+0xe1/0x110
> > [ 2430.973499]  ? kthread_complete_and_exit+0x20/0x20
> > [ 2430.973507]  ret_from_fork+0x1f/0x30
> > [ 2430.973526]  </TASK>  
> 
> See Bugzilla for the full thread and perf output.
> 
> Anyway, I'm tracking it in regzbot so that it doesn't fall through
> cracks unnoticed:
> 
> #regzbot introduced: v5.17.3..v5.19.17 https://bugzilla.kernel.org/show_bug.cgi?id=217586
> #regzbot title: kworker/events_power_efficient utilizes full CPU power after kernel upgrade
> 
> Thanks.
> 
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217586
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-23  0:58 Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later Bagas Sanjaya
  2023-06-23  1:26 ` Steven Rostedt
@ 2023-06-27  2:29 ` Bagas Sanjaya
  2023-06-27  7:30   ` Peter Zijlstra
  2023-10-20 10:55 ` Fwd: " Linux regression tracking #update (Thorsten Leemhuis)
  2 siblings, 1 reply; 9+ messages in thread
From: Bagas Sanjaya @ 2023-06-27  2:29 UTC (permalink / raw)
  To: Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan,
	Peter Zijlstra, Borislav Petkov, Josh Poimboeuf
  Cc: Linux Kernel Mailing List, Linux Regressions,
	Linux Netfilter Development, Netfilter Core Developers,
	Linux Networking, Linux Power Management

[-- Attachment #1: Type: text/plain, Size: 3199 bytes --]

On Fri, Jun 23, 2023 at 07:58:51AM +0700, Bagas Sanjaya wrote:
> Hi,
> 
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
> > kernel process "kworker/events_power_efficient" uses a lot of cpu power (100% on ESXI 6.7, ~30% on ESXI 7.0U3 or later) after upgrading from 5.17.3 to 5.19.17 or later.
> > 
> > dmesg log:
> > [ 2430.973102]  </TASK>
> > [ 2430.973131] Sending NMI from CPU 1 to CPUs 0:
> > [ 2430.973241] NMI backtrace for cpu 0
> > [ 2430.973247] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 6.3.3 #1
> > [ 2430.973254] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> > [ 2430.973258] Workqueue: events_power_efficient htable_gc [xt_hashlimit]
> > [ 2430.973275] RIP: 0010:preempt_count_sub+0x2e/0xa0
> > [ 2430.973289] Code: 36 01 85 c9 75 1b 65 8b 15 a7 da f8 5e 89 d1 81 e1 ff ff ff 7f 39 f9 7c 16 81 ff fe 00 00 00 76 3b f7 df 65 01 3d 8a da f8 5e <c3> cc cc cc cc e8 98 aa 25 00 85 c0 74 f2 8b 15 da 71 ed 00 85 d2
> > [ 2430.973294] RSP: 0018:ffffb15ec00dbe58 EFLAGS: 00000297
> > [ 2430.973299] RAX: 0000000000000000 RBX: ffffb15ec12ad000 RCX: 0000000000000001
> > [ 2430.973302] RDX: 0000000080000001 RSI: ffffffffa1c3313b RDI: 00000000ffffffff
> > [ 2430.973306] RBP: dead000000000122 R08: 0000000000000010 R09: 0000746e65696369
> > [ 2430.973309] R10: 8080808080808080 R11: 0000000000000018 R12: 0000000000000000
> > [ 2430.973312] R13: 0000000000001e2b R14: ffffb15ec12ad048 R15: ffff91c279c26a05
> > [ 2430.973316] FS:  0000000000000000(0000) GS:ffff91c279c00000(0000) knlGS:0000000000000000
> > [ 2430.973320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2430.973324] CR2: 000055fc138890e0 CR3: 000000010810e002 CR4: 00000000001706f0
> > [ 2430.973374] Call Trace:
> > [ 2430.973388]  <TASK>
> > [ 2430.973390]  __local_bh_enable_ip+0x32/0x70
> > [ 2430.973413]  htable_selective_cleanup+0x95/0xc0 [xt_hashlimit]
> > [ 2430.973428]  htable_gc+0xf/0x30 [xt_hashlimit]
> > [ 2430.973440]  process_one_work+0x1d4/0x360
> > [ 2430.973459]  ? process_one_work+0x360/0x360
> > [ 2430.973467]  worker_thread+0x25/0x3b0
> > [ 2430.973476]  ? process_one_work+0x360/0x360
> > [ 2430.973483]  kthread+0xe1/0x110
> > [ 2430.973499]  ? kthread_complete_and_exit+0x20/0x20
> > [ 2430.973507]  ret_from_fork+0x1f/0x30
> > [ 2430.973526]  </TASK>
> 
> See Bugzilla for the full thread and perf output.
> 
> Anyway, I'm tracking it in regzbot so that it doesn't fall through
> cracks unnoticed:
> 
> #regzbot introduced: v5.17.3..v5.19.17 https://bugzilla.kernel.org/show_bug.cgi?id=217586
> #regzbot title: kworker/events_power_efficient utilizes full CPU power after kernel upgrade
> 

The reporter had found the culprit (see Bugzilla for more information), thus
telling regzbot:

#regzbot introduced: 6ad0ad2bf8a67e
#regzbot title: retbleed reporting causes high cpu utilization due to kworker/events_power_efficient
#regzbot link: https://lore.kernel.org/all/PH0PR05MB8448A203A909959FAC754B7AAF439@PH0PR05MB8448.namprd05.prod.outlook.com/
#regzbot link: 6ad0ad2bf8a67e

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-27  2:29 ` Fwd: " Bagas Sanjaya
@ 2023-06-27  7:30   ` Peter Zijlstra
  2023-06-27 10:15     ` Bagas Sanjaya
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2023-06-27  7:30 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan,
	Borislav Petkov, Josh Poimboeuf, Linux Kernel Mailing List,
	Linux Regressions, Linux Netfilter Development,
	Netfilter Core Developers, Linux Networking,
	Linux Power Management, x86

On Tue, Jun 27, 2023 at 09:29:36AM +0700, Bagas Sanjaya wrote:
> On Fri, Jun 23, 2023 at 07:58:51AM +0700, Bagas Sanjaya wrote:
> > Hi,
> > 
> > I notice a regression report on Bugzilla [1]. Quoting from it:
> > 
> > > kernel process "kworker/events_power_efficient" uses a lot of cpu power (100% on ESXI 6.7, ~30% on ESXI 7.0U3 or later) after upgrading from 5.17.3 to 5.19.17 or later.
> > > 
> > > dmesg log:
> > > [ 2430.973102]  </TASK>
> > > [ 2430.973131] Sending NMI from CPU 1 to CPUs 0:
> > > [ 2430.973241] NMI backtrace for cpu 0
> > > [ 2430.973247] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 6.3.3 #1
> > > [ 2430.973254] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> > > [ 2430.973258] Workqueue: events_power_efficient htable_gc [xt_hashlimit]
> > > [ 2430.973275] RIP: 0010:preempt_count_sub+0x2e/0xa0
> > > [ 2430.973289] Code: 36 01 85 c9 75 1b 65 8b 15 a7 da f8 5e 89 d1 81 e1 ff ff ff 7f 39 f9 7c 16 81 ff fe 00 00 00 76 3b f7 df 65 01 3d 8a da f8 5e <c3> cc cc cc cc e8 98 aa 25 00 85 c0 74 f2 8b 15 da 71 ed 00 85 d2
> > > [ 2430.973294] RSP: 0018:ffffb15ec00dbe58 EFLAGS: 00000297
> > > [ 2430.973299] RAX: 0000000000000000 RBX: ffffb15ec12ad000 RCX: 0000000000000001
> > > [ 2430.973302] RDX: 0000000080000001 RSI: ffffffffa1c3313b RDI: 00000000ffffffff
> > > [ 2430.973306] RBP: dead000000000122 R08: 0000000000000010 R09: 0000746e65696369
> > > [ 2430.973309] R10: 8080808080808080 R11: 0000000000000018 R12: 0000000000000000
> > > [ 2430.973312] R13: 0000000000001e2b R14: ffffb15ec12ad048 R15: ffff91c279c26a05
> > > [ 2430.973316] FS:  0000000000000000(0000) GS:ffff91c279c00000(0000) knlGS:0000000000000000
> > > [ 2430.973320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 2430.973324] CR2: 000055fc138890e0 CR3: 000000010810e002 CR4: 00000000001706f0
> > > [ 2430.973374] Call Trace:
> > > [ 2430.973388]  <TASK>
> > > [ 2430.973390]  __local_bh_enable_ip+0x32/0x70
> > > [ 2430.973413]  htable_selective_cleanup+0x95/0xc0 [xt_hashlimit]
> > > [ 2430.973428]  htable_gc+0xf/0x30 [xt_hashlimit]
> > > [ 2430.973440]  process_one_work+0x1d4/0x360
> > > [ 2430.973459]  ? process_one_work+0x360/0x360
> > > [ 2430.973467]  worker_thread+0x25/0x3b0
> > > [ 2430.973476]  ? process_one_work+0x360/0x360
> > > [ 2430.973483]  kthread+0xe1/0x110
> > > [ 2430.973499]  ? kthread_complete_and_exit+0x20/0x20
> > > [ 2430.973507]  ret_from_fork+0x1f/0x30
> > > [ 2430.973526]  </TASK>
> > 
> > See Bugzilla for the full thread and perf output.
> > 
> > Anyway, I'm tracking it in regzbot so that it doesn't fall through
> > cracks unnoticed:
> > 
> > #regzbot introduced: v5.17.3..v5.19.17 https://bugzilla.kernel.org/show_bug.cgi?id=217586
> > #regzbot title: kworker/events_power_efficient utilizes full CPU power after kernel upgrade
> > 
> 
> The reporter had found the culprit (see Bugzilla for more information), thus
> telling regzbot:
> 
> #regzbot introduced: 6ad0ad2bf8a67e
> #regzbot title: retbleed reporting causes high cpu utilization due to kworker/events_power_efficient
> #regzbot link: https://lore.kernel.org/all/PH0PR05MB8448A203A909959FAC754B7AAF439@PH0PR05MB8448.namprd05.prod.outlook.com/
> #regzbot link: 6ad0ad2bf8a67e

Is there a problem other than that IBRS is slow and slower still on
shitty virt?

Does booting with: "spectre_v2=retpoline retbleed=off" make it go away?

I can't tell from this. Also, please don't use bugzilla.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-27  7:30   ` Peter Zijlstra
@ 2023-06-27 10:15     ` Bagas Sanjaya
  2023-06-27 10:19       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Bagas Sanjaya @ 2023-06-27 10:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan,
	Borislav Petkov, Josh Poimboeuf, Linux Kernel Mailing List,
	Linux Regressions, Linux Netfilter Development,
	Netfilter Core Developers, Linux Networking,
	Linux Power Management, x86

On 6/27/23 14:30, Peter Zijlstra wrote:
> I can't tell from this. Also, please don't use bugzilla.

Why not BZ? I'm confused too...

-- 
An old man doll... just what I always wanted! - Clara


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-27 10:15     ` Bagas Sanjaya
@ 2023-06-27 10:19       ` Peter Zijlstra
       [not found]         ` <CAJnqnX5rYn65zVQ+SLN4m4ZzM_jOa_RjGhazWO=Fh8ZvdOCadg@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2023-06-27 10:19 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Vivek Anand, Rafael J. Wysocki, Thomas Renninger, Shuah Khan,
	Borislav Petkov, Josh Poimboeuf, Linux Kernel Mailing List,
	Linux Regressions, Linux Netfilter Development,
	Netfilter Core Developers, Linux Networking,
	Linux Power Management, x86

On Tue, Jun 27, 2023 at 05:15:42PM +0700, Bagas Sanjaya wrote:
> On 6/27/23 14:30, Peter Zijlstra wrote:
> > I can't tell from this. Also, please don't use bugzilla.
> 
> Why not BZ? I'm confused too...

Because we have email; and why would I want to touch a browser for this?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
       [not found]         ` <CAJnqnX5rYn65zVQ+SLN4m4ZzM_jOa_RjGhazWO=Fh8ZvdOCadg@mail.gmail.com>
@ 2023-06-29 14:18           ` Thomas Gleixner
       [not found]             ` <CAJnqnX5dHiXe3smKhj6JT9+6FNdgrAR=5_Hm8BSRpVF3uARYUg@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2023-06-29 14:18 UTC (permalink / raw)
  To: Vivek Anand, Peter Zijlstra
  Cc: Bagas Sanjaya, Steven Rostedt (Google),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Rafael J. Wysocki, Thomas Renninger, Shuah Khan, Borislav Petkov,
	Josh Poimboeuf, Linux Kernel Mailing List, Linux Regressions,
	Linux Netfilter Development, Netfilter Core Developers,
	Linux Networking, Linux Power Management, x86

On Thu, Jun 29 2023 at 12:05, Vivek Anand wrote:
> I've tried booting with "spectre_v2=retpoline retbleed=off".
> This change didn't work. Still CPU is 100%

This does not make sense.

retbleed=off has the same effect as CONFIG_X86_IBRS_ENTRY=n.

The only difference is that with CONFIG_X86_IBRS_ENTRY=y and
retbleed=off there is one extra jump in the low level entry code
(syscall, interrupts, exceptions) and one extra jump on the exit side.

But those extra jumps are completely irrelevant for the kworker threads.

Can you please provide dmesg and the content of the files in

 /sys/devices/system/cpu/vulnerabilities/

on a kernel booted with "spectre_v2=retpoline retbleed=off" ?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
       [not found]             ` <CAJnqnX5dHiXe3smKhj6JT9+6FNdgrAR=5_Hm8BSRpVF3uARYUg@mail.gmail.com>
@ 2023-07-06 16:33               ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2023-07-06 16:33 UTC (permalink / raw)
  To: Vivek Anand
  Cc: Thomas Gleixner, Peter Zijlstra, Bagas Sanjaya,
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Lingutla Chandrasekhar, Frederic Weisbecker, J. Avila,
	Rafael J. Wysocki, Thomas Renninger, Shuah Khan, Borislav Petkov,
	Josh Poimboeuf, Linux Kernel Mailing List, Linux Regressions,
	Linux Netfilter Development, Netfilter Core Developers,
	Linux Networking, Linux Power Management, x86

On Mon, 3 Jul 2023 18:23:25 +0530
Vivek Anand <vivekanand754@gmail.com> wrote:

> Hi Thomas,
> 
> Further analyzing, I found that I did set
> "CONFIG_NETFILTER_XT_MATCH_LIMIT=m" in my kernel config earlier which was
> causing high CPU consumption.
> Setting it to "CONFIG_NETFILTER_XT_MATCH_LIMIT=n" resolved the high CPU
> issue.
> 
> Is there any suggestion regarding the use of this config
> "CONFIG_NETFILTER_XT_MATCH_LIMIT" as I'm getting high CPU by setting it to
> "m" ?

That config enables the compiling of: net/netfilter/xt_limit.c

The htable_gc that you reported is defined in: net/netfilter/xt_hashlimit.c

It has:

static void htable_gc(struct work_struct *work)
{
        struct xt_hashlimit_htable *ht;

        ht = container_of(work, struct xt_hashlimit_htable, gc_work.work);

        htable_selective_cleanup(ht, false);

        queue_delayed_work(system_power_efficient_wq,
                           &ht->gc_work, msecs_to_jiffies(ht->cfg.gc_interval));
}

So it queues itself every ht->cfg.gc_interval msecs. That variable seems to
come from some configuration of netfilter, and I think you can see these in:

 find /proc/sys/net -name 'gc_interval'

Perhaps you have it set off to go too much?

-- Steve


> 
> Thanks,
> Vivek
> 
> On Thu, Jun 29, 2023 at 7:48 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > On Thu, Jun 29 2023 at 12:05, Vivek Anand wrote:
> > > I've tried booting with "spectre_v2=retpoline retbleed=off".
> > > This change didn't work. Still CPU is 100%
> >
> > This does not make sense.
> >
> > retbleed=off has the same effect as CONFIG_X86_IBRS_ENTRY=n.
> >
> > The only difference is that with CONFIG_X86_IBRS_ENTRY=y and
> > retbleed=off there is one extra jump in the low level entry code
> > (syscall, interrupts, exceptions) and one extra jump on the exit side.
> >
> > But those extra jumps are completely irrelevant for the kworker threads.
> >
> > Can you please provide dmesg and the content of the files in
> >
> >  /sys/devices/system/cpu/vulnerabilities/
> >
> > on a kernel booted with "spectre_v2=retpoline retbleed=off" ?
> >
> > Thanks,
> >
> >         tglx
> >


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later
  2023-06-23  0:58 Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later Bagas Sanjaya
  2023-06-23  1:26 ` Steven Rostedt
  2023-06-27  2:29 ` Fwd: " Bagas Sanjaya
@ 2023-10-20 10:55 ` Linux regression tracking #update (Thorsten Leemhuis)
  2 siblings, 0 replies; 9+ messages in thread
From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-10-20 10:55 UTC (permalink / raw)
  To: Linux Regressions
  Cc: Linux Kernel Mailing List, Linux Netfilter Development,
	Netfilter Core Developers, Linux Networking,
	Linux Power Management

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 23.06.23 02:58, Bagas Sanjaya wrote:
> 
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
>> kernel process "kworker/events_power_efficient" uses a lot of cpu power (100% on ESXI 6.7, ~30% on ESXI 7.0U3 or later) after upgrading from 5.17.3 to 5.19.17 or later.
> [...]
> #regzbot introduced: v5.17.3..v5.19.17 https://bugzilla.kernel.org/show_bug.cgi?id=217586
> #regzbot title: kworker/events_power_efficient utilizes full CPU power after kernel upgrade
> 
> Thanks.
> 
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217586

#regzbot resolve: afaics: misconfiguration became more problematic due
to new mitigations
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-10-20 10:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-23  0:58 Fwd: High cpu usage caused by kernel process when upgraded to linux 5.19.17 or later Bagas Sanjaya
2023-06-23  1:26 ` Steven Rostedt
2023-06-27  2:29 ` Fwd: " Bagas Sanjaya
2023-06-27  7:30   ` Peter Zijlstra
2023-06-27 10:15     ` Bagas Sanjaya
2023-06-27 10:19       ` Peter Zijlstra
     [not found]         ` <CAJnqnX5rYn65zVQ+SLN4m4ZzM_jOa_RjGhazWO=Fh8ZvdOCadg@mail.gmail.com>
2023-06-29 14:18           ` Thomas Gleixner
     [not found]             ` <CAJnqnX5dHiXe3smKhj6JT9+6FNdgrAR=5_Hm8BSRpVF3uARYUg@mail.gmail.com>
2023-07-06 16:33               ` Steven Rostedt
2023-10-20 10:55 ` Fwd: " Linux regression tracking #update (Thorsten Leemhuis)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).