linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Zhang Rui <rui.zhang@intel.com>,
	Robert Moore <robert.moore@intel.com>,
	Erik Kaneda <erik.kaneda@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Len Brown <lenb@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	Stephen Berman <stephen.berman@gmx.net>,
	devel@acpica.org
Subject: Re: power-off delay/hang due to commit 6d25be57 (mainline)
Date: Tue, 14 Jul 2020 15:44:10 +0200	[thread overview]
Message-ID: <20200714134410.3odqfvjq6rndjjf6@linutronix.de> (raw)
In-Reply-To: <87ftak2kxr.fsf@rub.de>

On 2020-06-24 23:49:52 [+0200], Stephen Berman wrote:

Let me summarize the thread here:

On Stephen's system, ACPI informs the thermal zone driver to poll the
temperature every second and the driver does so.
The driver queries the temperature by invoking acpi_evaluate_integer()
which invokes (at some point) acpi_ev_queue_notify_request(). 
This then invokes acpi_os_execute_deferred() via
    queue_work_on(, kacpi_notify_wq, )
 
acpi_os_execute_deferred() invokes acpi_ev_notify_dispatch() and this is
no longer synchronised with the initial acpi_evaluate_integer() request.

Before commit
   6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock")

that function took on average 1.023993 seconds to complete. The
interval when the thermal driver invokes acpi_evaluate_integer() isn't
exactly 1 second but almost (it is not an absolute timer so). Still it
looks that one function slowly overtakes the other. After 5 Minutes
uptime there is:

| kworker/0:1-12 [000] 312.315565: acpi_ev_queue_notify_request: Dispatching Notify on [TZ10] (Thermal) Value 0x81 (Information Change) Node ffff9935ab20f7a8
| kworker/0:1-12 [000] 312.315567: acpi_os_execute: Adding acpi_ev_notify_dispatch+0x0/0x5a ffff9935a6c64050 <ffff9935a7eb5e00>

Enqueue worker with job ffff9935a7eb5e00

| kworker/0:1-12 [000] 312.315596: acpi_os_execute_deferred_notify: End ffff9935a7eb5c80 acpi_ev_notify_dispatch+0x0/0x5a(ffff9935a6c64c80)
| kworker/0:1-12 [000] 312.315607: acpi_os_execute_deferred_notify: Start ffff9935a7eb5d80 acpi_ev_notify_dispatch+0x0/0x5a(ffff9935a6c64dc0)

previous worker completed, another (already enqueued) started.

| kworker/0:1-12 [000] 313.339564: acpi_ev_queue_notify_request: Dispatching Notify on [TZ10] (Thermal) Value 0x81 (Information Change) Node ffff9935ab20f7a8
| kworker/0:1-12 [000] 313.339566: acpi_os_execute: Adding acpi_ev_notify_dispatch+0x0/0x5a ffff9935a6c64f00 <ffff9935a7eb5c80>

another one enqueued.

| kworker/0:1-12 [000] 313.339595: acpi_os_execute_deferred_notify: End ffff9935a7eb5d80 acpi_ev_notify_dispatch+0x0/0x5a(ffff9935a6c64dc0)
| kworker/0:1-12 [000] 313.339597: acpi_os_execute_deferred_notify: Start ffff9935a7eb5e00 acpi_ev_notify_dispatch+0x0/0x5a(ffff9935a6c64050)

finally, job ffff9935a7eb5e00 started (one second after enqueue).

| kworker/0:1-12 [000] 314.363571: acpi_ev_queue_notify_request: Dispatching Notify on [TZ10] (Thermal) Value 0x81 (Information Change) Node ffff9935ab20f7a8
| kworker/0:1-12 [000] 314.363574: acpi_os_execute: Adding acpi_ev_notify_dispatch+0x0/0x5a ffff9935a6c646e0 <ffff9935a7eb5d80>
| kworker/0:1-12 [000] 314.363608: acpi_os_execute_deferred_notify: End ffff9935a7eb5e00 acpi_ev_notify_dispatch+0x0/0x5a(ffff9935a6c64050)

and ended, two seconds after enqueue. Before it ended, the system
enqueued two more jobs. And this just within 5 Minutes of uptime. The
worker pile slowly up.

According to Stephen, after previously mentioned commit the situation
gets worse. According to tracing the execution time of
acpi_ev_notify_dispatch() varies between 1.023990 (like before) and
2.048005 seconds. While I don't have an explanation for the changed
behaviour (or see anything wrong the commit in question), the pile up of
worker increased.
Stephen noticed this because system shutdown flushes kacpi_notify_wq
and this takes quite some time (with that amount of worker pending).

I tried a few test cases to reproduce that behaviour but failed.
However, even before that commit in question the situation is far from
perfect: 
- There is the lack of synchronisation between thermal driver's
  requests and its tail (that acpi_ev_notify_dispatch() part). 
- Do we have to respect ACPI interval of one seconds. Is the hardware
  really so accurate that it can change noticeable in one second.
- The requests are already back to back which keeps the CPU busy (maybe
  sched-switch will reveal that the CPU is idle most of the time).

So...

Is there a simple way to synchronize the ACPI part? The obvious thing
would be

	flush_workqueue(kacpi_notify_wq);
or
	acpi_os_wait_events_complete()

in thermal_get_temp().
Would it make sense to ensure that the polling interval is no less than
10 seconds?

> Steve Berman

Sebastian

  reply	other threads:[~2020-07-14 13:44 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01 15:46 power-off delay/hang due to commit 6d25be57 (mainline) Stephen Berman
2020-05-06 21:57 ` Sebastian Andrzej Siewior
2020-05-08 21:30   ` Stephen Berman
2020-05-13 22:04     ` Sebastian Andrzej Siewior
2020-05-14 21:39       ` Stephen Berman
2020-05-22 16:40         ` Sebastian Andrzej Siewior
2020-06-09 10:06           ` Stephen Berman
2020-06-09 20:23             ` Sebastian Andrzej Siewior
2020-06-10  8:21               ` Stephen Berman
2020-06-10 10:25                 ` Sebastian Andrzej Siewior
2020-06-10 22:49                   ` Stephen Berman
2020-06-11 15:39                     ` Stephen Berman
2020-06-12 11:01                       ` Sebastian Andrzej Siewior
2020-06-14 12:12                         ` Stephen Berman
2020-06-14 17:10                           ` Sebastian Andrzej Siewior
2020-06-15  7:58                             ` Stephen Berman
2020-06-15 14:51                               ` Sebastian Andrzej Siewior
2020-06-15 15:41                                 ` Stephen Berman
2020-06-15 15:58                                   ` Sebastian Andrzej Siewior
2020-06-15 16:19                                     ` Stephen Berman
2020-06-15 16:32                                       ` Sebastian Andrzej Siewior
2020-06-16  7:14                                 ` Stephen Berman
2020-06-16  7:38                                   ` Sebastian Andrzej Siewior
2020-06-16  8:13                                     ` Stephen Berman
2020-06-16 15:55                                       ` Sebastian Andrzej Siewior
2020-06-16 20:28                                         ` Stephen Berman
2020-06-17 14:27                                           ` Sebastian Andrzej Siewior
2020-06-17 21:09                                             ` Stephen Berman
2020-06-24 20:11                                               ` Sebastian Andrzej Siewior
2020-06-24 21:49                                                 ` Stephen Berman
2020-07-14 13:44                                                   ` Sebastian Andrzej Siewior [this message]
2020-07-14 13:54                                                     ` Rafael J. Wysocki
2020-07-14 14:11                                                       ` Sebastian Andrzej Siewior
2020-07-14 15:53                                                         ` Rafael J. Wysocki
2020-07-14 16:10                                                           ` Sebastian Andrzej Siewior
2020-08-11 10:27                                                           ` Sebastian Andrzej Siewior
2020-08-11 14:02                                                             ` Rafael J. Wysocki
2020-07-19 10:07                                                         ` Stephen Berman
2020-08-11 11:58                                                           ` Stephen Berman
2020-08-11 13:29                                                             ` Sebastian Andrzej Siewior
2020-08-11 14:34                                                               ` Rafael J. Wysocki
2020-08-11 15:25                                                                 ` Sebastian Andrzej Siewior
2020-08-11 17:22                                                                   ` Stephen Berman
2020-08-11 18:49                                                                     ` Sebastian Andrzej Siewior
2020-10-06 21:49                                                                       ` Sebastian Andrzej Siewior
2020-10-07 16:18                                                                         ` Rafael J. Wysocki
2020-10-26 17:20                                                                           ` Sebastian Andrzej Siewior
2020-12-02 18:03                                                                             ` Sebastian Andrzej Siewior
2020-12-02 18:31                                                                               ` Rafael J. Wysocki
2020-12-02 19:13                                                                                 ` Rafael J. Wysocki
2020-12-31 20:46                                                                                   ` Rafael J. Wysocki
2021-01-02 11:03                                                                                     ` Rafael J. Wysocki
2021-01-04 15:38                                                                                     ` Stephen Berman
2021-01-24 13:49                                                                                       ` Stephen Berman
2021-01-25 16:25                                                                                         ` Rafael J. Wysocki
2020-06-20 19:08               ` [PATCH] SCSI: Disable CD-ROM poll on shutdown kernel test robot
2020-06-09 21:26             ` power-off delay/hang due to commit 6d25be57 (mainline) Stephen Berman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200714134410.3odqfvjq6rndjjf6@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=devel@acpica.org \
    --cc=erik.kaneda@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=robert.moore@intel.com \
    --cc=rui.zhang@intel.com \
    --cc=stephen.berman@gmx.net \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).