Hey, On Wed, Apr 12, 2017 at 11:31:18AM -0500, Grygorii Strashko wrote: > > > On 04/12/2017 10:44 AM, Eduardo Valentin wrote: > > Hello, > > > ... > > > > > I agree. But there it nothing that says it is not reenterable. If you > > saw something in this line, can you please share? > > > >>>> will you generate a patch to do this? > >>> Sure. I will generate a patch to take care of 1) To make sure that > >>> orderly_poweroff is called only once right away. I have already > >>> tested. > >>> > >>> for 2) Cancel all the scheduled work queues to monitor the > >>> temperature. > >>> I will take some more time to make it and test. > >>> > >>> Is that okay? Or you want me to send both together? > >>> > >> I think you can send patch for step 1 first. > > > > I am happy to see that Keerthy found the problem with his setup and a > > possible solution. But I have a few concerns here. > > > > 1. If regular shutdown process takes 10seconds, that is a ballpark that > > thermal should never wait. orderly_poweroff() calls run_cmd() with wait > > flag set. That means, if regular userland shutdown takes 10s, we are > > waiting for it. Obviously this not acceptable. Specially if you setup > > critical trip to be 125C. Now, if you properly size the critical trip to > > fire before hotspot really reach 125C, for 10s (or the time it takes to > > shutdown), then fine. But based on what was described in this thread, > > his system is waiting 10s on regular shutdown, and his silicon is on > > out-of-spec temperature for 10s, which is wrong. > > > > 2. The above scenario is not acceptable in a long run, specially from a > > reliability perspective. If orderly_poweroff() has a possibility to > > simply never return (or take too long), I would say the thermal > > subsystem is using the wrong API. > > > > > Hh, I do not see that orderly_poweroff() will wait for anything now: > void orderly_poweroff(bool force) > { > if (force) /* do not override the pending "true" */ > poweroff_force = true; > schedule_work(&poweroff_work); > ^^^^^^^ async call. even here can be pretty big delay if system is under pressure > } > > > static int __orderly_poweroff(bool force) > { > int ret; > > ret = run_cmd(poweroff_cmd); > ^^^^ no wait for the process - only for exec. flags == UMH_WAIT_EXEC Yeah, and that is what I really meant. Sorry for the confusion. The exec is problematic in his scenario too, given he is running on a very interesting NFS setup. Yes, the WAIT_EXEC is set: 392 static int run_cmd(const char *cmd) 393 { 394 char **argv; 395 static char *envp[] = { 396 "HOME=/", 397 "PATH=/sbin:/bin:/usr/sbin:/usr/bin", 398 NULL 399 }; 400 int ret; 401 argv = argv_split(GFP_KERNEL, cmd, NULL); 402 if (argv) { 403 ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); 404 argv_free(argv); 405 } else { 406 ret = -ENOMEM; 407 } 408 409 return ret; 410 } 411 > > if (ret && force) { > pr_warn("Failed to start orderly shutdown: forcing the issue\n"); > > /* > * I guess this should try to kick off some daemon to sync and > * poweroff asap. Or not even bother syncing if we're doing an > * emergency shutdown? > */ > emergency_sync(); > kernel_power_off(); > ^^^ force power off, but only if run_cmd() failed - for example /sbin/poweroff doesn't exist > } > > return ret; > } > > static bool poweroff_force; > > static void poweroff_work_func(struct work_struct *work) > { > __orderly_poweroff(poweroff_force); > } > > As result thermal has no control of power off any more after calling orderly_poweroff() and can get the result > of US poweroff binary execution. > > > > > If you are going to implement the above two patches, keep in mind: > > i. At least within the thermal subsystem, you need to take care of all > > zones that could trigger a shutdown. > > ii. serializing the calls to orderly_poweroff() seams to be more > > concerning than cancelling all monitoring. > > > > > > -- > regards, > -grygorii