From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967419AbdEZLR6 (ORCPT ); Fri, 26 May 2017 07:17:58 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:41012 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1764148AbdEZLQU (ORCPT ); Fri, 26 May 2017 07:16:20 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: "Fuzzey\, Martin" Cc: Andy Lutomirski , "Luis R. Rodriguez" , "Michael Kerrisk \(man-pages\)" , Linux API , Peter Zijlstra , Greg KH , Daniel Wagner , David Woodhouse , jewalt@lgsinnovations.com, rafal@milecki.pl, Arend Van Spriel , "Rafael J. Wysocki" , "Li\, Yi" , atull@opensource.altera.com, Moritz Fischer , Petr Mladek , Johannes Berg , Emmanuel Grumbach , Luca Coelho , Kalle Valo , Linus Torvalds , Kees Cook , AKASHI Takahiro , David Howells , Peter Jones , Hans de Goede , Alan Cox , "Ted Ts'o" , "linux-kernel\@vger.kernel.org" References: <20170524205658.GK8951@wotan.suse.de> <20170524214027.7775-1-mcgrof@kernel.org> Date: Fri, 26 May 2017 06:09:29 -0500 In-Reply-To: (Martin Fuzzey's message of "Thu, 25 May 2017 10:28:38 +0200") Message-ID: <87fufr3mdy.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1dEDE0-0001J4-3D;;;mid=<87fufr3mdy.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=97.121.81.159;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX194d0A4k0ISmxiAK/TX9kd5342YLmlJaM4= X-SA-Exim-Connect-IP: 97.121.81.159 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4973] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa08 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.1 XMSolicitRefs_0 Weightloss drug X-Spam-DCC: XMission; sa08 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;"Fuzzey\, Martin" X-Spam-Relay-Country: X-Spam-Timing: total 5301 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 3.6 (0.1%), b_tie_ro: 2.5 (0.0%), parse: 1.17 (0.0%), extract_message_metadata: 11 (0.2%), get_uri_detail_list: 2.2 (0.0%), tests_pri_-1000: 6 (0.1%), tests_pri_-950: 0.86 (0.0%), tests_pri_-900: 0.76 (0.0%), tests_pri_-400: 28 (0.5%), check_bayes: 27 (0.5%), b_tokenize: 10 (0.2%), b_tok_get_all: 10 (0.2%), b_comp_prob: 2.3 (0.0%), b_tok_touch_all: 3.7 (0.1%), b_finish: 0.66 (0.0%), tests_pri_0: 996 (18.8%), check_dkim_signature: 0.46 (0.0%), check_dkim_adsp: 2.6 (0.0%), tests_pri_500: 4250 (80.2%), poll_dns_idle: 4246 (80.1%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH v2] firmware: fix sending -ERESTARTSYS due to signal on fallback X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Fuzzey, Martin" writes: > On 25 May 2017 at 06:13, Andy Lutomirski wrote: >>>> >>>> Can you give a simple example of what's going on and why it matters? >>>> > > > Here is the use case in which I ran into this problem. > > I have a driver which does request_firmware() when a write() is done > to a sysfs file. > > The write() was being done by an android init script (with the init > interpreter "write" command). > init, of course, forks lots of processes and some of the children die. > > So the scenario was the following: > > 1) Android init calls write() on the sysfs file > 2) The sysfs .store() callback registered by a driver is called > 3) The driver calls request_firmware() > 4) request_firmware() sends the firmware load request to userspace and > calls wait_for_completion_interruptible() > 5) A child dies and raises SIGCHLD > 6) wait_for_completion_interruptible() returns -ERESTARTSYS due to the signal > 7) request_firmware() [before this patch] translated that to -EAGAIN > 8) The driver (in my case) ignored this [because the firmware was not > critical - it was for checking if a microcontroler was up to date] > (but it could have returned it to userspace, same problem) > > The point being that, due to a signal (SIGCHLD) which has nothing to > do with the firmware loading process, the firmware load was not done. > Also EAGAIN is the same error used if the load request times out so it > was impossible to distinguish the two cases. > > ERESTARTSYS is an internal error and is not returned to userspace. > Instead it is handled by the linux syscall machinery which, after > processing the signal either restarts (transpently to userspace) the > syscall or returns EINTR to userspace (depending if the signal handler > users SA_RESTART - see man 7 signal) > > > With this patch here is what happens: > > 1) Android init calls write() on the sysfs file > 2) The sysfs .store() callback registered by a driver is called > 3) The driver calls request_firmware() > 4) request_firmware() sends the firmware load request to userspace and > calls wait_for_completion_interruptible() > 5) A child dies and raises SIGCHLD > 6) wait_for_completion_interruptible() returns -ERESTARTSYS due to the signal > 7) request_firmware() [with this patch] returns -ERESTARTSYS > 8) The driver returns -ERSTARTSYS from its sysfs .store method > 9) The system call machinery invokes the signal handler > 10) The signal handler does its stuff > 11) Because SA_RESTART was set the system call is restarted (calling > the sysfs .store) and we try it all again from step 2 > > Note that, on the the userspace side write() is only called once (the > restart is transparent to userspace which is oblivious to all this) > The kernel side write() (which calls .store() is called multiple times > (so that code does need to know about this) > > >>>> ERESTARTSYS and friends are highly magical, and I'm not convinced that >>>> allowing _request_firmware_load to return -ERESTARTSYS is actually a >>>> good idea. What if there are system calls that can't handle this >>>> style of restart that start being restarted as a result? >>> > > If the caller is unable to restart (for example if the driver's > .store() callback had already done lots of stuff that couldn't be > undone) it is free to translate -ERSTARTSYS to -EINTR before > returning. > But request_frimware() can't know about that. > > >>>> Maybe SIGCHLD shouldn't interrupt firmware loading? > > I don't think there's a way of doing that without disabling all > signals (ie using the non interruptible wait variants). > It used to be that way (which is why I only ran into this after > updating from an ancient 3.16 kernel to a slightly less ancient 4.4) > But there are valid reasons for wanting to be able to interrupt > firmware loading (like being able to kill the userspace helper) Perhaps simply using a killable wait and not a fully interruptible wait would be better? It sounds like the code really is not prepared for an truly interruptible wait here. Eric