* [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD @ 2017-06-14 22:20 Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez ` (5 more replies) 0 siblings, 6 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw) To: gregkh Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel, L Martin reported an issue with Android where if sysfs is used to trigger a sync fw load which *relies* on the fallback mechanism and a background job completes while the trigger is ongoing in the foreground it will immediately fail the fw request. The issue can be observed in this simple test script using the test_firmware driver: set -e /etc/init.d/udev stop modprobe test_firmware DIR=/sys/devices/virtual/misc/test_firmware echo 10 >/sys/class/firmware/timeout sleep 2 & echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request The background sleep triggers the SIGCHLD signal and we fail the firmware request on the fallback mechanism. This was due to the type of wait used which captures all signals, but we currently lack the killable swaits which would only allow killing the fallback wait on SIGKILL. This adds the missing killable swaits and fixes the firmware API to use it. This goes along with a test case to demo the issue clearly and how its fixed afterwards. Lastly, ensure to use -EINTR when interrupted so callers can distinguish between an interrupted failure and other types of failure. As suggested I've tagged the addition of the killable swaits and the firmware fix as stable. Between v4.0 and v4.10 the stable fix is to instead change the firmware to use wait_for_completion_killable_timeout() as the firmware API was only converted to swait as of v4.10. The last patch must be applied only after the killable wait is applied to ensure only SIGKILL triggers an interruption. Otherwise it has been observed using -EINTR on other signals like SIGCHLD will trigger a restart of the call, if a sysfs write() is used to trigger the sync firmware request. This might be due what signal(7) says about using the SA_RESTART flag when write() is called, in such cases the call will be automatically restarted after the signal handler returns. The excemption here naturally seems to be on SIGKILL. Note that although I *feared* this might implicate any use of non-killable waits on other system calls, such as finit_module(), initial testing confirms this to not be the case. For instance replacing the echo with modprobe on a module which does the same on init does not present the same issues. This could be due to the special SA_RESTART flag case on write() as noted above and sysfs... however, its not perfectly clear yet to me. As usual these patches are on my linux-next git tree, they are on the 20170614-fw-fixes branch based on linux-next 20170614 [0]. [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170614-fw-fixes Luis R. Rodriguez (4): test_firmware: add test case for SIGCHLD on sync fallback swait: add the missing killable swaits firmware: avoid invalid fallback aborts by using killable swait firmware: send -EINTR on signal abort on fallback mechanism drivers/base/firmware_class.c | 11 +++++---- include/linux/swait.h | 25 ++++++++++++++++++++ tools/testing/selftests/firmware/fw_fallback.sh | 31 +++++++++++++++++++++++++ 3 files changed, 63 insertions(+), 4 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez @ 2017-06-14 22:20 ` Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez ` (4 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw) To: gregkh Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel, L It has been reported that SIGCHLD will trigger an immediate abort on sync firmware requests which rely on the sysfs interface for a trigger. This is unexpected behaviour, this reproduces this issue. This test case currenty fails. Reported-by: Martin Fuzzey <mfuzzey@parkeon.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- tools/testing/selftests/firmware/fw_fallback.sh | 31 +++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/tools/testing/selftests/firmware/fw_fallback.sh b/tools/testing/selftests/firmware/fw_fallback.sh index 2e4c22d5abf7..8f511035f783 100755 --- a/tools/testing/selftests/firmware/fw_fallback.sh +++ b/tools/testing/selftests/firmware/fw_fallback.sh @@ -134,6 +134,27 @@ load_fw_custom_cancel() wait } +load_fw_fallback_with_child() +{ + local name="$1" + local file="$2" + + # This is the value already set but we want to be explicit + echo 4 >/sys/class/firmware/timeout + + sleep 1 & + SECONDS_BEFORE=$(date +%s) + echo -n "$name" >"$DIR"/trigger_request 2>/dev/null + SECONDS_AFTER=$(date +%s) + SECONDS_DELTA=$(($SECONDS_AFTER - $SECONDS_BEFORE)) + if [ "$SECONDS_DELTA" -lt 4 ]; then + RET=1 + else + RET=0 + fi + wait + return $RET +} trap "test_finish" EXIT @@ -221,4 +242,14 @@ else echo "$0: cancelling custom fallback mechanism works" fi +set +e +load_fw_fallback_with_child "nope-signal-$NAME" "$FW" +if [ "$?" -eq 0 ]; then + echo "$0: SIGCHLD on sync ignored as expected" >&2 +else + echo "$0: error - sync firmware request cancelled due to SIGCHLD" >&2 + exit 1 +fi +set -e + exit 0 -- 2.11.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 2/4] swait: add the missing killable swaits 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez @ 2017-06-14 22:20 ` Luis R. Rodriguez [not found] ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez ` (3 subsequent siblings) 5 siblings, 1 reply; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw) To: gregkh Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel, L Code in kernel which incorrectly used the non-killable variants could end up having waits killed improperly. The respective killable waits have been upstream for a while: o wait_for_completion_killable() o wait_for_completion_killable_timeout() swait has been upstream since v4.6. Older kernels have had the above variants in place for a long time. Cc: stable <stable@vger.kernel.org> # 4.6 Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- include/linux/swait.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/include/linux/swait.h b/include/linux/swait.h index c1f9c62a8a50..2c700694d50a 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -169,4 +169,29 @@ do { \ __ret; \ }) +#define __swait_event_killable(wq, condition) \ + ___swait_event(wq, condition, TASK_KILLABLE, 0, schedule()) + +#define swait_event_killable(wq, condition) \ +({ \ + int __ret = 0; \ + if (!(condition)) \ + __ret = __swait_event_killable(wq, condition); \ + __ret; \ +}) + +#define __swait_event_killable_timeout(wq, condition, timeout) \ + ___swait_event(wq, ___wait_cond_timeout(condition), \ + TASK_KILLABLE, timeout, \ + __ret = schedule_timeout(__ret)) + +#define swait_event_killable_timeout(wq, condition, timeout) \ +({ \ + long __ret = timeout; \ + if (!___wait_cond_timeout(condition)) \ + __ret = __swait_event_killable_timeout(wq, \ + condition, timeout); \ + __ret; \ +}) + #endif /* _LINUX_SWAIT_H */ -- 2.11.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
[parent not found: <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2017-06-29 12:54 ` Greg KH [not found] ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2017-06-29 12:54 UTC (permalink / raw) To: Luis R. Rodriguez Cc: mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, arend.vanspriel-dY08KVG/lbpWk0Htik3J/w, rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA, atull-DgEjT+Ai2ygdnm+yROfE0A, moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k, johannes.berg-ral2JQCrhuEAvxtiuMwx3w, emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, keescook-F7+t8E8rja9g9hUCZPvPmw, takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A, dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA, hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA, tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ, mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A, linux-api-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote: > Code in kernel which incorrectly used the non-killable variants could > end up having waits killed improperly. The respective killable waits > have been upstream for a while: > > o wait_for_completion_killable() > o wait_for_completion_killable_timeout() > > swait has been upstream since v4.6. Older kernels have had the > above variants in place for a long time. > > Cc: stable <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 4.6 > Signed-off-by: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > --- > include/linux/swait.h | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/include/linux/swait.h b/include/linux/swait.h > index c1f9c62a8a50..2c700694d50a 100644 > --- a/include/linux/swait.h > +++ b/include/linux/swait.h > @@ -169,4 +169,29 @@ do { \ > __ret; \ > }) > > +#define __swait_event_killable(wq, condition) \ > + ___swait_event(wq, condition, TASK_KILLABLE, 0, schedule()) > + > +#define swait_event_killable(wq, condition) \ > +({ \ > + int __ret = 0; \ > + if (!(condition)) \ > + __ret = __swait_event_killable(wq, condition); \ > + __ret; \ > +}) > + > +#define __swait_event_killable_timeout(wq, condition, timeout) \ > + ___swait_event(wq, ___wait_cond_timeout(condition), \ > + TASK_KILLABLE, timeout, \ > + __ret = schedule_timeout(__ret)) > + > +#define swait_event_killable_timeout(wq, condition, timeout) \ > +({ \ > + long __ret = timeout; \ > + if (!___wait_cond_timeout(condition)) \ > + __ret = __swait_event_killable_timeout(wq, \ > + condition, timeout); \ > + __ret; \ > +}) > + > #endif /* _LINUX_SWAIT_H */ Do you really still want to add these, now that we know we shouldn't be using swait in "real" code? :) thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> @ 2017-06-29 13:05 ` Thomas Gleixner 2017-06-29 13:35 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Thomas Gleixner @ 2017-06-29 13:05 UTC (permalink / raw) To: Greg KH Cc: Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, arend.vanspriel-dY08KVG/lbpWk0Htik3J/w, rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA, atull-DgEjT+Ai2ygdnm+yROfE0A, moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k, johannes.berg-ral2JQCrhuEAvxtiuMwx3w, emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, keescook-F7+t8E8rja9g9hUCZPvPmw, takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A, dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA, hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA, tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ, mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A, linux-api-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA On Thu, 29 Jun 2017, Greg KH wrote: > On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote: > Do you really still want to add these, now that we know we shouldn't be > using swait in "real" code? :) And who defined that it should not be used in real code? Thanks, tglx ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 13:05 ` Thomas Gleixner @ 2017-06-29 13:35 ` Greg KH 2017-06-29 13:46 ` Thomas Gleixner 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2017-06-29 13:35 UTC (permalink / raw) To: Thomas Gleixner Cc: Luis R. Rodriguez, mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote: > On Thu, 29 Jun 2017, Greg KH wrote: > > On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote: > > Do you really still want to add these, now that we know we shouldn't be > > using swait in "real" code? :) > > And who defined that it should not be used in real code? Linus did, in a different firmware thread. You have to _really_ know what you are doing to use this interface, and the firmware interface shouldn't be using it. So adding new apis just for firmware does not seem like a wise decision at this point in time. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 13:35 ` Greg KH @ 2017-06-29 13:46 ` Thomas Gleixner 2017-06-29 16:13 ` Linus Torvalds 0 siblings, 1 reply; 38+ messages in thread From: Thomas Gleixner @ 2017-06-29 13:46 UTC (permalink / raw) To: Greg KH Cc: Luis R. Rodriguez, mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel On Thu, 29 Jun 2017, Greg KH wrote: > On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote: > > On Thu, 29 Jun 2017, Greg KH wrote: > > > On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote: > > > Do you really still want to add these, now that we know we shouldn't be > > > using swait in "real" code? :) > > > > And who defined that it should not be used in real code? > > Linus did, in a different firmware thread. You have to _really_ know > what you are doing to use this interface, and the firmware interface > shouldn't be using it. So adding new apis just for firmware does not > seem like a wise decision at this point in time. So it's not about code in general, it's about a particular piece of code. Fair enough. Thanks, tglx ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 13:46 ` Thomas Gleixner @ 2017-06-29 16:13 ` Linus Torvalds 2017-06-29 16:31 ` Matthew Wilcox ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Linus Torvalds @ 2017-06-29 16:13 UTC (permalink / raw) To: Thomas Gleixner Cc: Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski, Kees Cook On Thu, Jun 29, 2017 at 6:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > On Thu, 29 Jun 2017, Greg KH wrote: >> On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote: >> > >> > And who defined that it should not be used in real code? >> >> Linus did, in a different firmware thread. You have to _really_ know >> what you are doing to use this interface, and the firmware interface >> shouldn't be using it. So adding new apis just for firmware does not >> seem like a wise decision at this point in time. > > So it's not about code in general, it's about a particular piece of > code. Fair enough. Well, I'd actually say it the other way around: swait should not be used in general, only in _very_ particular pieces of code that actually explicitly need the odd swait semantics. swait uses special locking and has odd semantics that are not at all the same as the default wait queue ones. It should not be used without very strong reasons (and honestly, the only strong enough reason seems to be "RT"). The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but it also means that it doesn't even work in all contexts. So "swake_up()" does surprising things (only wake up one - that's what caused a firmware loading bug), and "swake_up_all()" has magic rules about interrupt disabling. The thing is simply a collection of small hacks and should NOT be used in general. I never want to see a driver use that code, for example. It was designed for RCU and RT, and it should damn well be limited to that. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 16:13 ` Linus Torvalds @ 2017-06-29 16:31 ` Matthew Wilcox 2017-06-29 17:29 ` Luis R. Rodriguez 2017-06-29 17:40 ` Davidlohr Bueso 2017-06-29 19:15 ` Marcelo Tosatti 2 siblings, 1 reply; 38+ messages in thread From: Matthew Wilcox @ 2017-06-29 16:31 UTC (permalink / raw) To: Linus Torvalds, Thomas Gleixner Cc: Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano Linus Torvalds wrote: > The thing is simply a collection of small hacks and should NOT be used > in general. > > I never want to see a driver use that code, for example. It was > designed for RCU and RT, and it should damn well be limited to that. Maybe put #ifndef MODULE around the entire include file then? And definitely remove this line: * One would recommend using this wait queue where possible. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 16:31 ` Matthew Wilcox @ 2017-06-29 17:29 ` Luis R. Rodriguez 0 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 17:29 UTC (permalink / raw) To: Matthew Wilcox Cc: Linus Torvalds, Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emman On Thu, Jun 29, 2017 at 04:31:08PM +0000, Matthew Wilcox wrote: > Linus Torvalds wrote: > > The thing is simply a collection of small hacks and should NOT be used > > in general. > > > > I never want to see a driver use that code, for example. It was > > designed for RCU and RT, and it should damn well be limited to that. > > Maybe put #ifndef MODULE around the entire include file then? > And definitely remove this line: > > * One would recommend using this wait queue where possible. I'll respin and port away from swait, and add the #ifndef MODULE and special note on swait.h. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 16:13 ` Linus Torvalds 2017-06-29 16:31 ` Matthew Wilcox @ 2017-06-29 17:40 ` Davidlohr Bueso 2017-06-29 17:57 ` Linus Torvalds 2017-06-29 19:15 ` Marcelo Tosatti 2 siblings, 1 reply; 38+ messages in thread From: Davidlohr Bueso @ 2017-06-29 17:40 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, 29 Jun 2017, Linus Torvalds wrote: >Well, I'd actually say it the other way around: swait should not be >used in general, only in _very_ particular pieces of code that >actually explicitly need the odd swait semantics. > >swait uses special locking and has odd semantics that are not at all >the same as the default wait queue ones. It should not be used without >very strong reasons (and honestly, the only strong enough reason seems >to be "RT"). > >The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but >it also means that it doesn't even work in all contexts. > >So "swake_up()" does surprising things (only wake up one - that's what >caused a firmware loading bug), and "swake_up_all()" has magic rules >about interrupt disabling. > >The thing is simply a collection of small hacks and should NOT be used >in general. For all the above, what do you think of my 'sswait' proposal? Thanks, Davidlohr ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 17:40 ` Davidlohr Bueso @ 2017-06-29 17:57 ` Linus Torvalds 2017-06-29 18:33 ` Davidlohr Bueso 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2017-06-29 17:57 UTC (permalink / raw) To: Davidlohr Bueso Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, Jun 29, 2017 at 10:40 AM, Davidlohr Bueso <dave@stgolabs.net> wrote: > > For all the above, what do you think of my 'sswait' proposal? I see no actual users of such a specialty interface. I don't think we've *ever* had any actual problems with our current wait-queues, apart from the RT issues, which were not about the waitqueues themselves, but purely about RT itself. So without some very compelling reason, I'd not want to add yet another wait-queue. I actually think swait is pure garbage. Most users only wake up one process anyway, and using swait for that is stupid. If you only wake up one, you might as well just have a single process pointer, not a wait list at all, and then use "wake_up_process()". There is *one* single user of swake_up_all(), and that one looks like bogus crap also: it does it outside of the spinlock that could have been used to protect the queue - p,lus I'm not sure there's really a queue anyway, since I think it's just the grace-period kthread that is there. kvm uses swait, but doesn't use swake_up_all(), so it always just wakes up a single waiter. That may be the right thing to do. Or it's just another bug. I don't know. The KVM use looks broken too, since it does if (swait_active(wqp)) { swake_up(wqp); which is racy wrt new waiters, which implies that there is some upper-level synchronization - possibly the same "only one thread anyway". So swake really looks like crap. It has crap semantics, it has crap users, it's just broken. The last thing we want to do is to create something _else_ specialized like this. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 17:57 ` Linus Torvalds @ 2017-06-29 18:33 ` Davidlohr Bueso [not found] ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> 2017-07-05 2:06 ` Davidlohr Bueso 0 siblings, 2 replies; 38+ messages in thread From: Davidlohr Bueso @ 2017-06-29 18:33 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, 29 Jun 2017, Linus Torvalds wrote: >So without some very compelling reason, I'd not want to add yet >another wait-queue. Yes, I was expecting this and very much agree. I'll actually take a look at wake_q for wake_up_all() and co. to see if we can reduce the spinlock hold times. Of course it would only make sense for more than a one wakeup. >I actually think swait is pure garbage. Most users only wake up one >process anyway, and using swait for that is stupid. If you only wake >up one, you might as well just have a single process pointer, not a >wait list at all, and then use "wake_up_process()". But you still need the notion of a queue, even if you wake one task at a time... I'm probably missing your point here. >There is *one* single user of swake_up_all(), and that one looks like >bogus crap also: it does it outside of the spinlock that could have >been used to protect the queue - p,lus I'm not sure there's really a >queue anyway, since I think it's just the grace-period kthread that is >there. So those cases when there's only one waiter I completely agree should not be using waitqueues. pcpu-rwsems in the past suffered from this. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> @ 2017-06-29 18:59 ` Linus Torvalds [not found] ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2017-06-29 18:59 UTC (permalink / raw) To: Davidlohr Bueso Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote: > On Thu, 29 Jun 2017, Linus Torvalds wrote: > >> I actually think swait is pure garbage. Most users only wake up one >> process anyway, and using swait for that is stupid. If you only wake >> up one, you might as well just have a single process pointer, not a >> wait list at all, and then use "wake_up_process()". > > But you still need the notion of a queue, even if you wake one task > at a time... I'm probably missing your point here. The *reason* they wake up only one seems to be that there really is just one. It's some per-cpu idle thread for kvm, and for RCU it's the RCU workqueue thread. So the queue literally looks suspiciously pointless. But I might be wrong, and there can actually be multiple entries. If there are, I don't see why the wake-up-one semantics the code uses would be valid, though. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2017-06-29 19:40 ` Luis R. Rodriguez 2017-06-29 19:44 ` Luis R. Rodriguez [not found] ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 0 siblings, 2 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 19:40 UTC (permalink / raw) To: Linus Torvalds Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote: > > On Thu, 29 Jun 2017, Linus Torvalds wrote: > > > >> I actually think swait is pure garbage. Most users only wake up one > >> process anyway, and using swait for that is stupid. If you only wake > >> up one, you might as well just have a single process pointer, not a > >> wait list at all, and then use "wake_up_process()". > > > > But you still need the notion of a queue, even if you wake one task > > at a time... I'm probably missing your point here. > > The *reason* they wake up only one seems to be that there really is > just one. It's some per-cpu idle thread for kvm, and for RCU it's the > RCU workqueue thread. > > So the queue literally looks suspiciously pointless. > > But I might be wrong, and there can actually be multiple entries. Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up issue it would seem this does queue [0]. That said, I don't see any simple tests tools/testing/selftests/swait but then again we don't have test for regular waits either... [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > If there are, I don't see why the wake-up-one semantics the code uses > would be valid, though. Not sure what's wrong with it? I believe one use case for example is for when we know that waker alone would be able to ensure the next item in queue will also be woken up. Such was the case for the kmod.c conversion I tested, and behold it seemed to have wored with just swake_up(). Its obviously *fragile* though given you *assume* error cases also wake up. In the case of kmod.c we have no such error cases but in firmware_class.c we *do*, and actually that is part of the next set of fixes I have to address next, but that issue would be present even if we move to wait for completion and complete_all() is used. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 19:40 ` Luis R. Rodriguez @ 2017-06-29 19:44 ` Luis R. Rodriguez [not found] ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> [not found] ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 19:44 UTC (permalink / raw) To: nbroeking, Jakub Kicinski, Linus Torvalds Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, mcgrof, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew On Thu, Jun 29, 2017 at 09:40:15PM +0200, Luis R. Rodriguez wrote: > On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote: > > On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave@stgolabs.net> wrote: > > > On Thu, 29 Jun 2017, Linus Torvalds wrote: > > > > > >> I actually think swait is pure garbage. Most users only wake up one > > >> process anyway, and using swait for that is stupid. If you only wake > > >> up one, you might as well just have a single process pointer, not a > > >> wait list at all, and then use "wake_up_process()". > > > > > > But you still need the notion of a queue, even if you wake one task > > > at a time... I'm probably missing your point here. > > > > The *reason* they wake up only one seems to be that there really is > > just one. It's some per-cpu idle thread for kvm, and for RCU it's the > > RCU workqueue thread. > > > > So the queue literally looks suspiciously pointless. > > > > But I might be wrong, and there can actually be multiple entries. > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > issue it would seem this does queue [0]. That said, I don't see any simple tests > tools/testing/selftests/swait but then again we don't have test for regular > waits either... > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 I should also note that the swake_up_all() should have only helped in cases where 3 cards were used, as if only 2 were used that should have been covered by just the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or from Jakub. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> @ 2017-06-29 20:58 ` Jakub Kicinski 2017-06-29 22:50 ` Luis R. Rodriguez 0 siblings, 1 reply; 38+ messages in thread From: Jakub Kicinski @ 2017-06-29 20:58 UTC (permalink / raw) To: Luis R. Rodriguez Cc: nbroeking-BUHhN+a2lJ4, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote: > On Thu, Jun 29, 2017 at 09:40:15PM +0200, Luis R. Rodriguez wrote: > > On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote: > > > On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote: > > > > On Thu, 29 Jun 2017, Linus Torvalds wrote: > > > > > > > >> I actually think swait is pure garbage. Most users only wake up one > > > >> process anyway, and using swait for that is stupid. If you only wake > > > >> up one, you might as well just have a single process pointer, not a > > > >> wait list at all, and then use "wake_up_process()". > > > > > > > > But you still need the notion of a queue, even if you wake one task > > > > at a time... I'm probably missing your point here. > > > > > > The *reason* they wake up only one seems to be that there really is > > > just one. It's some per-cpu idle thread for kvm, and for RCU it's the > > > RCU workqueue thread. > > > > > > So the queue literally looks suspiciously pointless. > > > > > > But I might be wrong, and there can actually be multiple entries. > > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > > issue it would seem this does queue [0]. That said, I don't see any simple tests > > tools/testing/selftests/swait but then again we don't have test for regular > > waits either... > > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > > I should also note that the swake_up_all() should have only helped in cases where > 3 cards were used, as if only 2 were used that should have been covered by just > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or > from Jakub. I was hitting this with 2 cards. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 20:58 ` Jakub Kicinski @ 2017-06-29 22:50 ` Luis R. Rodriguez 2017-06-29 22:53 ` Jakub Kicinski [not found] ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 0 siblings, 2 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 22:50 UTC (permalink / raw) To: Jakub Kicinski Cc: Luis R. Rodriguez, nbroeking, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coel On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote: > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote: > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > > > issue it would seem this does queue [0]. That said, I don't see any simple tests > > > tools/testing/selftests/swait but then again we don't have test for regular > > > waits either... > > > > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > > > > I should also note that the swake_up_all() should have only helped in cases where > > 3 cards were used, as if only 2 were used that should have been covered by just > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or > > from Jakub. > > I was hitting this with 2 cards. Thanks! Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs swake_up_all() in this case though. Let's recall also the missing wake up on errors! And the fact that netronome has optional firmware, which naturally can fail. So could the issue with 2 cards instead of the miss of a wake up on error due to batched requests ? If so then that still would not put blame on the swake_up()! We can find out by you testing the series I just posted [0] and if that did not fix the issue then try this patch, which I do expect to actually have fixed most issues considering optional firmware. ie, I expect the combination of both to fix your issues, not just the last series I just posted [0]. If you want this in git form you can find all of the patches bundled on the 20170629-fw-fixes-wait-v4 branch [1]. I just wrote this patch it but it seems to have not broken the tests: $ sudo tools/testing/selftests/firmware/fw_fallback.sh tools/testing/selftests/firmware/fw_fallback.sh: timeout works tools/testing/selftests/firmware/fw_fallback.sh: firmware comparison works tools/testing/selftests/firmware/fw_fallback.sh: fallback mechanism works tools/testing/selftests/firmware/fw_fallback.sh: cancelling fallback mechanism works tools/testing/selftests/firmware/fw_fallback.sh: custom fallback loading mechanism works tools/testing/selftests/firmware/fw_fallback.sh: cancelling custom fallback mechanism works tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected $ sudo tools/testing/selftests/firmware/fw_filesystem.sh tools/testing/selftests/firmware/fw_filesystem.sh: timeout works tools/testing/selftests/firmware/fw_filesystem.sh: filesystem loading works tools/testing/selftests/firmware/fw_filesystem.sh: async filesystem loading works [0] https://lkml.kernel.org/r/20170629205151.5329-1-mcgrof@kernel.org [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170629-fw-fixes-wait-v4 Luis >From cb7fee12c6d539405793e883dfd79e0b21c2baad Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" <mcgrof@kernel.org> Date: Thu, 29 Jun 2017 15:19:04 -0700 Subject: [RFT] firmware: send wake up on failure for batched requests Fix batched requests from waiting forever on failure. The firmware API supports "batched requests" which means requests with the same name share the same lookup effort. They wait for the first request to complete, however they are set to always wait for what seem to be forever (MAX_SCHEDULE_TIMEOUT). We currently handle informing waited batched requests on success but we never seem to have sent smoke signals of any kind on failure! This should mean secondary requests batched in seem to just wait forever when the request fails. For device drivers with optional firmware schemes (Intel, or Netronome), this could mean that when you boot a system with multiple cards the firmware will seem to never load on the system, or that the card is just not responsive even the driver initialized. Due to differences in scheduling possible this should not always trigger, so triggering batched requests actually needs to be triggered for this to be an issue. Its reported that at least with the Intel WiFi cards on one system this issue was creeping up 50% of the boots [0]. [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 Reported-by: Nicolas <nbroeking@me.com> Reported-by: John Ewalt <jewalt@lgsinnovations.com> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- drivers/base/firmware_class.c | 40 ++++++++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c index d3d071dbd2a5..40d1351660c0 100644 --- a/drivers/base/firmware_class.c +++ b/drivers/base/firmware_class.c @@ -152,28 +152,27 @@ static void __fw_state_set(struct fw_state *fw_st, __fw_state_set(fw_st, FW_STATUS_LOADING) #define fw_state_done(fw_st) \ __fw_state_set(fw_st, FW_STATUS_DONE) +#define fw_state_aborted(fw_st) \ + __fw_state_set(fw_st, FW_STATUS_ABORTED) #define fw_state_wait(fw_st) \ __fw_state_wait_common(fw_st, MAX_SCHEDULE_TIMEOUT) -#ifndef CONFIG_FW_LOADER_USER_HELPER - -#define fw_state_is_aborted(fw_st) false - -#else /* CONFIG_FW_LOADER_USER_HELPER */ - static int __fw_state_check(struct fw_state *fw_st, enum fw_status status) { return fw_st->status == status; } +#define fw_state_is_aborted(fw_st) \ + __fw_state_check(fw_st, FW_STATUS_ABORTED) + +#ifdef CONFIG_FW_LOADER_USER_HELPER + #define fw_state_aborted(fw_st) \ __fw_state_set(fw_st, FW_STATUS_ABORTED) #define fw_state_is_done(fw_st) \ __fw_state_check(fw_st, FW_STATUS_DONE) #define fw_state_is_loading(fw_st) \ __fw_state_check(fw_st, FW_STATUS_LOADING) -#define fw_state_is_aborted(fw_st) \ - __fw_state_check(fw_st, FW_STATUS_ABORTED) #define fw_state_wait_timeout(fw_st, timeout) \ __fw_state_wait_common(fw_st, timeout) @@ -332,6 +331,7 @@ static struct firmware_buf *__fw_lookup_buf(const char *fw_name) return NULL; } +/* Returns 1 for batching firmware requests with the same name */ static int fw_lookup_and_allocate_buf(const char *fw_name, struct firmware_cache *fwc, struct firmware_buf **buf, void *dbuf, @@ -345,6 +345,7 @@ static int fw_lookup_and_allocate_buf(const char *fw_name, kref_get(&tmp->ref); spin_unlock(&fwc->lock); *buf = tmp; + /* requests are batched ! */ return 1; } tmp = __allocate_fw_buf(fw_name, fwc, dbuf, size); @@ -1200,6 +1201,28 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name, return 1; /* need to load */ } +/* + * Batched requests need only one wake, we need to do this step last due to the + * fallback mechanism. The buf is protected with kref_get(), and it won't be + * released until the last user calls release_firmware(). + * + * Failed batched requests are possible as well, in such cases we just share + * the struct firmware_buf and won't release it until all requests are woken + * and have gone through this same path. + */ +static void fw_abort_batch_reqs(struct firmware *fw) +{ + struct firmware_buf *buf; + + /* Loaded directly? */ + if (!fw || !fw->priv) + return; + + buf = fw->priv; + if (!fw_state_is_aborted(&buf->fw_st)) + fw_state_aborted(&buf->fw_st); +} + /* called from request_firmware() and request_firmware_work_func() */ static int _request_firmware(const struct firmware **firmware_p, const char *name, @@ -1243,6 +1266,7 @@ _request_firmware(const struct firmware **firmware_p, const char *name, out: if (ret < 0) { + fw_abort_batch_reqs(fw); release_firmware(fw); fw = NULL; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 22:50 ` Luis R. Rodriguez @ 2017-06-29 22:53 ` Jakub Kicinski 2017-06-29 23:00 ` Luis R. Rodriguez [not found] ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Jakub Kicinski @ 2017-06-29 22:53 UTC (permalink / raw) To: Luis R. Rodriguez Cc: nbroeking, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote: > On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote: > > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote: > > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > > > > issue it would seem this does queue [0]. That said, I don't see any simple tests > > > > tools/testing/selftests/swait but then again we don't have test for regular > > > > waits either... > > > > > > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > > > > > > I should also note that the swake_up_all() should have only helped in cases where > > > 3 cards were used, as if only 2 were used that should have been covered by just > > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or > > > from Jakub. > > > > I was hitting this with 2 cards. > > Thanks! > > Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs > swake_up_all() in this case though. Let's recall also the missing wake up on > errors! And the fact that netronome has optional firmware, which naturally can > fail. > > So could the issue with 2 cards instead of the miss of a wake up on error due > to batched requests ? If so then that still would not put blame on the > swake_up()! Sorry, that was during manual tests. I had the driver request the firmware with _nowait() without an uevent, and then after I manually wrote -1 to loading only one would get woken up. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 22:53 ` Jakub Kicinski @ 2017-06-29 23:00 ` Luis R. Rodriguez 2017-06-29 23:06 ` Jakub Kicinski 0 siblings, 1 reply; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 23:00 UTC (permalink / raw) To: Jakub Kicinski Cc: Nicolas Broeking, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Co On Thu, Jun 29, 2017 at 3:53 PM, Jakub Kicinski <jakub.kicinski@netronome.com> wrote: > On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote: >> On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote: >> > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote: >> > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up >> > > > issue it would seem this does queue [0]. That said, I don't see any simple tests >> > > > tools/testing/selftests/swait but then again we don't have test for regular >> > > > waits either... >> > > > >> > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 >> > > >> > > I should also note that the swake_up_all() should have only helped in cases where >> > > 3 cards were used, as if only 2 were used that should have been covered by just >> > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or >> > > from Jakub. >> > >> > I was hitting this with 2 cards. >> >> Thanks! >> >> Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs >> swake_up_all() in this case though. Let's recall also the missing wake up on >> errors! And the fact that netronome has optional firmware, which naturally can >> fail. >> >> So could the issue with 2 cards instead of the miss of a wake up on error due >> to batched requests ? If so then that still would not put blame on the >> swake_up()! > > Sorry, that was during manual tests. I had the driver request the > firmware with _nowait() without an uevent, Can you provide the output of grep CONFIG_FW_LOADER_USER_HELPER .config I'd like to see if CONFIG_FW_LOADER_USER_HELPER_FALLBACK was disabled. Not to be confused with the CONFIG_FW_LOADER_USER_HELPER. When the uevent is not set its also known as "a custom fallback mechanism" and by default that has set a timeout of MAX_SCHEDULE_TIMEOUT, which means that even if the direct filesystem lookup failed the fallback mechanism would be used and just sit and wait for what seems to be forever until user input was provided. > and then after I manually > wrote -1 to loading only one would get woken up. Great, this helps, so for those not familiar with the firmware sysfs fallback mechanism: The relevant values one could use are: 1: Start a load, discarding any previous partial load. 0: Conclude the load and hand the data to the driver code. -1: Conclude the load with an error and discard any written data. So you are triggering a failure. And indeed I expect the patch I just provided to be the one to fix your issue with 2 cards. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 23:00 ` Luis R. Rodriguez @ 2017-06-29 23:06 ` Jakub Kicinski 0 siblings, 0 replies; 38+ messages in thread From: Jakub Kicinski @ 2017-06-29 23:06 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Nicolas Broeking, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Co On Thu, 29 Jun 2017 16:00:32 -0700, Luis R. Rodriguez wrote: > On Thu, Jun 29, 2017 at 3:53 PM, Jakub Kicinski > <jakub.kicinski@netronome.com> wrote: > > On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote: > >> On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote: > >> > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote: > >> > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > >> > > > issue it would seem this does queue [0]. That said, I don't see any simple tests > >> > > > tools/testing/selftests/swait but then again we don't have test for regular > >> > > > waits either... > >> > > > > >> > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > >> > > > >> > > I should also note that the swake_up_all() should have only helped in cases where > >> > > 3 cards were used, as if only 2 were used that should have been covered by just > >> > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or > >> > > from Jakub. > >> > > >> > I was hitting this with 2 cards. > >> > >> Thanks! > >> > >> Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs > >> swake_up_all() in this case though. Let's recall also the missing wake up on > >> errors! And the fact that netronome has optional firmware, which naturally can > >> fail. > >> > >> So could the issue with 2 cards instead of the miss of a wake up on error due > >> to batched requests ? If so then that still would not put blame on the > >> swake_up()! > > > > Sorry, that was during manual tests. I had the driver request the > > firmware with _nowait() without an uevent, > > Can you provide the output of > > grep CONFIG_FW_LOADER_USER_HELPER .config > > I'd like to see if CONFIG_FW_LOADER_USER_HELPER_FALLBACK was disabled. > Not to be confused with the CONFIG_FW_LOADER_USER_HELPER. > > When the uevent is not set its also known as "a custom fallback > mechanism" and by default that has set a timeout of > MAX_SCHEDULE_TIMEOUT, which means that even if the direct filesystem > lookup failed the fallback mechanism would be used and just sit and > wait for what seems to be forever until user input was provided. FWIW CONFIG_FW_LOADER=y CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y # CONFIG_FW_CFG_SYSFS is not set > > and then after I manually > > wrote -1 to loading only one would get woken up. > > Great, this helps, so for those not familiar with the firmware sysfs > fallback mechanism: > > The relevant values one could use are: > > 1: Start a load, discarding any previous partial load. > 0: Conclude the load and hand the data to the driver code. > -1: Conclude the load with an error and discard any written data. > > So you are triggering a failure. And indeed I expect the patch I just > provided to be the one to fix your issue with 2 cards. Cool, thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> @ 2017-07-12 21:33 ` Luis R. Rodriguez 0 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-07-12 21:33 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jakub Kicinski, nbroeking-BUHhN+a2lJ4, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach On Fri, Jun 30, 2017 at 12:50:03AM +0200, Luis R. Rodriguez wrote: > ie, I expect the combination of both to fix your issues, not just the last > series I just posted [0]. If you want this in git form you can find all of > the patches bundled on the 20170629-fw-fixes-wait-v4 branch [1]. I just > wrote this patch it but it seems to have not broken the tests > > From cb7fee12c6d539405793e883dfd79e0b21c2baad Mon Sep 17 00:00:00 2001 > From: "Luis R. Rodriguez" <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Date: Thu, 29 Jun 2017 15:19:04 -0700 > Subject: [RFT] firmware: send wake up on failure for batched requests > > Fix batched requests from waiting forever on failure. > > The firmware API supports "batched requests" which means requests with > the same name share the same lookup effort. They wait for the first > request to complete, however they are set to always wait for what seem > to be forever (MAX_SCHEDULE_TIMEOUT). > > We currently handle informing waited batched requests on success but we > never seem to have sent smoke signals of any kind on failure! This > should mean secondary requests batched in seem to just wait forever when > the request fails. > > For device drivers with optional firmware schemes (Intel, or Netronome), > this could mean that when you boot a system with multiple cards the > firmware will seem to never load on the system, or that the card is just > not responsive even the driver initialized. Due to differences in scheduling > possible this should not always trigger, so triggering batched requests > actually needs to be triggered for this to be an issue. > > Its reported that at least with the Intel WiFi cards on one system this > issue was creeping up 50% of the boots [0]. > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > > Reported-by: Nicolas <nbroeking-BUHhN+a2lJ4@public.gmane.org> > Reported-by: John Ewalt <jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8@public.gmane.org> > Reported-by: Jakub Kicinski <jakub.kicinski-wFxRvT7yatFl57MIdRCFDg@public.gmane.org> > Signed-off-by: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > --- FWIW I wrote a test case for this and indeed as I expected, it fixed the last remaining issue I was aware of with using multiple cards and the firmware API. Determining the first affected kernel was rather hard, but it would seem to be that this became an issue once we started supporting making the fallback mechanism optional via commit bba3a87e982ad5 ("firmware: Introduce request_firmware_direct()", merged via v3.14. Will follow up with patches. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> @ 2017-06-29 20:57 ` Linus Torvalds 0 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2017-06-29 20:57 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, Jun 29, 2017 at 12:40 PM, Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote: >> > at a time... I'm probably missing your point here. >> >> The *reason* they wake up only one seems to be that there really is >> just one. It's some per-cpu idle thread for kvm, and for RCU it's the >> RCU workqueue thread. >> >> So the queue literally looks suspiciously pointless. >> >> But I might be wrong, and there can actually be multiple entries. > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up > issue it would seem this does queue [0]. I'm not talking about the firmware code. That thing never had an excuse to use swait in the first place. I'm talking about kvm and rcu, which *do* have excuses to use it, but where I argue that swait is _still_ a questionable interface for other reasons. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 18:33 ` Davidlohr Bueso [not found] ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> @ 2017-07-05 2:06 ` Davidlohr Bueso [not found] ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Davidlohr Bueso @ 2017-07-05 2:06 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, 29 Jun 2017, Davidlohr Bueso wrote: >I'll actually take a look at wake_q for wake_up_all() and co. to see if >we can reduce the spinlock hold times. Of course it would only make sense >for more than a one wakeup. So here's something that boots and builds a kernel. Any thoughts? Thanks, Davidlohr ----------8<---------------------------------------------- From: Davidlohr Bueso <dave@stgolabs.net> Subject: [PATCH] sched/wait: Perform lockless wake_up_all() The use of wake_qs have proven a solid option for performing wakeups without holding the corresponding lock -- to the extent that many core subsystems use them. We can make use of wake_qs in waitqueue wakeups. There are a few constraints, nonetheless, that limit the usage to _only_ wake_up_all(): (i) We cannot use them in exclusive wakes as wake_qs do not provide a way to acknowledge a successful wakeup vs when the task is already running. Therefore any node skipping cannot be determined. (ii) Lockless wakeups are only under TASK_NORMAL wakeups, and therefore cannot be used by wake_up_all_interruptible(). For minimal overhead, wake_q does not understand queues with mixed states. Similar changes in the past have shown measurable performance improvements and more bounded latencies in benchmarks where threads are contending for the lock. Ie: futex_wake(N) via mark_wait_futex(), or rwsem waiter wakeups. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- include/linux/wait.h | 37 +++++++++++++++++++++++++------------ kernel/sched/wait.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 12 deletions(-) diff --git a/include/linux/wait.h b/include/linux/wait.h index b289c96151ee..9f6075271e52 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -12,8 +12,10 @@ typedef struct wait_queue_entry wait_queue_entry_t; -typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key); -int default_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key); +typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry, + unsigned mode, int flags, void *key); +int default_wake_function(struct wait_queue_entry *wq_entry, + unsigned mode, int flags, void *key); /* wait_queue_entry::flags */ #define WQ_FLAG_EXCLUSIVE 0x01 @@ -56,7 +58,8 @@ struct task_struct; #define DECLARE_WAIT_QUEUE_HEAD(name) \ struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name) -extern void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *name, struct lock_class_key *); +extern void __init_waitqueue_head(struct wait_queue_head *wq_head, + const char *name, struct lock_class_key *); #define init_waitqueue_head(wq_head) \ do { \ @@ -158,39 +161,49 @@ static inline void __add_wait_queue(struct wait_queue_head *wq_head, struct wait * Used for wake-one threads: */ static inline void -__add_wait_queue_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__add_wait_queue_exclusive(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { wq_entry->flags |= WQ_FLAG_EXCLUSIVE; __add_wait_queue(wq_head, wq_entry); } -static inline void __add_wait_queue_entry_tail(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +static inline void +__add_wait_queue_entry_tail(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { list_add_tail(&wq_entry->entry, &wq_head->head); } static inline void -__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { wq_entry->flags |= WQ_FLAG_EXCLUSIVE; __add_wait_queue_entry_tail(wq_head, wq_entry); } static inline void -__remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__remove_wait_queue(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { list_del(&wq_entry->entry); } -void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key); -void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key); -void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key); -void __wake_up_locked(struct wait_queue_head *wq_head, unsigned int mode, int nr); +void __wake_up(struct wait_queue_head *wq_head, + unsigned int mode, int nr, void *key); +void __wake_up_all_lockless(struct wait_queue_head *wq_head); +void __wake_up_locked_key(struct wait_queue_head *wq_head, + unsigned int mode, void *key); +void __wake_up_sync_key(struct wait_queue_head *wq_head, + unsigned int mode, int nr, void *key); +void __wake_up_locked(struct wait_queue_head *wq_head, + unsigned int mode, int nr); void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode, int nr); #define wake_up(x) __wake_up(x, TASK_NORMAL, 1, NULL) #define wake_up_nr(x, nr) __wake_up(x, TASK_NORMAL, nr, NULL) -#define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) +#define wake_up_all(x) __wake_up_all_lockless(x) #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 17f11c6b0a9f..d029e13529ed 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -5,6 +5,7 @@ */ #include <linux/init.h> #include <linux/export.h> +#include <linux/sched/wake_q.h> #include <linux/sched/signal.h> #include <linux/sched/debug.h> #include <linux/mm.h> @@ -98,6 +99,41 @@ void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, } EXPORT_SYMBOL(__wake_up); +/** + * __wake_up_all_lockless - wake up all threads blocked on a waitqueue, with + * assistance from wake_qs. + * @wq_head: the waitqueue + * + * Using lockless wake_qs allows the wakeups to occur after releasing the + * @wq_head->lock, thus reducing hold times. As such, it only makes sense + * when there is more than a single wakeup to be performed, otherwise it's + * not really worth it. Similarly, this can only be used for TASK_NORMAL + * wakeups, such that we avoid queues with mixed states. + * + * It may be assumed that unlike __wake_up() this function always implies + * a memory barrier before changing the task state due to wake_q_add(), + * regardless of whether or not the task was actually running. + */ +void __wake_up_all_lockless(struct wait_queue_head *wq_head) +{ + unsigned long flags; + wait_queue_entry_t *curr, *next; + DEFINE_WAKE_Q(wake_q); + + spin_lock_irqsave(&wq_head->lock, flags); + /* + * Because we are to awake all tasks, we don't care about + * dealing with WQ_FLAG_EXCLUSIVE cases. + */ + list_for_each_entry_safe(curr, next, &wq_head->head, entry) + wake_q_add(&wake_q, curr->private); + + spin_unlock_irqrestore(&wq_head->lock, flags); + + wake_up_q(&wake_q); +} +EXPORT_SYMBOL(__wake_up_all_lockless); + /* * Same as __wake_up but called with the spinlock in wait_queue_head_t held. */ -- 2.12.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
[parent not found: <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> @ 2017-07-07 19:58 ` Linus Torvalds 2017-07-07 22:27 ` Davidlohr Bueso 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2017-07-07 19:58 UTC (permalink / raw) To: Davidlohr Bueso Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, Rafał Miłecki, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote: > > So here's something that boots and builds a kernel. Any thoughts? This patch ios just nasty crap. Sorry. It has random whitespace changfes that look entirely unrelated to trhe actual real meat of the patch, and that actually make whitespace *worse*. WHY? That alone should just mean that this patch needs to be thrown away and never ever looked at again. But also, this is fundamentally garbage. Exactly for the same reasons that the swait interfaces were fundamentally broken. It *looks* like it works on regular wait queues, and people can start using it that way, but it actually doesn't have the right semantics at all. The new "lockless" function ONLY works if you don't have a private wakeup function. So it completely changes the semantics of "wake_up_all()". It used to be that "wake_up_all()" did what the name suggested. With this patch, "wake_up_all()" will not cause random kernel oopses or memory corruption if you use any of the more specialized wakeup functions Guys, STOP DOING SHIT LIKE THIS! The swake interface was incredible crap, with bad semantics that confused people, and bad performance too. Don't make the normal wait-queues have similar idiotic problems. So no, this is not only NAK'ed, the whole approach is pure and utter shit and this needs to be buried deep and forgotten about so that it never ever comes back to life. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-07-07 19:58 ` Linus Torvalds @ 2017-07-07 22:27 ` Davidlohr Bueso 2017-07-07 22:48 ` Linus Torvalds 0 siblings, 1 reply; 38+ messages in thread From: Davidlohr Bueso @ 2017-07-07 22:27 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, Rafa?? Mi??ecki, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Fri, 07 Jul 2017, Linus Torvalds wrote: >On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave@stgolabs.net> wrote: >> >> So here's something that boots and builds a kernel. Any thoughts? > >This patch ios just nasty crap. Sorry. > >It has random whitespace changfes that look entirely unrelated to trhe >actual real meat of the patch, and that actually make whitespace >*worse*. Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough to just fly by. > >WHY? > >That alone should just mean that this patch needs to be thrown away >and never ever looked at again. > >But also, this is fundamentally garbage. > >Exactly for the same reasons that the swait interfaces were >fundamentally broken. > >It *looks* like it works on regular wait queues, and people can start >using it that way, but it actually doesn't have the right semantics at >all. > >The new "lockless" function ONLY works if you don't have a private >wakeup function. Oh indeed, this was always my intent. Going back to the patch, when checking DEFINE_WAIT_FUNC I clearly overlooked the ->func() implications, breaking all kinds of semantics. With that and the constraints aforementioned in the patch, I see no sane way of using wake_qs. > >So no, this is not only NAK'ed, the whole approach is pure and utter >shit and this needs to be buried deep and forgotten about so that it >never ever comes back to life. Given that you seem to agree that the lockless version is possible as long as we keep semantics, this imho is another point for some form of simplified waitqueues. But yeah. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-07-07 22:27 ` Davidlohr Bueso @ 2017-07-07 22:48 ` Linus Torvalds 0 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2017-07-07 22:48 UTC (permalink / raw) To: Davidlohr Bueso Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, Rafa?? Mi??ecki, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Fri, Jul 7, 2017 at 3:27 PM, Davidlohr Bueso <dave@stgolabs.net> wrote: > > Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough > to just fly by. I find them annoying, because it makes it so much harder to see what the patch actually does. In this case, I think that more than 50% of the patch was just whitespace changes.. > Oh indeed, this was always my intent. Going back to the patch, when > checking DEFINE_WAIT_FUNC I clearly overlooked the ->func() > implications, breaking all kinds of semantics. With that and the > constraints aforementioned in the patch, I see no sane way of using > wake_qs. Well, very few people actually use "wake_up_all()", particularly for any of the things that use special wake functions. So it probably works in practice. And then somebody starts using pollfd or something on one the things that *do* use wake_up_all() and happens to also allow polling (or whatever), and you get nasty crashes. > Given that you seem to agree that the lockless version is possible as > long as we keep semantics, this imho is another point for some form of > simplified waitqueues. We just really haven't had a lot of problems with the waitqueues in my experience. Many of the historical big problems were about the whole "exclusive vs non-exclusive" thundering herd problems, which is actually the most complex thing about them (the callback function adds a pointer to the wait queue, so makes it bigger, but that is very seldom a huge issue). Most of the things that want specific wakeups tend to be some really low-level stuff (ie semaphores etc - both the sysvipc kind of ones and the kernel locking kind of ones). They are often doing their very special own things anyway. And often the regular waitqueues actually work fine, and the biggest thing is to use the lock within the waitqueue for the object that is being waited on too, so that you just avoid the double locking. So you may have hit the one or two cases where the usual wait-queues didn't work well, but in *most* cases they work wonderfully. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-29 16:13 ` Linus Torvalds 2017-06-29 16:31 ` Matthew Wilcox 2017-06-29 17:40 ` Davidlohr Bueso @ 2017-06-29 19:15 ` Marcelo Tosatti [not found] ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org> 2 siblings, 1 reply; 38+ messages in thread From: Marcelo Tosatti @ 2017-06-29 19:15 UTC (permalink / raw) To: Linus Torvalds, h Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo, Andrew Lutomirski On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 6:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, 29 Jun 2017, Greg KH wrote: > >> On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote: > >> > > >> > And who defined that it should not be used in real code? > >> > >> Linus did, in a different firmware thread. You have to _really_ know > >> what you are doing to use this interface, and the firmware interface > >> shouldn't be using it. So adding new apis just for firmware does not > >> seem like a wise decision at this point in time. > > > > So it's not about code in general, it's about a particular piece of > > code. Fair enough. > > Well, I'd actually say it the other way around: swait should not be > used in general, only in _very_ particular pieces of code that > actually explicitly need the odd swait semantics. > > swait uses special locking and has odd semantics that are not at all > the same as the default wait queue ones. It should not be used without > very strong reasons (and honestly, the only strong enough reason seems > to be "RT"). Performance shortcut: https://lkml.org/lkml/2016/2/25/301 > The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but > it also means that it doesn't even work in all contexts. > > So "swake_up()" does surprising things (only wake up one - that's what > caused a firmware loading bug), and "swake_up_all()" has magic rules > about interrupt disabling. > > The thing is simply a collection of small hacks and should NOT be used > in general. Its a very smart performance speed up ;-) > I never want to see a driver use that code, for example. It was > designed for RCU and RT, and it should damn well be limited to that. > > Linus If KVM is the only user, feel free to remove it, you're past the point where that performance improvement matters (due to VMX hardware improvements). ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org>]
* Re: [PATCH 2/4] swait: add the missing killable swaits [not found] ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org> @ 2017-06-30 4:03 ` Linus Torvalds 2017-06-30 11:55 ` Marcelo Tosatti ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Linus Torvalds @ 2017-06-30 4:03 UTC (permalink / raw) To: Marcelo Tosatti Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: >> >> swait uses special locking and has odd semantics that are not at all >> the same as the default wait queue ones. It should not be used without >> very strong reasons (and honestly, the only strong enough reason seems >> to be "RT"). > > Performance shortcut: > > https://lkml.org/lkml/2016/2/25/301 Yes, I know why kvm uses it, I just don't think it's necessarily the right thing. That kvm commit is actually a great example: it uses swake_up() from an interrupt, and that's in fact the *reason* it uses swake_up(). But that also fundamentally means that it cannot use swake_up_all(), so it basically *relies* on there only ever being one single entry that needs to be woken up. And as far as I can tell, it really is because the queue only ever has one entry (ie it's per-vcpu, and when the vcpu is blocked, it's blocked - so no other user will be waiting there). So it isn't that you migth queue multiple entries and then just wake them up one at a time. There really is just one entry at a time, right? And that means that swait is actuially completely the wrong thing to do. It's more expensive and more complex than just saving the single process pointer away and just doing "wake_up_process()". Now, it really is entirely possible that I'm missing something, but it does look like that to me. We've had wake_up_process() since pretty much day #1. THAT is the fastest and simplest direct wake-up there is, not some "simple wait-queue". Now, admittedly I don't know the code and really may be entirely off, but looking at the commit (no need to go to the lkml archives - it's commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in mainline), I really think the swait() use is simply not correct if there can be multiple waiters, exactly because swake_up() only wakes up a single entry. So either there is only a single entry, or *all* the code like dvcpu->arch.wait = 0; - if (waitqueue_active(&dvcpu->wq)) - wake_up_interruptible(&dvcpu->wq); + if (swait_active(&dvcpu->wq)) + swake_up(&dvcpu->wq); is simply wrong. If there are multiple blockers, and you just cleared "arch.wait", I think they should *all* be woken up. And that's not what swake_up() does. So I think that kvm_vcpu_block() could easily have instead done vcpu->process = current; as the "prepare_to_wait()" part, and "finish_wait()" would be to just clear vcpu->process. No wait-queue, just a single pointer to the single blocking thread. (Of course, you still need serialization, so that "wake_up_process(vcpu->process)" doesn't end up using a stale value, but since processes are already freed with RCU because of other things like that, the serialization is very low-cost, you only need to be RCU-read safe when waking up). See what I'm saying? Note that "wake_up_process()" really is fairly widely used. It's widely used because it's fairly obvious, and because that really *is* the lowest-possible cost: a single pointer to the sleeping thread, and you can often do almost no locking at all. And unlike swake_up(), it's obvious that you only wake up a single thread. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-30 4:03 ` Linus Torvalds @ 2017-06-30 11:55 ` Marcelo Tosatti 2017-06-30 11:57 ` Marcelo Tosatti 2017-06-30 17:30 ` Krister Johansen 2 siblings, 0 replies; 38+ messages in thread From: Marcelo Tosatti @ 2017-06-30 11:55 UTC (permalink / raw) To: Linus Torvalds Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: > >> > >> swait uses special locking and has odd semantics that are not at all > >> the same as the default wait queue ones. It should not be used without > >> very strong reasons (and honestly, the only strong enough reason seems > >> to be "RT"). > > > > Performance shortcut: > > > > https://lkml.org/lkml/2016/2/25/301 > > Yes, I know why kvm uses it, I just don't think it's necessarily the > right thing. > > That kvm commit is actually a great example: it uses swake_up() from > an interrupt, and that's in fact the *reason* it uses swake_up(). > > But that also fundamentally means that it cannot use swake_up_all(), > so it basically *relies* on there only ever being one single entry > that needs to be woken up. > > And as far as I can tell, it really is because the queue only ever has > one entry (ie it's per-vcpu, and when the vcpu is blocked, it's > blocked - so no other user will be waiting there). Exactly. > > So it isn't that you migth queue multiple entries and then just wake > them up one at a time. There really is just one entry at a time, > right? Yes. > And that means that swait is actuially completely the wrong thing to > do. It's more expensive and more complex than just saving the single > process pointer away and just doing "wake_up_process()". Aha, i see. > > Now, it really is entirely possible that I'm missing something, but it > does look like that to me. Just drop it -- the optimization is not relevant anymore given VMX hardware improvements. > We've had wake_up_process() since pretty much day #1. THAT is the > fastest and simplest direct wake-up there is, not some "simple > wait-queue". > > Now, admittedly I don't know the code and really may be entirely off, > but looking at the commit (no need to go to the lkml archives - it's > commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in > mainline), I really think the swait() use is simply not correct if > there can be multiple waiters, exactly because swake_up() only wakes > up a single entry. There can't be: its one emulated LAPIC per vcpu. So only one vcpu waits for that waitqueue. > So either there is only a single entry, or *all* the code like > > dvcpu->arch.wait = 0; > > - if (waitqueue_active(&dvcpu->wq)) > - wake_up_interruptible(&dvcpu->wq); > + if (swait_active(&dvcpu->wq)) > + swake_up(&dvcpu->wq); > > is simply wrong. If there are multiple blockers, and you just cleared > "arch.wait", I think they should *all* be woken up. And that's not > what swake_up() does. > > So I think that kvm_vcpu_block() could easily have instead done > > vcpu->process = current; > > as the "prepare_to_wait()" part, and "finish_wait()" would be to just > clear vcpu->process. No wait-queue, just a single pointer to the > single blocking thread. > > (Of course, you still need serialization, so that > "wake_up_process(vcpu->process)" doesn't end up using a stale value, > but since processes are already freed with RCU because of other things > like that, the serialization is very low-cost, you only need to be > RCU-read safe when waking up). > > See what I'm saying? > > Note that "wake_up_process()" really is fairly widely used. It's > widely used because it's fairly obvious, and because that really *is* > the lowest-possible cost: a single pointer to the sleeping thread, and > you can often do almost no locking at all. > > And unlike swake_up(), it's obvious that you only wake up a single thread. > > Linus Feel free to drop the KVM usage... agreed the interface is a special case and a generic one which handles multiple waiters and debugging etc should be preferred. Not sure if other people are using it, thought. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-30 4:03 ` Linus Torvalds 2017-06-30 11:55 ` Marcelo Tosatti @ 2017-06-30 11:57 ` Marcelo Tosatti 2017-06-30 17:30 ` Krister Johansen 2 siblings, 0 replies; 38+ messages in thread From: Marcelo Tosatti @ 2017-06-30 11:57 UTC (permalink / raw) To: Linus Torvalds Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: > >> > >> swait uses special locking and has odd semantics that are not at all > >> the same as the default wait queue ones. It should not be used without > >> very strong reasons (and honestly, the only strong enough reason seems > >> to be "RT"). > > > > Performance shortcut: > > > > https://lkml.org/lkml/2016/2/25/301 > > Yes, I know why kvm uses it, I just don't think it's necessarily the > right thing. > > That kvm commit is actually a great example: it uses swake_up() from > an interrupt, and that's in fact the *reason* it uses swake_up(). > > But that also fundamentally means that it cannot use swake_up_all(), > so it basically *relies* on there only ever being one single entry > that needs to be woken up. > > And as far as I can tell, it really is because the queue only ever has > one entry (ie it's per-vcpu, and when the vcpu is blocked, it's > blocked - so no other user will be waiting there). Exactly. > > So it isn't that you migth queue multiple entries and then just wake > them up one at a time. There really is just one entry at a time, > right? Yes. > And that means that swait is actuially completely the wrong thing to > do. It's more expensive and more complex than just saving the single > process pointer away and just doing "wake_up_process()". Aha, i see. > > Now, it really is entirely possible that I'm missing something, but it > does look like that to me. Just drop it -- the optimization is not relevant anymore given VMX hardware improvements. > We've had wake_up_process() since pretty much day #1. THAT is the > fastest and simplest direct wake-up there is, not some "simple > wait-queue". > > Now, admittedly I don't know the code and really may be entirely off, > but looking at the commit (no need to go to the lkml archives - it's > commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in > mainline), I really think the swait() use is simply not correct if > there can be multiple waiters, exactly because swake_up() only wakes > up a single entry. There can't be: its one emulated LAPIC per vcpu. So only one vcpu waits for that waitqueue. > So either there is only a single entry, or *all* the code like > > dvcpu->arch.wait = 0; > > - if (waitqueue_active(&dvcpu->wq)) > - wake_up_interruptible(&dvcpu->wq); > + if (swait_active(&dvcpu->wq)) > + swake_up(&dvcpu->wq); > > is simply wrong. If there are multiple blockers, and you just cleared > "arch.wait", I think they should *all* be woken up. And that's not > what swake_up() does. > > So I think that kvm_vcpu_block() could easily have instead done > > vcpu->process = current; > > as the "prepare_to_wait()" part, and "finish_wait()" would be to just > clear vcpu->process. No wait-queue, just a single pointer to the > single blocking thread. > > (Of course, you still need serialization, so that > "wake_up_process(vcpu->process)" doesn't end up using a stale value, > but since processes are already freed with RCU because of other things > like that, the serialization is very low-cost, you only need to be > RCU-read safe when waking up). > > See what I'm saying? > > Note that "wake_up_process()" really is fairly widely used. It's > widely used because it's fairly obvious, and because that really *is* > the lowest-possible cost: a single pointer to the sleeping thread, and > you can often do almost no locking at all. > > And unlike swake_up(), it's obvious that you only wake up a single thread. > > Linus Feel free to drop the KVM usage... agreed the interface is a special case and a generic one which handles multiple waiters and has debugging etc should be preferred to avoid bugs Not sure if other people are using it (swait). ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] swait: add the missing killable swaits 2017-06-30 4:03 ` Linus Torvalds 2017-06-30 11:55 ` Marcelo Tosatti 2017-06-30 11:57 ` Marcelo Tosatti @ 2017-06-30 17:30 ` Krister Johansen 2 siblings, 0 replies; 38+ messages in thread From: Krister Johansen @ 2017-06-30 17:30 UTC (permalink / raw) To: Linus Torvalds Cc: Marcelo Tosatti, h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach, Coelho, Luciano On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: > >> > >> swait uses special locking and has odd semantics that are not at all > >> the same as the default wait queue ones. It should not be used without > >> very strong reasons (and honestly, the only strong enough reason seems > >> to be "RT"). > > > > Performance shortcut: > > > > https://lkml.org/lkml/2016/2/25/301 > > Now, admittedly I don't know the code and really may be entirely off, > but looking at the commit (no need to go to the lkml archives - it's > commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in > mainline), I really think the swait() use is simply not correct if > there can be multiple waiters, exactly because swake_up() only wakes > up a single entry. > > So either there is only a single entry, or *all* the code like > > dvcpu->arch.wait = 0; > > - if (waitqueue_active(&dvcpu->wq)) > - wake_up_interruptible(&dvcpu->wq); > + if (swait_active(&dvcpu->wq)) > + swake_up(&dvcpu->wq); > > is simply wrong. If there are multiple blockers, and you just cleared > "arch.wait", I think they should *all* be woken up. And that's not > what swake_up() does. Code like this is probably wrong for another reason too. The swait_active() is likely redudant, since swake_up() also calls swait_active(). The check in swake_up() returns if it thinks there are no active waiters. However, the synchronization needed to ensure a proper wakeup is left as an exercise to swake_up's caller. There have been a couple of other discussions around this topic recently: https://lkml.org/lkml/2017/5/25/722 https://lkml.org/lkml/2017/6/8/1222 The above is better written as the following, but even then you still have the single/multiple wakeup problem: - if (waitqueue_active(&dvcpu->wq)) - wake_up_interruptible(&dvcpu->wq); + smp_mb(); + swake_up(&dvcpu->wq); Just to add to the confusion, the last time I checked, the semantics of swake_up() even differ between RT Linux and mainline, which makes this even more confusing. -K ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez @ 2017-06-14 22:20 ` Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez ` (2 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw) To: gregkh Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel, L Commit 0cb64249ca500 ("firmware_loader: abort request if wait_for_completion is interrupted") added via 4.0 added support to abort the fallback mechanism when a signal was detected and wait_for_completion_interruptible() returned -ERESTARTSYS -- for instance when a user hits CTRL-C. The abort was overly *too* effective. When a child process terminates (successful or not) the signal SIGCHLD can be sent to the parent process which ran the child in the background and later triggered a sync request for firmware through a sysfs interface which relies on the fallback mechanism. This signal in turn can be recieved by the interruptible swait we constructed on firmware_class and detects it as an abort *before* userspace could get a chance to write the firmware. Upon failure -EAGAIN is returned, so userspace is also kept in the dark about exactly what happened. We can reproduce the issue with the fw_fallback.sh selftest: Before this patch: $ sudo tools/testing/selftests/firmware/fw_fallback.sh ... tools/testing/selftests/firmware/fw_fallback.sh: error - sync firmware request cancelled due to SIGCHLD After this patch: $ sudo tools/testing/selftests/firmware/fw_fallback.sh ... tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected Fix this by making the swait killable -- only killable by SIGKILL (kill -9). We loose the ability to allow userspace to cancel a write with CTRL-C (SIGINT), however its been decided the compromise to require SIGKILL is worth the gains. Chances of this issue occuring are low due to the number of drivers upstream exclusively relying on the fallback mechanism for firmware (2 drivers), however this is observed in the field with custom drivers with sysfs triggers to load firmware. Only distributions relying on the fallback mechanism are impacted as well. An example reported issue was on Android, as follows: 1) Android init (pid=1) fork()s (say pid=42) [this child process is totally unrelated to firmware loading, it could be sleep 2; for all we care ] 2) Android init (pid=1) does a write() on a (driver custom) sysfs file which ends up calling request_firmware() kernel side 3) The firmware loading fallback mechanism is used, the request is sent to userspace and pid 1 waits in the kernel on wait_* 4) before firmware loading completes pid 42 dies (for any reason, even normal termination) 5) Kernel delivers SIGCHLD to pid=1 to tell it a child has died, which causes -ERESTARTSYS to be returned from wait_* 6) The kernel's wait aborts and return -EAGAIN for the request_firmware() caller. swait was introduced as of v4.6 and the firmware_class code was modified to use swait only as of v4.10 via commit 5b029624948d6 ("firmware: do not use fw_lock for fw_state protection"), as such stable kernels older than v4.10 must modify the old firmware_class call: from wait_for_completion_interruptible_timeout() to wait_for_completion_killable_timeout() Cc: stable <stable@vger.kernel.org> # 4.0 Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Tested-by: Martin Fuzzey <mfuzzey@parkeon.com> Reported-by: Martin Fuzzey <mfuzzey@parkeon.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- drivers/base/firmware_class.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c index b9f907eedbf7..70fc42e5e0da 100644 --- a/drivers/base/firmware_class.c +++ b/drivers/base/firmware_class.c @@ -131,7 +131,7 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout) { long ret; - ret = swait_event_interruptible_timeout(fw_st->wq, + ret = swait_event_killable_timeout(fw_st->wq, __fw_state_is_done(READ_ONCE(fw_st->status)), timeout); if (ret != 0 && fw_st->status == FW_STATUS_ABORTED) -- 2.11.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez ` (2 preceding siblings ...) 2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez @ 2017-06-14 22:20 ` Luis R. Rodriguez 2017-06-15 7:49 ` [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Martin Fuzzey [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 5 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw) To: gregkh Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel, L Right now we send -EAGAIN to a syfs write which got interrupted. Userspace can't tell what happened though, send -EINTR if we were killed due to a signal so userspace can tell things apart. This is only applicable to the fallback mechanism. Reported-by: Martin Fuzzey <mfuzzey@parkeon.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- drivers/base/firmware_class.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c index 70fc42e5e0da..da043cb16e2f 100644 --- a/drivers/base/firmware_class.c +++ b/drivers/base/firmware_class.c @@ -1089,9 +1089,12 @@ static int _request_firmware_load(struct firmware_priv *fw_priv, mutex_unlock(&fw_lock); } - if (fw_state_is_aborted(&buf->fw_st)) - retval = -EAGAIN; - else if (buf->is_paged_buf && !buf->data) + if (fw_state_is_aborted(&buf->fw_st)) { + if (retval == -ERESTARTSYS) + retval = -EINTR; + else + retval = -EAGAIN; + } else if (buf->is_paged_buf && !buf->data) retval = -ENOMEM; device_del(f_dev); -- 2.11.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez ` (3 preceding siblings ...) 2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez @ 2017-06-15 7:49 ` Martin Fuzzey [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 5 siblings, 0 replies; 38+ messages in thread From: Martin Fuzzey @ 2017-06-15 7:49 UTC (permalink / raw) To: Luis R. Rodriguez, gregkh Cc: ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede, alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel, linux-kernel On 15/06/17 00:20, Luis R. Rodriguez wrote: > Martin reported an issue with Android where if sysfs is used to trigger a sync > fw load which *relies* on the fallback mechanism and a background job completes > while the trigger is ongoing in the foreground it will immediately fail the fw > request. The issue can be observed in this simple test script using the > test_firmware driver: > > set -e > /etc/init.d/udev stop > modprobe test_firmware > DIR=/sys/devices/virtual/misc/test_firmware > echo 10 >/sys/class/firmware/timeout > sleep 2 & > echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request > > The background sleep triggers the SIGCHLD signal and we fail the firmware > request on the fallback mechanism. This was due to the type of wait used which > ... > > Note that although I *feared* this might implicate any use of non-killable waits > on other system calls, such as finit_module(), initial testing confirms this to > not be the case. For instance replacing the echo with modprobe on a module > which does the same on init does not present the same issues. This could be due > to the special SA_RESTART flag case on write() as noted above and sysfs... > however, its not perfectly clear yet to me. > > The reason the problem does not occur with modprobe is that in that case the processes triggering the firmware load (modprobe) and the process dying (sleep) are *siblings* rather than father and child. So the modprobe process does *not* receive a SIGCHLD when its' *brother* dies. echo is a shell built-in so the process triggering the firmware load (the shell) and the process dying (sleep) *are* father and child. Martin ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2017-06-26 21:19 ` Luis R. Rodriguez 2017-06-29 15:14 ` Greg KH 1 sibling, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-26 21:19 UTC (permalink / raw) To: Luis R. Rodriguez Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, arend.vanspriel-dY08KVG/lbpWk0Htik3J/w, rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA, atull-DgEjT+Ai2ygdnm+yROfE0A, moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k, johannes.berg-ral2JQCrhuEAvxtiuMwx3w, emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, keescook-F7+t8E8rja9g9hUCZPvPmw, takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A, dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA, hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA, tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ, mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A, linux-api-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote: > Martin reported an issue with Android where if sysfs is used to trigger a sync > fw load which *relies* on the fallback mechanism and a background job completes > while the trigger is ongoing in the foreground it will immediately fail the fw > request. The issue can be observed in this simple test script using the > test_firmware driver: > > set -e > /etc/init.d/udev stop > modprobe test_firmware > DIR=/sys/devices/virtual/misc/test_firmware > echo 10 >/sys/class/firmware/timeout > sleep 2 & > echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request > > The background sleep triggers the SIGCHLD signal and we fail the firmware > request on the fallback mechanism. This was due to the type of wait used which > captures all signals, but we currently lack the killable swaits which would > only allow killing the fallback wait on SIGKILL. This adds the missing killable > swaits and fixes the firmware API to use it. This goes along with a test case to > demo the issue clearly and how its fixed afterwards. Lastly, ensure to use > -EINTR when interrupted so callers can distinguish between an interrupted > failure and other types of failure. > > As suggested I've tagged the addition of the killable swaits and the firmware > fix as stable. Between v4.0 and v4.10 the stable fix is to instead change > the firmware to use wait_for_completion_killable_timeout() as the firmware > API was only converted to swait as of v4.10. > > The last patch must be applied only after the killable wait is applied to > ensure only SIGKILL triggers an interruption. Otherwise it has been observed > using -EINTR on other signals like SIGCHLD will trigger a restart of the call, > if a sysfs write() is used to trigger the sync firmware request. This might be > due what signal(7) says about using the SA_RESTART flag when write() is called, > in such cases the call will be automatically restarted after the signal handler > returns. The excemption here naturally seems to be on SIGKILL. > > Note that although I *feared* this might implicate any use of non-killable waits > on other system calls, such as finit_module(), initial testing confirms this to > not be the case. For instance replacing the echo with modprobe on a module > which does the same on init does not present the same issues. This could be due > to the special SA_RESTART flag case on write() as noted above and sysfs... > however, its not perfectly clear yet to me. > > As usual these patches are on my linux-next git tree, they are on the > 20170614-fw-fixes branch based on linux-next 20170614 [0]. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170614-fw-fixes Greg, *poke* Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2017-06-26 21:19 ` Luis R. Rodriguez @ 2017-06-29 15:14 ` Greg KH [not found] ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Greg KH @ 2017-06-29 15:14 UTC (permalink / raw) To: Luis R. Rodriguez Cc: mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, arend.vanspriel-dY08KVG/lbpWk0Htik3J/w, rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA, atull-DgEjT+Ai2ygdnm+yROfE0A, moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k, johannes.berg-ral2JQCrhuEAvxtiuMwx3w, emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, keescook-F7+t8E8rja9g9hUCZPvPmw, takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A, dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA, hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA, tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ, mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A, linux-api-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote: > Martin reported an issue with Android where if sysfs is used to trigger a sync > fw load which *relies* on the fallback mechanism and a background job completes > while the trigger is ongoing in the foreground it will immediately fail the fw > request. The issue can be observed in this simple test script using the > test_firmware driver: Please respin this with the swait code removed. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD [not found] ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> @ 2017-06-29 17:29 ` Luis R. Rodriguez 0 siblings, 0 replies; 38+ messages in thread From: Luis R. Rodriguez @ 2017-06-29 17:29 UTC (permalink / raw) To: Greg KH Cc: Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8, rafal-g1n6cQUeyibVItvQsEIGlw, arend.vanspriel-dY08KVG/lbpWk0Htik3J/w, rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA, atull-DgEjT+Ai2ygdnm+yROfE0A, moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k, johannes.berg-ral2JQCrhuEAvxtiuMwx3w, emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, keescook-F7+t8E8rja9g9hUCZPvPmw, takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A, dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA, hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA, tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ, mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A, linux-api-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA On Thu, Jun 29, 2017 at 05:14:42PM +0200, Greg KH wrote: > On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote: > > Martin reported an issue with Android where if sysfs is used to trigger a sync > > fw load which *relies* on the fallback mechanism and a background job completes > > while the trigger is ongoing in the foreground it will immediately fail the fw > > request. The issue can be observed in this simple test script using the > > test_firmware driver: > > Please respin this with the swait code removed. Sure thing. Luis ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2017-07-12 21:33 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez [not found] ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2017-06-29 12:54 ` Greg KH [not found] ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 2017-06-29 13:05 ` Thomas Gleixner 2017-06-29 13:35 ` Greg KH 2017-06-29 13:46 ` Thomas Gleixner 2017-06-29 16:13 ` Linus Torvalds 2017-06-29 16:31 ` Matthew Wilcox 2017-06-29 17:29 ` Luis R. Rodriguez 2017-06-29 17:40 ` Davidlohr Bueso 2017-06-29 17:57 ` Linus Torvalds 2017-06-29 18:33 ` Davidlohr Bueso [not found] ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> 2017-06-29 18:59 ` Linus Torvalds [not found] ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2017-06-29 19:40 ` Luis R. Rodriguez 2017-06-29 19:44 ` Luis R. Rodriguez [not found] ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 2017-06-29 20:58 ` Jakub Kicinski 2017-06-29 22:50 ` Luis R. Rodriguez 2017-06-29 22:53 ` Jakub Kicinski 2017-06-29 23:00 ` Luis R. Rodriguez 2017-06-29 23:06 ` Jakub Kicinski [not found] ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 2017-07-12 21:33 ` Luis R. Rodriguez [not found] ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 2017-06-29 20:57 ` Linus Torvalds 2017-07-05 2:06 ` Davidlohr Bueso [not found] ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org> 2017-07-07 19:58 ` Linus Torvalds 2017-07-07 22:27 ` Davidlohr Bueso 2017-07-07 22:48 ` Linus Torvalds 2017-06-29 19:15 ` Marcelo Tosatti [not found] ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org> 2017-06-30 4:03 ` Linus Torvalds 2017-06-30 11:55 ` Marcelo Tosatti 2017-06-30 11:57 ` Marcelo Tosatti 2017-06-30 17:30 ` Krister Johansen 2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez 2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez 2017-06-15 7:49 ` [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Martin Fuzzey [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2017-06-26 21:19 ` Luis R. Rodriguez 2017-06-29 15:14 ` Greg KH [not found] ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> 2017-06-29 17:29 ` Luis R. Rodriguez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).