[PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD

linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD
@ 2017-06-14 22:20 Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez
                   ` (5 more replies)
  0 siblings, 6 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw)
  To: gregkh
  Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel, L

Martin reported an issue with Android where if sysfs is used to trigger a sync
fw load which *relies* on the fallback mechanism and a background job completes
while the trigger is ongoing in the foreground it will immediately fail the fw
request.  The issue can be observed in this simple test script using the
test_firmware driver:

	set -e
	/etc/init.d/udev stop
	modprobe test_firmware
	DIR=/sys/devices/virtual/misc/test_firmware
	echo 10 >/sys/class/firmware/timeout
	sleep 2 &
	echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request

The background sleep triggers the SIGCHLD signal and we fail the firmware
request on the fallback mechanism. This was due to the type of wait used which
captures all signals, but we currently lack the killable swaits which would
only allow killing the fallback wait on SIGKILL. This adds the missing killable
swaits and fixes the firmware API to use it. This goes along with a test case to
demo the issue clearly and how its fixed afterwards.  Lastly, ensure to use
-EINTR when interrupted so callers can distinguish between an interrupted
failure and other types of failure.

As suggested I've tagged the addition of the killable swaits and the firmware
fix as stable. Between v4.0 and v4.10 the stable fix is to instead change
the firmware to use wait_for_completion_killable_timeout() as the firmware
API was only converted to swait as of v4.10.

The last patch must be applied only after the killable wait is applied to
ensure only SIGKILL triggers an interruption. Otherwise it has been observed
using -EINTR on other signals like SIGCHLD will trigger a restart of the call,
if a sysfs write() is used to trigger the sync firmware request. This might be
due what signal(7) says about using the SA_RESTART flag when write() is called,
in such cases the call will be automatically restarted after the signal handler
returns. The excemption here naturally seems to be on SIGKILL.

Note that although I *feared* this might implicate any use of non-killable waits
on other system calls, such as finit_module(), initial testing confirms this to
not be the case.  For instance replacing the echo with modprobe on a module
which does the same on init does not present the same issues. This could be due
to the special SA_RESTART flag case on write() as noted above and sysfs...
however, its not perfectly clear yet to me.

As usual these patches are on my linux-next git tree, they are on the
20170614-fw-fixes branch based on linux-next 20170614 [0].

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170614-fw-fixes

Luis R. Rodriguez (4):
  test_firmware: add test case for SIGCHLD on sync fallback
  swait: add the missing killable swaits
  firmware: avoid invalid fallback aborts by using killable swait
  firmware: send -EINTR on signal abort on fallback mechanism

 drivers/base/firmware_class.c                   | 11 +++++----
 include/linux/swait.h                           | 25 ++++++++++++++++++++
 tools/testing/selftests/firmware/fw_fallback.sh | 31 +++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 4 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback
  2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
@ 2017-06-14 22:20 ` Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw)
  To: gregkh
  Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel, L

It has been reported that SIGCHLD will trigger an immediate abort
on sync firmware requests which rely on the sysfs interface for a
trigger. This is unexpected behaviour, this reproduces this issue.

This test case currenty fails.

Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 tools/testing/selftests/firmware/fw_fallback.sh | 31 +++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/tools/testing/selftests/firmware/fw_fallback.sh b/tools/testing/selftests/firmware/fw_fallback.sh
index 2e4c22d5abf7..8f511035f783 100755
--- a/tools/testing/selftests/firmware/fw_fallback.sh
+++ b/tools/testing/selftests/firmware/fw_fallback.sh
@@ -134,6 +134,27 @@ load_fw_custom_cancel()
 	wait
 }
 
+load_fw_fallback_with_child()
+{
+	local name="$1"
+	local file="$2"
+
+	# This is the value already set but we want to be explicit
+	echo 4 >/sys/class/firmware/timeout
+
+	sleep 1 &
+	SECONDS_BEFORE=$(date +%s)
+	echo -n "$name" >"$DIR"/trigger_request 2>/dev/null
+	SECONDS_AFTER=$(date +%s)
+	SECONDS_DELTA=$(($SECONDS_AFTER - $SECONDS_BEFORE))
+	if [ "$SECONDS_DELTA" -lt 4 ]; then
+		RET=1
+	else
+		RET=0
+	fi
+	wait
+	return $RET
+}
 
 trap "test_finish" EXIT
 
@@ -221,4 +242,14 @@ else
 	echo "$0: cancelling custom fallback mechanism works"
 fi
 
+set +e
+load_fw_fallback_with_child "nope-signal-$NAME" "$FW"
+if [ "$?" -eq 0 ]; then
+	echo "$0: SIGCHLD on sync ignored as expected" >&2
+else
+	echo "$0: error - sync firmware request cancelled due to SIGCHLD" >&2
+	exit 1
+fi
+set -e
+
 exit 0
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/4] swait: add the missing killable swaits
  2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez
@ 2017-06-14 22:20 ` Luis R. Rodriguez
       [not found]   ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw)
  To: gregkh
  Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel, L

Code in kernel which incorrectly used the non-killable variants could
end up having waits killed improperly. The respective killable waits
have been upstream for a while:

  o wait_for_completion_killable()
  o wait_for_completion_killable_timeout()

swait has been upstream since v4.6. Older kernels have had the
above variants in place for a long time.

Cc: stable <stable@vger.kernel.org> # 4.6
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 include/linux/swait.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index c1f9c62a8a50..2c700694d50a 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -169,4 +169,29 @@ do {									\
 	__ret;								\
 })
 
+#define __swait_event_killable(wq, condition)				\
+	___swait_event(wq, condition, TASK_KILLABLE, 0, schedule())
+
+#define swait_event_killable(wq, condition)				\
+({									\
+	int __ret = 0;							\
+	if (!(condition))						\
+		__ret = __swait_event_killable(wq, condition);		\
+	__ret;								\
+})
+
+#define __swait_event_killable_timeout(wq, condition, timeout)		\
+	___swait_event(wq, ___wait_cond_timeout(condition),		\
+		       TASK_KILLABLE, timeout,				\
+		       __ret = schedule_timeout(__ret))
+
+#define swait_event_killable_timeout(wq, condition, timeout)		\
+({									\
+	long __ret = timeout;						\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __swait_event_killable_timeout(wq,		\
+						condition, timeout);	\
+	__ret;								\
+})
+
 #endif /* _LINUX_SWAIT_H */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait
  2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez
@ 2017-06-14 22:20 ` Luis R. Rodriguez
  2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw)
  To: gregkh
  Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel, L

Commit 0cb64249ca500 ("firmware_loader: abort request if wait_for_completion
is interrupted") added via 4.0 added support to abort the fallback mechanism
when a signal was detected and wait_for_completion_interruptible() returned
-ERESTARTSYS -- for instance when a user hits CTRL-C. The abort was overly
*too* effective.

When a child process terminates (successful or not) the signal SIGCHLD can
be sent to the parent process which ran the child in the background and
later triggered a sync request for firmware through a sysfs interface which
relies on the fallback mechanism. This signal in turn can be recieved by the
interruptible swait we constructed on firmware_class and detects it as an
abort *before* userspace could get a chance to write the firmware. Upon
failure -EAGAIN is returned, so userspace is also kept in the dark about
exactly what happened.

We can reproduce the issue with the fw_fallback.sh selftest:

Before this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: error - sync firmware request cancelled due to SIGCHLD

After this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected

Fix this by making the swait killable -- only killable by SIGKILL (kill -9).
We loose the ability to allow userspace to cancel a write with CTRL-C
(SIGINT), however its been decided the compromise to require SIGKILL is
worth the gains.

Chances of this issue occuring are low due to the number of drivers upstream
exclusively relying on the fallback mechanism for firmware (2 drivers),
however this is observed in the field with custom drivers with sysfs
triggers to load firmware. Only distributions relying on the fallback
mechanism are impacted as well. An example reported issue was on Android,
as follows:

1) Android init (pid=1) fork()s (say pid=42) [this child process is totally
   unrelated to firmware loading, it could be sleep 2; for all we care ]
2) Android init (pid=1) does a write() on a (driver custom) sysfs file which
   ends up calling request_firmware() kernel side
3) The firmware loading fallback mechanism is used, the request is sent to
   userspace and pid 1 waits in the kernel on wait_*
4) before firmware loading completes pid 42 dies (for any reason, even
   normal termination)
5) Kernel delivers SIGCHLD to pid=1 to tell it a child has died, which
   causes -ERESTARTSYS to be returned from wait_*
6) The kernel's wait aborts and return -EAGAIN for the
   request_firmware() caller.

swait was introduced as of v4.6 and the firmware_class code was modified to
use swait only as of v4.10 via commit 5b029624948d6 ("firmware: do not use
fw_lock for fw_state protection"), as such stable kernels older than v4.10
must modify the old firmware_class call:

	from  wait_for_completion_interruptible_timeout()
	to    wait_for_completion_killable_timeout()

Cc: stable <stable@vger.kernel.org> # 4.0
Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Martin Fuzzey <mfuzzey@parkeon.com>
Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 drivers/base/firmware_class.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index b9f907eedbf7..70fc42e5e0da 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -131,7 +131,7 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout)
 {
 	long ret;

-	ret = swait_event_interruptible_timeout(fw_st->wq,
+	ret = swait_event_killable_timeout(fw_st->wq,
 				__fw_state_is_done(READ_ONCE(fw_st->status)),
 				timeout);
 	if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism
  2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez
@ 2017-06-14 22:20 ` Luis R. Rodriguez
  2017-06-15  7:49 ` [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Martin Fuzzey
       [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  5 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-14 22:20 UTC (permalink / raw)
  To: gregkh
  Cc: mfuzzey, ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel, L

Right now we send -EAGAIN to a syfs write which got interrupted.
Userspace can't tell what happened though, send -EINTR if we
were killed due to a signal so userspace can tell things apart.

This is only applicable to the fallback mechanism.

Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 drivers/base/firmware_class.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 70fc42e5e0da..da043cb16e2f 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -1089,9 +1089,12 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
 		mutex_unlock(&fw_lock);
 	}
 
-	if (fw_state_is_aborted(&buf->fw_st))
-		retval = -EAGAIN;
-	else if (buf->is_paged_buf && !buf->data)
+	if (fw_state_is_aborted(&buf->fw_st)) {
+		if (retval == -ERESTARTSYS)
+			retval = -EINTR;
+		else
+			retval = -EAGAIN;
+	} else if (buf->is_paged_buf && !buf->data)
 		retval = -ENOMEM;
 
 	device_del(f_dev);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD
  2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez
@ 2017-06-15  7:49 ` Martin Fuzzey
       [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  5 siblings, 0 replies; 38+ messages in thread
From: Martin Fuzzey @ 2017-06-15  7:49 UTC (permalink / raw)
  To: Luis R. Rodriguez, gregkh
  Cc: ebiederm, dmitry.torokhov, wagi, dwmw2, jewalt, rafal,
	arend.vanspriel, rjw, yi1.li, atull, moritz.fischer, pmladek,
	johannes.berg, emmanuel.grumbach, luciano.coelho, kvalo, luto,
	torvalds, keescook, takahiro.akashi, dhowells, pjones, hdegoede,
	alan, tytso, mtk.manpages, paul.gortmaker, mtosatti, mawilcox,
	linux-api, linux-fsdevel, linux-kernel

On 15/06/17 00:20, Luis R. Rodriguez wrote:
> Martin reported an issue with Android where if sysfs is used to trigger a sync
> fw load which *relies* on the fallback mechanism and a background job completes
> while the trigger is ongoing in the foreground it will immediately fail the fw
> request.  The issue can be observed in this simple test script using the
> test_firmware driver:
>
> 	set -e
> 	/etc/init.d/udev stop
> 	modprobe test_firmware
> 	DIR=/sys/devices/virtual/misc/test_firmware
> 	echo 10 >/sys/class/firmware/timeout
> 	sleep 2 &
> 	echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request
>
> The background sleep triggers the SIGCHLD signal and we fail the firmware
> request on the fallback mechanism. This was due to the type of wait used which
> ...
>
> Note that although I *feared* this might implicate any use of non-killable waits
> on other system calls, such as finit_module(), initial testing confirms this to
> not be the case.  For instance replacing the echo with modprobe on a module
> which does the same on init does not present the same issues. This could be due
> to the special SA_RESTART flag case on write() as noted above and sysfs...
> however, its not perfectly clear yet to me.
>
>
The reason the problem does not occur with modprobe is that in that case 
the processes triggering the firmware load
(modprobe) and the process dying (sleep) are *siblings* rather than 
father and child.
So the modprobe process does *not* receive a SIGCHLD when its' *brother* 
dies.

echo is a shell built-in so the process triggering the firmware load 
(the shell) and the process dying (sleep) *are* father and child.

Martin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD
       [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-06-26 21:19   ` Luis R. Rodriguez
  2017-06-29 15:14   ` Greg KH
  1 sibling, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-26 21:19 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
	wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA,
	atull-DgEjT+Ai2ygdnm+yROfE0A,
	moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w,
	luciano.coelho-ral2JQCrhuEAvxtiuMwx3w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA,
	hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA,
	tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel

On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote:
> Martin reported an issue with Android where if sysfs is used to trigger a sync
> fw load which *relies* on the fallback mechanism and a background job completes
> while the trigger is ongoing in the foreground it will immediately fail the fw
> request.  The issue can be observed in this simple test script using the
> test_firmware driver:
> 
> 	set -e
> 	/etc/init.d/udev stop
> 	modprobe test_firmware
> 	DIR=/sys/devices/virtual/misc/test_firmware
> 	echo 10 >/sys/class/firmware/timeout
> 	sleep 2 &
> 	echo -n "does-not-exist-file.bin" > "$DIR"/trigger_request
> 
> The background sleep triggers the SIGCHLD signal and we fail the firmware
> request on the fallback mechanism. This was due to the type of wait used which
> captures all signals, but we currently lack the killable swaits which would
> only allow killing the fallback wait on SIGKILL. This adds the missing killable
> swaits and fixes the firmware API to use it. This goes along with a test case to
> demo the issue clearly and how its fixed afterwards.  Lastly, ensure to use
> -EINTR when interrupted so callers can distinguish between an interrupted
> failure and other types of failure.
> 
> As suggested I've tagged the addition of the killable swaits and the firmware
> fix as stable. Between v4.0 and v4.10 the stable fix is to instead change
> the firmware to use wait_for_completion_killable_timeout() as the firmware
> API was only converted to swait as of v4.10.
> 
> The last patch must be applied only after the killable wait is applied to
> ensure only SIGKILL triggers an interruption. Otherwise it has been observed
> using -EINTR on other signals like SIGCHLD will trigger a restart of the call,
> if a sysfs write() is used to trigger the sync firmware request. This might be
> due what signal(7) says about using the SA_RESTART flag when write() is called,
> in such cases the call will be automatically restarted after the signal handler
> returns. The excemption here naturally seems to be on SIGKILL.
> 
> Note that although I *feared* this might implicate any use of non-killable waits
> on other system calls, such as finit_module(), initial testing confirms this to
> not be the case.  For instance replacing the echo with modprobe on a module
> which does the same on init does not present the same issues. This could be due
> to the special SA_RESTART flag case on write() as noted above and sysfs...
> however, its not perfectly clear yet to me.
> 
> As usual these patches are on my linux-next git tree, they are on the
> 20170614-fw-fixes branch based on linux-next 20170614 [0].
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170614-fw-fixes

Greg, *poke*

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]   ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-06-29 12:54     ` Greg KH
       [not found]       ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2017-06-29 12:54 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
	wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA,
	atull-DgEjT+Ai2ygdnm+yROfE0A,
	moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w,
	luciano.coelho-ral2JQCrhuEAvxtiuMwx3w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA,
	hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA,
	tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote:
> Code in kernel which incorrectly used the non-killable variants could
> end up having waits killed improperly. The respective killable waits
> have been upstream for a while:
> 
>   o wait_for_completion_killable()
>   o wait_for_completion_killable_timeout()
> 
> swait has been upstream since v4.6. Older kernels have had the
> above variants in place for a long time.
> 
> Cc: stable <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 4.6
> Signed-off-by: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  include/linux/swait.h | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/include/linux/swait.h b/include/linux/swait.h
> index c1f9c62a8a50..2c700694d50a 100644
> --- a/include/linux/swait.h
> +++ b/include/linux/swait.h
> @@ -169,4 +169,29 @@ do {									\
>  	__ret;								\
>  })
>  
> +#define __swait_event_killable(wq, condition)				\
> +	___swait_event(wq, condition, TASK_KILLABLE, 0, schedule())
> +
> +#define swait_event_killable(wq, condition)				\
> +({									\
> +	int __ret = 0;							\
> +	if (!(condition))						\
> +		__ret = __swait_event_killable(wq, condition);		\
> +	__ret;								\
> +})
> +
> +#define __swait_event_killable_timeout(wq, condition, timeout)		\
> +	___swait_event(wq, ___wait_cond_timeout(condition),		\
> +		       TASK_KILLABLE, timeout,				\
> +		       __ret = schedule_timeout(__ret))
> +
> +#define swait_event_killable_timeout(wq, condition, timeout)		\
> +({									\
> +	long __ret = timeout;						\
> +	if (!___wait_cond_timeout(condition))				\
> +		__ret = __swait_event_killable_timeout(wq,		\
> +						condition, timeout);	\
> +	__ret;								\
> +})
> +
>  #endif /* _LINUX_SWAIT_H */

Do you really still want to add these, now that we know we shouldn't be
using swait in "real" code? :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]       ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
@ 2017-06-29 13:05         ` Thomas Gleixner
  2017-06-29 13:35           ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2017-06-29 13:05 UTC (permalink / raw)
  To: Greg KH
  Cc: Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
	wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA,
	atull-DgEjT+Ai2ygdnm+yROfE0A,
	moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w,
	luciano.coelho-ral2JQCrhuEAvxtiuMwx3w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA,
	hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA,
	tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Thu, 29 Jun 2017, Greg KH wrote:
> On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote:
> Do you really still want to add these, now that we know we shouldn't be
> using swait in "real" code? :)

And who defined that it should not be used in real code?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 13:05         ` Thomas Gleixner
@ 2017-06-29 13:35           ` Greg KH
  2017-06-29 13:46             ` Thomas Gleixner
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2017-06-29 13:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Luis R. Rodriguez, mfuzzey, ebiederm, dmitry.torokhov, wagi,
	dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull,
	moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach,
	luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi,
	dhowells, pjones, hdegoede, alan, tytso, mtk.manpages,
	paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel

On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote:
> On Thu, 29 Jun 2017, Greg KH wrote:
> > On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote:
> > Do you really still want to add these, now that we know we shouldn't be
> > using swait in "real" code? :)
> 
> And who defined that it should not be used in real code?

Linus did, in a different firmware thread.  You have to _really_ know
what you are doing to use this interface, and the firmware interface
shouldn't be using it.  So adding new apis just for firmware does not
seem like a wise decision at this point in time.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 13:35           ` Greg KH
@ 2017-06-29 13:46             ` Thomas Gleixner
  2017-06-29 16:13               ` Linus Torvalds
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2017-06-29 13:46 UTC (permalink / raw)
  To: Greg KH
  Cc: Luis R. Rodriguez, mfuzzey, ebiederm, dmitry.torokhov, wagi,
	dwmw2, jewalt, rafal, arend.vanspriel, rjw, yi1.li, atull,
	moritz.fischer, pmladek, johannes.berg, emmanuel.grumbach,
	luciano.coelho, kvalo, luto, torvalds, keescook, takahiro.akashi,
	dhowells, pjones, hdegoede, alan, tytso, mtk.manpages,
	paul.gortmaker, mtosatti, mawilcox, linux-api, linux-fsdevel

On Thu, 29 Jun 2017, Greg KH wrote:
> On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote:
> > On Thu, 29 Jun 2017, Greg KH wrote:
> > > On Wed, Jun 14, 2017 at 03:20:15PM -0700, Luis R. Rodriguez wrote:
> > > Do you really still want to add these, now that we know we shouldn't be
> > > using swait in "real" code? :)
> > 
> > And who defined that it should not be used in real code?
> 
> Linus did, in a different firmware thread.  You have to _really_ know
> what you are doing to use this interface, and the firmware interface
> shouldn't be using it.  So adding new apis just for firmware does not
> seem like a wise decision at this point in time.

So it's not about code in general, it's about a particular piece of
code. Fair enough.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD
       [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2017-06-26 21:19   ` Luis R. Rodriguez
@ 2017-06-29 15:14   ` Greg KH
       [not found]     ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Greg KH @ 2017-06-29 15:14 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
	wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA,
	atull-DgEjT+Ai2ygdnm+yROfE0A,
	moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w,
	luciano.coelho-ral2JQCrhuEAvxtiuMwx3w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA,
	hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA,
	tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote:
> Martin reported an issue with Android where if sysfs is used to trigger a sync
> fw load which *relies* on the fallback mechanism and a background job completes
> while the trigger is ongoing in the foreground it will immediately fail the fw
> request.  The issue can be observed in this simple test script using the
> test_firmware driver:

Please respin this with the swait code removed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 13:46             ` Thomas Gleixner
@ 2017-06-29 16:13               ` Linus Torvalds
  2017-06-29 16:31                 ` Matthew Wilcox
                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Linus Torvalds @ 2017-06-29 16:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal,
	Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano, Kalle Valo, Andrew Lutomirski, Kees Cook

On Thu, Jun 29, 2017 at 6:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 29 Jun 2017, Greg KH wrote:
>> On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote:
>> >
>> > And who defined that it should not be used in real code?
>>
>> Linus did, in a different firmware thread.  You have to _really_ know
>> what you are doing to use this interface, and the firmware interface
>> shouldn't be using it.  So adding new apis just for firmware does not
>> seem like a wise decision at this point in time.
>
> So it's not about code in general, it's about a particular piece of
> code. Fair enough.

Well, I'd actually say it the other way around: swait should not be
used in general, only in _very_ particular pieces of code that
actually explicitly need the odd swait semantics.

swait uses special locking and has odd semantics that are not at all
the same as the default wait queue ones. It should not be used without
very strong reasons (and honestly, the only strong enough reason seems
to be "RT").

The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but
it also means that it doesn't even work in all contexts.

So "swake_up()" does surprising things (only wake up one - that's what
caused a firmware loading bug), and "swake_up_all()" has magic rules
about interrupt disabling.

The thing is simply a collection of small hacks and should NOT be used
in general.

I never want to see a driver use that code, for example. It was
designed for RCU and RT, and it should damn well be limited to that.

              Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 16:13               ` Linus Torvalds
@ 2017-06-29 16:31                 ` Matthew Wilcox
  2017-06-29 17:29                   ` Luis R. Rodriguez
  2017-06-29 17:40                 ` Davidlohr Bueso
  2017-06-29 19:15                 ` Marcelo Tosatti
  2 siblings, 1 reply; 38+ messages in thread
From: Matthew Wilcox @ 2017-06-29 16:31 UTC (permalink / raw)
  To: Linus Torvalds, Thomas Gleixner
  Cc: Greg KH, Luis R. Rodriguez, mfuzzey, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal,
	Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano

Linus Torvalds wrote:
> The thing is simply a collection of small hacks and should NOT be used
> in general.
> 
> I never want to see a driver use that code, for example. It was
> designed for RCU and RT, and it should damn well be limited to that.

Maybe put #ifndef MODULE around the entire include file then?
And definitely remove this line:

* One would recommend using this wait queue where possible.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 16:31                 ` Matthew Wilcox
@ 2017-06-29 17:29                   ` Luis R. Rodriguez
  0 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 17:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Thomas Gleixner, Greg KH, Luis R. Rodriguez,
	mfuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emman

On Thu, Jun 29, 2017 at 04:31:08PM +0000, Matthew Wilcox wrote:
> Linus Torvalds wrote:
> > The thing is simply a collection of small hacks and should NOT be used
> > in general.
> > 
> > I never want to see a driver use that code, for example. It was
> > designed for RCU and RT, and it should damn well be limited to that.
> 
> Maybe put #ifndef MODULE around the entire include file then?
> And definitely remove this line:
> 
> * One would recommend using this wait queue where possible.

I'll respin and port away from swait, and add the #ifndef MODULE and special
note on swait.h.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD
       [not found]     ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
@ 2017-06-29 17:29       ` Luis R. Rodriguez
  0 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 17:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Luis R. Rodriguez, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
	wagi-kQCPcA+X3s7YtjvyW6yDsg, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	rjw-LthD3rsA81gm4RdzfppkhA, yi1.li-VuQAYsv1563Yd54FQh9/CA,
	atull-DgEjT+Ai2ygdnm+yROfE0A,
	moritz.fischer-+aYTwkv1SeIAvxtiuMwx3w, pmladek-IBi9RG/b67k,
	johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	emmanuel.grumbach-ral2JQCrhuEAvxtiuMwx3w,
	luciano.coelho-ral2JQCrhuEAvxtiuMwx3w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, luto-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	takahiro.akashi-QSEj5FYQhm4dnm+yROfE0A,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA, pjones-H+wXaHxf7aLQT0dZR+AlfA,
	hdegoede-H+wXaHxf7aLQT0dZR+AlfA, alan-VuQAYsv1563Yd54FQh9/CA,
	tytso-3s7WtUTddSA, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA, mawilcox-0li6OtcxBFHby3iVrkZq2A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Thu, Jun 29, 2017 at 05:14:42PM +0200, Greg KH wrote:
> On Wed, Jun 14, 2017 at 03:20:13PM -0700, Luis R. Rodriguez wrote:
> > Martin reported an issue with Android where if sysfs is used to trigger a sync
> > fw load which *relies* on the fallback mechanism and a background job completes
> > while the trigger is ongoing in the foreground it will immediately fail the fw
> > request.  The issue can be observed in this simple test script using the
> > test_firmware driver:
> 
> Please respin this with the swait code removed.

Sure thing.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 16:13               ` Linus Torvalds
  2017-06-29 16:31                 ` Matthew Wilcox
@ 2017-06-29 17:40                 ` Davidlohr Bueso
  2017-06-29 17:57                   ` Linus Torvalds
  2017-06-29 19:15                 ` Marcelo Tosatti
  2 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2017-06-29 17:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew Lutomirski

On Thu, 29 Jun 2017, Linus Torvalds wrote:

>Well, I'd actually say it the other way around: swait should not be
>used in general, only in _very_ particular pieces of code that
>actually explicitly need the odd swait semantics.
>
>swait uses special locking and has odd semantics that are not at all
>the same as the default wait queue ones. It should not be used without
>very strong reasons (and honestly, the only strong enough reason seems
>to be "RT").
>
>The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but
>it also means that it doesn't even work in all contexts.
>
>So "swake_up()" does surprising things (only wake up one - that's what
>caused a firmware loading bug), and "swake_up_all()" has magic rules
>about interrupt disabling.
>
>The thing is simply a collection of small hacks and should NOT be used
>in general.

For all the above, what do you think of my 'sswait' proposal?

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 17:40                 ` Davidlohr Bueso
@ 2017-06-29 17:57                   ` Linus Torvalds
  2017-06-29 18:33                     ` Davidlohr Bueso
  0 siblings, 1 reply; 38+ messages in thread
From: Linus Torvalds @ 2017-06-29 17:57 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew Lutomirski

On Thu, Jun 29, 2017 at 10:40 AM, Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> For all the above, what do you think of my 'sswait' proposal?

I see no actual users of such a specialty interface.

I don't think we've *ever* had any actual problems with our current
wait-queues, apart from the RT issues, which were not about the
waitqueues themselves, but purely about RT itself.

So without some very compelling reason, I'd not want to add yet
another wait-queue.

I actually think swait is pure garbage. Most users only wake up one
process anyway, and using swait for that is stupid. If you only wake
up one, you might as well just have a single process pointer, not a
wait list at all, and then use "wake_up_process()".

There is *one* single user of swake_up_all(), and that one looks like
bogus crap also: it does it outside of the spinlock that could have
been used to protect the queue - p,lus I'm not sure there's really a
queue anyway, since I think it's just the grace-period kthread that is
there.

kvm uses swait, but doesn't use swake_up_all(), so it always just
wakes up a single waiter. That may be the right thing to do. Or it's
just another bug. I don't know. The KVM use looks broken too, since it
does

        if (swait_active(wqp)) {
                swake_up(wqp);

which is racy wrt new waiters, which implies that there is some
upper-level synchronization - possibly the same "only one thread
anyway".

So swake really looks like crap. It has crap semantics, it has crap
users, it's just broken.

The last thing we want to do is to create something _else_ specialized
like this.

                Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 17:57                   ` Linus Torvalds
@ 2017-06-29 18:33                     ` Davidlohr Bueso
       [not found]                       ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
  2017-07-05  2:06                       ` Davidlohr Bueso
  0 siblings, 2 replies; 38+ messages in thread
From: Davidlohr Bueso @ 2017-06-29 18:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew Lutomirski

On Thu, 29 Jun 2017, Linus Torvalds wrote:

>So without some very compelling reason, I'd not want to add yet
>another wait-queue.

Yes, I was expecting this and very much agree.

I'll actually take a look at wake_q for wake_up_all() and co. to see if
we can reduce the spinlock hold times. Of course it would only make sense
for more than a one wakeup.

>I actually think swait is pure garbage. Most users only wake up one
>process anyway, and using swait for that is stupid. If you only wake
>up one, you might as well just have a single process pointer, not a
>wait list at all, and then use "wake_up_process()".

But you still need the notion of a queue, even if you wake one task
at a time... I'm probably missing your point here.

>There is *one* single user of swake_up_all(), and that one looks like
>bogus crap also: it does it outside of the spinlock that could have
>been used to protect the queue - p,lus I'm not sure there's really a
>queue anyway, since I think it's just the grace-period kthread that is
>there.

So those cases when there's only one waiter I completely agree should
not be using waitqueues. pcpu-rwsems in the past suffered from this.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                       ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
@ 2017-06-29 18:59                         ` Linus Torvalds
       [not found]                           ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Linus Torvalds @ 2017-06-29 18:59 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez,
	mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano, Kalle Valo, Andrew Lutomirski

On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote:
> On Thu, 29 Jun 2017, Linus Torvalds wrote:
>
>> I actually think swait is pure garbage. Most users only wake up one
>> process anyway, and using swait for that is stupid. If you only wake
>> up one, you might as well just have a single process pointer, not a
>> wait list at all, and then use "wake_up_process()".
>
> But you still need the notion of a queue, even if you wake one task
> at a time... I'm probably missing your point here.

The *reason* they wake up only one seems to be that there really is
just one. It's some per-cpu idle thread for kvm, and for RCU it's the
RCU workqueue thread.

So the queue literally looks suspiciously pointless.

But I might be wrong, and there can actually be multiple entries. If
there are, I don't see why the wake-up-one semantics the code uses
would be valid, though.

                   Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 16:13               ` Linus Torvalds
  2017-06-29 16:31                 ` Matthew Wilcox
  2017-06-29 17:40                 ` Davidlohr Bueso
@ 2017-06-29 19:15                 ` Marcelo Tosatti
       [not found]                   ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org>
  2 siblings, 1 reply; 38+ messages in thread
From: Marcelo Tosatti @ 2017-06-29 19:15 UTC (permalink / raw)
  To: Linus Torvalds, h
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew Lutomirski

On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 6:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Thu, 29 Jun 2017, Greg KH wrote:
> >> On Thu, Jun 29, 2017 at 03:05:26PM +0200, Thomas Gleixner wrote:
> >> >
> >> > And who defined that it should not be used in real code?
> >>
> >> Linus did, in a different firmware thread.  You have to _really_ know
> >> what you are doing to use this interface, and the firmware interface
> >> shouldn't be using it.  So adding new apis just for firmware does not
> >> seem like a wise decision at this point in time.
> >
> > So it's not about code in general, it's about a particular piece of
> > code. Fair enough.
> 
> Well, I'd actually say it the other way around: swait should not be
> used in general, only in _very_ particular pieces of code that
> actually explicitly need the odd swait semantics.
> 
> swait uses special locking and has odd semantics that are not at all
> the same as the default wait queue ones. It should not be used without
> very strong reasons (and honestly, the only strong enough reason seems
> to be "RT").

Performance shortcut:

https://lkml.org/lkml/2016/2/25/301

> The special locking means that swait doesn't do DEBUG_LOCK_ALLOC, but
> it also means that it doesn't even work in all contexts.
> 
> So "swake_up()" does surprising things (only wake up one - that's what
> caused a firmware loading bug), and "swake_up_all()" has magic rules
> about interrupt disabling.
> 
> The thing is simply a collection of small hacks and should NOT be used
> in general.

Its a very smart performance speed up ;-)

> I never want to see a driver use that code, for example. It was
> designed for RCU and RT, and it should damn well be limited to that.
> 
>               Linus

If KVM is the only user, feel free to remove it, you're past the point
where that performance improvement matters (due to VMX hardware
improvements).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                           ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-29 19:40                             ` Luis R. Rodriguez
  2017-06-29 19:44                               ` Luis R. Rodriguez
       [not found]                               ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  0 siblings, 2 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 19:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH, Luis R. Rodriguez,
	mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano, Kalle Valo

On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote:
> > On Thu, 29 Jun 2017, Linus Torvalds wrote:
> >
> >> I actually think swait is pure garbage. Most users only wake up one
> >> process anyway, and using swait for that is stupid. If you only wake
> >> up one, you might as well just have a single process pointer, not a
> >> wait list at all, and then use "wake_up_process()".
> >
> > But you still need the notion of a queue, even if you wake one task
> > at a time... I'm probably missing your point here.
> 
> The *reason* they wake up only one seems to be that there really is
> just one. It's some per-cpu idle thread for kvm, and for RCU it's the
> RCU workqueue thread.
> 
> So the queue literally looks suspiciously pointless.
> 
> But I might be wrong, and there can actually be multiple entries.

Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
issue it would seem this does queue [0]. That said, I don't see any simple tests
tools/testing/selftests/swait but then again we don't have test for regular
waits either...

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

> If there are, I don't see why the wake-up-one semantics the code uses
> would be valid, though.

Not sure what's wrong with it?

I believe one use case for example is for when we know that waker alone would
be able to ensure the next item in queue will also be woken up. Such was the
case for the kmod.c conversion I tested, and behold it seemed to have wored
with just swake_up(). Its obviously *fragile* though given you *assume* error
cases also wake up. In the case of kmod.c we have no such error cases but in
firmware_class.c we *do*, and actually that is part of the next set of fixes I
have to address next, but that issue would be present even if we move to wait
for completion and complete_all() is used.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 19:40                             ` Luis R. Rodriguez
@ 2017-06-29 19:44                               ` Luis R. Rodriguez
       [not found]                                 ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
       [not found]                               ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 19:44 UTC (permalink / raw)
  To: nbroeking, Jakub Kicinski, Linus Torvalds
  Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner, mcgrof,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew

On Thu, Jun 29, 2017 at 09:40:15PM +0200, Luis R. Rodriguez wrote:
> On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote:
> > On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> > > On Thu, 29 Jun 2017, Linus Torvalds wrote:
> > >
> > >> I actually think swait is pure garbage. Most users only wake up one
> > >> process anyway, and using swait for that is stupid. If you only wake
> > >> up one, you might as well just have a single process pointer, not a
> > >> wait list at all, and then use "wake_up_process()".
> > >
> > > But you still need the notion of a queue, even if you wake one task
> > > at a time... I'm probably missing your point here.
> > 
> > The *reason* they wake up only one seems to be that there really is
> > just one. It's some per-cpu idle thread for kvm, and for RCU it's the
> > RCU workqueue thread.
> > 
> > So the queue literally looks suspiciously pointless.
> > 
> > But I might be wrong, and there can actually be multiple entries.
> 
> Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> issue it would seem this does queue [0]. That said, I don't see any simple tests
> tools/testing/selftests/swait but then again we don't have test for regular
> waits either...
> 
> [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

I should also note that the swake_up_all() should have only helped in cases where
3 cards were used, as if only 2 were used that should have been covered by just
the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
from Jakub.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                               ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
@ 2017-06-29 20:57                                 ` Linus Torvalds
  0 siblings, 0 replies; 38+ messages in thread
From: Linus Torvalds @ 2017-06-29 20:57 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Davidlohr Bueso, Thomas Gleixner, Greg KH,
	mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano, Kalle Valo, Andrew Lutomirski

On Thu, Jun 29, 2017 at 12:40 PM, Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote:
>> > at a time... I'm probably missing your point here.
>>
>> The *reason* they wake up only one seems to be that there really is
>> just one. It's some per-cpu idle thread for kvm, and for RCU it's the
>> RCU workqueue thread.
>>
>> So the queue literally looks suspiciously pointless.
>>
>> But I might be wrong, and there can actually be multiple entries.
>
> Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> issue it would seem this does queue [0].

I'm not talking about the firmware code.

That thing never had an excuse to use swait in the first place.

I'm talking about kvm and rcu, which *do* have excuses to use it, but
where I argue that swait is _still_ a questionable interface for other
reasons.

                Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                                 ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
@ 2017-06-29 20:58                                   ` Jakub Kicinski
  2017-06-29 22:50                                     ` Luis R. Rodriguez
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-06-29 20:58 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: nbroeking-BUHhN+a2lJ4, Linus Torvalds, Davidlohr Bueso,
	Thomas Gleixner, Greg KH, mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano

On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote:
> On Thu, Jun 29, 2017 at 09:40:15PM +0200, Luis R. Rodriguez wrote:
> > On Thu, Jun 29, 2017 at 11:59:29AM -0700, Linus Torvalds wrote:  
> > > On Thu, Jun 29, 2017 at 11:33 AM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote:  
> > > > On Thu, 29 Jun 2017, Linus Torvalds wrote:
> > > >  
> > > >> I actually think swait is pure garbage. Most users only wake up one
> > > >> process anyway, and using swait for that is stupid. If you only wake
> > > >> up one, you might as well just have a single process pointer, not a
> > > >> wait list at all, and then use "wake_up_process()".  
> > > >
> > > > But you still need the notion of a queue, even if you wake one task
> > > > at a time... I'm probably missing your point here.  
> > > 
> > > The *reason* they wake up only one seems to be that there really is
> > > just one. It's some per-cpu idle thread for kvm, and for RCU it's the
> > > RCU workqueue thread.
> > > 
> > > So the queue literally looks suspiciously pointless.
> > > 
> > > But I might be wrong, and there can actually be multiple entries.  
> > 
> > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> > issue it would seem this does queue [0]. That said, I don't see any simple tests
> > tools/testing/selftests/swait but then again we don't have test for regular
> > waits either...
> > 
> > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477  
> 
> I should also note that the swake_up_all() should have only helped in cases where
> 3 cards were used, as if only 2 were used that should have been covered by just
> the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
> from Jakub.

I was hitting this with 2 cards.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 20:58                                   ` Jakub Kicinski
@ 2017-06-29 22:50                                     ` Luis R. Rodriguez
  2017-06-29 22:53                                       ` Jakub Kicinski
       [not found]                                       ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  0 siblings, 2 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 22:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Luis R. Rodriguez, nbroeking, Linus Torvalds, Davidlohr Bueso,
	Thomas Gleixner, Greg KH, mfuzzey, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal,
	Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coel

On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote:
> On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote:
> > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> > > issue it would seem this does queue [0]. That said, I don't see any simple tests
> > > tools/testing/selftests/swait but then again we don't have test for regular
> > > waits either...
> > > 
> > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477  
> > 
> > I should also note that the swake_up_all() should have only helped in cases where
> > 3 cards were used, as if only 2 were used that should have been covered by just
> > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
> > from Jakub.
> 
> I was hitting this with 2 cards.

Thanks!

Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs
swake_up_all() in this case though. Let's recall also the missing wake up on
errors! And the fact that netronome has optional firmware, which naturally can
fail.

So could the issue with 2 cards instead of the miss of a wake up on error due
to batched requests ? If so then that still would not put blame on the
swake_up()!

We can find out by you testing the series I just posted [0] and if that did not
fix the issue then try this patch, which I do expect to actually have fixed
most issues considering optional firmware.

ie, I expect the combination of both to fix your issues, not just the last
series I just posted [0]. If you want this in git form you can find all of
the patches bundled on the 20170629-fw-fixes-wait-v4 branch [1]. I just
wrote this patch it but it seems to have not broken the tests:

$ sudo tools/testing/selftests/firmware/fw_fallback.sh 
tools/testing/selftests/firmware/fw_fallback.sh: timeout works
tools/testing/selftests/firmware/fw_fallback.sh: firmware comparison works
tools/testing/selftests/firmware/fw_fallback.sh: fallback mechanism works
tools/testing/selftests/firmware/fw_fallback.sh: cancelling fallback mechanism works
tools/testing/selftests/firmware/fw_fallback.sh: custom fallback loading mechanism works
tools/testing/selftests/firmware/fw_fallback.sh: cancelling custom fallback mechanism works
tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected

$ sudo tools/testing/selftests/firmware/fw_filesystem.sh 
tools/testing/selftests/firmware/fw_filesystem.sh: timeout works
tools/testing/selftests/firmware/fw_filesystem.sh: filesystem loading works
tools/testing/selftests/firmware/fw_filesystem.sh: async filesystem loading works

[0] https://lkml.kernel.org/r/20170629205151.5329-1-mcgrof@kernel.org
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170629-fw-fixes-wait-v4

 Luis

>From cb7fee12c6d539405793e883dfd79e0b21c2baad Mon Sep 17 00:00:00 2001
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
Date: Thu, 29 Jun 2017 15:19:04 -0700
Subject: [RFT] firmware: send wake up on failure for batched requests

Fix batched requests from waiting forever on failure.

The firmware API supports "batched requests" which means requests with
the same name share the same lookup effort. They wait for the first
request to complete, however they are set to always wait for what seem
to be forever (MAX_SCHEDULE_TIMEOUT).

We currently handle informing waited batched requests on success but we
never seem to have sent smoke signals of any kind on failure! This
should mean secondary requests batched in seem to just wait forever when
the request fails.

For device drivers with optional firmware schemes (Intel, or Netronome),
this could mean that when you boot a system with multiple cards the
firmware will seem to never load on the system, or that the card is just
not responsive even the driver initialized. Due to differences in scheduling
possible this should not always trigger, so triggering batched requests
actually needs to be triggered for this to be an issue.

Its reported that at least with the Intel WiFi cards on one system this
issue was creeping up 50% of the boots [0].

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

Reported-by: Nicolas <nbroeking@me.com>
Reported-by: John Ewalt  <jewalt@lgsinnovations.com>
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 drivers/base/firmware_class.c | 40 ++++++++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index d3d071dbd2a5..40d1351660c0 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -152,28 +152,27 @@ static void __fw_state_set(struct fw_state *fw_st,
 	__fw_state_set(fw_st, FW_STATUS_LOADING)
 #define fw_state_done(fw_st)					\
 	__fw_state_set(fw_st, FW_STATUS_DONE)
+#define fw_state_aborted(fw_st)					\
+	__fw_state_set(fw_st, FW_STATUS_ABORTED)
 #define fw_state_wait(fw_st)					\
 	__fw_state_wait_common(fw_st, MAX_SCHEDULE_TIMEOUT)
 
-#ifndef CONFIG_FW_LOADER_USER_HELPER
-
-#define fw_state_is_aborted(fw_st)	false
-
-#else /* CONFIG_FW_LOADER_USER_HELPER */
-
 static int __fw_state_check(struct fw_state *fw_st, enum fw_status status)
 {
 	return fw_st->status == status;
 }
 
+#define fw_state_is_aborted(fw_st)				\
+	__fw_state_check(fw_st, FW_STATUS_ABORTED)
+
+#ifdef CONFIG_FW_LOADER_USER_HELPER
+
 #define fw_state_aborted(fw_st)					\
 	__fw_state_set(fw_st, FW_STATUS_ABORTED)
 #define fw_state_is_done(fw_st)					\
 	__fw_state_check(fw_st, FW_STATUS_DONE)
 #define fw_state_is_loading(fw_st)				\
 	__fw_state_check(fw_st, FW_STATUS_LOADING)
-#define fw_state_is_aborted(fw_st)				\
-	__fw_state_check(fw_st, FW_STATUS_ABORTED)
 #define fw_state_wait_timeout(fw_st, timeout)			\
 	__fw_state_wait_common(fw_st, timeout)
 
@@ -332,6 +331,7 @@ static struct firmware_buf *__fw_lookup_buf(const char *fw_name)
 	return NULL;
 }
 
+/* Returns 1 for batching firmware requests with the same name */
 static int fw_lookup_and_allocate_buf(const char *fw_name,
 				      struct firmware_cache *fwc,
 				      struct firmware_buf **buf, void *dbuf,
@@ -345,6 +345,7 @@ static int fw_lookup_and_allocate_buf(const char *fw_name,
 		kref_get(&tmp->ref);
 		spin_unlock(&fwc->lock);
 		*buf = tmp;
+		/* requests are batched ! */
 		return 1;
 	}
 	tmp = __allocate_fw_buf(fw_name, fwc, dbuf, size);
@@ -1200,6 +1201,28 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name,
 	return 1; /* need to load */
 }
 
+/*
+ * Batched requests need only one wake, we need to do this step last due to the
+ * fallback mechanism. The buf is protected with kref_get(), and it won't be
+ * released until the last user calls release_firmware().
+ *
+ * Failed batched requests are possible as well, in such cases we just share
+ * the struct firmware_buf and won't release it until all requests are woken
+ * and have gone through this same path.
+ */
+static void fw_abort_batch_reqs(struct firmware *fw)
+{
+	struct firmware_buf *buf;
+
+	/* Loaded directly? */
+	if (!fw || !fw->priv)
+		return;
+
+	buf = fw->priv;
+	if (!fw_state_is_aborted(&buf->fw_st))
+		fw_state_aborted(&buf->fw_st);
+}
+
 /* called from request_firmware() and request_firmware_work_func() */
 static int
 _request_firmware(const struct firmware **firmware_p, const char *name,
@@ -1243,6 +1266,7 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 
  out:
 	if (ret < 0) {
+		fw_abort_batch_reqs(fw);
 		release_firmware(fw);
 		fw = NULL;
 	}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 22:50                                     ` Luis R. Rodriguez
@ 2017-06-29 22:53                                       ` Jakub Kicinski
  2017-06-29 23:00                                         ` Luis R. Rodriguez
       [not found]                                       ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-06-29 22:53 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: nbroeking, Linus Torvalds, Davidlohr Bueso, Thomas Gleixner,
	Greg KH, mfuzzey, Eric W. Biederman, Dmitry Torokhov,
	Daniel Wagner, David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano

On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote:
> On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote:
> > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote:  
> > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> > > > issue it would seem this does queue [0]. That said, I don't see any simple tests
> > > > tools/testing/selftests/swait but then again we don't have test for regular
> > > > waits either...
> > > > 
> > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477    
> > > 
> > > I should also note that the swake_up_all() should have only helped in cases where
> > > 3 cards were used, as if only 2 were used that should have been covered by just
> > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
> > > from Jakub.  
> > 
> > I was hitting this with 2 cards.  
> 
> Thanks!
> 
> Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs
> swake_up_all() in this case though. Let's recall also the missing wake up on
> errors! And the fact that netronome has optional firmware, which naturally can
> fail.
> 
> So could the issue with 2 cards instead of the miss of a wake up on error due
> to batched requests ? If so then that still would not put blame on the
> swake_up()!

Sorry, that was during manual tests.  I had the driver request the
firmware with _nowait() without an uevent, and then after I manually
wrote -1 to loading only one would get woken up.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 22:53                                       ` Jakub Kicinski
@ 2017-06-29 23:00                                         ` Luis R. Rodriguez
  2017-06-29 23:06                                           ` Jakub Kicinski
  0 siblings, 1 reply; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-06-29 23:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Nicolas Broeking, Linus Torvalds, Davidlohr Bueso,
	Thomas Gleixner, Greg KH, Martin Fuzzey, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal,
	Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Co

On Thu, Jun 29, 2017 at 3:53 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote:
>> On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote:
>> > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote:
>> > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
>> > > > issue it would seem this does queue [0]. That said, I don't see any simple tests
>> > > > tools/testing/selftests/swait but then again we don't have test for regular
>> > > > waits either...
>> > > >
>> > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
>> > >
>> > > I should also note that the swake_up_all() should have only helped in cases where
>> > > 3 cards were used, as if only 2 were used that should have been covered by just
>> > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
>> > > from Jakub.
>> >
>> > I was hitting this with 2 cards.
>>
>> Thanks!
>>
>> Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs
>> swake_up_all() in this case though. Let's recall also the missing wake up on
>> errors! And the fact that netronome has optional firmware, which naturally can
>> fail.
>>
>> So could the issue with 2 cards instead of the miss of a wake up on error due
>> to batched requests ? If so then that still would not put blame on the
>> swake_up()!
>
> Sorry, that was during manual tests.  I had the driver request the
> firmware with _nowait() without an uevent,

Can you provide the output of

grep  CONFIG_FW_LOADER_USER_HELPER .config

I'd like to see if CONFIG_FW_LOADER_USER_HELPER_FALLBACK was disabled.
Not to be confused with the CONFIG_FW_LOADER_USER_HELPER.

When the uevent is not set its also known as "a custom fallback
mechanism" and by default that has set a timeout of
MAX_SCHEDULE_TIMEOUT, which means that even if the direct filesystem
lookup failed the fallback mechanism would be used and just sit and
wait for what seems to be forever until user input was provided.

> and then after I manually
> wrote -1 to loading only one would get woken up.

Great, this helps, so for those not familiar with the firmware sysfs
fallback mechanism:

The relevant values one could use are:

1: Start a load, discarding any previous partial load.
0: Conclude the load and hand the data to the driver code.
-1: Conclude the load with an error and discard any written data.

So you are triggering a failure. And indeed I expect the patch I just
provided to be the one to fix your issue with 2 cards.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 23:00                                         ` Luis R. Rodriguez
@ 2017-06-29 23:06                                           ` Jakub Kicinski
  0 siblings, 0 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-06-29 23:06 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Nicolas Broeking, Linus Torvalds, Davidlohr Bueso,
	Thomas Gleixner, Greg KH, Martin Fuzzey, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse, jewalt, rafal,
	Arend Van Spriel, Rafael J. Wysocki, Li, Yi, atull,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Co

On Thu, 29 Jun 2017 16:00:32 -0700, Luis R. Rodriguez wrote:
> On Thu, Jun 29, 2017 at 3:53 PM, Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
> > On Fri, 30 Jun 2017 00:50:03 +0200, Luis R. Rodriguez wrote:  
> >> On Thu, Jun 29, 2017 at 01:58:22PM -0700, Jakub Kicinski wrote:  
> >> > On Thu, 29 Jun 2017 21:44:55 +0200, Luis R. Rodriguez wrote:  
> >> > > > Since this swake_up() --> swake_up_all() reportedly *fixed* the one wake up
> >> > > > issue it would seem this does queue [0]. That said, I don't see any simple tests
> >> > > > tools/testing/selftests/swait but then again we don't have test for regular
> >> > > > waits either...
> >> > > >
> >> > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477  
> >> > >
> >> > > I should also note that the swake_up_all() should have only helped in cases where
> >> > > 3 cards were used, as if only 2 were used that should have been covered by just
> >> > > the swake_up(). Unless of course I hear otherwise by the reporter, Nicolas or
> >> > > from Jakub.  
> >> >
> >> > I was hitting this with 2 cards.  
> >>
> >> Thanks!
> >>
> >> Thing is I'm not convinced the issue with 2 cards was the swake_up() Vs
> >> swake_up_all() in this case though. Let's recall also the missing wake up on
> >> errors! And the fact that netronome has optional firmware, which naturally can
> >> fail.
> >>
> >> So could the issue with 2 cards instead of the miss of a wake up on error due
> >> to batched requests ? If so then that still would not put blame on the
> >> swake_up()!  
> >
> > Sorry, that was during manual tests.  I had the driver request the
> > firmware with _nowait() without an uevent,  
> 
> Can you provide the output of
> 
> grep  CONFIG_FW_LOADER_USER_HELPER .config
> 
> I'd like to see if CONFIG_FW_LOADER_USER_HELPER_FALLBACK was disabled.
> Not to be confused with the CONFIG_FW_LOADER_USER_HELPER.
> 
> When the uevent is not set its also known as "a custom fallback
> mechanism" and by default that has set a timeout of
> MAX_SCHEDULE_TIMEOUT, which means that even if the direct filesystem
> lookup failed the fallback mechanism would be used and just sit and
> wait for what seems to be forever until user input was provided.

FWIW

CONFIG_FW_LOADER=y
CONFIG_FW_LOADER_USER_HELPER=y
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
# CONFIG_FW_CFG_SYSFS is not set

> > and then after I manually
> > wrote -1 to loading only one would get woken up.  
> 
> Great, this helps, so for those not familiar with the firmware sysfs
> fallback mechanism:
> 
> The relevant values one could use are:
> 
> 1: Start a load, discarding any previous partial load.
> 0: Conclude the load and hand the data to the driver code.
> -1: Conclude the load with an error and discard any written data.
> 
> So you are triggering a failure. And indeed I expect the patch I just
> provided to be the one to fix your issue with 2 cards.

Cool, thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                   ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org>
@ 2017-06-30  4:03                     ` Linus Torvalds
  2017-06-30 11:55                       ` Marcelo Tosatti
                                         ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Linus Torvalds @ 2017-06-30  4:03 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach,
	Coelho, Luciano, Kalle Valo

On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
>>
>> swait uses special locking and has odd semantics that are not at all
>> the same as the default wait queue ones. It should not be used without
>> very strong reasons (and honestly, the only strong enough reason seems
>> to be "RT").
>
> Performance shortcut:
>
> https://lkml.org/lkml/2016/2/25/301

Yes, I know why kvm uses it, I just don't think it's necessarily the
right thing.

That kvm commit is actually a great example: it uses swake_up() from
an interrupt, and that's in fact the *reason* it uses swake_up().

But that also fundamentally means that it cannot use swake_up_all(),
so it basically *relies* on there only ever being one single entry
that needs to be woken up.

And as far as I can tell, it really is because the queue only ever has
one entry (ie it's per-vcpu, and when the vcpu is blocked, it's
blocked - so no other user will be waiting there).

So it isn't that you migth queue multiple entries and then just wake
them up one at a time. There really is just one entry at a time,
right?

And that means that swait is actuially completely the wrong thing to
do. It's more expensive and more complex than just saving the single
process pointer away and just doing "wake_up_process()".

Now, it really is entirely possible that I'm missing something, but it
does look like that to me.

We've had wake_up_process() since pretty much day #1. THAT is the
fastest and simplest direct wake-up there is, not some "simple
wait-queue".

Now, admittedly I don't know the code and really may be entirely off,
but looking at the commit (no need to go to the lkml archives - it's
commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in
mainline), I really think the swait() use is simply not correct if
there can be multiple waiters, exactly because swake_up() only wakes
up a single entry.

So either there is only a single entry, or *all* the code like

        dvcpu->arch.wait = 0;

-       if (waitqueue_active(&dvcpu->wq))
-               wake_up_interruptible(&dvcpu->wq);
+       if (swait_active(&dvcpu->wq))
+               swake_up(&dvcpu->wq);

is simply wrong. If there are multiple blockers, and you just cleared
"arch.wait", I think they should *all* be woken up. And that's not
what swake_up() does.

So I think that kvm_vcpu_block() could easily have instead done

    vcpu->process = current;

as the "prepare_to_wait()" part, and "finish_wait()" would be to just
clear vcpu->process. No wait-queue, just a single pointer to the
single blocking thread.

(Of course, you still need serialization, so that
"wake_up_process(vcpu->process)" doesn't end up using a stale value,
but since processes are already freed with RCU because of other things
like that, the serialization is very low-cost, you only need to be
RCU-read safe when waking up).

See what I'm saying?

Note that "wake_up_process()" really is fairly widely used. It's
widely used because it's fairly obvious, and because that really *is*
the lowest-possible cost: a single pointer to the sleeping thread, and
you can often do almost no locking at all.

And unlike swake_up(), it's obvious that you only wake up a single thread.

           Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-30  4:03                     ` Linus Torvalds
@ 2017-06-30 11:55                       ` Marcelo Tosatti
  2017-06-30 11:57                       ` Marcelo Tosatti
  2017-06-30 17:30                       ` Krister Johansen
  2 siblings, 0 replies; 38+ messages in thread
From: Marcelo Tosatti @ 2017-06-30 11:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo

On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
> >>
> >> swait uses special locking and has odd semantics that are not at all
> >> the same as the default wait queue ones. It should not be used without
> >> very strong reasons (and honestly, the only strong enough reason seems
> >> to be "RT").
> >
> > Performance shortcut:
> >
> > https://lkml.org/lkml/2016/2/25/301
> 
> Yes, I know why kvm uses it, I just don't think it's necessarily the
> right thing.
> 
> That kvm commit is actually a great example: it uses swake_up() from
> an interrupt, and that's in fact the *reason* it uses swake_up().
> 
> But that also fundamentally means that it cannot use swake_up_all(),
> so it basically *relies* on there only ever being one single entry
> that needs to be woken up.
> 
> And as far as I can tell, it really is because the queue only ever has
> one entry (ie it's per-vcpu, and when the vcpu is blocked, it's
> blocked - so no other user will be waiting there).

Exactly.
> 
> So it isn't that you migth queue multiple entries and then just wake
> them up one at a time. There really is just one entry at a time,
> right?

Yes.

> And that means that swait is actuially completely the wrong thing to
> do. It's more expensive and more complex than just saving the single
> process pointer away and just doing "wake_up_process()".

Aha, i see.

> 
> Now, it really is entirely possible that I'm missing something, but it
> does look like that to me.

Just drop it -- the optimization is not relevant anymore given VMX
hardware improvements.

> We've had wake_up_process() since pretty much day #1. THAT is the
> fastest and simplest direct wake-up there is, not some "simple
> wait-queue".
> 
> Now, admittedly I don't know the code and really may be entirely off,
> but looking at the commit (no need to go to the lkml archives - it's
> commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in
> mainline), I really think the swait() use is simply not correct if
> there can be multiple waiters, exactly because swake_up() only wakes
> up a single entry.

There can't be: its one emulated LAPIC per vcpu. So only one vcpu
waits for that waitqueue.

> So either there is only a single entry, or *all* the code like
> 
>         dvcpu->arch.wait = 0;
> 
> -       if (waitqueue_active(&dvcpu->wq))
> -               wake_up_interruptible(&dvcpu->wq);
> +       if (swait_active(&dvcpu->wq))
> +               swake_up(&dvcpu->wq);
> 
> is simply wrong. If there are multiple blockers, and you just cleared
> "arch.wait", I think they should *all* be woken up. And that's not
> what swake_up() does.
> 
> So I think that kvm_vcpu_block() could easily have instead done
> 
>     vcpu->process = current;
> 
> as the "prepare_to_wait()" part, and "finish_wait()" would be to just
> clear vcpu->process. No wait-queue, just a single pointer to the
> single blocking thread.
> 
> (Of course, you still need serialization, so that
> "wake_up_process(vcpu->process)" doesn't end up using a stale value,
> but since processes are already freed with RCU because of other things
> like that, the serialization is very low-cost, you only need to be
> RCU-read safe when waking up).
> 
> See what I'm saying?
> 
> Note that "wake_up_process()" really is fairly widely used. It's
> widely used because it's fairly obvious, and because that really *is*
> the lowest-possible cost: a single pointer to the sleeping thread, and
> you can often do almost no locking at all.
> 
> And unlike swake_up(), it's obvious that you only wake up a single thread.
> 
>            Linus

Feel free to drop the KVM usage... agreed the interface is a special 
case and a generic one which handles multiple waiters 
and debugging etc should be preferred.

Not sure if other people are using it, thought.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-30  4:03                     ` Linus Torvalds
  2017-06-30 11:55                       ` Marcelo Tosatti
@ 2017-06-30 11:57                       ` Marcelo Tosatti
  2017-06-30 17:30                       ` Krister Johansen
  2 siblings, 0 replies; 38+ messages in thread
From: Marcelo Tosatti @ 2017-06-30 11:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: h, Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo


On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
> >>
> >> swait uses special locking and has odd semantics that are not at all
> >> the same as the default wait queue ones. It should not be used without
> >> very strong reasons (and honestly, the only strong enough reason seems
> >> to be "RT").
> >
> > Performance shortcut:
> >
> > https://lkml.org/lkml/2016/2/25/301
> 
> Yes, I know why kvm uses it, I just don't think it's necessarily the
> right thing.
> 
> That kvm commit is actually a great example: it uses swake_up() from
> an interrupt, and that's in fact the *reason* it uses swake_up().
> 
> But that also fundamentally means that it cannot use swake_up_all(),
> so it basically *relies* on there only ever being one single entry
> that needs to be woken up.
> 
> And as far as I can tell, it really is because the queue only ever has
> one entry (ie it's per-vcpu, and when the vcpu is blocked, it's
> blocked - so no other user will be waiting there).

Exactly.
> 
> So it isn't that you migth queue multiple entries and then just wake
> them up one at a time. There really is just one entry at a time,
> right?

Yes.

> And that means that swait is actuially completely the wrong thing to
> do. It's more expensive and more complex than just saving the single
> process pointer away and just doing "wake_up_process()".

Aha, i see.

> 
> Now, it really is entirely possible that I'm missing something, but it
> does look like that to me.

Just drop it -- the optimization is not relevant anymore given VMX
hardware improvements.

> We've had wake_up_process() since pretty much day #1. THAT is the
> fastest and simplest direct wake-up there is, not some "simple
> wait-queue".
> 
> Now, admittedly I don't know the code and really may be entirely off,
> but looking at the commit (no need to go to the lkml archives - it's
> commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in
> mainline), I really think the swait() use is simply not correct if
> there can be multiple waiters, exactly because swake_up() only wakes
> up a single entry.

There can't be: its one emulated LAPIC per vcpu. So only one vcpu
waits for that waitqueue.

> So either there is only a single entry, or *all* the code like
> 
>         dvcpu->arch.wait = 0;
> 
> -       if (waitqueue_active(&dvcpu->wq))
> -               wake_up_interruptible(&dvcpu->wq);
> +       if (swait_active(&dvcpu->wq))
> +               swake_up(&dvcpu->wq);
> 
> is simply wrong. If there are multiple blockers, and you just cleared
> "arch.wait", I think they should *all* be woken up. And that's not
> what swake_up() does.
> 
> So I think that kvm_vcpu_block() could easily have instead done
> 
>     vcpu->process = current;
> 
> as the "prepare_to_wait()" part, and "finish_wait()" would be to just
> clear vcpu->process. No wait-queue, just a single pointer to the
> single blocking thread.
> 
> (Of course, you still need serialization, so that
> "wake_up_process(vcpu->process)" doesn't end up using a stale value,
> but since processes are already freed with RCU because of other things
> like that, the serialization is very low-cost, you only need to be
> RCU-read safe when waking up).
> 
> See what I'm saying?
> 
> Note that "wake_up_process()" really is fairly widely used. It's
> widely used because it's fairly obvious, and because that really *is*
> the lowest-possible cost: a single pointer to the sleeping thread, and
> you can often do almost no locking at all.
> 
> And unlike swake_up(), it's obvious that you only wake up a single thread.
> 
>            Linus

Feel free to drop the KVM usage... agreed the interface is a special 
case and a generic one which handles multiple waiters 
and has debugging etc should be preferred to avoid bugs

Not sure if other people are using it (swait). 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-30  4:03                     ` Linus Torvalds
  2017-06-30 11:55                       ` Marcelo Tosatti
  2017-06-30 11:57                       ` Marcelo Tosatti
@ 2017-06-30 17:30                       ` Krister Johansen
  2 siblings, 0 replies; 38+ messages in thread
From: Krister Johansen @ 2017-06-30 17:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Marcelo Tosatti, h, Thomas Gleixner, Greg KH, Luis R. Rodriguez,
	Martin Fuzzey, Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano

On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
> >>
> >> swait uses special locking and has odd semantics that are not at all
> >> the same as the default wait queue ones. It should not be used without
> >> very strong reasons (and honestly, the only strong enough reason seems
> >> to be "RT").
> >
> > Performance shortcut:
> >
> > https://lkml.org/lkml/2016/2/25/301
> 
> Now, admittedly I don't know the code and really may be entirely off,
> but looking at the commit (no need to go to the lkml archives - it's
> commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in
> mainline), I really think the swait() use is simply not correct if
> there can be multiple waiters, exactly because swake_up() only wakes
> up a single entry.
> 
> So either there is only a single entry, or *all* the code like
> 
>         dvcpu->arch.wait = 0;
> 
> -       if (waitqueue_active(&dvcpu->wq))
> -               wake_up_interruptible(&dvcpu->wq);
> +       if (swait_active(&dvcpu->wq))
> +               swake_up(&dvcpu->wq);
> 
> is simply wrong. If there are multiple blockers, and you just cleared
> "arch.wait", I think they should *all* be woken up. And that's not
> what swake_up() does.

Code like this is probably wrong for another reason too.  The
swait_active() is likely redudant, since swake_up() also calls
swait_active().  The check in swake_up() returns if it thinks there are
no active waiters.  However, the synchronization needed to ensure a
proper wakeup is left as an exercise to swake_up's caller.

There have been a couple of other discussions around this topic
recently:

https://lkml.org/lkml/2017/5/25/722
https://lkml.org/lkml/2017/6/8/1222

The above is better written as the following, but even then you still
have the single/multiple wakeup problem:

 -       if (waitqueue_active(&dvcpu->wq))
 -               wake_up_interruptible(&dvcpu->wq);
 +       smp_mb();
 +       swake_up(&dvcpu->wq);


Just to add to the confusion, the last time I checked, the semantics of
swake_up() even differ between RT Linux and mainline, which makes this
even more confusing.

-K

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-06-29 18:33                     ` Davidlohr Bueso
       [not found]                       ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
@ 2017-07-05  2:06                       ` Davidlohr Bueso
       [not found]                         ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2017-07-05  2:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, mfuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, rafal, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo,
	Andrew Lutomirski

On Thu, 29 Jun 2017, Davidlohr Bueso wrote:

>I'll actually take a look at wake_q for wake_up_all() and co. to see if
>we can reduce the spinlock hold times. Of course it would only make sense
>for more than a one wakeup.

So here's something that boots and builds a kernel. Any thoughts?

Thanks,
Davidlohr

----------8<----------------------------------------------
From: Davidlohr Bueso <dave@stgolabs.net>
Subject: [PATCH] sched/wait: Perform lockless wake_up_all()

The use of wake_qs have proven a solid option for performing
wakeups without holding the corresponding lock -- to the extent
that many core subsystems use them. We can make use of wake_qs
in waitqueue wakeups. There are a few constraints, nonetheless,
that limit the usage to _only_ wake_up_all():

(i) We cannot use them in exclusive wakes as wake_qs do not
provide a way to acknowledge a successful wakeup vs when the
task is already running. Therefore any node skipping cannot
be determined.

(ii) Lockless wakeups are only under TASK_NORMAL wakeups, and
therefore cannot be used by wake_up_all_interruptible(). For
minimal overhead, wake_q does not understand queues with mixed
states.

Similar changes in the past have shown measurable performance
improvements and more bounded latencies in benchmarks where
threads are contending for the lock. Ie: futex_wake(N) via
mark_wait_futex(), or rwsem waiter wakeups.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/wait.h | 37 +++++++++++++++++++++++++------------
 kernel/sched/wait.c  | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index b289c96151ee..9f6075271e52 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -12,8 +12,10 @@
 
 typedef struct wait_queue_entry wait_queue_entry_t;
 
-typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key);
-int default_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key);
+typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry,
+				 unsigned mode, int flags, void *key);
+int default_wake_function(struct wait_queue_entry *wq_entry,
+			  unsigned mode, int flags, void *key);
 
 /* wait_queue_entry::flags */
 #define WQ_FLAG_EXCLUSIVE	0x01
@@ -56,7 +58,8 @@ struct task_struct;
 #define DECLARE_WAIT_QUEUE_HEAD(name) \
 	struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name)
 
-extern void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *name, struct lock_class_key *);
+extern void __init_waitqueue_head(struct wait_queue_head *wq_head,
+				  const char *name, struct lock_class_key *);
 
 #define init_waitqueue_head(wq_head)						\
 	do {									\
@@ -158,39 +161,49 @@ static inline void __add_wait_queue(struct wait_queue_head *wq_head, struct wait
  * Used for wake-one threads:
  */
 static inline void
-__add_wait_queue_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry)
+__add_wait_queue_exclusive(struct wait_queue_head *wq_head,
+			   struct wait_queue_entry *wq_entry)
 {
 	wq_entry->flags |= WQ_FLAG_EXCLUSIVE;
 	__add_wait_queue(wq_head, wq_entry);
 }
 
-static inline void __add_wait_queue_entry_tail(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry)
+static inline void
+__add_wait_queue_entry_tail(struct wait_queue_head *wq_head,
+			    struct wait_queue_entry *wq_entry)
 {
 	list_add_tail(&wq_entry->entry, &wq_head->head);
 }
 
 static inline void
-__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry)
+__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head,
+				      struct wait_queue_entry *wq_entry)
 {
 	wq_entry->flags |= WQ_FLAG_EXCLUSIVE;
 	__add_wait_queue_entry_tail(wq_head, wq_entry);
 }
 
 static inline void
-__remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry)
+__remove_wait_queue(struct wait_queue_head *wq_head,
+		    struct wait_queue_entry *wq_entry)
 {
 	list_del(&wq_entry->entry);
 }
 
-void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key);
-void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key);
-void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key);
-void __wake_up_locked(struct wait_queue_head *wq_head, unsigned int mode, int nr);
+void __wake_up(struct wait_queue_head *wq_head,
+	       unsigned int mode, int nr, void *key);
+void __wake_up_all_lockless(struct wait_queue_head *wq_head);
+void __wake_up_locked_key(struct wait_queue_head *wq_head,
+			  unsigned int mode, void *key);
+void __wake_up_sync_key(struct wait_queue_head *wq_head,
+			unsigned int mode, int nr, void *key);
+void __wake_up_locked(struct wait_queue_head *wq_head,
+		      unsigned int mode, int nr);
 void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode, int nr);
 
 #define wake_up(x)			__wake_up(x, TASK_NORMAL, 1, NULL)
 #define wake_up_nr(x, nr)		__wake_up(x, TASK_NORMAL, nr, NULL)
-#define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
+#define wake_up_all(x)			__wake_up_all_lockless(x)
 #define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL, 1)
 #define wake_up_all_locked(x)		__wake_up_locked((x), TASK_NORMAL, 0)
 
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 17f11c6b0a9f..d029e13529ed 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -5,6 +5,7 @@
  */
 #include <linux/init.h>
 #include <linux/export.h>
+#include <linux/sched/wake_q.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/debug.h>
 #include <linux/mm.h>
@@ -98,6 +99,41 @@ void __wake_up(struct wait_queue_head *wq_head, unsigned int mode,
 }
 EXPORT_SYMBOL(__wake_up);
 
+/**
+ * __wake_up_all_lockless - wake up all threads blocked on a waitqueue, with
+ *                          assistance from wake_qs.
+ * @wq_head: the waitqueue
+ *
+ * Using lockless wake_qs allows the wakeups to occur after releasing the
+ * @wq_head->lock, thus reducing hold times. As such, it only makes sense
+ * when there is more than a single wakeup to be performed, otherwise it's
+ * not really worth it. Similarly, this can only be used for TASK_NORMAL
+ * wakeups, such that we avoid queues with mixed states.
+ *
+ * It may be assumed that unlike __wake_up() this function always implies
+ * a memory barrier before changing the task state due to wake_q_add(),
+ * regardless of whether or not the task was actually running.
+ */
+void __wake_up_all_lockless(struct wait_queue_head *wq_head)
+{
+	unsigned long flags;
+	wait_queue_entry_t *curr, *next;
+	DEFINE_WAKE_Q(wake_q);
+
+	spin_lock_irqsave(&wq_head->lock, flags);
+	/*
+	 * Because we are to awake all tasks, we don't care about
+	 * dealing with WQ_FLAG_EXCLUSIVE cases.
+	 */
+	list_for_each_entry_safe(curr, next, &wq_head->head, entry)
+		wake_q_add(&wake_q, curr->private);
+
+	spin_unlock_irqrestore(&wq_head->lock, flags);
+
+	wake_up_q(&wake_q);
+}
+EXPORT_SYMBOL(__wake_up_all_lockless);
+
 /*
  * Same as __wake_up but called with the spinlock in wait_queue_head_t held.
  */
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                         ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
@ 2017-07-07 19:58                           ` Linus Torvalds
  2017-07-07 22:27                             ` Davidlohr Bueso
  0 siblings, 1 reply; 38+ messages in thread
From: Linus Torvalds @ 2017-07-07 19:58 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	Rafał Miłecki, Arend Van Spriel, Rafael J. Wysocki, Li,
	Yi, atull-DgEjT+Ai2ygdnm+yROfE0A, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano

On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org> wrote:
>
> So here's something that boots and builds a kernel. Any thoughts?

This patch ios just nasty crap. Sorry.

It has random whitespace changfes that look entirely unrelated to trhe
actual real meat of the patch, and that actually make whitespace
*worse*.

WHY?

That alone should just mean that this patch needs to be thrown away
and never ever looked at again.

But also, this is fundamentally garbage.

Exactly for the same reasons that the swait interfaces were
fundamentally broken.

It *looks* like it works on regular wait queues, and people can start
using it that way, but it actually doesn't have the right semantics at
all.

The new "lockless" function ONLY works if you don't have a private
wakeup function.

So it completely changes the semantics of "wake_up_all()".

It used to be that "wake_up_all()" did what the name suggested.

With this patch, "wake_up_all()" will not cause random kernel oopses
or memory corruption if you use any of the more specialized wakeup
functions

Guys, STOP DOING SHIT LIKE THIS!

The swake interface was incredible crap, with bad semantics that
confused people, and bad performance too.

Don't make the normal wait-queues have similar idiotic problems.

So no, this is not only NAK'ed, the whole approach is pure and utter
shit and this needs to be buried deep and forgotten about so that it
never ever comes back to life.

               Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-07-07 19:58                           ` Linus Torvalds
@ 2017-07-07 22:27                             ` Davidlohr Bueso
  2017-07-07 22:48                               ` Linus Torvalds
  0 siblings, 1 reply; 38+ messages in thread
From: Davidlohr Bueso @ 2017-07-07 22:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, Rafa?? Mi??ecki, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo

On Fri, 07 Jul 2017, Linus Torvalds wrote:

>On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
>>
>> So here's something that boots and builds a kernel. Any thoughts?
>
>This patch ios just nasty crap. Sorry.
>
>It has random whitespace changfes that look entirely unrelated to trhe
>actual real meat of the patch, and that actually make whitespace
>*worse*.

Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough
to just fly by.

>
>WHY?
>
>That alone should just mean that this patch needs to be thrown away
>and never ever looked at again.
>
>But also, this is fundamentally garbage.
>
>Exactly for the same reasons that the swait interfaces were
>fundamentally broken.
>
>It *looks* like it works on regular wait queues, and people can start
>using it that way, but it actually doesn't have the right semantics at
>all.
>
>The new "lockless" function ONLY works if you don't have a private
>wakeup function.

Oh indeed, this was always my intent. Going back to the patch, when
checking DEFINE_WAIT_FUNC I clearly overlooked the ->func()
implications, breaking all kinds of semantics. With that and the
constraints aforementioned in the patch, I see no sane way of using
wake_qs.

>
>So no, this is not only NAK'ed, the whole approach is pure and utter
>shit and this needs to be buried deep and forgotten about so that it
>never ever comes back to life.

Given that you seem to agree that the lockless version is possible as
long as we keep semantics, this imho is another point for some form of
simplified waitqueues.

But yeah.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
  2017-07-07 22:27                             ` Davidlohr Bueso
@ 2017-07-07 22:48                               ` Linus Torvalds
  0 siblings, 0 replies; 38+ messages in thread
From: Linus Torvalds @ 2017-07-07 22:48 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Greg KH, Luis R. Rodriguez, Martin Fuzzey,
	Eric W. Biederman, Dmitry Torokhov, Daniel Wagner,
	David Woodhouse, jewalt, Rafa?? Mi??ecki, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull, Moritz Fischer, Petr Mladek,
	Johannes Berg, Emmanuel Grumbach, Coelho, Luciano, Kalle Valo

On Fri, Jul 7, 2017 at 3:27 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough
> to just fly by.

I find them annoying, because it makes it so much harder to see what
the patch actually does.

In this case, I think that more than 50% of the patch was just
whitespace changes..

> Oh indeed, this was always my intent. Going back to the patch, when
> checking DEFINE_WAIT_FUNC I clearly overlooked the ->func()
> implications, breaking all kinds of semantics. With that and the
> constraints aforementioned in the patch, I see no sane way of using
> wake_qs.

Well, very few people actually use "wake_up_all()", particularly for
any of the things that use special wake functions.

So it probably works in practice.

And then somebody starts using pollfd or something on one the things
that *do* use wake_up_all() and happens to also allow polling (or
whatever), and you get nasty crashes.

> Given that you seem to agree that the lockless version is possible as
> long as we keep semantics, this imho is another point for some form of
> simplified waitqueues.

We just really haven't had a lot of problems with the waitqueues in my
experience. Many of the historical big problems were about the whole
"exclusive vs non-exclusive" thundering herd problems, which is
actually the most complex thing about them (the callback function adds
a pointer to the wait queue, so makes it bigger, but that is very
seldom a huge issue).

Most of the things that want specific wakeups tend to be some really
low-level stuff (ie semaphores etc - both the sysvipc kind of ones and
the kernel locking kind of ones).

They are often doing their very special own things anyway. And often
the regular waitqueues actually work fine, and the biggest thing is to
use the lock within the waitqueue for the object that is being waited
on too, so that you just avoid the double locking.

So you may have hit the one or two cases where the usual wait-queues
didn't work well, but in *most* cases they work wonderfully.

             Linus

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] swait: add the missing killable swaits
       [not found]                                       ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
@ 2017-07-12 21:33                                         ` Luis R. Rodriguez
  0 siblings, 0 replies; 38+ messages in thread
From: Luis R. Rodriguez @ 2017-07-12 21:33 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jakub Kicinski, nbroeking-BUHhN+a2lJ4, Linus Torvalds,
	Davidlohr Bueso, Thomas Gleixner, Greg KH,
	mfuzzey-mB3Nsq4MPf1BDgjK7y7TUQ, Eric W. Biederman,
	Dmitry Torokhov, Daniel Wagner, David Woodhouse,
	jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8,
	rafal-g1n6cQUeyibVItvQsEIGlw, Arend Van Spriel,
	Rafael J. Wysocki, Li, Yi, atull-DgEjT+Ai2ygdnm+yROfE0A,
	Moritz Fischer, Petr Mladek, Johannes Berg, Emmanuel Grumbach

On Fri, Jun 30, 2017 at 12:50:03AM +0200, Luis R. Rodriguez wrote:
> ie, I expect the combination of both to fix your issues, not just the last
> series I just posted [0]. If you want this in git form you can find all of
> the patches bundled on the 20170629-fw-fixes-wait-v4 branch [1]. I just
> wrote this patch it but it seems to have not broken the tests
> 
> From cb7fee12c6d539405793e883dfd79e0b21c2baad Mon Sep 17 00:00:00 2001
> From: "Luis R. Rodriguez" <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Thu, 29 Jun 2017 15:19:04 -0700
> Subject: [RFT] firmware: send wake up on failure for batched requests
> 
> Fix batched requests from waiting forever on failure.
> 
> The firmware API supports "batched requests" which means requests with
> the same name share the same lookup effort. They wait for the first
> request to complete, however they are set to always wait for what seem
> to be forever (MAX_SCHEDULE_TIMEOUT).
> 
> We currently handle informing waited batched requests on success but we
> never seem to have sent smoke signals of any kind on failure! This
> should mean secondary requests batched in seem to just wait forever when
> the request fails.
> 
> For device drivers with optional firmware schemes (Intel, or Netronome),
> this could mean that when you boot a system with multiple cards the
> firmware will seem to never load on the system, or that the card is just
> not responsive even the driver initialized. Due to differences in scheduling
> possible this should not always trigger, so triggering batched requests
> actually needs to be triggered for this to be an issue.
> 
> Its reported that at least with the Intel WiFi cards on one system this
> issue was creeping up 50% of the boots [0].
> 
> [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
> 
> Reported-by: Nicolas <nbroeking-BUHhN+a2lJ4@public.gmane.org>
> Reported-by: John Ewalt  <jewalt-d4N2ExZK1jaSe5ORCPIMD9BPR1lH4CV8@public.gmane.org>
> Reported-by: Jakub Kicinski <jakub.kicinski-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
> Signed-off-by: Luis R. Rodriguez <mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---

FWIW I wrote a test case for this and indeed as I expected, it fixed the last
remaining issue I was aware of with using multiple cards and the firmware API.

Determining the first affected kernel was rather hard, but it would seem to be
that this became an issue once we started supporting making the fallback
mechanism optional via commit bba3a87e982ad5 ("firmware: Introduce
request_firmware_direct()", merged via v3.14.

Will follow up with patches.

  Luis

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-07-12 21:33 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-14 22:20 [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Luis R. Rodriguez
2017-06-14 22:20 ` [PATCH 1/4] test_firmware: add test case for SIGCHLD on sync fallback Luis R. Rodriguez
2017-06-14 22:20 ` [PATCH 2/4] swait: add the missing killable swaits Luis R. Rodriguez
     [not found]   ` <20170614222017.14653-3-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-06-29 12:54     ` Greg KH
     [not found]       ` <20170629125402.GH26046-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2017-06-29 13:05         ` Thomas Gleixner
2017-06-29 13:35           ` Greg KH
2017-06-29 13:46             ` Thomas Gleixner
2017-06-29 16:13               ` Linus Torvalds
2017-06-29 16:31                 ` Matthew Wilcox
2017-06-29 17:29                   ` Luis R. Rodriguez
2017-06-29 17:40                 ` Davidlohr Bueso
2017-06-29 17:57                   ` Linus Torvalds
2017-06-29 18:33                     ` Davidlohr Bueso
     [not found]                       ` <20170629183339.GD3954-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
2017-06-29 18:59                         ` Linus Torvalds
     [not found]                           ` <CA+55aFz8Mhx+A-g-5yOG-O1ZLRUR_fpeeA4iBNGH8EnDBZEdpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-29 19:40                             ` Luis R. Rodriguez
2017-06-29 19:44                               ` Luis R. Rodriguez
     [not found]                                 ` <20170629194455.GR21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2017-06-29 20:58                                   ` Jakub Kicinski
2017-06-29 22:50                                     ` Luis R. Rodriguez
2017-06-29 22:53                                       ` Jakub Kicinski
2017-06-29 23:00                                         ` Luis R. Rodriguez
2017-06-29 23:06                                           ` Jakub Kicinski
     [not found]                                       ` <20170629225003.GU21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2017-07-12 21:33                                         ` Luis R. Rodriguez
     [not found]                               ` <20170629194015.GQ21846-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2017-06-29 20:57                                 ` Linus Torvalds
2017-07-05  2:06                       ` Davidlohr Bueso
     [not found]                         ` <20170705020635.GD11168-3dK4OQgjB4rH06JGZaSw0A@public.gmane.org>
2017-07-07 19:58                           ` Linus Torvalds
2017-07-07 22:27                             ` Davidlohr Bueso
2017-07-07 22:48                               ` Linus Torvalds
2017-06-29 19:15                 ` Marcelo Tosatti
     [not found]                   ` <20170629191506.GB12368-I4X2Mt4zSy4@public.gmane.org>
2017-06-30  4:03                     ` Linus Torvalds
2017-06-30 11:55                       ` Marcelo Tosatti
2017-06-30 11:57                       ` Marcelo Tosatti
2017-06-30 17:30                       ` Krister Johansen
2017-06-14 22:20 ` [PATCH 3/4] firmware: avoid invalid fallback aborts by using killable swait Luis R. Rodriguez
2017-06-14 22:20 ` [PATCH 4/4] firmware: send -EINTR on signal abort on fallback mechanism Luis R. Rodriguez
2017-06-15  7:49 ` [PATCH 0/4] firmware: fix fallback mechanism by ignoring SIGCHLD Martin Fuzzey
     [not found] ` <20170614222017.14653-1-mcgrof-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-06-26 21:19   ` Luis R. Rodriguez
2017-06-29 15:14   ` Greg KH
     [not found]     ` <20170629151442.GA4880-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2017-06-29 17:29       ` Luis R. Rodriguez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).