All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-05-18 15:59 ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-18 15:59 UTC (permalink / raw)
  To: Luis Chamberlain, Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun
  Cc: linux-kernel-mentees, Anirudh Rayabharam,
	syzbot+de271708674e2093097b, linux-kernel

This use-after-free happens when a fw_priv object has been freed but
hasn't been removed from the pending list (pending_fw_head). The next
time fw_load_sysfs_fallback tries to insert into the list, it ends up
accessing the pending_list member of the previoiusly freed fw_priv.

The root cause here is that all code paths that abort the fw load
don't delete it from the pending list. For example:

	_request_firmware()
	  -> fw_abort_batch_reqs()
	      -> fw_state_aborted()

To fix this, delete the fw_priv from the list in __fw_set_state() if
the new state is DONE or ABORTED. This way, all aborts will remove
the fw_priv from the list. Accordingly, remove calls to list_del_init
that were being made before calling fw_state_(aborted|done)().

Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
if it is already aborted. Instead, just jump out and return early.

Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
---

Changes in v4:
Documented the reasons behind the error codes returned from
fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.

Changes in v3:
Modified the patch to incorporate suggestions by Luis Chamberlain in
order to fix the root cause instead of applying a "band-aid" kind of
fix.
https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/

Changes in v2:
1. Fixed 1 error and 1 warning (in the commit message) reported by
checkpatch.pl. The error was regarding the format for referring to
another commit "commit <sha> ("oneline")". The warning was for line
longer than 75 chars. 

---
 drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
 drivers/base/firmware_loader/firmware.h |  6 +++-
 drivers/base/firmware_loader/main.c     |  2 ++
 3 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
index 91899d185e31..f244c7b89ba5 100644
--- a/drivers/base/firmware_loader/fallback.c
+++ b/drivers/base/firmware_loader/fallback.c
@@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
 
 static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
 {
-	return __fw_state_wait_common(fw_priv, timeout);
+	int ret = __fw_state_wait_common(fw_priv, timeout);
+
+	/*
+	 * A signal could be sent to abort a wait. Consider Android's init
+	 * gettting a SIGCHLD, which in turn was the same process issuing the
+	 * sysfs store call for the fallback. In such cases we want to be able
+	 * to tell apart in userspace when a signal caused a failure on the
+	 * wait. In such cases we'd get -ERESTARTSYS.
+	 *
+	 * Likewise though another race can happen and abort the load earlier.
+	 *
+	 * In either case the situation is interrupted so we just inform
+	 * userspace of that and we end things right away.
+	 *
+	 * When we really time out just tell userspace it should try again,
+	 * perhaps later.
+	 */
+	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
+		ret = -EINTR;
+	else if (ret == -ETIMEDOUT)
+		ret = -EAGAIN;
+	else if (fw_priv->is_paged_buf && !fw_priv->data)
+		ret = -ENOMEM;
+
+	return ret;
 }
 
 struct fw_sysfs {
@@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
 	 * There is a small window in which user can write to 'loading'
 	 * between loading done and disappearance of 'loading'
 	 */
-	if (fw_sysfs_done(fw_priv))
+	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
 		return;
 
-	list_del_init(&fw_priv->pending_list);
 	fw_state_aborted(fw_priv);
 }
 
@@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
 			 * Same logic as fw_load_abort, only the DONE bit
 			 * is ignored and we set ABORT only on failure.
 			 */
-			list_del_init(&fw_priv->pending_list);
 			if (rc) {
 				fw_state_aborted(fw_priv);
 				written = rc;
@@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
 	}
 
 	mutex_lock(&fw_lock);
+	if (fw_state_is_aborted(fw_priv)) {
+		mutex_unlock(&fw_lock);
+		retval = -EINTR;
+		goto out;
+	}
 	list_add(&fw_priv->pending_list, &pending_fw_head);
 	mutex_unlock(&fw_lock);
 
@@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
 	}
 
 	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
-	if (retval < 0 && retval != -ENOENT) {
+	if (retval < 0) {
 		mutex_lock(&fw_lock);
 		fw_load_abort(fw_sysfs);
 		mutex_unlock(&fw_lock);
 	}
 
-	if (fw_state_is_aborted(fw_priv)) {
-		if (retval == -ERESTARTSYS)
-			retval = -EINTR;
-		else
-			retval = -EAGAIN;
-	} else if (fw_priv->is_paged_buf && !fw_priv->data)
-		retval = -ENOMEM;
-
+out:
 	device_del(f_dev);
 err_put_dev:
 	put_device(f_dev);
diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
index 63bd29fdcb9c..36bdb413c998 100644
--- a/drivers/base/firmware_loader/firmware.h
+++ b/drivers/base/firmware_loader/firmware.h
@@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
 
 	WRITE_ONCE(fw_st->status, status);
 
-	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
+	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
+#ifdef CONFIG_FW_LOADER_USER_HELPER
+		list_del_init(&fw_priv->pending_list);
+#endif
 		complete_all(&fw_st->completion);
+	}
 }
 
 static inline void fw_state_aborted(struct fw_priv *fw_priv)
diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
index 4fdb8219cd08..68c549d71230 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
 		return;
 
 	fw_priv = fw->priv;
+	mutex_lock(&fw_lock);
 	if (!fw_state_is_aborted(fw_priv))
 		fw_state_aborted(fw_priv);
+	mutex_unlock(&fw_lock);
 }
 
 /* called from request_firmware() and request_firmware_work_func() */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-05-18 15:59 ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-18 15:59 UTC (permalink / raw)
  To: Luis Chamberlain, Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun
  Cc: linux-kernel-mentees, syzbot+de271708674e2093097b, linux-kernel

This use-after-free happens when a fw_priv object has been freed but
hasn't been removed from the pending list (pending_fw_head). The next
time fw_load_sysfs_fallback tries to insert into the list, it ends up
accessing the pending_list member of the previoiusly freed fw_priv.

The root cause here is that all code paths that abort the fw load
don't delete it from the pending list. For example:

	_request_firmware()
	  -> fw_abort_batch_reqs()
	      -> fw_state_aborted()

To fix this, delete the fw_priv from the list in __fw_set_state() if
the new state is DONE or ABORTED. This way, all aborts will remove
the fw_priv from the list. Accordingly, remove calls to list_del_init
that were being made before calling fw_state_(aborted|done)().

Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
if it is already aborted. Instead, just jump out and return early.

Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
---

Changes in v4:
Documented the reasons behind the error codes returned from
fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.

Changes in v3:
Modified the patch to incorporate suggestions by Luis Chamberlain in
order to fix the root cause instead of applying a "band-aid" kind of
fix.
https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/

Changes in v2:
1. Fixed 1 error and 1 warning (in the commit message) reported by
checkpatch.pl. The error was regarding the format for referring to
another commit "commit <sha> ("oneline")". The warning was for line
longer than 75 chars. 

---
 drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
 drivers/base/firmware_loader/firmware.h |  6 +++-
 drivers/base/firmware_loader/main.c     |  2 ++
 3 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
index 91899d185e31..f244c7b89ba5 100644
--- a/drivers/base/firmware_loader/fallback.c
+++ b/drivers/base/firmware_loader/fallback.c
@@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
 
 static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
 {
-	return __fw_state_wait_common(fw_priv, timeout);
+	int ret = __fw_state_wait_common(fw_priv, timeout);
+
+	/*
+	 * A signal could be sent to abort a wait. Consider Android's init
+	 * gettting a SIGCHLD, which in turn was the same process issuing the
+	 * sysfs store call for the fallback. In such cases we want to be able
+	 * to tell apart in userspace when a signal caused a failure on the
+	 * wait. In such cases we'd get -ERESTARTSYS.
+	 *
+	 * Likewise though another race can happen and abort the load earlier.
+	 *
+	 * In either case the situation is interrupted so we just inform
+	 * userspace of that and we end things right away.
+	 *
+	 * When we really time out just tell userspace it should try again,
+	 * perhaps later.
+	 */
+	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
+		ret = -EINTR;
+	else if (ret == -ETIMEDOUT)
+		ret = -EAGAIN;
+	else if (fw_priv->is_paged_buf && !fw_priv->data)
+		ret = -ENOMEM;
+
+	return ret;
 }
 
 struct fw_sysfs {
@@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
 	 * There is a small window in which user can write to 'loading'
 	 * between loading done and disappearance of 'loading'
 	 */
-	if (fw_sysfs_done(fw_priv))
+	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
 		return;
 
-	list_del_init(&fw_priv->pending_list);
 	fw_state_aborted(fw_priv);
 }
 
@@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
 			 * Same logic as fw_load_abort, only the DONE bit
 			 * is ignored and we set ABORT only on failure.
 			 */
-			list_del_init(&fw_priv->pending_list);
 			if (rc) {
 				fw_state_aborted(fw_priv);
 				written = rc;
@@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
 	}
 
 	mutex_lock(&fw_lock);
+	if (fw_state_is_aborted(fw_priv)) {
+		mutex_unlock(&fw_lock);
+		retval = -EINTR;
+		goto out;
+	}
 	list_add(&fw_priv->pending_list, &pending_fw_head);
 	mutex_unlock(&fw_lock);
 
@@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
 	}
 
 	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
-	if (retval < 0 && retval != -ENOENT) {
+	if (retval < 0) {
 		mutex_lock(&fw_lock);
 		fw_load_abort(fw_sysfs);
 		mutex_unlock(&fw_lock);
 	}
 
-	if (fw_state_is_aborted(fw_priv)) {
-		if (retval == -ERESTARTSYS)
-			retval = -EINTR;
-		else
-			retval = -EAGAIN;
-	} else if (fw_priv->is_paged_buf && !fw_priv->data)
-		retval = -ENOMEM;
-
+out:
 	device_del(f_dev);
 err_put_dev:
 	put_device(f_dev);
diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
index 63bd29fdcb9c..36bdb413c998 100644
--- a/drivers/base/firmware_loader/firmware.h
+++ b/drivers/base/firmware_loader/firmware.h
@@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
 
 	WRITE_ONCE(fw_st->status, status);
 
-	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
+	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
+#ifdef CONFIG_FW_LOADER_USER_HELPER
+		list_del_init(&fw_priv->pending_list);
+#endif
 		complete_all(&fw_st->completion);
+	}
 }
 
 static inline void fw_state_aborted(struct fw_priv *fw_priv)
diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
index 4fdb8219cd08..68c549d71230 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
 		return;
 
 	fw_priv = fw->priv;
+	mutex_lock(&fw_lock);
 	if (!fw_state_is_aborted(fw_priv))
 		fw_state_aborted(fw_priv);
+	mutex_unlock(&fw_lock);
 }
 
 /* called from request_firmware() and request_firmware_work_func() */
-- 
2.26.2

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-18 15:59 ` Anirudh Rayabharam
@ 2021-05-19  3:20   ` Luis Chamberlain
  -1 siblings, 0 replies; 14+ messages in thread
From: Luis Chamberlain @ 2021-05-19  3:20 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun,
	linux-kernel-mentees, syzbot+de271708674e2093097b, linux-kernel

On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> This use-after-free happens when a fw_priv object has been freed but
> hasn't been removed from the pending list (pending_fw_head). The next
> time fw_load_sysfs_fallback tries to insert into the list, it ends up
> accessing the pending_list member of the previoiusly freed fw_priv.
> 
> The root cause here is that all code paths that abort the fw load
> don't delete it from the pending list. For example:
> 
> 	_request_firmware()
> 	  -> fw_abort_batch_reqs()
> 	      -> fw_state_aborted()
> 
> To fix this, delete the fw_priv from the list in __fw_set_state() if
> the new state is DONE or ABORTED. This way, all aborts will remove
> the fw_priv from the list. Accordingly, remove calls to list_del_init
> that were being made before calling fw_state_(aborted|done)().
> 
> Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> if it is already aborted. Instead, just jump out and return early.
> 
> Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> ---
> 
> Changes in v4:
> Documented the reasons behind the error codes returned from
> fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> 
> Changes in v3:
> Modified the patch to incorporate suggestions by Luis Chamberlain in
> order to fix the root cause instead of applying a "band-aid" kind of
> fix.
> https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> 
> Changes in v2:
> 1. Fixed 1 error and 1 warning (in the commit message) reported by
> checkpatch.pl. The error was regarding the format for referring to
> another commit "commit <sha> ("oneline")". The warning was for line
> longer than 75 chars. 
> 
> ---
>  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
>  drivers/base/firmware_loader/firmware.h |  6 +++-
>  drivers/base/firmware_loader/main.c     |  2 ++
>  3 files changed, 40 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> index 91899d185e31..f244c7b89ba5 100644
> --- a/drivers/base/firmware_loader/fallback.c
> +++ b/drivers/base/firmware_loader/fallback.c
> @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
>  
>  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
>  {
> -	return __fw_state_wait_common(fw_priv, timeout);
> +	int ret = __fw_state_wait_common(fw_priv, timeout);
> +
> +	/*
> +	 * A signal could be sent to abort a wait. Consider Android's init
> +	 * gettting a SIGCHLD, which in turn was the same process issuing the
> +	 * sysfs store call for the fallback. In such cases we want to be able
> +	 * to tell apart in userspace when a signal caused a failure on the
> +	 * wait. In such cases we'd get -ERESTARTSYS.
> +	 *
> +	 * Likewise though another race can happen and abort the load earlier.
> +	 *
> +	 * In either case the situation is interrupted so we just inform
> +	 * userspace of that and we end things right away.
> +	 *
> +	 * When we really time out just tell userspace it should try again,
> +	 * perhaps later.
> +	 */
> +	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> +		ret = -EINTR;
> +	else if (ret == -ETIMEDOUT)
> +		ret = -EAGAIN;


Shuah has explained to me that the only motivation on her part with
using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
_request_firmware_load() return val for fw load abort") was to
distinguish the error from -ENOMEM, and so there was no real
reason to stick to -EAGAIN. Given -EAGAIN is used typically to
ask user to retry, but it makes no sense in this case since the
sysfs interface is ephemeral, I think we should do away with it
and document this rationale.

I think we should stick to use -ETIMEDOUT. Its more telling of what
happened. And so I think just removing the check should do it, but
augmenting the comment should suffice.

Since this change is already big, it would be good for this other
change to go in as a separate change. If you can test to ensure the
-ETIMEDOUT does indeed get propagated that'd be appreciated.

Otherwise looks good. Thanks for your patience!

  Luis

> +	else if (fw_priv->is_paged_buf && !fw_priv->data)
> +		ret = -ENOMEM;
> +
> +	return ret;
>  }
>  
>  struct fw_sysfs {
> @@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
>  	 * There is a small window in which user can write to 'loading'
>  	 * between loading done and disappearance of 'loading'
>  	 */
> -	if (fw_sysfs_done(fw_priv))
> +	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
>  		return;
>  
> -	list_del_init(&fw_priv->pending_list);
>  	fw_state_aborted(fw_priv);
>  }
>  
> @@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
>  			 * Same logic as fw_load_abort, only the DONE bit
>  			 * is ignored and we set ABORT only on failure.
>  			 */
> -			list_del_init(&fw_priv->pending_list);
>  			if (rc) {
>  				fw_state_aborted(fw_priv);
>  				written = rc;
> @@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>  	}
>  
>  	mutex_lock(&fw_lock);
> +	if (fw_state_is_aborted(fw_priv)) {
> +		mutex_unlock(&fw_lock);
> +		retval = -EINTR;
> +		goto out;
> +	}
>  	list_add(&fw_priv->pending_list, &pending_fw_head);
>  	mutex_unlock(&fw_lock);
>  
> @@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>  	}
>  
>  	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> -	if (retval < 0 && retval != -ENOENT) {
> +	if (retval < 0) {
>  		mutex_lock(&fw_lock);
>  		fw_load_abort(fw_sysfs);
>  		mutex_unlock(&fw_lock);
>  	}
>  
> -	if (fw_state_is_aborted(fw_priv)) {
> -		if (retval == -ERESTARTSYS)
> -			retval = -EINTR;
> -		else
> -			retval = -EAGAIN;
> -	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> -		retval = -ENOMEM;
> -
> +out:
>  	device_del(f_dev);
>  err_put_dev:
>  	put_device(f_dev);
> diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> index 63bd29fdcb9c..36bdb413c998 100644
> --- a/drivers/base/firmware_loader/firmware.h
> +++ b/drivers/base/firmware_loader/firmware.h
> @@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
>  
>  	WRITE_ONCE(fw_st->status, status);
>  
> -	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> +	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> +#ifdef CONFIG_FW_LOADER_USER_HELPER
> +		list_del_init(&fw_priv->pending_list);
> +#endif
>  		complete_all(&fw_st->completion);
> +	}
>  }
>  
>  static inline void fw_state_aborted(struct fw_priv *fw_priv)
> diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> index 4fdb8219cd08..68c549d71230 100644
> --- a/drivers/base/firmware_loader/main.c
> +++ b/drivers/base/firmware_loader/main.c
> @@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
>  		return;
>  
>  	fw_priv = fw->priv;
> +	mutex_lock(&fw_lock);
>  	if (!fw_state_is_aborted(fw_priv))
>  		fw_state_aborted(fw_priv);
> +	mutex_unlock(&fw_lock);
>  }
>  
>  /* called from request_firmware() and request_firmware_work_func() */
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-05-19  3:20   ` Luis Chamberlain
  0 siblings, 0 replies; 14+ messages in thread
From: Luis Chamberlain @ 2021-05-19  3:20 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Junyong Sun, linux-kernel-mentees

On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> This use-after-free happens when a fw_priv object has been freed but
> hasn't been removed from the pending list (pending_fw_head). The next
> time fw_load_sysfs_fallback tries to insert into the list, it ends up
> accessing the pending_list member of the previoiusly freed fw_priv.
> 
> The root cause here is that all code paths that abort the fw load
> don't delete it from the pending list. For example:
> 
> 	_request_firmware()
> 	  -> fw_abort_batch_reqs()
> 	      -> fw_state_aborted()
> 
> To fix this, delete the fw_priv from the list in __fw_set_state() if
> the new state is DONE or ABORTED. This way, all aborts will remove
> the fw_priv from the list. Accordingly, remove calls to list_del_init
> that were being made before calling fw_state_(aborted|done)().
> 
> Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> if it is already aborted. Instead, just jump out and return early.
> 
> Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> ---
> 
> Changes in v4:
> Documented the reasons behind the error codes returned from
> fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> 
> Changes in v3:
> Modified the patch to incorporate suggestions by Luis Chamberlain in
> order to fix the root cause instead of applying a "band-aid" kind of
> fix.
> https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> 
> Changes in v2:
> 1. Fixed 1 error and 1 warning (in the commit message) reported by
> checkpatch.pl. The error was regarding the format for referring to
> another commit "commit <sha> ("oneline")". The warning was for line
> longer than 75 chars. 
> 
> ---
>  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
>  drivers/base/firmware_loader/firmware.h |  6 +++-
>  drivers/base/firmware_loader/main.c     |  2 ++
>  3 files changed, 40 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> index 91899d185e31..f244c7b89ba5 100644
> --- a/drivers/base/firmware_loader/fallback.c
> +++ b/drivers/base/firmware_loader/fallback.c
> @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
>  
>  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
>  {
> -	return __fw_state_wait_common(fw_priv, timeout);
> +	int ret = __fw_state_wait_common(fw_priv, timeout);
> +
> +	/*
> +	 * A signal could be sent to abort a wait. Consider Android's init
> +	 * gettting a SIGCHLD, which in turn was the same process issuing the
> +	 * sysfs store call for the fallback. In such cases we want to be able
> +	 * to tell apart in userspace when a signal caused a failure on the
> +	 * wait. In such cases we'd get -ERESTARTSYS.
> +	 *
> +	 * Likewise though another race can happen and abort the load earlier.
> +	 *
> +	 * In either case the situation is interrupted so we just inform
> +	 * userspace of that and we end things right away.
> +	 *
> +	 * When we really time out just tell userspace it should try again,
> +	 * perhaps later.
> +	 */
> +	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> +		ret = -EINTR;
> +	else if (ret == -ETIMEDOUT)
> +		ret = -EAGAIN;


Shuah has explained to me that the only motivation on her part with
using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
_request_firmware_load() return val for fw load abort") was to
distinguish the error from -ENOMEM, and so there was no real
reason to stick to -EAGAIN. Given -EAGAIN is used typically to
ask user to retry, but it makes no sense in this case since the
sysfs interface is ephemeral, I think we should do away with it
and document this rationale.

I think we should stick to use -ETIMEDOUT. Its more telling of what
happened. And so I think just removing the check should do it, but
augmenting the comment should suffice.

Since this change is already big, it would be good for this other
change to go in as a separate change. If you can test to ensure the
-ETIMEDOUT does indeed get propagated that'd be appreciated.

Otherwise looks good. Thanks for your patience!

  Luis

> +	else if (fw_priv->is_paged_buf && !fw_priv->data)
> +		ret = -ENOMEM;
> +
> +	return ret;
>  }
>  
>  struct fw_sysfs {
> @@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
>  	 * There is a small window in which user can write to 'loading'
>  	 * between loading done and disappearance of 'loading'
>  	 */
> -	if (fw_sysfs_done(fw_priv))
> +	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
>  		return;
>  
> -	list_del_init(&fw_priv->pending_list);
>  	fw_state_aborted(fw_priv);
>  }
>  
> @@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
>  			 * Same logic as fw_load_abort, only the DONE bit
>  			 * is ignored and we set ABORT only on failure.
>  			 */
> -			list_del_init(&fw_priv->pending_list);
>  			if (rc) {
>  				fw_state_aborted(fw_priv);
>  				written = rc;
> @@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>  	}
>  
>  	mutex_lock(&fw_lock);
> +	if (fw_state_is_aborted(fw_priv)) {
> +		mutex_unlock(&fw_lock);
> +		retval = -EINTR;
> +		goto out;
> +	}
>  	list_add(&fw_priv->pending_list, &pending_fw_head);
>  	mutex_unlock(&fw_lock);
>  
> @@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>  	}
>  
>  	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> -	if (retval < 0 && retval != -ENOENT) {
> +	if (retval < 0) {
>  		mutex_lock(&fw_lock);
>  		fw_load_abort(fw_sysfs);
>  		mutex_unlock(&fw_lock);
>  	}
>  
> -	if (fw_state_is_aborted(fw_priv)) {
> -		if (retval == -ERESTARTSYS)
> -			retval = -EINTR;
> -		else
> -			retval = -EAGAIN;
> -	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> -		retval = -ENOMEM;
> -
> +out:
>  	device_del(f_dev);
>  err_put_dev:
>  	put_device(f_dev);
> diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> index 63bd29fdcb9c..36bdb413c998 100644
> --- a/drivers/base/firmware_loader/firmware.h
> +++ b/drivers/base/firmware_loader/firmware.h
> @@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
>  
>  	WRITE_ONCE(fw_st->status, status);
>  
> -	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> +	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> +#ifdef CONFIG_FW_LOADER_USER_HELPER
> +		list_del_init(&fw_priv->pending_list);
> +#endif
>  		complete_all(&fw_st->completion);
> +	}
>  }
>  
>  static inline void fw_state_aborted(struct fw_priv *fw_priv)
> diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> index 4fdb8219cd08..68c549d71230 100644
> --- a/drivers/base/firmware_loader/main.c
> +++ b/drivers/base/firmware_loader/main.c
> @@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
>  		return;
>  
>  	fw_priv = fw->priv;
> +	mutex_lock(&fw_lock);
>  	if (!fw_state_is_aborted(fw_priv))
>  		fw_state_aborted(fw_priv);
> +	mutex_unlock(&fw_lock);
>  }
>  
>  /* called from request_firmware() and request_firmware_work_func() */
> -- 
> 2.26.2
> 
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-18 15:59 ` Anirudh Rayabharam
  (?)
  (?)
@ 2021-05-19  9:10 ` Hillf Danton
  2021-05-19 18:56     ` Anirudh Rayabharam
  -1 siblings, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2021-05-19  9:10 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Luis Chamberlain, Junyong Sun, linux-kernel-mentees

On  Tue, 18 May 2021 21:29:20 +0530 Anirudh Rayabharam wrote:
>This use-after-free happens when a fw_priv object has been freed but
>hasn't been removed from the pending list (pending_fw_head). The next
>time fw_load_sysfs_fallback tries to insert into the list, it ends up
>accessing the pending_list member of the previoiusly freed fw_priv.
>
>The root cause here is that all code paths that abort the fw load
>don't delete it from the pending list. For example:
>
>	_request_firmware()
>	  -> fw_abort_batch_reqs()
>	      -> fw_state_aborted()
>
>To fix this, delete the fw_priv from the list in __fw_set_state() if
>the new state is DONE or ABORTED. This way, all aborts will remove
>the fw_priv from the list. Accordingly, remove calls to list_del_init
>that were being made before calling fw_state_(aborted|done)().
>
>Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
>if it is already aborted. Instead, just jump out and return early.
>
>Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
>Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
>Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
>Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
>---
>
>Changes in v4:
>Documented the reasons behind the error codes returned from
>fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
>
>Changes in v3:
>Modified the patch to incorporate suggestions by Luis Chamberlain in
>order to fix the root cause instead of applying a "band-aid" kind of
>fix.
>https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
>
>Changes in v2:
>1. Fixed 1 error and 1 warning (in the commit message) reported by
>checkpatch.pl. The error was regarding the format for referring to
>another commit "commit <sha> ("oneline")". The warning was for line
>longer than 75 chars. 
>
>---
> drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> drivers/base/firmware_loader/firmware.h |  6 +++-
> drivers/base/firmware_loader/main.c     |  2 ++
> 3 files changed, 40 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
>index 91899d185e31..f244c7b89ba5 100644
>--- a/drivers/base/firmware_loader/fallback.c
>+++ b/drivers/base/firmware_loader/fallback.c
>@@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> 
> static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> {
>-	return __fw_state_wait_common(fw_priv, timeout);
>+	int ret = __fw_state_wait_common(fw_priv, timeout);
>+
>+	/*
>+	 * A signal could be sent to abort a wait. Consider Android's init
>+	 * gettting a SIGCHLD, which in turn was the same process issuing the
>+	 * sysfs store call for the fallback. In such cases we want to be able
>+	 * to tell apart in userspace when a signal caused a failure on the
>+	 * wait. In such cases we'd get -ERESTARTSYS.
>+	 *
>+	 * Likewise though another race can happen and abort the load earlier.
>+	 *
>+	 * In either case the situation is interrupted so we just inform
>+	 * userspace of that and we end things right away.
>+	 *
>+	 * When we really time out just tell userspace it should try again,
>+	 * perhaps later.
>+	 */
>+	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
>+		ret = -EINTR;
>+	else if (ret == -ETIMEDOUT)
>+		ret = -EAGAIN;
>+	else if (fw_priv->is_paged_buf && !fw_priv->data)
>+		ret = -ENOMEM;
>+
>+	return ret;
> }
> 
> struct fw_sysfs {
>@@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
> 	 * There is a small window in which user can write to 'loading'
> 	 * between loading done and disappearance of 'loading'
> 	 */
>-	if (fw_sysfs_done(fw_priv))
>+	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
> 		return;
> 
>-	list_del_init(&fw_priv->pending_list);
> 	fw_state_aborted(fw_priv);
> }
> 
>@@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> 			 * Same logic as fw_load_abort, only the DONE bit
> 			 * is ignored and we set ABORT only on failure.
> 			 */
>-			list_del_init(&fw_priv->pending_list);
> 			if (rc) {
> 				fw_state_aborted(fw_priv);
> 				written = rc;
>@@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> 	}
> 
> 	mutex_lock(&fw_lock);
>+	if (fw_state_is_aborted(fw_priv)) {
>+		mutex_unlock(&fw_lock);
>+		retval = -EINTR;
>+		goto out;
>+	}
> 	list_add(&fw_priv->pending_list, &pending_fw_head);

This looks like prepare_to_wait().

> 	mutex_unlock(&fw_lock);
> 
>@@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> 	}
> 
> 	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
>-	if (retval < 0 && retval != -ENOENT) {
>+	if (retval < 0) {
> 		mutex_lock(&fw_lock);
> 		fw_load_abort(fw_sysfs);
		  __fw_load_abort();
		    fw_state_aborted();
		      __fw_state_set();

Is this your finish_wait() part? See what you add below.

> 		mutex_unlock(&fw_lock);
> 	}
> 
>-	if (fw_state_is_aborted(fw_priv)) {
>-		if (retval == -ERESTARTSYS)
>-			retval = -EINTR;
>-		else
>-			retval = -EAGAIN;
>-	} else if (fw_priv->is_paged_buf && !fw_priv->data)
>-		retval = -ENOMEM;
>-
>+out:
> 	device_del(f_dev);
> err_put_dev:
> 	put_device(f_dev);
>diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
>index 63bd29fdcb9c..36bdb413c998 100644
>--- a/drivers/base/firmware_loader/firmware.h
>+++ b/drivers/base/firmware_loader/firmware.h
>@@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
> 
> 	WRITE_ONCE(fw_st->status, status);
> 
>-	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
>+	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
>+#ifdef CONFIG_FW_LOADER_USER_HELPER
>+		list_del_init(&fw_priv->pending_list);
>+#endif
> 		complete_all(&fw_st->completion);
>+	}

Fine, apart from what you are fixing, you are adding something like
finish_wait() into the waker's backyard. Why are you calling
complete_all() on the waiter side?

> }
> 
> static inline void fw_state_aborted(struct fw_priv *fw_priv)
>diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
>index 4fdb8219cd08..68c549d71230 100644
>--- a/drivers/base/firmware_loader/main.c
>+++ b/drivers/base/firmware_loader/main.c
>@@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
> 		return;
> 
> 	fw_priv = fw->priv;
>+	mutex_lock(&fw_lock);
> 	if (!fw_state_is_aborted(fw_priv))
> 		fw_state_aborted(fw_priv);
>+	mutex_unlock(&fw_lock);
> }
> 
> /* called from request_firmware() and request_firmware_work_func() */
>-- 
>2.26.2
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-19  3:20   ` Luis Chamberlain
@ 2021-05-19 18:44     ` Anirudh Rayabharam
  -1 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-19 18:44 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun,
	linux-kernel-mentees, syzbot+de271708674e2093097b, linux-kernel

On Wed, May 19, 2021 at 03:20:14AM +0000, Luis Chamberlain wrote:
> On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > This use-after-free happens when a fw_priv object has been freed but
> > hasn't been removed from the pending list (pending_fw_head). The next
> > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > accessing the pending_list member of the previoiusly freed fw_priv.
> > 
> > The root cause here is that all code paths that abort the fw load
> > don't delete it from the pending list. For example:
> > 
> > 	_request_firmware()
> > 	  -> fw_abort_batch_reqs()
> > 	      -> fw_state_aborted()
> > 
> > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > the new state is DONE or ABORTED. This way, all aborts will remove
> > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > that were being made before calling fw_state_(aborted|done)().
> > 
> > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > if it is already aborted. Instead, just jump out and return early.
> > 
> > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > ---
> > 
> > Changes in v4:
> > Documented the reasons behind the error codes returned from
> > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> > 
> > Changes in v3:
> > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > order to fix the root cause instead of applying a "band-aid" kind of
> > fix.
> > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> > 
> > Changes in v2:
> > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > checkpatch.pl. The error was regarding the format for referring to
> > another commit "commit <sha> ("oneline")". The warning was for line
> > longer than 75 chars. 
> > 
> > ---
> >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> >  drivers/base/firmware_loader/firmware.h |  6 +++-
> >  drivers/base/firmware_loader/main.c     |  2 ++
> >  3 files changed, 40 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > index 91899d185e31..f244c7b89ba5 100644
> > --- a/drivers/base/firmware_loader/fallback.c
> > +++ b/drivers/base/firmware_loader/fallback.c
> > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> >  
> >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> >  {
> > -	return __fw_state_wait_common(fw_priv, timeout);
> > +	int ret = __fw_state_wait_common(fw_priv, timeout);
> > +
> > +	/*
> > +	 * A signal could be sent to abort a wait. Consider Android's init
> > +	 * gettting a SIGCHLD, which in turn was the same process issuing the
> > +	 * sysfs store call for the fallback. In such cases we want to be able
> > +	 * to tell apart in userspace when a signal caused a failure on the
> > +	 * wait. In such cases we'd get -ERESTARTSYS.
> > +	 *
> > +	 * Likewise though another race can happen and abort the load earlier.
> > +	 *
> > +	 * In either case the situation is interrupted so we just inform
> > +	 * userspace of that and we end things right away.
> > +	 *
> > +	 * When we really time out just tell userspace it should try again,
> > +	 * perhaps later.
> > +	 */
> > +	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > +		ret = -EINTR;
> > +	else if (ret == -ETIMEDOUT)
> > +		ret = -EAGAIN;
> 
> 
> Shuah has explained to me that the only motivation on her part with
> using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> _request_firmware_load() return val for fw load abort") was to
> distinguish the error from -ENOMEM, and so there was no real
> reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> ask user to retry, but it makes no sense in this case since the
> sysfs interface is ephemeral, I think we should do away with it
> and document this rationale.

As per Shuah's explanation here:
https://lore.kernel.org/lkml/04b5bb2f-edf7-5b43-585a-3267d83bd8c3@linuxfoundation.org/

It looks like some media drivers expect -EAGAIN when there is a timeout
so that they can retry. So, looks like we have to retain -EAGAIN until
we verify that these media drivers don't need it anymore.

> 
> I think we should stick to use -ETIMEDOUT. Its more telling of what
> happened. And so I think just removing the check should do it, but
> augmenting the comment should suffice.
> 
> Since this change is already big, it would be good for this other
> change to go in as a separate change. If you can test to ensure the

"other change" here refers to getting rid of -EAGAIN and returning
-ETIMEDOUT instead right?

> -ETIMEDOUT does indeed get propagated that'd be appreciated.

Can you please clarify what needs to be tested here?
Obviously it is not getting propagated from this function to the caller. We
are returning -EAGAIN when we receive -ETIMEDOUT.

Thanks for the review!

	- Anirudh.

> 
> Otherwise looks good. Thanks for your patience!
> 
>   Luis
> 
> > +	else if (fw_priv->is_paged_buf && !fw_priv->data)
> > +		ret = -ENOMEM;
> > +
> > +	return ret;
> >  }
> >  
> >  struct fw_sysfs {
> > @@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
> >  	 * There is a small window in which user can write to 'loading'
> >  	 * between loading done and disappearance of 'loading'
> >  	 */
> > -	if (fw_sysfs_done(fw_priv))
> > +	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
> >  		return;
> >  
> > -	list_del_init(&fw_priv->pending_list);
> >  	fw_state_aborted(fw_priv);
> >  }
> >  
> > @@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> >  			 * Same logic as fw_load_abort, only the DONE bit
> >  			 * is ignored and we set ABORT only on failure.
> >  			 */
> > -			list_del_init(&fw_priv->pending_list);
> >  			if (rc) {
> >  				fw_state_aborted(fw_priv);
> >  				written = rc;
> > @@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> >  	}
> >  
> >  	mutex_lock(&fw_lock);
> > +	if (fw_state_is_aborted(fw_priv)) {
> > +		mutex_unlock(&fw_lock);
> > +		retval = -EINTR;
> > +		goto out;
> > +	}
> >  	list_add(&fw_priv->pending_list, &pending_fw_head);
> >  	mutex_unlock(&fw_lock);
> >  
> > @@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> >  	}
> >  
> >  	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> > -	if (retval < 0 && retval != -ENOENT) {
> > +	if (retval < 0) {
> >  		mutex_lock(&fw_lock);
> >  		fw_load_abort(fw_sysfs);
> >  		mutex_unlock(&fw_lock);
> >  	}
> >  
> > -	if (fw_state_is_aborted(fw_priv)) {
> > -		if (retval == -ERESTARTSYS)
> > -			retval = -EINTR;
> > -		else
> > -			retval = -EAGAIN;
> > -	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> > -		retval = -ENOMEM;
> > -
> > +out:
> >  	device_del(f_dev);
> >  err_put_dev:
> >  	put_device(f_dev);
> > diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> > index 63bd29fdcb9c..36bdb413c998 100644
> > --- a/drivers/base/firmware_loader/firmware.h
> > +++ b/drivers/base/firmware_loader/firmware.h
> > @@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
> >  
> >  	WRITE_ONCE(fw_st->status, status);
> >  
> > -	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> > +	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> > +#ifdef CONFIG_FW_LOADER_USER_HELPER
> > +		list_del_init(&fw_priv->pending_list);
> > +#endif
> >  		complete_all(&fw_st->completion);
> > +	}
> >  }
> >  
> >  static inline void fw_state_aborted(struct fw_priv *fw_priv)
> > diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> > index 4fdb8219cd08..68c549d71230 100644
> > --- a/drivers/base/firmware_loader/main.c
> > +++ b/drivers/base/firmware_loader/main.c
> > @@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
> >  		return;
> >  
> >  	fw_priv = fw->priv;
> > +	mutex_lock(&fw_lock);
> >  	if (!fw_state_is_aborted(fw_priv))
> >  		fw_state_aborted(fw_priv);
> > +	mutex_unlock(&fw_lock);
> >  }
> >  
> >  /* called from request_firmware() and request_firmware_work_func() */
> > -- 
> > 2.26.2
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-05-19 18:44     ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-19 18:44 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Junyong Sun, linux-kernel-mentees

On Wed, May 19, 2021 at 03:20:14AM +0000, Luis Chamberlain wrote:
> On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > This use-after-free happens when a fw_priv object has been freed but
> > hasn't been removed from the pending list (pending_fw_head). The next
> > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > accessing the pending_list member of the previoiusly freed fw_priv.
> > 
> > The root cause here is that all code paths that abort the fw load
> > don't delete it from the pending list. For example:
> > 
> > 	_request_firmware()
> > 	  -> fw_abort_batch_reqs()
> > 	      -> fw_state_aborted()
> > 
> > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > the new state is DONE or ABORTED. This way, all aborts will remove
> > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > that were being made before calling fw_state_(aborted|done)().
> > 
> > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > if it is already aborted. Instead, just jump out and return early.
> > 
> > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > ---
> > 
> > Changes in v4:
> > Documented the reasons behind the error codes returned from
> > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> > 
> > Changes in v3:
> > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > order to fix the root cause instead of applying a "band-aid" kind of
> > fix.
> > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> > 
> > Changes in v2:
> > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > checkpatch.pl. The error was regarding the format for referring to
> > another commit "commit <sha> ("oneline")". The warning was for line
> > longer than 75 chars. 
> > 
> > ---
> >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> >  drivers/base/firmware_loader/firmware.h |  6 +++-
> >  drivers/base/firmware_loader/main.c     |  2 ++
> >  3 files changed, 40 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > index 91899d185e31..f244c7b89ba5 100644
> > --- a/drivers/base/firmware_loader/fallback.c
> > +++ b/drivers/base/firmware_loader/fallback.c
> > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> >  
> >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> >  {
> > -	return __fw_state_wait_common(fw_priv, timeout);
> > +	int ret = __fw_state_wait_common(fw_priv, timeout);
> > +
> > +	/*
> > +	 * A signal could be sent to abort a wait. Consider Android's init
> > +	 * gettting a SIGCHLD, which in turn was the same process issuing the
> > +	 * sysfs store call for the fallback. In such cases we want to be able
> > +	 * to tell apart in userspace when a signal caused a failure on the
> > +	 * wait. In such cases we'd get -ERESTARTSYS.
> > +	 *
> > +	 * Likewise though another race can happen and abort the load earlier.
> > +	 *
> > +	 * In either case the situation is interrupted so we just inform
> > +	 * userspace of that and we end things right away.
> > +	 *
> > +	 * When we really time out just tell userspace it should try again,
> > +	 * perhaps later.
> > +	 */
> > +	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > +		ret = -EINTR;
> > +	else if (ret == -ETIMEDOUT)
> > +		ret = -EAGAIN;
> 
> 
> Shuah has explained to me that the only motivation on her part with
> using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> _request_firmware_load() return val for fw load abort") was to
> distinguish the error from -ENOMEM, and so there was no real
> reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> ask user to retry, but it makes no sense in this case since the
> sysfs interface is ephemeral, I think we should do away with it
> and document this rationale.

As per Shuah's explanation here:
https://lore.kernel.org/lkml/04b5bb2f-edf7-5b43-585a-3267d83bd8c3@linuxfoundation.org/

It looks like some media drivers expect -EAGAIN when there is a timeout
so that they can retry. So, looks like we have to retain -EAGAIN until
we verify that these media drivers don't need it anymore.

> 
> I think we should stick to use -ETIMEDOUT. Its more telling of what
> happened. And so I think just removing the check should do it, but
> augmenting the comment should suffice.
> 
> Since this change is already big, it would be good for this other
> change to go in as a separate change. If you can test to ensure the

"other change" here refers to getting rid of -EAGAIN and returning
-ETIMEDOUT instead right?

> -ETIMEDOUT does indeed get propagated that'd be appreciated.

Can you please clarify what needs to be tested here?
Obviously it is not getting propagated from this function to the caller. We
are returning -EAGAIN when we receive -ETIMEDOUT.

Thanks for the review!

	- Anirudh.

> 
> Otherwise looks good. Thanks for your patience!
> 
>   Luis
> 
> > +	else if (fw_priv->is_paged_buf && !fw_priv->data)
> > +		ret = -ENOMEM;
> > +
> > +	return ret;
> >  }
> >  
> >  struct fw_sysfs {
> > @@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
> >  	 * There is a small window in which user can write to 'loading'
> >  	 * between loading done and disappearance of 'loading'
> >  	 */
> > -	if (fw_sysfs_done(fw_priv))
> > +	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
> >  		return;
> >  
> > -	list_del_init(&fw_priv->pending_list);
> >  	fw_state_aborted(fw_priv);
> >  }
> >  
> > @@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> >  			 * Same logic as fw_load_abort, only the DONE bit
> >  			 * is ignored and we set ABORT only on failure.
> >  			 */
> > -			list_del_init(&fw_priv->pending_list);
> >  			if (rc) {
> >  				fw_state_aborted(fw_priv);
> >  				written = rc;
> > @@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> >  	}
> >  
> >  	mutex_lock(&fw_lock);
> > +	if (fw_state_is_aborted(fw_priv)) {
> > +		mutex_unlock(&fw_lock);
> > +		retval = -EINTR;
> > +		goto out;
> > +	}
> >  	list_add(&fw_priv->pending_list, &pending_fw_head);
> >  	mutex_unlock(&fw_lock);
> >  
> > @@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> >  	}
> >  
> >  	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> > -	if (retval < 0 && retval != -ENOENT) {
> > +	if (retval < 0) {
> >  		mutex_lock(&fw_lock);
> >  		fw_load_abort(fw_sysfs);
> >  		mutex_unlock(&fw_lock);
> >  	}
> >  
> > -	if (fw_state_is_aborted(fw_priv)) {
> > -		if (retval == -ERESTARTSYS)
> > -			retval = -EINTR;
> > -		else
> > -			retval = -EAGAIN;
> > -	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> > -		retval = -ENOMEM;
> > -
> > +out:
> >  	device_del(f_dev);
> >  err_put_dev:
> >  	put_device(f_dev);
> > diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> > index 63bd29fdcb9c..36bdb413c998 100644
> > --- a/drivers/base/firmware_loader/firmware.h
> > +++ b/drivers/base/firmware_loader/firmware.h
> > @@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
> >  
> >  	WRITE_ONCE(fw_st->status, status);
> >  
> > -	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> > +	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> > +#ifdef CONFIG_FW_LOADER_USER_HELPER
> > +		list_del_init(&fw_priv->pending_list);
> > +#endif
> >  		complete_all(&fw_st->completion);
> > +	}
> >  }
> >  
> >  static inline void fw_state_aborted(struct fw_priv *fw_priv)
> > diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> > index 4fdb8219cd08..68c549d71230 100644
> > --- a/drivers/base/firmware_loader/main.c
> > +++ b/drivers/base/firmware_loader/main.c
> > @@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
> >  		return;
> >  
> >  	fw_priv = fw->priv;
> > +	mutex_lock(&fw_lock);
> >  	if (!fw_state_is_aborted(fw_priv))
> >  		fw_state_aborted(fw_priv);
> > +	mutex_unlock(&fw_lock);
> >  }
> >  
> >  /* called from request_firmware() and request_firmware_work_func() */
> > -- 
> > 2.26.2
> > 
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-19  9:10 ` Hillf Danton
@ 2021-05-19 18:56     ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-19 18:56 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Luis Chamberlain, Greg Kroah-Hartman, Rafael J. Wysocki,
	Junyong Sun, linux-kernel-mentees, syzbot+de271708674e2093097b,
	linux-kernel

On Wed, May 19, 2021 at 05:10:47PM +0800, Hillf Danton wrote:
> On  Tue, 18 May 2021 21:29:20 +0530 Anirudh Rayabharam wrote:
> >This use-after-free happens when a fw_priv object has been freed but
> >hasn't been removed from the pending list (pending_fw_head). The next
> >time fw_load_sysfs_fallback tries to insert into the list, it ends up
> >accessing the pending_list member of the previoiusly freed fw_priv.
> >
> >The root cause here is that all code paths that abort the fw load
> >don't delete it from the pending list. For example:
> >
> >	_request_firmware()
> >	  -> fw_abort_batch_reqs()
> >	      -> fw_state_aborted()
> >
> >To fix this, delete the fw_priv from the list in __fw_set_state() if
> >the new state is DONE or ABORTED. This way, all aborts will remove
> >the fw_priv from the list. Accordingly, remove calls to list_del_init
> >that were being made before calling fw_state_(aborted|done)().
> >
> >Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> >if it is already aborted. Instead, just jump out and return early.
> >
> >Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> >Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> >Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> >Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> >---
> >
> >Changes in v4:
> >Documented the reasons behind the error codes returned from
> >fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> >
> >Changes in v3:
> >Modified the patch to incorporate suggestions by Luis Chamberlain in
> >order to fix the root cause instead of applying a "band-aid" kind of
> >fix.
> >https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> >
> >Changes in v2:
> >1. Fixed 1 error and 1 warning (in the commit message) reported by
> >checkpatch.pl. The error was regarding the format for referring to
> >another commit "commit <sha> ("oneline")". The warning was for line
> >longer than 75 chars. 
> >
> >---
> > drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> > drivers/base/firmware_loader/firmware.h |  6 +++-
> > drivers/base/firmware_loader/main.c     |  2 ++
> > 3 files changed, 40 insertions(+), 14 deletions(-)
> >
> >diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> >index 91899d185e31..f244c7b89ba5 100644
> >--- a/drivers/base/firmware_loader/fallback.c
> >+++ b/drivers/base/firmware_loader/fallback.c
> >@@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> > 
> > static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> > {
> >-	return __fw_state_wait_common(fw_priv, timeout);
> >+	int ret = __fw_state_wait_common(fw_priv, timeout);
> >+
> >+	/*
> >+	 * A signal could be sent to abort a wait. Consider Android's init
> >+	 * gettting a SIGCHLD, which in turn was the same process issuing the
> >+	 * sysfs store call for the fallback. In such cases we want to be able
> >+	 * to tell apart in userspace when a signal caused a failure on the
> >+	 * wait. In such cases we'd get -ERESTARTSYS.
> >+	 *
> >+	 * Likewise though another race can happen and abort the load earlier.
> >+	 *
> >+	 * In either case the situation is interrupted so we just inform
> >+	 * userspace of that and we end things right away.
> >+	 *
> >+	 * When we really time out just tell userspace it should try again,
> >+	 * perhaps later.
> >+	 */
> >+	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> >+		ret = -EINTR;
> >+	else if (ret == -ETIMEDOUT)
> >+		ret = -EAGAIN;
> >+	else if (fw_priv->is_paged_buf && !fw_priv->data)
> >+		ret = -ENOMEM;
> >+
> >+	return ret;
> > }
> > 
> > struct fw_sysfs {
> >@@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
> > 	 * There is a small window in which user can write to 'loading'
> > 	 * between loading done and disappearance of 'loading'
> > 	 */
> >-	if (fw_sysfs_done(fw_priv))
> >+	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
> > 		return;
> > 
> >-	list_del_init(&fw_priv->pending_list);
> > 	fw_state_aborted(fw_priv);
> > }
> > 
> >@@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> > 			 * Same logic as fw_load_abort, only the DONE bit
> > 			 * is ignored and we set ABORT only on failure.
> > 			 */
> >-			list_del_init(&fw_priv->pending_list);
> > 			if (rc) {
> > 				fw_state_aborted(fw_priv);
> > 				written = rc;
> >@@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> > 	}
> > 
> > 	mutex_lock(&fw_lock);
> >+	if (fw_state_is_aborted(fw_priv)) {
> >+		mutex_unlock(&fw_lock);
> >+		retval = -EINTR;
> >+		goto out;
> >+	}
> > 	list_add(&fw_priv->pending_list, &pending_fw_head);
> 
> This looks like prepare_to_wait().
> 
> > 	mutex_unlock(&fw_lock);
> > 
> >@@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> > 	}
> > 
> > 	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> >-	if (retval < 0 && retval != -ENOENT) {
> >+	if (retval < 0) {
> > 		mutex_lock(&fw_lock);
> > 		fw_load_abort(fw_sysfs);
> 		  __fw_load_abort();
> 		    fw_state_aborted();
> 		      __fw_state_set();
> 
> Is this your finish_wait() part? See what you add below.
> 
> > 		mutex_unlock(&fw_lock);
> > 	}
> > 
> >-	if (fw_state_is_aborted(fw_priv)) {
> >-		if (retval == -ERESTARTSYS)
> >-			retval = -EINTR;
> >-		else
> >-			retval = -EAGAIN;
> >-	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> >-		retval = -ENOMEM;
> >-
> >+out:
> > 	device_del(f_dev);
> > err_put_dev:
> > 	put_device(f_dev);
> >diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> >index 63bd29fdcb9c..36bdb413c998 100644
> >--- a/drivers/base/firmware_loader/firmware.h
> >+++ b/drivers/base/firmware_loader/firmware.h
> >@@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
> > 
> > 	WRITE_ONCE(fw_st->status, status);
> > 
> >-	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> >+	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> >+#ifdef CONFIG_FW_LOADER_USER_HELPER
> >+		list_del_init(&fw_priv->pending_list);
> >+#endif
> > 		complete_all(&fw_st->completion);
> >+	}
> 
> Fine, apart from what you are fixing, you are adding something like
> finish_wait() into the waker's backyard. Why are you calling
> complete_all() on the waiter side?

Sorry, I don't really get your point here. I did not add complete_all().
It was already there. Could you please elaborate?

	- Anirudh.

> 
> > }
> > 
> > static inline void fw_state_aborted(struct fw_priv *fw_priv)
> >diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> >index 4fdb8219cd08..68c549d71230 100644
> >--- a/drivers/base/firmware_loader/main.c
> >+++ b/drivers/base/firmware_loader/main.c
> >@@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
> > 		return;
> > 
> > 	fw_priv = fw->priv;
> >+	mutex_lock(&fw_lock);
> > 	if (!fw_state_is_aborted(fw_priv))
> > 		fw_state_aborted(fw_priv);
> >+	mutex_unlock(&fw_lock);
> > }
> > 
> > /* called from request_firmware() and request_firmware_work_func() */
> >-- 
> >2.26.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-05-19 18:56     ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-05-19 18:56 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Luis Chamberlain, Junyong Sun, linux-kernel-mentees

On Wed, May 19, 2021 at 05:10:47PM +0800, Hillf Danton wrote:
> On  Tue, 18 May 2021 21:29:20 +0530 Anirudh Rayabharam wrote:
> >This use-after-free happens when a fw_priv object has been freed but
> >hasn't been removed from the pending list (pending_fw_head). The next
> >time fw_load_sysfs_fallback tries to insert into the list, it ends up
> >accessing the pending_list member of the previoiusly freed fw_priv.
> >
> >The root cause here is that all code paths that abort the fw load
> >don't delete it from the pending list. For example:
> >
> >	_request_firmware()
> >	  -> fw_abort_batch_reqs()
> >	      -> fw_state_aborted()
> >
> >To fix this, delete the fw_priv from the list in __fw_set_state() if
> >the new state is DONE or ABORTED. This way, all aborts will remove
> >the fw_priv from the list. Accordingly, remove calls to list_del_init
> >that were being made before calling fw_state_(aborted|done)().
> >
> >Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> >if it is already aborted. Instead, just jump out and return early.
> >
> >Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> >Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> >Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> >Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> >---
> >
> >Changes in v4:
> >Documented the reasons behind the error codes returned from
> >fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> >
> >Changes in v3:
> >Modified the patch to incorporate suggestions by Luis Chamberlain in
> >order to fix the root cause instead of applying a "band-aid" kind of
> >fix.
> >https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> >
> >Changes in v2:
> >1. Fixed 1 error and 1 warning (in the commit message) reported by
> >checkpatch.pl. The error was regarding the format for referring to
> >another commit "commit <sha> ("oneline")". The warning was for line
> >longer than 75 chars. 
> >
> >---
> > drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> > drivers/base/firmware_loader/firmware.h |  6 +++-
> > drivers/base/firmware_loader/main.c     |  2 ++
> > 3 files changed, 40 insertions(+), 14 deletions(-)
> >
> >diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> >index 91899d185e31..f244c7b89ba5 100644
> >--- a/drivers/base/firmware_loader/fallback.c
> >+++ b/drivers/base/firmware_loader/fallback.c
> >@@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> > 
> > static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> > {
> >-	return __fw_state_wait_common(fw_priv, timeout);
> >+	int ret = __fw_state_wait_common(fw_priv, timeout);
> >+
> >+	/*
> >+	 * A signal could be sent to abort a wait. Consider Android's init
> >+	 * gettting a SIGCHLD, which in turn was the same process issuing the
> >+	 * sysfs store call for the fallback. In such cases we want to be able
> >+	 * to tell apart in userspace when a signal caused a failure on the
> >+	 * wait. In such cases we'd get -ERESTARTSYS.
> >+	 *
> >+	 * Likewise though another race can happen and abort the load earlier.
> >+	 *
> >+	 * In either case the situation is interrupted so we just inform
> >+	 * userspace of that and we end things right away.
> >+	 *
> >+	 * When we really time out just tell userspace it should try again,
> >+	 * perhaps later.
> >+	 */
> >+	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> >+		ret = -EINTR;
> >+	else if (ret == -ETIMEDOUT)
> >+		ret = -EAGAIN;
> >+	else if (fw_priv->is_paged_buf && !fw_priv->data)
> >+		ret = -ENOMEM;
> >+
> >+	return ret;
> > }
> > 
> > struct fw_sysfs {
> >@@ -91,10 +115,9 @@ static void __fw_load_abort(struct fw_priv *fw_priv)
> > 	 * There is a small window in which user can write to 'loading'
> > 	 * between loading done and disappearance of 'loading'
> > 	 */
> >-	if (fw_sysfs_done(fw_priv))
> >+	if (fw_state_is_aborted(fw_priv) || fw_sysfs_done(fw_priv))
> > 		return;
> > 
> >-	list_del_init(&fw_priv->pending_list);
> > 	fw_state_aborted(fw_priv);
> > }
> > 
> >@@ -280,7 +303,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> > 			 * Same logic as fw_load_abort, only the DONE bit
> > 			 * is ignored and we set ABORT only on failure.
> > 			 */
> >-			list_del_init(&fw_priv->pending_list);
> > 			if (rc) {
> > 				fw_state_aborted(fw_priv);
> > 				written = rc;
> >@@ -513,6 +535,11 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> > 	}
> > 
> > 	mutex_lock(&fw_lock);
> >+	if (fw_state_is_aborted(fw_priv)) {
> >+		mutex_unlock(&fw_lock);
> >+		retval = -EINTR;
> >+		goto out;
> >+	}
> > 	list_add(&fw_priv->pending_list, &pending_fw_head);
> 
> This looks like prepare_to_wait().
> 
> > 	mutex_unlock(&fw_lock);
> > 
> >@@ -526,20 +553,13 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> > 	}
> > 
> > 	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> >-	if (retval < 0 && retval != -ENOENT) {
> >+	if (retval < 0) {
> > 		mutex_lock(&fw_lock);
> > 		fw_load_abort(fw_sysfs);
> 		  __fw_load_abort();
> 		    fw_state_aborted();
> 		      __fw_state_set();
> 
> Is this your finish_wait() part? See what you add below.
> 
> > 		mutex_unlock(&fw_lock);
> > 	}
> > 
> >-	if (fw_state_is_aborted(fw_priv)) {
> >-		if (retval == -ERESTARTSYS)
> >-			retval = -EINTR;
> >-		else
> >-			retval = -EAGAIN;
> >-	} else if (fw_priv->is_paged_buf && !fw_priv->data)
> >-		retval = -ENOMEM;
> >-
> >+out:
> > 	device_del(f_dev);
> > err_put_dev:
> > 	put_device(f_dev);
> >diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
> >index 63bd29fdcb9c..36bdb413c998 100644
> >--- a/drivers/base/firmware_loader/firmware.h
> >+++ b/drivers/base/firmware_loader/firmware.h
> >@@ -117,8 +117,12 @@ static inline void __fw_state_set(struct fw_priv *fw_priv,
> > 
> > 	WRITE_ONCE(fw_st->status, status);
> > 
> >-	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> >+	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) {
> >+#ifdef CONFIG_FW_LOADER_USER_HELPER
> >+		list_del_init(&fw_priv->pending_list);
> >+#endif
> > 		complete_all(&fw_st->completion);
> >+	}
> 
> Fine, apart from what you are fixing, you are adding something like
> finish_wait() into the waker's backyard. Why are you calling
> complete_all() on the waiter side?

Sorry, I don't really get your point here. I did not add complete_all().
It was already there. Could you please elaborate?

	- Anirudh.

> 
> > }
> > 
> > static inline void fw_state_aborted(struct fw_priv *fw_priv)
> >diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> >index 4fdb8219cd08..68c549d71230 100644
> >--- a/drivers/base/firmware_loader/main.c
> >+++ b/drivers/base/firmware_loader/main.c
> >@@ -783,8 +783,10 @@ static void fw_abort_batch_reqs(struct firmware *fw)
> > 		return;
> > 
> > 	fw_priv = fw->priv;
> >+	mutex_lock(&fw_lock);
> > 	if (!fw_state_is_aborted(fw_priv))
> > 		fw_state_aborted(fw_priv);
> >+	mutex_unlock(&fw_lock);
> > }
> > 
> > /* called from request_firmware() and request_firmware_work_func() */
> >-- 
> >2.26.2
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-19 18:56     ` Anirudh Rayabharam
  (?)
@ 2021-05-20  3:52     ` Hillf Danton
  -1 siblings, 0 replies; 14+ messages in thread
From: Hillf Danton @ 2021-05-20  3:52 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Luis Chamberlain, Junyong Sun, linux-kernel-mentees

On Thu, 20 May 2021 00:26:12 +0530 Anirudh Rayabharam wrote:
>On Wed, May 19, 2021 at 05:10:47PM +0800, Hillf Danton wrote:
>> 
>> Fine, apart from what you are fixing, you are adding something like
>> finish_wait() into the waker's backyard. Why are you calling
>> complete_all() on the waiter side?
>
>Sorry, I don't really get your point here. I did not add complete_all().
>It was already there. Could you please elaborate?

If a simple pattern works for you,

 	mutex_lock(&fw_lock);
 	list_add(&fw_priv->pending_list, &pending_fw_head);
 	mutex_unlock(&fw_lock);

	retval = fw_sysfs_wait_timeout(fw_priv, timeout);

 	mutex_lock(&fw_lock);
	list_del_init(&fw_priv->pending_list);
 	mutex_unlock(&fw_lock);

	device_del(f_dev);
	put_device(f_dev);
	return retval;

add a followup cleanup to cut off the list_del on the waker side instead of
putting a spanner in their work that is completing all waiters.
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-05-19  3:20   ` Luis Chamberlain
@ 2021-06-10 22:32     ` Luis Chamberlain
  -1 siblings, 0 replies; 14+ messages in thread
From: Luis Chamberlain @ 2021-06-10 22:32 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun,
	linux-kernel-mentees, syzbot+de271708674e2093097b, linux-kernel

On Tue, May 18, 2021 at 8:20 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > This use-after-free happens when a fw_priv object has been freed but
> > hasn't been removed from the pending list (pending_fw_head). The next
> > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > accessing the pending_list member of the previoiusly freed fw_priv.
> >
> > The root cause here is that all code paths that abort the fw load
> > don't delete it from the pending list. For example:
> >
> >       _request_firmware()
> >         -> fw_abort_batch_reqs()
> >             -> fw_state_aborted()
> >
> > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > the new state is DONE or ABORTED. This way, all aborts will remove
> > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > that were being made before calling fw_state_(aborted|done)().
> >
> > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > if it is already aborted. Instead, just jump out and return early.
> >
> > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > ---
> >
> > Changes in v4:
> > Documented the reasons behind the error codes returned from
> > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> >
> > Changes in v3:
> > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > order to fix the root cause instead of applying a "band-aid" kind of
> > fix.
> > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> >
> > Changes in v2:
> > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > checkpatch.pl. The error was regarding the format for referring to
> > another commit "commit <sha> ("oneline")". The warning was for line
> > longer than 75 chars.
> >
> > ---
> >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> >  drivers/base/firmware_loader/firmware.h |  6 +++-
> >  drivers/base/firmware_loader/main.c     |  2 ++
> >  3 files changed, 40 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > index 91899d185e31..f244c7b89ba5 100644
> > --- a/drivers/base/firmware_loader/fallback.c
> > +++ b/drivers/base/firmware_loader/fallback.c
> > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> >
> >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> >  {
> > -     return __fw_state_wait_common(fw_priv, timeout);
> > +     int ret = __fw_state_wait_common(fw_priv, timeout);
> > +
> > +     /*
> > +      * A signal could be sent to abort a wait. Consider Android's init
> > +      * gettting a SIGCHLD, which in turn was the same process issuing the
> > +      * sysfs store call for the fallback. In such cases we want to be able
> > +      * to tell apart in userspace when a signal caused a failure on the
> > +      * wait. In such cases we'd get -ERESTARTSYS.
> > +      *
> > +      * Likewise though another race can happen and abort the load earlier.
> > +      *
> > +      * In either case the situation is interrupted so we just inform
> > +      * userspace of that and we end things right away.
> > +      *
> > +      * When we really time out just tell userspace it should try again,
> > +      * perhaps later.
> > +      */
> > +     if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > +             ret = -EINTR;
> > +     else if (ret == -ETIMEDOUT)
> > +             ret = -EAGAIN;
>
>
> Shuah has explained to me that the only motivation on her part with
> using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> _request_firmware_load() return val for fw load abort") was to
> distinguish the error from -ENOMEM, and so there was no real
> reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> ask user to retry, but it makes no sense in this case since the
> sysfs interface is ephemeral, I think we should do away with it
> and document this rationale.
>
> I think we should stick to use -ETIMEDOUT. Its more telling of what
> happened. And so I think just removing the check should do it, but
> augmenting the comment should suffice.
>
> Since this change is already big, it would be good for this other
> change to go in as a separate change. If you can test to ensure the
> -ETIMEDOUT does indeed get propagated that'd be appreciated.
>
> Otherwise looks good. Thanks for your patience!

Anirudh, did you get a chance to test?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-06-10 22:32     ` Luis Chamberlain
  0 siblings, 0 replies; 14+ messages in thread
From: Luis Chamberlain @ 2021-06-10 22:32 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Junyong Sun, linux-kernel-mentees

On Tue, May 18, 2021 at 8:20 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > This use-after-free happens when a fw_priv object has been freed but
> > hasn't been removed from the pending list (pending_fw_head). The next
> > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > accessing the pending_list member of the previoiusly freed fw_priv.
> >
> > The root cause here is that all code paths that abort the fw load
> > don't delete it from the pending list. For example:
> >
> >       _request_firmware()
> >         -> fw_abort_batch_reqs()
> >             -> fw_state_aborted()
> >
> > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > the new state is DONE or ABORTED. This way, all aborts will remove
> > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > that were being made before calling fw_state_(aborted|done)().
> >
> > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > if it is already aborted. Instead, just jump out and return early.
> >
> > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > ---
> >
> > Changes in v4:
> > Documented the reasons behind the error codes returned from
> > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> >
> > Changes in v3:
> > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > order to fix the root cause instead of applying a "band-aid" kind of
> > fix.
> > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> >
> > Changes in v2:
> > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > checkpatch.pl. The error was regarding the format for referring to
> > another commit "commit <sha> ("oneline")". The warning was for line
> > longer than 75 chars.
> >
> > ---
> >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> >  drivers/base/firmware_loader/firmware.h |  6 +++-
> >  drivers/base/firmware_loader/main.c     |  2 ++
> >  3 files changed, 40 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > index 91899d185e31..f244c7b89ba5 100644
> > --- a/drivers/base/firmware_loader/fallback.c
> > +++ b/drivers/base/firmware_loader/fallback.c
> > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> >
> >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> >  {
> > -     return __fw_state_wait_common(fw_priv, timeout);
> > +     int ret = __fw_state_wait_common(fw_priv, timeout);
> > +
> > +     /*
> > +      * A signal could be sent to abort a wait. Consider Android's init
> > +      * gettting a SIGCHLD, which in turn was the same process issuing the
> > +      * sysfs store call for the fallback. In such cases we want to be able
> > +      * to tell apart in userspace when a signal caused a failure on the
> > +      * wait. In such cases we'd get -ERESTARTSYS.
> > +      *
> > +      * Likewise though another race can happen and abort the load earlier.
> > +      *
> > +      * In either case the situation is interrupted so we just inform
> > +      * userspace of that and we end things right away.
> > +      *
> > +      * When we really time out just tell userspace it should try again,
> > +      * perhaps later.
> > +      */
> > +     if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > +             ret = -EINTR;
> > +     else if (ret == -ETIMEDOUT)
> > +             ret = -EAGAIN;
>
>
> Shuah has explained to me that the only motivation on her part with
> using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> _request_firmware_load() return val for fw load abort") was to
> distinguish the error from -ENOMEM, and so there was no real
> reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> ask user to retry, but it makes no sense in this case since the
> sysfs interface is ephemeral, I think we should do away with it
> and document this rationale.
>
> I think we should stick to use -ETIMEDOUT. Its more telling of what
> happened. And so I think just removing the check should do it, but
> augmenting the comment should suffice.
>
> Since this change is already big, it would be good for this other
> change to go in as a separate change. If you can test to ensure the
> -ETIMEDOUT does indeed get propagated that'd be appreciated.
>
> Otherwise looks good. Thanks for your patience!

Anirudh, did you get a chance to test?
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
  2021-06-10 22:32     ` Luis Chamberlain
@ 2021-06-11  3:16       ` Anirudh Rayabharam
  -1 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-06-11  3:16 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Junyong Sun,
	linux-kernel-mentees, syzbot+de271708674e2093097b, linux-kernel,
	mail

On Thu, Jun 10, 2021 at 03:32:53PM -0700, Luis Chamberlain wrote:
> On Tue, May 18, 2021 at 8:20 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > > This use-after-free happens when a fw_priv object has been freed but
> > > hasn't been removed from the pending list (pending_fw_head). The next
> > > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > > accessing the pending_list member of the previoiusly freed fw_priv.
> > >
> > > The root cause here is that all code paths that abort the fw load
> > > don't delete it from the pending list. For example:
> > >
> > >       _request_firmware()
> > >         -> fw_abort_batch_reqs()
> > >             -> fw_state_aborted()
> > >
> > > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > > the new state is DONE or ABORTED. This way, all aborts will remove
> > > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > > that were being made before calling fw_state_(aborted|done)().
> > >
> > > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > > if it is already aborted. Instead, just jump out and return early.
> > >
> > > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > > ---
> > >
> > > Changes in v4:
> > > Documented the reasons behind the error codes returned from
> > > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> > >
> > > Changes in v3:
> > > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > > order to fix the root cause instead of applying a "band-aid" kind of
> > > fix.
> > > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> > >
> > > Changes in v2:
> > > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > > checkpatch.pl. The error was regarding the format for referring to
> > > another commit "commit <sha> ("oneline")". The warning was for line
> > > longer than 75 chars.
> > >
> > > ---
> > >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> > >  drivers/base/firmware_loader/firmware.h |  6 +++-
> > >  drivers/base/firmware_loader/main.c     |  2 ++
> > >  3 files changed, 40 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > > index 91899d185e31..f244c7b89ba5 100644
> > > --- a/drivers/base/firmware_loader/fallback.c
> > > +++ b/drivers/base/firmware_loader/fallback.c
> > > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> > >
> > >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> > >  {
> > > -     return __fw_state_wait_common(fw_priv, timeout);
> > > +     int ret = __fw_state_wait_common(fw_priv, timeout);
> > > +
> > > +     /*
> > > +      * A signal could be sent to abort a wait. Consider Android's init
> > > +      * gettting a SIGCHLD, which in turn was the same process issuing the
> > > +      * sysfs store call for the fallback. In such cases we want to be able
> > > +      * to tell apart in userspace when a signal caused a failure on the
> > > +      * wait. In such cases we'd get -ERESTARTSYS.
> > > +      *
> > > +      * Likewise though another race can happen and abort the load earlier.
> > > +      *
> > > +      * In either case the situation is interrupted so we just inform
> > > +      * userspace of that and we end things right away.
> > > +      *
> > > +      * When we really time out just tell userspace it should try again,
> > > +      * perhaps later.
> > > +      */
> > > +     if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > > +             ret = -EINTR;
> > > +     else if (ret == -ETIMEDOUT)
> > > +             ret = -EAGAIN;
> >
> >
> > Shuah has explained to me that the only motivation on her part with
> > using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> > _request_firmware_load() return val for fw load abort") was to
> > distinguish the error from -ENOMEM, and so there was no real
> > reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> > ask user to retry, but it makes no sense in this case since the
> > sysfs interface is ephemeral, I think we should do away with it
> > and document this rationale.
> >
> > I think we should stick to use -ETIMEDOUT. Its more telling of what
> > happened. And so I think just removing the check should do it, but
> > augmenting the comment should suffice.
> >
> > Since this change is already big, it would be good for this other
> > change to go in as a separate change. If you can test to ensure the
> > -ETIMEDOUT does indeed get propagated that'd be appreciated.
> >
> > Otherwise looks good. Thanks for your patience!
> 
> Anirudh, did you get a chance to test?

Hi Luis, I had replied to your email here:
https://lore.kernel.org/lkml/YKVcnQ7mm1b92mbR@anirudhrb.com/

Thanks!

	- Anirudh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs
@ 2021-06-11  3:16       ` Anirudh Rayabharam
  0 siblings, 0 replies; 14+ messages in thread
From: Anirudh Rayabharam @ 2021-06-11  3:16 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: syzbot+de271708674e2093097b, Rafael J. Wysocki, linux-kernel,
	Junyong Sun, linux-kernel-mentees

On Thu, Jun 10, 2021 at 03:32:53PM -0700, Luis Chamberlain wrote:
> On Tue, May 18, 2021 at 8:20 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > > This use-after-free happens when a fw_priv object has been freed but
> > > hasn't been removed from the pending list (pending_fw_head). The next
> > > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > > accessing the pending_list member of the previoiusly freed fw_priv.
> > >
> > > The root cause here is that all code paths that abort the fw load
> > > don't delete it from the pending list. For example:
> > >
> > >       _request_firmware()
> > >         -> fw_abort_batch_reqs()
> > >             -> fw_state_aborted()
> > >
> > > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > > the new state is DONE or ABORTED. This way, all aborts will remove
> > > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > > that were being made before calling fw_state_(aborted|done)().
> > >
> > > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > > if it is already aborted. Instead, just jump out and return early.
> > >
> > > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > > Reported-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > > Tested-by: syzbot+de271708674e2093097b@syzkaller.appspotmail.com
> > > Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
> > > ---
> > >
> > > Changes in v4:
> > > Documented the reasons behind the error codes returned from
> > > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> > >
> > > Changes in v3:
> > > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > > order to fix the root cause instead of applying a "band-aid" kind of
> > > fix.
> > > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> > >
> > > Changes in v2:
> > > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > > checkpatch.pl. The error was regarding the format for referring to
> > > another commit "commit <sha> ("oneline")". The warning was for line
> > > longer than 75 chars.
> > >
> > > ---
> > >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> > >  drivers/base/firmware_loader/firmware.h |  6 +++-
> > >  drivers/base/firmware_loader/main.c     |  2 ++
> > >  3 files changed, 40 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > > index 91899d185e31..f244c7b89ba5 100644
> > > --- a/drivers/base/firmware_loader/fallback.c
> > > +++ b/drivers/base/firmware_loader/fallback.c
> > > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> > >
> > >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> > >  {
> > > -     return __fw_state_wait_common(fw_priv, timeout);
> > > +     int ret = __fw_state_wait_common(fw_priv, timeout);
> > > +
> > > +     /*
> > > +      * A signal could be sent to abort a wait. Consider Android's init
> > > +      * gettting a SIGCHLD, which in turn was the same process issuing the
> > > +      * sysfs store call for the fallback. In such cases we want to be able
> > > +      * to tell apart in userspace when a signal caused a failure on the
> > > +      * wait. In such cases we'd get -ERESTARTSYS.
> > > +      *
> > > +      * Likewise though another race can happen and abort the load earlier.
> > > +      *
> > > +      * In either case the situation is interrupted so we just inform
> > > +      * userspace of that and we end things right away.
> > > +      *
> > > +      * When we really time out just tell userspace it should try again,
> > > +      * perhaps later.
> > > +      */
> > > +     if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > > +             ret = -EINTR;
> > > +     else if (ret == -ETIMEDOUT)
> > > +             ret = -EAGAIN;
> >
> >
> > Shuah has explained to me that the only motivation on her part with
> > using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> > _request_firmware_load() return val for fw load abort") was to
> > distinguish the error from -ENOMEM, and so there was no real
> > reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> > ask user to retry, but it makes no sense in this case since the
> > sysfs interface is ephemeral, I think we should do away with it
> > and document this rationale.
> >
> > I think we should stick to use -ETIMEDOUT. Its more telling of what
> > happened. And so I think just removing the check should do it, but
> > augmenting the comment should suffice.
> >
> > Since this change is already big, it would be good for this other
> > change to go in as a separate change. If you can test to ensure the
> > -ETIMEDOUT does indeed get propagated that'd be appreciated.
> >
> > Otherwise looks good. Thanks for your patience!
> 
> Anirudh, did you get a chance to test?

Hi Luis, I had replied to your email here:
https://lore.kernel.org/lkml/YKVcnQ7mm1b92mbR@anirudhrb.com/

Thanks!

	- Anirudh
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-11  3:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18 15:59 [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs Anirudh Rayabharam
2021-05-18 15:59 ` Anirudh Rayabharam
2021-05-19  3:20 ` Luis Chamberlain
2021-05-19  3:20   ` Luis Chamberlain
2021-05-19 18:44   ` Anirudh Rayabharam
2021-05-19 18:44     ` Anirudh Rayabharam
2021-06-10 22:32   ` Luis Chamberlain
2021-06-10 22:32     ` Luis Chamberlain
2021-06-11  3:16     ` Anirudh Rayabharam
2021-06-11  3:16       ` Anirudh Rayabharam
2021-05-19  9:10 ` Hillf Danton
2021-05-19 18:56   ` Anirudh Rayabharam
2021-05-19 18:56     ` Anirudh Rayabharam
2021-05-20  3:52     ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.