linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/1] binder: fix freeze race
@ 2021-09-10 16:42 Li Li
  2021-09-10 16:42 ` [PATCH v3 1/1] " Li Li
  0 siblings, 1 reply; 3+ messages in thread
From: Li Li @ 2021-09-10 16:42 UTC (permalink / raw)
  To: dualli, tkjos, gregkh, christian, arve, devel, linux-kernel,
	maco, hridya, surenb
  Cc: joel, kernel-team

From: Li Li <dualli@google.com>

As there isn't an atomic operation to freeze the main thread and binder
interface together, it's possible the main thread initiates a new binder
transaction while the binder interfaces are already frozen. This race issue
will result in failed binder transaction and unexpectedly crash the app.

This patch allows a post-froze rollback mechanism by checking if there's
any new pending binder transaction waiting for response. At the same time,
it treats the response transaction like an oneway transaction so that the
response can successfully reach the frozen process.

Changes in v2:
1. Improve commit msg, adding "Fixes";
2. Adding missing "_ilocked" suffix to binder_txns_pending();
3. Document bit assignment of struct binder_frozen_status_info in binder.h.

Changes in v3:
1. Make function binder_txns_pending() bool;
2. Remove redundant outstanding_txns check in binder_ioctl_freeze;
3. Change local variable txns_pending from int to __u32 for alignment;
4. Clarify uapi backward compatibility in commit msg.

Li Li (1):
  binder: fix freeze race

 drivers/android/binder.c            | 35 ++++++++++++++++++++++++-----
 drivers/android/binder_internal.h   |  2 ++
 include/uapi/linux/android/binder.h |  7 ++++++
 3 files changed, 38 insertions(+), 6 deletions(-)

-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v3 1/1] binder: fix freeze race
  2021-09-10 16:42 [PATCH v3 0/1] binder: fix freeze race Li Li
@ 2021-09-10 16:42 ` Li Li
  2021-09-13 21:15   ` Todd Kjos
  0 siblings, 1 reply; 3+ messages in thread
From: Li Li @ 2021-09-10 16:42 UTC (permalink / raw)
  To: dualli, tkjos, gregkh, christian, arve, devel, linux-kernel,
	maco, hridya, surenb
  Cc: joel, kernel-team

From: Li Li <dualli@google.com>

Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.

But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.

1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.

This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.

Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.

As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.

NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.

To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.

If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.

A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.

Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl")
Test: stress test with apps being frozen and initiating binder calls at
the same time, confirmed the pending transactions succeeded.
Signed-off-by: Li Li <dualli@google.com>
---
 drivers/android/binder.c            | 35 ++++++++++++++++++++++++-----
 drivers/android/binder_internal.h   |  2 ++
 include/uapi/linux/android/binder.h |  7 ++++++
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index d9030cb6b1e4..1a68c2f590cf 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -3038,9 +3038,8 @@ static void binder_transaction(struct binder_proc *proc,
 	if (reply) {
 		binder_enqueue_thread_work(thread, tcomplete);
 		binder_inner_proc_lock(target_proc);
-		if (target_thread->is_dead || target_proc->is_frozen) {
-			return_error = target_thread->is_dead ?
-				BR_DEAD_REPLY : BR_FROZEN_REPLY;
+		if (target_thread->is_dead) {
+			return_error = BR_DEAD_REPLY;
 			binder_inner_proc_unlock(target_proc);
 			goto err_dead_proc_or_thread;
 		}
@@ -4648,6 +4647,22 @@ static int binder_ioctl_get_node_debug_info(struct binder_proc *proc,
 	return 0;
 }
 
+static bool binder_txns_pending_ilocked(struct binder_proc *proc)
+{
+	struct rb_node *n;
+	struct binder_thread *thread;
+
+	if (proc->outstanding_txns > 0)
+		return true;
+
+	for (n = rb_first(&proc->threads); n; n = rb_next(n)) {
+		thread = rb_entry(n, struct binder_thread, rb_node);
+		if (thread->transaction_stack)
+			return true;
+	}
+	return false;
+}
+
 static int binder_ioctl_freeze(struct binder_freeze_info *info,
 			       struct binder_proc *target_proc)
 {
@@ -4679,8 +4694,13 @@ static int binder_ioctl_freeze(struct binder_freeze_info *info,
 			(!target_proc->outstanding_txns),
 			msecs_to_jiffies(info->timeout_ms));
 
-	if (!ret && target_proc->outstanding_txns)
-		ret = -EAGAIN;
+	/* Check pending transactions that wait for reply */
+	if (ret >= 0) {
+		binder_inner_proc_lock(target_proc);
+		if (binder_txns_pending_ilocked(target_proc))
+			ret = -EAGAIN;
+		binder_inner_proc_unlock(target_proc);
+	}
 
 	if (ret < 0) {
 		binder_inner_proc_lock(target_proc);
@@ -4696,6 +4716,7 @@ static int binder_ioctl_get_freezer_info(
 {
 	struct binder_proc *target_proc;
 	bool found = false;
+	__u32 txns_pending;
 
 	info->sync_recv = 0;
 	info->async_recv = 0;
@@ -4705,7 +4726,9 @@ static int binder_ioctl_get_freezer_info(
 		if (target_proc->pid == info->pid) {
 			found = true;
 			binder_inner_proc_lock(target_proc);
-			info->sync_recv |= target_proc->sync_recv;
+			txns_pending = binder_txns_pending_ilocked(target_proc);
+			info->sync_recv |= target_proc->sync_recv |
+					(txns_pending << 1);
 			info->async_recv |= target_proc->async_recv;
 			binder_inner_proc_unlock(target_proc);
 		}
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 810c0b84d3f8..402c4d4362a8 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -378,6 +378,8 @@ struct binder_ref {
  *                        binder transactions
  *                        (protected by @inner_lock)
  * @sync_recv:            process received sync transactions since last frozen
+ *                        bit 0: received sync transaction after being frozen
+ *                        bit 1: new pending sync transaction during freezing
  *                        (protected by @inner_lock)
  * @async_recv:           process received async transactions since last frozen
  *                        (protected by @inner_lock)
diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h
index 20e435fe657a..3246f2c74696 100644
--- a/include/uapi/linux/android/binder.h
+++ b/include/uapi/linux/android/binder.h
@@ -225,7 +225,14 @@ struct binder_freeze_info {
 
 struct binder_frozen_status_info {
 	__u32            pid;
+
+	/* process received sync transactions since last frozen
+	 * bit 0: received sync transaction after being frozen
+	 * bit 1: new pending sync transaction during freezing
+	 */
 	__u32            sync_recv;
+
+	/* process received async transactions since last frozen */
 	__u32            async_recv;
 };
 
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 1/1] binder: fix freeze race
  2021-09-10 16:42 ` [PATCH v3 1/1] " Li Li
@ 2021-09-13 21:15   ` Todd Kjos
  0 siblings, 0 replies; 3+ messages in thread
From: Todd Kjos @ 2021-09-13 21:15 UTC (permalink / raw)
  To: Li Li
  Cc: dualli, gregkh, christian, arve, devel, linux-kernel, maco,
	hridya, surenb, joel, kernel-team

On Fri, Sep 10, 2021 at 9:42 AM Li Li <dualli@chromium.org> wrote:
>
> From: Li Li <dualli@google.com>
>
> Currently cgroup freezer is used to freeze the application threads, and
> BINDER_FREEZE is used to freeze the corresponding binder interface.
> There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
> existing transactions to drain out before actually freezing the binder
> interface.
>
> But freezing an app requires 2 steps, freezing the binder interface with
> ioctl(BINDER_FREEZE) and then freezing the application main threads with
> cgroupfs. This is not an atomic operation. The following race issue
> might happen.
>
> 1) Binder interface is frozen by ioctl(BINDER_FREEZE);
> 2) Main thread A initiates a new sync binder transaction to process B;
> 3) Main thread A is frozen by "echo 1 > cgroup.freeze";
> 4) The response from process B reaches the frozen thread, which will
> unexpectedly fail.
>
> This patch provides a mechanism to check if there's any new pending
> transaction happening between ioctl(BINDER_FREEZE) and freezing the
> main thread. If there's any, the main thread freezing operation can
> be rolled back to finish the pending transaction.
>
> Furthermore, the response might reach the binder driver before the
> rollback actually happens. That will still cause failed transaction.
>
> As the other process doesn't wait for another response of the response,
> the response transaction failure can be fixed by treating the response
> transaction like an oneway/async one, allowing it to reach the frozen
> thread. And it will be consumed when the thread gets unfrozen later.
>
> NOTE: This patch reuses the existing definition of struct
> binder_frozen_status_info but expands the bit assignments of __u32
> member sync_recv.
>
> To ensure backward compatibility, bit 0 of sync_recv still indicates
> there's an outstanding sync binder transaction. This patch adds new
> information to bit 1 of sync_recv, indicating the binder transaction
> happens exactly when there's a race.
>
> If an existing userspace app runs on a new kernel, a sync binder call
> will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
> return the expected value (true). The app just doesn't check bit 1
> intentionally so it doesn't have the ability to tell if there's a race.
> This behavior is aligned with what happens on an old kernel which
> doesn't set bit 1 at all.
>
> A new userspace app can 1) check bit 0 to know if there's a sync binder
> transaction happened when being frozen - same as before; and 2) check
> bit 1 to know if that sync binder transaction happened exactly when
> there's a race - a new information for rollback decision.
>
> Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl")
> Test: stress test with apps being frozen and initiating binder calls at
> the same time, confirmed the pending transactions succeeded.
> Signed-off-by: Li Li <dualli@google.com>

Acked-by: Todd Kjos <tkjos@google.com>

> ---
>  drivers/android/binder.c            | 35 ++++++++++++++++++++++++-----
>  drivers/android/binder_internal.h   |  2 ++
>  include/uapi/linux/android/binder.h |  7 ++++++
>  3 files changed, 38 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index d9030cb6b1e4..1a68c2f590cf 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -3038,9 +3038,8 @@ static void binder_transaction(struct binder_proc *proc,
>         if (reply) {
>                 binder_enqueue_thread_work(thread, tcomplete);
>                 binder_inner_proc_lock(target_proc);
> -               if (target_thread->is_dead || target_proc->is_frozen) {
> -                       return_error = target_thread->is_dead ?
> -                               BR_DEAD_REPLY : BR_FROZEN_REPLY;
> +               if (target_thread->is_dead) {
> +                       return_error = BR_DEAD_REPLY;
>                         binder_inner_proc_unlock(target_proc);
>                         goto err_dead_proc_or_thread;
>                 }
> @@ -4648,6 +4647,22 @@ static int binder_ioctl_get_node_debug_info(struct binder_proc *proc,
>         return 0;
>  }
>
> +static bool binder_txns_pending_ilocked(struct binder_proc *proc)
> +{
> +       struct rb_node *n;
> +       struct binder_thread *thread;
> +
> +       if (proc->outstanding_txns > 0)
> +               return true;
> +
> +       for (n = rb_first(&proc->threads); n; n = rb_next(n)) {
> +               thread = rb_entry(n, struct binder_thread, rb_node);
> +               if (thread->transaction_stack)
> +                       return true;
> +       }
> +       return false;
> +}
> +
>  static int binder_ioctl_freeze(struct binder_freeze_info *info,
>                                struct binder_proc *target_proc)
>  {
> @@ -4679,8 +4694,13 @@ static int binder_ioctl_freeze(struct binder_freeze_info *info,
>                         (!target_proc->outstanding_txns),
>                         msecs_to_jiffies(info->timeout_ms));
>
> -       if (!ret && target_proc->outstanding_txns)
> -               ret = -EAGAIN;
> +       /* Check pending transactions that wait for reply */
> +       if (ret >= 0) {
> +               binder_inner_proc_lock(target_proc);
> +               if (binder_txns_pending_ilocked(target_proc))
> +                       ret = -EAGAIN;
> +               binder_inner_proc_unlock(target_proc);
> +       }
>
>         if (ret < 0) {
>                 binder_inner_proc_lock(target_proc);
> @@ -4696,6 +4716,7 @@ static int binder_ioctl_get_freezer_info(
>  {
>         struct binder_proc *target_proc;
>         bool found = false;
> +       __u32 txns_pending;
>
>         info->sync_recv = 0;
>         info->async_recv = 0;
> @@ -4705,7 +4726,9 @@ static int binder_ioctl_get_freezer_info(
>                 if (target_proc->pid == info->pid) {
>                         found = true;
>                         binder_inner_proc_lock(target_proc);
> -                       info->sync_recv |= target_proc->sync_recv;
> +                       txns_pending = binder_txns_pending_ilocked(target_proc);
> +                       info->sync_recv |= target_proc->sync_recv |
> +                                       (txns_pending << 1);
>                         info->async_recv |= target_proc->async_recv;
>                         binder_inner_proc_unlock(target_proc);
>                 }
> diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
> index 810c0b84d3f8..402c4d4362a8 100644
> --- a/drivers/android/binder_internal.h
> +++ b/drivers/android/binder_internal.h
> @@ -378,6 +378,8 @@ struct binder_ref {
>   *                        binder transactions
>   *                        (protected by @inner_lock)
>   * @sync_recv:            process received sync transactions since last frozen
> + *                        bit 0: received sync transaction after being frozen
> + *                        bit 1: new pending sync transaction during freezing
>   *                        (protected by @inner_lock)
>   * @async_recv:           process received async transactions since last frozen
>   *                        (protected by @inner_lock)
> diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h
> index 20e435fe657a..3246f2c74696 100644
> --- a/include/uapi/linux/android/binder.h
> +++ b/include/uapi/linux/android/binder.h
> @@ -225,7 +225,14 @@ struct binder_freeze_info {
>
>  struct binder_frozen_status_info {
>         __u32            pid;
> +
> +       /* process received sync transactions since last frozen
> +        * bit 0: received sync transaction after being frozen
> +        * bit 1: new pending sync transaction during freezing
> +        */
>         __u32            sync_recv;
> +
> +       /* process received async transactions since last frozen */
>         __u32            async_recv;
>  };
>
> --
> 2.33.0.309.g3052b89438-goog
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-13 21:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-10 16:42 [PATCH v3 0/1] binder: fix freeze race Li Li
2021-09-10 16:42 ` [PATCH v3 1/1] " Li Li
2021-09-13 21:15   ` Todd Kjos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).