linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/20] Setting up Binder for the future
@ 2023-11-01 18:01 Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
                   ` (20 more replies)
  0 siblings, 21 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

We're generally not proponents of rewrites (nasty uncomfortable things
that make you late for dinner!). So why rewrite Binder? 

Binder has been evolving over the past 15+ years to meet the evolving
needs of Android. Its responsibilities, expectations, and complexity
have grown considerably during that time. While we expect Binder to
continue to evolve along with Android, there are a number of factors
that currently constrain our ability to develop/maintain it. Briefly
those are:

1. Complexity: Binder is at the intersection of everything in Android and
   fulfills many responsibilities beyond IPC. It has become many things
   to many people, and due to its many features and their interactions
   with each other, its complexity is quite high. In just 6kLOC it must
   deliver transactions to the right threads. It must correctly parse
   and translate the contents of transactions, which can contain several
   objects of different types (e.g., pointers, fds) that can interact
   with each other. It controls the size of thread pools in userspace,
   and ensures that transactions are assigned to threads in ways that
   avoid deadlocks where the threadpool has run out of threads. It must
   track refcounts of objects that are shared by several processes by
   forwarding refcount changes between the processes correctly.  It must
   handle numerous error scenarios and it combines/nests 13 different
   locks, 7 reference counters, and atomic variables. Finally, It must
   do all of this as fast and efficiently as possible. Minor performance
   regressions can cause a noticeably degraded user experience.

2. Things to improve: Thousand-line functions [1], error-prone error
   handling [2], and confusing structure can occur as a code base grows
   organically. After more than a decade of development, this codebase
   could use an overhaul.

3. Security critical: Binder is a critical part of Android's sandboxing
   strategy. Even Android's most de-privileged sandboxes (e.g. the
   Chrome renderer, or SW Codec) have direct access to Binder. More than
   just about any other component, it's important that Binder provide
   robust security, and itself be robust against security
   vulnerabilities.

It's #1 (high complexity) that has made continuing to evolve Binder and
resolving #2 (tech debt) exceptionally difficult without causing #3
(security issues). For Binder to continue to meet Android's needs, we
need better ways to manage (and reduce!) complexity without increasing
the risk.

The biggest change is obviously the choice of programming language. We
decided to use Rust because it directly addresses a number of the
challenges within Binder that we have faced during the last years. It
prevents mistakes with ref counting, locking, bounds checking, and also
does a lot to reduce the complexity of error handling. Additionally,
we've been able to use the more expressive type system to encode the
ownership semantics of the various structs and pointers, which takes the
complexity of managing object lifetimes out of the hands of the
programmer, reducing the risk of use-after-frees and similar problems.

Rust has many different pointer types that it uses to encode ownership
semantics into the type system, and this is probably one of the most
important aspects of how it helps in Binder. The Binder driver has a lot
of different objects that have complex ownership semantics; some
pointers own a refcount, some pointers have exclusive ownership, and
some pointers just reference the object and it is kept alive in some
other manner. With Rust, we can use a different pointer type for each
kind of pointer, which enables the compiler to enforce that the
ownership semantics are implemented correctly.

Another useful feature is Rust's error handling. Rust allows for more
simplified error handling with features such as destructors, and you get
compilation failures if errors are not properly handled. This means that
even though Rust requires you to spend more lines of code than C on
things such as writing down invariants that are left implicit in C, the
Rust driver is still slightly smaller than C binder: Rust is 5.5kLOC and
C is 5.8kLOC. (These numbers are excluding blank lines, comments,
binderfs, and any debugging facilities in C that are not yet implemented
in the Rust driver. The numbers include abstractions in rust/kernel/
that are unlikely to be used by other drivers than Binder.)

Although this rewrite completely rethinks how the code is structured and
how assumptions are enforced, we do not fundamentally change *how* the
driver does the things it does. A lot of careful thought has gone into
the existing design. The rewrite is aimed rather at improving code
health, structure, readability, robustness, security, maintainability
and extensibility. We also include more inline documentation, and
improve how assumptions in the code are enforced. Furthermore, all
unsafe code is annotated with a SAFETY comment that explains why it is
correct.

We have left the binderfs filesystem component in C. Rewriting it in
Rust would be a large amount of work and requires a lot of bindings to
the file system interfaces. Binderfs has not historically had the same
challenges with security and complexity, so rewriting binderfs seems to
have lower value than the rest of Binder.

Correctness and feature parity
------------------------------

Rust binder passes all tests that validate the correctness of Binder in
the Android Open Source Project. We can boot a device, and run a variety
of apps and functionality without issues. We have performed this both on
the Cuttlefish Android emulator device, and on a Pixel 6 Pro.

As for feature parity, Rust binder currently implements all features
that C binder supports, with the exception of some debugging facilities.
The missing debugging facilities will be added before we submit the Rust
implementation upstream.

Performance numbers
-------------------

We have tested the driver using two different benchmarks:
binderThroughputTest [3] and binderRpcBenchmark [4]. These benchmarks
show that the Rust implementation has very promising performance
characteristics. That said, these are only microbenchmarks with very
simple workloads, and there is still a lot of work to be done before we
can truly understand how the drivers compare in the real world.

binderThroughputTest:
Some visualizations of the benchmarking results are available at the
following links:

Average latency with no payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/Average%20latency%20with%20no%20payload.png
Average latency with 4k payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/Average%20latency%20with%204k%20payload.png
99 percentile latency with no payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/99%20percentile%20latency%20with%20no%20payload.png
99 percentile latency with 4k payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/99%20percentile%20latency%20with%204k%20payload.png

Raw data with empty payloads:
    +-----------+----------+---------+----------+---------+----------+----------+
    | c/s pairs | Rust avg |  C avg  | Rust 99p |  C 99p  | Avg frac | 99p frac |
    +-----------+----------+---------+----------+---------+----------+----------+
    |         1 |   17.517 |  17.278 |   31.169 |  34.464 |   +1.38% |   -9.56% |
    |         2 |   17.405 |  17.425 |   36.051 |  36.825 |   -0.11% |   -2.10% |
    |         4 |   27.623 |  27.524 |   46.305 |  45.776 |   +0.36% |   +1.16% |
    |         8 |   25.152 |  25.461 |   61.442 |  61.279 |   -1.21% |   +0.27% |
    |        16 |   50.251 |  49.987 |  120.158 | 121.297 |   +0.53% |   -0.94% |
    |        32 |   99.439 | 100.537 |  238.891 | 238.404 |   -1.09% |   +0.20% |
    +-----------+----------+---------+----------+---------+----------+----------+
Raw data with 4k payloads:
    +-----------+----------+---------+----------+---------+----------+----------+
    | c/s pairs | Rust avg |  C avg  | Rust 99p |  C 99p  | Avg frac | 99p frac |
    +-----------+----------+---------+----------+---------+----------+----------+
    |         1 |   19.422 |  19.811 |   30.233 |  31.616 |   -1.96% |   -4.37% |
    |         2 |   18.393 |  18.277 |   34.790 |  35.319 |   +0.63% |   -1.50% |
    |         4 |   29.350 |  29.283 |   48.544 |  47.730 |   +0.23% |   +1.71% |
    |         8 |   25.075 |  25.283 |   66.040 |  65.226 |   -0.82% |   +1.25% |
    |        16 |   58.608 |  58.949 |  156.657 | 159.709 |   -0.58% |   -1.91% |
    |        32 |  127.404 | 129.459 |  321.249 | 326.945 |   -1.59% |   -1.74% |
    +-----------+----------+---------+----------+---------+----------+----------+
These tables depict roundtrip latencies of transactions as measured by
binderThroughputTest. Each measurement is given in microseconds. Each
row has a sample size of 10 million iterations. Negative percentages are
better for Rust.

We've found that Rust binder has similar performance to C binder on the
binderThroughputTest benchmark. The average latencies fluctuate between
-1.96% and +1.38%.

binderRpcBenchmark:
    +---------------------+-----------+---------+----------+---------+-----------+----------+
    |      Benchmark      | Time Rust | Time C  | CPU Rust |   CPU C | Time frac | CPU frac |
    +---------------------+-----------+---------+----------+---------+-----------+----------+
    | pingTransaction     |    21.595 |  22.167 |    9.625 |   9.692 |    -2.58% |   -0.69% |
    | repeatBinder        |    33.982 |  34.648 |   16.252 |  16.681 |    -1.92% |   -2.57% |
    | throughput/64       |    26.774 |  26.587 |   11.995 |  11.823 |    +0.70% |   +1.45% |
    | throughput/1024     |    33.679 |  33.867 |   15.140 |  15.137 |    -0.56% |   +0.02% |
    | throughput/2048     |    39.744 |  40.092 |   17.898 |  17.926 |    -0.87% |   -0.16% |
    | throughput/4096     |    52.585 |  53.457 |   23.788 |  24.067 |    -1.63% |   -1.16% |
    | throughput/8182     |    76.352 |  77.148 |   35.135 |  35.228 |    -1.03% |   -0.26% |
    | throughput/16364    |   121.875 | 122.877 |   57.342 |  57.614 |    -0.82% |   -0.47% |
    | throughput/32728    |   212.380 | 212.765 |  101.838 | 101.589 |    -0.18% |   +0.25% |
    | throughput/65535    |   442.983 | 421.935 |  222.642 | 212.494 |    +4.99% |   +4.78% |
    | throughput/65536    |   431.250 | 416.916 |  216.634 | 210.160 |    +3.44% |   +3.08% |
    | throughput/65537    |   512.902 | 492.272 |  242.472 | 232.786 |    +4.19% |   +4.16% |
    | repeatTwoPageString |   456.546 | 445.398 |  222.921 | 219.821 |    +2.50% |   +1.41% |
    +---------------------+-----------+---------+----------+---------+-----------+----------+
This table depicts wall clock time and cpu time measurements over
various test cases. Each measurement is given in microseconds. The
throughput benchmarks correspond to the
BM_throughputForTransportAndBytes test case, and the number is the size
of the payload. Negative percentages are better for Rust.

From the above, we find that Rust binder is competitive for all test
cases except for those with very large transaction sizes. However, this
is a very rare case in practice [5] and we've been able to fix all other
performance issues that we've run into, so there's no reason to think
that we won't also be able to fix this issue. We did not fix it for this
RFC because we prioritized getting the RFC out to provide context for
the upcoming discussion at Linux Plumbers Conference [6].

We ran all of the benchmarks with cross-language LTO enabled, so that C
code can be inlined into Rust code. We get similar results on the
Cuttlefish Android emulator (which has an x86 architecture).

The Binder driver is very performance critical, and although our initial
numbers are promising, we must gain a better understanding of how it
performs in realistic workloads and not just in simple benchmarks. What
we ultimately care about is the performance impact that it has on the
whole system. Much work remains to be done on this front.

Dependencies
------------

When implementing kernel drivers in Rust, you must write bindings for
each subsystem that we need to call into from Rust. Binder requires
quite a few of them. We have not included them in this patch series, but
you can view them at the following branch:

https://github.com/Darksonn/linux/commits/rust-binder-rfc

The branch is based on top of commit 639409a4ac8e ("Merge tag
'wq-for-6.7-rust-bindings' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq"),
which is available in mainline. I did not base it on a tag, since there
is not yet any tag that includes the Rust workqueue abstractions.

This RFC uses the kernel's red-black tree for key/value mappings, but we
are aware that the red-black tree is deprecated. We did this to make the
performance comparison more fair, since C binder also uses rbtree for
this. We intend to replace these with XArrays instead. That said, we
don't think that XArray is a good fit for the range allocator, and we
propose to continue using the red-black tree for the range allocator.
(see patch 6)

Thank you
---------

 * Wedson Almeida Filho who wrote the first version of the driver and
   started the project.

 * Miguel Ojeda for his support, and leading the Rust-for-Linux effort,
   and helping us navigate the upstream community.

 * Matt Gilbride for his work on the range allocator and oneway spam
   detection.

 * Carlos Llamas for patiently answering all my questions to help me
   understand the C driver, and co-presenting with me at LPC and
   Kangrejos.

 * Greg KH for reviews and guidance on upstream development.

 * Todd Kjos for reviewing the cover letter, answering questions, and
   pointers on benchmarking the driver.

 * Matthew Maurer for his mentorship and help with navigating the build
   system, including getting LTO working.

 * John Stultz for his help with debugging a performance issue.

 * Andreas Hindborg for his help with getting LTO working.

 * Benno Lossin, Gary Guo, Andreas Hindborg, Miguel Ojeda, Wedson
   Almeida Filho, Martin Rodriguez Reboredo, Björn Roy Baron, Boqun
   Feng, Tejun Heo, Nathan Huckleberry for reviewing various bindings
   needed by Binder.

Thank you,
Alice

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/android/binder.c?h=v6.5#n2896
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/android/binder.c?h=v6.5#n3658
[3]: https://android-review.googlesource.com/c/platform/frameworks/native/+/2680818
[4]: https://cs.android.com/android/platform/superproject/main/+/main:frameworks/native/libs/binder/tests/binderRpcBenchmark.cpp
[5]: https://cs.android.com/android/_/android/platform/frameworks/native/+/b85e7f7dbd0463d2ba78d53d50e64489fcb01ec4:libs/binder/tests/binderRpcBenchmark.cpp;l=206-217;bpv=1;bpt=0
[6]: https://lpc.events/event/17/contributions/1427/

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Alice Ryhl (15):
      rust_binder: add binderfs support to Rust binder
      rust_binder: add threading support
      rust_binder: add work lists
      rust_binder: add nodes and context managers
      rust_binder: add oneway transactions
      rust_binder: serialize oneway transactions
      rust_binder: send nodes in transactions
      rust_binder: add BINDER_TYPE_PTR support
      rust_binder: add BINDER_TYPE_FD support
      rust_binder: add BINDER_TYPE_FDA support
      rust_binder: add process freezing
      rust_binder: add TF_UPDATE_TXN support
      rust_binder: add binder_logs/state
      rust_binder: add vma shrinker
      binder: delete the C implementation

Matt Gilbride (1):
      rust_binder: add oneway spam detection

Wedson Almeida Filho (4):
      rust_binder: define a Rust binder driver
      rust_binder: add epoll support
      rust_binder: add non-oneway transactions
      rust_binder: add death notifications

 drivers/android/Kconfig                         |   19 +-
 drivers/android/Makefile                        |    2 +
 drivers/android/allocation.rs                   |  541 ++
 drivers/android/binder.c                        | 6630 -----------------------
 drivers/android/binder_alloc.c                  | 1284 -----
 drivers/android/context.rs                      |  225 +
 drivers/android/defs.rs                         |  171 +
 drivers/android/error.rs                        |   94 +
 drivers/android/node.rs                         |  761 +++
 drivers/android/process.rs                      | 1412 +++++
 drivers/android/range_alloc.rs                  |  442 ++
 drivers/android/rust_binder.rs                  |  389 ++
 drivers/android/{binderfs.c => rust_binderfs.c} |  135 +-
 drivers/android/thread.rs                       | 1552 ++++++
 drivers/android/transaction.rs                  |  428 ++
 include/linux/rust_binder.h                     |   16 +
 include/uapi/linux/android/binder.h             |   30 +-
 include/uapi/linux/magic.h                      |    1 +
 rust/bindings/bindings_helper.h                 |    6 +
 rust/helpers.c                                  |   48 +
 rust/kernel/file.rs                             |    2 +-
 rust/kernel/lib.rs                              |    9 +
 rust/kernel/page_range.rs                       |  715 +++
 rust/kernel/security.rs                         |   33 +
 rust/kernel/seq_file.rs                         |   47 +
 rust/kernel/sync/condvar.rs                     |   10 +
 rust/kernel/sync/lock.rs                        |   24 +
 rust/kernel/sync/lock/mutex.rs                  |   10 +
 rust/kernel/sync/lock/spinlock.rs               |   10 +
 rust/kernel/task.rs                             |    2 +-
 scripts/Makefile.build                          |    2 +-
 31 files changed, 7061 insertions(+), 7989 deletions(-)
---
base-commit: b4be1bd6c44225bf7276a4666fd30b8da9cba517
change-id: 20231101-rust-binder-464b89651887

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH RFC 01/20] rust_binder: define a Rust binder driver
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:09   ` Greg Kroah-Hartman
  2023-11-01 18:25   ` Boqun Feng
  2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
                   ` (19 subsequent siblings)
  20 siblings, 2 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

From: Wedson Almeida Filho <wedsonaf@gmail.com>

Define the Rust binder driver, and set up the helpers for making C types
accessible from Rust.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/Kconfig             | 11 +++++++++++
 drivers/android/Makefile            |  1 +
 drivers/android/rust_binder.rs      | 21 +++++++++++++++++++++
 include/uapi/linux/android/binder.h | 30 ++++++++++++++++--------------
 rust/bindings/bindings_helper.h     |  1 +
 5 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
index 07aa8ae0a058..fcfd25c9a016 100644
--- a/drivers/android/Kconfig
+++ b/drivers/android/Kconfig
@@ -13,6 +13,17 @@ config ANDROID_BINDER_IPC
 	  Android process, using Binder to identify, invoke and pass arguments
 	  between said processes.
 
+config ANDROID_BINDER_IPC_RUST
+	bool "Android Binder IPC Driver in Rust"
+	depends on MMU && RUST
+	help
+	  Binder is used in Android for both communication between processes,
+	  and remote method invocation.
+
+	  This means one Android process can call a method/routine in another
+	  Android process, using Binder to identify, invoke and pass arguments
+	  between said processes.
+
 config ANDROID_BINDERFS
 	bool "Android Binderfs filesystem"
 	depends on ANDROID_BINDER_IPC
diff --git a/drivers/android/Makefile b/drivers/android/Makefile
index c9d3d0c99c25..6348f75832ca 100644
--- a/drivers/android/Makefile
+++ b/drivers/android/Makefile
@@ -4,3 +4,4 @@ ccflags-y += -I$(src)			# needed for trace events
 obj-$(CONFIG_ANDROID_BINDERFS)		+= binderfs.o
 obj-$(CONFIG_ANDROID_BINDER_IPC)	+= binder.o binder_alloc.o
 obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o
+obj-$(CONFIG_ANDROID_BINDER_IPC_RUST)	+= rust_binder.o
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
new file mode 100644
index 000000000000..4b3d6676a9cf
--- /dev/null
+++ b/drivers/android/rust_binder.rs
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Binder -- the Android IPC mechanism.
+
+use kernel::prelude::*;
+
+module! {
+    type: BinderModule,
+    name: "rust_binder",
+    author: "Wedson Almeida Filho, Alice Ryhl",
+    description: "Android Binder",
+    license: "GPL",
+}
+
+struct BinderModule {}
+
+impl kernel::Module for BinderModule {
+    fn init(_module: &'static kernel::ThisModule) -> Result<Self> {
+        Ok(Self {})
+    }
+}
diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h
index 5f636b5afcd7..d44a8118b2ed 100644
--- a/include/uapi/linux/android/binder.h
+++ b/include/uapi/linux/android/binder.h
@@ -251,20 +251,22 @@ struct binder_extended_error {
 	__s32	param;
 };
 
-#define BINDER_WRITE_READ		_IOWR('b', 1, struct binder_write_read)
-#define BINDER_SET_IDLE_TIMEOUT		_IOW('b', 3, __s64)
-#define BINDER_SET_MAX_THREADS		_IOW('b', 5, __u32)
-#define BINDER_SET_IDLE_PRIORITY	_IOW('b', 6, __s32)
-#define BINDER_SET_CONTEXT_MGR		_IOW('b', 7, __s32)
-#define BINDER_THREAD_EXIT		_IOW('b', 8, __s32)
-#define BINDER_VERSION			_IOWR('b', 9, struct binder_version)
-#define BINDER_GET_NODE_DEBUG_INFO	_IOWR('b', 11, struct binder_node_debug_info)
-#define BINDER_GET_NODE_INFO_FOR_REF	_IOWR('b', 12, struct binder_node_info_for_ref)
-#define BINDER_SET_CONTEXT_MGR_EXT	_IOW('b', 13, struct flat_binder_object)
-#define BINDER_FREEZE			_IOW('b', 14, struct binder_freeze_info)
-#define BINDER_GET_FROZEN_INFO		_IOWR('b', 15, struct binder_frozen_status_info)
-#define BINDER_ENABLE_ONEWAY_SPAM_DETECTION	_IOW('b', 16, __u32)
-#define BINDER_GET_EXTENDED_ERROR	_IOWR('b', 17, struct binder_extended_error)
+enum {
+	BINDER_WRITE_READ		= _IOWR('b', 1, struct binder_write_read),
+	BINDER_SET_IDLE_TIMEOUT		= _IOW('b', 3, __s64),
+	BINDER_SET_MAX_THREADS		= _IOW('b', 5, __u32),
+	BINDER_SET_IDLE_PRIORITY	= _IOW('b', 6, __s32),
+	BINDER_SET_CONTEXT_MGR		= _IOW('b', 7, __s32),
+	BINDER_THREAD_EXIT		= _IOW('b', 8, __s32),
+	BINDER_VERSION			= _IOWR('b', 9, struct binder_version),
+	BINDER_GET_NODE_DEBUG_INFO	= _IOWR('b', 11, struct binder_node_debug_info),
+	BINDER_GET_NODE_INFO_FOR_REF	= _IOWR('b', 12, struct binder_node_info_for_ref),
+	BINDER_SET_CONTEXT_MGR_EXT	= _IOW('b', 13, struct flat_binder_object),
+	BINDER_FREEZE			= _IOW('b', 14, struct binder_freeze_info),
+	BINDER_GET_FROZEN_INFO		= _IOWR('b', 15, struct binder_frozen_status_info),
+	BINDER_ENABLE_ONEWAY_SPAM_DETECTION	= _IOW('b', 16, __u32),
+	BINDER_GET_EXTENDED_ERROR	= _IOWR('b', 17, struct binder_extended_error),
+};
 
 /*
  * NOTE: Two special error codes you should check for when calling
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 14f84aeef62d..00a66666f00a 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -21,6 +21,7 @@
 #include <linux/sched.h>
 #include <linux/task_work.h>
 #include <linux/workqueue.h>
+#include <uapi/linux/android/binder.h>
 
 /* `bindgen` gets confused at certain things. */
 const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:10   ` Greg Kroah-Hartman
                     ` (2 more replies)
  2023-11-01 18:01 ` [PATCH RFC 03/20] rust_binder: add threading support Alice Ryhl
                   ` (18 subsequent siblings)
  20 siblings, 3 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

Add support for accessing the Rust binder driver via binderfs. The
actual binderfs implementation is done entirely in C, and the
`rust_binderfs.c` file is a modified version of `binderfs.c` that is
adjusted to call into the Rust binder driver rather than the C driver.

We have left the binderfs filesystem component in C. Rewriting it in
Rust would be a large amount of work and requires a lot of bindings to
the file system interfaces. Binderfs has not historically had the same
challenges with security and complexity, so rewriting Binderfs seems to
have lower value than the rest of Binder.

We also add code on the Rust side for binderfs to call into. Most of
this is left as stub implementation, with the exception of closing the
file descriptor and the BINDER_VERSION ioctl.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/Kconfig         |  24 ++
 drivers/android/Makefile        |   1 +
 drivers/android/context.rs      | 144 +++++++
 drivers/android/defs.rs         |  39 ++
 drivers/android/process.rs      | 251 ++++++++++++
 drivers/android/rust_binder.rs  | 196 ++++++++-
 drivers/android/rust_binderfs.c | 866 ++++++++++++++++++++++++++++++++++++++++
 include/linux/rust_binder.h     |  16 +
 include/uapi/linux/magic.h      |   1 +
 rust/bindings/bindings_helper.h |   2 +
 rust/kernel/lib.rs              |   7 +
 scripts/Makefile.build          |   2 +-
 12 files changed, 1547 insertions(+), 2 deletions(-)

diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
index fcfd25c9a016..82ed6ddabe1a 100644
--- a/drivers/android/Kconfig
+++ b/drivers/android/Kconfig
@@ -36,6 +36,18 @@ config ANDROID_BINDERFS
 	  It can be used to dynamically allocate new binder IPC devices via
 	  ioctls.
 
+config ANDROID_BINDERFS_RUST
+	bool "Android Binderfs filesystem in Rust"
+	depends on ANDROID_BINDER_IPC_RUST
+	default n
+	help
+	  Binderfs is a pseudo-filesystem for the Android Binder IPC driver
+	  which can be mounted per-ipc namespace allowing to run multiple
+	  instances of Android.
+	  Each binderfs mount initially only contains a binder-control device.
+	  It can be used to dynamically allocate new binder IPC devices via
+	  ioctls.
+
 config ANDROID_BINDER_DEVICES
 	string "Android Binder devices"
 	depends on ANDROID_BINDER_IPC
@@ -48,6 +60,18 @@ config ANDROID_BINDER_DEVICES
 	  created. Each binder device has its own context manager, and is
 	  therefore logically separated from the other devices.
 
+config ANDROID_BINDER_DEVICES_RUST
+	string "Android Binder devices in Rust"
+	depends on ANDROID_BINDER_IPC_RUST
+	default "binder,hwbinder,vndbinder"
+	help
+	  Default value for the binder.devices parameter.
+
+	  The binder.devices parameter is a comma-separated list of strings
+	  that specifies the names of the binder device nodes that will be
+	  created. Each binder device has its own context manager, and is
+	  therefore logically separated from the other devices.
+
 config ANDROID_BINDER_IPC_SELFTEST
 	bool "Android Binder IPC Driver Selftest"
 	depends on ANDROID_BINDER_IPC
diff --git a/drivers/android/Makefile b/drivers/android/Makefile
index 6348f75832ca..5c819011aa77 100644
--- a/drivers/android/Makefile
+++ b/drivers/android/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_ANDROID_BINDERFS)		+= binderfs.o
 obj-$(CONFIG_ANDROID_BINDER_IPC)	+= binder.o binder_alloc.o
 obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o
 obj-$(CONFIG_ANDROID_BINDER_IPC_RUST)	+= rust_binder.o
+obj-$(CONFIG_ANDROID_BINDERFS_RUST)	+= rust_binderfs.o
diff --git a/drivers/android/context.rs b/drivers/android/context.rs
new file mode 100644
index 000000000000..630cb575d3ac
--- /dev/null
+++ b/drivers/android/context.rs
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::{
+    list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks},
+    prelude::*,
+    str::{CStr, CString},
+    sync::{Arc, Mutex},
+};
+
+use crate::process::Process;
+
+// This module defines the global variable containing the list of contexts. Since the
+// `kernel::sync` bindings currently don't support mutexes in globals, we use a temporary
+// workaround.
+//
+// TODO: Once `kernel::sync` has support for mutexes in globals, remove this module.
+mod context_global {
+    use super::ContextList;
+    use core::cell::UnsafeCell;
+    use core::mem::MaybeUninit;
+    use kernel::init::PinInit;
+    use kernel::list::List;
+    use kernel::sync::lock::mutex::{Mutex, MutexBackend};
+    use kernel::sync::lock::Guard;
+
+    /// A temporary wrapper used to define a mutex in a global.
+    pub(crate) struct Contexts {
+        inner: UnsafeCell<MaybeUninit<Mutex<ContextList>>>,
+    }
+
+    impl Contexts {
+        /// Called when the module is initialized.
+        pub(crate) fn init(&self) {
+            // SAFETY: This is only called during initialization of the binder module, so we know
+            // that the global is currently uninitialized and that nobody else is using it yet.
+            unsafe {
+                let ptr = self.inner.get() as *mut Mutex<ContextList>;
+                let init = kernel::new_mutex!(ContextList { list: List::new() }, "ContextList");
+                match init.__pinned_init(ptr) {
+                    Ok(()) => {}
+                    Err(e) => match e {},
+                }
+            }
+        }
+
+        pub(crate) fn lock(&self) -> Guard<'_, ContextList, MutexBackend> {
+            // SAFETY: The `init` method is called during initialization of the binder module, so the
+            // mutex is always initialized when this method is called.
+            unsafe {
+                let ptr = self.inner.get() as *const Mutex<ContextList>;
+                (*ptr).lock()
+            }
+        }
+    }
+
+    unsafe impl Send for Contexts {}
+    unsafe impl Sync for Contexts {}
+
+    pub(crate) static CONTEXTS: Contexts = Contexts {
+        inner: UnsafeCell::new(MaybeUninit::uninit()),
+    };
+}
+
+pub(crate) use self::context_global::CONTEXTS;
+
+pub(crate) struct ContextList {
+    list: List<Context>,
+}
+
+/// This struct keeps track of the processes using this context, and which process is the context
+/// manager.
+struct Manager {
+    all_procs: List<Process>,
+}
+
+/// There is one context per binder file (/dev/binder, /dev/hwbinder, etc)
+#[pin_data]
+pub(crate) struct Context {
+    #[pin]
+    manager: Mutex<Manager>,
+    pub(crate) name: CString,
+    #[pin]
+    links: ListLinks,
+}
+
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<0> for Context { self.links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for Context { untracked; }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<0> for Context {
+        using ListLinks;
+    }
+}
+
+impl Context {
+    pub(crate) fn new(name: &CStr) -> Result<Arc<Self>> {
+        let name = CString::try_from(name)?;
+        let list_ctx = ListArc::pin_init(pin_init!(Context {
+            name,
+            links <- ListLinks::new(),
+            manager <- kernel::new_mutex!(Manager {
+                all_procs: List::new(),
+            }, "Context::manager"),
+        }))?;
+
+        let ctx = list_ctx.clone_arc();
+        CONTEXTS.lock().list.push_back(list_ctx);
+
+        Ok(ctx)
+    }
+
+    /// Called when the file for this context is unlinked.
+    ///
+    /// No-op if called twice.
+    pub(crate) fn deregister(&self) {
+        // SAFETY: We never add the context to any other linked list than this one, so it is either
+        // in this list, or not in any list.
+        unsafe {
+            CONTEXTS.lock().list.remove(self);
+        }
+    }
+
+    pub(crate) fn register_process(self: &Arc<Self>, proc: ListArc<Process>) {
+        if !Arc::ptr_eq(self, &proc.ctx) {
+            pr_err!("Context::register_process called on the wrong context.");
+            return;
+        }
+        self.manager.lock().all_procs.push_back(proc);
+    }
+
+    pub(crate) fn deregister_process(self: &Arc<Self>, proc: &Process) {
+        if !Arc::ptr_eq(self, &proc.ctx) {
+            pr_err!("Context::deregister_process called on the wrong context.");
+            return;
+        }
+        // SAFETY: We just checked that this is the right list.
+        unsafe {
+            self.manager.lock().all_procs.remove(proc);
+        }
+    }
+}
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
new file mode 100644
index 000000000000..8fdcb856ccad
--- /dev/null
+++ b/drivers/android/defs.rs
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use core::ops::{Deref, DerefMut};
+use kernel::{
+    bindings,
+    io_buffer::{ReadableFromBytes, WritableToBytes},
+};
+
+macro_rules! decl_wrapper {
+    ($newname:ident, $wrapped:ty) => {
+        #[derive(Copy, Clone, Default)]
+        #[repr(transparent)]
+        pub(crate) struct $newname($wrapped);
+        // SAFETY: This macro is only used with types where this is ok.
+        unsafe impl ReadableFromBytes for $newname {}
+        unsafe impl WritableToBytes for $newname {}
+        impl Deref for $newname {
+            type Target = $wrapped;
+            fn deref(&self) -> &Self::Target {
+                &self.0
+            }
+        }
+        impl DerefMut for $newname {
+            fn deref_mut(&mut self) -> &mut Self::Target {
+                &mut self.0
+            }
+        }
+    };
+}
+
+decl_wrapper!(BinderVersion, bindings::binder_version);
+
+impl BinderVersion {
+    pub(crate) fn current() -> Self {
+        Self(bindings::binder_version {
+            protocol_version: bindings::BINDER_CURRENT_PROTOCOL_VERSION as _,
+        })
+    }
+}
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
new file mode 100644
index 000000000000..2f16e4cedbf1
--- /dev/null
+++ b/drivers/android/process.rs
@@ -0,0 +1,251 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! This module defines the `Process` type, which represents a process using a particular binder
+//! context.
+//!
+//! The `Process` object keeps track of all of the resources that this process owns in the binder
+//! context.
+//!
+//! There is one `Process` object for each binder fd that a process has opened, so processes using
+//! several binder contexts have several `Process` objects. This ensures that the contexts are
+//! fully separated.
+
+use kernel::{
+    bindings,
+    cred::Credential,
+    file::{File, PollTable},
+    io_buffer::IoBufferWriter,
+    list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks},
+    mm,
+    prelude::*,
+    sync::{Arc, ArcBorrow, SpinLock},
+    task::Task,
+    types::ARef,
+    user_ptr::{UserSlicePtr, UserSlicePtrReader},
+    workqueue::{self, Work},
+};
+
+use crate::{context::Context, defs::*};
+
+const PROC_DEFER_FLUSH: u8 = 1;
+const PROC_DEFER_RELEASE: u8 = 2;
+
+/// The fields of `Process` protected by the spinlock.
+pub(crate) struct ProcessInner {
+    is_dead: bool,
+
+    /// Bitmap of deferred work to do.
+    defer_work: u8,
+}
+
+impl ProcessInner {
+    fn new() -> Self {
+        Self {
+            is_dead: false,
+            defer_work: 0,
+        }
+    }
+}
+
+/// A process using binder.
+///
+/// Strictly speaking, there can be multiple of these per process. There is one for each binder fd
+/// that a process has opened, so processes using several binder contexts have several `Process`
+/// objects. This ensures that the contexts are fully separated.
+#[pin_data]
+pub(crate) struct Process {
+    pub(crate) ctx: Arc<Context>,
+
+    // The task leader (process).
+    pub(crate) task: ARef<Task>,
+
+    // Credential associated with file when `Process` is created.
+    pub(crate) cred: ARef<Credential>,
+
+    #[pin]
+    pub(crate) inner: SpinLock<ProcessInner>,
+
+    // Work node for deferred work item.
+    #[pin]
+    defer_work: Work<Process>,
+
+    // Links for process list in Context.
+    #[pin]
+    links: ListLinks,
+}
+
+kernel::impl_has_work! {
+    impl HasWork<Process> for Process { self.defer_work }
+}
+
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<0> for Process { self.links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for Process { untracked; }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<0> for Process {
+        using ListLinks;
+    }
+}
+
+impl workqueue::WorkItem for Process {
+    type Pointer = Arc<Process>;
+
+    fn run(me: Arc<Self>) {
+        let defer;
+        {
+            let mut inner = me.inner.lock();
+            defer = inner.defer_work;
+            inner.defer_work = 0;
+        }
+
+        if defer & PROC_DEFER_FLUSH != 0 {
+            me.deferred_flush();
+        }
+        if defer & PROC_DEFER_RELEASE != 0 {
+            me.deferred_release();
+        }
+    }
+}
+
+impl Process {
+    fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
+        let list_process = ListArc::pin_init(pin_init!(Process {
+            ctx,
+            cred,
+            inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"),
+            task: kernel::current!().group_leader().into(),
+            defer_work <- kernel::new_work!("Process::defer_work"),
+            links <- ListLinks::new(),
+        }))?;
+
+        let process = list_process.clone_arc();
+        process.ctx.register_process(list_process);
+
+        Ok(process)
+    }
+
+    fn version(&self, data: UserSlicePtr) -> Result {
+        data.writer().write(&BinderVersion::current())
+    }
+
+    fn deferred_flush(&self) {
+        // NOOP for now.
+    }
+
+    fn deferred_release(self: Arc<Self>) {
+        self.inner.lock().is_dead = true;
+
+        self.ctx.deregister_process(&self);
+    }
+
+    pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
+        let should_schedule;
+        {
+            let mut inner = this.inner.lock();
+            should_schedule = inner.defer_work == 0;
+            inner.defer_work |= PROC_DEFER_FLUSH;
+        }
+
+        if should_schedule {
+            // Ignore failures to schedule to the workqueue. Those just mean that we're already
+            // scheduled for execution.
+            let _ = workqueue::system().enqueue(Arc::from(this));
+        }
+        Ok(())
+    }
+}
+
+/// The ioctl handler.
+impl Process {
+    fn write(
+        _this: ArcBorrow<'_, Process>,
+        _file: &File,
+        _cmd: u32,
+        _reader: &mut UserSlicePtrReader,
+    ) -> Result<i32> {
+        Err(EINVAL)
+    }
+
+    fn read_write(
+        this: ArcBorrow<'_, Process>,
+        _file: &File,
+        cmd: u32,
+        data: UserSlicePtr,
+    ) -> Result<i32> {
+        match cmd {
+            bindings::BINDER_VERSION => this.version(data)?,
+            _ => return Err(EINVAL),
+        }
+        Ok(0)
+    }
+}
+
+/// The file operations supported by `Process`.
+impl Process {
+    pub(crate) fn open(ctx: ArcBorrow<'_, Context>, file: &File) -> Result<Arc<Process>> {
+        Self::new(ctx.into(), ARef::from(file.cred()))
+    }
+
+    pub(crate) fn release(this: Arc<Process>, _file: &File) {
+        let should_schedule;
+        {
+            let mut inner = this.inner.lock();
+            should_schedule = inner.defer_work == 0;
+            inner.defer_work |= PROC_DEFER_RELEASE;
+        }
+
+        if should_schedule {
+            // Ignore failures to schedule to the workqueue. Those just mean that we're already
+            // scheduled for execution.
+            let _ = workqueue::system().enqueue(this);
+        }
+    }
+
+    pub(crate) fn ioctl(
+        this: ArcBorrow<'_, Process>,
+        file: &File,
+        cmd: u32,
+        arg: *mut core::ffi::c_void,
+    ) -> Result<i32> {
+        use kernel::ioctl::{_IOC_DIR, _IOC_SIZE};
+        use kernel::uapi::{_IOC_READ, _IOC_WRITE};
+
+        let user_slice = UserSlicePtr::new(arg, _IOC_SIZE(cmd));
+
+        const _IOC_READ_WRITE: u32 = _IOC_READ | _IOC_WRITE;
+
+        match _IOC_DIR(cmd) {
+            _IOC_WRITE => Self::write(this, file, cmd, &mut user_slice.reader()),
+            _IOC_READ_WRITE => Self::read_write(this, file, cmd, user_slice),
+            _ => Err(EINVAL),
+        }
+    }
+
+    pub(crate) fn compat_ioctl(
+        this: ArcBorrow<'_, Process>,
+        file: &File,
+        cmd: u32,
+        arg: *mut core::ffi::c_void,
+    ) -> Result<i32> {
+        Self::ioctl(this, file, cmd, arg)
+    }
+
+    pub(crate) fn mmap(
+        _this: ArcBorrow<'_, Process>,
+        _file: &File,
+        _vma: &mut mm::virt::Area,
+    ) -> Result {
+        Err(EINVAL)
+    }
+
+    pub(crate) fn poll(
+        _this: ArcBorrow<'_, Process>,
+        _file: &File,
+        _table: &mut PollTable,
+    ) -> Result<u32> {
+        Err(EINVAL)
+    }
+}
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 4b3d6676a9cf..6de2f40846fb 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -2,7 +2,19 @@
 
 //! Binder -- the Android IPC mechanism.
 
-use kernel::prelude::*;
+use kernel::{
+    bindings::{self, seq_file},
+    file::{File, PollTable},
+    prelude::*,
+    sync::Arc,
+    types::ForeignOwnable,
+};
+
+use crate::{context::Context, process::Process};
+
+mod context;
+mod defs;
+mod process;
 
 module! {
     type: BinderModule,
@@ -16,6 +28,188 @@ struct BinderModule {}
 
 impl kernel::Module for BinderModule {
     fn init(_module: &'static kernel::ThisModule) -> Result<Self> {
+        crate::context::CONTEXTS.init();
+
+        // SAFETY: The module is being loaded, so we can initialize binderfs.
+        #[cfg(CONFIG_ANDROID_BINDERFS_RUST)]
+        unsafe {
+            kernel::error::to_result(bindings::init_rust_binderfs())?;
+        }
+
         Ok(Self {})
     }
 }
+
+/// Makes the inner type Sync.
+#[repr(transparent)]
+pub struct AssertSync<T>(T);
+// SAFETY: Used only to insert `file_operations` into a global, which is safe.
+unsafe impl<T> Sync for AssertSync<T> {}
+
+/// File operations that rust_binderfs.c can use.
+#[no_mangle]
+#[used]
+pub static rust_binder_fops: AssertSync<kernel::bindings::file_operations> = {
+    // SAFETY: All zeroes is safe for the `file_operations` type.
+    let zeroed_ops = unsafe { core::mem::MaybeUninit::zeroed().assume_init() };
+
+    let ops = kernel::bindings::file_operations {
+        owner: THIS_MODULE.as_ptr(),
+        poll: Some(rust_binder_poll),
+        unlocked_ioctl: Some(rust_binder_unlocked_ioctl),
+        compat_ioctl: Some(rust_binder_compat_ioctl),
+        mmap: Some(rust_binder_mmap),
+        open: Some(rust_binder_open),
+        release: Some(rust_binder_release),
+        mmap_supported_flags: 0,
+        flush: Some(rust_binder_flush),
+        ..zeroed_ops
+    };
+    AssertSync(ops)
+};
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_new_device(
+    name: *const core::ffi::c_char,
+) -> *mut core::ffi::c_void {
+    // SAFETY: The caller will always provide a valid c string here.
+    let name = unsafe { kernel::str::CStr::from_char_ptr(name) };
+    match Context::new(name) {
+        Ok(ctx) => Arc::into_foreign(ctx).cast_mut(),
+        Err(_err) => core::ptr::null_mut(),
+    }
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_remove_device(device: *mut core::ffi::c_void) {
+    if !device.is_null() {
+        // SAFETY: The caller ensures that the `device` pointer came from a previous call to
+        // `rust_binder_new_device`.
+        let ctx = unsafe { Arc::<Context>::from_foreign(device) };
+        ctx.deregister();
+        drop(ctx);
+    }
+}
+
+unsafe extern "C" fn rust_binder_open(
+    inode: *mut bindings::inode,
+    file_ptr: *mut bindings::file,
+) -> core::ffi::c_int {
+    // SAFETY: The `rust_binderfs.c` file ensures that `i_private` is set to the return value of a
+    // successful call to `rust_binder_new_device`.
+    let ctx = unsafe { Arc::<Context>::borrow((*inode).i_private) };
+
+    // SAFETY: The caller provides a valid file pointer to a new `struct file`.
+    let file = unsafe { File::from_ptr(file_ptr) };
+    let process = match Process::open(ctx, file) {
+        Ok(process) => process,
+        Err(err) => return err.to_errno(),
+    };
+    // SAFETY: This file is associated with Rust binder, so we own the `private_data` field.
+    unsafe {
+        (*file_ptr).private_data = process.into_foreign().cast_mut();
+    }
+    0
+}
+
+unsafe extern "C" fn rust_binder_release(
+    _inode: *mut bindings::inode,
+    file: *mut bindings::file,
+) -> core::ffi::c_int {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let process = unsafe { Arc::<Process>::from_foreign((*file).private_data) };
+    // SAFETY: The caller ensures that the file is valid.
+    let file = unsafe { File::from_ptr(file) };
+    Process::release(process, file);
+    0
+}
+
+unsafe extern "C" fn rust_binder_compat_ioctl(
+    file: *mut bindings::file,
+    cmd: core::ffi::c_uint,
+    arg: core::ffi::c_ulong,
+) -> core::ffi::c_long {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let f = unsafe { Arc::<Process>::borrow((*file).private_data) };
+    // SAFETY: The caller ensures that the file is valid.
+    match Process::compat_ioctl(f, unsafe { File::from_ptr(file) }, cmd as _, arg as _) {
+        Ok(ret) => ret.into(),
+        Err(err) => err.to_errno().into(),
+    }
+}
+
+unsafe extern "C" fn rust_binder_unlocked_ioctl(
+    file: *mut bindings::file,
+    cmd: core::ffi::c_uint,
+    arg: core::ffi::c_ulong,
+) -> core::ffi::c_long {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let f = unsafe { Arc::<Process>::borrow((*file).private_data) };
+    // SAFETY: The caller ensures that the file is valid.
+    match Process::ioctl(f, unsafe { File::from_ptr(file) }, cmd as _, arg as _) {
+        Ok(ret) => ret.into(),
+        Err(err) => err.to_errno().into(),
+    }
+}
+
+unsafe extern "C" fn rust_binder_mmap(
+    file: *mut bindings::file,
+    vma: *mut bindings::vm_area_struct,
+) -> core::ffi::c_int {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let f = unsafe { Arc::<Process>::borrow((*file).private_data) };
+    // SAFETY: The caller ensures that the vma is valid.
+    let area = unsafe { kernel::mm::virt::Area::from_ptr_mut(vma) };
+    // SAFETY: The caller ensures that the file is valid.
+    match Process::mmap(f, unsafe { File::from_ptr(file) }, area) {
+        Ok(()) => 0,
+        Err(err) => err.to_errno(),
+    }
+}
+
+unsafe extern "C" fn rust_binder_poll(
+    file: *mut bindings::file,
+    wait: *mut bindings::poll_table_struct,
+) -> bindings::__poll_t {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let f = unsafe { Arc::<Process>::borrow((*file).private_data) };
+    // SAFETY: The caller ensures that the file is valid.
+    let fileref = unsafe { File::from_ptr(file) };
+    // SAFETY: The caller ensures that the `PollTable` is valid.
+    match Process::poll(f, fileref, unsafe { PollTable::from_ptr(wait) }) {
+        Ok(v) => v,
+        Err(_) => bindings::POLLERR,
+    }
+}
+
+unsafe extern "C" fn rust_binder_flush(
+    file: *mut bindings::file,
+    _id: bindings::fl_owner_t,
+) -> core::ffi::c_int {
+    // SAFETY: We previously set `private_data` in `rust_binder_open`.
+    let f = unsafe { Arc::<Process>::borrow((*file).private_data) };
+    match Process::flush(f) {
+        Ok(()) => 0,
+        Err(err) => err.to_errno(),
+    }
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_stats_show(_: *mut seq_file) -> core::ffi::c_int {
+    0
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_state_show(_: *mut seq_file) -> core::ffi::c_int {
+    0
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_transactions_show(_: *mut seq_file) -> core::ffi::c_int {
+    0
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_binder_transaction_log_show(_: *mut seq_file) -> core::ffi::c_int {
+    0
+}
diff --git a/drivers/android/rust_binderfs.c b/drivers/android/rust_binderfs.c
new file mode 100644
index 000000000000..2c011e26752c
--- /dev/null
+++ b/drivers/android/rust_binderfs.c
@@ -0,0 +1,866 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/compiler_types.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/fsnotify.h>
+#include <linux/gfp.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/ipc_namespace.h>
+#include <linux/kdev_t.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/namei.h>
+#include <linux/magic.h>
+#include <linux/major.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/mount.h>
+#include <linux/fs_parser.h>
+#include <linux/rust_binder.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/spinlock_types.h>
+#include <linux/stddef.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/user_namespace.h>
+#include <linux/xarray.h>
+#include <uapi/asm-generic/errno-base.h>
+#include <uapi/linux/android/binder.h>
+#include <uapi/linux/android/binderfs.h>
+
+#include "binder_internal.h"
+
+#define FIRST_INODE 1
+#define SECOND_INODE 2
+#define INODE_OFFSET 3
+#define BINDERFS_MAX_MINOR (1U << MINORBITS)
+/* Ensure that the initial ipc namespace always has devices available. */
+#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 4)
+
+/* === DEFINED IN RUST === */
+extern int rust_binder_stats_show(struct seq_file *m, void *unused);
+DEFINE_SHOW_ATTRIBUTE(rust_binder_stats);
+
+extern int rust_binder_state_show(struct seq_file *m, void *unused);
+DEFINE_SHOW_ATTRIBUTE(rust_binder_state);
+
+extern int rust_binder_transactions_show(struct seq_file *m, void *unused);
+DEFINE_SHOW_ATTRIBUTE(rust_binder_transactions);
+
+extern int rust_binder_transaction_log_show(struct seq_file *m, void *unused);
+DEFINE_SHOW_ATTRIBUTE(rust_binder_transaction_log);
+
+extern const struct file_operations rust_binder_fops;
+extern rust_binder_device rust_binder_new_device(char *name);
+extern void rust_binder_remove_device(rust_binder_device device);
+/* === END DEFINED IN RUST === */
+
+char *rust_binder_devices_param = CONFIG_ANDROID_BINDER_DEVICES_RUST;
+module_param_named(rust_devices, rust_binder_devices_param, charp, 0444);
+
+static dev_t binderfs_dev;
+static DEFINE_MUTEX(binderfs_minors_mutex);
+static DEFINE_IDA(binderfs_minors);
+
+enum binderfs_param {
+	Opt_max,
+	Opt_stats_mode,
+};
+
+enum binderfs_stats_mode {
+	binderfs_stats_mode_unset,
+	binderfs_stats_mode_global,
+};
+
+struct binder_features {
+	bool oneway_spam_detection;
+	bool extended_error;
+};
+
+static const struct constant_table binderfs_param_stats[] = {
+	{ "global", binderfs_stats_mode_global },
+	{}
+};
+
+static const struct fs_parameter_spec binderfs_fs_parameters[] = {
+	fsparam_u32("max",	Opt_max),
+	fsparam_enum("stats",	Opt_stats_mode, binderfs_param_stats),
+	{}
+};
+
+static struct binder_features binder_features = {
+	.oneway_spam_detection = true,
+	.extended_error = true,
+};
+
+static inline struct binderfs_info *BINDERFS_SB(const struct super_block *sb)
+{
+	return sb->s_fs_info;
+}
+
+bool is_rust_binderfs_device(const struct inode *inode)
+{
+	if (inode->i_sb->s_magic == RUST_BINDERFS_SUPER_MAGIC)
+		return true;
+
+	return false;
+}
+
+/**
+ * binderfs_binder_device_create - allocate inode from super block of a
+ *                                 binderfs mount
+ * @ref_inode: inode from wich the super block will be taken
+ * @userp:     buffer to copy information about new device for userspace to
+ * @req:       struct binderfs_device as copied from userspace
+ *
+ * This function allocates a new binder_device and reserves a new minor
+ * number for it.
+ * Minor numbers are limited and tracked globally in binderfs_minors. The
+ * function will stash a struct binder_device for the specific binder
+ * device in i_private of the inode.
+ * It will go on to allocate a new inode from the super block of the
+ * filesystem mount, stash a struct binder_device in its i_private field
+ * and attach a dentry to that inode.
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+static int binderfs_binder_device_create(struct inode *ref_inode,
+					 struct binderfs_device __user *userp,
+					 struct binderfs_device *req)
+{
+	int minor, ret;
+	struct dentry *dentry, *root;
+	rust_binder_device device = NULL;
+	char *name = NULL;
+	size_t name_len;
+	struct inode *inode = NULL;
+	struct super_block *sb = ref_inode->i_sb;
+	struct binderfs_info *info = sb->s_fs_info;
+#if defined(CONFIG_IPC_NS)
+	bool use_reserve = (info->ipc_ns == &init_ipc_ns);
+#else
+	bool use_reserve = true;
+#endif
+
+	/* Reserve new minor number for the new device. */
+	mutex_lock(&binderfs_minors_mutex);
+	if (++info->device_count <= info->mount_opts.max)
+		minor = ida_alloc_max(&binderfs_minors,
+				      use_reserve ? BINDERFS_MAX_MINOR :
+						    BINDERFS_MAX_MINOR_CAPPED,
+				      GFP_KERNEL);
+	else
+		minor = -ENOSPC;
+	if (minor < 0) {
+		--info->device_count;
+		mutex_unlock(&binderfs_minors_mutex);
+		return minor;
+	}
+	mutex_unlock(&binderfs_minors_mutex);
+
+	ret = -ENOMEM;
+	req->name[BINDERFS_MAX_NAME] = '\0'; /* NUL-terminate */
+	name_len = strlen(req->name);
+	/* Make sure to include terminating NUL byte */
+	name = kmemdup(req->name, name_len + 1, GFP_KERNEL);
+	if (!name)
+		goto err;
+
+	device = rust_binder_new_device(name);
+	if (!device)
+		goto err;
+
+	inode = new_inode(sb);
+	if (!inode)
+		goto err;
+
+	inode->i_ino = minor + INODE_OFFSET;
+	simple_inode_init_ts(inode);
+	init_special_inode(inode, S_IFCHR | 0600,
+			   MKDEV(MAJOR(binderfs_dev), minor));
+	inode->i_fop = &rust_binder_fops;
+	inode->i_uid = info->root_uid;
+	inode->i_gid = info->root_gid;
+
+	req->major = MAJOR(binderfs_dev);
+	req->minor = minor;
+
+	if (userp && copy_to_user(userp, req, sizeof(*req))) {
+		ret = -EFAULT;
+		goto err;
+	}
+
+	root = sb->s_root;
+	inode_lock(d_inode(root));
+
+	/* look it up */
+	dentry = lookup_one_len(name, root, name_len);
+	if (IS_ERR(dentry)) {
+		inode_unlock(d_inode(root));
+		ret = PTR_ERR(dentry);
+		goto err;
+	}
+
+	if (d_really_is_positive(dentry)) {
+		/* already exists */
+		dput(dentry);
+		inode_unlock(d_inode(root));
+		ret = -EEXIST;
+		goto err;
+	}
+
+	inode->i_private = device;
+	d_instantiate(dentry, inode);
+	fsnotify_create(root->d_inode, dentry);
+	inode_unlock(d_inode(root));
+
+	return 0;
+
+err:
+	kfree(name);
+	rust_binder_remove_device(device);
+	mutex_lock(&binderfs_minors_mutex);
+	--info->device_count;
+	ida_free(&binderfs_minors, minor);
+	mutex_unlock(&binderfs_minors_mutex);
+	iput(inode);
+
+	return ret;
+}
+
+/**
+ * binder_ctl_ioctl - handle binder device node allocation requests
+ *
+ * The request handler for the binder-control device. All requests operate on
+ * the binderfs mount the binder-control device resides in:
+ * - BINDER_CTL_ADD
+ *   Allocate a new binder device.
+ *
+ * Return: %0 on success, negative errno on failure.
+ */
+static long binder_ctl_ioctl(struct file *file, unsigned int cmd,
+			     unsigned long arg)
+{
+	int ret = -EINVAL;
+	struct inode *inode = file_inode(file);
+	struct binderfs_device __user *device = (struct binderfs_device __user *)arg;
+	struct binderfs_device device_req;
+
+	switch (cmd) {
+	case BINDER_CTL_ADD:
+		ret = copy_from_user(&device_req, device, sizeof(device_req));
+		if (ret) {
+			ret = -EFAULT;
+			break;
+		}
+
+		ret = binderfs_binder_device_create(inode, device, &device_req);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+static void binderfs_evict_inode(struct inode *inode)
+{
+	rust_binder_device device = inode->i_private;
+	struct binderfs_info *info = BINDERFS_SB(inode->i_sb);
+	int minor = inode->i_ino - INODE_OFFSET;
+
+	clear_inode(inode);
+
+	if (!S_ISCHR(inode->i_mode) || !device)
+		return;
+
+	mutex_lock(&binderfs_minors_mutex);
+	--info->device_count;
+	ida_free(&binderfs_minors, minor);
+	mutex_unlock(&binderfs_minors_mutex);
+
+	rust_binder_remove_device(device);
+}
+
+static int binderfs_fs_context_parse_param(struct fs_context *fc,
+					   struct fs_parameter *param)
+{
+	int opt;
+	struct binderfs_mount_opts *ctx = fc->fs_private;
+	struct fs_parse_result result;
+
+	opt = fs_parse(fc, binderfs_fs_parameters, param, &result);
+	if (opt < 0)
+		return opt;
+
+	switch (opt) {
+	case Opt_max:
+		if (result.uint_32 > BINDERFS_MAX_MINOR)
+			return invalfc(fc, "Bad value for '%s'", param->key);
+
+		ctx->max = result.uint_32;
+		break;
+	case Opt_stats_mode:
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+
+		ctx->stats_mode = result.uint_32;
+		break;
+	default:
+		return invalfc(fc, "Unsupported parameter '%s'", param->key);
+	}
+
+	return 0;
+}
+
+static int binderfs_fs_context_reconfigure(struct fs_context *fc)
+{
+	struct binderfs_mount_opts *ctx = fc->fs_private;
+	struct binderfs_info *info = BINDERFS_SB(fc->root->d_sb);
+
+	if (info->mount_opts.stats_mode != ctx->stats_mode)
+		return invalfc(fc, "Binderfs stats mode cannot be changed during a remount");
+
+	info->mount_opts.stats_mode = ctx->stats_mode;
+	info->mount_opts.max = ctx->max;
+	return 0;
+}
+
+static int binderfs_show_options(struct seq_file *seq, struct dentry *root)
+{
+	struct binderfs_info *info = BINDERFS_SB(root->d_sb);
+
+	if (info->mount_opts.max <= BINDERFS_MAX_MINOR)
+		seq_printf(seq, ",max=%d", info->mount_opts.max);
+
+	switch (info->mount_opts.stats_mode) {
+	case binderfs_stats_mode_unset:
+		break;
+	case binderfs_stats_mode_global:
+		seq_printf(seq, ",stats=global");
+		break;
+	}
+
+	return 0;
+}
+
+static const struct super_operations binderfs_super_ops = {
+	.evict_inode    = binderfs_evict_inode,
+	.show_options	= binderfs_show_options,
+	.statfs         = simple_statfs,
+};
+
+static inline bool is_binderfs_control_device(const struct dentry *dentry)
+{
+	struct binderfs_info *info = dentry->d_sb->s_fs_info;
+
+	return info->control_dentry == dentry;
+}
+
+static int binderfs_rename(struct mnt_idmap *idmap,
+			   struct inode *old_dir, struct dentry *old_dentry,
+			   struct inode *new_dir, struct dentry *new_dentry,
+			   unsigned int flags)
+{
+	if (is_binderfs_control_device(old_dentry) ||
+	    is_binderfs_control_device(new_dentry))
+		return -EPERM;
+
+	return simple_rename(idmap, old_dir, old_dentry, new_dir,
+			     new_dentry, flags);
+}
+
+static int binderfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+	if (is_binderfs_control_device(dentry))
+		return -EPERM;
+
+	return simple_unlink(dir, dentry);
+}
+
+static const struct file_operations binder_ctl_fops = {
+	.owner		= THIS_MODULE,
+	.open		= nonseekable_open,
+	.unlocked_ioctl	= binder_ctl_ioctl,
+	.compat_ioctl	= binder_ctl_ioctl,
+	.llseek		= noop_llseek,
+};
+
+/**
+ * binderfs_binder_ctl_create - create a new binder-control device
+ * @sb: super block of the binderfs mount
+ *
+ * This function creates a new binder-control device node in the binderfs mount
+ * referred to by @sb.
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+static int binderfs_binder_ctl_create(struct super_block *sb)
+{
+	int minor, ret;
+	struct dentry *dentry;
+	struct binder_device *device;
+	struct inode *inode = NULL;
+	struct dentry *root = sb->s_root;
+	struct binderfs_info *info = sb->s_fs_info;
+#if defined(CONFIG_IPC_NS)
+	bool use_reserve = (info->ipc_ns == &init_ipc_ns);
+#else
+	bool use_reserve = true;
+#endif
+
+	device = kzalloc(sizeof(*device), GFP_KERNEL);
+	if (!device)
+		return -ENOMEM;
+
+	/* If we have already created a binder-control node, return. */
+	if (info->control_dentry) {
+		ret = 0;
+		goto out;
+	}
+
+	ret = -ENOMEM;
+	inode = new_inode(sb);
+	if (!inode)
+		goto out;
+
+	/* Reserve a new minor number for the new device. */
+	mutex_lock(&binderfs_minors_mutex);
+	minor = ida_alloc_max(&binderfs_minors,
+			      use_reserve ? BINDERFS_MAX_MINOR :
+					    BINDERFS_MAX_MINOR_CAPPED,
+			      GFP_KERNEL);
+	mutex_unlock(&binderfs_minors_mutex);
+	if (minor < 0) {
+		ret = minor;
+		goto out;
+	}
+
+	inode->i_ino = SECOND_INODE;
+	simple_inode_init_ts(inode);
+	init_special_inode(inode, S_IFCHR | 0600,
+			   MKDEV(MAJOR(binderfs_dev), minor));
+	inode->i_fop = &binder_ctl_fops;
+	inode->i_uid = info->root_uid;
+	inode->i_gid = info->root_gid;
+
+	refcount_set(&device->ref, 1);
+	device->binderfs_inode = inode;
+	device->miscdev.minor = minor;
+
+	dentry = d_alloc_name(root, "binder-control");
+	if (!dentry)
+		goto out;
+
+	inode->i_private = device;
+	info->control_dentry = dentry;
+	d_add(dentry, inode);
+
+	return 0;
+
+out:
+	kfree(device);
+	iput(inode);
+
+	return ret;
+}
+
+static const struct inode_operations binderfs_dir_inode_operations = {
+	.lookup = simple_lookup,
+	.rename = binderfs_rename,
+	.unlink = binderfs_unlink,
+};
+
+static struct inode *binderfs_make_inode(struct super_block *sb, int mode)
+{
+	struct inode *ret;
+
+	ret = new_inode(sb);
+	if (ret) {
+		ret->i_ino = iunique(sb, BINDERFS_MAX_MINOR + INODE_OFFSET);
+		ret->i_mode = mode;
+		simple_inode_init_ts(ret);
+	}
+	return ret;
+}
+
+static struct dentry *binderfs_create_dentry(struct dentry *parent,
+					     const char *name)
+{
+	struct dentry *dentry;
+
+	dentry = lookup_one_len(name, parent, strlen(name));
+	if (IS_ERR(dentry))
+		return dentry;
+
+	/* Return error if the file/dir already exists. */
+	if (d_really_is_positive(dentry)) {
+		dput(dentry);
+		return ERR_PTR(-EEXIST);
+	}
+
+	return dentry;
+}
+
+void rust_binderfs_remove_file(struct dentry *dentry)
+{
+	struct inode *parent_inode;
+
+	parent_inode = d_inode(dentry->d_parent);
+	inode_lock(parent_inode);
+	if (simple_positive(dentry)) {
+		dget(dentry);
+		simple_unlink(parent_inode, dentry);
+		d_delete(dentry);
+		dput(dentry);
+	}
+	inode_unlock(parent_inode);
+}
+
+struct dentry *rust_binderfs_create_file(struct dentry *parent, const char *name,
+					 const struct file_operations *fops,
+					 void *data)
+{
+	struct dentry *dentry;
+	struct inode *new_inode, *parent_inode;
+	struct super_block *sb;
+
+	parent_inode = d_inode(parent);
+	inode_lock(parent_inode);
+
+	dentry = binderfs_create_dentry(parent, name);
+	if (IS_ERR(dentry))
+		goto out;
+
+	sb = parent_inode->i_sb;
+	new_inode = binderfs_make_inode(sb, S_IFREG | 0444);
+	if (!new_inode) {
+		dput(dentry);
+		dentry = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	new_inode->i_fop = fops;
+	new_inode->i_private = data;
+	d_instantiate(dentry, new_inode);
+	fsnotify_create(parent_inode, dentry);
+
+out:
+	inode_unlock(parent_inode);
+	return dentry;
+}
+
+static struct dentry *binderfs_create_dir(struct dentry *parent,
+					  const char *name)
+{
+	struct dentry *dentry;
+	struct inode *new_inode, *parent_inode;
+	struct super_block *sb;
+
+	parent_inode = d_inode(parent);
+	inode_lock(parent_inode);
+
+	dentry = binderfs_create_dentry(parent, name);
+	if (IS_ERR(dentry))
+		goto out;
+
+	sb = parent_inode->i_sb;
+	new_inode = binderfs_make_inode(sb, S_IFDIR | 0755);
+	if (!new_inode) {
+		dput(dentry);
+		dentry = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	new_inode->i_fop = &simple_dir_operations;
+	new_inode->i_op = &simple_dir_inode_operations;
+
+	set_nlink(new_inode, 2);
+	d_instantiate(dentry, new_inode);
+	inc_nlink(parent_inode);
+	fsnotify_mkdir(parent_inode, dentry);
+
+out:
+	inode_unlock(parent_inode);
+	return dentry;
+}
+
+static int binder_features_show(struct seq_file *m, void *unused)
+{
+	bool *feature = m->private;
+
+	seq_printf(m, "%d\n", *feature);
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(binder_features);
+
+static int init_binder_features(struct super_block *sb)
+{
+	struct dentry *dentry, *dir;
+
+	dir = binderfs_create_dir(sb->s_root, "features");
+	if (IS_ERR(dir))
+		return PTR_ERR(dir);
+
+	dentry = rust_binderfs_create_file(dir, "oneway_spam_detection",
+				      &binder_features_fops,
+				      &binder_features.oneway_spam_detection);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+
+	dentry = rust_binderfs_create_file(dir, "extended_error",
+				      &binder_features_fops,
+				      &binder_features.extended_error);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+
+	return 0;
+}
+
+static int init_binder_logs(struct super_block *sb)
+{
+	struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir;
+	struct binderfs_info *info;
+	int ret = 0;
+
+	binder_logs_root_dir = binderfs_create_dir(sb->s_root,
+						   "binder_logs");
+	if (IS_ERR(binder_logs_root_dir)) {
+		ret = PTR_ERR(binder_logs_root_dir);
+		goto out;
+	}
+
+	dentry = rust_binderfs_create_file(binder_logs_root_dir, "stats",
+				      &rust_binder_stats_fops, NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	dentry = rust_binderfs_create_file(binder_logs_root_dir, "state",
+				      &rust_binder_state_fops, NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	dentry = rust_binderfs_create_file(binder_logs_root_dir, "transactions",
+				      &rust_binder_transactions_fops, NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	dentry = rust_binderfs_create_file(binder_logs_root_dir,
+				      "transaction_log",
+				      &rust_binder_transaction_log_fops,
+				      NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	dentry = rust_binderfs_create_file(binder_logs_root_dir,
+				      "failed_transaction_log",
+				      &rust_binder_transaction_log_fops,
+				      NULL);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
+		goto out;
+	}
+
+	proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc");
+	if (IS_ERR(proc_log_dir)) {
+		ret = PTR_ERR(proc_log_dir);
+		goto out;
+	}
+	info = sb->s_fs_info;
+	info->proc_log_dir = proc_log_dir;
+
+out:
+	return ret;
+}
+
+static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
+{
+	int ret;
+	struct binderfs_info *info;
+	struct binderfs_mount_opts *ctx = fc->fs_private;
+	struct inode *inode = NULL;
+	struct binderfs_device device_info = {};
+	const char *name;
+	size_t len;
+
+	sb->s_blocksize = PAGE_SIZE;
+	sb->s_blocksize_bits = PAGE_SHIFT;
+
+	/*
+	 * The binderfs filesystem can be mounted by userns root in a
+	 * non-initial userns. By default such mounts have the SB_I_NODEV flag
+	 * set in s_iflags to prevent security issues where userns root can
+	 * just create random device nodes via mknod() since it owns the
+	 * filesystem mount. But binderfs does not allow to create any files
+	 * including devices nodes. The only way to create binder devices nodes
+	 * is through the binder-control device which userns root is explicitly
+	 * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
+	 * necessary and safe.
+	 */
+	sb->s_iflags &= ~SB_I_NODEV;
+	sb->s_iflags |= SB_I_NOEXEC;
+	sb->s_magic = RUST_BINDERFS_SUPER_MAGIC;
+	sb->s_op = &binderfs_super_ops;
+	sb->s_time_gran = 1;
+
+	sb->s_fs_info = kzalloc(sizeof(struct binderfs_info), GFP_KERNEL);
+	if (!sb->s_fs_info)
+		return -ENOMEM;
+	info = sb->s_fs_info;
+
+	info->ipc_ns = get_ipc_ns(current->nsproxy->ipc_ns);
+
+	info->root_gid = make_kgid(sb->s_user_ns, 0);
+	if (!gid_valid(info->root_gid))
+		info->root_gid = GLOBAL_ROOT_GID;
+	info->root_uid = make_kuid(sb->s_user_ns, 0);
+	if (!uid_valid(info->root_uid))
+		info->root_uid = GLOBAL_ROOT_UID;
+	info->mount_opts.max = ctx->max;
+	info->mount_opts.stats_mode = ctx->stats_mode;
+
+	inode = new_inode(sb);
+	if (!inode)
+		return -ENOMEM;
+
+	inode->i_ino = FIRST_INODE;
+	inode->i_fop = &simple_dir_operations;
+	inode->i_mode = S_IFDIR | 0755;
+	simple_inode_init_ts(inode);
+	inode->i_op = &binderfs_dir_inode_operations;
+	set_nlink(inode, 2);
+
+	sb->s_root = d_make_root(inode);
+	if (!sb->s_root)
+		return -ENOMEM;
+
+	ret = binderfs_binder_ctl_create(sb);
+	if (ret)
+		return ret;
+
+	name = rust_binder_devices_param;
+	for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) {
+		strscpy(device_info.name, name, len + 1);
+		ret = binderfs_binder_device_create(inode, NULL, &device_info);
+		if (ret)
+			return ret;
+		name += len;
+		if (*name == ',')
+			name++;
+	}
+
+	ret = init_binder_features(sb);
+	if (ret)
+		return ret;
+
+	if (info->mount_opts.stats_mode == binderfs_stats_mode_global)
+		return init_binder_logs(sb);
+
+	return 0;
+}
+
+static int binderfs_fs_context_get_tree(struct fs_context *fc)
+{
+	return get_tree_nodev(fc, binderfs_fill_super);
+}
+
+static void binderfs_fs_context_free(struct fs_context *fc)
+{
+	struct binderfs_mount_opts *ctx = fc->fs_private;
+
+	kfree(ctx);
+}
+
+static const struct fs_context_operations binderfs_fs_context_ops = {
+	.free		= binderfs_fs_context_free,
+	.get_tree	= binderfs_fs_context_get_tree,
+	.parse_param	= binderfs_fs_context_parse_param,
+	.reconfigure	= binderfs_fs_context_reconfigure,
+};
+
+static int binderfs_init_fs_context(struct fs_context *fc)
+{
+	struct binderfs_mount_opts *ctx;
+
+	ctx = kzalloc(sizeof(struct binderfs_mount_opts), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->max = BINDERFS_MAX_MINOR;
+	ctx->stats_mode = binderfs_stats_mode_unset;
+
+	fc->fs_private = ctx;
+	fc->ops = &binderfs_fs_context_ops;
+
+	return 0;
+}
+
+static void binderfs_kill_super(struct super_block *sb)
+{
+	struct binderfs_info *info = sb->s_fs_info;
+
+	/*
+	 * During inode eviction struct binderfs_info is needed.
+	 * So first wipe the super_block then free struct binderfs_info.
+	 */
+	kill_litter_super(sb);
+
+	if (info && info->ipc_ns)
+		put_ipc_ns(info->ipc_ns);
+
+	kfree(info);
+}
+
+static struct file_system_type binder_fs_type = {
+	.name			= "binder",
+	.init_fs_context	= binderfs_init_fs_context,
+	.parameters		= binderfs_fs_parameters,
+	.kill_sb		= binderfs_kill_super,
+	.fs_flags		= FS_USERNS_MOUNT,
+};
+
+int init_rust_binderfs(void)
+{
+	int ret;
+	const char *name;
+	size_t len;
+
+	/* Verify that the default binderfs device names are valid. */
+	name = rust_binder_devices_param;
+	for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) {
+		if (len > BINDERFS_MAX_NAME)
+			return -E2BIG;
+		name += len;
+		if (*name == ',')
+			name++;
+	}
+
+	/* Allocate new major number for binderfs. */
+	ret = alloc_chrdev_region(&binderfs_dev, 0, BINDERFS_MAX_MINOR,
+				  "rust_binder");
+	if (ret)
+		return ret;
+
+	ret = register_filesystem(&binder_fs_type);
+	if (ret) {
+		unregister_chrdev_region(binderfs_dev, BINDERFS_MAX_MINOR);
+		return ret;
+	}
+
+	return ret;
+}
diff --git a/include/linux/rust_binder.h b/include/linux/rust_binder.h
new file mode 100644
index 000000000000..1e44a0a5f6a1
--- /dev/null
+++ b/include/linux/rust_binder.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_RUST_BINDER_H
+#define _LINUX_RUST_BINDER_H
+
+#include <uapi/linux/android/binderfs.h>
+
+/*
+ * This typedef is used for Rust binder driver instances. The driver object is
+ * completely opaque from C and can only be accessed via calls into Rust, so we
+ * use a typedef.
+ */
+typedef void *rust_binder_device;
+
+int init_rust_binderfs(void);
+
+#endif
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 6325d1d0e90f..e5a20c1498af 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -82,6 +82,7 @@
 #define BINFMTFS_MAGIC          0x42494e4d
 #define DEVPTS_SUPER_MAGIC	0x1cd1
 #define BINDERFS_SUPER_MAGIC	0x6c6f6f70
+#define RUST_BINDERFS_SUPER_MAGIC	0x6c6f6f71
 #define FUTEXFS_SUPER_MAGIC	0xBAD1DEA
 #define PIPEFS_MAGIC            0x50495045
 #define PROC_SUPER_MAGIC	0x9fa0
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 00a66666f00a..ffeea312f2fd 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -17,11 +17,13 @@
 #include <linux/security.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
+#include <linux/rust_binder.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/task_work.h>
 #include <linux/workqueue.h>
 #include <uapi/linux/android/binder.h>
+#include <uapi/linux/android/binderfs.h>
 
 /* `bindgen` gets confused at certain things. */
 const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 435d4c2ac5fc..f4d58da9202e 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -99,6 +99,13 @@ impl ThisModule {
     pub const unsafe fn from_ptr(ptr: *mut bindings::module) -> ThisModule {
         ThisModule(ptr)
     }
+
+    /// Access the raw pointer for this module.
+    ///
+    /// It is up to the user to use it correctly.
+    pub const fn as_ptr(&self) -> *mut bindings::module {
+        self.0
+    }
 }
 
 #[cfg(not(any(testlib, test)))]
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index da37bfa97211..f78d2e75a795 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE
 # Compile Rust sources (.rs)
 # ---------------------------------------------------------------------------
 
-rust_allowed_features := new_uninit,offset_of
+rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of
 
 # `--out-dir` is required to avoid temporaries being created by `rustc` in the
 # current working directory, which may be not accessible in the out-of-tree

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 03/20] rust_binder: add threading support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-03 10:51   ` Finn Behrens
  2023-11-01 18:01 ` [PATCH RFC 04/20] rust_binder: add work lists Alice Ryhl
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

The binder driver needs to keep track of the threads that a process
uses with the driver for several reasons:

 1. When replying to a transaction, it is assumed that you are replying
    to the "currently active transaction" on the thread you made the
    syscall from. The syscall does not provide any way to specify which
    transaction you are replying to.

 2. When a thread is sleeping while waiting for incoming transactions,
    the driver needs to keep track of where it can deliver a transaction
    to.

 3. The BINDER_GET_EXTENDED_ERROR ioctl gives you the last error
    triggered by a syscall on the same thread, so it needs to keep track
    of this value for each thread.

 4. For binder servers, the driver keeps track of whether a process has
    enough threads in its transaction thread pool.

Note that not all of the above items are implemented yet. Some of them
will appear in later patches.

In this patch, we add the structures to keep track of the threads and
implement item 3 and 4 in the above list.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Matt Gilbride <mattgilbride@google.com>
Signed-off-by: Matt Gilbride <mattgilbride@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/defs.rs        |  36 ++++++-
 drivers/android/error.rs       |  52 +++++++++++
 drivers/android/process.rs     | 108 +++++++++++++++++++--
 drivers/android/rust_binder.rs |   2 +
 drivers/android/thread.rs      | 206 +++++++++++++++++++++++++++++++++++++++++
 scripts/Makefile.build         |   2 +-
 6 files changed, 396 insertions(+), 10 deletions(-)

diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 8fdcb856ccad..86173add2616 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -2,24 +2,50 @@
 
 use core::ops::{Deref, DerefMut};
 use kernel::{
-    bindings,
+    bindings::{self, *},
     io_buffer::{ReadableFromBytes, WritableToBytes},
 };
 
+macro_rules! pub_no_prefix {
+    ($prefix:ident, $($newname:ident),+) => {
+        $(pub(crate) const $newname: u32 = kernel::macros::concat_idents!($prefix, $newname);)+
+    };
+}
+
+pub_no_prefix!(
+    binder_driver_return_protocol_,
+    BR_DEAD_REPLY,
+    BR_FAILED_REPLY,
+    BR_NOOP,
+    BR_SPAWN_LOOPER,
+    BR_TRANSACTION_COMPLETE,
+    BR_OK
+);
+
+pub_no_prefix!(
+    binder_driver_command_protocol_,
+    BC_ENTER_LOOPER,
+    BC_EXIT_LOOPER,
+    BC_REGISTER_LOOPER
+);
+
 macro_rules! decl_wrapper {
     ($newname:ident, $wrapped:ty) => {
         #[derive(Copy, Clone, Default)]
         #[repr(transparent)]
         pub(crate) struct $newname($wrapped);
+
         // SAFETY: This macro is only used with types where this is ok.
         unsafe impl ReadableFromBytes for $newname {}
         unsafe impl WritableToBytes for $newname {}
+
         impl Deref for $newname {
             type Target = $wrapped;
             fn deref(&self) -> &Self::Target {
                 &self.0
             }
         }
+
         impl DerefMut for $newname {
             fn deref_mut(&mut self) -> &mut Self::Target {
                 &mut self.0
@@ -28,7 +54,9 @@ fn deref_mut(&mut self) -> &mut Self::Target {
     };
 }
 
+decl_wrapper!(BinderWriteRead, bindings::binder_write_read);
 decl_wrapper!(BinderVersion, bindings::binder_version);
+decl_wrapper!(ExtendedError, bindings::binder_extended_error);
 
 impl BinderVersion {
     pub(crate) fn current() -> Self {
@@ -37,3 +65,9 @@ pub(crate) fn current() -> Self {
         })
     }
 }
+
+impl ExtendedError {
+    pub(crate) fn new(id: u32, command: u32, param: i32) -> Self {
+        Self(bindings::binder_extended_error { id, command, param })
+    }
+}
diff --git a/drivers/android/error.rs b/drivers/android/error.rs
new file mode 100644
index 000000000000..41fc4347ab55
--- /dev/null
+++ b/drivers/android/error.rs
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::defs::*;
+
+/// An error that will be returned to userspace via the `BINDER_WRITE_READ` ioctl rather than via
+/// errno.
+pub(crate) struct BinderError {
+    pub(crate) reply: u32,
+    source: Option<Error>,
+}
+
+/// Convert an errno into a `BinderError` and store the errno used to construct it. The errno
+/// should be stored as the thread's extended error when given to userspace.
+impl From<Error> for BinderError {
+    fn from(source: Error) -> Self {
+        Self {
+            reply: BR_FAILED_REPLY,
+            source: Some(source),
+        }
+    }
+}
+
+impl From<core::alloc::AllocError> for BinderError {
+    fn from(_: core::alloc::AllocError) -> Self {
+        Self {
+            reply: BR_FAILED_REPLY,
+            source: Some(ENOMEM),
+        }
+    }
+}
+
+impl core::fmt::Debug for BinderError {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        match self.reply {
+            BR_FAILED_REPLY => match self.source.as_ref() {
+                Some(source) => f
+                    .debug_struct("BR_FAILED_REPLY")
+                    .field("source", source)
+                    .finish(),
+                None => f.pad("BR_FAILED_REPLY"),
+            },
+            BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"),
+            BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"),
+            _ => f
+                .debug_struct("BinderError")
+                .field("reply", &self.reply)
+                .finish(),
+        }
+    }
+}
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 2f16e4cedbf1..47d074dd8465 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -13,11 +13,12 @@
 use kernel::{
     bindings,
     cred::Credential,
-    file::{File, PollTable},
-    io_buffer::IoBufferWriter,
+    file::{self, File, PollTable},
+    io_buffer::{IoBufferReader, IoBufferWriter},
     list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks},
     mm,
     prelude::*,
+    rbtree::RBTree,
     sync::{Arc, ArcBorrow, SpinLock},
     task::Task,
     types::ARef,
@@ -25,7 +26,9 @@
     workqueue::{self, Work},
 };
 
-use crate::{context::Context, defs::*};
+use crate::{context::Context, defs::*, thread::Thread};
+
+use core::mem::take;
 
 const PROC_DEFER_FLUSH: u8 = 1;
 const PROC_DEFER_RELEASE: u8 = 2;
@@ -33,6 +36,14 @@
 /// The fields of `Process` protected by the spinlock.
 pub(crate) struct ProcessInner {
     is_dead: bool,
+    threads: RBTree<i32, Arc<Thread>>,
+
+    /// The number of requested threads that haven't registered yet.
+    requested_thread_count: u32,
+    /// The maximum number of threads used by the process thread pool.
+    max_threads: u32,
+    /// The number of threads the started and registered with the thread pool.
+    started_thread_count: u32,
 
     /// Bitmap of deferred work to do.
     defer_work: u8,
@@ -42,9 +53,23 @@ impl ProcessInner {
     fn new() -> Self {
         Self {
             is_dead: false,
+            threads: RBTree::new(),
+            requested_thread_count: 0,
+            max_threads: 0,
+            started_thread_count: 0,
             defer_work: 0,
         }
     }
+
+    fn register_thread(&mut self) -> bool {
+        if self.requested_thread_count == 0 {
+            return false;
+        }
+
+        self.requested_thread_count -= 1;
+        self.started_thread_count += 1;
+        true
+    }
 }
 
 /// A process using binder.
@@ -127,10 +152,56 @@ fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
         Ok(process)
     }
 
+    fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result<Arc<Thread>> {
+        {
+            let inner = self.inner.lock();
+            if let Some(thread) = inner.threads.get(&id) {
+                return Ok(thread.clone());
+            }
+        }
+
+        // Allocate a new `Thread` without holding any locks.
+        let ta = Thread::new(id, self.into())?;
+        let node = RBTree::try_allocate_node(id, ta.clone())?;
+
+        let mut inner = self.inner.lock();
+
+        // Recheck. It's possible the thread was created while we were not holding the lock.
+        if let Some(thread) = inner.threads.get(&id) {
+            return Ok(thread.clone());
+        }
+
+        inner.threads.insert(node);
+        Ok(ta)
+    }
+
     fn version(&self, data: UserSlicePtr) -> Result {
         data.writer().write(&BinderVersion::current())
     }
 
+    pub(crate) fn register_thread(&self) -> bool {
+        self.inner.lock().register_thread()
+    }
+
+    fn remove_thread(&self, thread: Arc<Thread>) {
+        self.inner.lock().threads.remove(&thread.id);
+        thread.release();
+    }
+
+    fn set_max_threads(&self, max: u32) {
+        self.inner.lock().max_threads = max;
+    }
+
+    pub(crate) fn needs_thread(&self) -> bool {
+        let mut inner = self.inner.lock();
+        let ret =
+            inner.requested_thread_count == 0 && inner.started_thread_count < inner.max_threads;
+        if ret {
+            inner.requested_thread_count += 1
+        }
+        ret
+    }
+
     fn deferred_flush(&self) {
         // NOOP for now.
     }
@@ -139,6 +210,17 @@ fn deferred_release(self: Arc<Self>) {
         self.inner.lock().is_dead = true;
 
         self.ctx.deregister_process(&self);
+
+        // Move the threads out of `inner` so that we can iterate over them without holding the
+        // lock.
+        let mut inner = self.inner.lock();
+        let threads = take(&mut inner.threads);
+        drop(inner);
+
+        // Release all threads.
+        for thread in threads.values() {
+            thread.release();
+        }
     }
 
     pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
@@ -161,22 +243,32 @@ pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
 /// The ioctl handler.
 impl Process {
     fn write(
-        _this: ArcBorrow<'_, Process>,
+        this: ArcBorrow<'_, Process>,
         _file: &File,
-        _cmd: u32,
-        _reader: &mut UserSlicePtrReader,
+        cmd: u32,
+        reader: &mut UserSlicePtrReader,
     ) -> Result<i32> {
-        Err(EINVAL)
+        let thread = this.get_thread(kernel::current!().pid())?;
+        match cmd {
+            bindings::BINDER_SET_MAX_THREADS => this.set_max_threads(reader.read()?),
+            bindings::BINDER_THREAD_EXIT => this.remove_thread(thread),
+            _ => return Err(EINVAL),
+        }
+        Ok(0)
     }
 
     fn read_write(
         this: ArcBorrow<'_, Process>,
-        _file: &File,
+        file: &File,
         cmd: u32,
         data: UserSlicePtr,
     ) -> Result<i32> {
+        let thread = this.get_thread(kernel::current!().pid())?;
+        let blocking = (file.flags() & file::flags::O_NONBLOCK) == 0;
         match cmd {
+            bindings::BINDER_WRITE_READ => thread.write_read(data, blocking)?,
             bindings::BINDER_VERSION => this.version(data)?,
+            bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?,
             _ => return Err(EINVAL),
         }
         Ok(0)
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 6de2f40846fb..64fd24ea8be1 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -14,7 +14,9 @@
 
 mod context;
 mod defs;
+mod error;
 mod process;
+mod thread;
 
 module! {
     type: BinderModule,
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
new file mode 100644
index 000000000000..593c8e4f184e
--- /dev/null
+++ b/drivers/android/thread.rs
@@ -0,0 +1,206 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! This module defines the `Thread` type, which represents a userspace thread that is using
+//! binder.
+//!
+//! The `Process` object stores all of the threads in an rb tree.
+
+use kernel::{
+    bindings,
+    io_buffer::{IoBufferReader, IoBufferWriter},
+    prelude::*,
+    sync::{Arc, SpinLock},
+    user_ptr::UserSlicePtr,
+};
+
+use crate::{defs::*, process::Process};
+
+use core::mem::size_of;
+
+/// The fields of `Thread` protected by the spinlock.
+struct InnerThread {
+    /// Determines the looper state of the thread. It is a bit-wise combination of the constants
+    /// prefixed with `LOOPER_`.
+    looper_flags: u32,
+
+    /// Determines if thread is dead.
+    is_dead: bool,
+
+    /// Extended error information for this thread.
+    extended_error: ExtendedError,
+}
+
+const LOOPER_REGISTERED: u32 = 0x01;
+const LOOPER_ENTERED: u32 = 0x02;
+const LOOPER_EXITED: u32 = 0x04;
+const LOOPER_INVALID: u32 = 0x08;
+
+impl InnerThread {
+    fn new() -> Self {
+        use core::sync::atomic::{AtomicU32, Ordering};
+
+        fn next_err_id() -> u32 {
+            static EE_ID: AtomicU32 = AtomicU32::new(0);
+            EE_ID.fetch_add(1, Ordering::Relaxed)
+        }
+
+        Self {
+            looper_flags: 0,
+            is_dead: false,
+            extended_error: ExtendedError::new(next_err_id(), BR_OK, 0),
+        }
+    }
+
+    fn looper_enter(&mut self) {
+        self.looper_flags |= LOOPER_ENTERED;
+        if self.looper_flags & LOOPER_REGISTERED != 0 {
+            self.looper_flags |= LOOPER_INVALID;
+        }
+    }
+
+    fn looper_register(&mut self, valid: bool) {
+        self.looper_flags |= LOOPER_REGISTERED;
+        if !valid || self.looper_flags & LOOPER_ENTERED != 0 {
+            self.looper_flags |= LOOPER_INVALID;
+        }
+    }
+
+    fn looper_exit(&mut self) {
+        self.looper_flags |= LOOPER_EXITED;
+    }
+
+    /// Determines whether the thread is part of a pool, i.e., if it is a looper.
+    fn is_looper(&self) -> bool {
+        self.looper_flags & (LOOPER_ENTERED | LOOPER_REGISTERED) != 0
+    }
+}
+
+/// This represents a thread that's used with binder.
+#[pin_data]
+pub(crate) struct Thread {
+    pub(crate) id: i32,
+    pub(crate) process: Arc<Process>,
+    #[pin]
+    inner: SpinLock<InnerThread>,
+}
+
+impl Thread {
+    pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
+        Arc::pin_init(pin_init!(Thread {
+            id,
+            process,
+            inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"),
+        }))
+    }
+
+    pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result {
+        let mut writer = data.writer();
+        let ee = self.inner.lock().extended_error;
+        writer.write(&ee)?;
+        Ok(())
+    }
+
+    fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
+        let write_start = req.write_buffer.wrapping_add(req.write_consumed);
+        let write_len = req.write_size - req.write_consumed;
+        let mut reader = UserSlicePtr::new(write_start as _, write_len as _).reader();
+
+        while reader.len() >= size_of::<u32>() {
+            let before = reader.len();
+            let cmd = reader.read::<u32>()?;
+            match cmd {
+                BC_REGISTER_LOOPER => {
+                    let valid = self.process.register_thread();
+                    self.inner.lock().looper_register(valid);
+                }
+                BC_ENTER_LOOPER => self.inner.lock().looper_enter(),
+                BC_EXIT_LOOPER => self.inner.lock().looper_exit(),
+
+                // Fail if given an unknown error code.
+                // BC_ATTEMPT_ACQUIRE and BC_ACQUIRE_RESULT are no longer supported.
+                _ => return Err(EINVAL),
+            }
+            // Update the number of write bytes consumed.
+            req.write_consumed += (before - reader.len()) as u64;
+        }
+
+        Ok(())
+    }
+
+    fn read(self: &Arc<Self>, req: &mut BinderWriteRead, _wait: bool) -> Result {
+        let read_start = req.read_buffer.wrapping_add(req.read_consumed);
+        let read_len = req.read_size - req.read_consumed;
+        let mut writer = UserSlicePtr::new(read_start as _, read_len as _).writer();
+        let in_pool = self.inner.lock().is_looper();
+
+        // Reserve some room at the beginning of the read buffer so that we can send a
+        // BR_SPAWN_LOOPER if we need to.
+        let mut has_noop_placeholder = false;
+        if req.read_consumed == 0 {
+            if let Err(err) = writer.write(&BR_NOOP) {
+                pr_warn!("Failure when writing BR_NOOP at beginning of buffer.");
+                return Err(err);
+            }
+            has_noop_placeholder = true;
+        }
+
+        // Loop doing work while there is room in the buffer.
+        #[allow(clippy::never_loop)]
+        while writer.len() >= size_of::<bindings::binder_transaction_data_secctx>() + 4 {
+            // There is enough space in the output buffer to process another work item.
+            //
+            // However, we have not yet added work items to the driver, so we immediately break
+            // from the loop.
+            break;
+        }
+
+        req.read_consumed += read_len - writer.len() as u64;
+
+        // Write BR_SPAWN_LOOPER if the process needs more threads for its pool.
+        if has_noop_placeholder && in_pool && self.process.needs_thread() {
+            let mut writer = UserSlicePtr::new(req.read_buffer as _, req.read_size as _).writer();
+            writer.write(&BR_SPAWN_LOOPER)?;
+        }
+        Ok(())
+    }
+
+    pub(crate) fn write_read(self: &Arc<Self>, data: UserSlicePtr, wait: bool) -> Result {
+        let (mut reader, mut writer) = data.reader_writer();
+        let mut req = reader.read::<BinderWriteRead>()?;
+
+        // Go through the write buffer.
+        if req.write_size > 0 {
+            if let Err(err) = self.write(&mut req) {
+                pr_warn!(
+                    "Write failure {:?} in pid:{}",
+                    err,
+                    self.process.task.pid_in_current_ns()
+                );
+                req.read_consumed = 0;
+                writer.write(&req)?;
+                return Err(err);
+            }
+        }
+
+        // Go through the work queue.
+        let mut ret = Ok(());
+        if req.read_size > 0 {
+            ret = self.read(&mut req, wait);
+            if ret.is_err() && ret != Err(EINTR) {
+                pr_warn!(
+                    "Read failure {:?} in pid:{}",
+                    ret,
+                    self.process.task.pid_in_current_ns()
+                );
+            }
+        }
+
+        // Write the request back so that the consumed fields are visible to the caller.
+        writer.write(&req)?;
+        ret
+    }
+
+    pub(crate) fn release(self: &Arc<Self>) {
+        self.inner.lock().is_dead = true;
+    }
+}
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index f78d2e75a795..b388f3d75d49 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE
 # Compile Rust sources (.rs)
 # ---------------------------------------------------------------------------
 
-rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of
+rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api
 
 # `--out-dir` is required to avoid temporaries being created by `rustc` in the
 # current working directory, which may be not accessible in the out-of-tree

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 04/20] rust_binder: add work lists
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (2 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 03/20] rust_binder: add threading support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 05/20] rust_binder: add nodes and context managers Alice Ryhl
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

The binder driver uses linked lists of work items to store events that
need to be delivered to userspace. There are work lists on both the
process and threads.

Work items are expected to implement the `DeliverToRead` trait, whose
name signifies that this type is something that can be delivered to
userspace via the read part of the `BINDER_WRITE_READ` ioctl. The trait
defines what happens when a work item is executed, when it is cancelled,
how the thread should be notified (`wake_up_interruptible_sync` or
`wake_up_interruptible`?), and how it can be enqueued to a linked list.
For each type that implements the trait, Rust will generate a vtable
for the type. Pointers to the `dyn DeliverToRead` type will be fat
pointers where the metadata of the pointer is a pointer to the vtable.

We introduce the concept of a "ready thread". This is a thread that is
currently waiting for work items inside the `get_work` method. The
process will keep track of them and deliver new work items to one of the
ready threads directly. When there are no ready threads, work items are
stored in the process work list.

The work lists added in this patch are not used yet, so the `push_work`
methods are marked with `#[allow(dead_code)]` to silence the warnings
about unused methods. A user is added in the next patch of this patch
set.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/error.rs       |   9 ++
 drivers/android/process.rs     | 126 ++++++++++++++++--
 drivers/android/rust_binder.rs |  87 ++++++++++++-
 drivers/android/thread.rs      | 284 +++++++++++++++++++++++++++++++++++++++--
 scripts/Makefile.build         |   2 +-
 5 files changed, 488 insertions(+), 20 deletions(-)

diff --git a/drivers/android/error.rs b/drivers/android/error.rs
index 41fc4347ab55..a31b696efafc 100644
--- a/drivers/android/error.rs
+++ b/drivers/android/error.rs
@@ -11,6 +11,15 @@ pub(crate) struct BinderError {
     source: Option<Error>,
 }
 
+impl BinderError {
+    pub(crate) fn new_dead() -> Self {
+        Self {
+            reply: BR_DEAD_REPLY,
+            source: None,
+        }
+    }
+}
+
 /// Convert an errno into a `BinderError` and store the errno used to construct it. The errno
 /// should be stored as the thread's extended error when given to userspace.
 impl From<Error> for BinderError {
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 47d074dd8465..22662c7d388a 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -15,18 +15,24 @@
     cred::Credential,
     file::{self, File, PollTable},
     io_buffer::{IoBufferReader, IoBufferWriter},
-    list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks},
+    list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks},
     mm,
     prelude::*,
     rbtree::RBTree,
-    sync::{Arc, ArcBorrow, SpinLock},
+    sync::{lock::Guard, Arc, ArcBorrow, SpinLock},
     task::Task,
-    types::ARef,
+    types::{ARef, Either},
     user_ptr::{UserSlicePtr, UserSlicePtrReader},
     workqueue::{self, Work},
 };
 
-use crate::{context::Context, defs::*, thread::Thread};
+use crate::{
+    context::Context,
+    defs::*,
+    error::BinderError,
+    thread::{PushWorkRes, Thread},
+    DLArc, DTRWrap, DeliverToRead,
+};
 
 use core::mem::take;
 
@@ -35,8 +41,10 @@
 
 /// The fields of `Process` protected by the spinlock.
 pub(crate) struct ProcessInner {
-    is_dead: bool,
+    pub(crate) is_dead: bool,
     threads: RBTree<i32, Arc<Thread>>,
+    ready_threads: List<Thread>,
+    work: List<DTRWrap<dyn DeliverToRead>>,
 
     /// The number of requested threads that haven't registered yet.
     requested_thread_count: u32,
@@ -54,6 +62,8 @@ fn new() -> Self {
         Self {
             is_dead: false,
             threads: RBTree::new(),
+            ready_threads: List::new(),
+            work: List::new(),
             requested_thread_count: 0,
             max_threads: 0,
             started_thread_count: 0,
@@ -61,6 +71,37 @@ fn new() -> Self {
         }
     }
 
+    /// Schedule the work item for execution on this process.
+    ///
+    /// If any threads are ready for work, then the work item is given directly to that thread and
+    /// it is woken up. Otherwise, it is pushed to the process work list.
+    ///
+    /// This call can fail only if the process is dead. In this case, the work item is returned to
+    /// the caller so that the caller can drop it after releasing the inner process lock. This is
+    /// necessary since the destructor of `Transaction` will take locks that can't necessarily be
+    /// taken while holding the inner process lock.
+    #[allow(dead_code)]
+    pub(crate) fn push_work(
+        &mut self,
+        work: DLArc<dyn DeliverToRead>,
+    ) -> Result<(), (BinderError, DLArc<dyn DeliverToRead>)> {
+        // Try to find a ready thread to which to push the work.
+        if let Some(thread) = self.ready_threads.pop_front() {
+            // Push to thread while holding state lock. This prevents the thread from giving up
+            // (for example, because of a signal) when we're about to deliver work.
+            match thread.push_work(work) {
+                PushWorkRes::Ok => Ok(()),
+                PushWorkRes::FailedDead(work) => Err((BinderError::new_dead(), work)),
+            }
+        } else if self.is_dead {
+            Err((BinderError::new_dead(), work))
+        } else {
+            // There are no ready threads. Push work to process queue.
+            self.work.push_back(work);
+            Ok(())
+        }
+    }
+
     fn register_thread(&mut self) -> bool {
         if self.requested_thread_count == 0 {
             return false;
@@ -152,6 +193,31 @@ fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
         Ok(process)
     }
 
+    /// Attempts to fetch a work item from the process queue.
+    pub(crate) fn get_work(&self) -> Option<DLArc<dyn DeliverToRead>> {
+        self.inner.lock().work.pop_front()
+    }
+
+    /// Attempts to fetch a work item from the process queue. If none is available, it registers the
+    /// given thread as ready to receive work directly.
+    ///
+    /// This must only be called when the thread is not participating in a transaction chain; when
+    /// it is, work will always be delivered directly to the thread (and not through the process
+    /// queue).
+    pub(crate) fn get_work_or_register<'a>(
+        &'a self,
+        thread: &'a Arc<Thread>,
+    ) -> Either<DLArc<dyn DeliverToRead>, Registration<'a>> {
+        let mut inner = self.inner.lock();
+        // Try to get work from the process queue.
+        if let Some(work) = inner.work.pop_front() {
+            return Either::Left(work);
+        }
+
+        // Register the thread as ready.
+        Either::Right(Registration::new(self, thread, &mut inner))
+    }
+
     fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result<Arc<Thread>> {
         {
             let inner = self.inner.lock();
@@ -194,8 +260,9 @@ fn set_max_threads(&self, max: u32) {
 
     pub(crate) fn needs_thread(&self) -> bool {
         let mut inner = self.inner.lock();
-        let ret =
-            inner.requested_thread_count == 0 && inner.started_thread_count < inner.max_threads;
+        let ret = inner.requested_thread_count == 0
+            && inner.ready_threads.is_empty()
+            && inner.started_thread_count < inner.max_threads;
         if ret {
             inner.requested_thread_count += 1
         }
@@ -203,7 +270,10 @@ pub(crate) fn needs_thread(&self) -> bool {
     }
 
     fn deferred_flush(&self) {
-        // NOOP for now.
+        let inner = self.inner.lock();
+        for thread in inner.threads.values() {
+            thread.exit_looper();
+        }
     }
 
     fn deferred_release(self: Arc<Self>) {
@@ -211,6 +281,11 @@ fn deferred_release(self: Arc<Self>) {
 
         self.ctx.deregister_process(&self);
 
+        // Cancel all pending work items.
+        while let Some(work) = self.get_work() {
+            work.into_arc().cancel();
+        }
+
         // Move the threads out of `inner` so that we can iterate over them without holding the
         // lock.
         let mut inner = self.inner.lock();
@@ -341,3 +416,38 @@ pub(crate) fn poll(
         Err(EINVAL)
     }
 }
+
+/// Represents that a thread has registered with the `ready_threads` list of its process.
+///
+/// The destructor of this type will unregister the thread from the list of ready threads.
+pub(crate) struct Registration<'a> {
+    process: &'a Process,
+    thread: &'a Arc<Thread>,
+}
+
+impl<'a> Registration<'a> {
+    fn new(
+        process: &'a Process,
+        thread: &'a Arc<Thread>,
+        guard: &mut Guard<'_, ProcessInner, kernel::sync::lock::spinlock::SpinLockBackend>,
+    ) -> Self {
+        assert!(core::ptr::eq(process, &*thread.process));
+        // INVARIANT: We are pushing this thread to the right `ready_threads` list.
+        if let Ok(list_arc) = ListArc::try_from_arc(thread.clone()) {
+            guard.ready_threads.push_front(list_arc);
+        } else {
+            pr_warn!("Same thread registered with `ready_threads` twice.");
+        }
+        Self { process, thread }
+    }
+}
+
+impl Drop for Registration<'_> {
+    fn drop(&mut self) {
+        let mut inner = self.process.inner.lock();
+        // SAFETY: The thread has the invariant that we never push it to any other linked list than
+        // the `ready_threads` list of its parent process. Therefore, the thread is either in that
+        // list, or in no list.
+        unsafe { inner.ready_threads.remove(self.thread) };
+    }
+}
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 64fd24ea8be1..55d475737cef 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -5,12 +5,16 @@
 use kernel::{
     bindings::{self, seq_file},
     file::{File, PollTable},
+    list::{
+        HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc,
+    },
     prelude::*,
     sync::Arc,
     types::ForeignOwnable,
+    user_ptr::UserSlicePtrWriter,
 };
 
-use crate::{context::Context, process::Process};
+use crate::{context::Context, process::Process, thread::Thread};
 
 mod context;
 mod defs;
@@ -26,6 +30,87 @@
     license: "GPL",
 }
 
+/// Specifies how a type should be delivered to the read part of a BINDER_WRITE_READ ioctl.
+///
+/// When a value is pushed to the todo list for a process or thread, it is stored as a trait object
+/// with the type `Arc<dyn DeliverToRead>`. Trait objects are a Rust feature that lets you
+/// implement dynamic dispatch over many different types. This lets us store many different types
+/// in the todo list.
+trait DeliverToRead: ListArcSafe + Send + Sync {
+    /// Performs work. Returns true if remaining work items in the queue should be processed
+    /// immediately, or false if it should return to caller before processing additional work
+    /// items.
+    fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -> Result<bool>;
+
+    /// Cancels the given work item. This is called instead of [`DeliverToRead::do_work`] when work
+    /// won't be delivered.
+    fn cancel(self: DArc<Self>) {}
+
+    /// Should we use `wake_up_interruptible_sync` or `wake_up_interruptible` when scheduling this
+    /// work item?
+    ///
+    /// Generally only set to true for non-oneway transactions.
+    fn should_sync_wakeup(&self) -> bool;
+
+    /// Get the debug name of this type.
+    fn debug_name(&self) -> &'static str {
+        core::any::type_name::<Self>()
+    }
+}
+
+// Wrapper around a `DeliverToRead` with linked list links.
+#[pin_data]
+struct DTRWrap<T: ?Sized> {
+    #[pin]
+    links: ListLinksSelfPtr<DTRWrap<dyn DeliverToRead>>,
+    #[pin]
+    wrapped: T,
+}
+kernel::list::impl_has_list_links_self_ptr! {
+    impl HasSelfPtr<DTRWrap<dyn DeliverToRead>> for DTRWrap<dyn DeliverToRead> { self.links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl{T: ListArcSafe + ?Sized} ListArcSafe<0> for DTRWrap<T> {
+        tracked_by wrapped: T;
+    }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<0> for DTRWrap<dyn DeliverToRead> {
+        using ListLinksSelfPtr;
+    }
+}
+
+impl<T: ?Sized> core::ops::Deref for DTRWrap<T> {
+    type Target = T;
+    fn deref(&self) -> &T {
+        &self.wrapped
+    }
+}
+
+impl<T: ?Sized> core::ops::Receiver for DTRWrap<T> {}
+
+type DArc<T> = kernel::sync::Arc<DTRWrap<T>>;
+type DLArc<T> = kernel::list::ListArc<DTRWrap<T>>;
+
+impl<T: ListArcSafe> DTRWrap<T> {
+    #[allow(dead_code)]
+    fn arc_try_new(val: T) -> Result<DLArc<T>, alloc::alloc::AllocError> {
+        ListArc::pin_init(pin_init!(Self {
+            links <- ListLinksSelfPtr::new(),
+            wrapped: val,
+        }))
+        .map_err(|_| alloc::alloc::AllocError)
+    }
+
+    #[allow(dead_code)]
+    fn arc_pin_init(init: impl PinInit<T>) -> Result<DLArc<T>, kernel::error::Error> {
+        ListArc::pin_init(pin_init!(Self {
+            links <- ListLinksSelfPtr::new(),
+            wrapped <- init,
+        }))
+    }
+}
+
 struct BinderModule {}
 
 impl kernel::Module for BinderModule {
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 593c8e4f184e..a12c271a4e8f 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -8,24 +8,51 @@
 use kernel::{
     bindings,
     io_buffer::{IoBufferReader, IoBufferWriter},
+    list::{
+        AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc,
+    },
     prelude::*,
-    sync::{Arc, SpinLock},
+    sync::{Arc, CondVar, SpinLock},
+    types::Either,
     user_ptr::UserSlicePtr,
 };
 
-use crate::{defs::*, process::Process};
+use crate::{defs::*, process::Process, DLArc, DTRWrap, DeliverToRead};
 
 use core::mem::size_of;
 
+pub(crate) enum PushWorkRes {
+    Ok,
+    FailedDead(DLArc<dyn DeliverToRead>),
+}
+
+impl PushWorkRes {
+    fn is_ok(&self) -> bool {
+        match self {
+            PushWorkRes::Ok => true,
+            PushWorkRes::FailedDead(_) => false,
+        }
+    }
+}
+
 /// The fields of `Thread` protected by the spinlock.
 struct InnerThread {
     /// Determines the looper state of the thread. It is a bit-wise combination of the constants
     /// prefixed with `LOOPER_`.
     looper_flags: u32,
 
+    /// Determines whether the looper should return.
+    looper_need_return: bool,
+
     /// Determines if thread is dead.
     is_dead: bool,
 
+    /// Determines whether the work list below should be processed. When set to false, `work_list`
+    /// is treated as if it were empty.
+    process_work_list: bool,
+    /// List of work items to deliver to userspace.
+    work_list: List<DTRWrap<dyn DeliverToRead>>,
+
     /// Extended error information for this thread.
     extended_error: ExtendedError,
 }
@@ -34,6 +61,8 @@ struct InnerThread {
 const LOOPER_ENTERED: u32 = 0x02;
 const LOOPER_EXITED: u32 = 0x04;
 const LOOPER_INVALID: u32 = 0x08;
+const LOOPER_WAITING: u32 = 0x10;
+const LOOPER_WAITING_PROC: u32 = 0x20;
 
 impl InnerThread {
     fn new() -> Self {
@@ -46,11 +75,42 @@ fn next_err_id() -> u32 {
 
         Self {
             looper_flags: 0,
+            looper_need_return: false,
             is_dead: false,
+            process_work_list: false,
+            work_list: List::new(),
             extended_error: ExtendedError::new(next_err_id(), BR_OK, 0),
         }
     }
 
+    fn pop_work(&mut self) -> Option<DLArc<dyn DeliverToRead>> {
+        if !self.process_work_list {
+            return None;
+        }
+
+        let ret = self.work_list.pop_front();
+        self.process_work_list = !self.work_list.is_empty();
+        ret
+    }
+
+    #[allow(dead_code)]
+    fn push_work(&mut self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
+        if self.is_dead {
+            PushWorkRes::FailedDead(work)
+        } else {
+            self.work_list.push_back(work);
+            self.process_work_list = true;
+            PushWorkRes::Ok
+        }
+    }
+
+    /// Used to push work items that do not need to be processed immediately and can wait until the
+    /// thread gets another work item.
+    #[allow(dead_code)]
+    fn push_work_deferred(&mut self, work: DLArc<dyn DeliverToRead>) {
+        self.work_list.push_back(work);
+    }
+
     fn looper_enter(&mut self) {
         self.looper_flags |= LOOPER_ENTERED;
         if self.looper_flags & LOOPER_REGISTERED != 0 {
@@ -73,6 +133,14 @@ fn looper_exit(&mut self) {
     fn is_looper(&self) -> bool {
         self.looper_flags & (LOOPER_ENTERED | LOOPER_REGISTERED) != 0
     }
+
+    /// Determines whether the thread should attempt to fetch work items from the process queue.
+    /// This is case when the thread is not part of a transaction stack and it is registered as a
+    /// looper. Also, if there is local work, we want to return to userspace before we deliver any
+    /// remote work.
+    fn should_use_process_work_queue(&self) -> bool {
+        !self.process_work_list && self.is_looper()
+    }
 }
 
 /// This represents a thread that's used with binder.
@@ -82,6 +150,29 @@ pub(crate) struct Thread {
     pub(crate) process: Arc<Process>,
     #[pin]
     inner: SpinLock<InnerThread>,
+    #[pin]
+    work_condvar: CondVar,
+    /// Used to insert this thread into the process' `ready_threads` list.
+    ///
+    /// INVARIANT: May never be used for any other list than the `self.process.ready_threads`.
+    #[pin]
+    links: ListLinks,
+    #[pin]
+    links_track: AtomicListArcTracker,
+}
+
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<0> for Thread { self.links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for Thread {
+        tracked_by links_track: AtomicListArcTracker;
+    }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<0> for Thread {
+        using ListLinks;
+    }
 }
 
 impl Thread {
@@ -90,6 +181,9 @@ pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
             id,
             process,
             inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"),
+            work_condvar <- kernel::new_condvar!("Thread::work_condvar"),
+            links <- ListLinks::new(),
+            links_track <- AtomicListArcTracker::new(),
         }))
     }
 
@@ -100,6 +194,123 @@ pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result {
         Ok(())
     }
 
+    /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is
+    /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a
+    /// signal); otherwise it returns indicating that none is available.
+    fn get_work_local(self: &Arc<Self>, wait: bool) -> Result<Option<DLArc<dyn DeliverToRead>>> {
+        {
+            let mut inner = self.inner.lock();
+            if inner.looper_need_return {
+                return Ok(inner.pop_work());
+            }
+        }
+
+        // Try once if the caller does not want to wait.
+        if !wait {
+            return self.inner.lock().pop_work().ok_or(EAGAIN).map(Some);
+        }
+
+        // Loop waiting only on the local queue (i.e., not registering with the process queue).
+        let mut inner = self.inner.lock();
+        loop {
+            if let Some(work) = inner.pop_work() {
+                return Ok(Some(work));
+            }
+
+            inner.looper_flags |= LOOPER_WAITING;
+            let signal_pending = self.work_condvar.wait(&mut inner);
+            inner.looper_flags &= !LOOPER_WAITING;
+
+            if signal_pending {
+                return Err(EINTR);
+            }
+            if inner.looper_need_return {
+                return Ok(None);
+            }
+        }
+    }
+
+    /// Attempts to fetch a work item from the thread-local queue, falling back to the process-wide
+    /// queue if none is available locally.
+    ///
+    /// This must only be called when the thread is not participating in a transaction chain. If it
+    /// is, the local version (`get_work_local`) should be used instead.
+    fn get_work(self: &Arc<Self>, wait: bool) -> Result<Option<DLArc<dyn DeliverToRead>>> {
+        // Try to get work from the thread's work queue, using only a local lock.
+        {
+            let mut inner = self.inner.lock();
+            if let Some(work) = inner.pop_work() {
+                return Ok(Some(work));
+            }
+            if inner.looper_need_return {
+                drop(inner);
+                return Ok(self.process.get_work());
+            }
+        }
+
+        // If the caller doesn't want to wait, try to grab work from the process queue.
+        //
+        // We know nothing will have been queued directly to the thread queue because it is not in
+        // a transaction and it is not in the process' ready list.
+        if !wait {
+            return self.process.get_work().ok_or(EAGAIN).map(Some);
+        }
+
+        // Get work from the process queue. If none is available, atomically register as ready.
+        let reg = match self.process.get_work_or_register(self) {
+            Either::Left(work) => return Ok(Some(work)),
+            Either::Right(reg) => reg,
+        };
+
+        let mut inner = self.inner.lock();
+        loop {
+            if let Some(work) = inner.pop_work() {
+                return Ok(Some(work));
+            }
+
+            inner.looper_flags |= LOOPER_WAITING | LOOPER_WAITING_PROC;
+            let signal_pending = self.work_condvar.wait(&mut inner);
+            inner.looper_flags &= !(LOOPER_WAITING | LOOPER_WAITING_PROC);
+
+            if signal_pending || inner.looper_need_return {
+                // We need to return now. We need to pull the thread off the list of ready threads
+                // (by dropping `reg`), then check the state again after it's off the list to
+                // ensure that something was not queued in the meantime. If something has been
+                // queued, we just return it (instead of the error).
+                drop(inner);
+                drop(reg);
+
+                let res = match self.inner.lock().pop_work() {
+                    Some(work) => Ok(Some(work)),
+                    None if signal_pending => Err(EINTR),
+                    None => Ok(None),
+                };
+                return res;
+            }
+        }
+    }
+
+    /// Push the provided work item to be delivered to user space via this thread.
+    ///
+    /// Returns whether the item was successfully pushed. This can only fail if the work item is
+    /// already in a work list.
+    #[allow(dead_code)]
+    pub(crate) fn push_work(&self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
+        let sync = work.should_sync_wakeup();
+
+        let res = self.inner.lock().push_work(work);
+
+        if res.is_ok() {
+            if sync {
+                self.work_condvar.notify_sync();
+            } else {
+                self.work_condvar.notify_one();
+            }
+        }
+
+        res
+    }
+
     fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
         let write_start = req.write_buffer.wrapping_add(req.write_consumed);
         let write_len = req.write_size - req.write_consumed;
@@ -127,11 +338,19 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
         Ok(())
     }
 
-    fn read(self: &Arc<Self>, req: &mut BinderWriteRead, _wait: bool) -> Result {
+    fn read(self: &Arc<Self>, req: &mut BinderWriteRead, wait: bool) -> Result {
         let read_start = req.read_buffer.wrapping_add(req.read_consumed);
         let read_len = req.read_size - req.read_consumed;
         let mut writer = UserSlicePtr::new(read_start as _, read_len as _).writer();
-        let in_pool = self.inner.lock().is_looper();
+        let (in_pool, use_proc_queue) = {
+            let inner = self.inner.lock();
+            (inner.is_looper(), inner.should_use_process_work_queue())
+        };
+        let getter = if use_proc_queue {
+            Self::get_work
+        } else {
+            Self::get_work_local
+        };
 
         // Reserve some room at the beginning of the read buffer so that we can send a
         // BR_SPAWN_LOOPER if we need to.
@@ -145,13 +364,35 @@ fn read(self: &Arc<Self>, req: &mut BinderWriteRead, _wait: bool) -> Result {
         }
 
         // Loop doing work while there is room in the buffer.
-        #[allow(clippy::never_loop)]
+        let initial_len = writer.len();
         while writer.len() >= size_of::<bindings::binder_transaction_data_secctx>() + 4 {
-            // There is enough space in the output buffer to process another work item.
-            //
-            // However, we have not yet added work items to the driver, so we immediately break
-            // from the loop.
-            break;
+            match getter(self, wait && initial_len == writer.len()) {
+                Ok(Some(work)) => {
+                    let work_ty = work.debug_name();
+                    match work.into_arc().do_work(self, &mut writer) {
+                        Ok(true) => {}
+                        Ok(false) => break,
+                        Err(err) => {
+                            pr_warn!("Failure inside do_work of type {}.", work_ty);
+                            return Err(err);
+                        }
+                    }
+                }
+                Ok(None) => {
+                    break;
+                }
+                Err(err) => {
+                    // Propagate the error if we haven't written anything else.
+                    if err != EINTR && err != EAGAIN {
+                        pr_warn!("Failure in work getter: {:?}", err);
+                    }
+                    if initial_len == writer.len() {
+                        return Err(err);
+                    } else {
+                        break;
+                    }
+                }
+            }
         }
 
         req.read_consumed += read_len - writer.len() as u64;
@@ -178,6 +419,7 @@ pub(crate) fn write_read(self: &Arc<Self>, data: UserSlicePtr, wait: bool) -> Re
                 );
                 req.read_consumed = 0;
                 writer.write(&req)?;
+                self.inner.lock().looper_need_return = false;
                 return Err(err);
             }
         }
@@ -197,10 +439,32 @@ pub(crate) fn write_read(self: &Arc<Self>, data: UserSlicePtr, wait: bool) -> Re
 
         // Write the request back so that the consumed fields are visible to the caller.
         writer.write(&req)?;
+
+        self.inner.lock().looper_need_return = false;
+
         ret
     }
 
+    /// Make the call to `get_work` or `get_work_local` return immediately, if any.
+    pub(crate) fn exit_looper(&self) {
+        let mut inner = self.inner.lock();
+        let should_notify = inner.looper_flags & LOOPER_WAITING != 0;
+        if should_notify {
+            inner.looper_need_return = true;
+        }
+        drop(inner);
+
+        if should_notify {
+            self.work_condvar.notify_one();
+        }
+    }
+
     pub(crate) fn release(self: &Arc<Self>) {
         self.inner.lock().is_dead = true;
+
+        // Cancel all pending work items.
+        while let Ok(Some(work)) = self.get_work_local(false) {
+            work.into_arc().cancel();
+        }
     }
 }
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index b388f3d75d49..29108cd3377c 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE
 # Compile Rust sources (.rs)
 # ---------------------------------------------------------------------------
 
-rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api
+rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api,receiver_trait
 
 # `--out-dir` is required to avoid temporaries being created by `rustc` in the
 # current working directory, which may be not accessible in the out-of-tree

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 05/20] rust_binder: add nodes and context managers
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (3 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 04/20] rust_binder: add work lists Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 06/20] rust_binder: add oneway transactions Alice Ryhl
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

An important concept for the binder driver is a "node", which is a type
of object defined by the binder driver that serves as a "binder server".
Whenever you send a transaction, the recipient will be a node.

Binder nodes can exist in many processes. The driver keeps track of this
using two fields in `Process`.

 * The `nodes` rbtree. This structure stores all nodes that this process
   is the primary owner of. The `process` field of the `Node` struct
   will point at the process that has it in its `nodes` rbtree.
 * The `node_refs` collection. This keeps track of the nodes from other
   processes that this process holds a reference to. A process can only
   send transactions to nodes in this collection.

From userspace, we also make a distinction between local nodes owned by
the process itself, and proxy nodes that are owned by a different
process. Generally, a process will refer to local nodes using the
address of the corresponding userspace object, and it will refer to
proxy nodes using a 32-bit id that the kernel assigns to the node. The
32-bit ids are local to each process and assigned consecutively (the
same node can have a different 32-bit id in each external process that
has a reference to it).

Additionally, a node can also be stored in the context as the "context
manager". There will only be one context manager for each context (that
is, for each file in `/dev/binderfs`). The context manager implicitly
has the id 0 in every other process, which means that all processes are
able to access it by default.

In a later patch, we will add the ability to send nodes from one process
to another as part of a transaction. When this happens, the node is
added to the `node_refs` collection of the target process, and the
process will be able to start using it from then on. Except for the
context manager node, sending nodes in this way is the *only* way for a
process to obtain a reference to a node defined by another process.
Generally, userspace processes are expected to send their nodes to the
context manager process so that the context manager can pass it on to
clients that want to connect to it.

Binder nodes are reference counted through the kernel. This generally
happens in the following manner:

 1. Process A owns a binder node, which it stores in an allocation in
    userspace. This allocation is reference counted.
 2. The kernel owns a `Node` object that holds a reference count to the
    userspace object in process A. Changes to this reference count are
    communicated to process A using the commands BR_ACQUIRE, BR_RELEASE,
    BR_INCREFS, and BR_DECREFS.
 3. Other parts of the kernel own a `NodeRef` object that holds a
    reference count to the `Node` object. Destroying a `NodeRef` will
    decrement the refcount of the associated `Node` in the appropriate
    way.
 4. Process B owns a proxy node, which is a userspace object. Using a
    32-bit id, this proxy node refers to a `NodeRef` object in the
    kernel. When the proxy node is destroyed, userspace will use the
    commands BC_ACQUIRE, BC_RELEASE, BC_INCREFS, and BC_DECREFS to tell
    the kernel to modify the refcount on the `NodeRef` object.

Via the above chain, process B can own a refcount that keeps a node in
process A alive.

There can also be other things than processes than own a `NodeRef`. For
example, the context holds a `NodeRef` to the context manager node. This
keeps the node alive, even if there are no other processes with a
reference to it. In a later patch, we will see other instances of this -
for example, a transaction's allocation will also own a `NodeRef` to any
nodes embedded in it so that they don't go away while the process is
handling the transaction.

There is a potential race condition where the kernel sends BR_ACQUIRE
immediately followed by BR_RELEASE. If these are delivered to two
different userspace threads, then userspace might see them in reverse
order, which could make the refcount drop to zero when it shouldn't. To
prevent this from happening, userspace will respond to BR_ACQUIRE
commands with a BC_ACQUIRE_DONE after incrementing the refcount. The
kernel will postpone BR_RELEASE commands until after userspace has
responded with BC_ACQUIRE_DONE, which ensures that this race cannot
happen.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/context.rs     |  44 ++++-
 drivers/android/defs.rs        |  17 +-
 drivers/android/node.rs        | 377 +++++++++++++++++++++++++++++++++++++++++
 drivers/android/process.rs     | 343 ++++++++++++++++++++++++++++++++++++-
 drivers/android/rust_binder.rs |   2 +-
 drivers/android/thread.rs      |  13 +-
 rust/helpers.c                 |   6 +
 rust/kernel/security.rs        |   8 +
 8 files changed, 799 insertions(+), 11 deletions(-)

diff --git a/drivers/android/context.rs b/drivers/android/context.rs
index 630cb575d3ac..b5de9d98a6b0 100644
--- a/drivers/android/context.rs
+++ b/drivers/android/context.rs
@@ -3,11 +3,13 @@
 use kernel::{
     list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks},
     prelude::*,
+    security,
     str::{CStr, CString},
     sync::{Arc, Mutex},
+    task::Kuid,
 };
 
-use crate::process::Process;
+use crate::{error::BinderError, node::NodeRef, process::Process};
 
 // This module defines the global variable containing the list of contexts. Since the
 // `kernel::sync` bindings currently don't support mutexes in globals, we use a temporary
@@ -70,6 +72,8 @@ pub(crate) struct ContextList {
 /// This struct keeps track of the processes using this context, and which process is the context
 /// manager.
 struct Manager {
+    node: Option<NodeRef>,
+    uid: Option<Kuid>,
     all_procs: List<Process>,
 }
 
@@ -103,6 +107,8 @@ pub(crate) fn new(name: &CStr) -> Result<Arc<Self>> {
             links <- ListLinks::new(),
             manager <- kernel::new_mutex!(Manager {
                 all_procs: List::new(),
+                node: None,
+                uid: None,
             }, "Context::manager"),
         }))?;
 
@@ -141,4 +147,40 @@ pub(crate) fn deregister_process(self: &Arc<Self>, proc: &Process) {
             self.manager.lock().all_procs.remove(proc);
         }
     }
+
+    pub(crate) fn set_manager_node(&self, node_ref: NodeRef) -> Result {
+        let mut manager = self.manager.lock();
+        if manager.node.is_some() {
+            pr_warn!("BINDER_SET_CONTEXT_MGR already set");
+            return Err(EBUSY);
+        }
+        security::binder_set_context_mgr(&node_ref.node.owner.cred)?;
+
+        // If the context manager has been set before, ensure that we use the same euid.
+        let caller_uid = Kuid::current_euid();
+        if let Some(ref uid) = manager.uid {
+            if *uid != caller_uid {
+                return Err(EPERM);
+            }
+        }
+
+        manager.node = Some(node_ref);
+        manager.uid = Some(caller_uid);
+        Ok(())
+    }
+
+    pub(crate) fn unset_manager_node(&self) {
+        let node_ref = self.manager.lock().node.take();
+        drop(node_ref);
+    }
+
+    pub(crate) fn get_manager_node(&self, strong: bool) -> Result<NodeRef, BinderError> {
+        self.manager
+            .lock()
+            .node
+            .as_ref()
+            .ok_or_else(BinderError::new_dead)?
+            .clone(strong)
+            .map_err(BinderError::from)
+    }
 }
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 86173add2616..8a83df975e61 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -19,14 +19,24 @@ macro_rules! pub_no_prefix {
     BR_NOOP,
     BR_SPAWN_LOOPER,
     BR_TRANSACTION_COMPLETE,
-    BR_OK
+    BR_OK,
+    BR_INCREFS,
+    BR_ACQUIRE,
+    BR_RELEASE,
+    BR_DECREFS
 );
 
 pub_no_prefix!(
     binder_driver_command_protocol_,
     BC_ENTER_LOOPER,
     BC_EXIT_LOOPER,
-    BC_REGISTER_LOOPER
+    BC_REGISTER_LOOPER,
+    BC_INCREFS,
+    BC_ACQUIRE,
+    BC_RELEASE,
+    BC_DECREFS,
+    BC_INCREFS_DONE,
+    BC_ACQUIRE_DONE
 );
 
 macro_rules! decl_wrapper {
@@ -54,6 +64,9 @@ fn deref_mut(&mut self) -> &mut Self::Target {
     };
 }
 
+decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info);
+decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
+decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
 decl_wrapper!(BinderWriteRead, bindings::binder_write_read);
 decl_wrapper!(BinderVersion, bindings::binder_version);
 decl_wrapper!(ExtendedError, bindings::binder_extended_error);
diff --git a/drivers/android/node.rs b/drivers/android/node.rs
new file mode 100644
index 000000000000..0ca4b72b8710
--- /dev/null
+++ b/drivers/android/node.rs
@@ -0,0 +1,377 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::{
+    io_buffer::IoBufferWriter,
+    list::{AtomicListArcTracker, ListArcSafe, TryNewListArc},
+    prelude::*,
+    sync::lock::{spinlock::SpinLockBackend, Guard},
+    sync::{Arc, LockedBy},
+    user_ptr::UserSlicePtrWriter,
+};
+
+use crate::{
+    defs::*,
+    process::{Process, ProcessInner},
+    thread::Thread,
+    DArc, DeliverToRead,
+};
+
+struct CountState {
+    /// The reference count.
+    count: usize,
+    /// Whether the process that owns this node thinks that we hold a refcount on it. (Note that
+    /// even if count is greater than one, we only increment it once in the owning process.)
+    has_count: bool,
+}
+
+impl CountState {
+    fn new() -> Self {
+        Self {
+            count: 0,
+            has_count: false,
+        }
+    }
+}
+
+struct NodeInner {
+    strong: CountState,
+    weak: CountState,
+    /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two)
+    ///
+    /// If this is non-zero, then we postpone any BR_RELEASE or BR_DECREFS notifications until the
+    /// active operations have ended. This avoids the situation an increment and decrement get
+    /// reordered from userspace's perspective.
+    active_inc_refs: u8,
+}
+
+#[pin_data]
+pub(crate) struct Node {
+    pub(crate) global_id: u64,
+    ptr: usize,
+    cookie: usize,
+    #[allow(dead_code)]
+    pub(crate) flags: u32,
+    pub(crate) owner: Arc<Process>,
+    inner: LockedBy<NodeInner, ProcessInner>,
+    #[pin]
+    links_track: AtomicListArcTracker,
+}
+
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for Node {
+        tracked_by links_track: AtomicListArcTracker;
+    }
+}
+
+impl Node {
+    pub(crate) fn new(
+        ptr: usize,
+        cookie: usize,
+        flags: u32,
+        owner: Arc<Process>,
+    ) -> impl PinInit<Self> {
+        use core::sync::atomic::{AtomicU64, Ordering};
+        static NEXT_ID: AtomicU64 = AtomicU64::new(1);
+
+        pin_init!(Self {
+            global_id: NEXT_ID.fetch_add(1, Ordering::Relaxed),
+            inner: LockedBy::new(
+                &owner.inner,
+                NodeInner {
+                    strong: CountState::new(),
+                    weak: CountState::new(),
+                    active_inc_refs: 0,
+                },
+            ),
+            ptr,
+            cookie,
+            flags,
+            owner,
+            links_track <- AtomicListArcTracker::new(),
+        })
+    }
+
+    pub(crate) fn get_id(&self) -> (usize, usize) {
+        (self.ptr, self.cookie)
+    }
+
+    pub(crate) fn inc_ref_done_locked(
+        &self,
+        _strong: bool,
+        owner_inner: &mut ProcessInner,
+    ) -> bool {
+        let inner = self.inner.access_mut(owner_inner);
+        if inner.active_inc_refs == 0 {
+            pr_err!("inc_ref_done called when no active inc_refs");
+            return false;
+        }
+
+        inner.active_inc_refs -= 1;
+        if inner.active_inc_refs == 0 {
+            // Having active inc_refs can inhibit dropping of ref-counts. Calculate whether we
+            // would send a refcount decrement, and if so, tell the caller to schedule us.
+            let strong = inner.strong.count > 0;
+            let has_strong = inner.strong.has_count;
+            let weak = strong || inner.weak.count > 0;
+            let has_weak = inner.weak.has_count;
+
+            let should_drop_weak = !weak && has_weak;
+            let should_drop_strong = !strong && has_strong;
+
+            // If we want to drop the ref-count again, tell the caller to schedule a work node for
+            // that.
+            should_drop_weak || should_drop_strong
+        } else {
+            false
+        }
+    }
+
+    pub(crate) fn update_refcount_locked(
+        &self,
+        inc: bool,
+        strong: bool,
+        count: usize,
+        owner_inner: &mut ProcessInner,
+    ) -> bool {
+        let is_dead = owner_inner.is_dead;
+        let inner = self.inner.access_mut(owner_inner);
+
+        // Get a reference to the state we'll update.
+        let state = if strong {
+            &mut inner.strong
+        } else {
+            &mut inner.weak
+        };
+
+        // Update the count and determine whether we need to push work.
+        if inc {
+            state.count += count;
+            !is_dead && !state.has_count
+        } else {
+            if state.count < count {
+                pr_err!("Failure: refcount underflow!");
+                return false;
+            }
+            state.count -= count;
+            !is_dead && state.count == 0 && state.has_count
+        }
+    }
+
+    pub(crate) fn update_refcount(self: &DArc<Self>, inc: bool, count: usize, strong: bool) {
+        self.owner
+            .inner
+            .lock()
+            .update_node_refcount(self, inc, strong, count, None);
+    }
+
+    pub(crate) fn populate_counts(
+        &self,
+        out: &mut BinderNodeInfoForRef,
+        guard: &Guard<'_, ProcessInner, SpinLockBackend>,
+    ) {
+        let inner = self.inner.access(guard);
+        out.strong_count = inner.strong.count as _;
+        out.weak_count = inner.weak.count as _;
+    }
+
+    pub(crate) fn populate_debug_info(
+        &self,
+        out: &mut BinderNodeDebugInfo,
+        guard: &Guard<'_, ProcessInner, SpinLockBackend>,
+    ) {
+        out.ptr = self.ptr as _;
+        out.cookie = self.cookie as _;
+        let inner = self.inner.access(guard);
+        if inner.strong.has_count {
+            out.has_strong_ref = 1;
+        }
+        if inner.weak.has_count {
+            out.has_weak_ref = 1;
+        }
+    }
+
+    pub(crate) fn force_has_count(&self, guard: &mut Guard<'_, ProcessInner, SpinLockBackend>) {
+        let inner = self.inner.access_mut(guard);
+        inner.strong.has_count = true;
+        inner.weak.has_count = true;
+    }
+
+    fn write(&self, writer: &mut UserSlicePtrWriter, code: u32) -> Result {
+        writer.write(&code)?;
+        writer.write(&self.ptr)?;
+        writer.write(&self.cookie)?;
+        Ok(())
+    }
+}
+
+impl DeliverToRead for Node {
+    fn do_work(
+        self: DArc<Self>,
+        _thread: &Thread,
+        writer: &mut UserSlicePtrWriter,
+    ) -> Result<bool> {
+        let mut owner_inner = self.owner.inner.lock();
+        let inner = self.inner.access_mut(&mut owner_inner);
+        let strong = inner.strong.count > 0;
+        let has_strong = inner.strong.has_count;
+        let weak = strong || inner.weak.count > 0;
+        let has_weak = inner.weak.has_count;
+
+        if weak && !has_weak {
+            inner.weak.has_count = true;
+            inner.active_inc_refs += 1;
+        }
+
+        if strong && !has_strong {
+            inner.strong.has_count = true;
+            inner.active_inc_refs += 1;
+        }
+
+        let no_active_inc_refs = inner.active_inc_refs == 0;
+        let should_drop_weak = no_active_inc_refs && (!weak && has_weak);
+        let should_drop_strong = no_active_inc_refs && (!strong && has_strong);
+        if should_drop_weak {
+            inner.weak.has_count = false;
+        }
+        if should_drop_strong {
+            inner.strong.has_count = false;
+        }
+        if no_active_inc_refs && !weak {
+            // Remove the node if there are no references to it.
+            owner_inner.remove_node(self.ptr);
+        }
+        drop(owner_inner);
+
+        if weak && !has_weak {
+            self.write(writer, BR_INCREFS)?;
+        }
+        if strong && !has_strong {
+            self.write(writer, BR_ACQUIRE)?;
+        }
+        if should_drop_strong {
+            self.write(writer, BR_RELEASE)?;
+        }
+        if should_drop_weak {
+            self.write(writer, BR_DECREFS)?;
+        }
+
+        Ok(true)
+    }
+
+    fn should_sync_wakeup(&self) -> bool {
+        false
+    }
+}
+
+/// Represents something that holds one or more ref-counts to a `Node`.
+///
+/// Whenever process A holds a refcount to a node owned by a different process B, then process A
+/// will store a `NodeRef` that refers to the `Node` in process B. When process A releases the
+/// refcount, we destroy the NodeRef, which decrements the ref-count in process A.
+///
+/// This type is also used for some other cases. For example, a transaction allocation holds a
+/// refcount on the target node, and this is implemented by storing a `NodeRef` in the allocation
+/// so that the destructor of the allocation will drop a refcount of the `Node`.
+pub(crate) struct NodeRef {
+    pub(crate) node: DArc<Node>,
+    /// How many times does this NodeRef hold a refcount on the Node?
+    strong_node_count: usize,
+    weak_node_count: usize,
+    /// How many times does userspace hold a refcount on this NodeRef?
+    strong_count: usize,
+    weak_count: usize,
+}
+
+impl NodeRef {
+    pub(crate) fn new(node: DArc<Node>, strong_count: usize, weak_count: usize) -> Self {
+        Self {
+            node,
+            strong_node_count: strong_count,
+            weak_node_count: weak_count,
+            strong_count,
+            weak_count,
+        }
+    }
+
+    pub(crate) fn absorb(&mut self, mut other: Self) {
+        assert!(
+            Arc::ptr_eq(&self.node, &other.node),
+            "absorb called with differing nodes"
+        );
+        self.strong_node_count += other.strong_node_count;
+        self.weak_node_count += other.weak_node_count;
+        self.strong_count += other.strong_count;
+        self.weak_count += other.weak_count;
+        other.strong_count = 0;
+        other.weak_count = 0;
+        other.strong_node_count = 0;
+        other.weak_node_count = 0;
+    }
+
+    pub(crate) fn clone(&self, strong: bool) -> Result<NodeRef> {
+        if strong && self.strong_count == 0 {
+            return Err(EINVAL);
+        }
+        Ok(self
+            .node
+            .owner
+            .inner
+            .lock()
+            .new_node_ref(self.node.clone(), strong, None))
+    }
+
+    /// Updates (increments or decrements) the number of references held against the node. If the
+    /// count being updated transitions from 0 to 1 or from 1 to 0, the node is notified by having
+    /// its `update_refcount` function called.
+    ///
+    /// Returns whether `self` should be removed (when both counts are zero).
+    pub(crate) fn update(&mut self, inc: bool, strong: bool) -> bool {
+        if strong && self.strong_count == 0 {
+            return false;
+        }
+        let (count, node_count, other_count) = if strong {
+            (
+                &mut self.strong_count,
+                &mut self.strong_node_count,
+                self.weak_count,
+            )
+        } else {
+            (
+                &mut self.weak_count,
+                &mut self.weak_node_count,
+                self.strong_count,
+            )
+        };
+        if inc {
+            if *count == 0 {
+                *node_count = 1;
+                self.node.update_refcount(true, 1, strong);
+            }
+            *count += 1;
+        } else {
+            *count -= 1;
+            if *count == 0 {
+                self.node.update_refcount(false, *node_count, strong);
+                *node_count = 0;
+                return other_count == 0;
+            }
+        }
+        false
+    }
+}
+
+impl Drop for NodeRef {
+    // This destructor is called conditionally from `Allocation::drop`. That branch is often
+    // mispredicted. Inlining this method call reduces the cost of those branch mispredictions.
+    #[inline(always)]
+    fn drop(&mut self) {
+        if self.strong_node_count > 0 {
+            self.node
+                .update_refcount(false, self.strong_node_count, true);
+        }
+        if self.weak_node_count > 0 {
+            self.node
+                .update_refcount(false, self.weak_node_count, false);
+        }
+    }
+}
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 22662c7d388a..2d8aa29776a1 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -19,7 +19,7 @@
     mm,
     prelude::*,
     rbtree::RBTree,
-    sync::{lock::Guard, Arc, ArcBorrow, SpinLock},
+    sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock},
     task::Task,
     types::{ARef, Either},
     user_ptr::{UserSlicePtr, UserSlicePtrReader},
@@ -30,8 +30,9 @@
     context::Context,
     defs::*,
     error::BinderError,
+    node::{Node, NodeRef},
     thread::{PushWorkRes, Thread},
-    DLArc, DTRWrap, DeliverToRead,
+    DArc, DLArc, DTRWrap, DeliverToRead,
 };
 
 use core::mem::take;
@@ -41,9 +42,11 @@
 
 /// The fields of `Process` protected by the spinlock.
 pub(crate) struct ProcessInner {
+    is_manager: bool,
     pub(crate) is_dead: bool,
     threads: RBTree<i32, Arc<Thread>>,
     ready_threads: List<Thread>,
+    nodes: RBTree<usize, DArc<Node>>,
     work: List<DTRWrap<dyn DeliverToRead>>,
 
     /// The number of requested threads that haven't registered yet.
@@ -60,9 +63,11 @@ pub(crate) struct ProcessInner {
 impl ProcessInner {
     fn new() -> Self {
         Self {
+            is_manager: false,
             is_dead: false,
             threads: RBTree::new(),
             ready_threads: List::new(),
+            nodes: RBTree::new(),
             work: List::new(),
             requested_thread_count: 0,
             max_threads: 0,
@@ -80,7 +85,6 @@ fn new() -> Self {
     /// the caller so that the caller can drop it after releasing the inner process lock. This is
     /// necessary since the destructor of `Transaction` will take locks that can't necessarily be
     /// taken while holding the inner process lock.
-    #[allow(dead_code)]
     pub(crate) fn push_work(
         &mut self,
         work: DLArc<dyn DeliverToRead>,
@@ -102,6 +106,81 @@ pub(crate) fn push_work(
         }
     }
 
+    pub(crate) fn remove_node(&mut self, ptr: usize) {
+        self.nodes.remove(&ptr);
+    }
+
+    /// Updates the reference count on the given node.
+    pub(crate) fn update_node_refcount(
+        &mut self,
+        node: &DArc<Node>,
+        inc: bool,
+        strong: bool,
+        count: usize,
+        othread: Option<&Thread>,
+    ) {
+        let push = node.update_refcount_locked(inc, strong, count, self);
+
+        // If we decided that we need to push work, push either to the process or to a thread if
+        // one is specified.
+        if push {
+            // It's not a problem if creating the ListArc fails, because that just means that
+            // it is already queued to a worklist.
+            if let Some(node) = ListArc::try_from_arc_or_drop(node.clone()) {
+                if let Some(thread) = othread {
+                    thread.push_work_deferred(node);
+                } else {
+                    let _ = self.push_work(node);
+                    // Nothing to do: `push_work` may fail if the process is dead, but that's ok as in
+                    // that case, it doesn't care about the notification.
+                }
+            }
+        }
+    }
+
+    pub(crate) fn new_node_ref(
+        &mut self,
+        node: DArc<Node>,
+        strong: bool,
+        thread: Option<&Thread>,
+    ) -> NodeRef {
+        self.update_node_refcount(&node, true, strong, 1, thread);
+        let strong_count = if strong { 1 } else { 0 };
+        NodeRef::new(node, strong_count, 1 - strong_count)
+    }
+
+    /// Returns an existing node with the given pointer and cookie, if one exists.
+    ///
+    /// Returns an error if a node with the given pointer but a different cookie exists.
+    fn get_existing_node(&self, ptr: usize, cookie: usize) -> Result<Option<DArc<Node>>> {
+        match self.nodes.get(&ptr) {
+            None => Ok(None),
+            Some(node) => {
+                let (_, node_cookie) = node.get_id();
+                if node_cookie == cookie {
+                    Ok(Some(node.clone()))
+                } else {
+                    Err(EINVAL)
+                }
+            }
+        }
+    }
+
+    /// Returns a reference to an existing node with the given pointer and cookie. It requires a
+    /// mutable reference because it needs to increment the ref count on the node, which may
+    /// require pushing work to the work queue (to notify userspace of 0 to 1 transitions).
+    fn get_existing_node_ref(
+        &mut self,
+        ptr: usize,
+        cookie: usize,
+        strong: bool,
+        thread: Option<&Thread>,
+    ) -> Result<Option<NodeRef>> {
+        Ok(self
+            .get_existing_node(ptr, cookie)?
+            .map(|node| self.new_node_ref(node, strong, thread)))
+    }
+
     fn register_thread(&mut self) -> bool {
         if self.requested_thread_count == 0 {
             return false;
@@ -113,6 +192,30 @@ fn register_thread(&mut self) -> bool {
     }
 }
 
+struct NodeRefInfo {
+    node_ref: NodeRef,
+}
+
+impl NodeRefInfo {
+    fn new(node_ref: NodeRef) -> Self {
+        Self { node_ref }
+    }
+}
+
+struct ProcessNodeRefs {
+    by_handle: RBTree<u32, NodeRefInfo>,
+    by_global_id: RBTree<u64, u32>,
+}
+
+impl ProcessNodeRefs {
+    fn new() -> Self {
+        Self {
+            by_handle: RBTree::new(),
+            by_global_id: RBTree::new(),
+        }
+    }
+}
+
 /// A process using binder.
 ///
 /// Strictly speaking, there can be multiple of these per process. There is one for each binder fd
@@ -131,6 +234,11 @@ pub(crate) struct Process {
     #[pin]
     pub(crate) inner: SpinLock<ProcessInner>,
 
+    // Node references are in a different lock to avoid recursive acquisition when
+    // incrementing/decrementing a node in another process.
+    #[pin]
+    node_refs: Mutex<ProcessNodeRefs>,
+
     // Work node for deferred work item.
     #[pin]
     defer_work: Work<Process>,
@@ -182,6 +290,7 @@ fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
             ctx,
             cred,
             inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"),
+            node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"),
             task: kernel::current!().group_leader().into(),
             defer_work <- kernel::new_work!("Process::defer_work"),
             links <- ListLinks::new(),
@@ -241,6 +350,167 @@ fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result<Arc<Thread>> {
         Ok(ta)
     }
 
+    fn set_as_manager(
+        self: ArcBorrow<'_, Self>,
+        info: Option<FlatBinderObject>,
+        thread: &Thread,
+    ) -> Result {
+        let (ptr, cookie, flags) = if let Some(obj) = info {
+            (
+                // SAFETY: The object type for this ioctl is implicitly `BINDER_TYPE_BINDER`, so it
+                // is safe to access the `binder` field.
+                unsafe { obj.__bindgen_anon_1.binder },
+                obj.cookie,
+                obj.flags,
+            )
+        } else {
+            (0, 0, 0)
+        };
+        let node_ref = self.get_node(ptr as _, cookie as _, flags as _, true, Some(thread))?;
+        let node = node_ref.node.clone();
+        self.ctx.set_manager_node(node_ref)?;
+        self.inner.lock().is_manager = true;
+
+        // Force the state of the node to prevent the delivery of acquire/increfs.
+        let mut owner_inner = node.owner.inner.lock();
+        node.force_has_count(&mut owner_inner);
+        Ok(())
+    }
+
+    pub(crate) fn get_node(
+        self: ArcBorrow<'_, Self>,
+        ptr: usize,
+        cookie: usize,
+        flags: u32,
+        strong: bool,
+        thread: Option<&Thread>,
+    ) -> Result<NodeRef> {
+        // Try to find an existing node.
+        {
+            let mut inner = self.inner.lock();
+            if let Some(node) = inner.get_existing_node_ref(ptr, cookie, strong, thread)? {
+                return Ok(node);
+            }
+        }
+
+        // Allocate the node before reacquiring the lock.
+        let node = DTRWrap::arc_pin_init(Node::new(ptr, cookie, flags, self.into()))?.into_arc();
+        let rbnode = RBTree::try_allocate_node(ptr, node.clone())?;
+        let mut inner = self.inner.lock();
+        if let Some(node) = inner.get_existing_node_ref(ptr, cookie, strong, thread)? {
+            return Ok(node);
+        }
+
+        inner.nodes.insert(rbnode);
+        Ok(inner.new_node_ref(node, strong, thread))
+    }
+
+    pub(crate) fn insert_or_update_handle(
+        &self,
+        node_ref: NodeRef,
+        is_mananger: bool,
+    ) -> Result<u32> {
+        {
+            let mut refs = self.node_refs.lock();
+
+            // Do a lookup before inserting.
+            if let Some(handle_ref) = refs.by_global_id.get(&node_ref.node.global_id) {
+                let handle = *handle_ref;
+                let info = refs.by_handle.get_mut(&handle).unwrap();
+                info.node_ref.absorb(node_ref);
+                return Ok(handle);
+            }
+        }
+
+        // Reserve memory for tree nodes.
+        let reserve1 = RBTree::try_reserve_node()?;
+        let reserve2 = RBTree::try_reserve_node()?;
+
+        let mut refs = self.node_refs.lock();
+
+        // Do a lookup again as node may have been inserted before the lock was reacquired.
+        if let Some(handle_ref) = refs.by_global_id.get(&node_ref.node.global_id) {
+            let handle = *handle_ref;
+            let info = refs.by_handle.get_mut(&handle).unwrap();
+            info.node_ref.absorb(node_ref);
+            return Ok(handle);
+        }
+
+        // Find id.
+        let mut target: u32 = if is_mananger { 0 } else { 1 };
+        for handle in refs.by_handle.keys() {
+            if *handle > target {
+                break;
+            }
+            if *handle == target {
+                target = target.checked_add(1).ok_or(ENOMEM)?;
+            }
+        }
+
+        // Ensure the process is still alive while we insert a new reference.
+        let inner = self.inner.lock();
+        if inner.is_dead {
+            return Err(ESRCH);
+        }
+        refs.by_global_id
+            .insert(reserve1.into_node(node_ref.node.global_id, target));
+        refs.by_handle
+            .insert(reserve2.into_node(target, NodeRefInfo::new(node_ref)));
+        Ok(target)
+    }
+
+    pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result<NodeRef> {
+        self.node_refs
+            .lock()
+            .by_handle
+            .get(&handle)
+            .ok_or(ENOENT)?
+            .node_ref
+            .clone(strong)
+    }
+
+    pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result {
+        if inc && handle == 0 {
+            if let Ok(node_ref) = self.ctx.get_manager_node(strong) {
+                if core::ptr::eq(self, &*node_ref.node.owner) {
+                    return Err(EINVAL);
+                }
+                let _ = self.insert_or_update_handle(node_ref, true);
+                return Ok(());
+            }
+        }
+
+        // To preserve original binder behaviour, we only fail requests where the manager tries to
+        // increment references on itself.
+        let mut refs = self.node_refs.lock();
+        if let Some(info) = refs.by_handle.get_mut(&handle) {
+            if info.node_ref.update(inc, strong) {
+                // Remove reference from process tables.
+                let id = info.node_ref.node.global_id;
+                refs.by_handle.remove(&handle);
+                refs.by_global_id.remove(&id);
+            }
+        }
+        Ok(())
+    }
+
+    pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool) -> Result {
+        let ptr = reader.read::<usize>()?;
+        let cookie = reader.read::<usize>()?;
+        let mut inner = self.inner.lock();
+        if let Ok(Some(node)) = inner.get_existing_node(ptr, cookie) {
+            if node.inc_ref_done_locked(strong, &mut inner) {
+                // It's not a problem if creating the ListArc fails, because that just means that
+                // it is already queued to a worklist.
+                if let Some(node) = ListArc::try_from_arc_or_drop(node) {
+                    // This only fails if the process is dead.
+                    let _ = inner.push_work(node);
+                }
+            }
+        }
+        Ok(())
+    }
+
     fn version(&self, data: UserSlicePtr) -> Result {
         data.writer().write(&BinderVersion::current())
     }
@@ -258,6 +528,57 @@ fn set_max_threads(&self, max: u32) {
         self.inner.lock().max_threads = max;
     }
 
+    fn get_node_debug_info(&self, data: UserSlicePtr) -> Result {
+        let (mut reader, mut writer) = data.reader_writer();
+
+        // Read the starting point.
+        let ptr = reader.read::<BinderNodeDebugInfo>()?.ptr as usize;
+        let mut out = BinderNodeDebugInfo::default();
+
+        {
+            let inner = self.inner.lock();
+            for (node_ptr, node) in &inner.nodes {
+                if *node_ptr > ptr {
+                    node.populate_debug_info(&mut out, &inner);
+                    break;
+                }
+            }
+        }
+
+        writer.write(&out)
+    }
+
+    fn get_node_info_from_ref(&self, data: UserSlicePtr) -> Result {
+        let (mut reader, mut writer) = data.reader_writer();
+        let mut out = reader.read::<BinderNodeInfoForRef>()?;
+
+        if out.strong_count != 0
+            || out.weak_count != 0
+            || out.reserved1 != 0
+            || out.reserved2 != 0
+            || out.reserved3 != 0
+        {
+            return Err(EINVAL);
+        }
+
+        // Only the context manager is allowed to use this ioctl.
+        if !self.inner.lock().is_manager {
+            return Err(EPERM);
+        }
+
+        let node_ref = self
+            .get_node_from_handle(out.handle, true)
+            .or(Err(EINVAL))?;
+        // Get the counts from the node.
+        {
+            let owner_inner = node_ref.node.owner.inner.lock();
+            node_ref.node.populate_counts(&mut out, &owner_inner);
+        }
+
+        // Write the result back.
+        writer.write(&out)
+    }
+
     pub(crate) fn needs_thread(&self) -> bool {
         let mut inner = self.inner.lock();
         let ret = inner.requested_thread_count == 0
@@ -277,7 +598,15 @@ fn deferred_flush(&self) {
     }
 
     fn deferred_release(self: Arc<Self>) {
-        self.inner.lock().is_dead = true;
+        let is_manager = {
+            let mut inner = self.inner.lock();
+            inner.is_dead = true;
+            inner.is_manager
+        };
+
+        if is_manager {
+            self.ctx.unset_manager_node();
+        }
 
         self.ctx.deregister_process(&self);
 
@@ -327,6 +656,10 @@ fn write(
         match cmd {
             bindings::BINDER_SET_MAX_THREADS => this.set_max_threads(reader.read()?),
             bindings::BINDER_THREAD_EXIT => this.remove_thread(thread),
+            bindings::BINDER_SET_CONTEXT_MGR => this.set_as_manager(None, &thread)?,
+            bindings::BINDER_SET_CONTEXT_MGR_EXT => {
+                this.set_as_manager(Some(reader.read()?), &thread)?
+            }
             _ => return Err(EINVAL),
         }
         Ok(0)
@@ -342,6 +675,8 @@ fn read_write(
         let blocking = (file.flags() & file::flags::O_NONBLOCK) == 0;
         match cmd {
             bindings::BINDER_WRITE_READ => thread.write_read(data, blocking)?,
+            bindings::BINDER_GET_NODE_DEBUG_INFO => this.get_node_debug_info(data)?,
+            bindings::BINDER_GET_NODE_INFO_FOR_REF => this.get_node_info_from_ref(data)?,
             bindings::BINDER_VERSION => this.version(data)?,
             bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?,
             _ => return Err(EINVAL),
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 55d475737cef..2ef37cc2c556 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -19,6 +19,7 @@
 mod context;
 mod defs;
 mod error;
+mod node;
 mod process;
 mod thread;
 
@@ -102,7 +103,6 @@ fn arc_try_new(val: T) -> Result<DLArc<T>, alloc::alloc::AllocError> {
         .map_err(|_| alloc::alloc::AllocError)
     }
 
-    #[allow(dead_code)]
     fn arc_pin_init(init: impl PinInit<T>) -> Result<DLArc<T>, kernel::error::Error> {
         ListArc::pin_init(pin_init!(Self {
             links <- ListLinksSelfPtr::new(),
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index a12c271a4e8f..f7d62fc380e5 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -93,7 +93,6 @@ fn pop_work(&mut self) -> Option<DLArc<dyn DeliverToRead>> {
         ret
     }
 
-    #[allow(dead_code)]
     fn push_work(&mut self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         if self.is_dead {
             PushWorkRes::FailedDead(work)
@@ -106,7 +105,6 @@ fn push_work(&mut self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
 
     /// Used to push work items that do not need to be processed immediately and can wait until the
     /// thread gets another work item.
-    #[allow(dead_code)]
     fn push_work_deferred(&mut self, work: DLArc<dyn DeliverToRead>) {
         self.work_list.push_back(work);
     }
@@ -294,7 +292,6 @@ fn get_work(self: &Arc<Self>, wait: bool) -> Result<Option<DLArc<dyn DeliverToRe
     ///
     /// Returns whether the item was successfully pushed. This can only fail if the work item is
     /// already in a work list.
-    #[allow(dead_code)]
     pub(crate) fn push_work(&self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         let sync = work.should_sync_wakeup();
 
@@ -311,6 +308,10 @@ pub(crate) fn push_work(&self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         res
     }
 
+    pub(crate) fn push_work_deferred(&self, work: DLArc<dyn DeliverToRead>) {
+        self.inner.lock().push_work_deferred(work);
+    }
+
     fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
         let write_start = req.write_buffer.wrapping_add(req.write_consumed);
         let write_len = req.write_size - req.write_consumed;
@@ -320,6 +321,12 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
             let before = reader.len();
             let cmd = reader.read::<u32>()?;
             match cmd {
+                BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?,
+                BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?,
+                BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?,
+                BC_DECREFS => self.process.update_ref(reader.read()?, false, false)?,
+                BC_INCREFS_DONE => self.process.inc_ref_done(&mut reader, false)?,
+                BC_ACQUIRE_DONE => self.process.inc_ref_done(&mut reader, true)?,
                 BC_REGISTER_LOOPER => {
                     let valid = self.process.register_thread();
                     self.inner.lock().looper_register(valid);
diff --git a/rust/helpers.c b/rust/helpers.c
index 2b436a7199e9..adb94ace2334 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -329,6 +329,12 @@ void rust_helper_security_release_secctx(char *secdata, u32 seclen)
 	security_release_secctx(secdata, seclen);
 }
 EXPORT_SYMBOL_GPL(rust_helper_security_release_secctx);
+
+int rust_helper_security_binder_set_context_mgr(const struct cred *mgr)
+{
+	return security_binder_set_context_mgr(mgr);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_binder_set_context_mgr);
 #endif
 
 /*
diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs
index 69c10ed89a57..f94c3c37560d 100644
--- a/rust/kernel/security.rs
+++ b/rust/kernel/security.rs
@@ -6,9 +6,17 @@
 
 use crate::{
     bindings,
+    cred::Credential,
     error::{to_result, Result},
 };
 
+/// Calls the security modules to determine if the given task can become the manager of a binder
+/// context.
+pub fn binder_set_context_mgr(mgr: &Credential) -> Result {
+    // SAFETY: `mrg.0` is valid because the shared reference guarantees a nonzero refcount.
+    to_result(unsafe { bindings::security_binder_set_context_mgr(mgr.0.get()) })
+}
+
 /// A security context string.
 ///
 /// The struct has the invariant that it always contains a valid security context.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 06/20] rust_binder: add oneway transactions
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (4 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 05/20] rust_binder: add nodes and context managers Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 07/20] rust_binder: add epoll support Alice Ryhl
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

Add support for sending oneway transactions using the binder driver.

To receive transactions, the process must first use mmap to create a
memory region for holding the contents of incoming transactions. The
driver will manage the resulting memory using two files: `allocation.rs`
and `range_alloc.rs`.

The `allocation.rs` file is responsible for actually managing the
mmap'ed region of memory and has methods for writing to it.

The `range_alloc.rs` file contains a data structure for tracking where
in the mmap we are storing different things. It doesn't actually touch
the mmap itself. Basically, it's a data structure that stores a set of
non-overlapping intervals (the allocations) and it is able to find the
smallest offset where the next X bytes are free and allocate that
region.

Other than that, this patch introduces a `Transaction` struct that
stores the information related to a transaction, and adds the necessary
infrastructure to send and receive them. This uses the work lists
introduces in a previous patch to deliver incoming transactions.

There are several different possible implementations of the range
allocator, and we have implemented several of them. The simplest
possible implementation is to use a linked list to store the allocations
and free regions sorted by address. Another possibility is to store the
same thing using a red-black tree. The red-black tree is preferable to
the linked list because its accesses are logarithmic rather than linear.

This RFC implements the range allocator using a red-black tree.

We have also looked into replacing the red-black tree with an XArray.
However, this is challenging because it doesn't have a good way to look
up the smallest free region whose size is at least some lower bound. You
can use `xa_find`, but there could be many free regions of the same
size, which makes it a challenge to maintain this information correctly.
We also run into issues with having to allocate while holding a lock.
Finally, the XArray is not optimized for this use-case: all of the
indices are going to have gaps between them.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Matt Gilbride <mattgilbride@google.com>
Signed-off-by: Matt Gilbride <mattgilbride@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  | 140 +++++++++++++++++
 drivers/android/defs.rs        |  39 +++++
 drivers/android/error.rs       |  10 ++
 drivers/android/node.rs        |   1 -
 drivers/android/process.rs     | 171 ++++++++++++++++++--
 drivers/android/range_alloc.rs | 344 +++++++++++++++++++++++++++++++++++++++++
 drivers/android/rust_binder.rs |  54 +++++++
 drivers/android/thread.rs      | 208 +++++++++++++++++++++++--
 drivers/android/transaction.rs | 163 +++++++++++++++++++
 rust/helpers.c                 |   7 +
 rust/kernel/security.rs        |   7 +
 11 files changed, 1123 insertions(+), 21 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
new file mode 100644
index 000000000000..1ab0f254fded
--- /dev/null
+++ b/drivers/android/allocation.rs
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0
+use core::mem::size_of_val;
+
+use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader};
+
+use crate::{node::NodeRef, process::Process};
+
+#[derive(Default)]
+pub(crate) struct AllocationInfo {
+    /// The target node of the transaction this allocation is associated to.
+    /// Not set for replies.
+    pub(crate) target_node: Option<NodeRef>,
+    /// Zero the data in the buffer on free.
+    pub(crate) clear_on_free: bool,
+}
+
+/// Represents an allocation that the kernel is currently using.
+///
+/// When allocations are idle, the range allocator holds the data related to them.
+pub(crate) struct Allocation {
+    pub(crate) offset: usize,
+    size: usize,
+    pub(crate) ptr: usize,
+    pages: Arc<Vec<Pages<0>>>,
+    pub(crate) process: Arc<Process>,
+    allocation_info: Option<AllocationInfo>,
+    free_on_drop: bool,
+}
+
+impl Allocation {
+    pub(crate) fn new(
+        process: Arc<Process>,
+        offset: usize,
+        size: usize,
+        ptr: usize,
+        pages: Arc<Vec<Pages<0>>>,
+    ) -> Self {
+        Self {
+            process,
+            offset,
+            size,
+            ptr,
+            pages,
+            allocation_info: None,
+            free_on_drop: true,
+        }
+    }
+
+    fn iterate<T>(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result
+    where
+        T: FnMut(&Pages<0>, usize, usize) -> Result,
+    {
+        // Check that the request is within the buffer.
+        if offset.checked_add(size).ok_or(EINVAL)? > self.size {
+            return Err(EINVAL);
+        }
+        offset += self.offset;
+        let mut page_index = offset >> bindings::PAGE_SHIFT;
+        offset &= (1 << bindings::PAGE_SHIFT) - 1;
+        while size > 0 {
+            let available = core::cmp::min(size, (1 << bindings::PAGE_SHIFT) - offset);
+            cb(&self.pages[page_index], offset, available)?;
+            size -= available;
+            page_index += 1;
+            offset = 0;
+        }
+        Ok(())
+    }
+
+    pub(crate) fn copy_into(
+        &self,
+        reader: &mut UserSlicePtrReader,
+        offset: usize,
+        size: usize,
+    ) -> Result {
+        self.iterate(offset, size, |page, offset, to_copy| {
+            page.copy_into_page(reader, offset, to_copy)
+        })
+    }
+
+    pub(crate) fn write<T: ?Sized>(&self, offset: usize, obj: &T) -> Result {
+        let mut obj_offset = 0;
+        self.iterate(offset, size_of_val(obj), |page, offset, to_copy| {
+            // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
+            let obj_ptr = unsafe { (obj as *const T as *const u8).add(obj_offset) };
+            // SAFETY: We have a reference to the object, so the pointer is valid.
+            unsafe { page.write(obj_ptr, offset, to_copy) }?;
+            obj_offset += to_copy;
+            Ok(())
+        })
+    }
+
+    pub(crate) fn fill_zero(&self) -> Result {
+        self.iterate(0, self.size, |page, offset, len| {
+            page.fill_zero(offset, len)
+        })
+    }
+
+    pub(crate) fn keep_alive(mut self) {
+        self.process
+            .buffer_make_freeable(self.offset, self.allocation_info.take());
+        self.free_on_drop = false;
+    }
+
+    pub(crate) fn set_info(&mut self, info: AllocationInfo) {
+        self.allocation_info = Some(info);
+    }
+
+    pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo {
+        self.allocation_info.get_or_insert_with(Default::default)
+    }
+
+    pub(crate) fn set_info_clear_on_drop(&mut self) {
+        self.get_or_init_info().clear_on_free = true;
+    }
+
+    pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) {
+        self.get_or_init_info().target_node = Some(target_node);
+    }
+}
+
+impl Drop for Allocation {
+    fn drop(&mut self) {
+        if !self.free_on_drop {
+            return;
+        }
+
+        if let Some(mut info) = self.allocation_info.take() {
+            info.target_node = None;
+
+            if info.clear_on_free {
+                if let Err(e) = self.fill_zero() {
+                    pr_warn!("Failed to clear data on free: {:?}", e);
+                }
+            }
+        }
+
+        self.process.buffer_raw_free(self.ptr);
+    }
+}
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 8a83df975e61..d0fc00fa5a57 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -14,6 +14,9 @@ macro_rules! pub_no_prefix {
 
 pub_no_prefix!(
     binder_driver_return_protocol_,
+    BR_TRANSACTION,
+    BR_TRANSACTION_SEC_CTX,
+    BR_REPLY,
     BR_DEAD_REPLY,
     BR_FAILED_REPLY,
     BR_NOOP,
@@ -28,6 +31,9 @@ macro_rules! pub_no_prefix {
 
 pub_no_prefix!(
     binder_driver_command_protocol_,
+    BC_TRANSACTION,
+    BC_TRANSACTION_SG,
+    BC_FREE_BUFFER,
     BC_ENTER_LOOPER,
     BC_EXIT_LOOPER,
     BC_REGISTER_LOOPER,
@@ -39,6 +45,10 @@ macro_rules! pub_no_prefix {
     BC_ACQUIRE_DONE
 );
 
+pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 =
+    kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX;
+pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF);
+
 macro_rules! decl_wrapper {
     ($newname:ident, $wrapped:ty) => {
         #[derive(Copy, Clone, Default)]
@@ -67,6 +77,15 @@ fn deref_mut(&mut self) -> &mut Self::Target {
 decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info);
 decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
 decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
+decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data);
+decl_wrapper!(
+    BinderTransactionDataSecctx,
+    bindings::binder_transaction_data_secctx
+);
+decl_wrapper!(
+    BinderTransactionDataSg,
+    bindings::binder_transaction_data_sg
+);
 decl_wrapper!(BinderWriteRead, bindings::binder_write_read);
 decl_wrapper!(BinderVersion, bindings::binder_version);
 decl_wrapper!(ExtendedError, bindings::binder_extended_error);
@@ -79,6 +98,26 @@ pub(crate) fn current() -> Self {
     }
 }
 
+impl BinderTransactionData {
+    pub(crate) fn with_buffers_size(self, buffers_size: u64) -> BinderTransactionDataSg {
+        BinderTransactionDataSg(bindings::binder_transaction_data_sg {
+            transaction_data: self.0,
+            buffers_size,
+        })
+    }
+}
+
+impl BinderTransactionDataSecctx {
+    /// View the inner data as wrapped in `BinderTransactionData`.
+    pub(crate) fn tr_data(&mut self) -> &mut BinderTransactionData {
+        // SAFETY: Transparent wrapper is safe to transmute.
+        unsafe {
+            &mut *(&mut self.transaction_data as *mut bindings::binder_transaction_data
+                as *mut BinderTransactionData)
+        }
+    }
+}
+
 impl ExtendedError {
     pub(crate) fn new(id: u32, command: u32, param: i32) -> Self {
         Self(bindings::binder_extended_error { id, command, param })
diff --git a/drivers/android/error.rs b/drivers/android/error.rs
index a31b696efafc..430b0994affa 100644
--- a/drivers/android/error.rs
+++ b/drivers/android/error.rs
@@ -4,6 +4,8 @@
 
 use crate::defs::*;
 
+pub(crate) type BinderResult<T = ()> = core::result::Result<T, BinderError>;
+
 /// An error that will be returned to userspace via the `BINDER_WRITE_READ` ioctl rather than via
 /// errno.
 pub(crate) struct BinderError {
@@ -18,6 +20,14 @@ pub(crate) fn new_dead() -> Self {
             source: None,
         }
     }
+
+    pub(crate) fn is_dead(&self) -> bool {
+        self.reply == BR_DEAD_REPLY
+    }
+
+    pub(crate) fn as_errno(&self) -> core::ffi::c_int {
+        self.source.unwrap_or(EINVAL).to_errno()
+    }
 }
 
 /// Convert an errno into a `BinderError` and store the errno used to construct it. The errno
diff --git a/drivers/android/node.rs b/drivers/android/node.rs
index 0ca4b72b8710..c6c3d81e705d 100644
--- a/drivers/android/node.rs
+++ b/drivers/android/node.rs
@@ -49,7 +49,6 @@ pub(crate) struct Node {
     pub(crate) global_id: u64,
     ptr: usize,
     cookie: usize,
-    #[allow(dead_code)]
     pub(crate) flags: u32,
     pub(crate) owner: Arc<Process>,
     inner: LockedBy<NodeInner, ProcessInner>,
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 2d8aa29776a1..26dd9309fbee 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -17,6 +17,7 @@
     io_buffer::{IoBufferReader, IoBufferWriter},
     list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks},
     mm,
+    pages::Pages,
     prelude::*,
     rbtree::RBTree,
     sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock},
@@ -27,16 +28,35 @@
 };
 
 use crate::{
+    allocation::{Allocation, AllocationInfo},
     context::Context,
     defs::*,
-    error::BinderError,
+    error::{BinderError, BinderResult},
     node::{Node, NodeRef},
+    range_alloc::{self, RangeAllocator},
     thread::{PushWorkRes, Thread},
     DArc, DLArc, DTRWrap, DeliverToRead,
 };
 
 use core::mem::take;
 
+struct Mapping {
+    address: usize,
+    alloc: RangeAllocator<AllocationInfo>,
+    pages: Arc<Vec<Pages<0>>>,
+}
+
+impl Mapping {
+    fn new(address: usize, size: usize, pages: Arc<Vec<Pages<0>>>) -> Result<Self> {
+        let alloc = RangeAllocator::new(size)?;
+        Ok(Self {
+            address,
+            alloc,
+            pages,
+        })
+    }
+}
+
 const PROC_DEFER_FLUSH: u8 = 1;
 const PROC_DEFER_RELEASE: u8 = 2;
 
@@ -47,6 +67,7 @@ pub(crate) struct ProcessInner {
     threads: RBTree<i32, Arc<Thread>>,
     ready_threads: List<Thread>,
     nodes: RBTree<usize, DArc<Node>>,
+    mapping: Option<Mapping>,
     work: List<DTRWrap<dyn DeliverToRead>>,
 
     /// The number of requested threads that haven't registered yet.
@@ -67,6 +88,7 @@ fn new() -> Self {
             is_dead: false,
             threads: RBTree::new(),
             ready_threads: List::new(),
+            mapping: None,
             nodes: RBTree::new(),
             work: List::new(),
             requested_thread_count: 0,
@@ -459,6 +481,15 @@ pub(crate) fn insert_or_update_handle(
         Ok(target)
     }
 
+    pub(crate) fn get_transaction_node(&self, handle: u32) -> BinderResult<NodeRef> {
+        // When handle is zero, try to get the context manager.
+        if handle == 0 {
+            Ok(self.ctx.get_manager_node(true)?)
+        } else {
+            Ok(self.get_node_from_handle(handle, true)?)
+        }
+    }
+
     pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result<NodeRef> {
         self.node_refs
             .lock()
@@ -511,6 +542,97 @@ pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool)
         Ok(())
     }
 
+    pub(crate) fn buffer_alloc(
+        self: &Arc<Self>,
+        size: usize,
+        is_oneway: bool,
+    ) -> BinderResult<Allocation> {
+        let alloc = range_alloc::ReserveNewBox::try_new()?;
+        let mut inner = self.inner.lock();
+        let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?;
+        let offset = mapping.alloc.reserve_new(size, is_oneway, alloc)?;
+        Ok(Allocation::new(
+            self.clone(),
+            offset,
+            size,
+            mapping.address + offset,
+            mapping.pages.clone(),
+        ))
+    }
+
+    pub(crate) fn buffer_get(self: &Arc<Self>, ptr: usize) -> Option<Allocation> {
+        let mut inner = self.inner.lock();
+        let mapping = inner.mapping.as_mut()?;
+        let offset = ptr.checked_sub(mapping.address)?;
+        let (size, odata) = mapping.alloc.reserve_existing(offset).ok()?;
+        let mut alloc = Allocation::new(self.clone(), offset, size, ptr, mapping.pages.clone());
+        if let Some(data) = odata {
+            alloc.set_info(data);
+        }
+        Some(alloc)
+    }
+
+    pub(crate) fn buffer_raw_free(&self, ptr: usize) {
+        let mut inner = self.inner.lock();
+        if let Some(ref mut mapping) = &mut inner.mapping {
+            if ptr < mapping.address
+                || mapping
+                    .alloc
+                    .reservation_abort(ptr - mapping.address)
+                    .is_err()
+            {
+                pr_warn!(
+                    "Pointer {:x} failed to free, base = {:x}\n",
+                    ptr,
+                    mapping.address
+                );
+            }
+        }
+    }
+
+    pub(crate) fn buffer_make_freeable(&self, offset: usize, data: Option<AllocationInfo>) {
+        let mut inner = self.inner.lock();
+        if let Some(ref mut mapping) = &mut inner.mapping {
+            if mapping.alloc.reservation_commit(offset, data).is_err() {
+                pr_warn!("Offset {} failed to be marked freeable\n", offset);
+            }
+        }
+    }
+
+    fn create_mapping(&self, vma: &mut mm::virt::Area) -> Result {
+        use kernel::bindings::PAGE_SIZE;
+        let size = core::cmp::min(vma.end() - vma.start(), bindings::SZ_4M as usize);
+        let page_count = size / PAGE_SIZE;
+
+        // Allocate and map all pages.
+        //
+        // N.B. If we fail halfway through mapping these pages, the kernel will unmap them.
+        let mut pages = Vec::new();
+        pages.try_reserve_exact(page_count)?;
+        let mut address = vma.start();
+        for _ in 0..page_count {
+            let page = Pages::<0>::new()?;
+            vma.insert_page(address, &page)?;
+            pages.try_push(page)?;
+            address += PAGE_SIZE;
+        }
+
+        let ref_pages = Arc::try_new(pages)?;
+        let mapping = Mapping::new(vma.start(), size, ref_pages)?;
+
+        // Save pages for later.
+        let mut inner = self.inner.lock();
+        match &inner.mapping {
+            None => inner.mapping = Some(mapping),
+            Some(_) => {
+                drop(inner);
+                drop(mapping);
+                return Err(EBUSY);
+            }
+        }
+        Ok(())
+    }
+
     fn version(&self, data: UserSlicePtr) -> Result {
         data.writer().write(&BinderVersion::current())
     }
@@ -610,11 +732,6 @@ fn deferred_release(self: Arc<Self>) {
 
         self.ctx.deregister_process(&self);
 
-        // Cancel all pending work items.
-        while let Some(work) = self.get_work() {
-            work.into_arc().cancel();
-        }
-
         // Move the threads out of `inner` so that we can iterate over them without holding the
         // lock.
         let mut inner = self.inner.lock();
@@ -625,6 +742,26 @@ fn deferred_release(self: Arc<Self>) {
         for thread in threads.values() {
             thread.release();
         }
+
+        // Cancel all pending work items.
+        while let Some(work) = self.get_work() {
+            work.into_arc().cancel();
+        }
+
+        // Free any resources kept alive by allocated buffers.
+        let omapping = self.inner.lock().mapping.take();
+        if let Some(mut mapping) = omapping {
+            let address = mapping.address;
+            let pages = mapping.pages.clone();
+            mapping.alloc.take_for_each(|offset, size, odata| {
+                let ptr = offset + address;
+                let mut alloc = Allocation::new(self.clone(), offset, size, ptr, pages.clone());
+                if let Some(data) = odata {
+                    alloc.set_info(data);
+                }
+                drop(alloc)
+            });
+        }
     }
 
     pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
@@ -736,11 +873,27 @@ pub(crate) fn compat_ioctl(
     }
 
     pub(crate) fn mmap(
-        _this: ArcBorrow<'_, Process>,
+        this: ArcBorrow<'_, Process>,
         _file: &File,
-        _vma: &mut mm::virt::Area,
+        vma: &mut mm::virt::Area,
     ) -> Result {
-        Err(EINVAL)
+        // We don't allow mmap to be used in a different process.
+        if !core::ptr::eq(kernel::current!().group_leader(), &*this.task) {
+            return Err(EINVAL);
+        }
+        if vma.start() == 0 {
+            return Err(EINVAL);
+        }
+        let mut flags = vma.flags();
+        use mm::virt::flags::*;
+        if flags & WRITE != 0 {
+            return Err(EPERM);
+        }
+        flags |= DONTCOPY | MIXEDMAP;
+        flags &= !MAYWRITE;
+        vma.set_flags(flags);
+        // TODO: Set ops. We need to learn when the user unmaps so that we can stop using it.
+        this.create_mapping(vma)
     }
 
     pub(crate) fn poll(
diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs
new file mode 100644
index 000000000000..e757129613cf
--- /dev/null
+++ b/drivers/android/range_alloc.rs
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::{
+    prelude::*,
+    rbtree::{RBTree, RBTreeNode, RBTreeNodeReservation},
+};
+
+/// Keeps track of allocations in a process' mmap.
+///
+/// Each process has an mmap where the data for incoming transactions will be placed. This struct
+/// keeps track of allocations made in the mmap. For each allocation, we store a descriptor that
+/// has metadata related to the allocation. We also keep track of available free space.
+pub(crate) struct RangeAllocator<T> {
+    tree: RBTree<usize, Descriptor<T>>,
+    free_tree: RBTree<FreeKey, ()>,
+    free_oneway_space: usize,
+}
+
+impl<T> RangeAllocator<T> {
+    pub(crate) fn new(size: usize) -> Result<Self> {
+        let mut tree = RBTree::new();
+        tree.try_create_and_insert(0, Descriptor::new(0, size))?;
+        let mut free_tree = RBTree::new();
+        free_tree.try_create_and_insert((size, 0), ())?;
+        Ok(Self {
+            free_oneway_space: size / 2,
+            tree,
+            free_tree,
+        })
+    }
+
+    fn find_best_match(&mut self, size: usize) -> Option<&mut Descriptor<T>> {
+        let free_cursor = self.free_tree.cursor_lower_bound(&(size, 0))?;
+        let ((_, offset), _) = free_cursor.current();
+        self.tree.get_mut(offset)
+    }
+
+    /// Try to reserve a new buffer, using the provided allocation if necessary.
+    pub(crate) fn reserve_new(
+        &mut self,
+        size: usize,
+        is_oneway: bool,
+        alloc: ReserveNewBox<T>,
+    ) -> Result<usize> {
+        // Compute new value of free_oneway_space, which is set only on success.
+        let new_oneway_space = if is_oneway {
+            match self.free_oneway_space.checked_sub(size) {
+                Some(new_oneway_space) => new_oneway_space,
+                None => return Err(ENOSPC),
+            }
+        } else {
+            self.free_oneway_space
+        };
+
+        let (found_size, found_off, tree_node, free_tree_node) = match self.find_best_match(size) {
+            None => {
+                pr_warn!("ENOSPC from range_alloc.reserve_new - size: {}", size);
+                return Err(ENOSPC);
+            }
+            Some(desc) => {
+                let found_size = desc.size;
+                let found_offset = desc.offset;
+
+                // In case we need to break up the descriptor
+                let new_desc = Descriptor::new(found_offset + size, found_size - size);
+                let (tree_node, free_tree_node, desc_node_res) = alloc.initialize(new_desc);
+
+                desc.state = Some(DescriptorState::new(is_oneway, desc_node_res));
+                desc.size = size;
+
+                (found_size, found_offset, tree_node, free_tree_node)
+            }
+        };
+        self.free_oneway_space = new_oneway_space;
+        self.free_tree.remove(&(found_size, found_off));
+
+        if found_size != size {
+            self.tree.insert(tree_node);
+            self.free_tree.insert(free_tree_node);
+        }
+
+        Ok(found_off)
+    }
+
+    pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result {
+        let mut cursor = self.tree.cursor_lower_bound(&offset).ok_or_else(|| {
+            pr_warn!(
+                "EINVAL from range_alloc.reservation_abort - offset: {}",
+                offset
+            );
+            EINVAL
+        })?;
+
+        let (_, desc) = cursor.current_mut();
+
+        if desc.offset != offset {
+            pr_warn!(
+                "EINVAL from range_alloc.reservation_abort - offset: {}",
+                offset
+            );
+            return Err(EINVAL);
+        }
+
+        let reservation = desc.try_change_state(|state| match state {
+            Some(DescriptorState::Reserved(reservation)) => (None, Ok(reservation)),
+            None => {
+                pr_warn!(
+                    "EINVAL from range_alloc.reservation_abort - offset: {}",
+                    offset
+                );
+                (None, Err(EINVAL))
+            }
+            allocated => {
+                pr_warn!(
+                    "EPERM from range_alloc.reservation_abort - offset: {}",
+                    offset
+                );
+                (allocated, Err(EPERM))
+            }
+        })?;
+
+        let mut size = desc.size;
+        let mut offset = desc.offset;
+        let free_oneway_space_add = if reservation.is_oneway { size } else { 0 };
+
+        self.free_oneway_space += free_oneway_space_add;
+
+        // Merge next into current if next is free
+        let remove_next = match cursor.peek_next() {
+            Some((_, next)) if next.state.is_none() => {
+                self.free_tree.remove(&(next.size, next.offset));
+                size += next.size;
+                true
+            }
+            _ => false,
+        };
+
+        if remove_next {
+            let (_, desc) = cursor.current_mut();
+            desc.size = size;
+            cursor.remove_next();
+        }
+
+        // Merge current into prev if prev is free
+        match cursor.peek_prev_mut() {
+            Some((_, prev)) if prev.state.is_none() => {
+                // merge previous with current, remove current
+                self.free_tree.remove(&(prev.size, prev.offset));
+                offset = prev.offset;
+                size += prev.size;
+                prev.size = size;
+                cursor.remove_current();
+            }
+            _ => {}
+        };
+
+        self.free_tree
+            .insert(reservation.free_res.into_node((size, offset), ()));
+
+        Ok(())
+    }
+
+    pub(crate) fn reservation_commit(&mut self, offset: usize, data: Option<T>) -> Result {
+        let desc = self.tree.get_mut(&offset).ok_or_else(|| {
+            pr_warn!(
+                "ENOENT from range_alloc.reservation_commit - offset: {}",
+                offset
+            );
+            ENOENT
+        })?;
+
+        desc.try_change_state(|state| match state {
+            Some(DescriptorState::Reserved(reservation)) => (
+                Some(DescriptorState::Allocated(reservation.allocate(data))),
+                Ok(()),
+            ),
+            other => {
+                pr_warn!(
+                    "ENOENT from range_alloc.reservation_commit - offset: {}",
+                    offset
+                );
+                (other, Err(ENOENT))
+            }
+        })
+    }
+
+    /// Takes an entry at the given offset from [`DescriptorState::Allocated`] to
+    /// [`DescriptorState::Reserved`].
+    ///
+    /// Returns the size of the existing entry and the data associated with it.
+    pub(crate) fn reserve_existing(&mut self, offset: usize) -> Result<(usize, Option<T>)> {
+        let desc = self.tree.get_mut(&offset).ok_or_else(|| {
+            pr_warn!(
+                "ENOENT from range_alloc.reserve_existing - offset: {}",
+                offset
+            );
+            ENOENT
+        })?;
+
+        let data = desc.try_change_state(|state| match state {
+            Some(DescriptorState::Allocated(allocation)) => {
+                let (reservation, data) = allocation.deallocate();
+                (Some(DescriptorState::Reserved(reservation)), Ok(data))
+            }
+            other => {
+                pr_warn!(
+                    "ENOENT from range_alloc.reserve_existing - offset: {}",
+                    offset
+                );
+                (other, Err(ENOENT))
+            }
+        })?;
+
+        Ok((desc.size, data))
+    }
+
+    /// Call the provided callback at every allocated region.
+    ///
+    /// This destroys the range allocator. Used only during shutdown.
+    pub(crate) fn take_for_each<F: Fn(usize, usize, Option<T>)>(&mut self, callback: F) {
+        for (_, desc) in self.tree.iter_mut() {
+            if let Some(DescriptorState::Allocated(allocation)) = &mut desc.state {
+                callback(desc.offset, desc.size, allocation.take());
+            }
+        }
+    }
+}
+
+struct Descriptor<T> {
+    size: usize,
+    offset: usize,
+    state: Option<DescriptorState<T>>,
+}
+
+impl<T> Descriptor<T> {
+    fn new(offset: usize, size: usize) -> Self {
+        Self {
+            size,
+            offset,
+            state: None,
+        }
+    }
+
+    fn try_change_state<F, Data>(&mut self, f: F) -> Result<Data>
+    where
+        F: FnOnce(Option<DescriptorState<T>>) -> (Option<DescriptorState<T>>, Result<Data>),
+    {
+        let (new_state, result) = f(self.state.take());
+        self.state = new_state;
+        result
+    }
+}
+
+enum DescriptorState<T> {
+    Reserved(Reservation),
+    Allocated(Allocation<T>),
+}
+
+impl<T> DescriptorState<T> {
+    fn new(is_oneway: bool, free_res: FreeNodeRes) -> Self {
+        DescriptorState::Reserved(Reservation {
+            is_oneway,
+            free_res,
+        })
+    }
+}
+
+struct Reservation {
+    is_oneway: bool,
+    free_res: FreeNodeRes,
+}
+
+impl Reservation {
+    fn allocate<T>(self, data: Option<T>) -> Allocation<T> {
+        Allocation {
+            data,
+            is_oneway: self.is_oneway,
+            free_res: self.free_res,
+        }
+    }
+}
+
+struct Allocation<T> {
+    is_oneway: bool,
+    free_res: FreeNodeRes,
+    data: Option<T>,
+}
+
+impl<T> Allocation<T> {
+    fn deallocate(self) -> (Reservation, Option<T>) {
+        (
+            Reservation {
+                is_oneway: self.is_oneway,
+                free_res: self.free_res,
+            },
+            self.data,
+        )
+    }
+
+    fn take(&mut self) -> Option<T> {
+        self.data.take()
+    }
+}
+
+// (Descriptor.size, Descriptor.offset)
+type FreeKey = (usize, usize);
+type FreeNodeRes = RBTreeNodeReservation<FreeKey, ()>;
+
+/// An allocation for use by `reserve_new`.
+pub(crate) struct ReserveNewBox<T> {
+    tree_node_res: RBTreeNodeReservation<usize, Descriptor<T>>,
+    free_tree_node_res: FreeNodeRes,
+    desc_node_res: FreeNodeRes,
+}
+
+impl<T> ReserveNewBox<T> {
+    pub(crate) fn try_new() -> Result<Self> {
+        let tree_node_res = RBTree::try_reserve_node()?;
+        let free_tree_node_res = RBTree::try_reserve_node()?;
+        let desc_node_res = RBTree::try_reserve_node()?;
+        Ok(Self {
+            tree_node_res,
+            free_tree_node_res,
+            desc_node_res,
+        })
+    }
+
+    fn initialize(
+        self,
+        desc: Descriptor<T>,
+    ) -> (
+        RBTreeNode<usize, Descriptor<T>>,
+        RBTreeNode<FreeKey, ()>,
+        FreeNodeRes,
+    ) {
+        let size = desc.size;
+        let offset = desc.offset;
+        (
+            self.tree_node_res.into_node(offset, desc),
+            self.free_tree_node_res.into_node((size, offset), ()),
+            self.desc_node_res,
+        )
+    }
+}
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 2ef37cc2c556..218c2001e8cb 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -5,6 +5,7 @@
 use kernel::{
     bindings::{self, seq_file},
     file::{File, PollTable},
+    io_buffer::IoBufferWriter,
     list::{
         HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc,
     },
@@ -16,12 +17,17 @@
 
 use crate::{context::Context, process::Process, thread::Thread};
 
+use core::sync::atomic::{AtomicBool, Ordering};
+
+mod allocation;
 mod context;
 mod defs;
 mod error;
 mod node;
 mod process;
+mod range_alloc;
 mod thread;
+mod transaction;
 
 module! {
     type: BinderModule,
@@ -111,6 +117,54 @@ fn arc_pin_init(init: impl PinInit<T>) -> Result<DLArc<T>, kernel::error::Error>
     }
 }
 
+struct DeliverCode {
+    code: u32,
+    skip: AtomicBool,
+}
+
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for DeliverCode { untracked; }
+}
+
+impl DeliverCode {
+    fn new(code: u32) -> Self {
+        Self {
+            code,
+            skip: AtomicBool::new(false),
+        }
+    }
+
+    /// Disable this DeliverCode and make it do nothing.
+    ///
+    /// This is used instead of removing it from the work list, since `LinkedList::remove` is
+    /// unsafe, whereas this method is not.
+    fn skip(&self) {
+        self.skip.store(true, Ordering::Relaxed);
+    }
+}
+
+impl DeliverToRead for DeliverCode {
+    fn do_work(
+        self: DArc<Self>,
+        _thread: &Thread,
+        writer: &mut UserSlicePtrWriter,
+    ) -> Result<bool> {
+        if !self.skip.load(Ordering::Relaxed) {
+            writer.write(&self.code)?;
+        }
+        Ok(true)
+    }
+
+    fn should_sync_wakeup(&self) -> bool {
+        false
+    }
+}
+
+const fn ptr_align(value: usize) -> usize {
+    let size = core::mem::size_of::<usize>() - 1;
+    (value + size) & !size
+}
+
 struct BinderModule {}
 
 impl kernel::Module for BinderModule {
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index f7d62fc380e5..f34de7ad6e6f 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -9,17 +9,25 @@
     bindings,
     io_buffer::{IoBufferReader, IoBufferWriter},
     list::{
-        AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc,
+        AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks,
+        TryNewListArc,
     },
     prelude::*,
+    security,
     sync::{Arc, CondVar, SpinLock},
     types::Either,
-    user_ptr::UserSlicePtr,
+    user_ptr::{UserSlicePtr, UserSlicePtrWriter},
 };
 
-use crate::{defs::*, process::Process, DLArc, DTRWrap, DeliverToRead};
+use crate::{
+    allocation::Allocation, defs::*, error::BinderResult, process::Process, ptr_align,
+    transaction::Transaction, DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead,
+};
 
-use core::mem::size_of;
+use core::{
+    mem::size_of,
+    sync::atomic::{AtomicU32, Ordering},
+};
 
 pub(crate) enum PushWorkRes {
     Ok,
@@ -47,6 +55,10 @@ struct InnerThread {
     /// Determines if thread is dead.
     is_dead: bool,
 
+    /// Work item used to deliver error codes to the current thread. Stored here so that it can be
+    /// reused.
+    return_work: DArc<ThreadError>,
+
     /// Determines whether the work list below should be processed. When set to false, `work_list`
     /// is treated as if it were empty.
     process_work_list: bool,
@@ -65,22 +77,21 @@ struct InnerThread {
 const LOOPER_WAITING_PROC: u32 = 0x20;
 
 impl InnerThread {
-    fn new() -> Self {
-        use core::sync::atomic::{AtomicU32, Ordering};
-
+    fn new() -> Result<Self> {
         fn next_err_id() -> u32 {
             static EE_ID: AtomicU32 = AtomicU32::new(0);
             EE_ID.fetch_add(1, Ordering::Relaxed)
         }
 
-        Self {
+        Ok(Self {
             looper_flags: 0,
             looper_need_return: false,
             is_dead: false,
             process_work_list: false,
+            return_work: ThreadError::try_new()?,
             work_list: List::new(),
             extended_error: ExtendedError::new(next_err_id(), BR_OK, 0),
-        }
+        })
     }
 
     fn pop_work(&mut self) -> Option<DLArc<dyn DeliverToRead>> {
@@ -103,6 +114,15 @@ fn push_work(&mut self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         }
     }
 
+    fn push_return_work(&mut self, reply: u32) {
+        if let Ok(work) = ListArc::try_from_arc(self.return_work.clone()) {
+            work.set_error_code(reply);
+            self.push_work(work);
+        } else {
+            pr_warn!("Thread return work is already in use.");
+        }
+    }
+
     /// Used to push work items that do not need to be processed immediately and can wait until the
     /// thread gets another work item.
     fn push_work_deferred(&mut self, work: DLArc<dyn DeliverToRead>) {
@@ -175,10 +195,12 @@ impl ListItem<0> for Thread {
 
 impl Thread {
     pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
+        let inner = InnerThread::new()?;
+
         Arc::pin_init(pin_init!(Thread {
             id,
             process,
-            inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"),
+            inner <- kernel::new_spinlock!(inner, "Thread::inner"),
             work_condvar <- kernel::new_condvar!("Thread::work_condvar"),
             links <- ListLinks::new(),
             links_track <- AtomicListArcTracker::new(),
@@ -312,15 +334,131 @@ pub(crate) fn push_work_deferred(&self, work: DLArc<dyn DeliverToRead>) {
         self.inner.lock().push_work_deferred(work);
     }
 
+    pub(crate) fn copy_transaction_data(
+        &self,
+        to_process: Arc<Process>,
+        tr: &BinderTransactionDataSg,
+        txn_security_ctx_offset: Option<&mut usize>,
+    ) -> BinderResult<Allocation> {
+        let trd = &tr.transaction_data;
+        let is_oneway = trd.flags & TF_ONE_WAY != 0;
+        let mut secctx = if let Some(offset) = txn_security_ctx_offset {
+            let secid = self.process.cred.get_secid();
+            let ctx = match security::SecurityCtx::from_secid(secid) {
+                Ok(ctx) => ctx,
+                Err(err) => {
+                    pr_warn!("Failed to get security ctx for id {}: {:?}", secid, err);
+                    return Err(err.into());
+                }
+            };
+            Some((offset, ctx))
+        } else {
+            None
+        };
+
+        let data_size = trd.data_size.try_into().map_err(|_| EINVAL)?;
+        let adata_size = ptr_align(data_size);
+        let asecctx_size = secctx
+            .as_ref()
+            .map(|(_, ctx)| ptr_align(ctx.len()))
+            .unwrap_or(0);
+
+        // This guarantees that at least `sizeof(usize)` bytes will be allocated.
+        let len = usize::max(
+            adata_size.checked_add(asecctx_size).ok_or(ENOMEM)?,
+            size_of::<usize>(),
+        );
+        let secctx_off = adata_size;
+        let alloc = match to_process.buffer_alloc(len, is_oneway) {
+            Ok(alloc) => alloc,
+            Err(err) => {
+                pr_warn!(
+                    "Failed to allocate buffer. len:{}, is_oneway:{}",
+                    len,
+                    is_oneway
+                );
+                return Err(err);
+            }
+        };
+
+        let mut buffer_reader =
+            unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader();
+
+        alloc.copy_into(&mut buffer_reader, 0, data_size)?;
+
+        if let Some((off_out, secctx)) = secctx.as_mut() {
+            if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) {
+                pr_warn!("Failed to write security context: {:?}", err);
+                return Err(err.into());
+            }
+            **off_out = secctx_off;
+        }
+        Ok(alloc)
+    }
+
+    fn transaction<T>(self: &Arc<Self>, tr: &BinderTransactionDataSg, inner: T)
+    where
+        T: FnOnce(&Arc<Self>, &BinderTransactionDataSg) -> BinderResult,
+    {
+        if let Err(err) = inner(self, tr) {
+            if err.reply != BR_TRANSACTION_COMPLETE {
+                let mut ee = self.inner.lock().extended_error;
+                ee.command = err.reply;
+                ee.param = err.as_errno();
+                pr_warn!(
+                    "Transaction failed: {:?} my_pid:{}",
+                    err,
+                    self.process.task.pid_in_current_ns()
+                );
+            }
+
+            self.inner.lock().push_return_work(err.reply);
+        }
+    }
+
+    fn oneway_transaction_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> BinderResult {
+        let handle = unsafe { tr.transaction_data.target.handle };
+        let node_ref = self.process.get_transaction_node(handle)?;
+        security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?;
+        let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
+        let transaction = Transaction::new(node_ref, self, tr)?;
+        let completion = list_completion.clone_arc();
+        self.inner.lock().push_work(list_completion);
+        match transaction.submit() {
+            Ok(()) => Ok(()),
+            Err(err) => {
+                completion.skip();
+                Err(err)
+            }
+        }
+    }
+
     fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
         let write_start = req.write_buffer.wrapping_add(req.write_consumed);
         let write_len = req.write_size - req.write_consumed;
         let mut reader = UserSlicePtr::new(write_start as _, write_len as _).reader();
 
-        while reader.len() >= size_of::<u32>() {
+        while reader.len() >= size_of::<u32>() && self.inner.lock().return_work.is_unused() {
             let before = reader.len();
             let cmd = reader.read::<u32>()?;
             match cmd {
+                BC_TRANSACTION => {
+                    let tr = reader.read::<BinderTransactionData>()?.with_buffers_size(0);
+                    if tr.transaction_data.flags & TF_ONE_WAY != 0 {
+                        self.transaction(&tr, Self::oneway_transaction_inner);
+                    } else {
+                        return Err(EINVAL);
+                    }
+                }
+                BC_TRANSACTION_SG => {
+                    let tr = reader.read::<BinderTransactionDataSg>()?;
+                    if tr.transaction_data.flags & TF_ONE_WAY != 0 {
+                        self.transaction(&tr, Self::oneway_transaction_inner);
+                    } else {
+                        return Err(EINVAL);
+                    }
+                }
+                BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)),
                 BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?,
                 BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?,
                 BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?,
@@ -475,3 +613,51 @@ pub(crate) fn release(self: &Arc<Self>) {
         }
     }
 }
+
+#[pin_data]
+struct ThreadError {
+    error_code: AtomicU32,
+    #[pin]
+    links_track: AtomicListArcTracker,
+}
+
+impl ThreadError {
+    fn try_new() -> Result<DArc<Self>> {
+        DTRWrap::arc_pin_init(pin_init!(Self {
+            error_code: AtomicU32::new(BR_OK),
+            links_track <- AtomicListArcTracker::new(),
+        }))
+        .map(ListArc::into_arc)
+    }
+
+    fn set_error_code(&self, code: u32) {
+        self.error_code.store(code, Ordering::Relaxed);
+    }
+
+    fn is_unused(&self) -> bool {
+        self.error_code.load(Ordering::Relaxed) == BR_OK
+    }
+}
+
+impl DeliverToRead for ThreadError {
+    fn do_work(
+        self: DArc<Self>,
+        _thread: &Thread,
+        writer: &mut UserSlicePtrWriter,
+    ) -> Result<bool> {
+        let code = self.error_code.load(Ordering::Relaxed);
+        self.error_code.store(BR_OK, Ordering::Relaxed);
+        writer.write(&code)?;
+        Ok(true)
+    }
+
+    fn should_sync_wakeup(&self) -> bool {
+        false
+    }
+}
+
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for ThreadError {
+        tracked_by links_track: AtomicListArcTracker;
+    }
+}
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
new file mode 100644
index 000000000000..8b4274ddc415
--- /dev/null
+++ b/drivers/android/transaction.rs
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::{
+    io_buffer::IoBufferWriter,
+    list::ListArcSafe,
+    prelude::*,
+    sync::{Arc, SpinLock},
+    task::Kuid,
+    user_ptr::UserSlicePtrWriter,
+};
+
+use crate::{
+    allocation::Allocation,
+    defs::*,
+    error::BinderResult,
+    node::{Node, NodeRef},
+    process::Process,
+    ptr_align,
+    thread::Thread,
+    DArc, DLArc, DTRWrap, DeliverToRead,
+};
+
+#[pin_data]
+pub(crate) struct Transaction {
+    target_node: Option<DArc<Node>>,
+    pub(crate) from: Arc<Thread>,
+    to: Arc<Process>,
+    #[pin]
+    allocation: SpinLock<Option<Allocation>>,
+    code: u32,
+    pub(crate) flags: u32,
+    data_size: usize,
+    data_address: usize,
+    sender_euid: Kuid,
+    txn_security_ctx_off: Option<usize>,
+}
+
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for Transaction { untracked; }
+}
+
+impl Transaction {
+    pub(crate) fn new(
+        node_ref: NodeRef,
+        from: &Arc<Thread>,
+        tr: &BinderTransactionDataSg,
+    ) -> BinderResult<DLArc<Self>> {
+        let trd = &tr.transaction_data;
+        let txn_security_ctx = node_ref.node.flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX != 0;
+        let mut txn_security_ctx_off = if txn_security_ctx { Some(0) } else { None };
+        let to = node_ref.node.owner.clone();
+        let mut alloc =
+            match from.copy_transaction_data(to.clone(), tr, txn_security_ctx_off.as_mut()) {
+                Ok(alloc) => alloc,
+                Err(err) => {
+                    if !err.is_dead() {
+                        pr_warn!("Failure in copy_transaction_data: {:?}", err);
+                    }
+                    return Err(err);
+                }
+            };
+        if trd.flags & TF_ONE_WAY == 0 {
+            pr_warn!("Non-oneway transactions are not yet supported.");
+            return Err(EINVAL.into());
+        }
+        if trd.flags & TF_CLEAR_BUF != 0 {
+            alloc.set_info_clear_on_drop();
+        }
+        let target_node = node_ref.node.clone();
+        alloc.set_info_target_node(node_ref);
+        let data_address = alloc.ptr;
+
+        Ok(DTRWrap::arc_pin_init(pin_init!(Transaction {
+            target_node: Some(target_node),
+            sender_euid: from.process.cred.euid(),
+            from: from.clone(),
+            to,
+            code: trd.code,
+            flags: trd.flags,
+            data_size: trd.data_size as _,
+            data_address,
+            allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
+            txn_security_ctx_off,
+        }))?)
+    }
+
+    /// Submits the transaction to a work queue.
+    pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
+        let process = self.to.clone();
+        let mut process_inner = process.inner.lock();
+        match process_inner.push_work(self) {
+            Ok(()) => Ok(()),
+            Err((err, work)) => {
+                // Drop work after releasing process lock.
+                drop(process_inner);
+                drop(work);
+                Err(err)
+            }
+        }
+    }
+}
+
+impl DeliverToRead for Transaction {
+    fn do_work(
+        self: DArc<Self>,
+        _thread: &Thread,
+        writer: &mut UserSlicePtrWriter,
+    ) -> Result<bool> {
+        let mut tr_sec = BinderTransactionDataSecctx::default();
+        let tr = tr_sec.tr_data();
+        if let Some(target_node) = &self.target_node {
+            let (ptr, cookie) = target_node.get_id();
+            tr.target.ptr = ptr as _;
+            tr.cookie = cookie as _;
+        };
+        tr.code = self.code;
+        tr.flags = self.flags;
+        tr.data_size = self.data_size as _;
+        tr.data.ptr.buffer = self.data_address as _;
+        tr.offsets_size = 0;
+        if tr.offsets_size > 0 {
+            tr.data.ptr.offsets = (self.data_address + ptr_align(self.data_size)) as _;
+        }
+        tr.sender_euid = self.sender_euid.into_uid_in_current_ns();
+        tr.sender_pid = 0;
+        if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 {
+            // Not a reply and not one-way.
+            tr.sender_pid = self.from.process.task.pid_in_current_ns();
+        }
+        let code = if self.target_node.is_none() {
+            BR_REPLY
+        } else if self.txn_security_ctx_off.is_some() {
+            BR_TRANSACTION_SEC_CTX
+        } else {
+            BR_TRANSACTION
+        };
+
+        // Write the transaction code and data to the user buffer.
+        writer.write(&code)?;
+        if let Some(off) = self.txn_security_ctx_off {
+            tr_sec.secctx = (self.data_address + off) as u64;
+            writer.write(&tr_sec)?;
+        } else {
+            writer.write(&*tr)?;
+        }
+
+        // It is now the user's responsibility to clear the allocation.
+        let alloc = self.allocation.lock().take();
+        if let Some(alloc) = alloc {
+            alloc.keep_alive();
+        }
+
+        Ok(false)
+    }
+
+    fn cancel(self: DArc<Self>) {
+        drop(self.allocation.lock().take());
+    }
+
+    fn should_sync_wakeup(&self) -> bool {
+        self.flags & TF_ONE_WAY == 0
+    }
+}
diff --git a/rust/helpers.c b/rust/helpers.c
index adb94ace2334..e70255f3774f 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -335,6 +335,13 @@ int rust_helper_security_binder_set_context_mgr(const struct cred *mgr)
 	return security_binder_set_context_mgr(mgr);
 }
 EXPORT_SYMBOL_GPL(rust_helper_security_binder_set_context_mgr);
+
+int rust_helper_security_binder_transaction(const struct cred *from,
+					    const struct cred *to)
+{
+	return security_binder_transaction(from, to);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_binder_transaction);
 #endif
 
 /*
diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs
index f94c3c37560d..9e3e4cf08ecb 100644
--- a/rust/kernel/security.rs
+++ b/rust/kernel/security.rs
@@ -17,6 +17,13 @@ pub fn binder_set_context_mgr(mgr: &Credential) -> Result {
     to_result(unsafe { bindings::security_binder_set_context_mgr(mgr.0.get()) })
 }
 
+/// Calls the security modules to determine if binder transactions are allowed from task `from` to
+/// task `to`.
+pub fn binder_transaction(from: &Credential, to: &Credential) -> Result {
+    // SAFETY: `from` and `to` are valid because the shared references guarantee nonzero refcounts.
+    to_result(unsafe { bindings::security_binder_transaction(from.0.get(), to.0.get()) })
+}
+
 /// A security context string.
 ///
 /// The struct has the invariant that it always contains a valid security context.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 07/20] rust_binder: add epoll support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (5 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 06/20] rust_binder: add oneway transactions Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 08/20] rust_binder: add non-oneway transactions Alice Ryhl
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

From: Wedson Almeida Filho <wedsonaf@gmail.com>

This adds epoll integration, allowing you to get an epoll notification
when an incoming transaction arrives.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/process.rs | 21 +++++++++++++++++----
 drivers/android/thread.rs  | 39 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 26dd9309fbee..2e8b0fc07756 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -122,8 +122,16 @@ pub(crate) fn push_work(
         } else if self.is_dead {
             Err((BinderError::new_dead(), work))
         } else {
+            let sync = work.should_sync_wakeup();
+
             // There are no ready threads. Push work to process queue.
             self.work.push_back(work);
+
+            // Wake up polling threads, if any.
+            for thread in self.threads.values() {
+                thread.notify_if_poll_ready(sync);
+            }
+
             Ok(())
         }
     }
@@ -897,11 +905,16 @@ pub(crate) fn mmap(
     }
 
     pub(crate) fn poll(
-        _this: ArcBorrow<'_, Process>,
-        _file: &File,
-        _table: &mut PollTable,
+        this: ArcBorrow<'_, Process>,
+        file: &File,
+        table: &mut PollTable,
     ) -> Result<u32> {
-        Err(EINVAL)
+        let thread = this.get_thread(kernel::current!().pid())?;
+        let (from_proc, mut mask) = thread.poll(file, table);
+        if mask == 0 && from_proc && !this.inner.lock().work.is_empty() {
+            mask |= bindings::POLLIN;
+        }
+        Ok(mask)
     }
 }
 
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index f34de7ad6e6f..159beebbd23e 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -7,6 +7,7 @@
 
 use kernel::{
     bindings,
+    file::{File, PollCondVar, PollTable},
     io_buffer::{IoBufferReader, IoBufferWriter},
     list::{
         AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks,
@@ -14,7 +15,7 @@
     },
     prelude::*,
     security,
-    sync::{Arc, CondVar, SpinLock},
+    sync::{Arc, SpinLock},
     types::Either,
     user_ptr::{UserSlicePtr, UserSlicePtrWriter},
 };
@@ -75,6 +76,7 @@ struct InnerThread {
 const LOOPER_INVALID: u32 = 0x08;
 const LOOPER_WAITING: u32 = 0x10;
 const LOOPER_WAITING_PROC: u32 = 0x20;
+const LOOPER_POLL: u32 = 0x40;
 
 impl InnerThread {
     fn new() -> Result<Self> {
@@ -159,6 +161,15 @@ fn is_looper(&self) -> bool {
     fn should_use_process_work_queue(&self) -> bool {
         !self.process_work_list && self.is_looper()
     }
+
+    fn poll(&mut self) -> u32 {
+        self.looper_flags |= LOOPER_POLL;
+        if self.process_work_list || self.looper_need_return {
+            bindings::POLLIN
+        } else {
+            0
+        }
+    }
 }
 
 /// This represents a thread that's used with binder.
@@ -169,7 +180,7 @@ pub(crate) struct Thread {
     #[pin]
     inner: SpinLock<InnerThread>,
     #[pin]
-    work_condvar: CondVar,
+    work_condvar: PollCondVar,
     /// Used to insert this thread into the process' `ready_threads` list.
     ///
     /// INVARIANT: May never be used for any other list than the `self.process.ready_threads`.
@@ -201,7 +212,7 @@ pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
             id,
             process,
             inner <- kernel::new_spinlock!(inner, "Thread::inner"),
-            work_condvar <- kernel::new_condvar!("Thread::work_condvar"),
+            work_condvar <- kernel::new_poll_condvar!("Thread::work_condvar"),
             links <- ListLinks::new(),
             links_track <- AtomicListArcTracker::new(),
         }))
@@ -590,6 +601,12 @@ pub(crate) fn write_read(self: &Arc<Self>, data: UserSlicePtr, wait: bool) -> Re
         ret
     }
 
+    pub(crate) fn poll(&self, file: &File, table: &mut PollTable) -> (bool, u32) {
+        table.register_wait(file, &self.work_condvar);
+        let mut inner = self.inner.lock();
+        (inner.should_use_process_work_queue(), inner.poll())
+    }
+
     /// Make the call to `get_work` or `get_work_local` return immediately, if any.
     pub(crate) fn exit_looper(&self) {
         let mut inner = self.inner.lock();
@@ -604,6 +621,22 @@ pub(crate) fn exit_looper(&self) {
         }
     }
 
+    pub(crate) fn notify_if_poll_ready(&self, sync: bool) {
+        // Determine if we need to notify. This requires the lock.
+        let inner = self.inner.lock();
+        let notify = inner.looper_flags & LOOPER_POLL != 0 && inner.should_use_process_work_queue();
+        drop(inner);
+
+        // Now that the lock is no longer held, notify the waiters if we have to.
+        if notify {
+            if sync {
+                self.work_condvar.notify_sync();
+            } else {
+                self.work_condvar.notify_one();
+            }
+        }
+    }
+
     pub(crate) fn release(self: &Arc<Self>) {
         self.inner.lock().is_dead = true;
 

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 08/20] rust_binder: add non-oneway transactions
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (6 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 07/20] rust_binder: add epoll support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 09/20] rust_binder: serialize oneway transactions Alice Ryhl
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

From: Wedson Almeida Filho <wedsonaf@gmail.com>

Make it possible to send transactions that are not oneway transactions,
that is, transactions that you need to reply to.

Generally, binder will try to look like a normal function call, where
the call blocks until the function returns. This is implemented by
allowing you to reply to incoming transactions, and having the sender
sleep until a reply arrives.

For each thread, binder will keep track of the current transaction.
Furthermore, if you send a transaction from a thread that already has a
current transaction, then binder will make that transaction into a
"sub-transaction". This mimicks a call stack with normal functions. If
you use subtransactions to send calls A->B->A with A and B being two
different processes, then binder will ensure that the incoming
sub-transaction is executed on the thread in A that sent the original
message to B (and that thread in A is not used for any other incoming
transactions). This feature is often referred to as "deadlock avoidance"
because it avoids the case where A's threadpool has run out of threads,
preventing the incoming subtransaction from being processed.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/defs.rs        |   2 +
 drivers/android/thread.rs      | 218 ++++++++++++++++++++++++++++++++++++++++-
 drivers/android/transaction.rs | 132 ++++++++++++++++++++++---
 3 files changed, 336 insertions(+), 16 deletions(-)

diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index d0fc00fa5a57..32178e8c5596 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -33,6 +33,8 @@ macro_rules! pub_no_prefix {
     binder_driver_command_protocol_,
     BC_TRANSACTION,
     BC_TRANSACTION_SG,
+    BC_REPLY,
+    BC_REPLY_SG,
     BC_FREE_BUFFER,
     BC_ENTER_LOOPER,
     BC_EXIT_LOOPER,
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 159beebbd23e..b583297cea91 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -56,6 +56,10 @@ struct InnerThread {
     /// Determines if thread is dead.
     is_dead: bool,
 
+    /// Work item used to deliver error codes to the thread that started a transaction. Stored here
+    /// so that it can be reused.
+    reply_work: DArc<ThreadError>,
+
     /// Work item used to deliver error codes to the current thread. Stored here so that it can be
     /// reused.
     return_work: DArc<ThreadError>,
@@ -65,6 +69,7 @@ struct InnerThread {
     process_work_list: bool,
     /// List of work items to deliver to userspace.
     work_list: List<DTRWrap<dyn DeliverToRead>>,
+    current_transaction: Option<DArc<Transaction>>,
 
     /// Extended error information for this thread.
     extended_error: ExtendedError,
@@ -90,8 +95,10 @@ fn next_err_id() -> u32 {
             looper_need_return: false,
             is_dead: false,
             process_work_list: false,
+            reply_work: ThreadError::try_new()?,
             return_work: ThreadError::try_new()?,
             work_list: List::new(),
+            current_transaction: None,
             extended_error: ExtendedError::new(next_err_id(), BR_OK, 0),
         })
     }
@@ -116,6 +123,15 @@ fn push_work(&mut self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         }
     }
 
+    fn push_reply_work(&mut self, code: u32) {
+        if let Ok(work) = ListArc::try_from_arc(self.reply_work.clone()) {
+            work.set_error_code(code);
+            self.push_work(work);
+        } else {
+            pr_warn!("Thread reply work is already in use.");
+        }
+    }
+
     fn push_return_work(&mut self, reply: u32) {
         if let Ok(work) = ListArc::try_from_arc(self.return_work.clone()) {
             work.set_error_code(reply);
@@ -131,6 +147,36 @@ fn push_work_deferred(&mut self, work: DLArc<dyn DeliverToRead>) {
         self.work_list.push_back(work);
     }
 
+    /// Fetches the transaction this thread can reply to. If the thread has a pending transaction
+    /// (that it could respond to) but it has also issued a transaction, it must first wait for the
+    /// previously-issued transaction to complete.
+    ///
+    /// The `thread` parameter should be the thread containing this `ThreadInner`.
+    fn pop_transaction_to_reply(&mut self, thread: &Thread) -> Result<DArc<Transaction>> {
+        let transaction = self.current_transaction.take().ok_or(EINVAL)?;
+        if core::ptr::eq(thread, transaction.from.as_ref()) {
+            self.current_transaction = Some(transaction);
+            return Err(EINVAL);
+        }
+        // Find a new current transaction for this thread.
+        self.current_transaction = transaction.find_from(thread);
+        Ok(transaction)
+    }
+
+    fn pop_transaction_replied(&mut self, transaction: &DArc<Transaction>) -> bool {
+        match self.current_transaction.take() {
+            None => false,
+            Some(old) => {
+                if !Arc::ptr_eq(transaction, &old) {
+                    self.current_transaction = Some(old);
+                    return false;
+                }
+                self.current_transaction = old.clone_next();
+                true
+            }
+        }
+    }
+
     fn looper_enter(&mut self) {
         self.looper_flags |= LOOPER_ENTERED;
         if self.looper_flags & LOOPER_REGISTERED != 0 {
@@ -159,7 +205,7 @@ fn is_looper(&self) -> bool {
     /// looper. Also, if there is local work, we want to return to userspace before we deliver any
     /// remote work.
     fn should_use_process_work_queue(&self) -> bool {
-        !self.process_work_list && self.is_looper()
+        self.current_transaction.is_none() && !self.process_work_list && self.is_looper()
     }
 
     fn poll(&mut self) -> u32 {
@@ -225,6 +271,10 @@ pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result {
         Ok(())
     }
 
+    pub(crate) fn set_current_transaction(&self, transaction: DArc<Transaction>) {
+        self.inner.lock().current_transaction = Some(transaction);
+    }
+
     /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is
     /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a
     /// signal); otherwise it returns indicating that none is available.
@@ -407,6 +457,89 @@ pub(crate) fn copy_transaction_data(
         Ok(alloc)
     }
 
+    fn unwind_transaction_stack(self: &Arc<Self>) {
+        let mut thread = self.clone();
+        while let Ok(transaction) = {
+            let mut inner = thread.inner.lock();
+            inner.pop_transaction_to_reply(thread.as_ref())
+        } {
+            let reply = Either::Right(BR_DEAD_REPLY);
+            if !transaction.from.deliver_single_reply(reply, &transaction) {
+                break;
+            }
+
+            thread = transaction.from.clone();
+        }
+    }
+
+    pub(crate) fn deliver_reply(
+        &self,
+        reply: Either<DLArc<Transaction>, u32>,
+        transaction: &DArc<Transaction>,
+    ) {
+        if self.deliver_single_reply(reply, transaction) {
+            transaction.from.unwind_transaction_stack();
+        }
+    }
+
+    /// Delivers a reply to the thread that started a transaction. The reply can either be a
+    /// reply-transaction or an error code to be delivered instead.
+    ///
+    /// Returns whether the thread is dead. If it is, the caller is expected to unwind the
+    /// transaction stack by completing transactions for threads that are dead.
+    fn deliver_single_reply(
+        &self,
+        reply: Either<DLArc<Transaction>, u32>,
+        transaction: &DArc<Transaction>,
+    ) -> bool {
+        {
+            let mut inner = self.inner.lock();
+            if !inner.pop_transaction_replied(transaction) {
+                return false;
+            }
+
+            if inner.is_dead {
+                return true;
+            }
+
+            match reply {
+                Either::Left(work) => {
+                    inner.push_work(work);
+                }
+                Either::Right(code) => inner.push_reply_work(code),
+            }
+        }
+
+        // Notify the thread now that we've released the inner lock.
+        self.work_condvar.notify_sync();
+        false
+    }
+
+    /// Determines if the given transaction is the current transaction for this thread.
+    fn is_current_transaction(&self, transaction: &DArc<Transaction>) -> bool {
+        let inner = self.inner.lock();
+        match &inner.current_transaction {
+            None => false,
+            Some(current) => Arc::ptr_eq(current, transaction),
+        }
+    }
+
+    /// Determines the current top of the transaction stack. It fails if the top is in another
+    /// thread (i.e., this thread belongs to a stack but it has called another thread). The top is
+    /// [`None`] if the thread is not currently participating in a transaction stack.
+    fn top_of_transaction_stack(&self) -> Result<Option<DArc<Transaction>>> {
+        let inner = self.inner.lock();
+        if let Some(cur) = &inner.current_transaction {
+            if core::ptr::eq(self, cur.from.as_ref()) {
+                pr_warn!("got new transaction with bad transaction stack");
+                return Err(EINVAL);
+            }
+            Ok(Some(cur.clone()))
+        } else {
+            Ok(None)
+        }
+    }
+
     fn transaction<T>(self: &Arc<Self>, tr: &BinderTransactionDataSg, inner: T)
     where
         T: FnOnce(&Arc<Self>, &BinderTransactionDataSg) -> BinderResult,
@@ -427,12 +560,79 @@ fn transaction<T>(self: &Arc<Self>, tr: &BinderTransactionDataSg, inner: T)
         }
     }
 
+    fn transaction_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> BinderResult {
+        let handle = unsafe { tr.transaction_data.target.handle };
+        let node_ref = self.process.get_transaction_node(handle)?;
+        security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?;
+        // TODO: We need to ensure that there isn't a pending transaction in the work queue. How
+        // could this happen?
+        let top = self.top_of_transaction_stack()?;
+        let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
+        let completion = list_completion.clone_arc();
+        let transaction = Transaction::new(node_ref, top, self, tr)?;
+
+        // Check that the transaction stack hasn't changed while the lock was released, then update
+        // it with the new transaction.
+        {
+            let mut inner = self.inner.lock();
+            if !transaction.is_stacked_on(&inner.current_transaction) {
+                pr_warn!("Transaction stack changed during transaction!");
+                return Err(EINVAL.into());
+            }
+            inner.current_transaction = Some(transaction.clone_arc());
+            // We push the completion as a deferred work so that we wait for the reply before returning
+            // to userland.
+            inner.push_work_deferred(list_completion);
+        }
+
+        if let Err(e) = transaction.submit() {
+            completion.skip();
+            // Define `transaction` first to drop it after `inner`.
+            let transaction;
+            let mut inner = self.inner.lock();
+            transaction = inner.current_transaction.take().unwrap();
+            inner.current_transaction = transaction.clone_next();
+            Err(e)
+        } else {
+            Ok(())
+        }
+    }
+
+    fn reply_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> BinderResult {
+        let orig = self.inner.lock().pop_transaction_to_reply(self)?;
+        if !orig.from.is_current_transaction(&orig) {
+            return Err(EINVAL.into());
+        }
+
+        // We need to complete the transaction even if we cannot complete building the reply.
+        (|| -> BinderResult<_> {
+            let completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
+            let process = orig.from.process.clone();
+            let reply = Transaction::new_reply(self, process, tr)?;
+            self.inner.lock().push_work(completion);
+            orig.from.deliver_reply(Either::Left(reply), &orig);
+            Ok(())
+        })()
+        .map_err(|mut err| {
+            // At this point we only return `BR_TRANSACTION_COMPLETE` to the caller, and we must let
+            // the sender know that the transaction has completed (with an error in this case).
+            pr_warn!(
+                "Failure {:?} during reply - delivering BR_FAILED_REPLY to sender.",
+                err
+            );
+            let reply = Either::Right(BR_FAILED_REPLY);
+            orig.from.deliver_reply(reply, &orig);
+            err.reply = BR_TRANSACTION_COMPLETE;
+            err
+        })
+    }
+
     fn oneway_transaction_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> BinderResult {
         let handle = unsafe { tr.transaction_data.target.handle };
         let node_ref = self.process.get_transaction_node(handle)?;
         security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?;
         let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
-        let transaction = Transaction::new(node_ref, self, tr)?;
+        let transaction = Transaction::new(node_ref, None, self, tr)?;
         let completion = list_completion.clone_arc();
         self.inner.lock().push_work(list_completion);
         match transaction.submit() {
@@ -458,7 +658,7 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
                     if tr.transaction_data.flags & TF_ONE_WAY != 0 {
                         self.transaction(&tr, Self::oneway_transaction_inner);
                     } else {
-                        return Err(EINVAL);
+                        self.transaction(&tr, Self::transaction_inner);
                     }
                 }
                 BC_TRANSACTION_SG => {
@@ -466,9 +666,17 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
                     if tr.transaction_data.flags & TF_ONE_WAY != 0 {
                         self.transaction(&tr, Self::oneway_transaction_inner);
                     } else {
-                        return Err(EINVAL);
+                        self.transaction(&tr, Self::transaction_inner);
                     }
                 }
+                BC_REPLY => {
+                    let tr = reader.read::<BinderTransactionData>()?.with_buffers_size(0);
+                    self.transaction(&tr, Self::reply_inner)
+                }
+                BC_REPLY_SG => {
+                    let tr = reader.read::<BinderTransactionDataSg>()?;
+                    self.transaction(&tr, Self::reply_inner)
+                }
                 BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)),
                 BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?,
                 BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?,
@@ -644,6 +852,8 @@ pub(crate) fn release(self: &Arc<Self>) {
         while let Ok(Some(work)) = self.get_work_local(false) {
             work.into_arc().cancel();
         }
+
+        self.unwind_transaction_stack();
     }
 }
 
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 8b4274ddc415..a6525a4253ea 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -6,23 +6,25 @@
     prelude::*,
     sync::{Arc, SpinLock},
     task::Kuid,
+    types::{Either, ScopeGuard},
     user_ptr::UserSlicePtrWriter,
 };
 
 use crate::{
     allocation::Allocation,
     defs::*,
-    error::BinderResult,
+    error::{BinderError, BinderResult},
     node::{Node, NodeRef},
     process::Process,
     ptr_align,
-    thread::Thread,
+    thread::{PushWorkRes, Thread},
     DArc, DLArc, DTRWrap, DeliverToRead,
 };
 
 #[pin_data]
 pub(crate) struct Transaction {
     target_node: Option<DArc<Node>>,
+    stack_next: Option<DArc<Transaction>>,
     pub(crate) from: Arc<Thread>,
     to: Arc<Process>,
     #[pin]
@@ -42,6 +44,7 @@ pub(crate) struct Transaction {
 impl Transaction {
     pub(crate) fn new(
         node_ref: NodeRef,
+        stack_next: Option<DArc<Transaction>>,
         from: &Arc<Thread>,
         tr: &BinderTransactionDataSg,
     ) -> BinderResult<DLArc<Self>> {
@@ -59,8 +62,8 @@ pub(crate) fn new(
                     return Err(err);
                 }
             };
-        if trd.flags & TF_ONE_WAY == 0 {
-            pr_warn!("Non-oneway transactions are not yet supported.");
+        if trd.flags & TF_ONE_WAY != 0 && stack_next.is_some() {
+            pr_warn!("Oneway transaction should not be in a transaction stack.");
             return Err(EINVAL.into());
         }
         if trd.flags & TF_CLEAR_BUF != 0 {
@@ -72,6 +75,7 @@ pub(crate) fn new(
 
         Ok(DTRWrap::arc_pin_init(pin_init!(Transaction {
             target_node: Some(target_node),
+            stack_next,
             sender_euid: from.process.cred.euid(),
             from: from.clone(),
             to,
@@ -84,15 +88,100 @@ pub(crate) fn new(
         }))?)
     }
 
-    /// Submits the transaction to a work queue.
+    pub(crate) fn new_reply(
+        from: &Arc<Thread>,
+        to: Arc<Process>,
+        tr: &BinderTransactionDataSg,
+    ) -> BinderResult<DLArc<Self>> {
+        let trd = &tr.transaction_data;
+        let mut alloc = match from.copy_transaction_data(to.clone(), tr, None) {
+            Ok(alloc) => alloc,
+            Err(err) => {
+                pr_warn!("Failure in copy_transaction_data: {:?}", err);
+                return Err(err);
+            }
+        };
+        if trd.flags & TF_CLEAR_BUF != 0 {
+            alloc.set_info_clear_on_drop();
+        }
+        Ok(DTRWrap::arc_pin_init(pin_init!(Transaction {
+            target_node: None,
+            stack_next: None,
+            sender_euid: from.process.task.euid(),
+            from: from.clone(),
+            to,
+            code: trd.code,
+            flags: trd.flags,
+            data_size: trd.data_size as _,
+            data_address: alloc.ptr,
+            allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
+            txn_security_ctx_off: None,
+        }))?)
+    }
+
+    /// Determines if the transaction is stacked on top of the given transaction.
+    pub(crate) fn is_stacked_on(&self, onext: &Option<DArc<Self>>) -> bool {
+        match (&self.stack_next, onext) {
+            (None, None) => true,
+            (Some(stack_next), Some(next)) => Arc::ptr_eq(stack_next, next),
+            _ => false,
+        }
+    }
+
+    /// Returns a pointer to the next transaction on the transaction stack, if there is one.
+    pub(crate) fn clone_next(&self) -> Option<DArc<Self>> {
+        Some(self.stack_next.as_ref()?.clone())
+    }
+
+    /// Searches in the transaction stack for a thread that belongs to the target process. This is
+    /// useful when finding a target for a new transaction: if the node belongs to a process that
+    /// is already part of the transaction stack, we reuse the thread.
+    fn find_target_thread(&self) -> Option<Arc<Thread>> {
+        let mut it = &self.stack_next;
+        while let Some(transaction) = it {
+            if Arc::ptr_eq(&transaction.from.process, &self.to) {
+                return Some(transaction.from.clone());
+            }
+            it = &transaction.stack_next;
+        }
+        None
+    }
+
+    /// Searches in the transaction stack for a transaction originating at the given thread.
+    pub(crate) fn find_from(&self, thread: &Thread) -> Option<DArc<Transaction>> {
+        let mut it = &self.stack_next;
+        while let Some(transaction) = it {
+            if core::ptr::eq(thread, transaction.from.as_ref()) {
+                return Some(transaction.clone());
+            }
+
+            it = &transaction.stack_next;
+        }
+        None
+    }
+
+    /// Submits the transaction to a work queue. Uses a thread if there is one in the transaction
+    /// stack, otherwise uses the destination process.
+    ///
+    /// Not used for replies.
     pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
         let process = self.to.clone();
         let mut process_inner = process.inner.lock();
-        match process_inner.push_work(self) {
+
+        let res = if let Some(thread) = self.find_target_thread() {
+            match thread.push_work(self) {
+                PushWorkRes::Ok => Ok(()),
+                PushWorkRes::FailedDead(me) => Err((BinderError::new_dead(), me)),
+            }
+        } else {
+            process_inner.push_work(self)
+        };
+        drop(process_inner);
+
+        match res {
             Ok(()) => Ok(()),
             Err((err, work)) => {
                 // Drop work after releasing process lock.
-                drop(process_inner);
                 drop(work);
                 Err(err)
             }
@@ -101,11 +190,14 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
 }
 
 impl DeliverToRead for Transaction {
-    fn do_work(
-        self: DArc<Self>,
-        _thread: &Thread,
-        writer: &mut UserSlicePtrWriter,
-    ) -> Result<bool> {
+    fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -> Result<bool> {
+        let send_failed_reply = ScopeGuard::new(|| {
+            if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 {
+                let reply = Either::Right(BR_FAILED_REPLY);
+                self.from.deliver_reply(reply, &self);
+            }
+        });
+
         let mut tr_sec = BinderTransactionDataSecctx::default();
         let tr = tr_sec.tr_data();
         if let Some(target_node) = &self.target_node {
@@ -144,17 +236,33 @@ fn do_work(
             writer.write(&*tr)?;
         }
 
+        // Dismiss the completion of transaction with a failure. No failure paths are allowed from
+        // here on out.
+        send_failed_reply.dismiss();
+
         // It is now the user's responsibility to clear the allocation.
         let alloc = self.allocation.lock().take();
         if let Some(alloc) = alloc {
             alloc.keep_alive();
         }
 
+        // When this is not a reply and not a oneway transaction, update `current_transaction`. If
+        // it's a reply, `current_transaction` has already been updated appropriately.
+        if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 {
+            thread.set_current_transaction(self);
+        }
+
         Ok(false)
     }
 
     fn cancel(self: DArc<Self>) {
         drop(self.allocation.lock().take());
+
+        // If this is not a reply or oneway transaction, then send a dead reply.
+        if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 {
+            let reply = Either::Right(BR_DEAD_REPLY);
+            self.from.deliver_reply(reply, &self);
+        }
     }
 
     fn should_sync_wakeup(&self) -> bool {

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 09/20] rust_binder: serialize oneway transactions
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (7 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 08/20] rust_binder: add non-oneway transactions Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 10/20] rust_binder: add death notifications Alice Ryhl
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

The binder driver guarantees that oneway transactions sent to the same
node are serialized, that is, userspace will not be given the next one
until it has finished processing the previous oneway transaction. This
is done to avoid the case where two oneway transactions arrive in
opposite order from the order in which they were sent. (E.g., they could
be delivered to two different threads, which could appear as-if they
were sent in opposite order.)

To fix that, we store pending oneway transactions in a separate list in
the node, and don't deliver the next oneway transaction until userspace
signals that it has finished processing the previous oneway transaction
by calling the BC_FREE_BUFFER ioctl.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  | 19 +++++++++-
 drivers/android/node.rs        | 79 ++++++++++++++++++++++++++++++++++++++++--
 drivers/android/process.rs     | 25 ++++++++++---
 drivers/android/transaction.rs | 26 ++++++++++++--
 4 files changed, 138 insertions(+), 11 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index 1ab0f254fded..0fdef5425918 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -3,13 +3,22 @@
 
 use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader};
 
-use crate::{node::NodeRef, process::Process};
+use crate::{
+    node::{Node, NodeRef},
+    process::Process,
+    DArc,
+};
 
 #[derive(Default)]
 pub(crate) struct AllocationInfo {
     /// The target node of the transaction this allocation is associated to.
     /// Not set for replies.
     pub(crate) target_node: Option<NodeRef>,
+    /// When this allocation is dropped, call `pending_oneway_finished` on the node.
+    ///
+    /// This is used to serialize oneway transaction on the same node. Binder guarantees that
+    /// oneway transactions to the same node are delivered sequentially in the order they are sent.
+    pub(crate) oneway_node: Option<DArc<Node>>,
     /// Zero the data in the buffer on free.
     pub(crate) clear_on_free: bool,
 }
@@ -110,6 +119,10 @@ pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo {
         self.allocation_info.get_or_insert_with(Default::default)
     }
 
+    pub(crate) fn set_info_oneway_node(&mut self, oneway_node: DArc<Node>) {
+        self.get_or_init_info().oneway_node = Some(oneway_node);
+    }
+
     pub(crate) fn set_info_clear_on_drop(&mut self) {
         self.get_or_init_info().clear_on_free = true;
     }
@@ -126,6 +139,10 @@ fn drop(&mut self) {
         }
 
         if let Some(mut info) = self.allocation_info.take() {
+            if let Some(oneway_node) = info.oneway_node.as_ref() {
+                oneway_node.pending_oneway_finished();
+            }
+
             info.target_node = None;
 
             if info.clear_on_free {
diff --git a/drivers/android/node.rs b/drivers/android/node.rs
index c6c3d81e705d..b8a08b16c06d 100644
--- a/drivers/android/node.rs
+++ b/drivers/android/node.rs
@@ -2,7 +2,9 @@
 
 use kernel::{
     io_buffer::IoBufferWriter,
-    list::{AtomicListArcTracker, ListArcSafe, TryNewListArc},
+    list::{
+        AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc,
+    },
     prelude::*,
     sync::lock::{spinlock::SpinLockBackend, Guard},
     sync::{Arc, LockedBy},
@@ -11,9 +13,11 @@
 
 use crate::{
     defs::*,
+    error::BinderError,
     process::{Process, ProcessInner},
     thread::Thread,
-    DArc, DeliverToRead,
+    transaction::Transaction,
+    DArc, DLArc, DTRWrap, DeliverToRead,
 };
 
 struct CountState {
@@ -36,6 +40,8 @@ fn new() -> Self {
 struct NodeInner {
     strong: CountState,
     weak: CountState,
+    oneway_todo: List<DTRWrap<Transaction>>,
+    has_pending_oneway_todo: bool,
     /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two)
     ///
     /// If this is non-zero, then we postpone any BR_RELEASE or BR_DECREFS notifications until the
@@ -62,6 +68,16 @@ impl ListArcSafe<0> for Node {
     }
 }
 
+// These make `oneway_todo` work.
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<0> for DTRWrap<Transaction> { self.links.inner }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<0> for DTRWrap<Transaction> {
+        using ListLinks;
+    }
+}
+
 impl Node {
     pub(crate) fn new(
         ptr: usize,
@@ -79,6 +95,8 @@ pub(crate) fn new(
                 NodeInner {
                     strong: CountState::new(),
                     weak: CountState::new(),
+                    oneway_todo: List::new(),
+                    has_pending_oneway_todo: false,
                     active_inc_refs: 0,
                 },
             ),
@@ -201,6 +219,63 @@ fn write(&self, writer: &mut UserSlicePtrWriter, code: u32) -> Result {
         writer.write(&self.cookie)?;
         Ok(())
     }
+
+    pub(crate) fn submit_oneway(
+        &self,
+        transaction: DLArc<Transaction>,
+        guard: &mut Guard<'_, ProcessInner, SpinLockBackend>,
+    ) -> Result<(), (BinderError, DLArc<dyn DeliverToRead>)> {
+        if guard.is_dead {
+            return Err((BinderError::new_dead(), transaction));
+        }
+
+        let inner = self.inner.access_mut(guard);
+        if inner.has_pending_oneway_todo {
+            inner.oneway_todo.push_back(transaction);
+        } else {
+            inner.has_pending_oneway_todo = true;
+            guard.push_work(transaction)?;
+        }
+        Ok(())
+    }
+
+    pub(crate) fn release(&self, guard: &mut Guard<'_, ProcessInner, SpinLockBackend>) {
+        // Move every pending oneshot message to the process todolist. The process
+        // will cancel it later.
+        //
+        // New items can't be pushed after this call, since `submit_oneway` fails when the process
+        // is dead, which is set before `Node::release` is called.
+        //
+        // TODO: Give our linked list implementation the ability to move everything in one go.
+        while let Some(work) = self.inner.access_mut(guard).oneway_todo.pop_front() {
+            guard.push_work_for_release(work);
+        }
+    }
+
+    pub(crate) fn pending_oneway_finished(&self) {
+        let mut guard = self.owner.inner.lock();
+        if guard.is_dead {
+            // Cleanup will happen in `Process::deferred_release`.
+            return;
+        }
+
+        let inner = self.inner.access_mut(&mut guard);
+
+        let transaction = inner.oneway_todo.pop_front();
+        inner.has_pending_oneway_todo = transaction.is_some();
+        if let Some(transaction) = transaction {
+            match guard.push_work(transaction) {
+                Ok(()) => {}
+                Err((_err, work)) => {
+                    // Process is dead.
+                    // This shouldn't happen due to the `is_dead` check, but if it does, just drop
+                    // the transaction and return.
+                    drop(guard);
+                    drop(work);
+                }
+            }
+        }
+    }
 }
 
 impl DeliverToRead for Node {
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 2e8b0fc07756..d4e50c7f9a88 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -136,6 +136,11 @@ pub(crate) fn push_work(
         }
     }
 
+    /// Push work to be cancelled. Only used during process teardown.
+    pub(crate) fn push_work_for_release(&mut self, work: DLArc<dyn DeliverToRead>) {
+        self.work.push_back(work);
+    }
+
     pub(crate) fn remove_node(&mut self, ptr: usize) {
         self.nodes.remove(&ptr);
     }
@@ -740,6 +745,21 @@ fn deferred_release(self: Arc<Self>) {
 
         self.ctx.deregister_process(&self);
 
+        // Move oneway_todo into the process todolist.
+        {
+            let mut inner = self.inner.lock();
+            let nodes = take(&mut inner.nodes);
+            for node in nodes.values() {
+                node.release(&mut inner);
+            }
+            inner.nodes = nodes;
+        }
+
+        // Cancel all pending work items.
+        while let Some(work) = self.get_work() {
+            work.into_arc().cancel();
+        }
+
         // Move the threads out of `inner` so that we can iterate over them without holding the
         // lock.
         let mut inner = self.inner.lock();
@@ -751,11 +771,6 @@ fn deferred_release(self: Arc<Self>) {
             thread.release();
         }
 
-        // Cancel all pending work items.
-        while let Some(work) = self.get_work() {
-            work.into_arc().cancel();
-        }
-
         // Free any resources kept alive by allocated buffers.
         let omapping = self.inner.lock().mapping.take();
         if let Some(mut mapping) = omapping {
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index a6525a4253ea..a4ffe0a3878c 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -62,9 +62,12 @@ pub(crate) fn new(
                     return Err(err);
                 }
             };
-        if trd.flags & TF_ONE_WAY != 0 && stack_next.is_some() {
-            pr_warn!("Oneway transaction should not be in a transaction stack.");
-            return Err(EINVAL.into());
+        if trd.flags & TF_ONE_WAY != 0 {
+            if stack_next.is_some() {
+                pr_warn!("Oneway transaction should not be in a transaction stack.");
+                return Err(EINVAL.into());
+            }
+            alloc.set_info_oneway_node(node_ref.node.clone());
         }
         if trd.flags & TF_CLEAR_BUF != 0 {
             alloc.set_info_clear_on_drop();
@@ -165,9 +168,26 @@ pub(crate) fn find_from(&self, thread: &Thread) -> Option<DArc<Transaction>> {
     ///
     /// Not used for replies.
     pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
+        let oneway = self.flags & TF_ONE_WAY != 0;
         let process = self.to.clone();
         let mut process_inner = process.inner.lock();
 
+        if oneway {
+            if let Some(target_node) = self.target_node.clone() {
+                match target_node.submit_oneway(self, &mut process_inner) {
+                    Ok(()) => return Ok(()),
+                    Err((err, work)) => {
+                        drop(process_inner);
+                        // Drop work after releasing process lock.
+                        drop(work);
+                        return Err(err);
+                    }
+                }
+            } else {
+                pr_err!("Failed to submit oneway transaction to node.");
+            }
+        }
+
         let res = if let Some(thread) = self.find_target_thread() {
             match thread.push_work(self) {
                 PushWorkRes::Ok => Ok(()),

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 10/20] rust_binder: add death notifications
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (8 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 09/20] rust_binder: serialize oneway transactions Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 11/20] rust_binder: send nodes in transactions Alice Ryhl
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

From: Wedson Almeida Filho <wedsonaf@gmail.com>

This adds death notifications that let one process be notified when
another process dies.

A process can request to be notified when a process dies using
`BC_REQUEST_DEATH_NOTIFICATION`. This will make the driver send a
`BR_DEAD_BINDER` to userspace when the process dies (or immediately if
it is already dead). Userspace is supposed to respond with
`BC_DEAD_BINDER_DONE` once it has processed the notification.

Userspace can unregister from death notifications using the
`BC_CLEAR_DEATH_NOTIFICATION` command. In this case, the kernel will
respond with `BR_CLEAR_DEATH_NOTIFICATION_DONE` once the notification
has been removed. Note that if the remote process dies before the kernel
has responded with `BR_CLEAR_DEATH_NOTIFICATION_DONE`, then the kernel
will still send a `BR_DEAD_BINDER`, which userspace must be able to
process. In this case, the kernel will wait for the
`BC_DEAD_BINDER_DONE` command before it sends
`BR_CLEAR_DEATH_NOTIFICATION_DONE`.

Note that even if the kernel sends a `BR_DEAD_BINDER`, this does not
remove the death notification. Userspace must still remove it manually
using `BC_CLEAR_DEATH_NOTIFICATION`.

If a process uses `BC_RELEASE` to destroy its last refcount on a node
that has an active death registration, then the death registration is
immediately deleted. However, userspace is not supposed to delete a
node reference without first deregistering death notifications, so this
codepath is not executed under normal circumstances.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/defs.rs        |  10 +-
 drivers/android/node.rs        | 258 ++++++++++++++++++++++++++++++++++++++++-
 drivers/android/process.rs     | 193 +++++++++++++++++++++++++++---
 drivers/android/rust_binder.rs |   7 ++
 drivers/android/thread.rs      |  22 +++-
 5 files changed, 471 insertions(+), 19 deletions(-)

diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 32178e8c5596..753f7e86c92d 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -23,10 +23,13 @@ macro_rules! pub_no_prefix {
     BR_SPAWN_LOOPER,
     BR_TRANSACTION_COMPLETE,
     BR_OK,
+    BR_ERROR,
     BR_INCREFS,
     BR_ACQUIRE,
     BR_RELEASE,
-    BR_DECREFS
+    BR_DECREFS,
+    BR_DEAD_BINDER,
+    BR_CLEAR_DEATH_NOTIFICATION_DONE
 );
 
 pub_no_prefix!(
@@ -44,7 +47,10 @@ macro_rules! pub_no_prefix {
     BC_RELEASE,
     BC_DECREFS,
     BC_INCREFS_DONE,
-    BC_ACQUIRE_DONE
+    BC_ACQUIRE_DONE,
+    BC_REQUEST_DEATH_NOTIFICATION,
+    BC_CLEAR_DEATH_NOTIFICATION,
+    BC_DEAD_BINDER_DONE
 );
 
 pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 =
diff --git a/drivers/android/node.rs b/drivers/android/node.rs
index b8a08b16c06d..7ed494bf9f7c 100644
--- a/drivers/android/node.rs
+++ b/drivers/android/node.rs
@@ -3,11 +3,12 @@
 use kernel::{
     io_buffer::IoBufferWriter,
     list::{
-        AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc,
+        AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks,
+        TryNewListArc,
     },
     prelude::*,
     sync::lock::{spinlock::SpinLockBackend, Guard},
-    sync::{Arc, LockedBy},
+    sync::{Arc, LockedBy, SpinLock},
     user_ptr::UserSlicePtrWriter,
 };
 
@@ -40,6 +41,7 @@ fn new() -> Self {
 struct NodeInner {
     strong: CountState,
     weak: CountState,
+    death_list: List<DTRWrap<NodeDeath>, 1>,
     oneway_todo: List<DTRWrap<Transaction>>,
     has_pending_oneway_todo: bool,
     /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two)
@@ -95,6 +97,7 @@ pub(crate) fn new(
                 NodeInner {
                     strong: CountState::new(),
                     weak: CountState::new(),
+                    death_list: List::new(),
                     oneway_todo: List::new(),
                     has_pending_oneway_todo: false,
                     active_inc_refs: 0,
@@ -112,6 +115,25 @@ pub(crate) fn get_id(&self) -> (usize, usize) {
         (self.ptr, self.cookie)
     }
 
+    pub(crate) fn next_death(
+        &self,
+        guard: &mut Guard<'_, ProcessInner, SpinLockBackend>,
+    ) -> Option<DArc<NodeDeath>> {
+        self.inner
+            .access_mut(guard)
+            .death_list
+            .pop_front()
+            .map(|larc| larc.into_arc())
+    }
+
+    pub(crate) fn add_death(
+        &self,
+        death: ListArc<DTRWrap<NodeDeath>, 1>,
+        guard: &mut Guard<'_, ProcessInner, SpinLockBackend>,
+    ) {
+        self.inner.access_mut(guard).death_list.push_back(death);
+    }
+
     pub(crate) fn inc_ref_done_locked(
         &self,
         _strong: bool,
@@ -449,3 +471,235 @@ fn drop(&mut self) {
         }
     }
 }
+
+struct NodeDeathInner {
+    dead: bool,
+    cleared: bool,
+    notification_done: bool,
+    /// Indicates whether the normal flow was interrupted by removing the handle. In this case, we
+    /// need behave as if the death notification didn't exist (i.e., we don't deliver anything to
+    /// the user.
+    aborted: bool,
+}
+
+/// Used to deliver notifications when a process dies.
+///
+/// A process can request to be notified when a process dies using `BC_REQUEST_DEATH_NOTIFICATION`.
+/// This will make the driver send a `BR_DEAD_BINDER` to userspace when the process dies (or
+/// immediately if it is already dead). Userspace is supposed to respond with `BC_DEAD_BINDER_DONE`
+/// once it has processed the notification.
+///
+/// Userspace can unregister from death notifications using the `BC_CLEAR_DEATH_NOTIFICATION`
+/// command. In this case, the kernel will respond with `BR_CLEAR_DEATH_NOTIFICATION_DONE` once the
+/// notification has been removed. Note that if the remote process dies before the kernel has
+/// responded with `BR_CLEAR_DEATH_NOTIFICATION_DONE`, then the kernel will still send a
+/// `BR_DEAD_BINDER`, which userspace must be able to process. In this case, the kernel will wait
+/// for the `BC_DEAD_BINDER_DONE` command before it sends `BR_CLEAR_DEATH_NOTIFICATION_DONE`.
+///
+/// Note that even if the kernel sends a `BR_DEAD_BINDER`, this does not remove the death
+/// notification. Userspace must still remove it manually using `BC_CLEAR_DEATH_NOTIFICATION`.
+///
+/// If a process uses `BC_RELEASE` to destroy its last refcount on a node that has an active death
+/// registration, then the death registration is immediately deleted (we implement this using the
+/// `aborted` field). However, userspace is not supposed to delete a `NodeRef` without first
+/// deregistering death notifications, so this codepath is not executed under normal circumstances.
+#[pin_data]
+pub(crate) struct NodeDeath {
+    node: DArc<Node>,
+    process: Arc<Process>,
+    pub(crate) cookie: usize,
+    #[pin]
+    links_track: AtomicListArcTracker<0>,
+    /// Used by the owner `Node` to store a list of registered death notifications.
+    ///
+    /// # Invariants
+    ///
+    /// Only ever used with the `death_list` list of `self.node`.
+    #[pin]
+    death_links: ListLinks<1>,
+    /// Used by the process to keep track of the death notifications for which we have sent a
+    /// `BR_DEAD_BINDER` but not yet received a `BC_DEAD_BINDER_DONE`.
+    ///
+    /// # Invariants
+    ///
+    /// Only ever used with the `delivered_deaths` list of `self.process`.
+    #[pin]
+    delivered_links: ListLinks<2>,
+    #[pin]
+    delivered_links_track: AtomicListArcTracker<2>,
+    #[pin]
+    inner: SpinLock<NodeDeathInner>,
+}
+
+impl NodeDeath {
+    /// Constructs a new node death notification object.
+    pub(crate) fn new(
+        node: DArc<Node>,
+        process: Arc<Process>,
+        cookie: usize,
+    ) -> impl PinInit<DTRWrap<Self>> {
+        DTRWrap::new(pin_init!(
+            Self {
+                node,
+                process,
+                cookie,
+                links_track <- AtomicListArcTracker::new(),
+                death_links <- ListLinks::new(),
+                delivered_links <- ListLinks::new(),
+                delivered_links_track <- AtomicListArcTracker::new(),
+                inner <- kernel::new_spinlock!(NodeDeathInner {
+                    dead: false,
+                    cleared: false,
+                    notification_done: false,
+                    aborted: false,
+                }, "NodeDeath::inner"),
+            }
+        ))
+    }
+
+    /// Sets the cleared flag to `true`.
+    ///
+    /// It removes `self` from the node's death notification list if needed.
+    ///
+    /// Returns whether it needs to be queued.
+    pub(crate) fn set_cleared(self: &DArc<Self>, abort: bool) -> bool {
+        let (needs_removal, needs_queueing) = {
+            // Update state and determine if we need to queue a work item. We only need to do it
+            // when the node is not dead or if the user already completed the death notification.
+            let mut inner = self.inner.lock();
+            if abort {
+                inner.aborted = true;
+            }
+            if inner.cleared {
+                // Already cleared.
+                return false;
+            }
+            inner.cleared = true;
+            (!inner.dead, !inner.dead || inner.notification_done)
+        };
+
+        // Remove death notification from node.
+        if needs_removal {
+            let mut owner_inner = self.node.owner.inner.lock();
+            let node_inner = self.node.inner.access_mut(&mut owner_inner);
+            // SAFETY: A `NodeDeath` is never inserted into the death list of any node other than
+            // its owner, so it is either in this death list or in no death list.
+            unsafe { node_inner.death_list.remove(self) };
+        }
+        needs_queueing
+    }
+
+    /// Sets the 'notification done' flag to `true`.
+    pub(crate) fn set_notification_done(self: DArc<Self>, thread: &Thread) {
+        let needs_queueing = {
+            let mut inner = self.inner.lock();
+            inner.notification_done = true;
+            inner.cleared
+        };
+        if needs_queueing {
+            if let Some(death) = ListArc::try_from_arc_or_drop(self) {
+                let _ = thread.push_work_if_looper(death);
+            }
+        }
+    }
+
+    /// Sets the 'dead' flag to `true` and queues work item if needed.
+    pub(crate) fn set_dead(self: DArc<Self>) {
+        let needs_queueing = {
+            let mut inner = self.inner.lock();
+            if inner.cleared {
+                false
+            } else {
+                inner.dead = true;
+                true
+            }
+        };
+        if needs_queueing {
+            // Push the death notification to the target process. There is nothing else to do if
+            // it's already dead.
+            if let Some(death) = ListArc::try_from_arc_or_drop(self) {
+                let process = death.process.clone();
+                let _ = process.push_work(death);
+            }
+        }
+    }
+}
+
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<0> for NodeDeath {
+        tracked_by links_track: AtomicListArcTracker;
+    }
+}
+
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<1> for DTRWrap<NodeDeath> { self.wrapped.death_links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<1> for DTRWrap<NodeDeath> { untracked; }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<1> for DTRWrap<NodeDeath> {
+        using ListLinks;
+    }
+}
+
+kernel::list::impl_has_list_links! {
+    impl HasListLinks<2> for DTRWrap<NodeDeath> { self.wrapped.delivered_links }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<2> for DTRWrap<NodeDeath> {
+        tracked_by wrapped: NodeDeath;
+    }
+}
+kernel::list::impl_list_arc_safe! {
+    impl ListArcSafe<2> for NodeDeath {
+        tracked_by delivered_links_track: AtomicListArcTracker<2>;
+    }
+}
+kernel::list::impl_list_item! {
+    impl ListItem<2> for DTRWrap<NodeDeath> {
+        using ListLinks;
+    }
+}
+
+impl DeliverToRead for NodeDeath {
+    fn do_work(
+        self: DArc<Self>,
+        _thread: &Thread,
+        writer: &mut UserSlicePtrWriter,
+    ) -> Result<bool> {
+        let done = {
+            let inner = self.inner.lock();
+            if inner.aborted {
+                return Ok(true);
+            }
+            inner.cleared && (!inner.dead || inner.notification_done)
+        };
+
+        let cookie = self.cookie;
+        let cmd = if done {
+            BR_CLEAR_DEATH_NOTIFICATION_DONE
+        } else {
+            let process = self.process.clone();
+            let mut process_inner = process.inner.lock();
+            let inner = self.inner.lock();
+            if inner.aborted {
+                return Ok(true);
+            }
+            // We're still holding the inner lock, so it cannot be aborted while we insert it into
+            // the delivered list.
+            process_inner.death_delivered(self.clone());
+            BR_DEAD_BINDER
+        };
+
+        writer.write(&cmd)?;
+        writer.write(&cookie)?;
+        // Mimic the original code: we stop processing work items when we get to a death
+        // notification.
+        Ok(cmd != BR_DEAD_BINDER)
+    }
+
+    fn should_sync_wakeup(&self) -> bool {
+        false
+    }
+}
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index d4e50c7f9a88..0b79fa59ffa5 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -20,7 +20,7 @@
     pages::Pages,
     prelude::*,
     rbtree::RBTree,
-    sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock},
+    sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock, UniqueArc},
     task::Task,
     types::{ARef, Either},
     user_ptr::{UserSlicePtr, UserSlicePtrReader},
@@ -32,7 +32,7 @@
     context::Context,
     defs::*,
     error::{BinderError, BinderResult},
-    node::{Node, NodeRef},
+    node::{Node, NodeDeath, NodeRef},
     range_alloc::{self, RangeAllocator},
     thread::{PushWorkRes, Thread},
     DArc, DLArc, DTRWrap, DeliverToRead,
@@ -69,6 +69,7 @@ pub(crate) struct ProcessInner {
     nodes: RBTree<usize, DArc<Node>>,
     mapping: Option<Mapping>,
     work: List<DTRWrap<dyn DeliverToRead>>,
+    delivered_deaths: List<DTRWrap<NodeDeath>, 2>,
 
     /// The number of requested threads that haven't registered yet.
     requested_thread_count: u32,
@@ -91,6 +92,7 @@ fn new() -> Self {
             mapping: None,
             nodes: RBTree::new(),
             work: List::new(),
+            delivered_deaths: List::new(),
             requested_thread_count: 0,
             max_threads: 0,
             started_thread_count: 0,
@@ -225,15 +227,40 @@ fn register_thread(&mut self) -> bool {
         self.started_thread_count += 1;
         true
     }
+
+    /// Finds a delivered death notification with the given cookie, removes it from the thread's
+    /// delivered list, and returns it.
+    fn pull_delivered_death(&mut self, cookie: usize) -> Option<DArc<NodeDeath>> {
+        let mut cursor_opt = self.delivered_deaths.cursor_front();
+        while let Some(cursor) = cursor_opt {
+            if cursor.current().cookie == cookie {
+                return Some(cursor.remove().into_arc());
+            }
+            cursor_opt = cursor.next();
+        }
+        None
+    }
+
+    pub(crate) fn death_delivered(&mut self, death: DArc<NodeDeath>) {
+        if let Some(death) = ListArc::try_from_arc_or_drop(death) {
+            self.delivered_deaths.push_back(death);
+        } else {
+            pr_warn!("Notification added to `delivered_deaths` twice.");
+        }
+    }
 }
 
 struct NodeRefInfo {
     node_ref: NodeRef,
+    death: Option<DArc<NodeDeath>>,
 }
 
 impl NodeRefInfo {
     fn new(node_ref: NodeRef) -> Self {
-        Self { node_ref }
+        Self {
+            node_ref,
+            death: None,
+        }
     }
 }
 
@@ -385,6 +412,18 @@ fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result<Arc<Thread>> {
         Ok(ta)
     }
 
+    pub(crate) fn push_work(&self, work: DLArc<dyn DeliverToRead>) -> BinderResult {
+        // If push_work fails, drop the work item outside the lock.
+        let res = self.inner.lock().push_work(work);
+        match res {
+            Ok(()) => Ok(()),
+            Err((err, work)) => {
+                drop(work);
+                Err(err)
+            }
+        }
+    }
+
     fn set_as_manager(
         self: ArcBorrow<'_, Self>,
         info: Option<FlatBinderObject>,
@@ -513,6 +552,14 @@ pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result<N
             .clone(strong)
     }
 
+    pub(crate) fn remove_from_delivered_deaths(&self, death: &DArc<NodeDeath>) {
+        let mut inner = self.inner.lock();
+        // SAFETY: By the invariant on the `delivered_links` field, this is the right linked list.
+        let removed = unsafe { inner.delivered_deaths.remove(death) };
+        drop(inner);
+        drop(removed);
+    }
+
     pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result {
         if inc && handle == 0 {
             if let Ok(node_ref) = self.ctx.get_manager_node(strong) {
@@ -529,6 +576,12 @@ pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result
         let mut refs = self.node_refs.lock();
         if let Some(info) = refs.by_handle.get_mut(&handle) {
             if info.node_ref.update(inc, strong) {
+                // Clean up death if there is one attached to this node reference.
+                if let Some(death) = info.death.take() {
+                    death.set_cleared(true);
+                    self.remove_from_delivered_deaths(&death);
+                }
+
                 // Remove reference from process tables.
                 let id = info.node_ref.node.global_id;
                 refs.by_handle.remove(&handle);
@@ -725,6 +778,87 @@ pub(crate) fn needs_thread(&self) -> bool {
         ret
     }
 
+    pub(crate) fn request_death(
+        self: &Arc<Self>,
+        reader: &mut UserSlicePtrReader,
+        thread: &Thread,
+    ) -> Result {
+        let handle: u32 = reader.read()?;
+        let cookie: usize = reader.read()?;
+
+        // TODO: First two should result in error, but not the others.
+
+        // TODO: Do we care about the context manager dying?
+
+        // Queue BR_ERROR if we can't allocate memory for the death notification.
+        let death = UniqueArc::try_new_uninit().map_err(|err| {
+            thread.push_return_work(BR_ERROR);
+            err
+        })?;
+        let mut refs = self.node_refs.lock();
+        let info = refs.by_handle.get_mut(&handle).ok_or(EINVAL)?;
+
+        // Nothing to do if there is already a death notification request for this handle.
+        if info.death.is_some() {
+            return Ok(());
+        }
+
+        let death = {
+            let death_init = NodeDeath::new(info.node_ref.node.clone(), self.clone(), cookie);
+            match death.pin_init_with(death_init) {
+                Ok(death) => death,
+                // error is infallible
+                Err(err) => match err {},
+            }
+        };
+
+        // Register the death notification.
+        {
+            let mut owner_inner = info.node_ref.node.owner.inner.lock();
+            if owner_inner.is_dead {
+                let death = ListArc::from_pin_unique(death);
+                info.death = Some(death.clone_arc());
+                drop(owner_inner);
+                let _ = self.push_work(death);
+            } else {
+                let death = ListArc::from_pin_unique(death);
+                info.death = Some(death.clone_arc());
+                info.node_ref.node.add_death(death, &mut owner_inner);
+            }
+        }
+        Ok(())
+    }
+
+    pub(crate) fn clear_death(&self, reader: &mut UserSlicePtrReader, thread: &Thread) -> Result {
+        let handle: u32 = reader.read()?;
+        let cookie: usize = reader.read()?;
+
+        let mut refs = self.node_refs.lock();
+        let info = refs.by_handle.get_mut(&handle).ok_or(EINVAL)?;
+
+        let death = info.death.take().ok_or(EINVAL)?;
+        if death.cookie != cookie {
+            info.death = Some(death);
+            return Err(EINVAL);
+        }
+
+        // Update state and determine if we need to queue a work item. We only need to do it when
+        // the node is not dead or if the user already completed the death notification.
+        if death.set_cleared(false) {
+            if let Some(death) = ListArc::try_from_arc_or_drop(death) {
+                let _ = thread.push_work_if_looper(death);
+            }
+        }
+
+        Ok(())
+    }
+
+    pub(crate) fn dead_binder_done(&self, cookie: usize, thread: &Thread) {
+        if let Some(death) = self.inner.lock().pull_delivered_death(cookie) {
+            death.set_notification_done(thread);
+        }
+    }
+
     fn deferred_flush(&self) {
         let inner = self.inner.lock();
         for thread in inner.threads.values() {
@@ -760,17 +894,6 @@ fn deferred_release(self: Arc<Self>) {
             work.into_arc().cancel();
         }
 
-        // Move the threads out of `inner` so that we can iterate over them without holding the
-        // lock.
-        let mut inner = self.inner.lock();
-        let threads = take(&mut inner.threads);
-        drop(inner);
-
-        // Release all threads.
-        for thread in threads.values() {
-            thread.release();
-        }
-
         // Free any resources kept alive by allocated buffers.
         let omapping = self.inner.lock().mapping.take();
         if let Some(mut mapping) = omapping {
@@ -785,6 +908,48 @@ fn deferred_release(self: Arc<Self>) {
                 drop(alloc)
             });
         }
+
+        // Drop all references. We do this dance with `swap` to avoid destroying the references
+        // while holding the lock.
+        let mut refs = self.node_refs.lock();
+        let mut node_refs = take(&mut refs.by_handle);
+        drop(refs);
+
+        // Remove all death notifications from the nodes (that belong to a different process).
+        for info in node_refs.values_mut() {
+            let death = if let Some(existing) = info.death.take() {
+                existing
+            } else {
+                continue;
+            };
+            death.set_cleared(false);
+        }
+
+        // Do similar dance for the state lock.
+        let mut inner = self.inner.lock();
+        let threads = take(&mut inner.threads);
+        let nodes = take(&mut inner.nodes);
+        drop(inner);
+
+        // Release all threads.
+        for thread in threads.values() {
+            thread.release();
+        }
+
+        // Deliver death notifications.
+        for node in nodes.values() {
+            loop {
+                let death = {
+                    let mut inner = self.inner.lock();
+                    if let Some(death) = node.next_death(&mut inner) {
+                        death
+                    } else {
+                        break;
+                    }
+                };
+                death.set_dead();
+            }
+        }
     }
 
     pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 218c2001e8cb..04477ff7e5a0 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -100,6 +100,13 @@ impl<T: ?Sized> core::ops::Receiver for DTRWrap<T> {}
 type DLArc<T> = kernel::list::ListArc<DTRWrap<T>>;
 
 impl<T: ListArcSafe> DTRWrap<T> {
+    fn new(val: impl PinInit<T>) -> impl PinInit<Self> {
+        pin_init!(Self {
+            links <- ListLinksSelfPtr::new(),
+            wrapped <- val,
+        })
+    }
+
     #[allow(dead_code)]
     fn arc_try_new(val: T) -> Result<DLArc<T>, alloc::alloc::AllocError> {
         ListArc::pin_init(pin_init!(Self {
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index b583297cea91..b70a5e3c064b 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -391,10 +391,27 @@ pub(crate) fn push_work(&self, work: DLArc<dyn DeliverToRead>) -> PushWorkRes {
         res
     }
 
+    /// Attempts to push to given work item to the thread if it's a looper thread (i.e., if it's
+    /// part of a thread pool) and is alive. Otherwise, push the work item to the process instead.
+    pub(crate) fn push_work_if_looper(&self, work: DLArc<dyn DeliverToRead>) -> BinderResult {
+        let mut inner = self.inner.lock();
+        if inner.is_looper() && !inner.is_dead {
+            inner.push_work(work);
+            Ok(())
+        } else {
+            drop(inner);
+            self.process.push_work(work)
+        }
+    }
+
     pub(crate) fn push_work_deferred(&self, work: DLArc<dyn DeliverToRead>) {
         self.inner.lock().push_work_deferred(work);
     }
 
+    pub(crate) fn push_return_work(&self, reply: u32) {
+        self.inner.lock().push_return_work(reply);
+    }
+
     pub(crate) fn copy_transaction_data(
         &self,
         to_process: Arc<Process>,
@@ -556,7 +573,7 @@ fn transaction<T>(self: &Arc<Self>, tr: &BinderTransactionDataSg, inner: T)
                 );
             }
 
-            self.inner.lock().push_return_work(err.reply);
+            self.push_return_work(err.reply);
         }
     }
 
@@ -684,6 +701,9 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
                 BC_DECREFS => self.process.update_ref(reader.read()?, false, false)?,
                 BC_INCREFS_DONE => self.process.inc_ref_done(&mut reader, false)?,
                 BC_ACQUIRE_DONE => self.process.inc_ref_done(&mut reader, true)?,
+                BC_REQUEST_DEATH_NOTIFICATION => self.process.request_death(&mut reader, self)?,
+                BC_CLEAR_DEATH_NOTIFICATION => self.process.clear_death(&mut reader, self)?,
+                BC_DEAD_BINDER_DONE => self.process.dead_binder_done(reader.read()?, self),
                 BC_REGISTER_LOOPER => {
                     let valid = self.process.register_thread();
                     self.inner.lock().looper_register(valid);

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 11/20] rust_binder: send nodes in transactions
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (9 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 10/20] rust_binder: add death notifications Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 12/20] rust_binder: add BINDER_TYPE_PTR support Alice Ryhl
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

To send a transaction to any process other than the context manager,
someone must first send you the binder node. Usually, you get it from
the context manager.

The transaction allocation now contains a list of offsets of objects in
the transaction that must be translated before they are passed to the
target process. In this patch, we only support translation of binder
nodes, but future patches will extend this to other object types.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  | 266 ++++++++++++++++++++++++++++++++++++++++-
 drivers/android/defs.rs        |  44 +++++--
 drivers/android/process.rs     |   8 ++
 drivers/android/thread.rs      | 118 +++++++++++++++++-
 drivers/android/transaction.rs |   5 +-
 rust/helpers.c                 |   7 ++
 rust/kernel/security.rs        |   7 ++
 7 files changed, 436 insertions(+), 19 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index 0fdef5425918..32bc268956f2 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -1,9 +1,18 @@
 // SPDX-License-Identifier: GPL-2.0
-use core::mem::size_of_val;
+use core::mem::{size_of, size_of_val, MaybeUninit};
+use core::ops::Range;
 
-use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader};
+use kernel::{
+    bindings,
+    io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes},
+    pages::Pages,
+    prelude::*,
+    sync::Arc,
+    user_ptr::UserSlicePtrReader,
+};
 
 use crate::{
+    defs::*,
     node::{Node, NodeRef},
     process::Process,
     DArc,
@@ -11,6 +20,8 @@
 
 #[derive(Default)]
 pub(crate) struct AllocationInfo {
+    /// Range within the allocation where we can find the offsets to the object descriptors.
+    pub(crate) offsets: Option<Range<usize>>,
     /// The target node of the transaction this allocation is associated to.
     /// Not set for replies.
     pub(crate) target_node: Option<NodeRef>,
@@ -87,6 +98,21 @@ pub(crate) fn copy_into(
         })
     }
 
+    pub(crate) fn read<T: ReadableFromBytes>(&self, offset: usize) -> Result<T> {
+        let mut out = MaybeUninit::<T>::uninit();
+        let mut out_offset = 0;
+        self.iterate(offset, size_of::<T>(), |page, offset, to_copy| {
+            // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
+            let obj_ptr = unsafe { (out.as_mut_ptr() as *mut u8).add(out_offset) };
+            // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid.
+            unsafe { page.read(obj_ptr, offset, to_copy) }?;
+            out_offset += to_copy;
+            Ok(())
+        })?;
+        // SAFETY: We just initialised the data.
+        Ok(unsafe { out.assume_init() })
+    }
+
     pub(crate) fn write<T: ?Sized>(&self, offset: usize, obj: &T) -> Result {
         let mut obj_offset = 0;
         self.iterate(offset, size_of_val(obj), |page, offset, to_copy| {
@@ -119,6 +145,10 @@ pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo {
         self.allocation_info.get_or_insert_with(Default::default)
     }
 
+    pub(crate) fn set_info_offsets(&mut self, offsets: Range<usize>) {
+        self.get_or_init_info().offsets = Some(offsets);
+    }
+
     pub(crate) fn set_info_oneway_node(&mut self, oneway_node: DArc<Node>) {
         self.get_or_init_info().oneway_node = Some(oneway_node);
     }
@@ -145,6 +175,15 @@ fn drop(&mut self) {
 
             info.target_node = None;
 
+            if let Some(offsets) = info.offsets.clone() {
+                let view = AllocationView::new(self, offsets.start);
+                for i in offsets.step_by(size_of::<usize>()) {
+                    if view.cleanup_object(i).is_err() {
+                        pr_warn!("Error cleaning up object at offset {}\n", i)
+                    }
+                }
+            }
+
             if info.clear_on_free {
                 if let Err(e) = self.fill_zero() {
                     pr_warn!("Failed to clear data on free: {:?}", e);
@@ -155,3 +194,226 @@ fn drop(&mut self) {
         self.process.buffer_raw_free(self.ptr);
     }
 }
+
+/// A view into the beginning of an allocation.
+///
+/// All attempts to read or write outside of the view will fail. To intentionally access outside of
+/// this view, use the `alloc` field of this struct directly.
+pub(crate) struct AllocationView<'a> {
+    pub(crate) alloc: &'a mut Allocation,
+    limit: usize,
+}
+
+impl<'a> AllocationView<'a> {
+    pub(crate) fn new(alloc: &'a mut Allocation, limit: usize) -> Self {
+        AllocationView { alloc, limit }
+    }
+
+    pub(crate) fn read<T: ReadableFromBytes>(&self, offset: usize) -> Result<T> {
+        if offset.checked_add(size_of::<T>()).ok_or(EINVAL)? > self.limit {
+            return Err(EINVAL);
+        }
+        self.alloc.read(offset)
+    }
+
+    pub(crate) fn write<T: WritableToBytes>(&self, offset: usize, obj: &T) -> Result {
+        if offset.checked_add(size_of::<T>()).ok_or(EINVAL)? > self.limit {
+            return Err(EINVAL);
+        }
+        self.alloc.write(offset, obj)
+    }
+
+    pub(crate) fn transfer_binder_object(
+        &self,
+        offset: usize,
+        obj: &bindings::flat_binder_object,
+        strong: bool,
+        node_ref: NodeRef,
+    ) -> Result {
+        if Arc::ptr_eq(&node_ref.node.owner, &self.alloc.process) {
+            // The receiving process is the owner of the node, so send it a binder object (instead
+            // of a handle).
+            let (ptr, cookie) = node_ref.node.get_id();
+            let mut newobj = FlatBinderObject::default();
+            newobj.hdr.type_ = if strong {
+                BINDER_TYPE_BINDER
+            } else {
+                BINDER_TYPE_WEAK_BINDER
+            };
+            newobj.flags = obj.flags;
+            newobj.__bindgen_anon_1.binder = ptr as _;
+            newobj.cookie = cookie as _;
+            self.write(offset, &newobj)?;
+            // Increment the user ref count on the node. It will be decremented as part of the
+            // destruction of the buffer, when we see a binder or weak-binder object.
+            node_ref.node.update_refcount(true, 1, strong);
+        } else {
+            // The receiving process is different from the owner, so we need to insert a handle to
+            // the binder object.
+            let handle = self
+                .alloc
+                .process
+                .insert_or_update_handle(node_ref, false)?;
+            let mut newobj = FlatBinderObject::default();
+            newobj.hdr.type_ = if strong {
+                BINDER_TYPE_HANDLE
+            } else {
+                BINDER_TYPE_WEAK_HANDLE
+            };
+            newobj.flags = obj.flags;
+            newobj.__bindgen_anon_1.handle = handle;
+            if self.write(offset, &newobj).is_err() {
+                // Decrement ref count on the handle we just created.
+                let _ = self.alloc.process.update_ref(handle, false, strong);
+                return Err(EINVAL);
+            }
+        }
+        Ok(())
+    }
+
+    fn cleanup_object(&self, index_offset: usize) -> Result {
+        let offset = self.alloc.read(index_offset)?;
+        let header = self.read::<BinderObjectHeader>(offset)?;
+        match header.type_ {
+            BINDER_TYPE_WEAK_BINDER | BINDER_TYPE_BINDER => {
+                let obj = self.read::<FlatBinderObject>(offset)?;
+                let strong = header.type_ == BINDER_TYPE_BINDER;
+                // SAFETY: The type is `BINDER_TYPE_{WEAK_}BINDER`, so the `binder` field is
+                // populated.
+                let ptr = unsafe { obj.__bindgen_anon_1.binder } as usize;
+                let cookie = obj.cookie as usize;
+                self.alloc.process.update_node(ptr, cookie, strong);
+                Ok(())
+            }
+            BINDER_TYPE_WEAK_HANDLE | BINDER_TYPE_HANDLE => {
+                let obj = self.read::<FlatBinderObject>(offset)?;
+                let strong = header.type_ == BINDER_TYPE_HANDLE;
+                // SAFETY: The type is `BINDER_TYPE_{WEAK_}HANDLE`, so the `handle` field is
+                // populated.
+                let handle = unsafe { obj.__bindgen_anon_1.handle } as _;
+                self.alloc.process.update_ref(handle, false, strong)
+            }
+            _ => Ok(()),
+        }
+    }
+}
+
+/// A binder object as it is serialized.
+///
+/// # Invariants
+///
+/// All bytes must be initialized, and the value of `self.hdr.type_` must be one of the allowed
+/// types.
+#[repr(C)]
+pub(crate) union BinderObject {
+    hdr: bindings::binder_object_header,
+    fbo: bindings::flat_binder_object,
+    fdo: bindings::binder_fd_object,
+    bbo: bindings::binder_buffer_object,
+    fdao: bindings::binder_fd_array_object,
+}
+
+/// A view into a `BinderObject` that can be used in a match statement.
+pub(crate) enum BinderObjectRef<'a> {
+    Binder(&'a mut bindings::flat_binder_object),
+    Handle(&'a mut bindings::flat_binder_object),
+    Fd(&'a mut bindings::binder_fd_object),
+    Ptr(&'a mut bindings::binder_buffer_object),
+    Fda(&'a mut bindings::binder_fd_array_object),
+}
+
+impl BinderObject {
+    pub(crate) fn read_from(reader: &mut UserSlicePtrReader) -> Result<BinderObject> {
+        let object = Self::read_from_inner(|slice| {
+            let read_len = usize::min(slice.len(), reader.len());
+            // SAFETY: The length we pass to `read_raw` is at most the length of the slice.
+            unsafe {
+                reader
+                    .clone_reader()
+                    .read_raw(slice.as_mut_ptr(), read_len)?;
+            }
+            Ok(())
+        })?;
+
+        // If we used a object type smaller than the largest object size, then we've read more
+        // bytes than we needed to. However, we used `.clone_reader()` to avoid advancing the
+        // original reader. Now, we call `skip` so that the caller's reader is advanced by the
+        // right amount.
+        //
+        // The `skip` call fails if the reader doesn't have `size` bytes available. This could
+        // happen if the type header corresponds to an object type that is larger than the rest of
+        // the reader.
+        //
+        // Any extra bytes beyond the size of the object are inaccessible after this call, so
+        // reading them again from the `reader` later does not result in TOCTOU bugs.
+        reader.skip(object.size())?;
+
+        Ok(object)
+    }
+
+    /// Use the provided reader closure to construct a `BinderObject`.
+    ///
+    /// The closure should write the bytes for the object into the provided slice.
+    pub(crate) fn read_from_inner<R>(reader: R) -> Result<BinderObject>
+    where
+        R: FnOnce(&mut [u8; size_of::<BinderObject>()]) -> Result<()>,
+    {
+        let mut obj = MaybeUninit::<BinderObject>::zeroed();
+
+        // SAFETY: The lengths of `BinderObject` and `[u8; size_of::<BinderObject>()]` are equal,
+        // and the byte array has an alignment requirement of one, so the pointer cast is okay.
+        // Additionally, `obj` was initialized to zeros, so the byte array will not be
+        // uninitialized.
+        (reader)(unsafe { &mut *obj.as_mut_ptr().cast() })?;
+
+        // SAFETY: The entire object is initialized, so accessing this field is safe.
+        let type_ = unsafe { obj.assume_init_ref().hdr.type_ };
+        if Self::type_to_size(type_).is_none() {
+            // The value of `obj.hdr_type_` was invalid.
+            return Err(EINVAL);
+        }
+
+        // SAFETY: All bytes are initialized (since we zeroed them at the start) and we checked
+        // that `self.hdr.type_` is one of the allowed types, so the type invariants are satisfied.
+        unsafe { Ok(obj.assume_init()) }
+    }
+
+    pub(crate) fn as_ref(&mut self) -> BinderObjectRef<'_> {
+        use BinderObjectRef::*;
+        // SAFETY: The constructor ensures that all bytes of `self` are initialized, and all
+        // variants of this union accept all initialized bit patterns.
+        unsafe {
+            match self.hdr.type_ {
+                BINDER_TYPE_WEAK_BINDER | BINDER_TYPE_BINDER => Binder(&mut self.fbo),
+                BINDER_TYPE_WEAK_HANDLE | BINDER_TYPE_HANDLE => Handle(&mut self.fbo),
+                BINDER_TYPE_FD => Fd(&mut self.fdo),
+                BINDER_TYPE_PTR => Ptr(&mut self.bbo),
+                BINDER_TYPE_FDA => Fda(&mut self.fdao),
+                // SAFETY: By the type invariant, the value of `self.hdr.type_` cannot have any
+                // other value than the ones checked above.
+                _ => core::hint::unreachable_unchecked(),
+            }
+        }
+    }
+
+    pub(crate) fn size(&self) -> usize {
+        // SAFETY: The entire object is initialized, so accessing this field is safe.
+        let type_ = unsafe { self.hdr.type_ };
+
+        // SAFETY: The type invariants guarantee that the type field is correct.
+        unsafe { Self::type_to_size(type_).unwrap_unchecked() }
+    }
+
+    fn type_to_size(type_: u32) -> Option<usize> {
+        match type_ {
+            BINDER_TYPE_WEAK_BINDER => Some(size_of::<bindings::flat_binder_object>()),
+            BINDER_TYPE_BINDER => Some(size_of::<bindings::flat_binder_object>()),
+            BINDER_TYPE_WEAK_HANDLE => Some(size_of::<bindings::flat_binder_object>()),
+            BINDER_TYPE_HANDLE => Some(size_of::<bindings::flat_binder_object>()),
+            BINDER_TYPE_FD => Some(size_of::<bindings::binder_fd_object>()),
+            BINDER_TYPE_PTR => Some(size_of::<bindings::binder_buffer_object>()),
+            BINDER_TYPE_FDA => Some(size_of::<bindings::binder_fd_array_object>()),
+            _ => None,
+        }
+    }
+}
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 753f7e86c92d..68f32a779a3c 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::mem::MaybeUninit;
 use core::ops::{Deref, DerefMut};
 use kernel::{
     bindings::{self, *},
@@ -57,11 +58,18 @@ macro_rules! pub_no_prefix {
     kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX;
 pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF);
 
+pub(crate) use bindings::{
+    BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR,
+    BINDER_TYPE_WEAK_BINDER, BINDER_TYPE_WEAK_HANDLE,
+};
+
 macro_rules! decl_wrapper {
     ($newname:ident, $wrapped:ty) => {
-        #[derive(Copy, Clone, Default)]
+        // Define a wrapper around the C type. Use `MaybeUninit` to enforce that the value of
+        // padding bytes must be preserved.
+        #[derive(Copy, Clone)]
         #[repr(transparent)]
-        pub(crate) struct $newname($wrapped);
+        pub(crate) struct $newname(MaybeUninit<$wrapped>);
 
         // SAFETY: This macro is only used with types where this is ok.
         unsafe impl ReadableFromBytes for $newname {}
@@ -70,13 +78,24 @@ unsafe impl WritableToBytes for $newname {}
         impl Deref for $newname {
             type Target = $wrapped;
             fn deref(&self) -> &Self::Target {
-                &self.0
+                // SAFETY: We use `MaybeUninit` only to preserve padding. The value must still
+                // always be valid.
+                unsafe { self.0.assume_init_ref() }
             }
         }
 
         impl DerefMut for $newname {
             fn deref_mut(&mut self) -> &mut Self::Target {
-                &mut self.0
+                // SAFETY: We use `MaybeUninit` only to preserve padding. The value must still
+                // always be valid.
+                unsafe { self.0.assume_init_mut() }
+            }
+        }
+
+        impl Default for $newname {
+            fn default() -> Self {
+                // Create a new value of this type where all bytes (including padding) are zeroed.
+                Self(MaybeUninit::zeroed())
             }
         }
     };
@@ -85,6 +104,7 @@ fn deref_mut(&mut self) -> &mut Self::Target {
 decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info);
 decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
 decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
+decl_wrapper!(BinderObjectHeader, bindings::binder_object_header);
 decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data);
 decl_wrapper!(
     BinderTransactionDataSecctx,
@@ -100,18 +120,18 @@ fn deref_mut(&mut self) -> &mut Self::Target {
 
 impl BinderVersion {
     pub(crate) fn current() -> Self {
-        Self(bindings::binder_version {
+        Self(MaybeUninit::new(bindings::binder_version {
             protocol_version: bindings::BINDER_CURRENT_PROTOCOL_VERSION as _,
-        })
+        }))
     }
 }
 
 impl BinderTransactionData {
     pub(crate) fn with_buffers_size(self, buffers_size: u64) -> BinderTransactionDataSg {
-        BinderTransactionDataSg(bindings::binder_transaction_data_sg {
-            transaction_data: self.0,
+        BinderTransactionDataSg(MaybeUninit::new(bindings::binder_transaction_data_sg {
+            transaction_data: *self,
             buffers_size,
-        })
+        }))
     }
 }
 
@@ -128,6 +148,10 @@ pub(crate) fn tr_data(&mut self) -> &mut BinderTransactionData {
 
 impl ExtendedError {
     pub(crate) fn new(id: u32, command: u32, param: i32) -> Self {
-        Self(bindings::binder_extended_error { id, command, param })
+        Self(MaybeUninit::new(bindings::binder_extended_error {
+            id,
+            command,
+            param,
+        }))
     }
 }
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 0b79fa59ffa5..944297b7403c 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -591,6 +591,14 @@ pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result
         Ok(())
     }
 
+    /// Decrements the refcount of the given node, if one exists.
+    pub(crate) fn update_node(&self, ptr: usize, cookie: usize, strong: bool) {
+        let mut inner = self.inner.lock();
+        if let Ok(Some(node)) = inner.get_existing_node(ptr, cookie) {
+            inner.update_node_refcount(&node, false, strong, 1, None);
+        }
+    }
+
     pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool) -> Result {
         let ptr = reader.read::<usize>()?;
         let cookie = reader.read::<usize>()?;
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index b70a5e3c064b..a9afc7b706c6 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -21,8 +21,13 @@
 };
 
 use crate::{
-    allocation::Allocation, defs::*, error::BinderResult, process::Process, ptr_align,
-    transaction::Transaction, DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead,
+    allocation::{Allocation, AllocationView, BinderObject, BinderObjectRef},
+    defs::*,
+    error::BinderResult,
+    process::Process,
+    ptr_align,
+    transaction::Transaction,
+    DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead,
 };
 
 use core::{
@@ -412,6 +417,54 @@ pub(crate) fn push_return_work(&self, reply: u32) {
         self.inner.lock().push_return_work(reply);
     }
 
+    fn translate_object(
+        &self,
+        offset: usize,
+        object: BinderObjectRef<'_>,
+        view: &mut AllocationView<'_>,
+    ) -> BinderResult {
+        match object {
+            BinderObjectRef::Binder(obj) => {
+                let strong = obj.hdr.type_ == BINDER_TYPE_BINDER;
+                // SAFETY: `binder` is a `binder_uintptr_t`; any bit pattern is a valid
+                // representation.
+                let ptr = unsafe { obj.__bindgen_anon_1.binder } as _;
+                let cookie = obj.cookie as _;
+                let flags = obj.flags as _;
+                let node = self.process.as_arc_borrow().get_node(
+                    ptr,
+                    cookie,
+                    flags,
+                    strong,
+                    Some(self),
+                )?;
+                security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?;
+                view.transfer_binder_object(offset, obj, strong, node)?;
+            }
+            BinderObjectRef::Handle(obj) => {
+                let strong = obj.hdr.type_ == BINDER_TYPE_HANDLE;
+                // SAFETY: `handle` is a `u32`; any bit pattern is a valid representation.
+                let handle = unsafe { obj.__bindgen_anon_1.handle } as _;
+                let node = self.process.get_node_from_handle(handle, strong)?;
+                security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?;
+                view.transfer_binder_object(offset, obj, strong, node)?;
+            }
+            BinderObjectRef::Fd(_obj) => {
+                pr_warn!("Using unsupported binder object type fd.");
+                return Err(EINVAL.into());
+            }
+            BinderObjectRef::Ptr(_obj) => {
+                pr_warn!("Using unsupported binder object type ptr.");
+                return Err(EINVAL.into());
+            }
+            BinderObjectRef::Fda(_obj) => {
+                pr_warn!("Using unsupported binder object type fda.");
+                return Err(EINVAL.into());
+            }
+        }
+        Ok(())
+    }
+
     pub(crate) fn copy_transaction_data(
         &self,
         to_process: Arc<Process>,
@@ -436,6 +489,8 @@ pub(crate) fn copy_transaction_data(
 
         let data_size = trd.data_size.try_into().map_err(|_| EINVAL)?;
         let adata_size = ptr_align(data_size);
+        let offsets_size = trd.offsets_size.try_into().map_err(|_| EINVAL)?;
+        let aoffsets_size = ptr_align(offsets_size);
         let asecctx_size = secctx
             .as_ref()
             .map(|(_, ctx)| ptr_align(ctx.len()))
@@ -443,11 +498,14 @@ pub(crate) fn copy_transaction_data(
 
         // This guarantees that at least `sizeof(usize)` bytes will be allocated.
         let len = usize::max(
-            adata_size.checked_add(asecctx_size).ok_or(ENOMEM)?,
+            adata_size
+                .checked_add(aoffsets_size)
+                .and_then(|sum| sum.checked_add(asecctx_size))
+                .ok_or(ENOMEM)?,
             size_of::<usize>(),
         );
-        let secctx_off = adata_size;
-        let alloc = match to_process.buffer_alloc(len, is_oneway) {
+        let secctx_off = adata_size + aoffsets_size;
+        let mut alloc = match to_process.buffer_alloc(len, is_oneway) {
             Ok(alloc) => alloc,
             Err(err) => {
                 pr_warn!(
@@ -461,8 +519,56 @@ pub(crate) fn copy_transaction_data(
 
         let mut buffer_reader =
             unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader();
+        let mut end_of_previous_object = 0;
+
+        // Copy offsets if there are any.
+        if offsets_size > 0 {
+            {
+                let mut reader =
+                    unsafe { UserSlicePtr::new(trd.data.ptr.offsets as _, offsets_size) }.reader();
+                alloc.copy_into(&mut reader, adata_size, offsets_size)?;
+            }
+
+            let offsets_start = adata_size;
+            let offsets_end = adata_size + aoffsets_size;
+
+            // Traverse the objects specified.
+            let mut view = AllocationView::new(&mut alloc, data_size);
+            for index_offset in (offsets_start..offsets_end).step_by(size_of::<usize>()) {
+                let offset = view.alloc.read(index_offset)?;
+
+                // Copy data between two objects.
+                if end_of_previous_object < offset {
+                    view.alloc.copy_into(
+                        &mut buffer_reader,
+                        end_of_previous_object,
+                        offset - end_of_previous_object,
+                    )?;
+                }
+
+                let mut object = BinderObject::read_from(&mut buffer_reader)?;
+
+                match self.translate_object(offset, object.as_ref(), &mut view) {
+                    Ok(()) => end_of_previous_object = offset + object.size(),
+                    Err(err) => {
+                        pr_warn!("Error while translating object.");
+                        return Err(err);
+                    }
+                }
+
+                // Update the indexes containing objects to clean up.
+                let offset_after_object = index_offset + size_of::<usize>();
+                view.alloc
+                    .set_info_offsets(offsets_start..offset_after_object);
+            }
+        }
 
-        alloc.copy_into(&mut buffer_reader, 0, data_size)?;
+        // Copy remaining raw data.
+        alloc.copy_into(
+            &mut buffer_reader,
+            end_of_previous_object,
+            data_size - end_of_previous_object,
+        )?;
 
         if let Some((off_out, secctx)) = secctx.as_mut() {
             if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) {
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index a4ffe0a3878c..2faba6e1f47f 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -32,6 +32,7 @@ pub(crate) struct Transaction {
     code: u32,
     pub(crate) flags: u32,
     data_size: usize,
+    offsets_size: usize,
     data_address: usize,
     sender_euid: Kuid,
     txn_security_ctx_off: Option<usize>,
@@ -85,6 +86,7 @@ pub(crate) fn new(
             code: trd.code,
             flags: trd.flags,
             data_size: trd.data_size as _,
+            offsets_size: trd.offsets_size as _,
             data_address,
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
             txn_security_ctx_off,
@@ -116,6 +118,7 @@ pub(crate) fn new_reply(
             code: trd.code,
             flags: trd.flags,
             data_size: trd.data_size as _,
+            offsets_size: trd.offsets_size as _,
             data_address: alloc.ptr,
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
             txn_security_ctx_off: None,
@@ -229,7 +232,7 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
         tr.flags = self.flags;
         tr.data_size = self.data_size as _;
         tr.data.ptr.buffer = self.data_address as _;
-        tr.offsets_size = 0;
+        tr.offsets_size = self.offsets_size as _;
         if tr.offsets_size > 0 {
             tr.data.ptr.offsets = (self.data_address + ptr_align(self.data_size)) as _;
         }
diff --git a/rust/helpers.c b/rust/helpers.c
index e70255f3774f..924c7a00f433 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -342,6 +342,13 @@ int rust_helper_security_binder_transaction(const struct cred *from,
 	return security_binder_transaction(from, to);
 }
 EXPORT_SYMBOL_GPL(rust_helper_security_binder_transaction);
+
+int rust_helper_security_binder_transfer_binder(const struct cred *from,
+						const struct cred *to)
+{
+	return security_binder_transfer_binder(from, to);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_binder);
 #endif
 
 /*
diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs
index 9e3e4cf08ecb..9179fc225406 100644
--- a/rust/kernel/security.rs
+++ b/rust/kernel/security.rs
@@ -24,6 +24,13 @@ pub fn binder_transaction(from: &Credential, to: &Credential) -> Result {
     to_result(unsafe { bindings::security_binder_transaction(from.0.get(), to.0.get()) })
 }
 
+/// Calls the security modules to determine if task `from` is allowed to send binder objects
+/// (owned by itself or other processes) to task `to` through a binder transaction.
+pub fn binder_transfer_binder(from: &Credential, to: &Credential) -> Result {
+    // SAFETY: `from` and `to` are valid because the shared references guarantee nonzero refcounts.
+    to_result(unsafe { bindings::security_binder_transfer_binder(from.0.get(), to.0.get()) })
+}
+
 /// A security context string.
 ///
 /// The struct has the invariant that it always contains a valid security context.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 12/20] rust_binder: add BINDER_TYPE_PTR support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (10 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 11/20] rust_binder: send nodes in transactions Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support Alice Ryhl
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

Implement support for the scatter-gather feature of binder, which lets
you embed pointers in binder transactions and have them be translated
so that the recipient gets a pointer that also works for them.

This works by adding a second kind of object to the offset array, namely
the BINDER_TYPE_PTR object. This object has a pointer and length
embedded. The kernel will copy the data behind the pointer, and update
the address of the pointer so that the recipient will be able to follow
the pointer and see the same data.

These objects are supported recursively. Other than the pointer in the
main transaction buffer, each buffer may be pointed at by a pointer
in one of the other buffers. This can be used to build arbitrary trees
of buffers.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/defs.rs   |   1 +
 drivers/android/error.rs  |   9 ++
 drivers/android/thread.rs | 340 +++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 344 insertions(+), 6 deletions(-)

diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 68f32a779a3c..267266f3ad76 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -105,6 +105,7 @@ fn default() -> Self {
 decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
 decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
 decl_wrapper!(BinderObjectHeader, bindings::binder_object_header);
+decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object);
 decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data);
 decl_wrapper!(
     BinderTransactionDataSecctx,
diff --git a/drivers/android/error.rs b/drivers/android/error.rs
index 430b0994affa..c9b991d133d9 100644
--- a/drivers/android/error.rs
+++ b/drivers/android/error.rs
@@ -50,6 +50,15 @@ fn from(_: core::alloc::AllocError) -> Self {
     }
 }
 
+impl From<alloc::collections::TryReserveError> for BinderError {
+    fn from(_: alloc::collections::TryReserveError) -> Self {
+        Self {
+            reply: BR_FAILED_REPLY,
+            source: Some(ENOMEM),
+        }
+    }
+}
+
 impl core::fmt::Debug for BinderError {
     fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
         match self.reply {
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index a9afc7b706c6..86bb32bbabd9 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -35,6 +35,184 @@
     sync::atomic::{AtomicU32, Ordering},
 };
 
+/// Stores the layout of the scatter-gather entries. This is used during the `translate_objects`
+/// call and is discarded when it returns.
+struct ScatterGatherState {
+    /// A struct that tracks the amount of unused buffer space.
+    unused_buffer_space: UnusedBufferSpace,
+    /// Scatter-gather entries to copy.
+    sg_entries: Vec<ScatterGatherEntry>,
+    /// Indexes into `sg_entries` corresponding to the last binder_buffer_object that
+    /// was processed and all of its ancestors. The array is in sorted order.
+    ancestors: Vec<usize>,
+}
+
+/// This entry specifies an additional buffer that should be copied using the scatter-gather
+/// mechanism.
+struct ScatterGatherEntry {
+    /// The index in the offset array of the BINDER_TYPE_PTR that this entry originates from.
+    obj_index: usize,
+    /// Offset in target buffer.
+    offset: usize,
+    /// User address in source buffer.
+    sender_uaddr: usize,
+    /// Number of bytes to copy.
+    length: usize,
+    /// The minimum offset of the next fixup in this buffer.
+    fixup_min_offset: usize,
+    /// The offsets within this buffer that contain pointers which should be translated.
+    pointer_fixups: Vec<PointerFixupEntry>,
+}
+
+/// This entry specifies that a fixup should happen at `target_offset` of the
+/// buffer. If `skip` is nonzero, then the fixup is a `binder_fd_array_object`
+/// and is applied later. Otherwise if `skip` is zero, then the size of the
+/// fixup is `sizeof::<u64>()` and `pointer_value` is written to the buffer.
+struct PointerFixupEntry {
+    /// The number of bytes to skip, or zero for a `binder_buffer_object` fixup.
+    skip: usize,
+    /// The translated pointer to write when `skip` is zero.
+    pointer_value: u64,
+    /// The offset at which the value should be written. The offset is relative
+    /// to the original buffer.
+    target_offset: usize,
+}
+
+/// Return type of `apply_and_validate_fixup_in_parent`.
+struct ParentFixupInfo {
+    /// The index of the parent buffer in `sg_entries`.
+    parent_sg_index: usize,
+    /// The number of ancestors of the buffer.
+    ///
+    /// The buffer is considered an ancestor of itself, so this is always at
+    /// least one.
+    num_ancestors: usize,
+    /// New value of `fixup_min_offset` if this fixup is applied.
+    new_min_offset: usize,
+    /// The offset of the fixup in the target buffer.
+    target_offset: usize,
+}
+
+impl ScatterGatherState {
+    /// Called when a `binder_buffer_object` or `binder_fd_array_object` tries
+    /// to access a region in its parent buffer. These accesses have various
+    /// restrictions, which this method verifies.
+    ///
+    /// The `parent_offset` and `length` arguments describe the offset and
+    /// length of the access in the parent buffer.
+    ///
+    /// # Detailed restrictions
+    ///
+    /// Obviously the fixup must be in-bounds for the parent buffer.
+    ///
+    /// For safety reasons, we only allow fixups inside a buffer to happen
+    /// at increasing offsets; additionally, we only allow fixup on the last
+    /// buffer object that was verified, or one of its parents.
+    ///
+    /// Example of what is allowed:
+    ///
+    /// A
+    ///   B (parent = A, offset = 0)
+    ///   C (parent = A, offset = 16)
+    ///     D (parent = C, offset = 0)
+    ///   E (parent = A, offset = 32) // min_offset is 16 (C.parent_offset)
+    ///
+    /// Examples of what is not allowed:
+    ///
+    /// Decreasing offsets within the same parent:
+    /// A
+    ///   C (parent = A, offset = 16)
+    ///   B (parent = A, offset = 0) // decreasing offset within A
+    ///
+    /// Arcerring to a parent that wasn't the last object or any of its parents:
+    /// A
+    ///   B (parent = A, offset = 0)
+    ///   C (parent = A, offset = 0)
+    ///   C (parent = A, offset = 16)
+    ///     D (parent = B, offset = 0) // B is not A or any of A's parents
+    fn validate_parent_fixup(
+        &self,
+        parent: usize,
+        parent_offset: usize,
+        length: usize,
+    ) -> Result<ParentFixupInfo> {
+        // Using `position` would also be correct, but `rposition` avoids
+        // quadratic running times.
+        let ancestors_i = self
+            .ancestors
+            .iter()
+            .copied()
+            .rposition(|sg_idx| self.sg_entries[sg_idx].obj_index == parent)
+            .ok_or(EINVAL)?;
+        let sg_idx = self.ancestors[ancestors_i];
+        let sg_entry = match self.sg_entries.get(sg_idx) {
+            Some(sg_entry) => sg_entry,
+            None => {
+                pr_err!(
+                    "self.ancestors[{}] is {}, but self.sg_entries.len() is {}",
+                    ancestors_i,
+                    sg_idx,
+                    self.sg_entries.len()
+                );
+                return Err(EINVAL);
+            }
+        };
+        if sg_entry.fixup_min_offset > parent_offset {
+            pr_warn!(
+                "validate_parent_fixup: fixup_min_offset={}, parent_offset={}",
+                sg_entry.fixup_min_offset,
+                parent_offset
+            );
+            return Err(EINVAL);
+        }
+        let new_min_offset = parent_offset.checked_add(length).ok_or(EINVAL)?;
+        if new_min_offset > sg_entry.length {
+            pr_warn!(
+                "validate_parent_fixup: new_min_offset={}, sg_entry.length={}",
+                new_min_offset,
+                sg_entry.length
+            );
+            return Err(EINVAL);
+        }
+        let target_offset = sg_entry.offset.checked_add(parent_offset).ok_or(EINVAL)?;
+        // The `ancestors_i + 1` operation can't overflow since the output of the addition is at
+        // most `self.ancestors.len()`, which also fits in a usize.
+        Ok(ParentFixupInfo {
+            parent_sg_index: sg_idx,
+            num_ancestors: ancestors_i + 1,
+            new_min_offset,
+            target_offset,
+        })
+    }
+}
+
+/// Keeps track of how much unused buffer space is left. The initial amount is the number of bytes
+/// requested by the user using the `buffers_size` field of `binder_transaction_data_sg`. Each time
+/// we translate an object of type `BINDER_TYPE_PTR`, some of the unused buffer space is consumed.
+struct UnusedBufferSpace {
+    /// The start of the remaining space.
+    offset: usize,
+    /// The end of the remaining space.
+    limit: usize,
+}
+impl UnusedBufferSpace {
+    /// Claim the next `size` bytes from the unused buffer space. The offset for the claimed chunk
+    /// into the buffer is returned.
+    fn claim_next(&mut self, size: usize) -> Result<usize> {
+        // We require every chunk to be aligned.
+        let size = ptr_align(size);
+        let new_offset = self.offset.checked_add(size).ok_or(EINVAL)?;
+
+        if new_offset <= self.limit {
+            let offset = self.offset;
+            self.offset = new_offset;
+            Ok(offset)
+        } else {
+            Err(EINVAL)
+        }
+    }
+}
+
 pub(crate) enum PushWorkRes {
     Ok,
     FailedDead(DLArc<dyn DeliverToRead>),
@@ -419,9 +597,11 @@ pub(crate) fn push_return_work(&self, reply: u32) {
 
     fn translate_object(
         &self,
+        obj_index: usize,
         offset: usize,
         object: BinderObjectRef<'_>,
         view: &mut AllocationView<'_>,
+        sg_state: &mut ScatterGatherState,
     ) -> BinderResult {
         match object {
             BinderObjectRef::Binder(obj) => {
@@ -453,9 +633,78 @@ fn translate_object(
                 pr_warn!("Using unsupported binder object type fd.");
                 return Err(EINVAL.into());
             }
-            BinderObjectRef::Ptr(_obj) => {
-                pr_warn!("Using unsupported binder object type ptr.");
-                return Err(EINVAL.into());
+            BinderObjectRef::Ptr(obj) => {
+                let obj_length = obj.length.try_into().map_err(|_| EINVAL)?;
+                let alloc_offset = match sg_state.unused_buffer_space.claim_next(obj_length) {
+                    Ok(alloc_offset) => alloc_offset,
+                    Err(err) => {
+                        pr_warn!(
+                            "Failed to claim space for a BINDER_TYPE_PTR. (offset: {}, limit: {}, size: {})",
+                            sg_state.unused_buffer_space.offset,
+                            sg_state.unused_buffer_space.limit,
+                            obj_length,
+                        );
+                        return Err(err.into());
+                    }
+                };
+
+                let sg_state_idx = sg_state.sg_entries.len();
+                sg_state.sg_entries.try_push(ScatterGatherEntry {
+                    obj_index,
+                    offset: alloc_offset,
+                    sender_uaddr: obj.buffer as _,
+                    length: obj_length,
+                    pointer_fixups: Vec::new(),
+                    fixup_min_offset: 0,
+                })?;
+
+                let buffer_ptr_in_user_space = (view.alloc.ptr + alloc_offset) as u64;
+
+                if obj.flags & bindings::BINDER_BUFFER_FLAG_HAS_PARENT == 0 {
+                    sg_state.ancestors.clear();
+                    sg_state.ancestors.try_push(sg_state_idx)?;
+                } else {
+                    // Another buffer also has a pointer to this buffer, and we need to fixup that
+                    // pointer too.
+
+                    let parent_index = usize::try_from(obj.parent).map_err(|_| EINVAL)?;
+                    let parent_offset = usize::try_from(obj.parent_offset).map_err(|_| EINVAL)?;
+
+                    let info = sg_state.validate_parent_fixup(
+                        parent_index,
+                        parent_offset,
+                        size_of::<u64>(),
+                    )?;
+
+                    sg_state.ancestors.truncate(info.num_ancestors);
+                    sg_state.ancestors.try_push(sg_state_idx)?;
+
+                    let parent_entry = match sg_state.sg_entries.get_mut(info.parent_sg_index) {
+                        Some(parent_entry) => parent_entry,
+                        None => {
+                            pr_err!(
+                                "validate_parent_fixup returned index out of bounds for sg.entries"
+                            );
+                            return Err(EINVAL.into());
+                        }
+                    };
+
+                    parent_entry.fixup_min_offset = info.new_min_offset;
+                    parent_entry.pointer_fixups.try_push(PointerFixupEntry {
+                        skip: 0,
+                        pointer_value: buffer_ptr_in_user_space,
+                        target_offset: info.target_offset,
+                    })?;
+                }
+
+                let mut obj_write = BinderBufferObject::default();
+                obj_write.hdr.type_ = BINDER_TYPE_PTR;
+                obj_write.flags = obj.flags;
+                obj_write.buffer = buffer_ptr_in_user_space;
+                obj_write.length = obj.length;
+                obj_write.parent = obj.parent;
+                obj_write.parent_offset = obj.parent_offset;
+                view.write::<BinderBufferObject>(offset, &obj_write)?;
             }
             BinderObjectRef::Fda(_obj) => {
                 pr_warn!("Using unsupported binder object type fda.");
@@ -465,6 +714,61 @@ fn translate_object(
         Ok(())
     }
 
+    fn apply_sg(&self, alloc: &mut Allocation, sg_state: &mut ScatterGatherState) -> BinderResult {
+        for sg_entry in &mut sg_state.sg_entries {
+            let mut end_of_previous_fixup = sg_entry.offset;
+            let offset_end = sg_entry.offset.checked_add(sg_entry.length).ok_or(EINVAL)?;
+
+            let mut reader =
+                UserSlicePtr::new(sg_entry.sender_uaddr as _, sg_entry.length).reader();
+            for fixup in &mut sg_entry.pointer_fixups {
+                let fixup_len = if fixup.skip == 0 {
+                    size_of::<u64>()
+                } else {
+                    fixup.skip
+                };
+
+                let target_offset_end = fixup.target_offset.checked_add(fixup_len).ok_or(EINVAL)?;
+                if fixup.target_offset < end_of_previous_fixup || offset_end < target_offset_end {
+                    pr_warn!(
+                        "Fixups oob {} {} {} {}",
+                        fixup.target_offset,
+                        end_of_previous_fixup,
+                        offset_end,
+                        target_offset_end
+                    );
+                    return Err(EINVAL.into());
+                }
+
+                let copy_off = end_of_previous_fixup;
+                let copy_len = fixup.target_offset - end_of_previous_fixup;
+                if let Err(err) = alloc.copy_into(&mut reader, copy_off, copy_len) {
+                    pr_warn!("Failed copying into alloc: {:?}", err);
+                    return Err(err.into());
+                }
+                if fixup.skip == 0 {
+                    let res = alloc.write::<u64>(fixup.target_offset, &fixup.pointer_value);
+                    if let Err(err) = res {
+                        pr_warn!("Failed copying ptr into alloc: {:?}", err);
+                        return Err(err.into());
+                    }
+                }
+                if let Err(err) = reader.skip(fixup_len) {
+                    pr_warn!("Failed skipping {} from reader: {:?}", fixup_len, err);
+                    return Err(err.into());
+                }
+                end_of_previous_fixup = target_offset_end;
+            }
+            let copy_off = end_of_previous_fixup;
+            let copy_len = offset_end - end_of_previous_fixup;
+            if let Err(err) = alloc.copy_into(&mut reader, copy_off, copy_len) {
+                pr_warn!("Failed copying remainder into alloc: {:?}", err);
+                return Err(err.into());
+            }
+        }
+        Ok(())
+    }
+
     pub(crate) fn copy_transaction_data(
         &self,
         to_process: Arc<Process>,
@@ -491,6 +795,8 @@ pub(crate) fn copy_transaction_data(
         let adata_size = ptr_align(data_size);
         let offsets_size = trd.offsets_size.try_into().map_err(|_| EINVAL)?;
         let aoffsets_size = ptr_align(offsets_size);
+        let buffers_size = tr.buffers_size.try_into().map_err(|_| EINVAL)?;
+        let abuffers_size = ptr_align(buffers_size);
         let asecctx_size = secctx
             .as_ref()
             .map(|(_, ctx)| ptr_align(ctx.len()))
@@ -500,11 +806,12 @@ pub(crate) fn copy_transaction_data(
         let len = usize::max(
             adata_size
                 .checked_add(aoffsets_size)
+                .and_then(|sum| sum.checked_add(abuffers_size))
                 .and_then(|sum| sum.checked_add(asecctx_size))
                 .ok_or(ENOMEM)?,
             size_of::<usize>(),
         );
-        let secctx_off = adata_size + aoffsets_size;
+        let secctx_off = adata_size + aoffsets_size + abuffers_size;
         let mut alloc = match to_process.buffer_alloc(len, is_oneway) {
             Ok(alloc) => alloc,
             Err(err) => {
@@ -520,6 +827,7 @@ pub(crate) fn copy_transaction_data(
         let mut buffer_reader =
             unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader();
         let mut end_of_previous_object = 0;
+        let mut sg_state = None;
 
         // Copy offsets if there are any.
         if offsets_size > 0 {
@@ -532,9 +840,22 @@ pub(crate) fn copy_transaction_data(
             let offsets_start = adata_size;
             let offsets_end = adata_size + aoffsets_size;
 
+            // This state is used for BINDER_TYPE_PTR objects.
+            let sg_state = sg_state.insert(ScatterGatherState {
+                unused_buffer_space: UnusedBufferSpace {
+                    offset: offsets_end,
+                    limit: len,
+                },
+                sg_entries: Vec::new(),
+                ancestors: Vec::new(),
+            });
+
             // Traverse the objects specified.
             let mut view = AllocationView::new(&mut alloc, data_size);
-            for index_offset in (offsets_start..offsets_end).step_by(size_of::<usize>()) {
+            for (index, index_offset) in (offsets_start..offsets_end)
+                .step_by(size_of::<usize>())
+                .enumerate()
+            {
                 let offset = view.alloc.read(index_offset)?;
 
                 // Copy data between two objects.
@@ -548,7 +869,7 @@ pub(crate) fn copy_transaction_data(
 
                 let mut object = BinderObject::read_from(&mut buffer_reader)?;
 
-                match self.translate_object(offset, object.as_ref(), &mut view) {
+                match self.translate_object(index, offset, object.as_ref(), &mut view, sg_state) {
                     Ok(()) => end_of_previous_object = offset + object.size(),
                     Err(err) => {
                         pr_warn!("Error while translating object.");
@@ -570,6 +891,13 @@ pub(crate) fn copy_transaction_data(
             data_size - end_of_previous_object,
         )?;
 
+        if let Some(sg_state) = sg_state.as_mut() {
+            if let Err(err) = self.apply_sg(&mut alloc, sg_state) {
+                pr_warn!("Failure in apply_sg: {:?}", err);
+                return Err(err);
+            }
+        }
+
         if let Some((off_out, secctx)) = secctx.as_mut() {
             if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) {
                 pr_warn!("Failed to write security context: {:?}", err);

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (11 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 12/20] rust_binder: add BINDER_TYPE_PTR support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support Alice Ryhl
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

Add support for sending fds over binder.

Unlike the other object types, file descriptors are not translated until
the transaction is actually received by the recipient. Until that
happens, we store `u32::MAX` as the fd.

Translating fds is done in a two-phase process. First, the file
descriptors are allocated and written to the allocation. Then, once we
have allocated all of them, we commit them to the files in question.
Using this strategy, we are able to guarantee that we either send all of
the fds, or none of them.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  | 71 ++++++++++++++++++++++++++++++++++++++++++
 drivers/android/defs.rs        |  4 ++-
 drivers/android/error.rs       |  6 ++++
 drivers/android/thread.rs      | 42 ++++++++++++++++++++++---
 drivers/android/transaction.rs | 53 ++++++++++++++++++++++++-------
 rust/helpers.c                 |  8 +++++
 rust/kernel/file.rs            |  2 +-
 rust/kernel/security.rs        | 11 +++++++
 8 files changed, 179 insertions(+), 18 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index 32bc268956f2..9d777ffb7176 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -4,10 +4,12 @@
 
 use kernel::{
     bindings,
+    file::{File, FileDescriptorReservation},
     io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes},
     pages::Pages,
     prelude::*,
     sync::Arc,
+    types::ARef,
     user_ptr::UserSlicePtrReader,
 };
 
@@ -32,6 +34,8 @@ pub(crate) struct AllocationInfo {
     pub(crate) oneway_node: Option<DArc<Node>>,
     /// Zero the data in the buffer on free.
     pub(crate) clear_on_free: bool,
+    /// List of files embedded in this transaction.
+    file_list: FileList,
 }
 
 /// Represents an allocation that the kernel is currently using.
@@ -160,6 +164,38 @@ pub(crate) fn set_info_clear_on_drop(&mut self) {
     pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) {
         self.get_or_init_info().target_node = Some(target_node);
     }
+
+    pub(crate) fn info_add_fd(&mut self, file: ARef<File>, buffer_offset: usize) -> Result {
+        self.get_or_init_info()
+            .file_list
+            .files_to_translate
+            .try_push(FileEntry {
+                file,
+                buffer_offset,
+            })?;
+
+        Ok(())
+    }
+
+    pub(crate) fn translate_fds(&mut self) -> Result<TranslatedFds> {
+        let file_list = match self.allocation_info.as_mut() {
+            Some(info) => &mut info.file_list,
+            None => return Ok(TranslatedFds::new()),
+        };
+
+        let files = core::mem::take(&mut file_list.files_to_translate);
+        let mut reservations = Vec::try_with_capacity(files.len())?;
+        for file_info in files {
+            let res = FileDescriptorReservation::new(bindings::O_CLOEXEC)?;
+            self.write::<u32>(file_info.buffer_offset, &res.reserved_fd())?;
+            reservations.try_push(Reservation {
+                res,
+                file: file_info.file,
+            })?;
+        }
+
+        Ok(TranslatedFds { reservations })
+    }
 }
 
 impl Drop for Allocation {
@@ -417,3 +453,38 @@ fn type_to_size(type_: u32) -> Option<usize> {
         }
     }
 }
+
+#[derive(Default)]
+struct FileList {
+    files_to_translate: Vec<FileEntry>,
+}
+
+struct FileEntry {
+    /// The file for which a descriptor will be created in the recipient process.
+    file: ARef<File>,
+    /// The offset in the buffer where the file descriptor is stored.
+    buffer_offset: usize,
+}
+
+pub(crate) struct TranslatedFds {
+    reservations: Vec<Reservation>,
+}
+
+struct Reservation {
+    res: FileDescriptorReservation,
+    file: ARef<File>,
+}
+
+impl TranslatedFds {
+    pub(crate) fn new() -> Self {
+        Self {
+            reservations: Vec::new(),
+        }
+    }
+
+    pub(crate) fn commit(self) {
+        for entry in self.reservations {
+            entry.res.commit(entry.file);
+        }
+    }
+}
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 267266f3ad76..fa4ec3eff424 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -54,9 +54,10 @@ macro_rules! pub_no_prefix {
     BC_DEAD_BINDER_DONE
 );
 
+pub(crate) const FLAT_BINDER_FLAG_ACCEPTS_FDS: u32 = kernel::bindings::FLAT_BINDER_FLAG_ACCEPTS_FDS;
 pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 =
     kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX;
-pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF);
+pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_ACCEPT_FDS, TF_CLEAR_BUF);
 
 pub(crate) use bindings::{
     BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR,
@@ -104,6 +105,7 @@ fn default() -> Self {
 decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info);
 decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
 decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
+decl_wrapper!(BinderFdObject, bindings::binder_fd_object);
 decl_wrapper!(BinderObjectHeader, bindings::binder_object_header);
 decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object);
 decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data);
diff --git a/drivers/android/error.rs b/drivers/android/error.rs
index c9b991d133d9..6735636d2a1c 100644
--- a/drivers/android/error.rs
+++ b/drivers/android/error.rs
@@ -41,6 +41,12 @@ fn from(source: Error) -> Self {
     }
 }
 
+impl From<kernel::file::BadFdError> for BinderError {
+    fn from(source: kernel::file::BadFdError) -> Self {
+        BinderError::from(Error::from(source))
+    }
+}
+
 impl From<core::alloc::AllocError> for BinderError {
     fn from(_: core::alloc::AllocError) -> Self {
         Self {
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 86bb32bbabd9..56b36dc43bcc 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -601,6 +601,7 @@ fn translate_object(
         offset: usize,
         object: BinderObjectRef<'_>,
         view: &mut AllocationView<'_>,
+        allow_fds: bool,
         sg_state: &mut ScatterGatherState,
     ) -> BinderResult {
         match object {
@@ -629,9 +630,31 @@ fn translate_object(
                 security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?;
                 view.transfer_binder_object(offset, obj, strong, node)?;
             }
-            BinderObjectRef::Fd(_obj) => {
-                pr_warn!("Using unsupported binder object type fd.");
-                return Err(EINVAL.into());
+            BinderObjectRef::Fd(obj) => {
+                if !allow_fds {
+                    return Err(EPERM.into());
+                }
+
+                // SAFETY: `fd` is a `u32`; any bit pattern is a valid representation.
+                let fd = unsafe { obj.__bindgen_anon_1.fd };
+                let file = File::from_fd(fd)?;
+                security::binder_transfer_file(
+                    &self.process.cred,
+                    &view.alloc.process.cred,
+                    &file,
+                )?;
+
+                let mut obj_write = BinderFdObject::default();
+                obj_write.hdr.type_ = BINDER_TYPE_FD;
+                // This will be overwritten with the actual fd when the transaction is received.
+                obj_write.__bindgen_anon_1.fd = u32::MAX;
+                obj_write.cookie = obj.cookie;
+                view.write::<BinderFdObject>(offset, &obj_write)?;
+
+                const FD_FIELD_OFFSET: usize =
+                    ::core::mem::offset_of!(bindings::binder_fd_object, __bindgen_anon_1.fd)
+                        as usize;
+                view.alloc.info_add_fd(file, offset + FD_FIELD_OFFSET)?;
             }
             BinderObjectRef::Ptr(obj) => {
                 let obj_length = obj.length.try_into().map_err(|_| EINVAL)?;
@@ -773,6 +796,7 @@ pub(crate) fn copy_transaction_data(
         &self,
         to_process: Arc<Process>,
         tr: &BinderTransactionDataSg,
+        allow_fds: bool,
         txn_security_ctx_offset: Option<&mut usize>,
     ) -> BinderResult<Allocation> {
         let trd = &tr.transaction_data;
@@ -869,7 +893,14 @@ pub(crate) fn copy_transaction_data(
 
                 let mut object = BinderObject::read_from(&mut buffer_reader)?;
 
-                match self.translate_object(index, offset, object.as_ref(), &mut view, sg_state) {
+                match self.translate_object(
+                    index,
+                    offset,
+                    object.as_ref(),
+                    &mut view,
+                    allow_fds,
+                    sg_state,
+                ) {
                     Ok(()) => end_of_previous_object = offset + object.size(),
                     Err(err) => {
                         pr_warn!("Error while translating object.");
@@ -1059,7 +1090,8 @@ fn reply_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> BinderResult {
         (|| -> BinderResult<_> {
             let completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
             let process = orig.from.process.clone();
-            let reply = Transaction::new_reply(self, process, tr)?;
+            let allow_fds = orig.flags & TF_ACCEPT_FDS != 0;
+            let reply = Transaction::new_reply(self, process, tr, allow_fds)?;
             self.inner.lock().push_work(completion);
             orig.from.deliver_reply(Either::Left(reply), &orig);
             Ok(())
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 2faba6e1f47f..3230ea490a5b 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -11,7 +11,7 @@
 };
 
 use crate::{
-    allocation::Allocation,
+    allocation::{Allocation, TranslatedFds},
     defs::*,
     error::{BinderError, BinderResult},
     node::{Node, NodeRef},
@@ -50,19 +50,24 @@ pub(crate) fn new(
         tr: &BinderTransactionDataSg,
     ) -> BinderResult<DLArc<Self>> {
         let trd = &tr.transaction_data;
+        let allow_fds = node_ref.node.flags & FLAT_BINDER_FLAG_ACCEPTS_FDS != 0;
         let txn_security_ctx = node_ref.node.flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX != 0;
         let mut txn_security_ctx_off = if txn_security_ctx { Some(0) } else { None };
         let to = node_ref.node.owner.clone();
-        let mut alloc =
-            match from.copy_transaction_data(to.clone(), tr, txn_security_ctx_off.as_mut()) {
-                Ok(alloc) => alloc,
-                Err(err) => {
-                    if !err.is_dead() {
-                        pr_warn!("Failure in copy_transaction_data: {:?}", err);
-                    }
-                    return Err(err);
+        let mut alloc = match from.copy_transaction_data(
+            to.clone(),
+            tr,
+            allow_fds,
+            txn_security_ctx_off.as_mut(),
+        ) {
+            Ok(alloc) => alloc,
+            Err(err) => {
+                if !err.is_dead() {
+                    pr_warn!("Failure in copy_transaction_data: {:?}", err);
                 }
-            };
+                return Err(err);
+            }
+        };
         if trd.flags & TF_ONE_WAY != 0 {
             if stack_next.is_some() {
                 pr_warn!("Oneway transaction should not be in a transaction stack.");
@@ -97,9 +102,10 @@ pub(crate) fn new_reply(
         from: &Arc<Thread>,
         to: Arc<Process>,
         tr: &BinderTransactionDataSg,
+        allow_fds: bool,
     ) -> BinderResult<DLArc<Self>> {
         let trd = &tr.transaction_data;
-        let mut alloc = match from.copy_transaction_data(to.clone(), tr, None) {
+        let mut alloc = match from.copy_transaction_data(to.clone(), tr, allow_fds, None) {
             Ok(alloc) => alloc,
             Err(err) => {
                 pr_warn!("Failure in copy_transaction_data: {:?}", err);
@@ -210,6 +216,22 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
             }
         }
     }
+
+    fn prepare_file_list(&self) -> Result<TranslatedFds> {
+        let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?;
+
+        match alloc.translate_fds() {
+            Ok(translated) => {
+                *self.allocation.lock() = Some(alloc);
+                Ok(translated)
+            }
+            Err(err) => {
+                // Free the allocation eagerly.
+                drop(alloc);
+                Err(err)
+            }
+        }
+    }
 }
 
 impl DeliverToRead for Transaction {
@@ -220,6 +242,13 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
                 self.from.deliver_reply(reply, &self);
             }
         });
+        let files = if let Ok(list) = self.prepare_file_list() {
+            list
+        } else {
+            // On failure to process the list, we send a reply back to the sender and ignore the
+            // transaction on the recipient.
+            return Ok(true);
+        };
 
         let mut tr_sec = BinderTransactionDataSecctx::default();
         let tr = tr_sec.tr_data();
@@ -269,6 +298,8 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
             alloc.keep_alive();
         }
 
+        files.commit();
+
         // When this is not a reply and not a oneway transaction, update `current_transaction`. If
         // it's a reply, `current_transaction` has already been updated appropriately.
         if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 {
diff --git a/rust/helpers.c b/rust/helpers.c
index 924c7a00f433..be295d8bdb46 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -349,6 +349,14 @@ int rust_helper_security_binder_transfer_binder(const struct cred *from,
 	return security_binder_transfer_binder(from, to);
 }
 EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_binder);
+
+int rust_helper_security_binder_transfer_file(const struct cred *from,
+					      const struct cred *to,
+					      struct file *file)
+{
+	return security_binder_transfer_file(from, to, file);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_file);
 #endif
 
 /*
diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index 2e983285cc16..a0319c93f367 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -107,7 +107,7 @@ pub mod flags {
 /// Instances of this type are always ref-counted, that is, a call to `get_file` ensures that the
 /// allocation remains valid at least until the matching call to `fput`.
 #[repr(transparent)]
-pub struct File(Opaque<bindings::file>);
+pub struct File(pub(crate) Opaque<bindings::file>);
 
 // SAFETY: By design, the only way to access a `File` is via an immutable reference or an `ARef`.
 // This means that the only situation in which a `File` can be accessed mutably is when the
diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs
index 9179fc225406..d308b8183c59 100644
--- a/rust/kernel/security.rs
+++ b/rust/kernel/security.rs
@@ -8,6 +8,7 @@
     bindings,
     cred::Credential,
     error::{to_result, Result},
+    file::File,
 };
 
 /// Calls the security modules to determine if the given task can become the manager of a binder
@@ -31,6 +32,16 @@ pub fn binder_transfer_binder(from: &Credential, to: &Credential) -> Result {
     to_result(unsafe { bindings::security_binder_transfer_binder(from.0.get(), to.0.get()) })
 }
 
+/// Calls the security modules to determine if task `from` is allowed to send the given file to
+/// task `to` (which would get its own file descriptor) through a binder transaction.
+pub fn binder_transfer_file(from: &Credential, to: &Credential, file: &File) -> Result {
+    // SAFETY: `from`, `to` and `file` are valid because the shared references guarantee nonzero
+    // refcounts.
+    to_result(unsafe {
+        bindings::security_binder_transfer_file(from.0.get(), to.0.get(), file.0.get())
+    })
+}
+
 /// A security context string.
 ///
 /// The struct has the invariant that it always contains a valid security context.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (12 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 15/20] rust_binder: add process freezing Alice Ryhl
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

In the previous patch, we introduced support for `BINDER_TYPE_FD`
objects that let you send a single fd, and in this patch, we add support
for FD arrays. One important difference between `BINDER_TYPE_FD` and
`BINDER_TYPE_FDA` is that FD arrays will close the file descriptors when
the transaction allocation is freed, whereas FDs sent using
`BINDER_TYPE_FD` are not closed.

Note that `BINDER_TYPE_FDA` is used only with hwbinder.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  | 74 ++++++++++++++++++++++++++++++++---
 drivers/android/defs.rs        |  1 +
 drivers/android/thread.rs      | 87 +++++++++++++++++++++++++++++++++++++++---
 drivers/android/transaction.rs | 13 ++++---
 4 files changed, 159 insertions(+), 16 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index 9d777ffb7176..c7f44a54b79b 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -4,7 +4,7 @@
 
 use kernel::{
     bindings,
-    file::{File, FileDescriptorReservation},
+    file::{DeferredFdCloser, File, FileDescriptorReservation},
     io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes},
     pages::Pages,
     prelude::*,
@@ -165,18 +165,38 @@ pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) {
         self.get_or_init_info().target_node = Some(target_node);
     }
 
-    pub(crate) fn info_add_fd(&mut self, file: ARef<File>, buffer_offset: usize) -> Result {
+    /// Reserve enough space to push at least `num_fds` fds.
+    pub(crate) fn info_add_fd_reserve(&mut self, num_fds: usize) -> Result {
+        self.get_or_init_info()
+            .file_list
+            .files_to_translate
+            .try_reserve(num_fds)?;
+
+        Ok(())
+    }
+
+    pub(crate) fn info_add_fd(
+        &mut self,
+        file: ARef<File>,
+        buffer_offset: usize,
+        close_on_free: bool,
+    ) -> Result {
         self.get_or_init_info()
             .file_list
             .files_to_translate
             .try_push(FileEntry {
                 file,
                 buffer_offset,
+                close_on_free,
             })?;
 
         Ok(())
     }
 
+    pub(crate) fn set_info_close_on_free(&mut self, cof: FdsCloseOnFree) {
+        self.get_or_init_info().file_list.close_on_free = cof.0;
+    }
+
     pub(crate) fn translate_fds(&mut self) -> Result<TranslatedFds> {
         let file_list = match self.allocation_info.as_mut() {
             Some(info) => &mut info.file_list,
@@ -184,17 +204,38 @@ pub(crate) fn translate_fds(&mut self) -> Result<TranslatedFds> {
         };
 
         let files = core::mem::take(&mut file_list.files_to_translate);
+
+        let num_close_on_free = files.iter().filter(|entry| entry.close_on_free).count();
+        let mut close_on_free = Vec::try_with_capacity(num_close_on_free)?;
+
         let mut reservations = Vec::try_with_capacity(files.len())?;
         for file_info in files {
             let res = FileDescriptorReservation::new(bindings::O_CLOEXEC)?;
-            self.write::<u32>(file_info.buffer_offset, &res.reserved_fd())?;
+            let fd = res.reserved_fd();
+            self.write::<u32>(file_info.buffer_offset, &fd)?;
             reservations.try_push(Reservation {
                 res,
                 file: file_info.file,
             })?;
+            if file_info.close_on_free {
+                close_on_free.try_push(fd)?;
+            }
         }
 
-        Ok(TranslatedFds { reservations })
+        Ok(TranslatedFds {
+            reservations,
+            close_on_free: FdsCloseOnFree(close_on_free),
+        })
+    }
+
+    /// Should the looper return to userspace when freeing this allocation?
+    pub(crate) fn looper_need_return_on_free(&self) -> bool {
+        // Closing fds involves pushing task_work for execution when we return to userspace. Hence,
+        // we should return to userspace asap if we are closing fds.
+        match self.allocation_info {
+            Some(ref info) => !info.file_list.close_on_free.is_empty(),
+            None => false,
+        }
     }
 }
 
@@ -220,6 +261,18 @@ fn drop(&mut self) {
                 }
             }
 
+            for &fd in &info.file_list.close_on_free {
+                let closer = match DeferredFdCloser::new() {
+                    Ok(closer) => closer,
+                    Err(core::alloc::AllocError) => {
+                        // Ignore allocation failures.
+                        break;
+                    }
+                };
+
+                closer.close_fd(fd);
+            }
+
             if info.clear_on_free {
                 if let Err(e) = self.fill_zero() {
                     pr_warn!("Failed to clear data on free: {:?}", e);
@@ -457,6 +510,7 @@ fn type_to_size(type_: u32) -> Option<usize> {
 #[derive(Default)]
 struct FileList {
     files_to_translate: Vec<FileEntry>,
+    close_on_free: Vec<u32>,
 }
 
 struct FileEntry {
@@ -464,10 +518,15 @@ struct FileEntry {
     file: ARef<File>,
     /// The offset in the buffer where the file descriptor is stored.
     buffer_offset: usize,
+    /// Whether this fd should be closed when the allocation is freed.
+    close_on_free: bool,
 }
 
 pub(crate) struct TranslatedFds {
     reservations: Vec<Reservation>,
+    /// If commit is called, then these fds should be closed. (If commit is not called, then they
+    /// shouldn't be closed.)
+    close_on_free: FdsCloseOnFree,
 }
 
 struct Reservation {
@@ -479,12 +538,17 @@ impl TranslatedFds {
     pub(crate) fn new() -> Self {
         Self {
             reservations: Vec::new(),
+            close_on_free: FdsCloseOnFree(Vec::new()),
         }
     }
 
-    pub(crate) fn commit(self) {
+    pub(crate) fn commit(self) -> FdsCloseOnFree {
         for entry in self.reservations {
             entry.res.commit(entry.file);
         }
+
+        self.close_on_free
     }
 }
+
+pub(crate) struct FdsCloseOnFree(Vec<u32>);
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index fa4ec3eff424..8f9419d474de 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -106,6 +106,7 @@ fn default() -> Self {
 decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref);
 decl_wrapper!(FlatBinderObject, bindings::flat_binder_object);
 decl_wrapper!(BinderFdObject, bindings::binder_fd_object);
+decl_wrapper!(BinderFdArrayObject, bindings::binder_fd_array_object);
 decl_wrapper!(BinderObjectHeader, bindings::binder_object_header);
 decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object);
 decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data);
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 56b36dc43bcc..2e86592fb61f 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -654,7 +654,8 @@ fn translate_object(
                 const FD_FIELD_OFFSET: usize =
                     ::core::mem::offset_of!(bindings::binder_fd_object, __bindgen_anon_1.fd)
                         as usize;
-                view.alloc.info_add_fd(file, offset + FD_FIELD_OFFSET)?;
+                view.alloc
+                    .info_add_fd(file, offset + FD_FIELD_OFFSET, false)?;
             }
             BinderObjectRef::Ptr(obj) => {
                 let obj_length = obj.length.try_into().map_err(|_| EINVAL)?;
@@ -729,9 +730,77 @@ fn translate_object(
                 obj_write.parent_offset = obj.parent_offset;
                 view.write::<BinderBufferObject>(offset, &obj_write)?;
             }
-            BinderObjectRef::Fda(_obj) => {
-                pr_warn!("Using unsupported binder object type fda.");
-                return Err(EINVAL.into());
+            BinderObjectRef::Fda(obj) => {
+                if !allow_fds {
+                    return Err(EPERM.into());
+                }
+                let parent_index = usize::try_from(obj.parent).map_err(|_| EINVAL)?;
+                let parent_offset = usize::try_from(obj.parent_offset).map_err(|_| EINVAL)?;
+                let num_fds = usize::try_from(obj.num_fds).map_err(|_| EINVAL)?;
+                let fds_len = num_fds.checked_mul(size_of::<u32>()).ok_or(EINVAL)?;
+
+                view.alloc.info_add_fd_reserve(num_fds)?;
+
+                let info = sg_state.validate_parent_fixup(parent_index, parent_offset, fds_len)?;
+
+                sg_state.ancestors.truncate(info.num_ancestors);
+                let parent_entry = match sg_state.sg_entries.get_mut(info.parent_sg_index) {
+                    Some(parent_entry) => parent_entry,
+                    None => {
+                        pr_err!(
+                            "validate_parent_fixup returned index out of bounds for sg.entries"
+                        );
+                        return Err(EINVAL.into());
+                    }
+                };
+
+                parent_entry.fixup_min_offset = info.new_min_offset;
+                parent_entry
+                    .pointer_fixups
+                    .try_push(PointerFixupEntry {
+                        skip: fds_len,
+                        pointer_value: 0,
+                        target_offset: info.target_offset,
+                    })
+                    .map_err(|_| ENOMEM)?;
+
+                let fda_uaddr = parent_entry
+                    .sender_uaddr
+                    .checked_add(parent_offset)
+                    .ok_or(EINVAL)?;
+                let fda_bytes = UserSlicePtr::new(fda_uaddr as _, fds_len).read_all()?;
+
+                if fds_len != fda_bytes.len() {
+                    pr_err!("UserSlicePtr::read_all returned wrong length in BINDER_TYPE_FDA");
+                    return Err(EINVAL.into());
+                }
+
+                for i in (0..fds_len).step_by(size_of::<u32>()) {
+                    let fd = {
+                        let mut fd_bytes = [0u8; size_of::<u32>()];
+                        fd_bytes.copy_from_slice(&fda_bytes[i..i + size_of::<u32>()]);
+                        u32::from_ne_bytes(fd_bytes)
+                    };
+
+                    let file = File::from_fd(fd)?;
+                    security::binder_transfer_file(
+                        &self.process.cred,
+                        &view.alloc.process.cred,
+                        &file,
+                    )?;
+
+                    // The `validate_parent_fixup` call ensuers that this addition will not
+                    // overflow.
+                    view.alloc.info_add_fd(file, info.target_offset + i, true)?;
+                }
+                drop(fda_bytes);
+
+                let mut obj_write = BinderFdArrayObject::default();
+                obj_write.hdr.type_ = BINDER_TYPE_FDA;
+                obj_write.num_fds = obj.num_fds;
+                obj_write.parent = obj.parent;
+                obj_write.parent_offset = obj.parent_offset;
+                view.write::<BinderFdArrayObject>(offset, &obj_write)?;
             }
         }
         Ok(())
@@ -1160,7 +1229,15 @@ fn write(self: &Arc<Self>, req: &mut BinderWriteRead) -> Result {
                     let tr = reader.read::<BinderTransactionDataSg>()?;
                     self.transaction(&tr, Self::reply_inner)
                 }
-                BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)),
+                BC_FREE_BUFFER => {
+                    let buffer = self.process.buffer_get(reader.read()?);
+                    if let Some(buffer) = &buffer {
+                        if buffer.looper_need_return_on_free() {
+                            self.inner.lock().looper_need_return = true;
+                        }
+                    }
+                    drop(buffer);
+                }
                 BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?,
                 BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?,
                 BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?,
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 3230ea490a5b..ec32a9fd0ff1 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -288,17 +288,18 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
             writer.write(&*tr)?;
         }
 
+        let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?;
+
         // Dismiss the completion of transaction with a failure. No failure paths are allowed from
         // here on out.
         send_failed_reply.dismiss();
 
-        // It is now the user's responsibility to clear the allocation.
-        let alloc = self.allocation.lock().take();
-        if let Some(alloc) = alloc {
-            alloc.keep_alive();
-        }
+        // Commit files, and set FDs in FDA to be closed on buffer free.
+        let close_on_free = files.commit();
+        alloc.set_info_close_on_free(close_on_free);
 
-        files.commit();
+        // It is now the user's responsibility to clear the allocation.
+        alloc.keep_alive();
 
         // When this is not a reply and not a oneway transaction, update `current_transaction`. If
         // it's a reply, `current_transaction` has already been updated appropriately.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 15/20] rust_binder: add process freezing
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (13 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 16/20] rust_binder: add TF_UPDATE_TXN support Alice Ryhl
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

When you want to freeze a process, you should process all incoming
transactions before you freeze it. This patch helps with that. The idea
is that before you freeze the process, you mark it as frozen in the
binder driver. When this happens, all new incoming transactions are
rejected, which lets you empty the queue of incoming transactions that
were sent before you decided to freeze the process. Once you have
processed every transaction in that queue, you can perform the actual
freeze operation.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/context.rs     |  39 +++++++++++
 drivers/android/defs.rs        |   3 +
 drivers/android/error.rs       |   8 +++
 drivers/android/process.rs     | 155 ++++++++++++++++++++++++++++++++++++++++-
 drivers/android/thread.rs      |  12 +++-
 drivers/android/transaction.rs |  50 ++++++++++++-
 rust/kernel/sync/condvar.rs    |  10 +++
 7 files changed, 272 insertions(+), 5 deletions(-)

diff --git a/drivers/android/context.rs b/drivers/android/context.rs
index b5de9d98a6b0..925c368238db 100644
--- a/drivers/android/context.rs
+++ b/drivers/android/context.rs
@@ -69,6 +69,18 @@ pub(crate) struct ContextList {
     list: List<Context>,
 }
 
+pub(crate) fn get_all_contexts() -> Result<Vec<Arc<Context>>> {
+    let lock = CONTEXTS.lock();
+
+    let count = lock.list.iter().count();
+
+    let mut ctxs = Vec::try_with_capacity(count)?;
+    for ctx in &lock.list {
+        ctxs.try_push(Arc::from(ctx))?;
+    }
+    Ok(ctxs)
+}
+
 /// This struct keeps track of the processes using this context, and which process is the context
 /// manager.
 struct Manager {
@@ -183,4 +195,31 @@ pub(crate) fn get_manager_node(&self, strong: bool) -> Result<NodeRef, BinderErr
             .clone(strong)
             .map_err(BinderError::from)
     }
+
+    pub(crate) fn for_each_proc<F>(&self, mut func: F)
+    where
+        F: FnMut(&Process),
+    {
+        let lock = self.manager.lock();
+        for proc in &lock.all_procs {
+            func(&proc);
+        }
+    }
+
+    pub(crate) fn get_all_procs(&self) -> Result<Vec<Arc<Process>>> {
+        let lock = self.manager.lock();
+        let count = lock.all_procs.iter().count();
+
+        let mut procs = Vec::try_with_capacity(count)?;
+        for proc in &lock.all_procs {
+            procs.try_push(Arc::from(proc))?;
+        }
+        Ok(procs)
+    }
+
+    pub(crate) fn get_procs_with_pid(&self, pid: i32) -> Result<Vec<Arc<Process>>> {
+        let mut procs = self.get_all_procs()?;
+        procs.retain(|proc| proc.task.pid() == pid);
+        Ok(procs)
+    }
 }
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 8f9419d474de..30659bd26bff 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -20,6 +20,7 @@ macro_rules! pub_no_prefix {
     BR_REPLY,
     BR_DEAD_REPLY,
     BR_FAILED_REPLY,
+    BR_FROZEN_REPLY,
     BR_NOOP,
     BR_SPAWN_LOOPER,
     BR_TRANSACTION_COMPLETE,
@@ -120,6 +121,8 @@ fn default() -> Self {
 );
 decl_wrapper!(BinderWriteRead, bindings::binder_write_read);
 decl_wrapper!(BinderVersion, bindings::binder_version);
+decl_wrapper!(BinderFrozenStatusInfo, bindings::binder_frozen_status_info);
+decl_wrapper!(BinderFreezeInfo, bindings::binder_freeze_info);
 decl_wrapper!(ExtendedError, bindings::binder_extended_error);
 
 impl BinderVersion {
diff --git a/drivers/android/error.rs b/drivers/android/error.rs
index 6735636d2a1c..5cc724931bd3 100644
--- a/drivers/android/error.rs
+++ b/drivers/android/error.rs
@@ -21,6 +21,13 @@ pub(crate) fn new_dead() -> Self {
         }
     }
 
+    pub(crate) fn new_frozen() -> Self {
+        Self {
+            reply: BR_FROZEN_REPLY,
+            source: None,
+        }
+    }
+
     pub(crate) fn is_dead(&self) -> bool {
         self.reply == BR_DEAD_REPLY
     }
@@ -76,6 +83,7 @@ fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
                 None => f.pad("BR_FAILED_REPLY"),
             },
             BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"),
+            BR_FROZEN_REPLY => f.pad("BR_FROZEN_REPLY"),
             BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"),
             _ => f
                 .debug_struct("BinderError")
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 944297b7403c..44baf9e3f998 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -20,7 +20,9 @@
     pages::Pages,
     prelude::*,
     rbtree::RBTree,
-    sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock, UniqueArc},
+    sync::{
+        lock::Guard, Arc, ArcBorrow, CondVar, CondVarTimeoutResult, Mutex, SpinLock, UniqueArc,
+    },
     task::Task,
     types::{ARef, Either},
     user_ptr::{UserSlicePtr, UserSlicePtrReader},
@@ -80,6 +82,16 @@ pub(crate) struct ProcessInner {
 
     /// Bitmap of deferred work to do.
     defer_work: u8,
+
+    /// Number of transactions to be transmitted before processes in freeze_wait
+    /// are woken up.
+    outstanding_txns: u32,
+    /// Process is frozen and unable to service binder transactions.
+    pub(crate) is_frozen: bool,
+    /// Process received sync transactions since last frozen.
+    pub(crate) sync_recv: bool,
+    /// Process received async transactions since last frozen.
+    pub(crate) async_recv: bool,
 }
 
 impl ProcessInner {
@@ -97,6 +109,10 @@ fn new() -> Self {
             max_threads: 0,
             started_thread_count: 0,
             defer_work: 0,
+            outstanding_txns: 0,
+            is_frozen: false,
+            sync_recv: false,
+            async_recv: false,
         }
     }
 
@@ -248,6 +264,22 @@ pub(crate) fn death_delivered(&mut self, death: DArc<NodeDeath>) {
             pr_warn!("Notification added to `delivered_deaths` twice.");
         }
     }
+
+    pub(crate) fn add_outstanding_txn(&mut self) {
+        self.outstanding_txns += 1;
+    }
+
+    fn txns_pending_locked(&self) -> bool {
+        if self.outstanding_txns > 0 {
+            return true;
+        }
+        for thread in self.threads.values() {
+            if thread.has_current_transaction() {
+                return true;
+            }
+        }
+        false
+    }
 }
 
 struct NodeRefInfo {
@@ -296,6 +328,11 @@ pub(crate) struct Process {
     #[pin]
     pub(crate) inner: SpinLock<ProcessInner>,
 
+    // Waitqueue of processes waiting for all outstanding transactions to be
+    // processed.
+    #[pin]
+    freeze_wait: CondVar,
+
     // Node references are in a different lock to avoid recursive acquisition when
     // incrementing/decrementing a node in another process.
     #[pin]
@@ -353,6 +390,7 @@ fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
             cred,
             inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"),
             node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"),
+            freeze_wait <- kernel::new_condvar!("Process::freeze_wait"),
             task: kernel::current!().group_leader().into(),
             defer_work <- kernel::new_work!("Process::defer_work"),
             links <- ListLinks::new(),
@@ -878,6 +916,9 @@ fn deferred_release(self: Arc<Self>) {
         let is_manager = {
             let mut inner = self.inner.lock();
             inner.is_dead = true;
+            inner.is_frozen = false;
+            inner.sync_recv = false;
+            inner.async_recv = false;
             inner.is_manager
         };
 
@@ -975,6 +1016,116 @@ pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result {
         }
         Ok(())
     }
+
+    pub(crate) fn drop_outstanding_txn(&self) {
+        let wake = {
+            let mut inner = self.inner.lock();
+            if inner.outstanding_txns == 0 {
+                pr_err!("outstanding_txns underflow");
+                return;
+            }
+            inner.outstanding_txns -= 1;
+            inner.is_frozen && inner.outstanding_txns == 0
+        };
+
+        if wake {
+            self.freeze_wait.notify_all();
+        }
+    }
+
+    pub(crate) fn ioctl_freeze(&self, info: &BinderFreezeInfo) -> Result {
+        if info.enable != 0 {
+            let mut inner = self.inner.lock();
+            inner.sync_recv = false;
+            inner.async_recv = false;
+            inner.is_frozen = false;
+            return Ok(());
+        }
+
+        let mut inner = self.inner.lock();
+        inner.sync_recv = false;
+        inner.async_recv = false;
+        inner.is_frozen = true;
+
+        if info.timeout_ms > 0 {
+            // Safety: Just an FFI call.
+            let mut jiffies = unsafe { bindings::__msecs_to_jiffies(info.timeout_ms) };
+            while jiffies > 0 {
+                if inner.outstanding_txns == 0 {
+                    break;
+                }
+
+                match self.freeze_wait.wait_timeout(&mut inner, jiffies) {
+                    CondVarTimeoutResult::Signal { .. } => {
+                        inner.is_frozen = false;
+                        return Err(ERESTARTSYS);
+                    }
+                    CondVarTimeoutResult::Woken { jiffies: remaining } => {
+                        jiffies = remaining;
+                    }
+                    CondVarTimeoutResult::Timeout => {
+                        jiffies = 0;
+                    }
+                }
+            }
+        }
+
+        if inner.txns_pending_locked() {
+            inner.is_frozen = false;
+            Err(EAGAIN)
+        } else {
+            Ok(())
+        }
+    }
+}
+
+fn get_frozen_status(data: UserSlicePtr) -> Result {
+    let (mut reader, mut writer) = data.reader_writer();
+
+    let mut info = reader.read::<BinderFrozenStatusInfo>()?;
+    info.sync_recv = 0;
+    info.async_recv = 0;
+    let mut found = false;
+
+    for ctx in crate::context::get_all_contexts()? {
+        ctx.for_each_proc(|proc| {
+            if proc.task.pid() == info.pid as _ {
+                found = true;
+                let inner = proc.inner.lock();
+                let txns_pending = inner.txns_pending_locked();
+                info.async_recv |= inner.async_recv as u32;
+                info.sync_recv |= inner.sync_recv as u32;
+                info.sync_recv |= (txns_pending as u32) << 1;
+            }
+        });
+    }
+
+    if found {
+        writer.write(&info)?;
+        Ok(())
+    } else {
+        Err(EINVAL)
+    }
+}
+
+fn ioctl_freeze(reader: &mut UserSlicePtrReader) -> Result {
+    let info = reader.read::<BinderFreezeInfo>()?;
+
+    // Very unlikely for there to be more than 3, since a process normally uses at most binder and
+    // hwbinder.
+    let mut procs = Vec::try_with_capacity(3)?;
+
+    let ctxs = crate::context::get_all_contexts()?;
+    for ctx in ctxs {
+        for proc in ctx.get_procs_with_pid(info.pid as i32)? {
+            procs.try_push(proc)?;
+        }
+    }
+
+    for proc in procs {
+        proc.ioctl_freeze(&info)?;
+    }
+    Ok(())
 }
 
 /// The ioctl handler.
@@ -993,6 +1144,7 @@ fn write(
             bindings::BINDER_SET_CONTEXT_MGR_EXT => {
                 this.set_as_manager(Some(reader.read()?), &thread)?
             }
+            bindings::BINDER_FREEZE => ioctl_freeze(reader)?,
             _ => return Err(EINVAL),
         }
         Ok(0)
@@ -1011,6 +1163,7 @@ fn read_write(
             bindings::BINDER_GET_NODE_DEBUG_INFO => this.get_node_debug_info(data)?,
             bindings::BINDER_GET_NODE_INFO_FOR_REF => this.get_node_info_from_ref(data)?,
             bindings::BINDER_VERSION => this.version(data)?,
+            bindings::BINDER_GET_FROZEN_INFO => get_frozen_status(data)?,
             bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?,
             _ => return Err(EINVAL),
         }
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 2e86592fb61f..0238c15604f6 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -458,6 +458,10 @@ pub(crate) fn set_current_transaction(&self, transaction: DArc<Transaction>) {
         self.inner.lock().current_transaction = Some(transaction);
     }
 
+    pub(crate) fn has_current_transaction(&self) -> bool {
+        self.inner.lock().current_transaction.is_some()
+    }
+
     /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is
     /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a
     /// signal); otherwise it returns indicating that none is available.
@@ -482,7 +486,7 @@ fn get_work_local(self: &Arc<Self>, wait: bool) -> Result<Option<DLArc<dyn Deliv
             }
 
             inner.looper_flags |= LOOPER_WAITING;
-            let signal_pending = self.work_condvar.wait(&mut inner);
+            let signal_pending = self.work_condvar.wait_freezable(&mut inner);
             inner.looper_flags &= !LOOPER_WAITING;
 
             if signal_pending {
@@ -533,7 +537,7 @@ fn get_work(self: &Arc<Self>, wait: bool) -> Result<Option<DLArc<dyn DeliverToRe
             }
 
             inner.looper_flags |= LOOPER_WAITING | LOOPER_WAITING_PROC;
-            let signal_pending = self.work_condvar.wait(&mut inner);
+            let signal_pending = self.work_condvar.wait_freezable(&mut inner);
             inner.looper_flags &= !(LOOPER_WAITING | LOOPER_WAITING_PROC);
 
             if signal_pending || inner.looper_need_return {
@@ -1043,6 +1047,10 @@ fn deliver_single_reply(
         reply: Either<DLArc<Transaction>, u32>,
         transaction: &DArc<Transaction>,
     ) -> bool {
+        if let Either::Left(transaction) = &reply {
+            transaction.set_outstanding(&mut self.process.inner.lock());
+        }
+
         {
             let mut inner = self.inner.lock();
             if !inner.pop_transaction_replied(transaction) {
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index ec32a9fd0ff1..96f63684b1a3 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::sync::atomic::{AtomicBool, Ordering};
 use kernel::{
     io_buffer::IoBufferWriter,
     list::ListArcSafe,
@@ -15,13 +16,13 @@
     defs::*,
     error::{BinderError, BinderResult},
     node::{Node, NodeRef},
-    process::Process,
+    process::{Process, ProcessInner},
     ptr_align,
     thread::{PushWorkRes, Thread},
     DArc, DLArc, DTRWrap, DeliverToRead,
 };
 
-#[pin_data]
+#[pin_data(PinnedDrop)]
 pub(crate) struct Transaction {
     target_node: Option<DArc<Node>>,
     stack_next: Option<DArc<Transaction>>,
@@ -29,6 +30,7 @@ pub(crate) struct Transaction {
     to: Arc<Process>,
     #[pin]
     allocation: SpinLock<Option<Allocation>>,
+    is_outstanding: AtomicBool,
     code: u32,
     pub(crate) flags: u32,
     data_size: usize,
@@ -94,6 +96,7 @@ pub(crate) fn new(
             offsets_size: trd.offsets_size as _,
             data_address,
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
+            is_outstanding: AtomicBool::new(false),
             txn_security_ctx_off,
         }))?)
     }
@@ -127,6 +130,7 @@ pub(crate) fn new_reply(
             offsets_size: trd.offsets_size as _,
             data_address: alloc.ptr,
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
+            is_outstanding: AtomicBool::new(false),
             txn_security_ctx_off: None,
         }))?)
     }
@@ -172,6 +176,26 @@ pub(crate) fn find_from(&self, thread: &Thread) -> Option<DArc<Transaction>> {
         None
     }
 
+    pub(crate) fn set_outstanding(&self, to_process: &mut ProcessInner) {
+        // No race because this method is only called once.
+        if !self.is_outstanding.load(Ordering::Relaxed) {
+            self.is_outstanding.store(true, Ordering::Relaxed);
+            to_process.add_outstanding_txn();
+        }
+    }
+
+    /// Decrement `outstanding_txns` in `to` if it hasn't already been decremented.
+    fn drop_outstanding_txn(&self) {
+        // No race because this is called at most twice, and one of the calls are in the
+        // destructor, which is guaranteed to not race with any other operations on the
+        // transaction. It also cannot race with `set_outstanding`, since submission happens
+        // before delivery.
+        if self.is_outstanding.load(Ordering::Relaxed) {
+            self.is_outstanding.store(false, Ordering::Relaxed);
+            self.to.drop_outstanding_txn();
+        }
+    }
+
     /// Submits the transaction to a work queue. Uses a thread if there is one in the transaction
     /// stack, otherwise uses the destination process.
     ///
@@ -181,8 +205,13 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
         let process = self.to.clone();
         let mut process_inner = process.inner.lock();
 
+        self.set_outstanding(&mut process_inner);
+
         if oneway {
             if let Some(target_node) = self.target_node.clone() {
+                if process_inner.is_frozen {
+                    process_inner.async_recv = true;
+                }
                 match target_node.submit_oneway(self, &mut process_inner) {
                     Ok(()) => return Ok(()),
                     Err((err, work)) => {
@@ -197,6 +226,11 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
             }
         }
 
+        if process_inner.is_frozen {
+            process_inner.sync_recv = true;
+            return Err(BinderError::new_frozen());
+        }
+
         let res = if let Some(thread) = self.find_target_thread() {
             match thread.push_work(self) {
                 PushWorkRes::Ok => Ok(()),
@@ -241,6 +275,7 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
                 let reply = Either::Right(BR_FAILED_REPLY);
                 self.from.deliver_reply(reply, &self);
             }
+            self.drop_outstanding_txn();
         });
         let files = if let Ok(list) = self.prepare_file_list() {
             list
@@ -301,6 +336,8 @@ fn do_work(self: DArc<Self>, thread: &Thread, writer: &mut UserSlicePtrWriter) -
         // It is now the user's responsibility to clear the allocation.
         alloc.keep_alive();
 
+        self.drop_outstanding_txn();
+
         // When this is not a reply and not a oneway transaction, update `current_transaction`. If
         // it's a reply, `current_transaction` has already been updated appropriately.
         if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 {
@@ -318,9 +355,18 @@ fn cancel(self: DArc<Self>) {
             let reply = Either::Right(BR_DEAD_REPLY);
             self.from.deliver_reply(reply, &self);
         }
+
+        self.drop_outstanding_txn();
     }
 
     fn should_sync_wakeup(&self) -> bool {
         self.flags & TF_ONE_WAY == 0
     }
 }
+
+#[pinned_drop]
+impl PinnedDrop for Transaction {
+    fn drop(self: Pin<&mut Self>) {
+        self.drop_outstanding_txn();
+    }
+}
diff --git a/rust/kernel/sync/condvar.rs b/rust/kernel/sync/condvar.rs
index 07cf6ba2e757..490fdf378e42 100644
--- a/rust/kernel/sync/condvar.rs
+++ b/rust/kernel/sync/condvar.rs
@@ -191,6 +191,16 @@ pub fn wait<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_, T, B>) -> bool {
         crate::current!().signal_pending()
     }
 
+    /// Releases the lock and waits for a notification in interruptible and freezable mode.
+    #[must_use = "wait returns if a signal is pending, so the caller must check the return value"]
+    pub fn wait_freezable<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_, T, B>) -> bool {
+        self.wait_internal(
+            bindings::TASK_INTERRUPTIBLE | bindings::TASK_FREEZABLE,
+            guard,
+        );
+        crate::current!().signal_pending()
+    }
+
     /// Releases the lock and waits for a notification in uninterruptible mode.
     ///
     /// Similar to [`CondVar::wait`], except that the wait is not interruptible. That is, the

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 16/20] rust_binder: add TF_UPDATE_TXN support
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (14 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 15/20] rust_binder: add process freezing Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 17/20] rust_binder: add oneway spam detection Alice Ryhl
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

When a process is frozen, incoming oneway transactions are held in a
queue until the process is unfrozen. If many oneway transactions are
sent, then the process could run out of space for them. This patch adds
a flag that avoids this by replacing previous oneway transactions in the
queue to avoid having transactions of the same type build up. This can
be useful when only the most recent transaction is necessary.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/defs.rs        |  8 +++++++-
 drivers/android/node.rs        | 19 +++++++++++++++++++
 drivers/android/transaction.rs | 26 ++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index 30659bd26bff..b1a54f85b365 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -58,7 +58,13 @@ macro_rules! pub_no_prefix {
 pub(crate) const FLAT_BINDER_FLAG_ACCEPTS_FDS: u32 = kernel::bindings::FLAT_BINDER_FLAG_ACCEPTS_FDS;
 pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 =
     kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX;
-pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_ACCEPT_FDS, TF_CLEAR_BUF);
+pub_no_prefix!(
+    transaction_flags_,
+    TF_ONE_WAY,
+    TF_ACCEPT_FDS,
+    TF_CLEAR_BUF,
+    TF_UPDATE_TXN
+);
 
 pub(crate) use bindings::{
     BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR,
diff --git a/drivers/android/node.rs b/drivers/android/node.rs
index 7ed494bf9f7c..2c056bd7582e 100644
--- a/drivers/android/node.rs
+++ b/drivers/android/node.rs
@@ -298,6 +298,25 @@ pub(crate) fn pending_oneway_finished(&self) {
             }
         }
     }
+
+    /// Finds an outdated transaction that the given transaction can replace.
+    ///
+    /// If one is found, it is removed from the list and returned.
+    pub(crate) fn take_outdated_transaction(
+        &self,
+        new: &Transaction,
+        guard: &mut Guard<'_, ProcessInner, SpinLockBackend>,
+    ) -> Option<DLArc<Transaction>> {
+        let inner = self.inner.access_mut(guard);
+        let mut cursor_opt = inner.oneway_todo.cursor_front();
+        while let Some(cursor) = cursor_opt {
+            if new.can_replace(&cursor.current()) {
+                return Some(cursor.remove());
+            }
+            cursor_opt = cursor.next();
+        }
+        None
+    }
 }
 
 impl DeliverToRead for Node {
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 96f63684b1a3..7028c504ef8c 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -201,6 +201,9 @@ fn drop_outstanding_txn(&self) {
     ///
     /// Not used for replies.
     pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
+        // Defined before `process_inner` so that the destructor runs after releasing the lock.
+        let mut _t_outdated = None;
+
         let oneway = self.flags & TF_ONE_WAY != 0;
         let process = self.to.clone();
         let mut process_inner = process.inner.lock();
@@ -211,6 +214,10 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
             if let Some(target_node) = self.target_node.clone() {
                 if process_inner.is_frozen {
                     process_inner.async_recv = true;
+                    if self.flags & TF_UPDATE_TXN != 0 {
+                        _t_outdated =
+                            target_node.take_outdated_transaction(&self, &mut process_inner);
+                    }
                 }
                 match target_node.submit_oneway(self, &mut process_inner) {
                     Ok(()) => return Ok(()),
@@ -251,6 +258,25 @@ pub(crate) fn submit(self: DLArc<Self>) -> BinderResult {
         }
     }
 
+    /// Check whether one oneway transaction can supersede another.
+    pub(crate) fn can_replace(&self, old: &Transaction) -> bool {
+        if self.from.process.task.pid() != old.from.process.task.pid() {
+            return false;
+        }
+
+        if self.flags & old.flags & (TF_ONE_WAY | TF_UPDATE_TXN) != (TF_ONE_WAY | TF_UPDATE_TXN) {
+            return false;
+        }
+
+        let target_node_match = match (self.target_node.as_ref(), old.target_node.as_ref()) {
+            (None, None) => true,
+            (Some(tn1), Some(tn2)) => Arc::ptr_eq(tn1, tn2),
+            _ => false,
+        };
+
+        self.code == old.code && self.flags == old.flags && target_node_match
+    }
+
     fn prepare_file_list(&self) -> Result<TranslatedFds> {
         let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?;
 

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 17/20] rust_binder: add oneway spam detection
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (15 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 16/20] rust_binder: add TF_UPDATE_TXN support Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 18/20] rust_binder: add binder_logs/state Alice Ryhl
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

From: Matt Gilbride <mattgilbride@google.com>

The idea is that once we cross a certain threshold of free async space,
whoever is responsible for the low async space is likely to try to send
another async transaction.

This change allows servers to turn on oneway spam detection and return a
different binder reply when it is detected.

Signed-off-by: Matt Gilbride <mattgilbride@google.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs  |  3 ++
 drivers/android/defs.rs        |  1 +
 drivers/android/process.rs     | 39 ++++++++++++++++++++++++--
 drivers/android/range_alloc.rs | 62 ++++++++++++++++++++++++++++++++++++++++--
 drivers/android/rust_binder.rs |  1 -
 drivers/android/thread.rs      | 11 ++++++--
 drivers/android/transaction.rs |  5 ++++
 rust/kernel/task.rs            |  2 +-
 8 files changed, 115 insertions(+), 9 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index c7f44a54b79b..7b64e7fcce4d 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -49,6 +49,7 @@ pub(crate) struct Allocation {
     pub(crate) process: Arc<Process>,
     allocation_info: Option<AllocationInfo>,
     free_on_drop: bool,
+    pub(crate) oneway_spam_detected: bool,
 }
 
 impl Allocation {
@@ -58,6 +59,7 @@ pub(crate) fn new(
         size: usize,
         ptr: usize,
         pages: Arc<Vec<Pages<0>>>,
+        oneway_spam_detected: bool,
     ) -> Self {
         Self {
             process,
@@ -65,6 +67,7 @@ pub(crate) fn new(
             size,
             ptr,
             pages,
+            oneway_spam_detected,
             allocation_info: None,
             free_on_drop: true,
         }
diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
index b1a54f85b365..e345b6ea45cc 100644
--- a/drivers/android/defs.rs
+++ b/drivers/android/defs.rs
@@ -24,6 +24,7 @@ macro_rules! pub_no_prefix {
     BR_NOOP,
     BR_SPAWN_LOOPER,
     BR_TRANSACTION_COMPLETE,
+    BR_ONEWAY_SPAM_SUSPECT,
     BR_OK,
     BR_ERROR,
     BR_INCREFS,
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 44baf9e3f998..4ac5d09041a4 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -92,6 +92,8 @@ pub(crate) struct ProcessInner {
     pub(crate) sync_recv: bool,
     /// Process received async transactions since last frozen.
     pub(crate) async_recv: bool,
+    /// Check for oneway spam
+    oneway_spam_detection_enabled: bool,
 }
 
 impl ProcessInner {
@@ -113,6 +115,7 @@ fn new() -> Self {
             is_frozen: false,
             sync_recv: false,
             async_recv: false,
+            oneway_spam_detection_enabled: false,
         }
     }
 
@@ -658,17 +661,21 @@ pub(crate) fn buffer_alloc(
         self: &Arc<Self>,
         size: usize,
         is_oneway: bool,
+        from_pid: i32,
     ) -> BinderResult<Allocation> {
         let alloc = range_alloc::ReserveNewBox::try_new()?;
         let mut inner = self.inner.lock();
         let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?;
-        let offset = mapping.alloc.reserve_new(size, is_oneway, alloc)?;
+        let offset = mapping
+            .alloc
+            .reserve_new(size, is_oneway, from_pid, alloc)?;
         Ok(Allocation::new(
             self.clone(),
             offset,
             size,
             mapping.address + offset,
             mapping.pages.clone(),
+            mapping.alloc.oneway_spam_detected,
         ))
     }
 
@@ -677,7 +684,14 @@ pub(crate) fn buffer_get(self: &Arc<Self>, ptr: usize) -> Option<Allocation> {
         let mapping = inner.mapping.as_mut()?;
         let offset = ptr.checked_sub(mapping.address)?;
         let (size, odata) = mapping.alloc.reserve_existing(offset).ok()?;
-        let mut alloc = Allocation::new(self.clone(), offset, size, ptr, mapping.pages.clone());
+        let mut alloc = Allocation::new(
+            self.clone(),
+            offset,
+            size,
+            ptr,
+            mapping.pages.clone(),
+            mapping.alloc.oneway_spam_detected,
+        );
         if let Some(data) = odata {
             alloc.set_info(data);
         }
@@ -762,6 +776,14 @@ fn set_max_threads(&self, max: u32) {
         self.inner.lock().max_threads = max;
     }
 
+    fn set_oneway_spam_detection_enabled(&self, enabled: u32) {
+        self.inner.lock().oneway_spam_detection_enabled = enabled != 0;
+    }
+
+    pub(crate) fn is_oneway_spam_detection_enabled(&self) -> bool {
+        self.inner.lock().oneway_spam_detection_enabled
+    }
+
     fn get_node_debug_info(&self, data: UserSlicePtr) -> Result {
         let (mut reader, mut writer) = data.reader_writer();
 
@@ -948,9 +970,17 @@ fn deferred_release(self: Arc<Self>) {
         if let Some(mut mapping) = omapping {
             let address = mapping.address;
             let pages = mapping.pages.clone();
+            let oneway_spam_detected = mapping.alloc.oneway_spam_detected;
             mapping.alloc.take_for_each(|offset, size, odata| {
                 let ptr = offset + address;
-                let mut alloc = Allocation::new(self.clone(), offset, size, ptr, pages.clone());
+                let mut alloc = Allocation::new(
+                    self.clone(),
+                    offset,
+                    size,
+                    ptr,
+                    pages.clone(),
+                    oneway_spam_detected,
+                );
                 if let Some(data) = odata {
                     alloc.set_info(data);
                 }
@@ -1144,6 +1174,9 @@ fn write(
             bindings::BINDER_SET_CONTEXT_MGR_EXT => {
                 this.set_as_manager(Some(reader.read()?), &thread)?
             }
+            bindings::BINDER_ENABLE_ONEWAY_SPAM_DETECTION => {
+                this.set_oneway_spam_detection_enabled(reader.read()?)
+            }
             bindings::BINDER_FREEZE => ioctl_freeze(reader)?,
             _ => return Err(EINVAL),
         }
diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs
index e757129613cf..c1d47115e54d 100644
--- a/drivers/android/range_alloc.rs
+++ b/drivers/android/range_alloc.rs
@@ -3,6 +3,7 @@
 use kernel::{
     prelude::*,
     rbtree::{RBTree, RBTreeNode, RBTreeNodeReservation},
+    task::Pid,
 };
 
 /// Keeps track of allocations in a process' mmap.
@@ -13,7 +14,9 @@
 pub(crate) struct RangeAllocator<T> {
     tree: RBTree<usize, Descriptor<T>>,
     free_tree: RBTree<FreeKey, ()>,
+    size: usize,
     free_oneway_space: usize,
+    pub(crate) oneway_spam_detected: bool,
 }
 
 impl<T> RangeAllocator<T> {
@@ -26,6 +29,8 @@ pub(crate) fn new(size: usize) -> Result<Self> {
             free_oneway_space: size / 2,
             tree,
             free_tree,
+            oneway_spam_detected: false,
+            size,
         })
     }
 
@@ -40,6 +45,7 @@ pub(crate) fn reserve_new(
         &mut self,
         size: usize,
         is_oneway: bool,
+        pid: Pid,
         alloc: ReserveNewBox<T>,
     ) -> Result<usize> {
         // Compute new value of free_oneway_space, which is set only on success.
@@ -52,6 +58,15 @@ pub(crate) fn reserve_new(
             self.free_oneway_space
         };
 
+        // Start detecting spammers once we have less than 20%
+        // of async space left (which is less than 10% of total
+        // buffer size).
+        //
+        // (This will short-circut, so `low_oneway_space` is
+        // only called when necessary.)
+        self.oneway_spam_detected =
+            is_oneway && new_oneway_space < self.size / 10 && self.low_oneway_space(pid);
+
         let (found_size, found_off, tree_node, free_tree_node) = match self.find_best_match(size) {
             None => {
                 pr_warn!("ENOSPC from range_alloc.reserve_new - size: {}", size);
@@ -65,7 +80,7 @@ pub(crate) fn reserve_new(
                 let new_desc = Descriptor::new(found_offset + size, found_size - size);
                 let (tree_node, free_tree_node, desc_node_res) = alloc.initialize(new_desc);
 
-                desc.state = Some(DescriptorState::new(is_oneway, desc_node_res));
+                desc.state = Some(DescriptorState::new(is_oneway, pid, desc_node_res));
                 desc.size = size;
 
                 (found_size, found_offset, tree_node, free_tree_node)
@@ -224,6 +239,30 @@ pub(crate) fn take_for_each<F: Fn(usize, usize, Option<T>)>(&mut self, callback:
             }
         }
     }
+
+    /// Find the amount and size of buffers allocated by the current caller.
+    ///
+    /// The idea is that once we cross the threshold, whoever is responsible
+    /// for the low async space is likely to try to send another async transaction,
+    /// and at some point we'll catch them in the act.  This is more efficient
+    /// than keeping a map per pid.
+    fn low_oneway_space(&self, calling_pid: Pid) -> bool {
+        let mut total_alloc_size = 0;
+        let mut num_buffers = 0;
+        for (_, desc) in self.tree.iter() {
+            if let Some(state) = &desc.state {
+                if state.is_oneway() && state.pid() == calling_pid {
+                    total_alloc_size += desc.size;
+                    num_buffers += 1;
+                }
+            }
+        }
+
+        // Warn if this pid has more than 50 transactions, or more than 50% of
+        // async space (which is 25% of total buffer size). Oneway spam is only
+        // detected when the threshold is exceeded.
+        num_buffers > 50 || total_alloc_size > self.size / 4
+    }
 }
 
 struct Descriptor<T> {
@@ -257,16 +296,32 @@ enum DescriptorState<T> {
 }
 
 impl<T> DescriptorState<T> {
-    fn new(is_oneway: bool, free_res: FreeNodeRes) -> Self {
+    fn new(is_oneway: bool, pid: Pid, free_res: FreeNodeRes) -> Self {
         DescriptorState::Reserved(Reservation {
             is_oneway,
+            pid,
             free_res,
         })
     }
+
+    fn pid(&self) -> Pid {
+        match self {
+            DescriptorState::Reserved(inner) => inner.pid,
+            DescriptorState::Allocated(inner) => inner.pid,
+        }
+    }
+
+    fn is_oneway(&self) -> bool {
+        match self {
+            DescriptorState::Reserved(inner) => inner.is_oneway,
+            DescriptorState::Allocated(inner) => inner.is_oneway,
+        }
+    }
 }
 
 struct Reservation {
     is_oneway: bool,
+    pid: Pid,
     free_res: FreeNodeRes,
 }
 
@@ -275,6 +330,7 @@ fn allocate<T>(self, data: Option<T>) -> Allocation<T> {
         Allocation {
             data,
             is_oneway: self.is_oneway,
+            pid: self.pid,
             free_res: self.free_res,
         }
     }
@@ -282,6 +338,7 @@ fn allocate<T>(self, data: Option<T>) -> Allocation<T> {
 
 struct Allocation<T> {
     is_oneway: bool,
+    pid: Pid,
     free_res: FreeNodeRes,
     data: Option<T>,
 }
@@ -291,6 +348,7 @@ fn deallocate(self) -> (Reservation, Option<T>) {
         (
             Reservation {
                 is_oneway: self.is_oneway,
+                pid: self.pid,
                 free_res: self.free_res,
             },
             self.data,
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index 04477ff7e5a0..adf542872f36 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -107,7 +107,6 @@ fn new(val: impl PinInit<T>) -> impl PinInit<Self> {
         })
     }
 
-    #[allow(dead_code)]
     fn arc_try_new(val: T) -> Result<DLArc<T>, alloc::alloc::AllocError> {
         ListArc::pin_init(pin_init!(Self {
             links <- ListLinksSelfPtr::new(),
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 0238c15604f6..414ffb1387a0 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -909,7 +909,7 @@ pub(crate) fn copy_transaction_data(
             size_of::<usize>(),
         );
         let secctx_off = adata_size + aoffsets_size + abuffers_size;
-        let mut alloc = match to_process.buffer_alloc(len, is_oneway) {
+        let mut alloc = match to_process.buffer_alloc(len, is_oneway, self.process.task.pid()) {
             Ok(alloc) => alloc,
             Err(err) => {
                 pr_warn!(
@@ -1191,8 +1191,15 @@ fn oneway_transaction_inner(self: &Arc<Self>, tr: &BinderTransactionDataSg) -> B
         let handle = unsafe { tr.transaction_data.target.handle };
         let node_ref = self.process.get_transaction_node(handle)?;
         security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?;
-        let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?;
         let transaction = Transaction::new(node_ref, None, self, tr)?;
+        let code = if self.process.is_oneway_spam_detection_enabled()
+            && transaction.oneway_spam_detected
+        {
+            BR_ONEWAY_SPAM_SUSPECT
+        } else {
+            BR_TRANSACTION_COMPLETE
+        };
+        let list_completion = DTRWrap::arc_try_new(DeliverCode::new(code))?;
         let completion = list_completion.clone_arc();
         self.inner.lock().push_work(list_completion);
         match transaction.submit() {
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 7028c504ef8c..84b9fe58fe3e 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -38,6 +38,7 @@ pub(crate) struct Transaction {
     data_address: usize,
     sender_euid: Kuid,
     txn_security_ctx_off: Option<usize>,
+    pub(crate) oneway_spam_detected: bool,
 }
 
 kernel::list::impl_list_arc_safe! {
@@ -70,6 +71,7 @@ pub(crate) fn new(
                 return Err(err);
             }
         };
+        let oneway_spam_detected = alloc.oneway_spam_detected;
         if trd.flags & TF_ONE_WAY != 0 {
             if stack_next.is_some() {
                 pr_warn!("Oneway transaction should not be in a transaction stack.");
@@ -98,6 +100,7 @@ pub(crate) fn new(
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
             is_outstanding: AtomicBool::new(false),
             txn_security_ctx_off,
+            oneway_spam_detected,
         }))?)
     }
 
@@ -115,6 +118,7 @@ pub(crate) fn new_reply(
                 return Err(err);
             }
         };
+        let oneway_spam_detected = alloc.oneway_spam_detected;
         if trd.flags & TF_CLEAR_BUF != 0 {
             alloc.set_info_clear_on_drop();
         }
@@ -132,6 +136,7 @@ pub(crate) fn new_reply(
             allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"),
             is_outstanding: AtomicBool::new(false),
             txn_security_ctx_off: None,
+            oneway_spam_detected,
         }))?)
     }
 
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 1a27b968a907..81649f12758b 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -81,7 +81,7 @@ unsafe impl Send for Task {}
 unsafe impl Sync for Task {}
 
 /// The type of process identifiers (PIDs).
-type Pid = bindings::pid_t;
+pub type Pid = bindings::pid_t;
 
 /// The type of user identifiers (UIDs).
 #[derive(Copy, Clone)]

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 18/20] rust_binder: add binder_logs/state
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (16 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 17/20] rust_binder: add oneway spam detection Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 19/20] rust_binder: add vma shrinker Alice Ryhl
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

The binderfs directory has four files intended for debugging the driver.
This patch implements the state file so that you can use it to view the
current state of the driver.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/node.rs         | 37 ++++++++++++++++++++
 drivers/android/process.rs      | 75 +++++++++++++++++++++++++++++++++++++++++
 drivers/android/rust_binder.rs  | 23 ++++++++++++-
 drivers/android/thread.rs       | 68 +++++++++++++++++++++++++++++++++++++
 drivers/android/transaction.rs  | 25 ++++++++++++++
 rust/bindings/bindings_helper.h |  1 +
 rust/kernel/lib.rs              |  1 +
 rust/kernel/seq_file.rs         | 47 ++++++++++++++++++++++++++
 8 files changed, 276 insertions(+), 1 deletion(-)

diff --git a/drivers/android/node.rs b/drivers/android/node.rs
index 2c056bd7582e..3acad1c2b963 100644
--- a/drivers/android/node.rs
+++ b/drivers/android/node.rs
@@ -7,6 +7,8 @@
         TryNewListArc,
     },
     prelude::*,
+    seq_file::SeqFile,
+    seq_print,
     sync::lock::{spinlock::SpinLockBackend, Guard},
     sync::{Arc, LockedBy, SpinLock},
     user_ptr::UserSlicePtrWriter,
@@ -111,6 +113,41 @@ pub(crate) fn new(
         })
     }
 
+    #[inline(never)]
+    pub(crate) fn debug_print(&self, m: &mut SeqFile) -> Result<()> {
+        let weak;
+        let strong;
+        let has_weak;
+        let has_strong;
+        let active_inc_refs;
+        {
+            let mut guard = self.owner.inner.lock();
+            let inner = self.inner.access_mut(&mut guard);
+            weak = inner.weak.count;
+            has_weak = inner.weak.has_count;
+            strong = inner.strong.count;
+            has_strong = inner.strong.has_count;
+            active_inc_refs = inner.active_inc_refs;
+        }
+
+        let has_weak = if has_weak { "Y" } else { "N" };
+        let has_strong = if has_strong { "Y" } else { "N" };
+
+        seq_print!(
+            m,
+            "node gid:{},ptr:{:#x},cookie:{:#x}: strong{}{} weak{}{} active{}\n",
+            self.global_id,
+            self.ptr,
+            self.cookie,
+            strong,
+            has_strong,
+            weak,
+            has_weak,
+            active_inc_refs
+        );
+        Ok(())
+    }
+
     pub(crate) fn get_id(&self) -> (usize, usize) {
         (self.ptr, self.cookie)
     }
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index 4ac5d09041a4..b5e44f9f2a14 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -20,6 +20,8 @@
     pages::Pages,
     prelude::*,
     rbtree::RBTree,
+    seq_file::SeqFile,
+    seq_print,
     sync::{
         lock::Guard, Arc, ArcBorrow, CondVar, CondVarTimeoutResult, Mutex, SpinLock, UniqueArc,
     },
@@ -405,6 +407,79 @@ fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
         Ok(process)
     }
 
+    #[inline(never)]
+    pub(crate) fn debug_print(&self, m: &mut SeqFile) -> Result<()> {
+        seq_print!(m, "pid: {}\n", self.task.pid_in_current_ns());
+
+        let is_manager;
+        let started_threads;
+        let has_proc_work;
+        let mut ready_threads = Vec::new();
+        let mut all_threads = Vec::new();
+        let mut all_nodes = Vec::new();
+        loop {
+            let inner = self.inner.lock();
+            let ready_threads_len = inner.ready_threads.iter().count();
+            let all_threads_len = inner.threads.values().count();
+            let all_nodes_len = inner.nodes.values().count();
+
+            let resize_ready_threads = ready_threads_len > ready_threads.capacity();
+            let resize_all_threads = all_threads_len > all_threads.capacity();
+            let resize_all_nodes = all_nodes_len > all_nodes.capacity();
+            if resize_ready_threads || resize_all_threads || resize_all_nodes {
+                drop(inner);
+                ready_threads.try_reserve(ready_threads_len)?;
+                all_threads.try_reserve(all_threads_len)?;
+                all_nodes.try_reserve(all_nodes_len)?;
+                continue;
+            }
+
+            is_manager = inner.is_manager;
+            started_threads = inner.started_thread_count;
+            has_proc_work = !inner.work.is_empty();
+
+            for thread in &inner.ready_threads {
+                assert!(ready_threads.len() < ready_threads.capacity());
+                ready_threads.try_push(thread.id)?;
+            }
+
+            for thread in inner.threads.values() {
+                assert!(all_threads.len() < all_threads.capacity());
+                all_threads.try_push(thread.clone())?;
+            }
+
+            for node in inner.nodes.values() {
+                assert!(all_nodes.len() < all_nodes.capacity());
+                all_nodes.try_push(node.clone())?;
+            }
+
+            break;
+        }
+
+        seq_print!(m, "is_manager: {}\n", is_manager);
+        seq_print!(m, "started_threads: {}\n", started_threads);
+        seq_print!(m, "has_proc_work: {}\n", has_proc_work);
+        if ready_threads.is_empty() {
+            seq_print!(m, "ready_thread_ids: none\n");
+        } else {
+            seq_print!(m, "ready_thread_ids:");
+            for thread_id in ready_threads {
+                seq_print!(m, " {}", thread_id);
+            }
+            seq_print!(m, "\n");
+        }
+
+        for node in all_nodes {
+            node.debug_print(m)?;
+        }
+
+        seq_print!(m, "all threads:\n");
+        for thread in all_threads {
+            thread.debug_print(m);
+        }
+        Ok(())
+    }
+
     /// Attempts to fetch a work item from the process queue.
     pub(crate) fn get_work(&self) -> Option<DLArc<dyn DeliverToRead>> {
         self.inner.lock().work.pop_front()
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index adf542872f36..a1c95a1609d5 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -10,6 +10,8 @@
         HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc,
     },
     prelude::*,
+    seq_file::SeqFile,
+    seq_print,
     sync::Arc,
     types::ForeignOwnable,
     user_ptr::UserSlicePtrWriter,
@@ -347,7 +349,13 @@ unsafe impl<T> Sync for AssertSync<T> {}
 }
 
 #[no_mangle]
-unsafe extern "C" fn rust_binder_state_show(_: *mut seq_file) -> core::ffi::c_int {
+unsafe extern "C" fn rust_binder_state_show(ptr: *mut seq_file) -> core::ffi::c_int {
+    // SAFETY: The caller ensures that the pointer is valid and exclusive for the duration in which
+    // this method is called.
+    let m = unsafe { SeqFile::from_raw(ptr) };
+    if let Err(err) = rust_binder_state_show_impl(m) {
+        seq_print!(m, "failed to generate state: {:?}\n", err);
+    }
     0
 }
 
@@ -360,3 +368,16 @@ unsafe impl<T> Sync for AssertSync<T> {}
 unsafe extern "C" fn rust_binder_transaction_log_show(_: *mut seq_file) -> core::ffi::c_int {
     0
 }
+
+fn rust_binder_state_show_impl(m: &mut SeqFile) -> Result<()> {
+    let contexts = context::get_all_contexts()?;
+    for ctx in contexts {
+        let procs = ctx.get_all_procs()?;
+        seq_print!(m, "context {}: ({} processes)\n", &*ctx.name, procs.len());
+        for proc in procs {
+            proc.debug_print(m)?;
+            seq_print!(m, "\n");
+        }
+    }
+    Ok(())
+}
diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs
index 414ffb1387a0..d5a56119cc19 100644
--- a/drivers/android/thread.rs
+++ b/drivers/android/thread.rs
@@ -15,6 +15,8 @@
     },
     prelude::*,
     security,
+    seq_file::SeqFile,
+    seq_print,
     sync::{Arc, SpinLock},
     types::Either,
     user_ptr::{UserSlicePtr, UserSlicePtrWriter},
@@ -447,6 +449,72 @@ pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
         }))
     }
 
+    #[inline(never)]
+    pub(crate) fn debug_print(&self, m: &mut SeqFile) {
+        let looper_flags;
+        let looper_need_return;
+        let is_dead;
+        let has_work;
+        let process_work_list;
+        let current_transaction;
+        {
+            let inner = self.inner.lock();
+            looper_flags = inner.looper_flags;
+            looper_need_return = inner.looper_need_return;
+            is_dead = inner.is_dead;
+            has_work = !inner.work_list.is_empty();
+            process_work_list = inner.process_work_list;
+            current_transaction = inner.current_transaction.clone();
+        }
+        seq_print!(m, "  tid: {}\n", self.id);
+        seq_print!(m, "  state:");
+        if is_dead {
+            seq_print!(m, " dead");
+        }
+        if looper_need_return {
+            seq_print!(m, " pending_flush_wakeup");
+        }
+        if has_work && process_work_list {
+            seq_print!(m, " has_work");
+        }
+        if has_work && !process_work_list {
+            seq_print!(m, " has_deferred_work");
+        }
+        if looper_flags & LOOPER_REGISTERED != 0 {
+            seq_print!(m, " registered");
+        }
+        if looper_flags & LOOPER_ENTERED != 0 {
+            seq_print!(m, " entered");
+        }
+        if looper_flags & LOOPER_EXITED != 0 {
+            seq_print!(m, " exited");
+        }
+        if looper_flags & LOOPER_INVALID != 0 {
+            seq_print!(m, " invalid");
+        }
+        if looper_flags & LOOPER_WAITING != 0 {
+            if looper_flags & LOOPER_WAITING_PROC != 0 {
+                seq_print!(m, " in_get_work");
+            } else {
+                seq_print!(m, " in_get_work_local");
+            }
+        }
+        if looper_flags & LOOPER_POLL != 0 {
+            seq_print!(m, " poll_is_initialized");
+        }
+        seq_print!(m, "\n");
+        if current_transaction.is_some() {
+            seq_print!(m, "  tstack:");
+            let mut t = current_transaction;
+            while let Some(tt) = t.as_ref() {
+                seq_print!(m, " ");
+                tt.debug_print(m);
+                t = tt.clone_next();
+            }
+            seq_print!(m, "\n");
+        }
+    }
+
     pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result {
         let mut writer = data.writer();
         let ee = self.inner.lock().extended_error;
diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs
index 84b9fe58fe3e..30c411ab0778 100644
--- a/drivers/android/transaction.rs
+++ b/drivers/android/transaction.rs
@@ -5,6 +5,8 @@
     io_buffer::IoBufferWriter,
     list::ListArcSafe,
     prelude::*,
+    seq_file::SeqFile,
+    seq_print,
     sync::{Arc, SpinLock},
     task::Kuid,
     types::{Either, ScopeGuard},
@@ -140,6 +142,29 @@ pub(crate) fn new_reply(
         }))?)
     }
 
+    #[inline(never)]
+    pub(crate) fn debug_print(&self, m: &mut SeqFile) {
+        let from_pid = self.from.process.task.pid_in_current_ns();
+        let to_pid = self.to.task.pid_in_current_ns();
+        let from_tid = self.from.id;
+        match self.target_node.as_ref() {
+            Some(target_node) => {
+                let node_id = target_node.global_id;
+                seq_print!(
+                    m,
+                    "{}(tid:{})->{}(nid:{})",
+                    from_pid,
+                    from_tid,
+                    to_pid,
+                    node_id
+                );
+            }
+            None => {
+                seq_print!(m, "{}(tid:{})->{}(nid:_)", from_pid, from_tid, to_pid);
+            }
+        }
+    }
+
     /// Determines if the transaction is stacked on top of the given transaction.
     pub(crate) fn is_stacked_on(&self, onext: &Option<DArc<Self>>) -> bool {
         match (&self.stack_next, onext) {
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index ffeea312f2fd..b2d60b4a9df6 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -15,6 +15,7 @@
 #include <linux/pid_namespace.h>
 #include <linux/poll.h>
 #include <linux/security.h>
+#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
 #include <linux/rust_binder.h>
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index f4d58da9202e..d46187783464 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -48,6 +48,7 @@
 pub mod print;
 pub mod rbtree;
 pub mod security;
+pub mod seq_file;
 mod static_assert;
 #[doc(hidden)]
 pub mod std_vendor;
diff --git a/rust/kernel/seq_file.rs b/rust/kernel/seq_file.rs
new file mode 100644
index 000000000000..997d527b2e9e
--- /dev/null
+++ b/rust/kernel/seq_file.rs
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Seq file bindings.
+//!
+//! C header: [`include/linux/seq_file.h`](../../../../include/linux/seq_file.h)
+
+use crate::{bindings, c_str, types::Opaque};
+
+/// A helper for implementing special files, where the complete contents can be generated on each
+/// access.
+pub struct SeqFile(Opaque<bindings::seq_file>);
+
+impl SeqFile {
+    /// Creates a new [`SeqFile`] from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure that, for the duration of 'a, the pointer must point at a valid
+    /// `seq_file` and that it will not be accessed via anything other than the returned reference.
+    pub unsafe fn from_raw<'a>(ptr: *mut bindings::seq_file) -> &'a mut SeqFile {
+        // SAFETY: The safety requirements guarantee the validity of the dereference, while the
+        // `Credential` type being transparent makes the cast ok.
+        unsafe { &mut *ptr.cast() }
+    }
+
+    /// Used by the [`seq_print`] macro.
+    ///
+    /// [`seq_print`]: crate::seq_print
+    pub fn call_printf(&mut self, args: core::fmt::Arguments<'_>) {
+        // SAFETY: Passing a void pointer to `Arguments` is valid for `%pA`.
+        unsafe {
+            bindings::seq_printf(
+                self.0.get(),
+                c_str!("%pA").as_char_ptr(),
+                &args as *const _ as *const core::ffi::c_void,
+            );
+        }
+    }
+}
+
+/// Use for writing to a [`SeqFile`] with the ordinary Rust formatting syntax.
+#[macro_export]
+macro_rules! seq_print {
+    ($m:expr, $($arg:tt)+) => (
+        $m.call_printf(format_args!($($arg)+))
+    );
+}

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 19/20] rust_binder: add vma shrinker
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (17 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 18/20] rust_binder: add binder_logs/state Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:01 ` [PATCH RFC 20/20] binder: delete the C implementation Alice Ryhl
  2023-11-01 18:34 ` [PATCH RFC 00/20] Setting up Binder for the future Carlos Llamas
  20 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

When the system is under memory pressure, we want the driver to release
unused pages. We do this by registering a memory shrinker with the
kernel.

The data for incoming transactions is stored in an mmap'ed region.
Previously in this patch series, we just allocated all of the pages in
that region immediately. With this patch series, we do not allocate the
pages until we need them.

Furthermore, when we no longer need a page, we mark it as "available"
using an lru list. If the system is under memory pressure, this allows
the shrinker to free that page by removing it from the lru list. If we
need to use the page again before the shrinker frees it, then we just
remove it from the lru list, and we don't need to reallocate the page.

The page range abstraction is split into a fast path and slow path. The
slow path is only used when a page is not allocated, which should only
happen on first use, and when the system is under memory pressure.

I'm not yet completely happy with this implementation. Specifically, I
would like to improve the robustness of the unsafe code found in
`allocation.rs`.

The slow-path/fast-path implementation in `page_range.rs` is different
from C Binder's current implementation, and was suggested to me by
Carlos.

Suggested-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/allocation.rs     |  80 ++---
 drivers/android/process.rs        | 129 +++----
 drivers/android/range_alloc.rs    |  44 ++-
 drivers/android/rust_binder.rs    |   6 +
 rust/bindings/bindings_helper.h   |   2 +
 rust/helpers.c                    |  20 ++
 rust/kernel/lib.rs                |   1 +
 rust/kernel/page_range.rs         | 715 ++++++++++++++++++++++++++++++++++++++
 rust/kernel/sync/lock.rs          |  24 ++
 rust/kernel/sync/lock/mutex.rs    |  10 +
 rust/kernel/sync/lock/spinlock.rs |  10 +
 11 files changed, 931 insertions(+), 110 deletions(-)

diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs
index 7b64e7fcce4d..4a9f8e7f2de3 100644
--- a/drivers/android/allocation.rs
+++ b/drivers/android/allocation.rs
@@ -6,7 +6,6 @@
     bindings,
     file::{DeferredFdCloser, File, FileDescriptorReservation},
     io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes},
-    pages::Pages,
     prelude::*,
     sync::Arc,
     types::ARef,
@@ -41,11 +40,15 @@ pub(crate) struct AllocationInfo {
 /// Represents an allocation that the kernel is currently using.
 ///
 /// When allocations are idle, the range allocator holds the data related to them.
+///
+/// # Invariants
+///
+/// This allocation corresponds to an allocation in the range allocator, so the relevant pages are
+/// marked in use in the page range.
 pub(crate) struct Allocation {
     pub(crate) offset: usize,
     size: usize,
     pub(crate) ptr: usize,
-    pages: Arc<Vec<Pages<0>>>,
     pub(crate) process: Arc<Process>,
     allocation_info: Option<AllocationInfo>,
     free_on_drop: bool,
@@ -58,7 +61,6 @@ pub(crate) fn new(
         offset: usize,
         size: usize,
         ptr: usize,
-        pages: Arc<Vec<Pages<0>>>,
         oneway_spam_detected: bool,
     ) -> Self {
         Self {
@@ -66,30 +68,17 @@ pub(crate) fn new(
             offset,
             size,
             ptr,
-            pages,
             oneway_spam_detected,
             allocation_info: None,
             free_on_drop: true,
         }
     }
 
-    fn iterate<T>(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result
-    where
-        T: FnMut(&Pages<0>, usize, usize) -> Result,
-    {
-        // Check that the request is within the buffer.
-        if offset.checked_add(size).ok_or(EINVAL)? > self.size {
-            return Err(EINVAL);
-        }
-        offset += self.offset;
-        let mut page_index = offset >> bindings::PAGE_SHIFT;
-        offset &= (1 << bindings::PAGE_SHIFT) - 1;
-        while size > 0 {
-            let available = core::cmp::min(size, (1 << bindings::PAGE_SHIFT) - offset);
-            cb(&self.pages[page_index], offset, available)?;
-            size -= available;
-            page_index += 1;
-            offset = 0;
+    fn size_check(&self, offset: usize, size: usize) -> Result {
+        let overflow_fail = offset.checked_add(size).is_none();
+        let cmp_size_fail = offset.wrapping_add(size) > self.size;
+        if overflow_fail || cmp_size_fail {
+            return Err(EFAULT);
         }
         Ok(())
     }
@@ -100,42 +89,37 @@ pub(crate) fn copy_into(
         offset: usize,
         size: usize,
     ) -> Result {
-        self.iterate(offset, size, |page, offset, to_copy| {
-            page.copy_into_page(reader, offset, to_copy)
-        })
+        self.size_check(offset, size)?;
+
+        // SAFETY: While this object exists, the range allocator will keep the range allocated, and
+        // in turn, the pages will be marked as in use.
+        unsafe {
+            self.process
+                .pages
+                .copy_into(reader, self.offset + offset, size)
+        }
     }
 
     pub(crate) fn read<T: ReadableFromBytes>(&self, offset: usize) -> Result<T> {
-        let mut out = MaybeUninit::<T>::uninit();
-        let mut out_offset = 0;
-        self.iterate(offset, size_of::<T>(), |page, offset, to_copy| {
-            // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
-            let obj_ptr = unsafe { (out.as_mut_ptr() as *mut u8).add(out_offset) };
-            // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid.
-            unsafe { page.read(obj_ptr, offset, to_copy) }?;
-            out_offset += to_copy;
-            Ok(())
-        })?;
-        // SAFETY: We just initialised the data.
-        Ok(unsafe { out.assume_init() })
+        self.size_check(offset, size_of::<T>())?;
+
+        // SAFETY: While this object exists, the range allocator will keep the range allocated, and
+        // in turn, the pages will be marked as in use.
+        unsafe { self.process.pages.read(self.offset + offset) }
     }
 
     pub(crate) fn write<T: ?Sized>(&self, offset: usize, obj: &T) -> Result {
-        let mut obj_offset = 0;
-        self.iterate(offset, size_of_val(obj), |page, offset, to_copy| {
-            // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
-            let obj_ptr = unsafe { (obj as *const T as *const u8).add(obj_offset) };
-            // SAFETY: We have a reference to the object, so the pointer is valid.
-            unsafe { page.write(obj_ptr, offset, to_copy) }?;
-            obj_offset += to_copy;
-            Ok(())
-        })
+        self.size_check(offset, size_of_val::<T>(obj))?;
+
+        // SAFETY: While this object exists, the range allocator will keep the range allocated, and
+        // in turn, the pages will be marked as in use.
+        unsafe { self.process.pages.write(self.offset + offset, obj) }
     }
 
     pub(crate) fn fill_zero(&self) -> Result {
-        self.iterate(0, self.size, |page, offset, len| {
-            page.fill_zero(offset, len)
-        })
+        // SAFETY: While this object exists, the range allocator will keep the range allocated, and
+        // in turn, the pages will be marked as in use.
+        unsafe { self.process.pages.fill_zero(self.offset, self.size) }
     }
 
     pub(crate) fn keep_alive(mut self) {
diff --git a/drivers/android/process.rs b/drivers/android/process.rs
index b5e44f9f2a14..61809e496a48 100644
--- a/drivers/android/process.rs
+++ b/drivers/android/process.rs
@@ -17,7 +17,7 @@
     io_buffer::{IoBufferReader, IoBufferWriter},
     list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks},
     mm,
-    pages::Pages,
+    page_range::ShrinkablePageRange,
     prelude::*,
     rbtree::RBTree,
     seq_file::SeqFile,
@@ -47,17 +47,12 @@
 struct Mapping {
     address: usize,
     alloc: RangeAllocator<AllocationInfo>,
-    pages: Arc<Vec<Pages<0>>>,
 }
 
 impl Mapping {
-    fn new(address: usize, size: usize, pages: Arc<Vec<Pages<0>>>) -> Result<Self> {
+    fn new(address: usize, size: usize) -> Result<Self> {
         let alloc = RangeAllocator::new(size)?;
-        Ok(Self {
-            address,
-            alloc,
-            pages,
-        })
+        Ok(Self { address, alloc })
     }
 }
 
@@ -333,6 +328,9 @@ pub(crate) struct Process {
     #[pin]
     pub(crate) inner: SpinLock<ProcessInner>,
 
+    #[pin]
+    pub(crate) pages: ShrinkablePageRange,
+
     // Waitqueue of processes waiting for all outstanding transactions to be
     // processed.
     #[pin]
@@ -390,10 +388,11 @@ fn run(me: Arc<Self>) {
 
 impl Process {
     fn new(ctx: Arc<Context>, cred: ARef<Credential>) -> Result<Arc<Self>> {
-        let list_process = ListArc::pin_init(pin_init!(Process {
+        let list_process = ListArc::pin_init(try_pin_init!(Process {
             ctx,
             cred,
             inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"),
+            pages <- ShrinkablePageRange::new(&super::BINDER_SHRINKER),
             node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"),
             freeze_wait <- kernel::new_condvar!("Process::freeze_wait"),
             task: kernel::current!().group_leader().into(),
@@ -738,20 +737,46 @@ pub(crate) fn buffer_alloc(
         is_oneway: bool,
         from_pid: i32,
     ) -> BinderResult<Allocation> {
+        use kernel::bindings::PAGE_SIZE;
+
         let alloc = range_alloc::ReserveNewBox::try_new()?;
         let mut inner = self.inner.lock();
         let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?;
         let offset = mapping
             .alloc
             .reserve_new(size, is_oneway, from_pid, alloc)?;
-        Ok(Allocation::new(
+
+        let res = Allocation::new(
             self.clone(),
             offset,
             size,
             mapping.address + offset,
-            mapping.pages.clone(),
             mapping.alloc.oneway_spam_detected,
-        ))
+        );
+        drop(inner);
+
+        // This allocation will be marked as in use until the `Allocation` is used to free it.
+        //
+        // This method can't be called while holding a lock, so we release the lock first. It's
+        // okay for several threads to use the method on the same index at the same time. In that
+        // case, one of the calls will allocate the given page (if missing), and the other call
+        // will wait for the other call to finish allocating the page.
+        //
+        // We will not call `stop_using_range` in parallel with this on the same page, because the
+        // allocation can only be removed via the destructor of the `Allocation` object that we
+        // currently own.
+        match self.pages.use_range(
+            offset / PAGE_SIZE,
+            (offset + size + (PAGE_SIZE - 1)) / PAGE_SIZE,
+        ) {
+            Ok(()) => {}
+            Err(err) => {
+                pr_warn!("use_range failure {:?}", err);
+                return Err(err.into());
+            }
+        }
+
+        Ok(res)
     }
 
     pub(crate) fn buffer_get(self: &Arc<Self>, ptr: usize) -> Option<Allocation> {
@@ -764,7 +789,6 @@ pub(crate) fn buffer_get(self: &Arc<Self>, ptr: usize) -> Option<Allocation> {
             offset,
             size,
             ptr,
-            mapping.pages.clone(),
             mapping.alloc.oneway_spam_detected,
         );
         if let Some(data) = odata {
@@ -776,18 +800,29 @@ pub(crate) fn buffer_get(self: &Arc<Self>, ptr: usize) -> Option<Allocation> {
     pub(crate) fn buffer_raw_free(&self, ptr: usize) {
         let mut inner = self.inner.lock();
         if let Some(ref mut mapping) = &mut inner.mapping {
-            if ptr < mapping.address
-                || mapping
-                    .alloc
-                    .reservation_abort(ptr - mapping.address)
-                    .is_err()
-            {
-                pr_warn!(
-                    "Pointer {:x} failed to free, base = {:x}\n",
-                    ptr,
-                    mapping.address
-                );
-            }
+            let offset = match ptr.checked_sub(mapping.address) {
+                Some(offset) => offset,
+                None => return,
+            };
+
+            let freed_range = match mapping.alloc.reservation_abort(offset) {
+                Ok(freed_range) => freed_range,
+                Err(_) => {
+                    pr_warn!(
+                        "Pointer {:x} failed to free, base = {:x}\n",
+                        ptr,
+                        mapping.address
+                    );
+                    return;
+                }
+            };
+
+            // No more allocations in this range. Mark them as not in use.
+            //
+            // Must be done before we release the lock so that `use_range` is not used on these
+            // indices until `stop_using_range` returns.
+            self.pages
+                .stop_using_range(freed_range.start_page_idx, freed_range.end_page_idx);
         }
     }
 
@@ -802,35 +837,16 @@ pub(crate) fn buffer_make_freeable(&self, offset: usize, data: Option<Allocation
 
     fn create_mapping(&self, vma: &mut mm::virt::Area) -> Result {
         use kernel::bindings::PAGE_SIZE;
-        let size = core::cmp::min(vma.end() - vma.start(), bindings::SZ_4M as usize);
-        let page_count = size / PAGE_SIZE;
-
-        // Allocate and map all pages.
-        //
-        // N.B. If we fail halfway through mapping these pages, the kernel will unmap them.
-        let mut pages = Vec::new();
-        pages.try_reserve_exact(page_count)?;
-        let mut address = vma.start();
-        for _ in 0..page_count {
-            let page = Pages::<0>::new()?;
-            vma.insert_page(address, &page)?;
-            pages.try_push(page)?;
-            address += PAGE_SIZE;
+        let size = usize::min(vma.end() - vma.start(), bindings::SZ_4M as usize);
+        let mapping = Mapping::new(vma.start(), size)?;
+        let page_count = self.pages.register_with_vma(vma)?;
+        if page_count * PAGE_SIZE != size {
+            return Err(EINVAL);
         }
 
-        let ref_pages = Arc::try_new(pages)?;
-        let mapping = Mapping::new(vma.start(), size, ref_pages)?;
+        // Save range allocator for later.
+        self.inner.lock().mapping = Some(mapping);
 
-        // Save pages for later.
-        let mut inner = self.inner.lock();
-        match &inner.mapping {
-            None => inner.mapping = Some(mapping),
-            Some(_) => {
-                drop(inner);
-                drop(mapping);
-                return Err(EBUSY);
-            }
-        }
         Ok(())
     }
 
@@ -1044,18 +1060,11 @@ fn deferred_release(self: Arc<Self>) {
         let omapping = self.inner.lock().mapping.take();
         if let Some(mut mapping) = omapping {
             let address = mapping.address;
-            let pages = mapping.pages.clone();
             let oneway_spam_detected = mapping.alloc.oneway_spam_detected;
             mapping.alloc.take_for_each(|offset, size, odata| {
                 let ptr = offset + address;
-                let mut alloc = Allocation::new(
-                    self.clone(),
-                    offset,
-                    size,
-                    ptr,
-                    pages.clone(),
-                    oneway_spam_detected,
-                );
+                let mut alloc =
+                    Allocation::new(self.clone(), offset, size, ptr, oneway_spam_detected);
                 if let Some(data) = odata {
                     alloc.set_info(data);
                 }
diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs
index c1d47115e54d..4aa1b5236bf5 100644
--- a/drivers/android/range_alloc.rs
+++ b/drivers/android/range_alloc.rs
@@ -19,6 +19,26 @@ pub(crate) struct RangeAllocator<T> {
     pub(crate) oneway_spam_detected: bool,
 }
 
+const PAGE_SIZE: usize = kernel::bindings::PAGE_SIZE;
+
+/// Represents a range of pages that have just become completely free.
+#[derive(Copy, Clone)]
+pub(crate) struct FreedRange {
+    pub(crate) start_page_idx: usize,
+    pub(crate) end_page_idx: usize,
+}
+
+impl FreedRange {
+    fn interior_pages(offset: usize, size: usize) -> FreedRange {
+        FreedRange {
+            // Divide round up
+            start_page_idx: (offset + (PAGE_SIZE - 1)) / PAGE_SIZE,
+            // Divide round down
+            end_page_idx: (offset + size) / PAGE_SIZE,
+        }
+    }
+}
+
 impl<T> RangeAllocator<T> {
     pub(crate) fn new(size: usize) -> Result<Self> {
         let mut tree = RBTree::new();
@@ -97,7 +117,7 @@ pub(crate) fn reserve_new(
         Ok(found_off)
     }
 
-    pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result {
+    pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result<FreedRange> {
         let mut cursor = self.tree.cursor_lower_bound(&offset).ok_or_else(|| {
             pr_warn!(
                 "EINVAL from range_alloc.reservation_abort - offset: {}",
@@ -140,9 +160,26 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result {
 
         self.free_oneway_space += free_oneway_space_add;
 
+        let mut freed_range = FreedRange::interior_pages(offset, size);
+        // Compute how large the next free region needs to be to include one more page in
+        // the newly freed range.
+        let add_next_page_needed = match (offset + size) % PAGE_SIZE {
+            0 => usize::MAX,
+            unalign => PAGE_SIZE - unalign,
+        };
+        // Compute how large the previous free region needs to be to include one more page
+        // in the newly freed range.
+        let add_prev_page_needed = match offset % PAGE_SIZE {
+            0 => usize::MAX,
+            unalign => unalign,
+        };
+
         // Merge next into current if next is free
         let remove_next = match cursor.peek_next() {
             Some((_, next)) if next.state.is_none() => {
+                if next.size >= add_next_page_needed {
+                    freed_range.end_page_idx += 1;
+                }
                 self.free_tree.remove(&(next.size, next.offset));
                 size += next.size;
                 true
@@ -159,6 +196,9 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result {
         // Merge current into prev if prev is free
         match cursor.peek_prev_mut() {
             Some((_, prev)) if prev.state.is_none() => {
+                if prev.size >= add_prev_page_needed {
+                    freed_range.start_page_idx -= 1;
+                }
                 // merge previous with current, remove current
                 self.free_tree.remove(&(prev.size, prev.offset));
                 offset = prev.offset;
@@ -172,7 +212,7 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result {
         self.free_tree
             .insert(reservation.free_res.into_node((size, offset), ()));
 
-        Ok(())
+        Ok(freed_range)
     }
 
     pub(crate) fn reservation_commit(&mut self, offset: usize, data: Option<T>) -> Result {
diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
index a1c95a1609d5..0e4033dc8e71 100644
--- a/drivers/android/rust_binder.rs
+++ b/drivers/android/rust_binder.rs
@@ -9,6 +9,7 @@
     list::{
         HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc,
     },
+    page_range::Shrinker,
     prelude::*,
     seq_file::SeqFile,
     seq_print,
@@ -173,12 +174,17 @@ const fn ptr_align(value: usize) -> usize {
     (value + size) & !size
 }
 
+// SAFETY: We call register in `init`.
+static BINDER_SHRINKER: Shrinker = unsafe { Shrinker::new() };
+
 struct BinderModule {}
 
 impl kernel::Module for BinderModule {
     fn init(_module: &'static kernel::ThisModule) -> Result<Self> {
         crate::context::CONTEXTS.init();
 
+        BINDER_SHRINKER.register(kernel::c_str!("android-binder"))?;
+
         // SAFETY: The module is being loaded, so we can initialize binderfs.
         #[cfg(CONFIG_ANDROID_BINDERFS_RUST)]
         unsafe {
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index b2d60b4a9df6..2f37e5192ce4 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -12,6 +12,7 @@
 #include <linux/fdtable.h>
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/list_lru.h>
 #include <linux/pid_namespace.h>
 #include <linux/poll.h>
 #include <linux/security.h>
@@ -21,6 +22,7 @@
 #include <linux/rust_binder.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
+#include <linux/shrinker.h>
 #include <linux/task_work.h>
 #include <linux/workqueue.h>
 #include <uapi/linux/android/binder.h>
diff --git a/rust/helpers.c b/rust/helpers.c
index be295d8bdb46..3392d2d4ee2c 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -93,6 +93,12 @@ void rust_helper_spin_unlock(spinlock_t *lock)
 }
 EXPORT_SYMBOL_GPL(rust_helper_spin_unlock);
 
+int rust_helper_spin_trylock(spinlock_t *lock)
+{
+	return spin_trylock(lock);
+}
+EXPORT_SYMBOL_GPL(rust_helper_spin_trylock);
+
 void rust_helper_init_wait(struct wait_queue_entry *wq_entry)
 {
 	init_wait(wq_entry);
@@ -310,6 +316,20 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
 }
 EXPORT_SYMBOL_GPL(rust_helper_vma_lookup);
 
+unsigned long rust_helper_list_lru_count(struct list_lru *lru)
+{
+	return list_lru_count(lru);
+}
+EXPORT_SYMBOL_GPL(rust_helper_list_lru_count);
+
+unsigned long rust_helper_list_lru_walk(struct list_lru *lru,
+					list_lru_walk_cb isolate, void *cb_arg,
+					unsigned long nr_to_walk)
+{
+	return list_lru_walk(lru, isolate, cb_arg, nr_to_walk);
+}
+EXPORT_SYMBOL_GPL(rust_helper_list_lru_walk);
+
 void rust_helper_rb_link_node(struct rb_node *node, struct rb_node *parent,
 			      struct rb_node **rb_link)
 {
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index d46187783464..02e670b92426 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -43,6 +43,7 @@
 pub mod kunit;
 pub mod list;
 pub mod mm;
+pub mod page_range;
 pub mod pages;
 pub mod prelude;
 pub mod print;
diff --git a/rust/kernel/page_range.rs b/rust/kernel/page_range.rs
new file mode 100644
index 000000000000..b13f8cd62b77
--- /dev/null
+++ b/rust/kernel/page_range.rs
@@ -0,0 +1,715 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! This module has utilities for managing a page range where unused pages may be reclaimed by a
+//! vma shrinker.
+
+// To avoid deadlocks, locks are taken in the order:
+//
+//  1. mmap lock
+//  2. spinlock
+//  3. lru spinlock
+//
+// The shrinker will use trylock methods because it locks them in a different order.
+
+use core::{
+    alloc::Layout,
+    ffi::{c_ulong, c_void},
+    marker::PhantomPinned,
+    mem::{size_of, size_of_val, MaybeUninit},
+    ptr,
+};
+
+use crate::{
+    bindings,
+    error::Result,
+    io_buffer::ReadableFromBytes,
+    mm::{virt, MmGrab},
+    new_spinlock,
+    pages::Pages,
+    prelude::*,
+    str::CStr,
+    sync::SpinLock,
+    types::Opaque,
+    user_ptr::UserSlicePtrReader,
+};
+
+const PAGE_SIZE: usize = bindings::PAGE_SIZE;
+const PAGE_SHIFT: usize = bindings::PAGE_SHIFT;
+const PAGE_MASK: usize = bindings::PAGE_MASK;
+
+/// Represents a shrinker that can be registered with the kernel.
+///
+/// Each shrinker can be used by many `ShrinkablePageRange` objects.
+#[repr(C)]
+pub struct Shrinker {
+    inner: Opaque<bindings::shrinker>,
+    list_lru: Opaque<bindings::list_lru>,
+}
+
+unsafe impl Send for Shrinker {}
+unsafe impl Sync for Shrinker {}
+
+impl Shrinker {
+    /// Create a new shrinker.
+    ///
+    /// # Safety
+    ///
+    /// Before using this shrinker with a `ShrinkablePageRange`, the `register` method must have
+    /// been called exactly once, and it must not have returned an error.
+    pub const unsafe fn new() -> Self {
+        Self {
+            inner: Opaque::uninit(),
+            list_lru: Opaque::uninit(),
+        }
+    }
+
+    /// Register this shrinker with the kernel.
+    pub fn register(&'static self, name: &CStr) -> Result<()> {
+        // SAFETY: These fields are not yet used, so it's okay to zero them.
+        unsafe {
+            self.inner.get().write_bytes(0, 1);
+            self.list_lru.get().write_bytes(0, 1);
+        }
+
+        // SAFETY: The field is not yet used, so we can initialize it.
+        let ret = unsafe {
+            bindings::__list_lru_init(self.list_lru.get(), false, ptr::null_mut(), ptr::null_mut())
+        };
+        if ret != 0 {
+            return Err(Error::from_errno(ret));
+        }
+
+        // SAFETY: We're about to register the shrinker, and these are the fields we need to
+        // initialize. (All other fields are already zeroed.)
+        unsafe {
+            let inner = self.inner.get();
+            ptr::addr_of_mut!((*inner).count_objects).write(Some(rust_shrink_count));
+            ptr::addr_of_mut!((*inner).scan_objects).write(Some(rust_shrink_scan));
+            ptr::addr_of_mut!((*inner).seeks).write(bindings::DEFAULT_SEEKS as _);
+        }
+
+        // SAFETY: We've initialized the shrinker fields we need to, so we can call this method.
+        let ret = unsafe { bindings::register_shrinker(self.inner.get(), name.as_char_ptr()) };
+        if ret != 0 {
+            // SAFETY: We initialized it, so its okay to destroy it.
+            unsafe { bindings::list_lru_destroy(self.list_lru.get()) };
+            return Err(Error::from_errno(ret));
+        }
+
+        Ok(())
+    }
+}
+
+/// A container that manages a page range in a vma.
+///
+/// The pages can be thought of as an array of booleans of whether the pages are usable. The
+/// methods `use_range` and `stop_using_range` set all booleans in a range to true or false
+/// respectively. Initially, no pages are allocated. When a page is not used, it is not freed
+/// immediately. Instead, it is made available to the memory shrinker to free it if the device is
+/// under memory pressure.
+///
+/// It's okay for `use_range` and `stop_using_range` to race with each other, although there's no
+/// way to know whether an index ends up with true or false if a call to `use_range` races with
+/// another call to `stop_using_range` on a given index.
+///
+/// It's also okay for the two methods to race with themselves, e.g. if two threads call
+/// `use_range` on the same index, then that's fine and neither call will return until the page is
+/// allocated and mapped.
+///
+/// The methods that read or write to a range require that the page is marked as in use. So it is
+/// _not_ okay to call `stop_using_range` on a page that is in use by the methods that read or
+/// write to the page.
+#[pin_data(PinnedDrop)]
+pub struct ShrinkablePageRange {
+    /// Shrinker object registered with the kernel.
+    shrinker: &'static Shrinker,
+    /// The mm for the relevant process.
+    mm: MmGrab,
+    /// Spinlock protecting changes to pages.
+    #[pin]
+    lock: SpinLock<Inner>,
+
+    /// Must not move, since page info has pointers back.
+    #[pin]
+    _pin: PhantomPinned,
+}
+
+struct Inner {
+    /// Array of pages.
+    ///
+    /// Since this is also accessed by the shrinker, we can't use a `Box`, which asserts exclusive
+    /// ownership. To deal with that, we manage it using raw pointers.
+    pages: *mut PageInfo,
+    /// Length of the `pages` array.
+    size: usize,
+    /// The address of the vma to insert the pages into.
+    vma_addr: usize,
+}
+
+unsafe impl Send for ShrinkablePageRange {}
+unsafe impl Sync for ShrinkablePageRange {}
+
+/// An array element that describes the current state of a page.
+///
+/// There are three states:
+///
+///  * Free. The page is None. The `lru` element is not queued.
+///  * Available. The page is Some. The `lru` element is queued to the shrinker's lru.
+///  * Used. The page is Some. The `lru` element is not queued.
+///
+/// When an element is available, the shrinker is able to free the page.
+#[repr(C)]
+struct PageInfo {
+    lru: bindings::list_head,
+    page: Option<Pages<0>>,
+    range: *const ShrinkablePageRange,
+}
+
+impl PageInfo {
+    /// # Safety
+    ///
+    /// The caller ensures that reading from `me.page` is ok.
+    unsafe fn has_page(me: *const PageInfo) -> bool {
+        // SAFETY: This pointer offset is in bounds.
+        let page = unsafe { ptr::addr_of!((*me).page) };
+
+        unsafe { (*page).is_some() }
+    }
+
+    /// # Safety
+    ///
+    /// The caller ensures that writing to `me.page` is ok, and that the page is not currently set.
+    unsafe fn set_page(me: *mut PageInfo, page: Pages<0>) {
+        // SAFETY: This pointer offset is in bounds.
+        let ptr = unsafe { ptr::addr_of_mut!((*me).page) };
+
+        // SAFETY: The pointer is valid for writing, so also valid for reading.
+        if unsafe { (*ptr).is_some() } {
+            pr_err!("set_page called when there is already a page");
+            // SAFETY: We will initialize the page again below.
+            unsafe { ptr::drop_in_place(ptr) };
+        }
+
+        // SAFETY: The pointer is valid for writing.
+        unsafe { ptr::write(ptr, Some(page)) };
+    }
+
+    /// # Safety
+    ///
+    /// The caller ensures that reading from `me.page` is ok for the duration of 'a.
+    unsafe fn get_page<'a>(me: *const PageInfo) -> Option<&'a Pages<0>> {
+        // SAFETY: This pointer offset is in bounds.
+        let ptr = unsafe { ptr::addr_of!((*me).page) };
+
+        // SAFETY: The pointer is valid for reading.
+        unsafe { (*ptr).as_ref() }
+    }
+
+    /// # Safety
+    ///
+    /// The caller ensures that writing to `me.page` is ok for the duration of 'a.
+    unsafe fn take_page(me: *mut PageInfo) -> Option<Pages<0>> {
+        // SAFETY: This pointer offset is in bounds.
+        let ptr = unsafe { ptr::addr_of_mut!((*me).page) };
+
+        // SAFETY: The pointer is valid for reading.
+        unsafe { (*ptr).take() }
+    }
+
+    /// Add this page to the lru list, if not already in the list.
+    ///
+    /// # Safety
+    ///
+    /// The pointer must be valid, and it must be the right shrinker.
+    unsafe fn list_lru_add(me: *mut PageInfo, shrinker: &'static Shrinker) {
+        // SAFETY: This pointer offset is in bounds.
+        let lru_ptr = unsafe { ptr::addr_of_mut!((*me).lru) };
+        // SAFETY: The lru pointer is valid, and we're not using it with any other lru list.
+        unsafe { bindings::list_lru_add(shrinker.list_lru.get(), lru_ptr) };
+    }
+
+    /// Remove this page from the lru list, if it is in the list.
+    ///
+    /// # Safety
+    ///
+    /// The pointer must be valid, and it must be the right shrinker.
+    unsafe fn list_lru_del(me: *mut PageInfo, shrinker: &'static Shrinker) {
+        // SAFETY: This pointer offset is in bounds.
+        let lru_ptr = unsafe { ptr::addr_of_mut!((*me).lru) };
+        // SAFETY: The lru pointer is valid, and we're not using it with any other lru list.
+        unsafe { bindings::list_lru_del(shrinker.list_lru.get(), lru_ptr) };
+    }
+}
+
+impl ShrinkablePageRange {
+    /// Create a new `ShrinkablePageRange` using the given shrinker.
+    pub fn new(shrinker: &'static Shrinker) -> impl PinInit<Self, Error> {
+        try_pin_init!(Self {
+            shrinker,
+            mm: MmGrab::mmgrab_current().ok_or(ESRCH)?,
+            lock <- new_spinlock!(Inner {
+                pages: ptr::null_mut(),
+                size: 0,
+                vma_addr: 0,
+            }, "ShrinkablePageRange"),
+            _pin: PhantomPinned,
+        })
+    }
+
+    /// Register a vma with this page range. Returns the size of the region.
+    pub fn register_with_vma(&self, vma: &virt::Area) -> Result<usize> {
+        let num_bytes = usize::min(vma.end() - vma.start(), bindings::SZ_4M as usize);
+        let num_pages = num_bytes >> PAGE_SHIFT;
+
+        if !self.mm.is_same_mm(vma) {
+            pr_debug!("Failed to register with vma: invalid vma->vm_mm");
+            return Err(EINVAL);
+        }
+        if num_pages == 0 {
+            pr_debug!("Failed to register with vma: size zero");
+            return Err(EINVAL);
+        }
+
+        let layout = Layout::array::<PageInfo>(num_pages).map_err(|_| ENOMEM)?;
+        // SAFETY: The layout has non-zero size.
+        let pages = unsafe { alloc::alloc::alloc(layout) as *mut PageInfo };
+        if pages.is_null() {
+            return Err(ENOMEM);
+        }
+
+        // SAFETY: This just initializes the pages array.
+        unsafe {
+            let self_ptr = self as *const ShrinkablePageRange;
+            for i in 0..num_pages {
+                let info = pages.add(i);
+                ptr::addr_of_mut!((*info).range).write(self_ptr);
+                ptr::addr_of_mut!((*info).page).write(None);
+                let lru = ptr::addr_of_mut!((*info).lru);
+                ptr::addr_of_mut!((*lru).next).write(lru);
+                ptr::addr_of_mut!((*lru).prev).write(lru);
+            }
+        }
+
+        let mut inner = self.lock.lock();
+        if inner.size > 0 {
+            pr_debug!("Failed to register with vma: already registered");
+            drop(inner);
+            // SAFETY: The `pages` array was allocated with the same layout.
+            unsafe { alloc::alloc::dealloc(pages.cast(), layout) };
+            return Err(EBUSY);
+        }
+
+        inner.pages = pages;
+        inner.size = num_pages;
+        inner.vma_addr = vma.start();
+
+        Ok(num_pages)
+    }
+
+    /// Make sure that the given pages are allocated and mapped.
+    ///
+    /// Must not be called from an atomic context.
+    pub fn use_range(&self, start: usize, end: usize) -> Result<()> {
+        if start >= end {
+            return Ok(());
+        }
+        let mut inner = self.lock.lock();
+        assert!(end <= inner.size);
+
+        for i in start..end {
+            // SAFETY: This pointer offset is in bounds.
+            let page_info = unsafe { inner.pages.add(i) };
+
+            // SAFETY: The pointer is valid, and we hold the lock so reading from the page is okay.
+            if unsafe { PageInfo::has_page(page_info) } {
+                // Since we're going to use the page, we should remove it from the lru list so that
+                // the shrinker will not free it.
+                //
+                // SAFETY: The pointer is valid, and this is the right shrinker.
+                //
+                // The shrinker can't free the page between the check and this call to
+                // `list_lru_del` because we hold the lock.
+                unsafe { PageInfo::list_lru_del(page_info, self.shrinker) };
+            } else {
+                // We have to allocate a new page. Use the slow path.
+                drop(inner);
+                match self.use_page_slow(i) {
+                    Ok(()) => {}
+                    Err(err) => {
+                        pr_warn!("Error in use_page_slow: {:?}", err);
+                        return Err(err);
+                    }
+                }
+                inner = self.lock.lock();
+            }
+        }
+        Ok(())
+    }
+
+    /// Mark the given page as in use, slow path.
+    ///
+    /// Must not be called from an atomic context.
+    ///
+    /// # Safety
+    ///
+    /// Assumes that `i` is in bounds.
+    #[cold]
+    fn use_page_slow(&self, i: usize) -> Result<()> {
+        let new_page = Pages::new()?;
+        // We use `mmput_async` when dropping the `mm` because `use_page_slow` is usually used from
+        // a remote process. If the call to `mmput` races with the process shutting down, then the
+        // caller of `use_page_slow` becomes responsible for cleaning up the `mm`, which doesn't
+        // happen until it returns to userspace. However, the caller might instead go to sleep and
+        // wait for the owner of the `mm` to wake it up, which doesn't happen because it's in the
+        // middle of a shutdown process that wont complete until the `mm` is dropped. This can
+        // amount to a deadlock.
+        //
+        // Using `mmput_async` avoids this, because then the `mm` cleanup is instead queued to a
+        // workqueue.
+        let mm = self.mm.mmget_not_zero().ok_or(ESRCH)?.use_async_put();
+        let mut mmap_lock = mm.mmap_write_lock();
+        let inner = self.lock.lock();
+
+        // SAFETY: This pointer offset is in bounds.
+        let page_info = unsafe { inner.pages.add(i) };
+
+        // SAFETY: The pointer is valid, and we hold the lock so reading from the page is okay.
+        if unsafe { PageInfo::has_page(page_info) } {
+            // The page was already there, or someone else added the page while we didn't hold the
+            // spinlock.
+            //
+            // SAFETY: The pointer is valid, and this is the right shrinker.
+            //
+            // The shrinker can't free the page between the check and this call to
+            // `list_lru_del` because we hold the lock.
+            unsafe { PageInfo::list_lru_del(page_info, self.shrinker) };
+            return Ok(());
+        }
+
+        let vma_addr = inner.vma_addr;
+        // Release the spinlock while we insert the page into the vma.
+        drop(inner);
+
+        let vma = mmap_lock.vma_lookup(vma_addr).ok_or(ESRCH)?;
+
+        // No overflow since we stay in bounds of the vma.
+        let user_page_addr = vma_addr + (i << PAGE_SHIFT);
+        match vma.insert_page(user_page_addr, &new_page) {
+            Ok(()) => {}
+            Err(err) => {
+                pr_warn!(
+                    "Error in insert_page({}): vma_addr:{} i:{} err:{:?}",
+                    user_page_addr,
+                    vma_addr,
+                    i,
+                    err
+                );
+                return Err(err);
+            }
+        }
+
+        let inner = self.lock.lock();
+
+        // SAFETY: The `page_info` pointer is valid and currently does not have a page. The page
+        // can be written to since we hold the lock.
+        //
+        // We released and reacquired the spinlock since we checked that the page is null, but we
+        // always hold the mmap write lock when setting the page to a non-null value, so it's not
+        // possible for someone else to have changed it since our check.
+        unsafe { PageInfo::set_page(page_info, new_page) };
+
+        drop(inner);
+
+        Ok(())
+    }
+
+    /// If the given page is in use, then mark it as available so that the shrinker can free it.
+    ///
+    /// May be called from an atomic context.
+    pub fn stop_using_range(&self, start: usize, end: usize) {
+        if start >= end {
+            return;
+        }
+        let inner = self.lock.lock();
+        assert!(end <= inner.size);
+
+        for i in (start..end).rev() {
+            // SAFETY: The pointer is in bounds.
+            let page_info = unsafe { inner.pages.add(i) };
+
+            // SAFETY: Okay for reading since we have the lock.
+            if unsafe { PageInfo::has_page(page_info) } {
+                // SAFETY: The pointer is valid, and it's the right shrinker.
+                unsafe { PageInfo::list_lru_add(page_info, self.shrinker) };
+            }
+        }
+    }
+
+    /// Helper for reading or writing to a range of bytes that may overlap with several pages.
+    ///
+    /// # Safety
+    ///
+    /// All pages touched by this operation must be in use for the duration of this call.
+    unsafe fn iterate<T>(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result
+    where
+        T: FnMut(&Pages<0>, usize, usize) -> Result,
+    {
+        if size == 0 {
+            return Ok(());
+        }
+
+        // SAFETY: The caller promises that the pages touched by this call are in use. It's only
+        // possible for a page to be in use if we have already been registered with a vma, and we
+        // only change the `pages` and `size` fields during registration with a vma, so there is no
+        // race when we read them here without taking the lock.
+        let (pages, num_pages) = unsafe {
+            let inner = self.lock.get_ptr();
+            (
+                ptr::addr_of!((*inner).pages).read(),
+                ptr::addr_of!((*inner).size).read(),
+            )
+        };
+        let num_bytes = num_pages << PAGE_SHIFT;
+
+        // Check that the request is within the buffer.
+        if offset.checked_add(size).ok_or(EFAULT)? > num_bytes {
+            return Err(EFAULT);
+        }
+
+        let mut page_index = offset >> PAGE_SHIFT;
+        offset &= PAGE_MASK;
+        while size > 0 {
+            let available = usize::min(size, PAGE_SIZE - offset);
+            // SAFETY: The pointer is in bounds.
+            let page_info = unsafe { pages.add(page_index) };
+            // SAFETY: The caller guarantees that this page is in the "in use" state for the
+            // duration of this call to `iterate`, so nobody will change the page.
+            let page = unsafe { PageInfo::get_page(page_info) };
+            if page.is_none() {
+                pr_warn!("Page is null!");
+            }
+            let page = page.ok_or(EFAULT)?;
+            cb(page, offset, available)?;
+            size -= available;
+            page_index += 1;
+            offset = 0;
+        }
+        Ok(())
+    }
+
+    /// Copy from userspace into this page range.
+    ///
+    /// # Safety
+    ///
+    /// All pages touched by this operation must be in use for the duration of this call.
+    pub unsafe fn copy_into(
+        &self,
+        reader: &mut UserSlicePtrReader,
+        offset: usize,
+        size: usize,
+    ) -> Result {
+        // SAFETY: `self.iterate` has the same safety requirements as `copy_into`.
+        unsafe {
+            self.iterate(offset, size, |page, offset, to_copy| {
+                page.copy_into_page(reader, offset, to_copy)
+            })
+        }
+    }
+
+    /// Copy from this page range into kernel space.
+    ///
+    /// # Safety
+    ///
+    /// All pages touched by this operation must be in use for the duration of this call.
+    pub unsafe fn read<T: ReadableFromBytes>(&self, offset: usize) -> Result<T> {
+        let mut out = MaybeUninit::<T>::uninit();
+        let mut out_offset = 0;
+        // SAFETY: `self.iterate` has the same safety requirements as `copy_into`.
+        unsafe {
+            self.iterate(offset, size_of::<T>(), |page, offset, to_copy| {
+                // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
+                let obj_ptr = (out.as_mut_ptr() as *mut u8).add(out_offset);
+                // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid.
+                page.read(obj_ptr, offset, to_copy)?;
+                out_offset += to_copy;
+                Ok(())
+            })?;
+        }
+        // SAFETY: We just initialised the data.
+        Ok(unsafe { out.assume_init() })
+    }
+
+    /// Copy from kernel space into this page range.
+    ///
+    /// # Safety
+    ///
+    /// All pages touched by this operation must be in use for the duration of this call.
+    pub unsafe fn write<T: ?Sized>(&self, offset: usize, obj: &T) -> Result {
+        let mut obj_offset = 0;
+        // SAFETY: `self.iterate` has the same safety requirements as `copy_into`.
+        unsafe {
+            self.iterate(offset, size_of_val(obj), |page, offset, to_copy| {
+                // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T.
+                let obj_ptr = (obj as *const T as *const u8).add(obj_offset);
+                // SAFETY: We have a reference to the object, so the pointer is valid.
+                page.write(obj_ptr, offset, to_copy)?;
+                obj_offset += to_copy;
+                Ok(())
+            })
+        }
+    }
+
+    /// Write zeroes to the given range.
+    ///
+    /// # Safety
+    ///
+    /// All pages touched by this operation must be in use for the duration of this call.
+    pub unsafe fn fill_zero(&self, offset: usize, size: usize) -> Result {
+        // SAFETY: `self.iterate` has the same safety requirements as `copy_into`.
+        unsafe {
+            self.iterate(offset, size, |page, offset, len| {
+                page.fill_zero(offset, len)
+            })
+        }
+    }
+}
+
+#[pinned_drop]
+impl PinnedDrop for ShrinkablePageRange {
+    fn drop(self: Pin<&mut Self>) {
+        let (pages, size) = {
+            let lock = self.lock.lock();
+            (lock.pages, lock.size)
+        };
+
+        if size == 0 {
+            return;
+        }
+
+        // This is the destructor, so unlike the other methods, we only need to worry about races
+        // with the shrinker here.
+        for i in 0..size {
+            // SAFETY: The pointer is valid and it's the right shrinker.
+            unsafe { PageInfo::list_lru_del(pages.add(i), self.shrinker) };
+            // SAFETY: If the shrinker was going to free this page, then it would have taken it
+            // from the PageInfo before releasing the lru lock. Thus, the call to `list_lru_del`
+            // will either remove it before the shrinker can access it, or the shrinker will
+            // already have taken the page at this point.
+            unsafe { drop(PageInfo::take_page(pages.add(i))) };
+        }
+
+        // SAFETY: This computation did not overflow when allocating the pages array, so it will
+        // not overflow this time.
+        let layout = unsafe { Layout::array::<PageInfo>(size).unwrap_unchecked() };
+
+        // SAFETY: The `pages` array was allocated with the same layout.
+        unsafe { alloc::alloc::dealloc(pages.cast(), layout) };
+    }
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_shrink_count(
+    shrink: *mut bindings::shrinker,
+    _sc: *mut bindings::shrink_control,
+) -> c_ulong {
+    // SAFETY: This method is only used with the `Shrinker` type, and the cast is valid since
+    // `shrinker` is the first field of a #[repr(C)] struct.
+    let shrinker = unsafe { &*shrink.cast::<Shrinker>() };
+    // SAFETY: Accessing the lru list is okay. Just an FFI call.
+    unsafe { bindings::list_lru_count(shrinker.list_lru.get()) }
+}
+
+#[no_mangle]
+unsafe extern "C" fn rust_shrink_scan(
+    shrink: *mut bindings::shrinker,
+    sc: *mut bindings::shrink_control,
+) -> c_ulong {
+    // SAFETY: This method is only used with the `Shrinker` type, and the cast is valid since
+    // `shrinker` is the first field of a #[repr(C)] struct.
+    let shrinker = unsafe { &*shrink.cast::<Shrinker>() };
+    // SAFETY: Caller guarantees that it is safe to read this field.
+    let nr_to_scan = unsafe { (*sc).nr_to_scan };
+    // SAFETY: Accessing the lru list is okay. Just an FFI call.
+    unsafe {
+        bindings::list_lru_walk(
+            shrinker.list_lru.get(),
+            Some(rust_shrink_free_page),
+            ptr::null_mut(),
+            nr_to_scan,
+        )
+    }
+}
+
+const LRU_SKIP: bindings::lru_status = bindings::lru_status_LRU_SKIP;
+const LRU_REMOVED_ENTRY: bindings::lru_status = bindings::lru_status_LRU_REMOVED_RETRY;
+
+#[no_mangle]
+unsafe extern "C" fn rust_shrink_free_page(
+    item: *mut bindings::list_head,
+    lru: *mut bindings::list_lru_one,
+    lru_lock: *mut bindings::spinlock_t,
+    _cb_arg: *mut c_void,
+) -> bindings::lru_status {
+    // Fields that should survive after unlocking the lru lock.
+    let page;
+    let page_index;
+    let mm;
+    let mmap_read;
+    let vma_addr;
+
+    {
+        // SAFETY: The `list_head` field is first in `PageInfo`.
+        let info = item as *mut PageInfo;
+        let range = unsafe { &*((*info).range) };
+
+        mm = match range.mm.mmget_not_zero() {
+            Some(mm) => mm.use_async_put(),
+            None => return LRU_SKIP,
+        };
+
+        mmap_read = match mm.mmap_read_trylock() {
+            Some(guard) => guard,
+            None => return LRU_SKIP,
+        };
+
+        // We can't lock it normally here, since we hold the lru lock.
+        let inner = match range.lock.trylock() {
+            Some(inner) => inner,
+            None => return LRU_SKIP,
+        };
+
+        // SAFETY: The item is in this lru list, so it's okay to remove it.
+        unsafe { bindings::list_lru_isolate(lru, item) };
+
+        // SAFETY: Both pointers are in bounds of the same allocation.
+        page_index = unsafe { info.offset_from(inner.pages) } as usize;
+
+        // SAFETY: We hold the spinlock, so we can take the page.
+        //
+        // This sets the page pointer to zero before we unmap it from the vma. However, we call
+        // `zap_page_range` before we release the mmap lock, so `use_page_slow` will not be able to
+        // insert a new page until after our call to `zap_page_range`.
+        page = unsafe { PageInfo::take_page(info) };
+        vma_addr = inner.vma_addr;
+
+        // From this point on, we don't access this PageInfo or ShrinkablePageRange again, because
+        // they can be freed at any point after we unlock `lru_lock`.
+    }
+
+    // SAFETY: The lru lock is locked when this method is called.
+    unsafe { bindings::spin_unlock(lru_lock) };
+
+    if let Some(vma) = mmap_read.vma_lookup(vma_addr) {
+        let user_page_addr = vma_addr + (page_index << PAGE_SHIFT);
+        vma.zap_page_range(user_page_addr, PAGE_SIZE);
+    }
+
+    drop(mmap_read);
+    drop(mm);
+    drop(page);
+
+    // SAFETY: We just unlocked the lru lock, but it should be locked when we return.
+    unsafe { bindings::spin_lock(lru_lock) };
+
+    LRU_REMOVED_ENTRY
+}
diff --git a/rust/kernel/sync/lock.rs b/rust/kernel/sync/lock.rs
index 149a5259d431..8cf02edb6f4a 100644
--- a/rust/kernel/sync/lock.rs
+++ b/rust/kernel/sync/lock.rs
@@ -51,6 +51,14 @@ unsafe fn init(
     #[must_use]
     unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState;
 
+    /// Tries to acquire the lock, making the caller its owner.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that [`Backend::init`] has been previously called.
+    #[must_use]
+    unsafe fn trylock(ptr: *mut Self::State) -> Option<Self::GuardState>;
+
     /// Releases the lock, giving up its ownership.
     ///
     /// # Safety
@@ -121,6 +129,22 @@ pub fn lock(&self) -> Guard<'_, T, B> {
         // SAFETY: The lock was just acquired.
         unsafe { Guard::new(self, state) }
     }
+
+    /// Acquires the lock and gives the caller access to the data protected by it.
+    pub fn trylock(&self) -> Option<Guard<'_, T, B>> {
+        // SAFETY: The constructor of the type calls `init`, so the existence of the object proves
+        // that `init` was called.
+        let state = unsafe { B::trylock(self.state.get())? };
+        // SAFETY: The lock was just acquired.
+        unsafe { Some(Guard::new(self, state)) }
+    }
+
+    /// Get a raw pointer to the data without touching the lock.
+    ///
+    /// It is up to the user to make sure that the pointer is used correctly.
+    pub fn get_ptr(&self) -> *mut T {
+        self.data.get()
+    }
 }
 
 /// A lock guard.
diff --git a/rust/kernel/sync/lock/mutex.rs b/rust/kernel/sync/lock/mutex.rs
index 09276fedc091..0871d0034174 100644
--- a/rust/kernel/sync/lock/mutex.rs
+++ b/rust/kernel/sync/lock/mutex.rs
@@ -111,6 +111,16 @@ unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState {
         unsafe { bindings::mutex_lock(ptr) };
     }
 
+    unsafe fn trylock(ptr: *mut Self::State) -> Option<Self::GuardState> {
+        // SAFETY: The safety requirements of this function ensure that `ptr` points to valid
+        // memory, and that it has been initialised before.
+        if unsafe { bindings::mutex_trylock(ptr) } != 0 {
+            Some(())
+        } else {
+            None
+        }
+    }
+
     unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
         // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the
         // caller is the owner of the mutex.
diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs
index 91eb2c9e9123..64ff1fcf36c4 100644
--- a/rust/kernel/sync/lock/spinlock.rs
+++ b/rust/kernel/sync/lock/spinlock.rs
@@ -110,6 +110,16 @@ unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState {
         unsafe { bindings::spin_lock(ptr) }
     }
 
+    unsafe fn trylock(ptr: *mut Self::State) -> Option<Self::GuardState> {
+        // SAFETY: The safety requirements of this function ensure that `ptr` points to valid
+        // memory, and that it has been initialised before.
+        if unsafe { bindings::spin_trylock(ptr) } != 0 {
+            Some(())
+        } else {
+            None
+        }
+    }
+
     unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
         // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the
         // caller is the owner of the mutex.

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC 20/20] binder: delete the C implementation
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (18 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 19/20] rust_binder: add vma shrinker Alice Ryhl
@ 2023-11-01 18:01 ` Alice Ryhl
  2023-11-01 18:15   ` Greg Kroah-Hartman
  2023-11-01 18:39   ` Carlos Llamas
  2023-11-01 18:34 ` [PATCH RFC 00/20] Setting up Binder for the future Carlos Llamas
  20 siblings, 2 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-01 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho
  Cc: linux-kernel, rust-for-linux, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer, Alice Ryhl

The ultimate goal of this project is to replace the C implementation.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/Kconfig        |   36 -
 drivers/android/binder.c       | 6630 ----------------------------------------
 drivers/android/binder_alloc.c | 1284 --------
 drivers/android/binderfs.c     |  827 -----
 4 files changed, 8777 deletions(-)

diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
index 82ed6ddabe1a..8b8badd87dc5 100644
--- a/drivers/android/Kconfig
+++ b/drivers/android/Kconfig
@@ -1,18 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 menu "Android"
 
-config ANDROID_BINDER_IPC
-	bool "Android Binder IPC Driver"
-	depends on MMU
-	default n
-	help
-	  Binder is used in Android for both communication between processes,
-	  and remote method invocation.
-
-	  This means one Android process can call a method/routine in another
-	  Android process, using Binder to identify, invoke and pass arguments
-	  between said processes.
-
 config ANDROID_BINDER_IPC_RUST
 	bool "Android Binder IPC Driver in Rust"
 	depends on MMU && RUST
@@ -24,18 +12,6 @@ config ANDROID_BINDER_IPC_RUST
 	  Android process, using Binder to identify, invoke and pass arguments
 	  between said processes.
 
-config ANDROID_BINDERFS
-	bool "Android Binderfs filesystem"
-	depends on ANDROID_BINDER_IPC
-	default n
-	help
-	  Binderfs is a pseudo-filesystem for the Android Binder IPC driver
-	  which can be mounted per-ipc namespace allowing to run multiple
-	  instances of Android.
-	  Each binderfs mount initially only contains a binder-control device.
-	  It can be used to dynamically allocate new binder IPC devices via
-	  ioctls.
-
 config ANDROID_BINDERFS_RUST
 	bool "Android Binderfs filesystem in Rust"
 	depends on ANDROID_BINDER_IPC_RUST
@@ -48,18 +24,6 @@ config ANDROID_BINDERFS_RUST
 	  It can be used to dynamically allocate new binder IPC devices via
 	  ioctls.
 
-config ANDROID_BINDER_DEVICES
-	string "Android Binder devices"
-	depends on ANDROID_BINDER_IPC
-	default "binder,hwbinder,vndbinder"
-	help
-	  Default value for the binder.devices parameter.
-
-	  The binder.devices parameter is a comma-separated list of strings
-	  that specifies the names of the binder device nodes that will be
-	  created. Each binder device has its own context manager, and is
-	  therefore logically separated from the other devices.
-
 config ANDROID_BINDER_DEVICES_RUST
 	string "Android Binder devices in Rust"
 	depends on ANDROID_BINDER_IPC_RUST
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
deleted file mode 100644
index 92128aae2d06..000000000000
--- a/drivers/android/binder.c
+++ /dev/null
@@ -1,6630 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/* binder.c
- *
- * Android IPC Subsystem
- *
- * Copyright (C) 2007-2008 Google, Inc.
- */
-
-/*
- * Locking overview
- *
- * There are 3 main spinlocks which must be acquired in the
- * order shown:
- *
- * 1) proc->outer_lock : protects binder_ref
- *    binder_proc_lock() and binder_proc_unlock() are
- *    used to acq/rel.
- * 2) node->lock : protects most fields of binder_node.
- *    binder_node_lock() and binder_node_unlock() are
- *    used to acq/rel
- * 3) proc->inner_lock : protects the thread and node lists
- *    (proc->threads, proc->waiting_threads, proc->nodes)
- *    and all todo lists associated with the binder_proc
- *    (proc->todo, thread->todo, proc->delivered_death and
- *    node->async_todo), as well as thread->transaction_stack
- *    binder_inner_proc_lock() and binder_inner_proc_unlock()
- *    are used to acq/rel
- *
- * Any lock under procA must never be nested under any lock at the same
- * level or below on procB.
- *
- * Functions that require a lock held on entry indicate which lock
- * in the suffix of the function name:
- *
- * foo_olocked() : requires node->outer_lock
- * foo_nlocked() : requires node->lock
- * foo_ilocked() : requires proc->inner_lock
- * foo_oilocked(): requires proc->outer_lock and proc->inner_lock
- * foo_nilocked(): requires node->lock and proc->inner_lock
- * ...
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/fdtable.h>
-#include <linux/file.h>
-#include <linux/freezer.h>
-#include <linux/fs.h>
-#include <linux/list.h>
-#include <linux/miscdevice.h>
-#include <linux/module.h>
-#include <linux/mutex.h>
-#include <linux/nsproxy.h>
-#include <linux/poll.h>
-#include <linux/debugfs.h>
-#include <linux/rbtree.h>
-#include <linux/sched/signal.h>
-#include <linux/sched/mm.h>
-#include <linux/seq_file.h>
-#include <linux/string.h>
-#include <linux/uaccess.h>
-#include <linux/pid_namespace.h>
-#include <linux/security.h>
-#include <linux/spinlock.h>
-#include <linux/ratelimit.h>
-#include <linux/syscalls.h>
-#include <linux/task_work.h>
-#include <linux/sizes.h>
-#include <linux/ktime.h>
-
-#include <uapi/linux/android/binder.h>
-
-#include <linux/cacheflush.h>
-
-#include "binder_internal.h"
-#include "binder_trace.h"
-
-static HLIST_HEAD(binder_deferred_list);
-static DEFINE_MUTEX(binder_deferred_lock);
-
-static HLIST_HEAD(binder_devices);
-static HLIST_HEAD(binder_procs);
-static DEFINE_MUTEX(binder_procs_lock);
-
-static HLIST_HEAD(binder_dead_nodes);
-static DEFINE_SPINLOCK(binder_dead_nodes_lock);
-
-static struct dentry *binder_debugfs_dir_entry_root;
-static struct dentry *binder_debugfs_dir_entry_proc;
-static atomic_t binder_last_id;
-
-static int proc_show(struct seq_file *m, void *unused);
-DEFINE_SHOW_ATTRIBUTE(proc);
-
-#define FORBIDDEN_MMAP_FLAGS                (VM_WRITE)
-
-enum {
-	BINDER_DEBUG_USER_ERROR             = 1U << 0,
-	BINDER_DEBUG_FAILED_TRANSACTION     = 1U << 1,
-	BINDER_DEBUG_DEAD_TRANSACTION       = 1U << 2,
-	BINDER_DEBUG_OPEN_CLOSE             = 1U << 3,
-	BINDER_DEBUG_DEAD_BINDER            = 1U << 4,
-	BINDER_DEBUG_DEATH_NOTIFICATION     = 1U << 5,
-	BINDER_DEBUG_READ_WRITE             = 1U << 6,
-	BINDER_DEBUG_USER_REFS              = 1U << 7,
-	BINDER_DEBUG_THREADS                = 1U << 8,
-	BINDER_DEBUG_TRANSACTION            = 1U << 9,
-	BINDER_DEBUG_TRANSACTION_COMPLETE   = 1U << 10,
-	BINDER_DEBUG_FREE_BUFFER            = 1U << 11,
-	BINDER_DEBUG_INTERNAL_REFS          = 1U << 12,
-	BINDER_DEBUG_PRIORITY_CAP           = 1U << 13,
-	BINDER_DEBUG_SPINLOCKS              = 1U << 14,
-};
-static uint32_t binder_debug_mask = BINDER_DEBUG_USER_ERROR |
-	BINDER_DEBUG_FAILED_TRANSACTION | BINDER_DEBUG_DEAD_TRANSACTION;
-module_param_named(debug_mask, binder_debug_mask, uint, 0644);
-
-char *binder_devices_param = CONFIG_ANDROID_BINDER_DEVICES;
-module_param_named(devices, binder_devices_param, charp, 0444);
-
-static DECLARE_WAIT_QUEUE_HEAD(binder_user_error_wait);
-static int binder_stop_on_user_error;
-
-static int binder_set_stop_on_user_error(const char *val,
-					 const struct kernel_param *kp)
-{
-	int ret;
-
-	ret = param_set_int(val, kp);
-	if (binder_stop_on_user_error < 2)
-		wake_up(&binder_user_error_wait);
-	return ret;
-}
-module_param_call(stop_on_user_error, binder_set_stop_on_user_error,
-	param_get_int, &binder_stop_on_user_error, 0644);
-
-static __printf(2, 3) void binder_debug(int mask, const char *format, ...)
-{
-	struct va_format vaf;
-	va_list args;
-
-	if (binder_debug_mask & mask) {
-		va_start(args, format);
-		vaf.va = &args;
-		vaf.fmt = format;
-		pr_info_ratelimited("%pV", &vaf);
-		va_end(args);
-	}
-}
-
-#define binder_txn_error(x...) \
-	binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, x)
-
-static __printf(1, 2) void binder_user_error(const char *format, ...)
-{
-	struct va_format vaf;
-	va_list args;
-
-	if (binder_debug_mask & BINDER_DEBUG_USER_ERROR) {
-		va_start(args, format);
-		vaf.va = &args;
-		vaf.fmt = format;
-		pr_info_ratelimited("%pV", &vaf);
-		va_end(args);
-	}
-
-	if (binder_stop_on_user_error)
-		binder_stop_on_user_error = 2;
-}
-
-#define binder_set_extended_error(ee, _id, _command, _param) \
-	do { \
-		(ee)->id = _id; \
-		(ee)->command = _command; \
-		(ee)->param = _param; \
-	} while (0)
-
-#define to_flat_binder_object(hdr) \
-	container_of(hdr, struct flat_binder_object, hdr)
-
-#define to_binder_fd_object(hdr) container_of(hdr, struct binder_fd_object, hdr)
-
-#define to_binder_buffer_object(hdr) \
-	container_of(hdr, struct binder_buffer_object, hdr)
-
-#define to_binder_fd_array_object(hdr) \
-	container_of(hdr, struct binder_fd_array_object, hdr)
-
-static struct binder_stats binder_stats;
-
-static inline void binder_stats_deleted(enum binder_stat_types type)
-{
-	atomic_inc(&binder_stats.obj_deleted[type]);
-}
-
-static inline void binder_stats_created(enum binder_stat_types type)
-{
-	atomic_inc(&binder_stats.obj_created[type]);
-}
-
-struct binder_transaction_log_entry {
-	int debug_id;
-	int debug_id_done;
-	int call_type;
-	int from_proc;
-	int from_thread;
-	int target_handle;
-	int to_proc;
-	int to_thread;
-	int to_node;
-	int data_size;
-	int offsets_size;
-	int return_error_line;
-	uint32_t return_error;
-	uint32_t return_error_param;
-	char context_name[BINDERFS_MAX_NAME + 1];
-};
-
-struct binder_transaction_log {
-	atomic_t cur;
-	bool full;
-	struct binder_transaction_log_entry entry[32];
-};
-
-static struct binder_transaction_log binder_transaction_log;
-static struct binder_transaction_log binder_transaction_log_failed;
-
-static struct binder_transaction_log_entry *binder_transaction_log_add(
-	struct binder_transaction_log *log)
-{
-	struct binder_transaction_log_entry *e;
-	unsigned int cur = atomic_inc_return(&log->cur);
-
-	if (cur >= ARRAY_SIZE(log->entry))
-		log->full = true;
-	e = &log->entry[cur % ARRAY_SIZE(log->entry)];
-	WRITE_ONCE(e->debug_id_done, 0);
-	/*
-	 * write-barrier to synchronize access to e->debug_id_done.
-	 * We make sure the initialized 0 value is seen before
-	 * memset() other fields are zeroed by memset.
-	 */
-	smp_wmb();
-	memset(e, 0, sizeof(*e));
-	return e;
-}
-
-enum binder_deferred_state {
-	BINDER_DEFERRED_FLUSH        = 0x01,
-	BINDER_DEFERRED_RELEASE      = 0x02,
-};
-
-enum {
-	BINDER_LOOPER_STATE_REGISTERED  = 0x01,
-	BINDER_LOOPER_STATE_ENTERED     = 0x02,
-	BINDER_LOOPER_STATE_EXITED      = 0x04,
-	BINDER_LOOPER_STATE_INVALID     = 0x08,
-	BINDER_LOOPER_STATE_WAITING     = 0x10,
-	BINDER_LOOPER_STATE_POLL        = 0x20,
-};
-
-/**
- * binder_proc_lock() - Acquire outer lock for given binder_proc
- * @proc:         struct binder_proc to acquire
- *
- * Acquires proc->outer_lock. Used to protect binder_ref
- * structures associated with the given proc.
- */
-#define binder_proc_lock(proc) _binder_proc_lock(proc, __LINE__)
-static void
-_binder_proc_lock(struct binder_proc *proc, int line)
-	__acquires(&proc->outer_lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_lock(&proc->outer_lock);
-}
-
-/**
- * binder_proc_unlock() - Release spinlock for given binder_proc
- * @proc:                struct binder_proc to acquire
- *
- * Release lock acquired via binder_proc_lock()
- */
-#define binder_proc_unlock(proc) _binder_proc_unlock(proc, __LINE__)
-static void
-_binder_proc_unlock(struct binder_proc *proc, int line)
-	__releases(&proc->outer_lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_unlock(&proc->outer_lock);
-}
-
-/**
- * binder_inner_proc_lock() - Acquire inner lock for given binder_proc
- * @proc:         struct binder_proc to acquire
- *
- * Acquires proc->inner_lock. Used to protect todo lists
- */
-#define binder_inner_proc_lock(proc) _binder_inner_proc_lock(proc, __LINE__)
-static void
-_binder_inner_proc_lock(struct binder_proc *proc, int line)
-	__acquires(&proc->inner_lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_lock(&proc->inner_lock);
-}
-
-/**
- * binder_inner_proc_unlock() - Release inner lock for given binder_proc
- * @proc:         struct binder_proc to acquire
- *
- * Release lock acquired via binder_inner_proc_lock()
- */
-#define binder_inner_proc_unlock(proc) _binder_inner_proc_unlock(proc, __LINE__)
-static void
-_binder_inner_proc_unlock(struct binder_proc *proc, int line)
-	__releases(&proc->inner_lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_unlock(&proc->inner_lock);
-}
-
-/**
- * binder_node_lock() - Acquire spinlock for given binder_node
- * @node:         struct binder_node to acquire
- *
- * Acquires node->lock. Used to protect binder_node fields
- */
-#define binder_node_lock(node) _binder_node_lock(node, __LINE__)
-static void
-_binder_node_lock(struct binder_node *node, int line)
-	__acquires(&node->lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_lock(&node->lock);
-}
-
-/**
- * binder_node_unlock() - Release spinlock for given binder_proc
- * @node:         struct binder_node to acquire
- *
- * Release lock acquired via binder_node_lock()
- */
-#define binder_node_unlock(node) _binder_node_unlock(node, __LINE__)
-static void
-_binder_node_unlock(struct binder_node *node, int line)
-	__releases(&node->lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_unlock(&node->lock);
-}
-
-/**
- * binder_node_inner_lock() - Acquire node and inner locks
- * @node:         struct binder_node to acquire
- *
- * Acquires node->lock. If node->proc also acquires
- * proc->inner_lock. Used to protect binder_node fields
- */
-#define binder_node_inner_lock(node) _binder_node_inner_lock(node, __LINE__)
-static void
-_binder_node_inner_lock(struct binder_node *node, int line)
-	__acquires(&node->lock) __acquires(&node->proc->inner_lock)
-{
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	spin_lock(&node->lock);
-	if (node->proc)
-		binder_inner_proc_lock(node->proc);
-	else
-		/* annotation for sparse */
-		__acquire(&node->proc->inner_lock);
-}
-
-/**
- * binder_node_inner_unlock() - Release node and inner locks
- * @node:         struct binder_node to acquire
- *
- * Release lock acquired via binder_node_lock()
- */
-#define binder_node_inner_unlock(node) _binder_node_inner_unlock(node, __LINE__)
-static void
-_binder_node_inner_unlock(struct binder_node *node, int line)
-	__releases(&node->lock) __releases(&node->proc->inner_lock)
-{
-	struct binder_proc *proc = node->proc;
-
-	binder_debug(BINDER_DEBUG_SPINLOCKS,
-		     "%s: line=%d\n", __func__, line);
-	if (proc)
-		binder_inner_proc_unlock(proc);
-	else
-		/* annotation for sparse */
-		__release(&node->proc->inner_lock);
-	spin_unlock(&node->lock);
-}
-
-static bool binder_worklist_empty_ilocked(struct list_head *list)
-{
-	return list_empty(list);
-}
-
-/**
- * binder_worklist_empty() - Check if no items on the work list
- * @proc:       binder_proc associated with list
- * @list:	list to check
- *
- * Return: true if there are no items on list, else false
- */
-static bool binder_worklist_empty(struct binder_proc *proc,
-				  struct list_head *list)
-{
-	bool ret;
-
-	binder_inner_proc_lock(proc);
-	ret = binder_worklist_empty_ilocked(list);
-	binder_inner_proc_unlock(proc);
-	return ret;
-}
-
-/**
- * binder_enqueue_work_ilocked() - Add an item to the work list
- * @work:         struct binder_work to add to list
- * @target_list:  list to add work to
- *
- * Adds the work to the specified list. Asserts that work
- * is not already on a list.
- *
- * Requires the proc->inner_lock to be held.
- */
-static void
-binder_enqueue_work_ilocked(struct binder_work *work,
-			   struct list_head *target_list)
-{
-	BUG_ON(target_list == NULL);
-	BUG_ON(work->entry.next && !list_empty(&work->entry));
-	list_add_tail(&work->entry, target_list);
-}
-
-/**
- * binder_enqueue_deferred_thread_work_ilocked() - Add deferred thread work
- * @thread:       thread to queue work to
- * @work:         struct binder_work to add to list
- *
- * Adds the work to the todo list of the thread. Doesn't set the process_todo
- * flag, which means that (if it wasn't already set) the thread will go to
- * sleep without handling this work when it calls read.
- *
- * Requires the proc->inner_lock to be held.
- */
-static void
-binder_enqueue_deferred_thread_work_ilocked(struct binder_thread *thread,
-					    struct binder_work *work)
-{
-	WARN_ON(!list_empty(&thread->waiting_thread_node));
-	binder_enqueue_work_ilocked(work, &thread->todo);
-}
-
-/**
- * binder_enqueue_thread_work_ilocked() - Add an item to the thread work list
- * @thread:       thread to queue work to
- * @work:         struct binder_work to add to list
- *
- * Adds the work to the todo list of the thread, and enables processing
- * of the todo queue.
- *
- * Requires the proc->inner_lock to be held.
- */
-static void
-binder_enqueue_thread_work_ilocked(struct binder_thread *thread,
-				   struct binder_work *work)
-{
-	WARN_ON(!list_empty(&thread->waiting_thread_node));
-	binder_enqueue_work_ilocked(work, &thread->todo);
-	thread->process_todo = true;
-}
-
-/**
- * binder_enqueue_thread_work() - Add an item to the thread work list
- * @thread:       thread to queue work to
- * @work:         struct binder_work to add to list
- *
- * Adds the work to the todo list of the thread, and enables processing
- * of the todo queue.
- */
-static void
-binder_enqueue_thread_work(struct binder_thread *thread,
-			   struct binder_work *work)
-{
-	binder_inner_proc_lock(thread->proc);
-	binder_enqueue_thread_work_ilocked(thread, work);
-	binder_inner_proc_unlock(thread->proc);
-}
-
-static void
-binder_dequeue_work_ilocked(struct binder_work *work)
-{
-	list_del_init(&work->entry);
-}
-
-/**
- * binder_dequeue_work() - Removes an item from the work list
- * @proc:         binder_proc associated with list
- * @work:         struct binder_work to remove from list
- *
- * Removes the specified work item from whatever list it is on.
- * Can safely be called if work is not on any list.
- */
-static void
-binder_dequeue_work(struct binder_proc *proc, struct binder_work *work)
-{
-	binder_inner_proc_lock(proc);
-	binder_dequeue_work_ilocked(work);
-	binder_inner_proc_unlock(proc);
-}
-
-static struct binder_work *binder_dequeue_work_head_ilocked(
-					struct list_head *list)
-{
-	struct binder_work *w;
-
-	w = list_first_entry_or_null(list, struct binder_work, entry);
-	if (w)
-		list_del_init(&w->entry);
-	return w;
-}
-
-static void
-binder_defer_work(struct binder_proc *proc, enum binder_deferred_state defer);
-static void binder_free_thread(struct binder_thread *thread);
-static void binder_free_proc(struct binder_proc *proc);
-static void binder_inc_node_tmpref_ilocked(struct binder_node *node);
-
-static bool binder_has_work_ilocked(struct binder_thread *thread,
-				    bool do_proc_work)
-{
-	return thread->process_todo ||
-		thread->looper_need_return ||
-		(do_proc_work &&
-		 !binder_worklist_empty_ilocked(&thread->proc->todo));
-}
-
-static bool binder_has_work(struct binder_thread *thread, bool do_proc_work)
-{
-	bool has_work;
-
-	binder_inner_proc_lock(thread->proc);
-	has_work = binder_has_work_ilocked(thread, do_proc_work);
-	binder_inner_proc_unlock(thread->proc);
-
-	return has_work;
-}
-
-static bool binder_available_for_proc_work_ilocked(struct binder_thread *thread)
-{
-	return !thread->transaction_stack &&
-		binder_worklist_empty_ilocked(&thread->todo) &&
-		(thread->looper & (BINDER_LOOPER_STATE_ENTERED |
-				   BINDER_LOOPER_STATE_REGISTERED));
-}
-
-static void binder_wakeup_poll_threads_ilocked(struct binder_proc *proc,
-					       bool sync)
-{
-	struct rb_node *n;
-	struct binder_thread *thread;
-
-	for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) {
-		thread = rb_entry(n, struct binder_thread, rb_node);
-		if (thread->looper & BINDER_LOOPER_STATE_POLL &&
-		    binder_available_for_proc_work_ilocked(thread)) {
-			if (sync)
-				wake_up_interruptible_sync(&thread->wait);
-			else
-				wake_up_interruptible(&thread->wait);
-		}
-	}
-}
-
-/**
- * binder_select_thread_ilocked() - selects a thread for doing proc work.
- * @proc:	process to select a thread from
- *
- * Note that calling this function moves the thread off the waiting_threads
- * list, so it can only be woken up by the caller of this function, or a
- * signal. Therefore, callers *should* always wake up the thread this function
- * returns.
- *
- * Return:	If there's a thread currently waiting for process work,
- *		returns that thread. Otherwise returns NULL.
- */
-static struct binder_thread *
-binder_select_thread_ilocked(struct binder_proc *proc)
-{
-	struct binder_thread *thread;
-
-	assert_spin_locked(&proc->inner_lock);
-	thread = list_first_entry_or_null(&proc->waiting_threads,
-					  struct binder_thread,
-					  waiting_thread_node);
-
-	if (thread)
-		list_del_init(&thread->waiting_thread_node);
-
-	return thread;
-}
-
-/**
- * binder_wakeup_thread_ilocked() - wakes up a thread for doing proc work.
- * @proc:	process to wake up a thread in
- * @thread:	specific thread to wake-up (may be NULL)
- * @sync:	whether to do a synchronous wake-up
- *
- * This function wakes up a thread in the @proc process.
- * The caller may provide a specific thread to wake-up in
- * the @thread parameter. If @thread is NULL, this function
- * will wake up threads that have called poll().
- *
- * Note that for this function to work as expected, callers
- * should first call binder_select_thread() to find a thread
- * to handle the work (if they don't have a thread already),
- * and pass the result into the @thread parameter.
- */
-static void binder_wakeup_thread_ilocked(struct binder_proc *proc,
-					 struct binder_thread *thread,
-					 bool sync)
-{
-	assert_spin_locked(&proc->inner_lock);
-
-	if (thread) {
-		if (sync)
-			wake_up_interruptible_sync(&thread->wait);
-		else
-			wake_up_interruptible(&thread->wait);
-		return;
-	}
-
-	/* Didn't find a thread waiting for proc work; this can happen
-	 * in two scenarios:
-	 * 1. All threads are busy handling transactions
-	 *    In that case, one of those threads should call back into
-	 *    the kernel driver soon and pick up this work.
-	 * 2. Threads are using the (e)poll interface, in which case
-	 *    they may be blocked on the waitqueue without having been
-	 *    added to waiting_threads. For this case, we just iterate
-	 *    over all threads not handling transaction work, and
-	 *    wake them all up. We wake all because we don't know whether
-	 *    a thread that called into (e)poll is handling non-binder
-	 *    work currently.
-	 */
-	binder_wakeup_poll_threads_ilocked(proc, sync);
-}
-
-static void binder_wakeup_proc_ilocked(struct binder_proc *proc)
-{
-	struct binder_thread *thread = binder_select_thread_ilocked(proc);
-
-	binder_wakeup_thread_ilocked(proc, thread, /* sync = */false);
-}
-
-static void binder_set_nice(long nice)
-{
-	long min_nice;
-
-	if (can_nice(current, nice)) {
-		set_user_nice(current, nice);
-		return;
-	}
-	min_nice = rlimit_to_nice(rlimit(RLIMIT_NICE));
-	binder_debug(BINDER_DEBUG_PRIORITY_CAP,
-		     "%d: nice value %ld not allowed use %ld instead\n",
-		      current->pid, nice, min_nice);
-	set_user_nice(current, min_nice);
-	if (min_nice <= MAX_NICE)
-		return;
-	binder_user_error("%d RLIMIT_NICE not set\n", current->pid);
-}
-
-static struct binder_node *binder_get_node_ilocked(struct binder_proc *proc,
-						   binder_uintptr_t ptr)
-{
-	struct rb_node *n = proc->nodes.rb_node;
-	struct binder_node *node;
-
-	assert_spin_locked(&proc->inner_lock);
-
-	while (n) {
-		node = rb_entry(n, struct binder_node, rb_node);
-
-		if (ptr < node->ptr)
-			n = n->rb_left;
-		else if (ptr > node->ptr)
-			n = n->rb_right;
-		else {
-			/*
-			 * take an implicit weak reference
-			 * to ensure node stays alive until
-			 * call to binder_put_node()
-			 */
-			binder_inc_node_tmpref_ilocked(node);
-			return node;
-		}
-	}
-	return NULL;
-}
-
-static struct binder_node *binder_get_node(struct binder_proc *proc,
-					   binder_uintptr_t ptr)
-{
-	struct binder_node *node;
-
-	binder_inner_proc_lock(proc);
-	node = binder_get_node_ilocked(proc, ptr);
-	binder_inner_proc_unlock(proc);
-	return node;
-}
-
-static struct binder_node *binder_init_node_ilocked(
-						struct binder_proc *proc,
-						struct binder_node *new_node,
-						struct flat_binder_object *fp)
-{
-	struct rb_node **p = &proc->nodes.rb_node;
-	struct rb_node *parent = NULL;
-	struct binder_node *node;
-	binder_uintptr_t ptr = fp ? fp->binder : 0;
-	binder_uintptr_t cookie = fp ? fp->cookie : 0;
-	__u32 flags = fp ? fp->flags : 0;
-
-	assert_spin_locked(&proc->inner_lock);
-
-	while (*p) {
-
-		parent = *p;
-		node = rb_entry(parent, struct binder_node, rb_node);
-
-		if (ptr < node->ptr)
-			p = &(*p)->rb_left;
-		else if (ptr > node->ptr)
-			p = &(*p)->rb_right;
-		else {
-			/*
-			 * A matching node is already in
-			 * the rb tree. Abandon the init
-			 * and return it.
-			 */
-			binder_inc_node_tmpref_ilocked(node);
-			return node;
-		}
-	}
-	node = new_node;
-	binder_stats_created(BINDER_STAT_NODE);
-	node->tmp_refs++;
-	rb_link_node(&node->rb_node, parent, p);
-	rb_insert_color(&node->rb_node, &proc->nodes);
-	node->debug_id = atomic_inc_return(&binder_last_id);
-	node->proc = proc;
-	node->ptr = ptr;
-	node->cookie = cookie;
-	node->work.type = BINDER_WORK_NODE;
-	node->min_priority = flags & FLAT_BINDER_FLAG_PRIORITY_MASK;
-	node->accept_fds = !!(flags & FLAT_BINDER_FLAG_ACCEPTS_FDS);
-	node->txn_security_ctx = !!(flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX);
-	spin_lock_init(&node->lock);
-	INIT_LIST_HEAD(&node->work.entry);
-	INIT_LIST_HEAD(&node->async_todo);
-	binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-		     "%d:%d node %d u%016llx c%016llx created\n",
-		     proc->pid, current->pid, node->debug_id,
-		     (u64)node->ptr, (u64)node->cookie);
-
-	return node;
-}
-
-static struct binder_node *binder_new_node(struct binder_proc *proc,
-					   struct flat_binder_object *fp)
-{
-	struct binder_node *node;
-	struct binder_node *new_node = kzalloc(sizeof(*node), GFP_KERNEL);
-
-	if (!new_node)
-		return NULL;
-	binder_inner_proc_lock(proc);
-	node = binder_init_node_ilocked(proc, new_node, fp);
-	binder_inner_proc_unlock(proc);
-	if (node != new_node)
-		/*
-		 * The node was already added by another thread
-		 */
-		kfree(new_node);
-
-	return node;
-}
-
-static void binder_free_node(struct binder_node *node)
-{
-	kfree(node);
-	binder_stats_deleted(BINDER_STAT_NODE);
-}
-
-static int binder_inc_node_nilocked(struct binder_node *node, int strong,
-				    int internal,
-				    struct list_head *target_list)
-{
-	struct binder_proc *proc = node->proc;
-
-	assert_spin_locked(&node->lock);
-	if (proc)
-		assert_spin_locked(&proc->inner_lock);
-	if (strong) {
-		if (internal) {
-			if (target_list == NULL &&
-			    node->internal_strong_refs == 0 &&
-			    !(node->proc &&
-			      node == node->proc->context->binder_context_mgr_node &&
-			      node->has_strong_ref)) {
-				pr_err("invalid inc strong node for %d\n",
-					node->debug_id);
-				return -EINVAL;
-			}
-			node->internal_strong_refs++;
-		} else
-			node->local_strong_refs++;
-		if (!node->has_strong_ref && target_list) {
-			struct binder_thread *thread = container_of(target_list,
-						    struct binder_thread, todo);
-			binder_dequeue_work_ilocked(&node->work);
-			BUG_ON(&thread->todo != target_list);
-			binder_enqueue_deferred_thread_work_ilocked(thread,
-								   &node->work);
-		}
-	} else {
-		if (!internal)
-			node->local_weak_refs++;
-		if (!node->has_weak_ref && list_empty(&node->work.entry)) {
-			if (target_list == NULL) {
-				pr_err("invalid inc weak node for %d\n",
-					node->debug_id);
-				return -EINVAL;
-			}
-			/*
-			 * See comment above
-			 */
-			binder_enqueue_work_ilocked(&node->work, target_list);
-		}
-	}
-	return 0;
-}
-
-static int binder_inc_node(struct binder_node *node, int strong, int internal,
-			   struct list_head *target_list)
-{
-	int ret;
-
-	binder_node_inner_lock(node);
-	ret = binder_inc_node_nilocked(node, strong, internal, target_list);
-	binder_node_inner_unlock(node);
-
-	return ret;
-}
-
-static bool binder_dec_node_nilocked(struct binder_node *node,
-				     int strong, int internal)
-{
-	struct binder_proc *proc = node->proc;
-
-	assert_spin_locked(&node->lock);
-	if (proc)
-		assert_spin_locked(&proc->inner_lock);
-	if (strong) {
-		if (internal)
-			node->internal_strong_refs--;
-		else
-			node->local_strong_refs--;
-		if (node->local_strong_refs || node->internal_strong_refs)
-			return false;
-	} else {
-		if (!internal)
-			node->local_weak_refs--;
-		if (node->local_weak_refs || node->tmp_refs ||
-				!hlist_empty(&node->refs))
-			return false;
-	}
-
-	if (proc && (node->has_strong_ref || node->has_weak_ref)) {
-		if (list_empty(&node->work.entry)) {
-			binder_enqueue_work_ilocked(&node->work, &proc->todo);
-			binder_wakeup_proc_ilocked(proc);
-		}
-	} else {
-		if (hlist_empty(&node->refs) && !node->local_strong_refs &&
-		    !node->local_weak_refs && !node->tmp_refs) {
-			if (proc) {
-				binder_dequeue_work_ilocked(&node->work);
-				rb_erase(&node->rb_node, &proc->nodes);
-				binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-					     "refless node %d deleted\n",
-					     node->debug_id);
-			} else {
-				BUG_ON(!list_empty(&node->work.entry));
-				spin_lock(&binder_dead_nodes_lock);
-				/*
-				 * tmp_refs could have changed so
-				 * check it again
-				 */
-				if (node->tmp_refs) {
-					spin_unlock(&binder_dead_nodes_lock);
-					return false;
-				}
-				hlist_del(&node->dead_node);
-				spin_unlock(&binder_dead_nodes_lock);
-				binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-					     "dead node %d deleted\n",
-					     node->debug_id);
-			}
-			return true;
-		}
-	}
-	return false;
-}
-
-static void binder_dec_node(struct binder_node *node, int strong, int internal)
-{
-	bool free_node;
-
-	binder_node_inner_lock(node);
-	free_node = binder_dec_node_nilocked(node, strong, internal);
-	binder_node_inner_unlock(node);
-	if (free_node)
-		binder_free_node(node);
-}
-
-static void binder_inc_node_tmpref_ilocked(struct binder_node *node)
-{
-	/*
-	 * No call to binder_inc_node() is needed since we
-	 * don't need to inform userspace of any changes to
-	 * tmp_refs
-	 */
-	node->tmp_refs++;
-}
-
-/**
- * binder_inc_node_tmpref() - take a temporary reference on node
- * @node:	node to reference
- *
- * Take reference on node to prevent the node from being freed
- * while referenced only by a local variable. The inner lock is
- * needed to serialize with the node work on the queue (which
- * isn't needed after the node is dead). If the node is dead
- * (node->proc is NULL), use binder_dead_nodes_lock to protect
- * node->tmp_refs against dead-node-only cases where the node
- * lock cannot be acquired (eg traversing the dead node list to
- * print nodes)
- */
-static void binder_inc_node_tmpref(struct binder_node *node)
-{
-	binder_node_lock(node);
-	if (node->proc)
-		binder_inner_proc_lock(node->proc);
-	else
-		spin_lock(&binder_dead_nodes_lock);
-	binder_inc_node_tmpref_ilocked(node);
-	if (node->proc)
-		binder_inner_proc_unlock(node->proc);
-	else
-		spin_unlock(&binder_dead_nodes_lock);
-	binder_node_unlock(node);
-}
-
-/**
- * binder_dec_node_tmpref() - remove a temporary reference on node
- * @node:	node to reference
- *
- * Release temporary reference on node taken via binder_inc_node_tmpref()
- */
-static void binder_dec_node_tmpref(struct binder_node *node)
-{
-	bool free_node;
-
-	binder_node_inner_lock(node);
-	if (!node->proc)
-		spin_lock(&binder_dead_nodes_lock);
-	else
-		__acquire(&binder_dead_nodes_lock);
-	node->tmp_refs--;
-	BUG_ON(node->tmp_refs < 0);
-	if (!node->proc)
-		spin_unlock(&binder_dead_nodes_lock);
-	else
-		__release(&binder_dead_nodes_lock);
-	/*
-	 * Call binder_dec_node() to check if all refcounts are 0
-	 * and cleanup is needed. Calling with strong=0 and internal=1
-	 * causes no actual reference to be released in binder_dec_node().
-	 * If that changes, a change is needed here too.
-	 */
-	free_node = binder_dec_node_nilocked(node, 0, 1);
-	binder_node_inner_unlock(node);
-	if (free_node)
-		binder_free_node(node);
-}
-
-static void binder_put_node(struct binder_node *node)
-{
-	binder_dec_node_tmpref(node);
-}
-
-static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc,
-						 u32 desc, bool need_strong_ref)
-{
-	struct rb_node *n = proc->refs_by_desc.rb_node;
-	struct binder_ref *ref;
-
-	while (n) {
-		ref = rb_entry(n, struct binder_ref, rb_node_desc);
-
-		if (desc < ref->data.desc) {
-			n = n->rb_left;
-		} else if (desc > ref->data.desc) {
-			n = n->rb_right;
-		} else if (need_strong_ref && !ref->data.strong) {
-			binder_user_error("tried to use weak ref as strong ref\n");
-			return NULL;
-		} else {
-			return ref;
-		}
-	}
-	return NULL;
-}
-
-/**
- * binder_get_ref_for_node_olocked() - get the ref associated with given node
- * @proc:	binder_proc that owns the ref
- * @node:	binder_node of target
- * @new_ref:	newly allocated binder_ref to be initialized or %NULL
- *
- * Look up the ref for the given node and return it if it exists
- *
- * If it doesn't exist and the caller provides a newly allocated
- * ref, initialize the fields of the newly allocated ref and insert
- * into the given proc rb_trees and node refs list.
- *
- * Return:	the ref for node. It is possible that another thread
- *		allocated/initialized the ref first in which case the
- *		returned ref would be different than the passed-in
- *		new_ref. new_ref must be kfree'd by the caller in
- *		this case.
- */
-static struct binder_ref *binder_get_ref_for_node_olocked(
-					struct binder_proc *proc,
-					struct binder_node *node,
-					struct binder_ref *new_ref)
-{
-	struct binder_context *context = proc->context;
-	struct rb_node **p = &proc->refs_by_node.rb_node;
-	struct rb_node *parent = NULL;
-	struct binder_ref *ref;
-	struct rb_node *n;
-
-	while (*p) {
-		parent = *p;
-		ref = rb_entry(parent, struct binder_ref, rb_node_node);
-
-		if (node < ref->node)
-			p = &(*p)->rb_left;
-		else if (node > ref->node)
-			p = &(*p)->rb_right;
-		else
-			return ref;
-	}
-	if (!new_ref)
-		return NULL;
-
-	binder_stats_created(BINDER_STAT_REF);
-	new_ref->data.debug_id = atomic_inc_return(&binder_last_id);
-	new_ref->proc = proc;
-	new_ref->node = node;
-	rb_link_node(&new_ref->rb_node_node, parent, p);
-	rb_insert_color(&new_ref->rb_node_node, &proc->refs_by_node);
-
-	new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;
-	for (n = rb_first(&proc->refs_by_desc); n != NULL; n = rb_next(n)) {
-		ref = rb_entry(n, struct binder_ref, rb_node_desc);
-		if (ref->data.desc > new_ref->data.desc)
-			break;
-		new_ref->data.desc = ref->data.desc + 1;
-	}
-
-	p = &proc->refs_by_desc.rb_node;
-	while (*p) {
-		parent = *p;
-		ref = rb_entry(parent, struct binder_ref, rb_node_desc);
-
-		if (new_ref->data.desc < ref->data.desc)
-			p = &(*p)->rb_left;
-		else if (new_ref->data.desc > ref->data.desc)
-			p = &(*p)->rb_right;
-		else
-			BUG();
-	}
-	rb_link_node(&new_ref->rb_node_desc, parent, p);
-	rb_insert_color(&new_ref->rb_node_desc, &proc->refs_by_desc);
-
-	binder_node_lock(node);
-	hlist_add_head(&new_ref->node_entry, &node->refs);
-
-	binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-		     "%d new ref %d desc %d for node %d\n",
-		      proc->pid, new_ref->data.debug_id, new_ref->data.desc,
-		      node->debug_id);
-	binder_node_unlock(node);
-	return new_ref;
-}
-
-static void binder_cleanup_ref_olocked(struct binder_ref *ref)
-{
-	bool delete_node = false;
-
-	binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-		     "%d delete ref %d desc %d for node %d\n",
-		      ref->proc->pid, ref->data.debug_id, ref->data.desc,
-		      ref->node->debug_id);
-
-	rb_erase(&ref->rb_node_desc, &ref->proc->refs_by_desc);
-	rb_erase(&ref->rb_node_node, &ref->proc->refs_by_node);
-
-	binder_node_inner_lock(ref->node);
-	if (ref->data.strong)
-		binder_dec_node_nilocked(ref->node, 1, 1);
-
-	hlist_del(&ref->node_entry);
-	delete_node = binder_dec_node_nilocked(ref->node, 0, 1);
-	binder_node_inner_unlock(ref->node);
-	/*
-	 * Clear ref->node unless we want the caller to free the node
-	 */
-	if (!delete_node) {
-		/*
-		 * The caller uses ref->node to determine
-		 * whether the node needs to be freed. Clear
-		 * it since the node is still alive.
-		 */
-		ref->node = NULL;
-	}
-
-	if (ref->death) {
-		binder_debug(BINDER_DEBUG_DEAD_BINDER,
-			     "%d delete ref %d desc %d has death notification\n",
-			      ref->proc->pid, ref->data.debug_id,
-			      ref->data.desc);
-		binder_dequeue_work(ref->proc, &ref->death->work);
-		binder_stats_deleted(BINDER_STAT_DEATH);
-	}
-	binder_stats_deleted(BINDER_STAT_REF);
-}
-
-/**
- * binder_inc_ref_olocked() - increment the ref for given handle
- * @ref:         ref to be incremented
- * @strong:      if true, strong increment, else weak
- * @target_list: list to queue node work on
- *
- * Increment the ref. @ref->proc->outer_lock must be held on entry
- *
- * Return: 0, if successful, else errno
- */
-static int binder_inc_ref_olocked(struct binder_ref *ref, int strong,
-				  struct list_head *target_list)
-{
-	int ret;
-
-	if (strong) {
-		if (ref->data.strong == 0) {
-			ret = binder_inc_node(ref->node, 1, 1, target_list);
-			if (ret)
-				return ret;
-		}
-		ref->data.strong++;
-	} else {
-		if (ref->data.weak == 0) {
-			ret = binder_inc_node(ref->node, 0, 1, target_list);
-			if (ret)
-				return ret;
-		}
-		ref->data.weak++;
-	}
-	return 0;
-}
-
-/**
- * binder_dec_ref_olocked() - dec the ref for given handle
- * @ref:	ref to be decremented
- * @strong:	if true, strong decrement, else weak
- *
- * Decrement the ref.
- *
- * Return: %true if ref is cleaned up and ready to be freed.
- */
-static bool binder_dec_ref_olocked(struct binder_ref *ref, int strong)
-{
-	if (strong) {
-		if (ref->data.strong == 0) {
-			binder_user_error("%d invalid dec strong, ref %d desc %d s %d w %d\n",
-					  ref->proc->pid, ref->data.debug_id,
-					  ref->data.desc, ref->data.strong,
-					  ref->data.weak);
-			return false;
-		}
-		ref->data.strong--;
-		if (ref->data.strong == 0)
-			binder_dec_node(ref->node, strong, 1);
-	} else {
-		if (ref->data.weak == 0) {
-			binder_user_error("%d invalid dec weak, ref %d desc %d s %d w %d\n",
-					  ref->proc->pid, ref->data.debug_id,
-					  ref->data.desc, ref->data.strong,
-					  ref->data.weak);
-			return false;
-		}
-		ref->data.weak--;
-	}
-	if (ref->data.strong == 0 && ref->data.weak == 0) {
-		binder_cleanup_ref_olocked(ref);
-		return true;
-	}
-	return false;
-}
-
-/**
- * binder_get_node_from_ref() - get the node from the given proc/desc
- * @proc:	proc containing the ref
- * @desc:	the handle associated with the ref
- * @need_strong_ref: if true, only return node if ref is strong
- * @rdata:	the id/refcount data for the ref
- *
- * Given a proc and ref handle, return the associated binder_node
- *
- * Return: a binder_node or NULL if not found or not strong when strong required
- */
-static struct binder_node *binder_get_node_from_ref(
-		struct binder_proc *proc,
-		u32 desc, bool need_strong_ref,
-		struct binder_ref_data *rdata)
-{
-	struct binder_node *node;
-	struct binder_ref *ref;
-
-	binder_proc_lock(proc);
-	ref = binder_get_ref_olocked(proc, desc, need_strong_ref);
-	if (!ref)
-		goto err_no_ref;
-	node = ref->node;
-	/*
-	 * Take an implicit reference on the node to ensure
-	 * it stays alive until the call to binder_put_node()
-	 */
-	binder_inc_node_tmpref(node);
-	if (rdata)
-		*rdata = ref->data;
-	binder_proc_unlock(proc);
-
-	return node;
-
-err_no_ref:
-	binder_proc_unlock(proc);
-	return NULL;
-}
-
-/**
- * binder_free_ref() - free the binder_ref
- * @ref:	ref to free
- *
- * Free the binder_ref. Free the binder_node indicated by ref->node
- * (if non-NULL) and the binder_ref_death indicated by ref->death.
- */
-static void binder_free_ref(struct binder_ref *ref)
-{
-	if (ref->node)
-		binder_free_node(ref->node);
-	kfree(ref->death);
-	kfree(ref);
-}
-
-/**
- * binder_update_ref_for_handle() - inc/dec the ref for given handle
- * @proc:	proc containing the ref
- * @desc:	the handle associated with the ref
- * @increment:	true=inc reference, false=dec reference
- * @strong:	true=strong reference, false=weak reference
- * @rdata:	the id/refcount data for the ref
- *
- * Given a proc and ref handle, increment or decrement the ref
- * according to "increment" arg.
- *
- * Return: 0 if successful, else errno
- */
-static int binder_update_ref_for_handle(struct binder_proc *proc,
-		uint32_t desc, bool increment, bool strong,
-		struct binder_ref_data *rdata)
-{
-	int ret = 0;
-	struct binder_ref *ref;
-	bool delete_ref = false;
-
-	binder_proc_lock(proc);
-	ref = binder_get_ref_olocked(proc, desc, strong);
-	if (!ref) {
-		ret = -EINVAL;
-		goto err_no_ref;
-	}
-	if (increment)
-		ret = binder_inc_ref_olocked(ref, strong, NULL);
-	else
-		delete_ref = binder_dec_ref_olocked(ref, strong);
-
-	if (rdata)
-		*rdata = ref->data;
-	binder_proc_unlock(proc);
-
-	if (delete_ref)
-		binder_free_ref(ref);
-	return ret;
-
-err_no_ref:
-	binder_proc_unlock(proc);
-	return ret;
-}
-
-/**
- * binder_dec_ref_for_handle() - dec the ref for given handle
- * @proc:	proc containing the ref
- * @desc:	the handle associated with the ref
- * @strong:	true=strong reference, false=weak reference
- * @rdata:	the id/refcount data for the ref
- *
- * Just calls binder_update_ref_for_handle() to decrement the ref.
- *
- * Return: 0 if successful, else errno
- */
-static int binder_dec_ref_for_handle(struct binder_proc *proc,
-		uint32_t desc, bool strong, struct binder_ref_data *rdata)
-{
-	return binder_update_ref_for_handle(proc, desc, false, strong, rdata);
-}
-
-
-/**
- * binder_inc_ref_for_node() - increment the ref for given proc/node
- * @proc:	 proc containing the ref
- * @node:	 target node
- * @strong:	 true=strong reference, false=weak reference
- * @target_list: worklist to use if node is incremented
- * @rdata:	 the id/refcount data for the ref
- *
- * Given a proc and node, increment the ref. Create the ref if it
- * doesn't already exist
- *
- * Return: 0 if successful, else errno
- */
-static int binder_inc_ref_for_node(struct binder_proc *proc,
-			struct binder_node *node,
-			bool strong,
-			struct list_head *target_list,
-			struct binder_ref_data *rdata)
-{
-	struct binder_ref *ref;
-	struct binder_ref *new_ref = NULL;
-	int ret = 0;
-
-	binder_proc_lock(proc);
-	ref = binder_get_ref_for_node_olocked(proc, node, NULL);
-	if (!ref) {
-		binder_proc_unlock(proc);
-		new_ref = kzalloc(sizeof(*ref), GFP_KERNEL);
-		if (!new_ref)
-			return -ENOMEM;
-		binder_proc_lock(proc);
-		ref = binder_get_ref_for_node_olocked(proc, node, new_ref);
-	}
-	ret = binder_inc_ref_olocked(ref, strong, target_list);
-	*rdata = ref->data;
-	if (ret && ref == new_ref) {
-		/*
-		 * Cleanup the failed reference here as the target
-		 * could now be dead and have already released its
-		 * references by now. Calling on the new reference
-		 * with strong=0 and a tmp_refs will not decrement
-		 * the node. The new_ref gets kfree'd below.
-		 */
-		binder_cleanup_ref_olocked(new_ref);
-		ref = NULL;
-	}
-
-	binder_proc_unlock(proc);
-	if (new_ref && ref != new_ref)
-		/*
-		 * Another thread created the ref first so
-		 * free the one we allocated
-		 */
-		kfree(new_ref);
-	return ret;
-}
-
-static void binder_pop_transaction_ilocked(struct binder_thread *target_thread,
-					   struct binder_transaction *t)
-{
-	BUG_ON(!target_thread);
-	assert_spin_locked(&target_thread->proc->inner_lock);
-	BUG_ON(target_thread->transaction_stack != t);
-	BUG_ON(target_thread->transaction_stack->from != target_thread);
-	target_thread->transaction_stack =
-		target_thread->transaction_stack->from_parent;
-	t->from = NULL;
-}
-
-/**
- * binder_thread_dec_tmpref() - decrement thread->tmp_ref
- * @thread:	thread to decrement
- *
- * A thread needs to be kept alive while being used to create or
- * handle a transaction. binder_get_txn_from() is used to safely
- * extract t->from from a binder_transaction and keep the thread
- * indicated by t->from from being freed. When done with that
- * binder_thread, this function is called to decrement the
- * tmp_ref and free if appropriate (thread has been released
- * and no transaction being processed by the driver)
- */
-static void binder_thread_dec_tmpref(struct binder_thread *thread)
-{
-	/*
-	 * atomic is used to protect the counter value while
-	 * it cannot reach zero or thread->is_dead is false
-	 */
-	binder_inner_proc_lock(thread->proc);
-	atomic_dec(&thread->tmp_ref);
-	if (thread->is_dead && !atomic_read(&thread->tmp_ref)) {
-		binder_inner_proc_unlock(thread->proc);
-		binder_free_thread(thread);
-		return;
-	}
-	binder_inner_proc_unlock(thread->proc);
-}
-
-/**
- * binder_proc_dec_tmpref() - decrement proc->tmp_ref
- * @proc:	proc to decrement
- *
- * A binder_proc needs to be kept alive while being used to create or
- * handle a transaction. proc->tmp_ref is incremented when
- * creating a new transaction or the binder_proc is currently in-use
- * by threads that are being released. When done with the binder_proc,
- * this function is called to decrement the counter and free the
- * proc if appropriate (proc has been released, all threads have
- * been released and not currenly in-use to process a transaction).
- */
-static void binder_proc_dec_tmpref(struct binder_proc *proc)
-{
-	binder_inner_proc_lock(proc);
-	proc->tmp_ref--;
-	if (proc->is_dead && RB_EMPTY_ROOT(&proc->threads) &&
-			!proc->tmp_ref) {
-		binder_inner_proc_unlock(proc);
-		binder_free_proc(proc);
-		return;
-	}
-	binder_inner_proc_unlock(proc);
-}
-
-/**
- * binder_get_txn_from() - safely extract the "from" thread in transaction
- * @t:	binder transaction for t->from
- *
- * Atomically return the "from" thread and increment the tmp_ref
- * count for the thread to ensure it stays alive until
- * binder_thread_dec_tmpref() is called.
- *
- * Return: the value of t->from
- */
-static struct binder_thread *binder_get_txn_from(
-		struct binder_transaction *t)
-{
-	struct binder_thread *from;
-
-	spin_lock(&t->lock);
-	from = t->from;
-	if (from)
-		atomic_inc(&from->tmp_ref);
-	spin_unlock(&t->lock);
-	return from;
-}
-
-/**
- * binder_get_txn_from_and_acq_inner() - get t->from and acquire inner lock
- * @t:	binder transaction for t->from
- *
- * Same as binder_get_txn_from() except it also acquires the proc->inner_lock
- * to guarantee that the thread cannot be released while operating on it.
- * The caller must call binder_inner_proc_unlock() to release the inner lock
- * as well as call binder_dec_thread_txn() to release the reference.
- *
- * Return: the value of t->from
- */
-static struct binder_thread *binder_get_txn_from_and_acq_inner(
-		struct binder_transaction *t)
-	__acquires(&t->from->proc->inner_lock)
-{
-	struct binder_thread *from;
-
-	from = binder_get_txn_from(t);
-	if (!from) {
-		__acquire(&from->proc->inner_lock);
-		return NULL;
-	}
-	binder_inner_proc_lock(from->proc);
-	if (t->from) {
-		BUG_ON(from != t->from);
-		return from;
-	}
-	binder_inner_proc_unlock(from->proc);
-	__acquire(&from->proc->inner_lock);
-	binder_thread_dec_tmpref(from);
-	return NULL;
-}
-
-/**
- * binder_free_txn_fixups() - free unprocessed fd fixups
- * @t:	binder transaction for t->from
- *
- * If the transaction is being torn down prior to being
- * processed by the target process, free all of the
- * fd fixups and fput the file structs. It is safe to
- * call this function after the fixups have been
- * processed -- in that case, the list will be empty.
- */
-static void binder_free_txn_fixups(struct binder_transaction *t)
-{
-	struct binder_txn_fd_fixup *fixup, *tmp;
-
-	list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) {
-		fput(fixup->file);
-		if (fixup->target_fd >= 0)
-			put_unused_fd(fixup->target_fd);
-		list_del(&fixup->fixup_entry);
-		kfree(fixup);
-	}
-}
-
-static void binder_txn_latency_free(struct binder_transaction *t)
-{
-	int from_proc, from_thread, to_proc, to_thread;
-
-	spin_lock(&t->lock);
-	from_proc = t->from ? t->from->proc->pid : 0;
-	from_thread = t->from ? t->from->pid : 0;
-	to_proc = t->to_proc ? t->to_proc->pid : 0;
-	to_thread = t->to_thread ? t->to_thread->pid : 0;
-	spin_unlock(&t->lock);
-
-	trace_binder_txn_latency_free(t, from_proc, from_thread, to_proc, to_thread);
-}
-
-static void binder_free_transaction(struct binder_transaction *t)
-{
-	struct binder_proc *target_proc = t->to_proc;
-
-	if (target_proc) {
-		binder_inner_proc_lock(target_proc);
-		target_proc->outstanding_txns--;
-		if (target_proc->outstanding_txns < 0)
-			pr_warn("%s: Unexpected outstanding_txns %d\n",
-				__func__, target_proc->outstanding_txns);
-		if (!target_proc->outstanding_txns && target_proc->is_frozen)
-			wake_up_interruptible_all(&target_proc->freeze_wait);
-		if (t->buffer)
-			t->buffer->transaction = NULL;
-		binder_inner_proc_unlock(target_proc);
-	}
-	if (trace_binder_txn_latency_free_enabled())
-		binder_txn_latency_free(t);
-	/*
-	 * If the transaction has no target_proc, then
-	 * t->buffer->transaction has already been cleared.
-	 */
-	binder_free_txn_fixups(t);
-	kfree(t);
-	binder_stats_deleted(BINDER_STAT_TRANSACTION);
-}
-
-static void binder_send_failed_reply(struct binder_transaction *t,
-				     uint32_t error_code)
-{
-	struct binder_thread *target_thread;
-	struct binder_transaction *next;
-
-	BUG_ON(t->flags & TF_ONE_WAY);
-	while (1) {
-		target_thread = binder_get_txn_from_and_acq_inner(t);
-		if (target_thread) {
-			binder_debug(BINDER_DEBUG_FAILED_TRANSACTION,
-				     "send failed reply for transaction %d to %d:%d\n",
-				      t->debug_id,
-				      target_thread->proc->pid,
-				      target_thread->pid);
-
-			binder_pop_transaction_ilocked(target_thread, t);
-			if (target_thread->reply_error.cmd == BR_OK) {
-				target_thread->reply_error.cmd = error_code;
-				binder_enqueue_thread_work_ilocked(
-					target_thread,
-					&target_thread->reply_error.work);
-				wake_up_interruptible(&target_thread->wait);
-			} else {
-				/*
-				 * Cannot get here for normal operation, but
-				 * we can if multiple synchronous transactions
-				 * are sent without blocking for responses.
-				 * Just ignore the 2nd error in this case.
-				 */
-				pr_warn("Unexpected reply error: %u\n",
-					target_thread->reply_error.cmd);
-			}
-			binder_inner_proc_unlock(target_thread->proc);
-			binder_thread_dec_tmpref(target_thread);
-			binder_free_transaction(t);
-			return;
-		}
-		__release(&target_thread->proc->inner_lock);
-		next = t->from_parent;
-
-		binder_debug(BINDER_DEBUG_FAILED_TRANSACTION,
-			     "send failed reply for transaction %d, target dead\n",
-			     t->debug_id);
-
-		binder_free_transaction(t);
-		if (next == NULL) {
-			binder_debug(BINDER_DEBUG_DEAD_BINDER,
-				     "reply failed, no target thread at root\n");
-			return;
-		}
-		t = next;
-		binder_debug(BINDER_DEBUG_DEAD_BINDER,
-			     "reply failed, no target thread -- retry %d\n",
-			      t->debug_id);
-	}
-}
-
-/**
- * binder_cleanup_transaction() - cleans up undelivered transaction
- * @t:		transaction that needs to be cleaned up
- * @reason:	reason the transaction wasn't delivered
- * @error_code:	error to return to caller (if synchronous call)
- */
-static void binder_cleanup_transaction(struct binder_transaction *t,
-				       const char *reason,
-				       uint32_t error_code)
-{
-	if (t->buffer->target_node && !(t->flags & TF_ONE_WAY)) {
-		binder_send_failed_reply(t, error_code);
-	} else {
-		binder_debug(BINDER_DEBUG_DEAD_TRANSACTION,
-			"undelivered transaction %d, %s\n",
-			t->debug_id, reason);
-		binder_free_transaction(t);
-	}
-}
-
-/**
- * binder_get_object() - gets object and checks for valid metadata
- * @proc:	binder_proc owning the buffer
- * @u:		sender's user pointer to base of buffer
- * @buffer:	binder_buffer that we're parsing.
- * @offset:	offset in the @buffer at which to validate an object.
- * @object:	struct binder_object to read into
- *
- * Copy the binder object at the given offset into @object. If @u is
- * provided then the copy is from the sender's buffer. If not, then
- * it is copied from the target's @buffer.
- *
- * Return:	If there's a valid metadata object at @offset, the
- *		size of that object. Otherwise, it returns zero. The object
- *		is read into the struct binder_object pointed to by @object.
- */
-static size_t binder_get_object(struct binder_proc *proc,
-				const void __user *u,
-				struct binder_buffer *buffer,
-				unsigned long offset,
-				struct binder_object *object)
-{
-	size_t read_size;
-	struct binder_object_header *hdr;
-	size_t object_size = 0;
-
-	read_size = min_t(size_t, sizeof(*object), buffer->data_size - offset);
-	if (offset > buffer->data_size || read_size < sizeof(*hdr))
-		return 0;
-	if (u) {
-		if (copy_from_user(object, u + offset, read_size))
-			return 0;
-	} else {
-		if (binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
-						  offset, read_size))
-			return 0;
-	}
-
-	/* Ok, now see if we read a complete object. */
-	hdr = &object->hdr;
-	switch (hdr->type) {
-	case BINDER_TYPE_BINDER:
-	case BINDER_TYPE_WEAK_BINDER:
-	case BINDER_TYPE_HANDLE:
-	case BINDER_TYPE_WEAK_HANDLE:
-		object_size = sizeof(struct flat_binder_object);
-		break;
-	case BINDER_TYPE_FD:
-		object_size = sizeof(struct binder_fd_object);
-		break;
-	case BINDER_TYPE_PTR:
-		object_size = sizeof(struct binder_buffer_object);
-		break;
-	case BINDER_TYPE_FDA:
-		object_size = sizeof(struct binder_fd_array_object);
-		break;
-	default:
-		return 0;
-	}
-	if (offset <= buffer->data_size - object_size &&
-	    buffer->data_size >= object_size)
-		return object_size;
-	else
-		return 0;
-}
-
-/**
- * binder_validate_ptr() - validates binder_buffer_object in a binder_buffer.
- * @proc:	binder_proc owning the buffer
- * @b:		binder_buffer containing the object
- * @object:	struct binder_object to read into
- * @index:	index in offset array at which the binder_buffer_object is
- *		located
- * @start_offset: points to the start of the offset array
- * @object_offsetp: offset of @object read from @b
- * @num_valid:	the number of valid offsets in the offset array
- *
- * Return:	If @index is within the valid range of the offset array
- *		described by @start and @num_valid, and if there's a valid
- *		binder_buffer_object at the offset found in index @index
- *		of the offset array, that object is returned. Otherwise,
- *		%NULL is returned.
- *		Note that the offset found in index @index itself is not
- *		verified; this function assumes that @num_valid elements
- *		from @start were previously verified to have valid offsets.
- *		If @object_offsetp is non-NULL, then the offset within
- *		@b is written to it.
- */
-static struct binder_buffer_object *binder_validate_ptr(
-						struct binder_proc *proc,
-						struct binder_buffer *b,
-						struct binder_object *object,
-						binder_size_t index,
-						binder_size_t start_offset,
-						binder_size_t *object_offsetp,
-						binder_size_t num_valid)
-{
-	size_t object_size;
-	binder_size_t object_offset;
-	unsigned long buffer_offset;
-
-	if (index >= num_valid)
-		return NULL;
-
-	buffer_offset = start_offset + sizeof(binder_size_t) * index;
-	if (binder_alloc_copy_from_buffer(&proc->alloc, &object_offset,
-					  b, buffer_offset,
-					  sizeof(object_offset)))
-		return NULL;
-	object_size = binder_get_object(proc, NULL, b, object_offset, object);
-	if (!object_size || object->hdr.type != BINDER_TYPE_PTR)
-		return NULL;
-	if (object_offsetp)
-		*object_offsetp = object_offset;
-
-	return &object->bbo;
-}
-
-/**
- * binder_validate_fixup() - validates pointer/fd fixups happen in order.
- * @proc:		binder_proc owning the buffer
- * @b:			transaction buffer
- * @objects_start_offset: offset to start of objects buffer
- * @buffer_obj_offset:	offset to binder_buffer_object in which to fix up
- * @fixup_offset:	start offset in @buffer to fix up
- * @last_obj_offset:	offset to last binder_buffer_object that we fixed
- * @last_min_offset:	minimum fixup offset in object at @last_obj_offset
- *
- * Return:		%true if a fixup in buffer @buffer at offset @offset is
- *			allowed.
- *
- * For safety reasons, we only allow fixups inside a buffer to happen
- * at increasing offsets; additionally, we only allow fixup on the last
- * buffer object that was verified, or one of its parents.
- *
- * Example of what is allowed:
- *
- * A
- *   B (parent = A, offset = 0)
- *   C (parent = A, offset = 16)
- *     D (parent = C, offset = 0)
- *   E (parent = A, offset = 32) // min_offset is 16 (C.parent_offset)
- *
- * Examples of what is not allowed:
- *
- * Decreasing offsets within the same parent:
- * A
- *   C (parent = A, offset = 16)
- *   B (parent = A, offset = 0) // decreasing offset within A
- *
- * Referring to a parent that wasn't the last object or any of its parents:
- * A
- *   B (parent = A, offset = 0)
- *   C (parent = A, offset = 0)
- *   C (parent = A, offset = 16)
- *     D (parent = B, offset = 0) // B is not A or any of A's parents
- */
-static bool binder_validate_fixup(struct binder_proc *proc,
-				  struct binder_buffer *b,
-				  binder_size_t objects_start_offset,
-				  binder_size_t buffer_obj_offset,
-				  binder_size_t fixup_offset,
-				  binder_size_t last_obj_offset,
-				  binder_size_t last_min_offset)
-{
-	if (!last_obj_offset) {
-		/* Nothing to fix up in */
-		return false;
-	}
-
-	while (last_obj_offset != buffer_obj_offset) {
-		unsigned long buffer_offset;
-		struct binder_object last_object;
-		struct binder_buffer_object *last_bbo;
-		size_t object_size = binder_get_object(proc, NULL, b,
-						       last_obj_offset,
-						       &last_object);
-		if (object_size != sizeof(*last_bbo))
-			return false;
-
-		last_bbo = &last_object.bbo;
-		/*
-		 * Safe to retrieve the parent of last_obj, since it
-		 * was already previously verified by the driver.
-		 */
-		if ((last_bbo->flags & BINDER_BUFFER_FLAG_HAS_PARENT) == 0)
-			return false;
-		last_min_offset = last_bbo->parent_offset + sizeof(uintptr_t);
-		buffer_offset = objects_start_offset +
-			sizeof(binder_size_t) * last_bbo->parent;
-		if (binder_alloc_copy_from_buffer(&proc->alloc,
-						  &last_obj_offset,
-						  b, buffer_offset,
-						  sizeof(last_obj_offset)))
-			return false;
-	}
-	return (fixup_offset >= last_min_offset);
-}
-
-/**
- * struct binder_task_work_cb - for deferred close
- *
- * @twork:                callback_head for task work
- * @fd:                   fd to close
- *
- * Structure to pass task work to be handled after
- * returning from binder_ioctl() via task_work_add().
- */
-struct binder_task_work_cb {
-	struct callback_head twork;
-	struct file *file;
-};
-
-/**
- * binder_do_fd_close() - close list of file descriptors
- * @twork:	callback head for task work
- *
- * It is not safe to call ksys_close() during the binder_ioctl()
- * function if there is a chance that binder's own file descriptor
- * might be closed. This is to meet the requirements for using
- * fdget() (see comments for __fget_light()). Therefore use
- * task_work_add() to schedule the close operation once we have
- * returned from binder_ioctl(). This function is a callback
- * for that mechanism and does the actual ksys_close() on the
- * given file descriptor.
- */
-static void binder_do_fd_close(struct callback_head *twork)
-{
-	struct binder_task_work_cb *twcb = container_of(twork,
-			struct binder_task_work_cb, twork);
-
-	fput(twcb->file);
-	kfree(twcb);
-}
-
-/**
- * binder_deferred_fd_close() - schedule a close for the given file-descriptor
- * @fd:		file-descriptor to close
- *
- * See comments in binder_do_fd_close(). This function is used to schedule
- * a file-descriptor to be closed after returning from binder_ioctl().
- */
-static void binder_deferred_fd_close(int fd)
-{
-	struct binder_task_work_cb *twcb;
-
-	twcb = kzalloc(sizeof(*twcb), GFP_KERNEL);
-	if (!twcb)
-		return;
-	init_task_work(&twcb->twork, binder_do_fd_close);
-	twcb->file = close_fd_get_file(fd);
-	if (twcb->file) {
-		// pin it until binder_do_fd_close(); see comments there
-		get_file(twcb->file);
-		filp_close(twcb->file, current->files);
-		task_work_add(current, &twcb->twork, TWA_RESUME);
-	} else {
-		kfree(twcb);
-	}
-}
-
-static void binder_transaction_buffer_release(struct binder_proc *proc,
-					      struct binder_thread *thread,
-					      struct binder_buffer *buffer,
-					      binder_size_t off_end_offset,
-					      bool is_failure)
-{
-	int debug_id = buffer->debug_id;
-	binder_size_t off_start_offset, buffer_offset;
-
-	binder_debug(BINDER_DEBUG_TRANSACTION,
-		     "%d buffer release %d, size %zd-%zd, failed at %llx\n",
-		     proc->pid, buffer->debug_id,
-		     buffer->data_size, buffer->offsets_size,
-		     (unsigned long long)off_end_offset);
-
-	if (buffer->target_node)
-		binder_dec_node(buffer->target_node, 1, 0);
-
-	off_start_offset = ALIGN(buffer->data_size, sizeof(void *));
-
-	for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
-	     buffer_offset += sizeof(binder_size_t)) {
-		struct binder_object_header *hdr;
-		size_t object_size = 0;
-		struct binder_object object;
-		binder_size_t object_offset;
-
-		if (!binder_alloc_copy_from_buffer(&proc->alloc, &object_offset,
-						   buffer, buffer_offset,
-						   sizeof(object_offset)))
-			object_size = binder_get_object(proc, NULL, buffer,
-							object_offset, &object);
-		if (object_size == 0) {
-			pr_err("transaction release %d bad object at offset %lld, size %zd\n",
-			       debug_id, (u64)object_offset, buffer->data_size);
-			continue;
-		}
-		hdr = &object.hdr;
-		switch (hdr->type) {
-		case BINDER_TYPE_BINDER:
-		case BINDER_TYPE_WEAK_BINDER: {
-			struct flat_binder_object *fp;
-			struct binder_node *node;
-
-			fp = to_flat_binder_object(hdr);
-			node = binder_get_node(proc, fp->binder);
-			if (node == NULL) {
-				pr_err("transaction release %d bad node %016llx\n",
-				       debug_id, (u64)fp->binder);
-				break;
-			}
-			binder_debug(BINDER_DEBUG_TRANSACTION,
-				     "        node %d u%016llx\n",
-				     node->debug_id, (u64)node->ptr);
-			binder_dec_node(node, hdr->type == BINDER_TYPE_BINDER,
-					0);
-			binder_put_node(node);
-		} break;
-		case BINDER_TYPE_HANDLE:
-		case BINDER_TYPE_WEAK_HANDLE: {
-			struct flat_binder_object *fp;
-			struct binder_ref_data rdata;
-			int ret;
-
-			fp = to_flat_binder_object(hdr);
-			ret = binder_dec_ref_for_handle(proc, fp->handle,
-				hdr->type == BINDER_TYPE_HANDLE, &rdata);
-
-			if (ret) {
-				pr_err("transaction release %d bad handle %d, ret = %d\n",
-				 debug_id, fp->handle, ret);
-				break;
-			}
-			binder_debug(BINDER_DEBUG_TRANSACTION,
-				     "        ref %d desc %d\n",
-				     rdata.debug_id, rdata.desc);
-		} break;
-
-		case BINDER_TYPE_FD: {
-			/*
-			 * No need to close the file here since user-space
-			 * closes it for successfully delivered
-			 * transactions. For transactions that weren't
-			 * delivered, the new fd was never allocated so
-			 * there is no need to close and the fput on the
-			 * file is done when the transaction is torn
-			 * down.
-			 */
-		} break;
-		case BINDER_TYPE_PTR:
-			/*
-			 * Nothing to do here, this will get cleaned up when the
-			 * transaction buffer gets freed
-			 */
-			break;
-		case BINDER_TYPE_FDA: {
-			struct binder_fd_array_object *fda;
-			struct binder_buffer_object *parent;
-			struct binder_object ptr_object;
-			binder_size_t fda_offset;
-			size_t fd_index;
-			binder_size_t fd_buf_size;
-			binder_size_t num_valid;
-
-			if (is_failure) {
-				/*
-				 * The fd fixups have not been applied so no
-				 * fds need to be closed.
-				 */
-				continue;
-			}
-
-			num_valid = (buffer_offset - off_start_offset) /
-						sizeof(binder_size_t);
-			fda = to_binder_fd_array_object(hdr);
-			parent = binder_validate_ptr(proc, buffer, &ptr_object,
-						     fda->parent,
-						     off_start_offset,
-						     NULL,
-						     num_valid);
-			if (!parent) {
-				pr_err("transaction release %d bad parent offset\n",
-				       debug_id);
-				continue;
-			}
-			fd_buf_size = sizeof(u32) * fda->num_fds;
-			if (fda->num_fds >= SIZE_MAX / sizeof(u32)) {
-				pr_err("transaction release %d invalid number of fds (%lld)\n",
-				       debug_id, (u64)fda->num_fds);
-				continue;
-			}
-			if (fd_buf_size > parent->length ||
-			    fda->parent_offset > parent->length - fd_buf_size) {
-				/* No space for all file descriptors here. */
-				pr_err("transaction release %d not enough space for %lld fds in buffer\n",
-				       debug_id, (u64)fda->num_fds);
-				continue;
-			}
-			/*
-			 * the source data for binder_buffer_object is visible
-			 * to user-space and the @buffer element is the user
-			 * pointer to the buffer_object containing the fd_array.
-			 * Convert the address to an offset relative to
-			 * the base of the transaction buffer.
-			 */
-			fda_offset =
-			    (parent->buffer - (uintptr_t)buffer->user_data) +
-			    fda->parent_offset;
-			for (fd_index = 0; fd_index < fda->num_fds;
-			     fd_index++) {
-				u32 fd;
-				int err;
-				binder_size_t offset = fda_offset +
-					fd_index * sizeof(fd);
-
-				err = binder_alloc_copy_from_buffer(
-						&proc->alloc, &fd, buffer,
-						offset, sizeof(fd));
-				WARN_ON(err);
-				if (!err) {
-					binder_deferred_fd_close(fd);
-					/*
-					 * Need to make sure the thread goes
-					 * back to userspace to complete the
-					 * deferred close
-					 */
-					if (thread)
-						thread->looper_need_return = true;
-				}
-			}
-		} break;
-		default:
-			pr_err("transaction release %d bad object type %x\n",
-				debug_id, hdr->type);
-			break;
-		}
-	}
-}
-
-/* Clean up all the objects in the buffer */
-static inline void binder_release_entire_buffer(struct binder_proc *proc,
-						struct binder_thread *thread,
-						struct binder_buffer *buffer,
-						bool is_failure)
-{
-	binder_size_t off_end_offset;
-
-	off_end_offset = ALIGN(buffer->data_size, sizeof(void *));
-	off_end_offset += buffer->offsets_size;
-
-	binder_transaction_buffer_release(proc, thread, buffer,
-					  off_end_offset, is_failure);
-}
-
-static int binder_translate_binder(struct flat_binder_object *fp,
-				   struct binder_transaction *t,
-				   struct binder_thread *thread)
-{
-	struct binder_node *node;
-	struct binder_proc *proc = thread->proc;
-	struct binder_proc *target_proc = t->to_proc;
-	struct binder_ref_data rdata;
-	int ret = 0;
-
-	node = binder_get_node(proc, fp->binder);
-	if (!node) {
-		node = binder_new_node(proc, fp);
-		if (!node)
-			return -ENOMEM;
-	}
-	if (fp->cookie != node->cookie) {
-		binder_user_error("%d:%d sending u%016llx node %d, cookie mismatch %016llx != %016llx\n",
-				  proc->pid, thread->pid, (u64)fp->binder,
-				  node->debug_id, (u64)fp->cookie,
-				  (u64)node->cookie);
-		ret = -EINVAL;
-		goto done;
-	}
-	if (security_binder_transfer_binder(proc->cred, target_proc->cred)) {
-		ret = -EPERM;
-		goto done;
-	}
-
-	ret = binder_inc_ref_for_node(target_proc, node,
-			fp->hdr.type == BINDER_TYPE_BINDER,
-			&thread->todo, &rdata);
-	if (ret)
-		goto done;
-
-	if (fp->hdr.type == BINDER_TYPE_BINDER)
-		fp->hdr.type = BINDER_TYPE_HANDLE;
-	else
-		fp->hdr.type = BINDER_TYPE_WEAK_HANDLE;
-	fp->binder = 0;
-	fp->handle = rdata.desc;
-	fp->cookie = 0;
-
-	trace_binder_transaction_node_to_ref(t, node, &rdata);
-	binder_debug(BINDER_DEBUG_TRANSACTION,
-		     "        node %d u%016llx -> ref %d desc %d\n",
-		     node->debug_id, (u64)node->ptr,
-		     rdata.debug_id, rdata.desc);
-done:
-	binder_put_node(node);
-	return ret;
-}
-
-static int binder_translate_handle(struct flat_binder_object *fp,
-				   struct binder_transaction *t,
-				   struct binder_thread *thread)
-{
-	struct binder_proc *proc = thread->proc;
-	struct binder_proc *target_proc = t->to_proc;
-	struct binder_node *node;
-	struct binder_ref_data src_rdata;
-	int ret = 0;
-
-	node = binder_get_node_from_ref(proc, fp->handle,
-			fp->hdr.type == BINDER_TYPE_HANDLE, &src_rdata);
-	if (!node) {
-		binder_user_error("%d:%d got transaction with invalid handle, %d\n",
-				  proc->pid, thread->pid, fp->handle);
-		return -EINVAL;
-	}
-	if (security_binder_transfer_binder(proc->cred, target_proc->cred)) {
-		ret = -EPERM;
-		goto done;
-	}
-
-	binder_node_lock(node);
-	if (node->proc == target_proc) {
-		if (fp->hdr.type == BINDER_TYPE_HANDLE)
-			fp->hdr.type = BINDER_TYPE_BINDER;
-		else
-			fp->hdr.type = BINDER_TYPE_WEAK_BINDER;
-		fp->binder = node->ptr;
-		fp->cookie = node->cookie;
-		if (node->proc)
-			binder_inner_proc_lock(node->proc);
-		else
-			__acquire(&node->proc->inner_lock);
-		binder_inc_node_nilocked(node,
-					 fp->hdr.type == BINDER_TYPE_BINDER,
-					 0, NULL);
-		if (node->proc)
-			binder_inner_proc_unlock(node->proc);
-		else
-			__release(&node->proc->inner_lock);
-		trace_binder_transaction_ref_to_node(t, node, &src_rdata);
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "        ref %d desc %d -> node %d u%016llx\n",
-			     src_rdata.debug_id, src_rdata.desc, node->debug_id,
-			     (u64)node->ptr);
-		binder_node_unlock(node);
-	} else {
-		struct binder_ref_data dest_rdata;
-
-		binder_node_unlock(node);
-		ret = binder_inc_ref_for_node(target_proc, node,
-				fp->hdr.type == BINDER_TYPE_HANDLE,
-				NULL, &dest_rdata);
-		if (ret)
-			goto done;
-
-		fp->binder = 0;
-		fp->handle = dest_rdata.desc;
-		fp->cookie = 0;
-		trace_binder_transaction_ref_to_ref(t, node, &src_rdata,
-						    &dest_rdata);
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "        ref %d desc %d -> ref %d desc %d (node %d)\n",
-			     src_rdata.debug_id, src_rdata.desc,
-			     dest_rdata.debug_id, dest_rdata.desc,
-			     node->debug_id);
-	}
-done:
-	binder_put_node(node);
-	return ret;
-}
-
-static int binder_translate_fd(u32 fd, binder_size_t fd_offset,
-			       struct binder_transaction *t,
-			       struct binder_thread *thread,
-			       struct binder_transaction *in_reply_to)
-{
-	struct binder_proc *proc = thread->proc;
-	struct binder_proc *target_proc = t->to_proc;
-	struct binder_txn_fd_fixup *fixup;
-	struct file *file;
-	int ret = 0;
-	bool target_allows_fd;
-
-	if (in_reply_to)
-		target_allows_fd = !!(in_reply_to->flags & TF_ACCEPT_FDS);
-	else
-		target_allows_fd = t->buffer->target_node->accept_fds;
-	if (!target_allows_fd) {
-		binder_user_error("%d:%d got %s with fd, %d, but target does not allow fds\n",
-				  proc->pid, thread->pid,
-				  in_reply_to ? "reply" : "transaction",
-				  fd);
-		ret = -EPERM;
-		goto err_fd_not_accepted;
-	}
-
-	file = fget(fd);
-	if (!file) {
-		binder_user_error("%d:%d got transaction with invalid fd, %d\n",
-				  proc->pid, thread->pid, fd);
-		ret = -EBADF;
-		goto err_fget;
-	}
-	ret = security_binder_transfer_file(proc->cred, target_proc->cred, file);
-	if (ret < 0) {
-		ret = -EPERM;
-		goto err_security;
-	}
-
-	/*
-	 * Add fixup record for this transaction. The allocation
-	 * of the fd in the target needs to be done from a
-	 * target thread.
-	 */
-	fixup = kzalloc(sizeof(*fixup), GFP_KERNEL);
-	if (!fixup) {
-		ret = -ENOMEM;
-		goto err_alloc;
-	}
-	fixup->file = file;
-	fixup->offset = fd_offset;
-	fixup->target_fd = -1;
-	trace_binder_transaction_fd_send(t, fd, fixup->offset);
-	list_add_tail(&fixup->fixup_entry, &t->fd_fixups);
-
-	return ret;
-
-err_alloc:
-err_security:
-	fput(file);
-err_fget:
-err_fd_not_accepted:
-	return ret;
-}
-
-/**
- * struct binder_ptr_fixup - data to be fixed-up in target buffer
- * @offset	offset in target buffer to fixup
- * @skip_size	bytes to skip in copy (fixup will be written later)
- * @fixup_data	data to write at fixup offset
- * @node	list node
- *
- * This is used for the pointer fixup list (pf) which is created and consumed
- * during binder_transaction() and is only accessed locally. No
- * locking is necessary.
- *
- * The list is ordered by @offset.
- */
-struct binder_ptr_fixup {
-	binder_size_t offset;
-	size_t skip_size;
-	binder_uintptr_t fixup_data;
-	struct list_head node;
-};
-
-/**
- * struct binder_sg_copy - scatter-gather data to be copied
- * @offset		offset in target buffer
- * @sender_uaddr	user address in source buffer
- * @length		bytes to copy
- * @node		list node
- *
- * This is used for the sg copy list (sgc) which is created and consumed
- * during binder_transaction() and is only accessed locally. No
- * locking is necessary.
- *
- * The list is ordered by @offset.
- */
-struct binder_sg_copy {
-	binder_size_t offset;
-	const void __user *sender_uaddr;
-	size_t length;
-	struct list_head node;
-};
-
-/**
- * binder_do_deferred_txn_copies() - copy and fixup scatter-gather data
- * @alloc:	binder_alloc associated with @buffer
- * @buffer:	binder buffer in target process
- * @sgc_head:	list_head of scatter-gather copy list
- * @pf_head:	list_head of pointer fixup list
- *
- * Processes all elements of @sgc_head, applying fixups from @pf_head
- * and copying the scatter-gather data from the source process' user
- * buffer to the target's buffer. It is expected that the list creation
- * and processing all occurs during binder_transaction() so these lists
- * are only accessed in local context.
- *
- * Return: 0=success, else -errno
- */
-static int binder_do_deferred_txn_copies(struct binder_alloc *alloc,
-					 struct binder_buffer *buffer,
-					 struct list_head *sgc_head,
-					 struct list_head *pf_head)
-{
-	int ret = 0;
-	struct binder_sg_copy *sgc, *tmpsgc;
-	struct binder_ptr_fixup *tmppf;
-	struct binder_ptr_fixup *pf =
-		list_first_entry_or_null(pf_head, struct binder_ptr_fixup,
-					 node);
-
-	list_for_each_entry_safe(sgc, tmpsgc, sgc_head, node) {
-		size_t bytes_copied = 0;
-
-		while (bytes_copied < sgc->length) {
-			size_t copy_size;
-			size_t bytes_left = sgc->length - bytes_copied;
-			size_t offset = sgc->offset + bytes_copied;
-
-			/*
-			 * We copy up to the fixup (pointed to by pf)
-			 */
-			copy_size = pf ? min(bytes_left, (size_t)pf->offset - offset)
-				       : bytes_left;
-			if (!ret && copy_size)
-				ret = binder_alloc_copy_user_to_buffer(
-						alloc, buffer,
-						offset,
-						sgc->sender_uaddr + bytes_copied,
-						copy_size);
-			bytes_copied += copy_size;
-			if (copy_size != bytes_left) {
-				BUG_ON(!pf);
-				/* we stopped at a fixup offset */
-				if (pf->skip_size) {
-					/*
-					 * we are just skipping. This is for
-					 * BINDER_TYPE_FDA where the translated
-					 * fds will be fixed up when we get
-					 * to target context.
-					 */
-					bytes_copied += pf->skip_size;
-				} else {
-					/* apply the fixup indicated by pf */
-					if (!ret)
-						ret = binder_alloc_copy_to_buffer(
-							alloc, buffer,
-							pf->offset,
-							&pf->fixup_data,
-							sizeof(pf->fixup_data));
-					bytes_copied += sizeof(pf->fixup_data);
-				}
-				list_del(&pf->node);
-				kfree(pf);
-				pf = list_first_entry_or_null(pf_head,
-						struct binder_ptr_fixup, node);
-			}
-		}
-		list_del(&sgc->node);
-		kfree(sgc);
-	}
-	list_for_each_entry_safe(pf, tmppf, pf_head, node) {
-		BUG_ON(pf->skip_size == 0);
-		list_del(&pf->node);
-		kfree(pf);
-	}
-	BUG_ON(!list_empty(sgc_head));
-
-	return ret > 0 ? -EINVAL : ret;
-}
-
-/**
- * binder_cleanup_deferred_txn_lists() - free specified lists
- * @sgc_head:	list_head of scatter-gather copy list
- * @pf_head:	list_head of pointer fixup list
- *
- * Called to clean up @sgc_head and @pf_head if there is an
- * error.
- */
-static void binder_cleanup_deferred_txn_lists(struct list_head *sgc_head,
-					      struct list_head *pf_head)
-{
-	struct binder_sg_copy *sgc, *tmpsgc;
-	struct binder_ptr_fixup *pf, *tmppf;
-
-	list_for_each_entry_safe(sgc, tmpsgc, sgc_head, node) {
-		list_del(&sgc->node);
-		kfree(sgc);
-	}
-	list_for_each_entry_safe(pf, tmppf, pf_head, node) {
-		list_del(&pf->node);
-		kfree(pf);
-	}
-}
-
-/**
- * binder_defer_copy() - queue a scatter-gather buffer for copy
- * @sgc_head:		list_head of scatter-gather copy list
- * @offset:		binder buffer offset in target process
- * @sender_uaddr:	user address in source process
- * @length:		bytes to copy
- *
- * Specify a scatter-gather block to be copied. The actual copy must
- * be deferred until all the needed fixups are identified and queued.
- * Then the copy and fixups are done together so un-translated values
- * from the source are never visible in the target buffer.
- *
- * We are guaranteed that repeated calls to this function will have
- * monotonically increasing @offset values so the list will naturally
- * be ordered.
- *
- * Return: 0=success, else -errno
- */
-static int binder_defer_copy(struct list_head *sgc_head, binder_size_t offset,
-			     const void __user *sender_uaddr, size_t length)
-{
-	struct binder_sg_copy *bc = kzalloc(sizeof(*bc), GFP_KERNEL);
-
-	if (!bc)
-		return -ENOMEM;
-
-	bc->offset = offset;
-	bc->sender_uaddr = sender_uaddr;
-	bc->length = length;
-	INIT_LIST_HEAD(&bc->node);
-
-	/*
-	 * We are guaranteed that the deferred copies are in-order
-	 * so just add to the tail.
-	 */
-	list_add_tail(&bc->node, sgc_head);
-
-	return 0;
-}
-
-/**
- * binder_add_fixup() - queue a fixup to be applied to sg copy
- * @pf_head:	list_head of binder ptr fixup list
- * @offset:	binder buffer offset in target process
- * @fixup:	bytes to be copied for fixup
- * @skip_size:	bytes to skip when copying (fixup will be applied later)
- *
- * Add the specified fixup to a list ordered by @offset. When copying
- * the scatter-gather buffers, the fixup will be copied instead of
- * data from the source buffer. For BINDER_TYPE_FDA fixups, the fixup
- * will be applied later (in target process context), so we just skip
- * the bytes specified by @skip_size. If @skip_size is 0, we copy the
- * value in @fixup.
- *
- * This function is called *mostly* in @offset order, but there are
- * exceptions. Since out-of-order inserts are relatively uncommon,
- * we insert the new element by searching backward from the tail of
- * the list.
- *
- * Return: 0=success, else -errno
- */
-static int binder_add_fixup(struct list_head *pf_head, binder_size_t offset,
-			    binder_uintptr_t fixup, size_t skip_size)
-{
-	struct binder_ptr_fixup *pf = kzalloc(sizeof(*pf), GFP_KERNEL);
-	struct binder_ptr_fixup *tmppf;
-
-	if (!pf)
-		return -ENOMEM;
-
-	pf->offset = offset;
-	pf->fixup_data = fixup;
-	pf->skip_size = skip_size;
-	INIT_LIST_HEAD(&pf->node);
-
-	/* Fixups are *mostly* added in-order, but there are some
-	 * exceptions. Look backwards through list for insertion point.
-	 */
-	list_for_each_entry_reverse(tmppf, pf_head, node) {
-		if (tmppf->offset < pf->offset) {
-			list_add(&pf->node, &tmppf->node);
-			return 0;
-		}
-	}
-	/*
-	 * if we get here, then the new offset is the lowest so
-	 * insert at the head
-	 */
-	list_add(&pf->node, pf_head);
-	return 0;
-}
-
-static int binder_translate_fd_array(struct list_head *pf_head,
-				     struct binder_fd_array_object *fda,
-				     const void __user *sender_ubuffer,
-				     struct binder_buffer_object *parent,
-				     struct binder_buffer_object *sender_uparent,
-				     struct binder_transaction *t,
-				     struct binder_thread *thread,
-				     struct binder_transaction *in_reply_to)
-{
-	binder_size_t fdi, fd_buf_size;
-	binder_size_t fda_offset;
-	const void __user *sender_ufda_base;
-	struct binder_proc *proc = thread->proc;
-	int ret;
-
-	if (fda->num_fds == 0)
-		return 0;
-
-	fd_buf_size = sizeof(u32) * fda->num_fds;
-	if (fda->num_fds >= SIZE_MAX / sizeof(u32)) {
-		binder_user_error("%d:%d got transaction with invalid number of fds (%lld)\n",
-				  proc->pid, thread->pid, (u64)fda->num_fds);
-		return -EINVAL;
-	}
-	if (fd_buf_size > parent->length ||
-	    fda->parent_offset > parent->length - fd_buf_size) {
-		/* No space for all file descriptors here. */
-		binder_user_error("%d:%d not enough space to store %lld fds in buffer\n",
-				  proc->pid, thread->pid, (u64)fda->num_fds);
-		return -EINVAL;
-	}
-	/*
-	 * the source data for binder_buffer_object is visible
-	 * to user-space and the @buffer element is the user
-	 * pointer to the buffer_object containing the fd_array.
-	 * Convert the address to an offset relative to
-	 * the base of the transaction buffer.
-	 */
-	fda_offset = (parent->buffer - (uintptr_t)t->buffer->user_data) +
-		fda->parent_offset;
-	sender_ufda_base = (void __user *)(uintptr_t)sender_uparent->buffer +
-				fda->parent_offset;
-
-	if (!IS_ALIGNED((unsigned long)fda_offset, sizeof(u32)) ||
-	    !IS_ALIGNED((unsigned long)sender_ufda_base, sizeof(u32))) {
-		binder_user_error("%d:%d parent offset not aligned correctly.\n",
-				  proc->pid, thread->pid);
-		return -EINVAL;
-	}
-	ret = binder_add_fixup(pf_head, fda_offset, 0, fda->num_fds * sizeof(u32));
-	if (ret)
-		return ret;
-
-	for (fdi = 0; fdi < fda->num_fds; fdi++) {
-		u32 fd;
-		binder_size_t offset = fda_offset + fdi * sizeof(fd);
-		binder_size_t sender_uoffset = fdi * sizeof(fd);
-
-		ret = copy_from_user(&fd, sender_ufda_base + sender_uoffset, sizeof(fd));
-		if (!ret)
-			ret = binder_translate_fd(fd, offset, t, thread,
-						  in_reply_to);
-		if (ret)
-			return ret > 0 ? -EINVAL : ret;
-	}
-	return 0;
-}
-
-static int binder_fixup_parent(struct list_head *pf_head,
-			       struct binder_transaction *t,
-			       struct binder_thread *thread,
-			       struct binder_buffer_object *bp,
-			       binder_size_t off_start_offset,
-			       binder_size_t num_valid,
-			       binder_size_t last_fixup_obj_off,
-			       binder_size_t last_fixup_min_off)
-{
-	struct binder_buffer_object *parent;
-	struct binder_buffer *b = t->buffer;
-	struct binder_proc *proc = thread->proc;
-	struct binder_proc *target_proc = t->to_proc;
-	struct binder_object object;
-	binder_size_t buffer_offset;
-	binder_size_t parent_offset;
-
-	if (!(bp->flags & BINDER_BUFFER_FLAG_HAS_PARENT))
-		return 0;
-
-	parent = binder_validate_ptr(target_proc, b, &object, bp->parent,
-				     off_start_offset, &parent_offset,
-				     num_valid);
-	if (!parent) {
-		binder_user_error("%d:%d got transaction with invalid parent offset or type\n",
-				  proc->pid, thread->pid);
-		return -EINVAL;
-	}
-
-	if (!binder_validate_fixup(target_proc, b, off_start_offset,
-				   parent_offset, bp->parent_offset,
-				   last_fixup_obj_off,
-				   last_fixup_min_off)) {
-		binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n",
-				  proc->pid, thread->pid);
-		return -EINVAL;
-	}
-
-	if (parent->length < sizeof(binder_uintptr_t) ||
-	    bp->parent_offset > parent->length - sizeof(binder_uintptr_t)) {
-		/* No space for a pointer here! */
-		binder_user_error("%d:%d got transaction with invalid parent offset\n",
-				  proc->pid, thread->pid);
-		return -EINVAL;
-	}
-	buffer_offset = bp->parent_offset +
-			(uintptr_t)parent->buffer - (uintptr_t)b->user_data;
-	return binder_add_fixup(pf_head, buffer_offset, bp->buffer, 0);
-}
-
-/**
- * binder_can_update_transaction() - Can a txn be superseded by an updated one?
- * @t1: the pending async txn in the frozen process
- * @t2: the new async txn to supersede the outdated pending one
- *
- * Return:  true if t2 can supersede t1
- *          false if t2 can not supersede t1
- */
-static bool binder_can_update_transaction(struct binder_transaction *t1,
-					  struct binder_transaction *t2)
-{
-	if ((t1->flags & t2->flags & (TF_ONE_WAY | TF_UPDATE_TXN)) !=
-	    (TF_ONE_WAY | TF_UPDATE_TXN) || !t1->to_proc || !t2->to_proc)
-		return false;
-	if (t1->to_proc->tsk == t2->to_proc->tsk && t1->code == t2->code &&
-	    t1->flags == t2->flags && t1->buffer->pid == t2->buffer->pid &&
-	    t1->buffer->target_node->ptr == t2->buffer->target_node->ptr &&
-	    t1->buffer->target_node->cookie == t2->buffer->target_node->cookie)
-		return true;
-	return false;
-}
-
-/**
- * binder_find_outdated_transaction_ilocked() - Find the outdated transaction
- * @t:		 new async transaction
- * @target_list: list to find outdated transaction
- *
- * Return: the outdated transaction if found
- *         NULL if no outdated transacton can be found
- *
- * Requires the proc->inner_lock to be held.
- */
-static struct binder_transaction *
-binder_find_outdated_transaction_ilocked(struct binder_transaction *t,
-					 struct list_head *target_list)
-{
-	struct binder_work *w;
-
-	list_for_each_entry(w, target_list, entry) {
-		struct binder_transaction *t_queued;
-
-		if (w->type != BINDER_WORK_TRANSACTION)
-			continue;
-		t_queued = container_of(w, struct binder_transaction, work);
-		if (binder_can_update_transaction(t_queued, t))
-			return t_queued;
-	}
-	return NULL;
-}
-
-/**
- * binder_proc_transaction() - sends a transaction to a process and wakes it up
- * @t:		transaction to send
- * @proc:	process to send the transaction to
- * @thread:	thread in @proc to send the transaction to (may be NULL)
- *
- * This function queues a transaction to the specified process. It will try
- * to find a thread in the target process to handle the transaction and
- * wake it up. If no thread is found, the work is queued to the proc
- * waitqueue.
- *
- * If the @thread parameter is not NULL, the transaction is always queued
- * to the waitlist of that specific thread.
- *
- * Return:	0 if the transaction was successfully queued
- *		BR_DEAD_REPLY if the target process or thread is dead
- *		BR_FROZEN_REPLY if the target process or thread is frozen and
- *			the sync transaction was rejected
- *		BR_TRANSACTION_PENDING_FROZEN if the target process is frozen
- *		and the async transaction was successfully queued
- */
-static int binder_proc_transaction(struct binder_transaction *t,
-				    struct binder_proc *proc,
-				    struct binder_thread *thread)
-{
-	struct binder_node *node = t->buffer->target_node;
-	bool oneway = !!(t->flags & TF_ONE_WAY);
-	bool pending_async = false;
-	struct binder_transaction *t_outdated = NULL;
-	bool frozen = false;
-
-	BUG_ON(!node);
-	binder_node_lock(node);
-	if (oneway) {
-		BUG_ON(thread);
-		if (node->has_async_transaction)
-			pending_async = true;
-		else
-			node->has_async_transaction = true;
-	}
-
-	binder_inner_proc_lock(proc);
-	if (proc->is_frozen) {
-		frozen = true;
-		proc->sync_recv |= !oneway;
-		proc->async_recv |= oneway;
-	}
-
-	if ((frozen && !oneway) || proc->is_dead ||
-			(thread && thread->is_dead)) {
-		binder_inner_proc_unlock(proc);
-		binder_node_unlock(node);
-		return frozen ? BR_FROZEN_REPLY : BR_DEAD_REPLY;
-	}
-
-	if (!thread && !pending_async)
-		thread = binder_select_thread_ilocked(proc);
-
-	if (thread) {
-		binder_enqueue_thread_work_ilocked(thread, &t->work);
-	} else if (!pending_async) {
-		binder_enqueue_work_ilocked(&t->work, &proc->todo);
-	} else {
-		if ((t->flags & TF_UPDATE_TXN) && frozen) {
-			t_outdated = binder_find_outdated_transaction_ilocked(t,
-									      &node->async_todo);
-			if (t_outdated) {
-				binder_debug(BINDER_DEBUG_TRANSACTION,
-					     "txn %d supersedes %d\n",
-					     t->debug_id, t_outdated->debug_id);
-				list_del_init(&t_outdated->work.entry);
-				proc->outstanding_txns--;
-			}
-		}
-		binder_enqueue_work_ilocked(&t->work, &node->async_todo);
-	}
-
-	if (!pending_async)
-		binder_wakeup_thread_ilocked(proc, thread, !oneway /* sync */);
-
-	proc->outstanding_txns++;
-	binder_inner_proc_unlock(proc);
-	binder_node_unlock(node);
-
-	/*
-	 * To reduce potential contention, free the outdated transaction and
-	 * buffer after releasing the locks.
-	 */
-	if (t_outdated) {
-		struct binder_buffer *buffer = t_outdated->buffer;
-
-		t_outdated->buffer = NULL;
-		buffer->transaction = NULL;
-		trace_binder_transaction_update_buffer_release(buffer);
-		binder_release_entire_buffer(proc, NULL, buffer, false);
-		binder_alloc_free_buf(&proc->alloc, buffer);
-		kfree(t_outdated);
-		binder_stats_deleted(BINDER_STAT_TRANSACTION);
-	}
-
-	if (oneway && frozen)
-		return BR_TRANSACTION_PENDING_FROZEN;
-
-	return 0;
-}
-
-/**
- * binder_get_node_refs_for_txn() - Get required refs on node for txn
- * @node:         struct binder_node for which to get refs
- * @procp:        returns @node->proc if valid
- * @error:        if no @procp then returns BR_DEAD_REPLY
- *
- * User-space normally keeps the node alive when creating a transaction
- * since it has a reference to the target. The local strong ref keeps it
- * alive if the sending process dies before the target process processes
- * the transaction. If the source process is malicious or has a reference
- * counting bug, relying on the local strong ref can fail.
- *
- * Since user-space can cause the local strong ref to go away, we also take
- * a tmpref on the node to ensure it survives while we are constructing
- * the transaction. We also need a tmpref on the proc while we are
- * constructing the transaction, so we take that here as well.
- *
- * Return: The target_node with refs taken or NULL if no @node->proc is NULL.
- * Also sets @procp if valid. If the @node->proc is NULL indicating that the
- * target proc has died, @error is set to BR_DEAD_REPLY.
- */
-static struct binder_node *binder_get_node_refs_for_txn(
-		struct binder_node *node,
-		struct binder_proc **procp,
-		uint32_t *error)
-{
-	struct binder_node *target_node = NULL;
-
-	binder_node_inner_lock(node);
-	if (node->proc) {
-		target_node = node;
-		binder_inc_node_nilocked(node, 1, 0, NULL);
-		binder_inc_node_tmpref_ilocked(node);
-		node->proc->tmp_ref++;
-		*procp = node->proc;
-	} else
-		*error = BR_DEAD_REPLY;
-	binder_node_inner_unlock(node);
-
-	return target_node;
-}
-
-static void binder_set_txn_from_error(struct binder_transaction *t, int id,
-				      uint32_t command, int32_t param)
-{
-	struct binder_thread *from = binder_get_txn_from_and_acq_inner(t);
-
-	if (!from) {
-		/* annotation for sparse */
-		__release(&from->proc->inner_lock);
-		return;
-	}
-
-	/* don't override existing errors */
-	if (from->ee.command == BR_OK)
-		binder_set_extended_error(&from->ee, id, command, param);
-	binder_inner_proc_unlock(from->proc);
-	binder_thread_dec_tmpref(from);
-}
-
-static void binder_transaction(struct binder_proc *proc,
-			       struct binder_thread *thread,
-			       struct binder_transaction_data *tr, int reply,
-			       binder_size_t extra_buffers_size)
-{
-	int ret;
-	struct binder_transaction *t;
-	struct binder_work *w;
-	struct binder_work *tcomplete;
-	binder_size_t buffer_offset = 0;
-	binder_size_t off_start_offset, off_end_offset;
-	binder_size_t off_min;
-	binder_size_t sg_buf_offset, sg_buf_end_offset;
-	binder_size_t user_offset = 0;
-	struct binder_proc *target_proc = NULL;
-	struct binder_thread *target_thread = NULL;
-	struct binder_node *target_node = NULL;
-	struct binder_transaction *in_reply_to = NULL;
-	struct binder_transaction_log_entry *e;
-	uint32_t return_error = 0;
-	uint32_t return_error_param = 0;
-	uint32_t return_error_line = 0;
-	binder_size_t last_fixup_obj_off = 0;
-	binder_size_t last_fixup_min_off = 0;
-	struct binder_context *context = proc->context;
-	int t_debug_id = atomic_inc_return(&binder_last_id);
-	ktime_t t_start_time = ktime_get();
-	char *secctx = NULL;
-	u32 secctx_sz = 0;
-	struct list_head sgc_head;
-	struct list_head pf_head;
-	const void __user *user_buffer = (const void __user *)
-				(uintptr_t)tr->data.ptr.buffer;
-	INIT_LIST_HEAD(&sgc_head);
-	INIT_LIST_HEAD(&pf_head);
-
-	e = binder_transaction_log_add(&binder_transaction_log);
-	e->debug_id = t_debug_id;
-	e->call_type = reply ? 2 : !!(tr->flags & TF_ONE_WAY);
-	e->from_proc = proc->pid;
-	e->from_thread = thread->pid;
-	e->target_handle = tr->target.handle;
-	e->data_size = tr->data_size;
-	e->offsets_size = tr->offsets_size;
-	strscpy(e->context_name, proc->context->name, BINDERFS_MAX_NAME);
-
-	binder_inner_proc_lock(proc);
-	binder_set_extended_error(&thread->ee, t_debug_id, BR_OK, 0);
-	binder_inner_proc_unlock(proc);
-
-	if (reply) {
-		binder_inner_proc_lock(proc);
-		in_reply_to = thread->transaction_stack;
-		if (in_reply_to == NULL) {
-			binder_inner_proc_unlock(proc);
-			binder_user_error("%d:%d got reply transaction with no transaction stack\n",
-					  proc->pid, thread->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EPROTO;
-			return_error_line = __LINE__;
-			goto err_empty_call_stack;
-		}
-		if (in_reply_to->to_thread != thread) {
-			spin_lock(&in_reply_to->lock);
-			binder_user_error("%d:%d got reply transaction with bad transaction stack, transaction %d has target %d:%d\n",
-				proc->pid, thread->pid, in_reply_to->debug_id,
-				in_reply_to->to_proc ?
-				in_reply_to->to_proc->pid : 0,
-				in_reply_to->to_thread ?
-				in_reply_to->to_thread->pid : 0);
-			spin_unlock(&in_reply_to->lock);
-			binder_inner_proc_unlock(proc);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EPROTO;
-			return_error_line = __LINE__;
-			in_reply_to = NULL;
-			goto err_bad_call_stack;
-		}
-		thread->transaction_stack = in_reply_to->to_parent;
-		binder_inner_proc_unlock(proc);
-		binder_set_nice(in_reply_to->saved_priority);
-		target_thread = binder_get_txn_from_and_acq_inner(in_reply_to);
-		if (target_thread == NULL) {
-			/* annotation for sparse */
-			__release(&target_thread->proc->inner_lock);
-			binder_txn_error("%d:%d reply target not found\n",
-				thread->pid, proc->pid);
-			return_error = BR_DEAD_REPLY;
-			return_error_line = __LINE__;
-			goto err_dead_binder;
-		}
-		if (target_thread->transaction_stack != in_reply_to) {
-			binder_user_error("%d:%d got reply transaction with bad target transaction stack %d, expected %d\n",
-				proc->pid, thread->pid,
-				target_thread->transaction_stack ?
-				target_thread->transaction_stack->debug_id : 0,
-				in_reply_to->debug_id);
-			binder_inner_proc_unlock(target_thread->proc);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EPROTO;
-			return_error_line = __LINE__;
-			in_reply_to = NULL;
-			target_thread = NULL;
-			goto err_dead_binder;
-		}
-		target_proc = target_thread->proc;
-		target_proc->tmp_ref++;
-		binder_inner_proc_unlock(target_thread->proc);
-	} else {
-		if (tr->target.handle) {
-			struct binder_ref *ref;
-
-			/*
-			 * There must already be a strong ref
-			 * on this node. If so, do a strong
-			 * increment on the node to ensure it
-			 * stays alive until the transaction is
-			 * done.
-			 */
-			binder_proc_lock(proc);
-			ref = binder_get_ref_olocked(proc, tr->target.handle,
-						     true);
-			if (ref) {
-				target_node = binder_get_node_refs_for_txn(
-						ref->node, &target_proc,
-						&return_error);
-			} else {
-				binder_user_error("%d:%d got transaction to invalid handle, %u\n",
-						  proc->pid, thread->pid, tr->target.handle);
-				return_error = BR_FAILED_REPLY;
-			}
-			binder_proc_unlock(proc);
-		} else {
-			mutex_lock(&context->context_mgr_node_lock);
-			target_node = context->binder_context_mgr_node;
-			if (target_node)
-				target_node = binder_get_node_refs_for_txn(
-						target_node, &target_proc,
-						&return_error);
-			else
-				return_error = BR_DEAD_REPLY;
-			mutex_unlock(&context->context_mgr_node_lock);
-			if (target_node && target_proc->pid == proc->pid) {
-				binder_user_error("%d:%d got transaction to context manager from process owning it\n",
-						  proc->pid, thread->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EINVAL;
-				return_error_line = __LINE__;
-				goto err_invalid_target_handle;
-			}
-		}
-		if (!target_node) {
-			binder_txn_error("%d:%d cannot find target node\n",
-				thread->pid, proc->pid);
-			/*
-			 * return_error is set above
-			 */
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_dead_binder;
-		}
-		e->to_node = target_node->debug_id;
-		if (WARN_ON(proc == target_proc)) {
-			binder_txn_error("%d:%d self transactions not allowed\n",
-				thread->pid, proc->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_invalid_target_handle;
-		}
-		if (security_binder_transaction(proc->cred,
-						target_proc->cred) < 0) {
-			binder_txn_error("%d:%d transaction credentials failed\n",
-				thread->pid, proc->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EPERM;
-			return_error_line = __LINE__;
-			goto err_invalid_target_handle;
-		}
-		binder_inner_proc_lock(proc);
-
-		w = list_first_entry_or_null(&thread->todo,
-					     struct binder_work, entry);
-		if (!(tr->flags & TF_ONE_WAY) && w &&
-		    w->type == BINDER_WORK_TRANSACTION) {
-			/*
-			 * Do not allow new outgoing transaction from a
-			 * thread that has a transaction at the head of
-			 * its todo list. Only need to check the head
-			 * because binder_select_thread_ilocked picks a
-			 * thread from proc->waiting_threads to enqueue
-			 * the transaction, and nothing is queued to the
-			 * todo list while the thread is on waiting_threads.
-			 */
-			binder_user_error("%d:%d new transaction not allowed when there is a transaction on thread todo\n",
-					  proc->pid, thread->pid);
-			binder_inner_proc_unlock(proc);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EPROTO;
-			return_error_line = __LINE__;
-			goto err_bad_todo_list;
-		}
-
-		if (!(tr->flags & TF_ONE_WAY) && thread->transaction_stack) {
-			struct binder_transaction *tmp;
-
-			tmp = thread->transaction_stack;
-			if (tmp->to_thread != thread) {
-				spin_lock(&tmp->lock);
-				binder_user_error("%d:%d got new transaction with bad transaction stack, transaction %d has target %d:%d\n",
-					proc->pid, thread->pid, tmp->debug_id,
-					tmp->to_proc ? tmp->to_proc->pid : 0,
-					tmp->to_thread ?
-					tmp->to_thread->pid : 0);
-				spin_unlock(&tmp->lock);
-				binder_inner_proc_unlock(proc);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EPROTO;
-				return_error_line = __LINE__;
-				goto err_bad_call_stack;
-			}
-			while (tmp) {
-				struct binder_thread *from;
-
-				spin_lock(&tmp->lock);
-				from = tmp->from;
-				if (from && from->proc == target_proc) {
-					atomic_inc(&from->tmp_ref);
-					target_thread = from;
-					spin_unlock(&tmp->lock);
-					break;
-				}
-				spin_unlock(&tmp->lock);
-				tmp = tmp->from_parent;
-			}
-		}
-		binder_inner_proc_unlock(proc);
-	}
-	if (target_thread)
-		e->to_thread = target_thread->pid;
-	e->to_proc = target_proc->pid;
-
-	/* TODO: reuse incoming transaction for reply */
-	t = kzalloc(sizeof(*t), GFP_KERNEL);
-	if (t == NULL) {
-		binder_txn_error("%d:%d cannot allocate transaction\n",
-			thread->pid, proc->pid);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -ENOMEM;
-		return_error_line = __LINE__;
-		goto err_alloc_t_failed;
-	}
-	INIT_LIST_HEAD(&t->fd_fixups);
-	binder_stats_created(BINDER_STAT_TRANSACTION);
-	spin_lock_init(&t->lock);
-
-	tcomplete = kzalloc(sizeof(*tcomplete), GFP_KERNEL);
-	if (tcomplete == NULL) {
-		binder_txn_error("%d:%d cannot allocate work for transaction\n",
-			thread->pid, proc->pid);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -ENOMEM;
-		return_error_line = __LINE__;
-		goto err_alloc_tcomplete_failed;
-	}
-	binder_stats_created(BINDER_STAT_TRANSACTION_COMPLETE);
-
-	t->debug_id = t_debug_id;
-	t->start_time = t_start_time;
-
-	if (reply)
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "%d:%d BC_REPLY %d -> %d:%d, data %016llx-%016llx size %lld-%lld-%lld\n",
-			     proc->pid, thread->pid, t->debug_id,
-			     target_proc->pid, target_thread->pid,
-			     (u64)tr->data.ptr.buffer,
-			     (u64)tr->data.ptr.offsets,
-			     (u64)tr->data_size, (u64)tr->offsets_size,
-			     (u64)extra_buffers_size);
-	else
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "%d:%d BC_TRANSACTION %d -> %d - node %d, data %016llx-%016llx size %lld-%lld-%lld\n",
-			     proc->pid, thread->pid, t->debug_id,
-			     target_proc->pid, target_node->debug_id,
-			     (u64)tr->data.ptr.buffer,
-			     (u64)tr->data.ptr.offsets,
-			     (u64)tr->data_size, (u64)tr->offsets_size,
-			     (u64)extra_buffers_size);
-
-	if (!reply && !(tr->flags & TF_ONE_WAY))
-		t->from = thread;
-	else
-		t->from = NULL;
-	t->from_pid = proc->pid;
-	t->from_tid = thread->pid;
-	t->sender_euid = task_euid(proc->tsk);
-	t->to_proc = target_proc;
-	t->to_thread = target_thread;
-	t->code = tr->code;
-	t->flags = tr->flags;
-	t->priority = task_nice(current);
-
-	if (target_node && target_node->txn_security_ctx) {
-		u32 secid;
-		size_t added_size;
-
-		security_cred_getsecid(proc->cred, &secid);
-		ret = security_secid_to_secctx(secid, &secctx, &secctx_sz);
-		if (ret) {
-			binder_txn_error("%d:%d failed to get security context\n",
-				thread->pid, proc->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = ret;
-			return_error_line = __LINE__;
-			goto err_get_secctx_failed;
-		}
-		added_size = ALIGN(secctx_sz, sizeof(u64));
-		extra_buffers_size += added_size;
-		if (extra_buffers_size < added_size) {
-			binder_txn_error("%d:%d integer overflow of extra_buffers_size\n",
-				thread->pid, proc->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_bad_extra_size;
-		}
-	}
-
-	trace_binder_transaction(reply, t, target_node);
-
-	t->buffer = binder_alloc_new_buf(&target_proc->alloc, tr->data_size,
-		tr->offsets_size, extra_buffers_size,
-		!reply && (t->flags & TF_ONE_WAY), current->tgid);
-	if (IS_ERR(t->buffer)) {
-		char *s;
-
-		ret = PTR_ERR(t->buffer);
-		s = (ret == -ESRCH) ? ": vma cleared, target dead or dying"
-			: (ret == -ENOSPC) ? ": no space left"
-			: (ret == -ENOMEM) ? ": memory allocation failed"
-			: "";
-		binder_txn_error("cannot allocate buffer%s", s);
-
-		return_error_param = PTR_ERR(t->buffer);
-		return_error = return_error_param == -ESRCH ?
-			BR_DEAD_REPLY : BR_FAILED_REPLY;
-		return_error_line = __LINE__;
-		t->buffer = NULL;
-		goto err_binder_alloc_buf_failed;
-	}
-	if (secctx) {
-		int err;
-		size_t buf_offset = ALIGN(tr->data_size, sizeof(void *)) +
-				    ALIGN(tr->offsets_size, sizeof(void *)) +
-				    ALIGN(extra_buffers_size, sizeof(void *)) -
-				    ALIGN(secctx_sz, sizeof(u64));
-
-		t->security_ctx = (uintptr_t)t->buffer->user_data + buf_offset;
-		err = binder_alloc_copy_to_buffer(&target_proc->alloc,
-						  t->buffer, buf_offset,
-						  secctx, secctx_sz);
-		if (err) {
-			t->security_ctx = 0;
-			WARN_ON(1);
-		}
-		security_release_secctx(secctx, secctx_sz);
-		secctx = NULL;
-	}
-	t->buffer->debug_id = t->debug_id;
-	t->buffer->transaction = t;
-	t->buffer->target_node = target_node;
-	t->buffer->clear_on_free = !!(t->flags & TF_CLEAR_BUF);
-	trace_binder_transaction_alloc_buf(t->buffer);
-
-	if (binder_alloc_copy_user_to_buffer(
-				&target_proc->alloc,
-				t->buffer,
-				ALIGN(tr->data_size, sizeof(void *)),
-				(const void __user *)
-					(uintptr_t)tr->data.ptr.offsets,
-				tr->offsets_size)) {
-		binder_user_error("%d:%d got transaction with invalid offsets ptr\n",
-				proc->pid, thread->pid);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -EFAULT;
-		return_error_line = __LINE__;
-		goto err_copy_data_failed;
-	}
-	if (!IS_ALIGNED(tr->offsets_size, sizeof(binder_size_t))) {
-		binder_user_error("%d:%d got transaction with invalid offsets size, %lld\n",
-				proc->pid, thread->pid, (u64)tr->offsets_size);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -EINVAL;
-		return_error_line = __LINE__;
-		goto err_bad_offset;
-	}
-	if (!IS_ALIGNED(extra_buffers_size, sizeof(u64))) {
-		binder_user_error("%d:%d got transaction with unaligned buffers size, %lld\n",
-				  proc->pid, thread->pid,
-				  (u64)extra_buffers_size);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -EINVAL;
-		return_error_line = __LINE__;
-		goto err_bad_offset;
-	}
-	off_start_offset = ALIGN(tr->data_size, sizeof(void *));
-	buffer_offset = off_start_offset;
-	off_end_offset = off_start_offset + tr->offsets_size;
-	sg_buf_offset = ALIGN(off_end_offset, sizeof(void *));
-	sg_buf_end_offset = sg_buf_offset + extra_buffers_size -
-		ALIGN(secctx_sz, sizeof(u64));
-	off_min = 0;
-	for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
-	     buffer_offset += sizeof(binder_size_t)) {
-		struct binder_object_header *hdr;
-		size_t object_size;
-		struct binder_object object;
-		binder_size_t object_offset;
-		binder_size_t copy_size;
-
-		if (binder_alloc_copy_from_buffer(&target_proc->alloc,
-						  &object_offset,
-						  t->buffer,
-						  buffer_offset,
-						  sizeof(object_offset))) {
-			binder_txn_error("%d:%d copy offset from buffer failed\n",
-				thread->pid, proc->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_bad_offset;
-		}
-
-		/*
-		 * Copy the source user buffer up to the next object
-		 * that will be processed.
-		 */
-		copy_size = object_offset - user_offset;
-		if (copy_size && (user_offset > object_offset ||
-				binder_alloc_copy_user_to_buffer(
-					&target_proc->alloc,
-					t->buffer, user_offset,
-					user_buffer + user_offset,
-					copy_size))) {
-			binder_user_error("%d:%d got transaction with invalid data ptr\n",
-					proc->pid, thread->pid);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EFAULT;
-			return_error_line = __LINE__;
-			goto err_copy_data_failed;
-		}
-		object_size = binder_get_object(target_proc, user_buffer,
-				t->buffer, object_offset, &object);
-		if (object_size == 0 || object_offset < off_min) {
-			binder_user_error("%d:%d got transaction with invalid offset (%lld, min %lld max %lld) or object.\n",
-					  proc->pid, thread->pid,
-					  (u64)object_offset,
-					  (u64)off_min,
-					  (u64)t->buffer->data_size);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_bad_offset;
-		}
-		/*
-		 * Set offset to the next buffer fragment to be
-		 * copied
-		 */
-		user_offset = object_offset + object_size;
-
-		hdr = &object.hdr;
-		off_min = object_offset + object_size;
-		switch (hdr->type) {
-		case BINDER_TYPE_BINDER:
-		case BINDER_TYPE_WEAK_BINDER: {
-			struct flat_binder_object *fp;
-
-			fp = to_flat_binder_object(hdr);
-			ret = binder_translate_binder(fp, t, thread);
-
-			if (ret < 0 ||
-			    binder_alloc_copy_to_buffer(&target_proc->alloc,
-							t->buffer,
-							object_offset,
-							fp, sizeof(*fp))) {
-				binder_txn_error("%d:%d translate binder failed\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-		} break;
-		case BINDER_TYPE_HANDLE:
-		case BINDER_TYPE_WEAK_HANDLE: {
-			struct flat_binder_object *fp;
-
-			fp = to_flat_binder_object(hdr);
-			ret = binder_translate_handle(fp, t, thread);
-			if (ret < 0 ||
-			    binder_alloc_copy_to_buffer(&target_proc->alloc,
-							t->buffer,
-							object_offset,
-							fp, sizeof(*fp))) {
-				binder_txn_error("%d:%d translate handle failed\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-		} break;
-
-		case BINDER_TYPE_FD: {
-			struct binder_fd_object *fp = to_binder_fd_object(hdr);
-			binder_size_t fd_offset = object_offset +
-				(uintptr_t)&fp->fd - (uintptr_t)fp;
-			int ret = binder_translate_fd(fp->fd, fd_offset, t,
-						      thread, in_reply_to);
-
-			fp->pad_binder = 0;
-			if (ret < 0 ||
-			    binder_alloc_copy_to_buffer(&target_proc->alloc,
-							t->buffer,
-							object_offset,
-							fp, sizeof(*fp))) {
-				binder_txn_error("%d:%d translate fd failed\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-		} break;
-		case BINDER_TYPE_FDA: {
-			struct binder_object ptr_object;
-			binder_size_t parent_offset;
-			struct binder_object user_object;
-			size_t user_parent_size;
-			struct binder_fd_array_object *fda =
-				to_binder_fd_array_object(hdr);
-			size_t num_valid = (buffer_offset - off_start_offset) /
-						sizeof(binder_size_t);
-			struct binder_buffer_object *parent =
-				binder_validate_ptr(target_proc, t->buffer,
-						    &ptr_object, fda->parent,
-						    off_start_offset,
-						    &parent_offset,
-						    num_valid);
-			if (!parent) {
-				binder_user_error("%d:%d got transaction with invalid parent offset or type\n",
-						  proc->pid, thread->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EINVAL;
-				return_error_line = __LINE__;
-				goto err_bad_parent;
-			}
-			if (!binder_validate_fixup(target_proc, t->buffer,
-						   off_start_offset,
-						   parent_offset,
-						   fda->parent_offset,
-						   last_fixup_obj_off,
-						   last_fixup_min_off)) {
-				binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n",
-						  proc->pid, thread->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EINVAL;
-				return_error_line = __LINE__;
-				goto err_bad_parent;
-			}
-			/*
-			 * We need to read the user version of the parent
-			 * object to get the original user offset
-			 */
-			user_parent_size =
-				binder_get_object(proc, user_buffer, t->buffer,
-						  parent_offset, &user_object);
-			if (user_parent_size != sizeof(user_object.bbo)) {
-				binder_user_error("%d:%d invalid ptr object size: %zd vs %zd\n",
-						  proc->pid, thread->pid,
-						  user_parent_size,
-						  sizeof(user_object.bbo));
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EINVAL;
-				return_error_line = __LINE__;
-				goto err_bad_parent;
-			}
-			ret = binder_translate_fd_array(&pf_head, fda,
-							user_buffer, parent,
-							&user_object.bbo, t,
-							thread, in_reply_to);
-			if (!ret)
-				ret = binder_alloc_copy_to_buffer(&target_proc->alloc,
-								  t->buffer,
-								  object_offset,
-								  fda, sizeof(*fda));
-			if (ret) {
-				binder_txn_error("%d:%d translate fd array failed\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret > 0 ? -EINVAL : ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-			last_fixup_obj_off = parent_offset;
-			last_fixup_min_off =
-				fda->parent_offset + sizeof(u32) * fda->num_fds;
-		} break;
-		case BINDER_TYPE_PTR: {
-			struct binder_buffer_object *bp =
-				to_binder_buffer_object(hdr);
-			size_t buf_left = sg_buf_end_offset - sg_buf_offset;
-			size_t num_valid;
-
-			if (bp->length > buf_left) {
-				binder_user_error("%d:%d got transaction with too large buffer\n",
-						  proc->pid, thread->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = -EINVAL;
-				return_error_line = __LINE__;
-				goto err_bad_offset;
-			}
-			ret = binder_defer_copy(&sgc_head, sg_buf_offset,
-				(const void __user *)(uintptr_t)bp->buffer,
-				bp->length);
-			if (ret) {
-				binder_txn_error("%d:%d deferred copy failed\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-			/* Fixup buffer pointer to target proc address space */
-			bp->buffer = (uintptr_t)
-				t->buffer->user_data + sg_buf_offset;
-			sg_buf_offset += ALIGN(bp->length, sizeof(u64));
-
-			num_valid = (buffer_offset - off_start_offset) /
-					sizeof(binder_size_t);
-			ret = binder_fixup_parent(&pf_head, t,
-						  thread, bp,
-						  off_start_offset,
-						  num_valid,
-						  last_fixup_obj_off,
-						  last_fixup_min_off);
-			if (ret < 0 ||
-			    binder_alloc_copy_to_buffer(&target_proc->alloc,
-							t->buffer,
-							object_offset,
-							bp, sizeof(*bp))) {
-				binder_txn_error("%d:%d failed to fixup parent\n",
-					thread->pid, proc->pid);
-				return_error = BR_FAILED_REPLY;
-				return_error_param = ret;
-				return_error_line = __LINE__;
-				goto err_translate_failed;
-			}
-			last_fixup_obj_off = object_offset;
-			last_fixup_min_off = 0;
-		} break;
-		default:
-			binder_user_error("%d:%d got transaction with invalid object type, %x\n",
-				proc->pid, thread->pid, hdr->type);
-			return_error = BR_FAILED_REPLY;
-			return_error_param = -EINVAL;
-			return_error_line = __LINE__;
-			goto err_bad_object_type;
-		}
-	}
-	/* Done processing objects, copy the rest of the buffer */
-	if (binder_alloc_copy_user_to_buffer(
-				&target_proc->alloc,
-				t->buffer, user_offset,
-				user_buffer + user_offset,
-				tr->data_size - user_offset)) {
-		binder_user_error("%d:%d got transaction with invalid data ptr\n",
-				proc->pid, thread->pid);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = -EFAULT;
-		return_error_line = __LINE__;
-		goto err_copy_data_failed;
-	}
-
-	ret = binder_do_deferred_txn_copies(&target_proc->alloc, t->buffer,
-					    &sgc_head, &pf_head);
-	if (ret) {
-		binder_user_error("%d:%d got transaction with invalid offsets ptr\n",
-				  proc->pid, thread->pid);
-		return_error = BR_FAILED_REPLY;
-		return_error_param = ret;
-		return_error_line = __LINE__;
-		goto err_copy_data_failed;
-	}
-	if (t->buffer->oneway_spam_suspect)
-		tcomplete->type = BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT;
-	else
-		tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE;
-	t->work.type = BINDER_WORK_TRANSACTION;
-
-	if (reply) {
-		binder_enqueue_thread_work(thread, tcomplete);
-		binder_inner_proc_lock(target_proc);
-		if (target_thread->is_dead) {
-			return_error = BR_DEAD_REPLY;
-			binder_inner_proc_unlock(target_proc);
-			goto err_dead_proc_or_thread;
-		}
-		BUG_ON(t->buffer->async_transaction != 0);
-		binder_pop_transaction_ilocked(target_thread, in_reply_to);
-		binder_enqueue_thread_work_ilocked(target_thread, &t->work);
-		target_proc->outstanding_txns++;
-		binder_inner_proc_unlock(target_proc);
-		wake_up_interruptible_sync(&target_thread->wait);
-		binder_free_transaction(in_reply_to);
-	} else if (!(t->flags & TF_ONE_WAY)) {
-		BUG_ON(t->buffer->async_transaction != 0);
-		binder_inner_proc_lock(proc);
-		/*
-		 * Defer the TRANSACTION_COMPLETE, so we don't return to
-		 * userspace immediately; this allows the target process to
-		 * immediately start processing this transaction, reducing
-		 * latency. We will then return the TRANSACTION_COMPLETE when
-		 * the target replies (or there is an error).
-		 */
-		binder_enqueue_deferred_thread_work_ilocked(thread, tcomplete);
-		t->need_reply = 1;
-		t->from_parent = thread->transaction_stack;
-		thread->transaction_stack = t;
-		binder_inner_proc_unlock(proc);
-		return_error = binder_proc_transaction(t,
-				target_proc, target_thread);
-		if (return_error) {
-			binder_inner_proc_lock(proc);
-			binder_pop_transaction_ilocked(thread, t);
-			binder_inner_proc_unlock(proc);
-			goto err_dead_proc_or_thread;
-		}
-	} else {
-		BUG_ON(target_node == NULL);
-		BUG_ON(t->buffer->async_transaction != 1);
-		return_error = binder_proc_transaction(t, target_proc, NULL);
-		/*
-		 * Let the caller know when async transaction reaches a frozen
-		 * process and is put in a pending queue, waiting for the target
-		 * process to be unfrozen.
-		 */
-		if (return_error == BR_TRANSACTION_PENDING_FROZEN)
-			tcomplete->type = BINDER_WORK_TRANSACTION_PENDING;
-		binder_enqueue_thread_work(thread, tcomplete);
-		if (return_error &&
-		    return_error != BR_TRANSACTION_PENDING_FROZEN)
-			goto err_dead_proc_or_thread;
-	}
-	if (target_thread)
-		binder_thread_dec_tmpref(target_thread);
-	binder_proc_dec_tmpref(target_proc);
-	if (target_node)
-		binder_dec_node_tmpref(target_node);
-	/*
-	 * write barrier to synchronize with initialization
-	 * of log entry
-	 */
-	smp_wmb();
-	WRITE_ONCE(e->debug_id_done, t_debug_id);
-	return;
-
-err_dead_proc_or_thread:
-	binder_txn_error("%d:%d dead process or thread\n",
-		thread->pid, proc->pid);
-	return_error_line = __LINE__;
-	binder_dequeue_work(proc, tcomplete);
-err_translate_failed:
-err_bad_object_type:
-err_bad_offset:
-err_bad_parent:
-err_copy_data_failed:
-	binder_cleanup_deferred_txn_lists(&sgc_head, &pf_head);
-	binder_free_txn_fixups(t);
-	trace_binder_transaction_failed_buffer_release(t->buffer);
-	binder_transaction_buffer_release(target_proc, NULL, t->buffer,
-					  buffer_offset, true);
-	if (target_node)
-		binder_dec_node_tmpref(target_node);
-	target_node = NULL;
-	t->buffer->transaction = NULL;
-	binder_alloc_free_buf(&target_proc->alloc, t->buffer);
-err_binder_alloc_buf_failed:
-err_bad_extra_size:
-	if (secctx)
-		security_release_secctx(secctx, secctx_sz);
-err_get_secctx_failed:
-	kfree(tcomplete);
-	binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE);
-err_alloc_tcomplete_failed:
-	if (trace_binder_txn_latency_free_enabled())
-		binder_txn_latency_free(t);
-	kfree(t);
-	binder_stats_deleted(BINDER_STAT_TRANSACTION);
-err_alloc_t_failed:
-err_bad_todo_list:
-err_bad_call_stack:
-err_empty_call_stack:
-err_dead_binder:
-err_invalid_target_handle:
-	if (target_node) {
-		binder_dec_node(target_node, 1, 0);
-		binder_dec_node_tmpref(target_node);
-	}
-
-	binder_debug(BINDER_DEBUG_FAILED_TRANSACTION,
-		     "%d:%d transaction %s to %d:%d failed %d/%d/%d, size %lld-%lld line %d\n",
-		     proc->pid, thread->pid, reply ? "reply" :
-		     (tr->flags & TF_ONE_WAY ? "async" : "call"),
-		     target_proc ? target_proc->pid : 0,
-		     target_thread ? target_thread->pid : 0,
-		     t_debug_id, return_error, return_error_param,
-		     (u64)tr->data_size, (u64)tr->offsets_size,
-		     return_error_line);
-
-	if (target_thread)
-		binder_thread_dec_tmpref(target_thread);
-	if (target_proc)
-		binder_proc_dec_tmpref(target_proc);
-
-	{
-		struct binder_transaction_log_entry *fe;
-
-		e->return_error = return_error;
-		e->return_error_param = return_error_param;
-		e->return_error_line = return_error_line;
-		fe = binder_transaction_log_add(&binder_transaction_log_failed);
-		*fe = *e;
-		/*
-		 * write barrier to synchronize with initialization
-		 * of log entry
-		 */
-		smp_wmb();
-		WRITE_ONCE(e->debug_id_done, t_debug_id);
-		WRITE_ONCE(fe->debug_id_done, t_debug_id);
-	}
-
-	BUG_ON(thread->return_error.cmd != BR_OK);
-	if (in_reply_to) {
-		binder_set_txn_from_error(in_reply_to, t_debug_id,
-				return_error, return_error_param);
-		thread->return_error.cmd = BR_TRANSACTION_COMPLETE;
-		binder_enqueue_thread_work(thread, &thread->return_error.work);
-		binder_send_failed_reply(in_reply_to, return_error);
-	} else {
-		binder_inner_proc_lock(proc);
-		binder_set_extended_error(&thread->ee, t_debug_id,
-				return_error, return_error_param);
-		binder_inner_proc_unlock(proc);
-		thread->return_error.cmd = return_error;
-		binder_enqueue_thread_work(thread, &thread->return_error.work);
-	}
-}
-
-/**
- * binder_free_buf() - free the specified buffer
- * @proc:	binder proc that owns buffer
- * @buffer:	buffer to be freed
- * @is_failure:	failed to send transaction
- *
- * If buffer for an async transaction, enqueue the next async
- * transaction from the node.
- *
- * Cleanup buffer and free it.
- */
-static void
-binder_free_buf(struct binder_proc *proc,
-		struct binder_thread *thread,
-		struct binder_buffer *buffer, bool is_failure)
-{
-	binder_inner_proc_lock(proc);
-	if (buffer->transaction) {
-		buffer->transaction->buffer = NULL;
-		buffer->transaction = NULL;
-	}
-	binder_inner_proc_unlock(proc);
-	if (buffer->async_transaction && buffer->target_node) {
-		struct binder_node *buf_node;
-		struct binder_work *w;
-
-		buf_node = buffer->target_node;
-		binder_node_inner_lock(buf_node);
-		BUG_ON(!buf_node->has_async_transaction);
-		BUG_ON(buf_node->proc != proc);
-		w = binder_dequeue_work_head_ilocked(
-				&buf_node->async_todo);
-		if (!w) {
-			buf_node->has_async_transaction = false;
-		} else {
-			binder_enqueue_work_ilocked(
-					w, &proc->todo);
-			binder_wakeup_proc_ilocked(proc);
-		}
-		binder_node_inner_unlock(buf_node);
-	}
-	trace_binder_transaction_buffer_release(buffer);
-	binder_release_entire_buffer(proc, thread, buffer, is_failure);
-	binder_alloc_free_buf(&proc->alloc, buffer);
-}
-
-static int binder_thread_write(struct binder_proc *proc,
-			struct binder_thread *thread,
-			binder_uintptr_t binder_buffer, size_t size,
-			binder_size_t *consumed)
-{
-	uint32_t cmd;
-	struct binder_context *context = proc->context;
-	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
-	void __user *ptr = buffer + *consumed;
-	void __user *end = buffer + size;
-
-	while (ptr < end && thread->return_error.cmd == BR_OK) {
-		int ret;
-
-		if (get_user(cmd, (uint32_t __user *)ptr))
-			return -EFAULT;
-		ptr += sizeof(uint32_t);
-		trace_binder_command(cmd);
-		if (_IOC_NR(cmd) < ARRAY_SIZE(binder_stats.bc)) {
-			atomic_inc(&binder_stats.bc[_IOC_NR(cmd)]);
-			atomic_inc(&proc->stats.bc[_IOC_NR(cmd)]);
-			atomic_inc(&thread->stats.bc[_IOC_NR(cmd)]);
-		}
-		switch (cmd) {
-		case BC_INCREFS:
-		case BC_ACQUIRE:
-		case BC_RELEASE:
-		case BC_DECREFS: {
-			uint32_t target;
-			const char *debug_string;
-			bool strong = cmd == BC_ACQUIRE || cmd == BC_RELEASE;
-			bool increment = cmd == BC_INCREFS || cmd == BC_ACQUIRE;
-			struct binder_ref_data rdata;
-
-			if (get_user(target, (uint32_t __user *)ptr))
-				return -EFAULT;
-
-			ptr += sizeof(uint32_t);
-			ret = -1;
-			if (increment && !target) {
-				struct binder_node *ctx_mgr_node;
-
-				mutex_lock(&context->context_mgr_node_lock);
-				ctx_mgr_node = context->binder_context_mgr_node;
-				if (ctx_mgr_node) {
-					if (ctx_mgr_node->proc == proc) {
-						binder_user_error("%d:%d context manager tried to acquire desc 0\n",
-								  proc->pid, thread->pid);
-						mutex_unlock(&context->context_mgr_node_lock);
-						return -EINVAL;
-					}
-					ret = binder_inc_ref_for_node(
-							proc, ctx_mgr_node,
-							strong, NULL, &rdata);
-				}
-				mutex_unlock(&context->context_mgr_node_lock);
-			}
-			if (ret)
-				ret = binder_update_ref_for_handle(
-						proc, target, increment, strong,
-						&rdata);
-			if (!ret && rdata.desc != target) {
-				binder_user_error("%d:%d tried to acquire reference to desc %d, got %d instead\n",
-					proc->pid, thread->pid,
-					target, rdata.desc);
-			}
-			switch (cmd) {
-			case BC_INCREFS:
-				debug_string = "IncRefs";
-				break;
-			case BC_ACQUIRE:
-				debug_string = "Acquire";
-				break;
-			case BC_RELEASE:
-				debug_string = "Release";
-				break;
-			case BC_DECREFS:
-			default:
-				debug_string = "DecRefs";
-				break;
-			}
-			if (ret) {
-				binder_user_error("%d:%d %s %d refcount change on invalid ref %d ret %d\n",
-					proc->pid, thread->pid, debug_string,
-					strong, target, ret);
-				break;
-			}
-			binder_debug(BINDER_DEBUG_USER_REFS,
-				     "%d:%d %s ref %d desc %d s %d w %d\n",
-				     proc->pid, thread->pid, debug_string,
-				     rdata.debug_id, rdata.desc, rdata.strong,
-				     rdata.weak);
-			break;
-		}
-		case BC_INCREFS_DONE:
-		case BC_ACQUIRE_DONE: {
-			binder_uintptr_t node_ptr;
-			binder_uintptr_t cookie;
-			struct binder_node *node;
-			bool free_node;
-
-			if (get_user(node_ptr, (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(binder_uintptr_t);
-			if (get_user(cookie, (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(binder_uintptr_t);
-			node = binder_get_node(proc, node_ptr);
-			if (node == NULL) {
-				binder_user_error("%d:%d %s u%016llx no match\n",
-					proc->pid, thread->pid,
-					cmd == BC_INCREFS_DONE ?
-					"BC_INCREFS_DONE" :
-					"BC_ACQUIRE_DONE",
-					(u64)node_ptr);
-				break;
-			}
-			if (cookie != node->cookie) {
-				binder_user_error("%d:%d %s u%016llx node %d cookie mismatch %016llx != %016llx\n",
-					proc->pid, thread->pid,
-					cmd == BC_INCREFS_DONE ?
-					"BC_INCREFS_DONE" : "BC_ACQUIRE_DONE",
-					(u64)node_ptr, node->debug_id,
-					(u64)cookie, (u64)node->cookie);
-				binder_put_node(node);
-				break;
-			}
-			binder_node_inner_lock(node);
-			if (cmd == BC_ACQUIRE_DONE) {
-				if (node->pending_strong_ref == 0) {
-					binder_user_error("%d:%d BC_ACQUIRE_DONE node %d has no pending acquire request\n",
-						proc->pid, thread->pid,
-						node->debug_id);
-					binder_node_inner_unlock(node);
-					binder_put_node(node);
-					break;
-				}
-				node->pending_strong_ref = 0;
-			} else {
-				if (node->pending_weak_ref == 0) {
-					binder_user_error("%d:%d BC_INCREFS_DONE node %d has no pending increfs request\n",
-						proc->pid, thread->pid,
-						node->debug_id);
-					binder_node_inner_unlock(node);
-					binder_put_node(node);
-					break;
-				}
-				node->pending_weak_ref = 0;
-			}
-			free_node = binder_dec_node_nilocked(node,
-					cmd == BC_ACQUIRE_DONE, 0);
-			WARN_ON(free_node);
-			binder_debug(BINDER_DEBUG_USER_REFS,
-				     "%d:%d %s node %d ls %d lw %d tr %d\n",
-				     proc->pid, thread->pid,
-				     cmd == BC_INCREFS_DONE ? "BC_INCREFS_DONE" : "BC_ACQUIRE_DONE",
-				     node->debug_id, node->local_strong_refs,
-				     node->local_weak_refs, node->tmp_refs);
-			binder_node_inner_unlock(node);
-			binder_put_node(node);
-			break;
-		}
-		case BC_ATTEMPT_ACQUIRE:
-			pr_err("BC_ATTEMPT_ACQUIRE not supported\n");
-			return -EINVAL;
-		case BC_ACQUIRE_RESULT:
-			pr_err("BC_ACQUIRE_RESULT not supported\n");
-			return -EINVAL;
-
-		case BC_FREE_BUFFER: {
-			binder_uintptr_t data_ptr;
-			struct binder_buffer *buffer;
-
-			if (get_user(data_ptr, (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(binder_uintptr_t);
-
-			buffer = binder_alloc_prepare_to_free(&proc->alloc,
-							      data_ptr);
-			if (IS_ERR_OR_NULL(buffer)) {
-				if (PTR_ERR(buffer) == -EPERM) {
-					binder_user_error(
-						"%d:%d BC_FREE_BUFFER u%016llx matched unreturned or currently freeing buffer\n",
-						proc->pid, thread->pid,
-						(u64)data_ptr);
-				} else {
-					binder_user_error(
-						"%d:%d BC_FREE_BUFFER u%016llx no match\n",
-						proc->pid, thread->pid,
-						(u64)data_ptr);
-				}
-				break;
-			}
-			binder_debug(BINDER_DEBUG_FREE_BUFFER,
-				     "%d:%d BC_FREE_BUFFER u%016llx found buffer %d for %s transaction\n",
-				     proc->pid, thread->pid, (u64)data_ptr,
-				     buffer->debug_id,
-				     buffer->transaction ? "active" : "finished");
-			binder_free_buf(proc, thread, buffer, false);
-			break;
-		}
-
-		case BC_TRANSACTION_SG:
-		case BC_REPLY_SG: {
-			struct binder_transaction_data_sg tr;
-
-			if (copy_from_user(&tr, ptr, sizeof(tr)))
-				return -EFAULT;
-			ptr += sizeof(tr);
-			binder_transaction(proc, thread, &tr.transaction_data,
-					   cmd == BC_REPLY_SG, tr.buffers_size);
-			break;
-		}
-		case BC_TRANSACTION:
-		case BC_REPLY: {
-			struct binder_transaction_data tr;
-
-			if (copy_from_user(&tr, ptr, sizeof(tr)))
-				return -EFAULT;
-			ptr += sizeof(tr);
-			binder_transaction(proc, thread, &tr,
-					   cmd == BC_REPLY, 0);
-			break;
-		}
-
-		case BC_REGISTER_LOOPER:
-			binder_debug(BINDER_DEBUG_THREADS,
-				     "%d:%d BC_REGISTER_LOOPER\n",
-				     proc->pid, thread->pid);
-			binder_inner_proc_lock(proc);
-			if (thread->looper & BINDER_LOOPER_STATE_ENTERED) {
-				thread->looper |= BINDER_LOOPER_STATE_INVALID;
-				binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called after BC_ENTER_LOOPER\n",
-					proc->pid, thread->pid);
-			} else if (proc->requested_threads == 0) {
-				thread->looper |= BINDER_LOOPER_STATE_INVALID;
-				binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called without request\n",
-					proc->pid, thread->pid);
-			} else {
-				proc->requested_threads--;
-				proc->requested_threads_started++;
-			}
-			thread->looper |= BINDER_LOOPER_STATE_REGISTERED;
-			binder_inner_proc_unlock(proc);
-			break;
-		case BC_ENTER_LOOPER:
-			binder_debug(BINDER_DEBUG_THREADS,
-				     "%d:%d BC_ENTER_LOOPER\n",
-				     proc->pid, thread->pid);
-			if (thread->looper & BINDER_LOOPER_STATE_REGISTERED) {
-				thread->looper |= BINDER_LOOPER_STATE_INVALID;
-				binder_user_error("%d:%d ERROR: BC_ENTER_LOOPER called after BC_REGISTER_LOOPER\n",
-					proc->pid, thread->pid);
-			}
-			thread->looper |= BINDER_LOOPER_STATE_ENTERED;
-			break;
-		case BC_EXIT_LOOPER:
-			binder_debug(BINDER_DEBUG_THREADS,
-				     "%d:%d BC_EXIT_LOOPER\n",
-				     proc->pid, thread->pid);
-			thread->looper |= BINDER_LOOPER_STATE_EXITED;
-			break;
-
-		case BC_REQUEST_DEATH_NOTIFICATION:
-		case BC_CLEAR_DEATH_NOTIFICATION: {
-			uint32_t target;
-			binder_uintptr_t cookie;
-			struct binder_ref *ref;
-			struct binder_ref_death *death = NULL;
-
-			if (get_user(target, (uint32_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(uint32_t);
-			if (get_user(cookie, (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(binder_uintptr_t);
-			if (cmd == BC_REQUEST_DEATH_NOTIFICATION) {
-				/*
-				 * Allocate memory for death notification
-				 * before taking lock
-				 */
-				death = kzalloc(sizeof(*death), GFP_KERNEL);
-				if (death == NULL) {
-					WARN_ON(thread->return_error.cmd !=
-						BR_OK);
-					thread->return_error.cmd = BR_ERROR;
-					binder_enqueue_thread_work(
-						thread,
-						&thread->return_error.work);
-					binder_debug(
-						BINDER_DEBUG_FAILED_TRANSACTION,
-						"%d:%d BC_REQUEST_DEATH_NOTIFICATION failed\n",
-						proc->pid, thread->pid);
-					break;
-				}
-			}
-			binder_proc_lock(proc);
-			ref = binder_get_ref_olocked(proc, target, false);
-			if (ref == NULL) {
-				binder_user_error("%d:%d %s invalid ref %d\n",
-					proc->pid, thread->pid,
-					cmd == BC_REQUEST_DEATH_NOTIFICATION ?
-					"BC_REQUEST_DEATH_NOTIFICATION" :
-					"BC_CLEAR_DEATH_NOTIFICATION",
-					target);
-				binder_proc_unlock(proc);
-				kfree(death);
-				break;
-			}
-
-			binder_debug(BINDER_DEBUG_DEATH_NOTIFICATION,
-				     "%d:%d %s %016llx ref %d desc %d s %d w %d for node %d\n",
-				     proc->pid, thread->pid,
-				     cmd == BC_REQUEST_DEATH_NOTIFICATION ?
-				     "BC_REQUEST_DEATH_NOTIFICATION" :
-				     "BC_CLEAR_DEATH_NOTIFICATION",
-				     (u64)cookie, ref->data.debug_id,
-				     ref->data.desc, ref->data.strong,
-				     ref->data.weak, ref->node->debug_id);
-
-			binder_node_lock(ref->node);
-			if (cmd == BC_REQUEST_DEATH_NOTIFICATION) {
-				if (ref->death) {
-					binder_user_error("%d:%d BC_REQUEST_DEATH_NOTIFICATION death notification already set\n",
-						proc->pid, thread->pid);
-					binder_node_unlock(ref->node);
-					binder_proc_unlock(proc);
-					kfree(death);
-					break;
-				}
-				binder_stats_created(BINDER_STAT_DEATH);
-				INIT_LIST_HEAD(&death->work.entry);
-				death->cookie = cookie;
-				ref->death = death;
-				if (ref->node->proc == NULL) {
-					ref->death->work.type = BINDER_WORK_DEAD_BINDER;
-
-					binder_inner_proc_lock(proc);
-					binder_enqueue_work_ilocked(
-						&ref->death->work, &proc->todo);
-					binder_wakeup_proc_ilocked(proc);
-					binder_inner_proc_unlock(proc);
-				}
-			} else {
-				if (ref->death == NULL) {
-					binder_user_error("%d:%d BC_CLEAR_DEATH_NOTIFICATION death notification not active\n",
-						proc->pid, thread->pid);
-					binder_node_unlock(ref->node);
-					binder_proc_unlock(proc);
-					break;
-				}
-				death = ref->death;
-				if (death->cookie != cookie) {
-					binder_user_error("%d:%d BC_CLEAR_DEATH_NOTIFICATION death notification cookie mismatch %016llx != %016llx\n",
-						proc->pid, thread->pid,
-						(u64)death->cookie,
-						(u64)cookie);
-					binder_node_unlock(ref->node);
-					binder_proc_unlock(proc);
-					break;
-				}
-				ref->death = NULL;
-				binder_inner_proc_lock(proc);
-				if (list_empty(&death->work.entry)) {
-					death->work.type = BINDER_WORK_CLEAR_DEATH_NOTIFICATION;
-					if (thread->looper &
-					    (BINDER_LOOPER_STATE_REGISTERED |
-					     BINDER_LOOPER_STATE_ENTERED))
-						binder_enqueue_thread_work_ilocked(
-								thread,
-								&death->work);
-					else {
-						binder_enqueue_work_ilocked(
-								&death->work,
-								&proc->todo);
-						binder_wakeup_proc_ilocked(
-								proc);
-					}
-				} else {
-					BUG_ON(death->work.type != BINDER_WORK_DEAD_BINDER);
-					death->work.type = BINDER_WORK_DEAD_BINDER_AND_CLEAR;
-				}
-				binder_inner_proc_unlock(proc);
-			}
-			binder_node_unlock(ref->node);
-			binder_proc_unlock(proc);
-		} break;
-		case BC_DEAD_BINDER_DONE: {
-			struct binder_work *w;
-			binder_uintptr_t cookie;
-			struct binder_ref_death *death = NULL;
-
-			if (get_user(cookie, (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-
-			ptr += sizeof(cookie);
-			binder_inner_proc_lock(proc);
-			list_for_each_entry(w, &proc->delivered_death,
-					    entry) {
-				struct binder_ref_death *tmp_death =
-					container_of(w,
-						     struct binder_ref_death,
-						     work);
-
-				if (tmp_death->cookie == cookie) {
-					death = tmp_death;
-					break;
-				}
-			}
-			binder_debug(BINDER_DEBUG_DEAD_BINDER,
-				     "%d:%d BC_DEAD_BINDER_DONE %016llx found %pK\n",
-				     proc->pid, thread->pid, (u64)cookie,
-				     death);
-			if (death == NULL) {
-				binder_user_error("%d:%d BC_DEAD_BINDER_DONE %016llx not found\n",
-					proc->pid, thread->pid, (u64)cookie);
-				binder_inner_proc_unlock(proc);
-				break;
-			}
-			binder_dequeue_work_ilocked(&death->work);
-			if (death->work.type == BINDER_WORK_DEAD_BINDER_AND_CLEAR) {
-				death->work.type = BINDER_WORK_CLEAR_DEATH_NOTIFICATION;
-				if (thread->looper &
-					(BINDER_LOOPER_STATE_REGISTERED |
-					 BINDER_LOOPER_STATE_ENTERED))
-					binder_enqueue_thread_work_ilocked(
-						thread, &death->work);
-				else {
-					binder_enqueue_work_ilocked(
-							&death->work,
-							&proc->todo);
-					binder_wakeup_proc_ilocked(proc);
-				}
-			}
-			binder_inner_proc_unlock(proc);
-		} break;
-
-		default:
-			pr_err("%d:%d unknown command %u\n",
-			       proc->pid, thread->pid, cmd);
-			return -EINVAL;
-		}
-		*consumed = ptr - buffer;
-	}
-	return 0;
-}
-
-static void binder_stat_br(struct binder_proc *proc,
-			   struct binder_thread *thread, uint32_t cmd)
-{
-	trace_binder_return(cmd);
-	if (_IOC_NR(cmd) < ARRAY_SIZE(binder_stats.br)) {
-		atomic_inc(&binder_stats.br[_IOC_NR(cmd)]);
-		atomic_inc(&proc->stats.br[_IOC_NR(cmd)]);
-		atomic_inc(&thread->stats.br[_IOC_NR(cmd)]);
-	}
-}
-
-static int binder_put_node_cmd(struct binder_proc *proc,
-			       struct binder_thread *thread,
-			       void __user **ptrp,
-			       binder_uintptr_t node_ptr,
-			       binder_uintptr_t node_cookie,
-			       int node_debug_id,
-			       uint32_t cmd, const char *cmd_name)
-{
-	void __user *ptr = *ptrp;
-
-	if (put_user(cmd, (uint32_t __user *)ptr))
-		return -EFAULT;
-	ptr += sizeof(uint32_t);
-
-	if (put_user(node_ptr, (binder_uintptr_t __user *)ptr))
-		return -EFAULT;
-	ptr += sizeof(binder_uintptr_t);
-
-	if (put_user(node_cookie, (binder_uintptr_t __user *)ptr))
-		return -EFAULT;
-	ptr += sizeof(binder_uintptr_t);
-
-	binder_stat_br(proc, thread, cmd);
-	binder_debug(BINDER_DEBUG_USER_REFS, "%d:%d %s %d u%016llx c%016llx\n",
-		     proc->pid, thread->pid, cmd_name, node_debug_id,
-		     (u64)node_ptr, (u64)node_cookie);
-
-	*ptrp = ptr;
-	return 0;
-}
-
-static int binder_wait_for_work(struct binder_thread *thread,
-				bool do_proc_work)
-{
-	DEFINE_WAIT(wait);
-	struct binder_proc *proc = thread->proc;
-	int ret = 0;
-
-	binder_inner_proc_lock(proc);
-	for (;;) {
-		prepare_to_wait(&thread->wait, &wait, TASK_INTERRUPTIBLE|TASK_FREEZABLE);
-		if (binder_has_work_ilocked(thread, do_proc_work))
-			break;
-		if (do_proc_work)
-			list_add(&thread->waiting_thread_node,
-				 &proc->waiting_threads);
-		binder_inner_proc_unlock(proc);
-		schedule();
-		binder_inner_proc_lock(proc);
-		list_del_init(&thread->waiting_thread_node);
-		if (signal_pending(current)) {
-			ret = -EINTR;
-			break;
-		}
-	}
-	finish_wait(&thread->wait, &wait);
-	binder_inner_proc_unlock(proc);
-
-	return ret;
-}
-
-/**
- * binder_apply_fd_fixups() - finish fd translation
- * @proc:         binder_proc associated @t->buffer
- * @t:	binder transaction with list of fd fixups
- *
- * Now that we are in the context of the transaction target
- * process, we can allocate and install fds. Process the
- * list of fds to translate and fixup the buffer with the
- * new fds first and only then install the files.
- *
- * If we fail to allocate an fd, skip the install and release
- * any fds that have already been allocated.
- */
-static int binder_apply_fd_fixups(struct binder_proc *proc,
-				  struct binder_transaction *t)
-{
-	struct binder_txn_fd_fixup *fixup, *tmp;
-	int ret = 0;
-
-	list_for_each_entry(fixup, &t->fd_fixups, fixup_entry) {
-		int fd = get_unused_fd_flags(O_CLOEXEC);
-
-		if (fd < 0) {
-			binder_debug(BINDER_DEBUG_TRANSACTION,
-				     "failed fd fixup txn %d fd %d\n",
-				     t->debug_id, fd);
-			ret = -ENOMEM;
-			goto err;
-		}
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "fd fixup txn %d fd %d\n",
-			     t->debug_id, fd);
-		trace_binder_transaction_fd_recv(t, fd, fixup->offset);
-		fixup->target_fd = fd;
-		if (binder_alloc_copy_to_buffer(&proc->alloc, t->buffer,
-						fixup->offset, &fd,
-						sizeof(u32))) {
-			ret = -EINVAL;
-			goto err;
-		}
-	}
-	list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) {
-		fd_install(fixup->target_fd, fixup->file);
-		list_del(&fixup->fixup_entry);
-		kfree(fixup);
-	}
-
-	return ret;
-
-err:
-	binder_free_txn_fixups(t);
-	return ret;
-}
-
-static int binder_thread_read(struct binder_proc *proc,
-			      struct binder_thread *thread,
-			      binder_uintptr_t binder_buffer, size_t size,
-			      binder_size_t *consumed, int non_block)
-{
-	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
-	void __user *ptr = buffer + *consumed;
-	void __user *end = buffer + size;
-
-	int ret = 0;
-	int wait_for_proc_work;
-
-	if (*consumed == 0) {
-		if (put_user(BR_NOOP, (uint32_t __user *)ptr))
-			return -EFAULT;
-		ptr += sizeof(uint32_t);
-	}
-
-retry:
-	binder_inner_proc_lock(proc);
-	wait_for_proc_work = binder_available_for_proc_work_ilocked(thread);
-	binder_inner_proc_unlock(proc);
-
-	thread->looper |= BINDER_LOOPER_STATE_WAITING;
-
-	trace_binder_wait_for_work(wait_for_proc_work,
-				   !!thread->transaction_stack,
-				   !binder_worklist_empty(proc, &thread->todo));
-	if (wait_for_proc_work) {
-		if (!(thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
-					BINDER_LOOPER_STATE_ENTERED))) {
-			binder_user_error("%d:%d ERROR: Thread waiting for process work before calling BC_REGISTER_LOOPER or BC_ENTER_LOOPER (state %x)\n",
-				proc->pid, thread->pid, thread->looper);
-			wait_event_interruptible(binder_user_error_wait,
-						 binder_stop_on_user_error < 2);
-		}
-		binder_set_nice(proc->default_priority);
-	}
-
-	if (non_block) {
-		if (!binder_has_work(thread, wait_for_proc_work))
-			ret = -EAGAIN;
-	} else {
-		ret = binder_wait_for_work(thread, wait_for_proc_work);
-	}
-
-	thread->looper &= ~BINDER_LOOPER_STATE_WAITING;
-
-	if (ret)
-		return ret;
-
-	while (1) {
-		uint32_t cmd;
-		struct binder_transaction_data_secctx tr;
-		struct binder_transaction_data *trd = &tr.transaction_data;
-		struct binder_work *w = NULL;
-		struct list_head *list = NULL;
-		struct binder_transaction *t = NULL;
-		struct binder_thread *t_from;
-		size_t trsize = sizeof(*trd);
-
-		binder_inner_proc_lock(proc);
-		if (!binder_worklist_empty_ilocked(&thread->todo))
-			list = &thread->todo;
-		else if (!binder_worklist_empty_ilocked(&proc->todo) &&
-			   wait_for_proc_work)
-			list = &proc->todo;
-		else {
-			binder_inner_proc_unlock(proc);
-
-			/* no data added */
-			if (ptr - buffer == 4 && !thread->looper_need_return)
-				goto retry;
-			break;
-		}
-
-		if (end - ptr < sizeof(tr) + 4) {
-			binder_inner_proc_unlock(proc);
-			break;
-		}
-		w = binder_dequeue_work_head_ilocked(list);
-		if (binder_worklist_empty_ilocked(&thread->todo))
-			thread->process_todo = false;
-
-		switch (w->type) {
-		case BINDER_WORK_TRANSACTION: {
-			binder_inner_proc_unlock(proc);
-			t = container_of(w, struct binder_transaction, work);
-		} break;
-		case BINDER_WORK_RETURN_ERROR: {
-			struct binder_error *e = container_of(
-					w, struct binder_error, work);
-
-			WARN_ON(e->cmd == BR_OK);
-			binder_inner_proc_unlock(proc);
-			if (put_user(e->cmd, (uint32_t __user *)ptr))
-				return -EFAULT;
-			cmd = e->cmd;
-			e->cmd = BR_OK;
-			ptr += sizeof(uint32_t);
-
-			binder_stat_br(proc, thread, cmd);
-		} break;
-		case BINDER_WORK_TRANSACTION_COMPLETE:
-		case BINDER_WORK_TRANSACTION_PENDING:
-		case BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT: {
-			if (proc->oneway_spam_detection_enabled &&
-				   w->type == BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT)
-				cmd = BR_ONEWAY_SPAM_SUSPECT;
-			else if (w->type == BINDER_WORK_TRANSACTION_PENDING)
-				cmd = BR_TRANSACTION_PENDING_FROZEN;
-			else
-				cmd = BR_TRANSACTION_COMPLETE;
-			binder_inner_proc_unlock(proc);
-			kfree(w);
-			binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE);
-			if (put_user(cmd, (uint32_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(uint32_t);
-
-			binder_stat_br(proc, thread, cmd);
-			binder_debug(BINDER_DEBUG_TRANSACTION_COMPLETE,
-				     "%d:%d BR_TRANSACTION_COMPLETE\n",
-				     proc->pid, thread->pid);
-		} break;
-		case BINDER_WORK_NODE: {
-			struct binder_node *node = container_of(w, struct binder_node, work);
-			int strong, weak;
-			binder_uintptr_t node_ptr = node->ptr;
-			binder_uintptr_t node_cookie = node->cookie;
-			int node_debug_id = node->debug_id;
-			int has_weak_ref;
-			int has_strong_ref;
-			void __user *orig_ptr = ptr;
-
-			BUG_ON(proc != node->proc);
-			strong = node->internal_strong_refs ||
-					node->local_strong_refs;
-			weak = !hlist_empty(&node->refs) ||
-					node->local_weak_refs ||
-					node->tmp_refs || strong;
-			has_strong_ref = node->has_strong_ref;
-			has_weak_ref = node->has_weak_ref;
-
-			if (weak && !has_weak_ref) {
-				node->has_weak_ref = 1;
-				node->pending_weak_ref = 1;
-				node->local_weak_refs++;
-			}
-			if (strong && !has_strong_ref) {
-				node->has_strong_ref = 1;
-				node->pending_strong_ref = 1;
-				node->local_strong_refs++;
-			}
-			if (!strong && has_strong_ref)
-				node->has_strong_ref = 0;
-			if (!weak && has_weak_ref)
-				node->has_weak_ref = 0;
-			if (!weak && !strong) {
-				binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-					     "%d:%d node %d u%016llx c%016llx deleted\n",
-					     proc->pid, thread->pid,
-					     node_debug_id,
-					     (u64)node_ptr,
-					     (u64)node_cookie);
-				rb_erase(&node->rb_node, &proc->nodes);
-				binder_inner_proc_unlock(proc);
-				binder_node_lock(node);
-				/*
-				 * Acquire the node lock before freeing the
-				 * node to serialize with other threads that
-				 * may have been holding the node lock while
-				 * decrementing this node (avoids race where
-				 * this thread frees while the other thread
-				 * is unlocking the node after the final
-				 * decrement)
-				 */
-				binder_node_unlock(node);
-				binder_free_node(node);
-			} else
-				binder_inner_proc_unlock(proc);
-
-			if (weak && !has_weak_ref)
-				ret = binder_put_node_cmd(
-						proc, thread, &ptr, node_ptr,
-						node_cookie, node_debug_id,
-						BR_INCREFS, "BR_INCREFS");
-			if (!ret && strong && !has_strong_ref)
-				ret = binder_put_node_cmd(
-						proc, thread, &ptr, node_ptr,
-						node_cookie, node_debug_id,
-						BR_ACQUIRE, "BR_ACQUIRE");
-			if (!ret && !strong && has_strong_ref)
-				ret = binder_put_node_cmd(
-						proc, thread, &ptr, node_ptr,
-						node_cookie, node_debug_id,
-						BR_RELEASE, "BR_RELEASE");
-			if (!ret && !weak && has_weak_ref)
-				ret = binder_put_node_cmd(
-						proc, thread, &ptr, node_ptr,
-						node_cookie, node_debug_id,
-						BR_DECREFS, "BR_DECREFS");
-			if (orig_ptr == ptr)
-				binder_debug(BINDER_DEBUG_INTERNAL_REFS,
-					     "%d:%d node %d u%016llx c%016llx state unchanged\n",
-					     proc->pid, thread->pid,
-					     node_debug_id,
-					     (u64)node_ptr,
-					     (u64)node_cookie);
-			if (ret)
-				return ret;
-		} break;
-		case BINDER_WORK_DEAD_BINDER:
-		case BINDER_WORK_DEAD_BINDER_AND_CLEAR:
-		case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: {
-			struct binder_ref_death *death;
-			uint32_t cmd;
-			binder_uintptr_t cookie;
-
-			death = container_of(w, struct binder_ref_death, work);
-			if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION)
-				cmd = BR_CLEAR_DEATH_NOTIFICATION_DONE;
-			else
-				cmd = BR_DEAD_BINDER;
-			cookie = death->cookie;
-
-			binder_debug(BINDER_DEBUG_DEATH_NOTIFICATION,
-				     "%d:%d %s %016llx\n",
-				      proc->pid, thread->pid,
-				      cmd == BR_DEAD_BINDER ?
-				      "BR_DEAD_BINDER" :
-				      "BR_CLEAR_DEATH_NOTIFICATION_DONE",
-				      (u64)cookie);
-			if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION) {
-				binder_inner_proc_unlock(proc);
-				kfree(death);
-				binder_stats_deleted(BINDER_STAT_DEATH);
-			} else {
-				binder_enqueue_work_ilocked(
-						w, &proc->delivered_death);
-				binder_inner_proc_unlock(proc);
-			}
-			if (put_user(cmd, (uint32_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(uint32_t);
-			if (put_user(cookie,
-				     (binder_uintptr_t __user *)ptr))
-				return -EFAULT;
-			ptr += sizeof(binder_uintptr_t);
-			binder_stat_br(proc, thread, cmd);
-			if (cmd == BR_DEAD_BINDER)
-				goto done; /* DEAD_BINDER notifications can cause transactions */
-		} break;
-		default:
-			binder_inner_proc_unlock(proc);
-			pr_err("%d:%d: bad work type %d\n",
-			       proc->pid, thread->pid, w->type);
-			break;
-		}
-
-		if (!t)
-			continue;
-
-		BUG_ON(t->buffer == NULL);
-		if (t->buffer->target_node) {
-			struct binder_node *target_node = t->buffer->target_node;
-
-			trd->target.ptr = target_node->ptr;
-			trd->cookie =  target_node->cookie;
-			t->saved_priority = task_nice(current);
-			if (t->priority < target_node->min_priority &&
-			    !(t->flags & TF_ONE_WAY))
-				binder_set_nice(t->priority);
-			else if (!(t->flags & TF_ONE_WAY) ||
-				 t->saved_priority > target_node->min_priority)
-				binder_set_nice(target_node->min_priority);
-			cmd = BR_TRANSACTION;
-		} else {
-			trd->target.ptr = 0;
-			trd->cookie = 0;
-			cmd = BR_REPLY;
-		}
-		trd->code = t->code;
-		trd->flags = t->flags;
-		trd->sender_euid = from_kuid(current_user_ns(), t->sender_euid);
-
-		t_from = binder_get_txn_from(t);
-		if (t_from) {
-			struct task_struct *sender = t_from->proc->tsk;
-
-			trd->sender_pid =
-				task_tgid_nr_ns(sender,
-						task_active_pid_ns(current));
-		} else {
-			trd->sender_pid = 0;
-		}
-
-		ret = binder_apply_fd_fixups(proc, t);
-		if (ret) {
-			struct binder_buffer *buffer = t->buffer;
-			bool oneway = !!(t->flags & TF_ONE_WAY);
-			int tid = t->debug_id;
-
-			if (t_from)
-				binder_thread_dec_tmpref(t_from);
-			buffer->transaction = NULL;
-			binder_cleanup_transaction(t, "fd fixups failed",
-						   BR_FAILED_REPLY);
-			binder_free_buf(proc, thread, buffer, true);
-			binder_debug(BINDER_DEBUG_FAILED_TRANSACTION,
-				     "%d:%d %stransaction %d fd fixups failed %d/%d, line %d\n",
-				     proc->pid, thread->pid,
-				     oneway ? "async " :
-					(cmd == BR_REPLY ? "reply " : ""),
-				     tid, BR_FAILED_REPLY, ret, __LINE__);
-			if (cmd == BR_REPLY) {
-				cmd = BR_FAILED_REPLY;
-				if (put_user(cmd, (uint32_t __user *)ptr))
-					return -EFAULT;
-				ptr += sizeof(uint32_t);
-				binder_stat_br(proc, thread, cmd);
-				break;
-			}
-			continue;
-		}
-		trd->data_size = t->buffer->data_size;
-		trd->offsets_size = t->buffer->offsets_size;
-		trd->data.ptr.buffer = (uintptr_t)t->buffer->user_data;
-		trd->data.ptr.offsets = trd->data.ptr.buffer +
-					ALIGN(t->buffer->data_size,
-					    sizeof(void *));
-
-		tr.secctx = t->security_ctx;
-		if (t->security_ctx) {
-			cmd = BR_TRANSACTION_SEC_CTX;
-			trsize = sizeof(tr);
-		}
-		if (put_user(cmd, (uint32_t __user *)ptr)) {
-			if (t_from)
-				binder_thread_dec_tmpref(t_from);
-
-			binder_cleanup_transaction(t, "put_user failed",
-						   BR_FAILED_REPLY);
-
-			return -EFAULT;
-		}
-		ptr += sizeof(uint32_t);
-		if (copy_to_user(ptr, &tr, trsize)) {
-			if (t_from)
-				binder_thread_dec_tmpref(t_from);
-
-			binder_cleanup_transaction(t, "copy_to_user failed",
-						   BR_FAILED_REPLY);
-
-			return -EFAULT;
-		}
-		ptr += trsize;
-
-		trace_binder_transaction_received(t);
-		binder_stat_br(proc, thread, cmd);
-		binder_debug(BINDER_DEBUG_TRANSACTION,
-			     "%d:%d %s %d %d:%d, cmd %u size %zd-%zd ptr %016llx-%016llx\n",
-			     proc->pid, thread->pid,
-			     (cmd == BR_TRANSACTION) ? "BR_TRANSACTION" :
-				(cmd == BR_TRANSACTION_SEC_CTX) ?
-				     "BR_TRANSACTION_SEC_CTX" : "BR_REPLY",
-			     t->debug_id, t_from ? t_from->proc->pid : 0,
-			     t_from ? t_from->pid : 0, cmd,
-			     t->buffer->data_size, t->buffer->offsets_size,
-			     (u64)trd->data.ptr.buffer,
-			     (u64)trd->data.ptr.offsets);
-
-		if (t_from)
-			binder_thread_dec_tmpref(t_from);
-		t->buffer->allow_user_free = 1;
-		if (cmd != BR_REPLY && !(t->flags & TF_ONE_WAY)) {
-			binder_inner_proc_lock(thread->proc);
-			t->to_parent = thread->transaction_stack;
-			t->to_thread = thread;
-			thread->transaction_stack = t;
-			binder_inner_proc_unlock(thread->proc);
-		} else {
-			binder_free_transaction(t);
-		}
-		break;
-	}
-
-done:
-
-	*consumed = ptr - buffer;
-	binder_inner_proc_lock(proc);
-	if (proc->requested_threads == 0 &&
-	    list_empty(&thread->proc->waiting_threads) &&
-	    proc->requested_threads_started < proc->max_threads &&
-	    (thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
-	     BINDER_LOOPER_STATE_ENTERED)) /* the user-space code fails to */
-	     /*spawn a new thread if we leave this out */) {
-		proc->requested_threads++;
-		binder_inner_proc_unlock(proc);
-		binder_debug(BINDER_DEBUG_THREADS,
-			     "%d:%d BR_SPAWN_LOOPER\n",
-			     proc->pid, thread->pid);
-		if (put_user(BR_SPAWN_LOOPER, (uint32_t __user *)buffer))
-			return -EFAULT;
-		binder_stat_br(proc, thread, BR_SPAWN_LOOPER);
-	} else
-		binder_inner_proc_unlock(proc);
-	return 0;
-}
-
-static void binder_release_work(struct binder_proc *proc,
-				struct list_head *list)
-{
-	struct binder_work *w;
-	enum binder_work_type wtype;
-
-	while (1) {
-		binder_inner_proc_lock(proc);
-		w = binder_dequeue_work_head_ilocked(list);
-		wtype = w ? w->type : 0;
-		binder_inner_proc_unlock(proc);
-		if (!w)
-			return;
-
-		switch (wtype) {
-		case BINDER_WORK_TRANSACTION: {
-			struct binder_transaction *t;
-
-			t = container_of(w, struct binder_transaction, work);
-
-			binder_cleanup_transaction(t, "process died.",
-						   BR_DEAD_REPLY);
-		} break;
-		case BINDER_WORK_RETURN_ERROR: {
-			struct binder_error *e = container_of(
-					w, struct binder_error, work);
-
-			binder_debug(BINDER_DEBUG_DEAD_TRANSACTION,
-				"undelivered TRANSACTION_ERROR: %u\n",
-				e->cmd);
-		} break;
-		case BINDER_WORK_TRANSACTION_PENDING:
-		case BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT:
-		case BINDER_WORK_TRANSACTION_COMPLETE: {
-			binder_debug(BINDER_DEBUG_DEAD_TRANSACTION,
-				"undelivered TRANSACTION_COMPLETE\n");
-			kfree(w);
-			binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE);
-		} break;
-		case BINDER_WORK_DEAD_BINDER_AND_CLEAR:
-		case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: {
-			struct binder_ref_death *death;
-
-			death = container_of(w, struct binder_ref_death, work);
-			binder_debug(BINDER_DEBUG_DEAD_TRANSACTION,
-				"undelivered death notification, %016llx\n",
-				(u64)death->cookie);
-			kfree(death);
-			binder_stats_deleted(BINDER_STAT_DEATH);
-		} break;
-		case BINDER_WORK_NODE:
-			break;
-		default:
-			pr_err("unexpected work type, %d, not freed\n",
-			       wtype);
-			break;
-		}
-	}
-
-}
-
-static struct binder_thread *binder_get_thread_ilocked(
-		struct binder_proc *proc, struct binder_thread *new_thread)
-{
-	struct binder_thread *thread = NULL;
-	struct rb_node *parent = NULL;
-	struct rb_node **p = &proc->threads.rb_node;
-
-	while (*p) {
-		parent = *p;
-		thread = rb_entry(parent, struct binder_thread, rb_node);
-
-		if (current->pid < thread->pid)
-			p = &(*p)->rb_left;
-		else if (current->pid > thread->pid)
-			p = &(*p)->rb_right;
-		else
-			return thread;
-	}
-	if (!new_thread)
-		return NULL;
-	thread = new_thread;
-	binder_stats_created(BINDER_STAT_THREAD);
-	thread->proc = proc;
-	thread->pid = current->pid;
-	atomic_set(&thread->tmp_ref, 0);
-	init_waitqueue_head(&thread->wait);
-	INIT_LIST_HEAD(&thread->todo);
-	rb_link_node(&thread->rb_node, parent, p);
-	rb_insert_color(&thread->rb_node, &proc->threads);
-	thread->looper_need_return = true;
-	thread->return_error.work.type = BINDER_WORK_RETURN_ERROR;
-	thread->return_error.cmd = BR_OK;
-	thread->reply_error.work.type = BINDER_WORK_RETURN_ERROR;
-	thread->reply_error.cmd = BR_OK;
-	thread->ee.command = BR_OK;
-	INIT_LIST_HEAD(&new_thread->waiting_thread_node);
-	return thread;
-}
-
-static struct binder_thread *binder_get_thread(struct binder_proc *proc)
-{
-	struct binder_thread *thread;
-	struct binder_thread *new_thread;
-
-	binder_inner_proc_lock(proc);
-	thread = binder_get_thread_ilocked(proc, NULL);
-	binder_inner_proc_unlock(proc);
-	if (!thread) {
-		new_thread = kzalloc(sizeof(*thread), GFP_KERNEL);
-		if (new_thread == NULL)
-			return NULL;
-		binder_inner_proc_lock(proc);
-		thread = binder_get_thread_ilocked(proc, new_thread);
-		binder_inner_proc_unlock(proc);
-		if (thread != new_thread)
-			kfree(new_thread);
-	}
-	return thread;
-}
-
-static void binder_free_proc(struct binder_proc *proc)
-{
-	struct binder_device *device;
-
-	BUG_ON(!list_empty(&proc->todo));
-	BUG_ON(!list_empty(&proc->delivered_death));
-	if (proc->outstanding_txns)
-		pr_warn("%s: Unexpected outstanding_txns %d\n",
-			__func__, proc->outstanding_txns);
-	device = container_of(proc->context, struct binder_device, context);
-	if (refcount_dec_and_test(&device->ref)) {
-		kfree(proc->context->name);
-		kfree(device);
-	}
-	binder_alloc_deferred_release(&proc->alloc);
-	put_task_struct(proc->tsk);
-	put_cred(proc->cred);
-	binder_stats_deleted(BINDER_STAT_PROC);
-	kfree(proc);
-}
-
-static void binder_free_thread(struct binder_thread *thread)
-{
-	BUG_ON(!list_empty(&thread->todo));
-	binder_stats_deleted(BINDER_STAT_THREAD);
-	binder_proc_dec_tmpref(thread->proc);
-	kfree(thread);
-}
-
-static int binder_thread_release(struct binder_proc *proc,
-				 struct binder_thread *thread)
-{
-	struct binder_transaction *t;
-	struct binder_transaction *send_reply = NULL;
-	int active_transactions = 0;
-	struct binder_transaction *last_t = NULL;
-
-	binder_inner_proc_lock(thread->proc);
-	/*
-	 * take a ref on the proc so it survives
-	 * after we remove this thread from proc->threads.
-	 * The corresponding dec is when we actually
-	 * free the thread in binder_free_thread()
-	 */
-	proc->tmp_ref++;
-	/*
-	 * take a ref on this thread to ensure it
-	 * survives while we are releasing it
-	 */
-	atomic_inc(&thread->tmp_ref);
-	rb_erase(&thread->rb_node, &proc->threads);
-	t = thread->transaction_stack;
-	if (t) {
-		spin_lock(&t->lock);
-		if (t->to_thread == thread)
-			send_reply = t;
-	} else {
-		__acquire(&t->lock);
-	}
-	thread->is_dead = true;
-
-	while (t) {
-		last_t = t;
-		active_transactions++;
-		binder_debug(BINDER_DEBUG_DEAD_TRANSACTION,
-			     "release %d:%d transaction %d %s, still active\n",
-			      proc->pid, thread->pid,
-			     t->debug_id,
-			     (t->to_thread == thread) ? "in" : "out");
-
-		if (t->to_thread == thread) {
-			thread->proc->outstanding_txns--;
-			t->to_proc = NULL;
-			t->to_thread = NULL;
-			if (t->buffer) {
-				t->buffer->transaction = NULL;
-				t->buffer = NULL;
-			}
-			t = t->to_parent;
-		} else if (t->from == thread) {
-			t->from = NULL;
-			t = t->from_parent;
-		} else
-			BUG();
-		spin_unlock(&last_t->lock);
-		if (t)
-			spin_lock(&t->lock);
-		else
-			__acquire(&t->lock);
-	}
-	/* annotation for sparse, lock not acquired in last iteration above */
-	__release(&t->lock);
-
-	/*
-	 * If this thread used poll, make sure we remove the waitqueue from any
-	 * poll data structures holding it.
-	 */
-	if (thread->looper & BINDER_LOOPER_STATE_POLL)
-		wake_up_pollfree(&thread->wait);
-
-	binder_inner_proc_unlock(thread->proc);
-
-	/*
-	 * This is needed to avoid races between wake_up_pollfree() above and
-	 * someone else removing the last entry from the queue for other reasons
-	 * (e.g. ep_remove_wait_queue() being called due to an epoll file
-	 * descriptor being closed).  Such other users hold an RCU read lock, so
-	 * we can be sure they're done after we call synchronize_rcu().
-	 */
-	if (thread->looper & BINDER_LOOPER_STATE_POLL)
-		synchronize_rcu();
-
-	if (send_reply)
-		binder_send_failed_reply(send_reply, BR_DEAD_REPLY);
-	binder_release_work(proc, &thread->todo);
-	binder_thread_dec_tmpref(thread);
-	return active_transactions;
-}
-
-static __poll_t binder_poll(struct file *filp,
-				struct poll_table_struct *wait)
-{
-	struct binder_proc *proc = filp->private_data;
-	struct binder_thread *thread = NULL;
-	bool wait_for_proc_work;
-
-	thread = binder_get_thread(proc);
-	if (!thread)
-		return POLLERR;
-
-	binder_inner_proc_lock(thread->proc);
-	thread->looper |= BINDER_LOOPER_STATE_POLL;
-	wait_for_proc_work = binder_available_for_proc_work_ilocked(thread);
-
-	binder_inner_proc_unlock(thread->proc);
-
-	poll_wait(filp, &thread->wait, wait);
-
-	if (binder_has_work(thread, wait_for_proc_work))
-		return EPOLLIN;
-
-	return 0;
-}
-
-static int binder_ioctl_write_read(struct file *filp, unsigned long arg,
-				struct binder_thread *thread)
-{
-	int ret = 0;
-	struct binder_proc *proc = filp->private_data;
-	void __user *ubuf = (void __user *)arg;
-	struct binder_write_read bwr;
-
-	if (copy_from_user(&bwr, ubuf, sizeof(bwr))) {
-		ret = -EFAULT;
-		goto out;
-	}
-	binder_debug(BINDER_DEBUG_READ_WRITE,
-		     "%d:%d write %lld at %016llx, read %lld at %016llx\n",
-		     proc->pid, thread->pid,
-		     (u64)bwr.write_size, (u64)bwr.write_buffer,
-		     (u64)bwr.read_size, (u64)bwr.read_buffer);
-
-	if (bwr.write_size > 0) {
-		ret = binder_thread_write(proc, thread,
-					  bwr.write_buffer,
-					  bwr.write_size,
-					  &bwr.write_consumed);
-		trace_binder_write_done(ret);
-		if (ret < 0) {
-			bwr.read_consumed = 0;
-			if (copy_to_user(ubuf, &bwr, sizeof(bwr)))
-				ret = -EFAULT;
-			goto out;
-		}
-	}
-	if (bwr.read_size > 0) {
-		ret = binder_thread_read(proc, thread, bwr.read_buffer,
-					 bwr.read_size,
-					 &bwr.read_consumed,
-					 filp->f_flags & O_NONBLOCK);
-		trace_binder_read_done(ret);
-		binder_inner_proc_lock(proc);
-		if (!binder_worklist_empty_ilocked(&proc->todo))
-			binder_wakeup_proc_ilocked(proc);
-		binder_inner_proc_unlock(proc);
-		if (ret < 0) {
-			if (copy_to_user(ubuf, &bwr, sizeof(bwr)))
-				ret = -EFAULT;
-			goto out;
-		}
-	}
-	binder_debug(BINDER_DEBUG_READ_WRITE,
-		     "%d:%d wrote %lld of %lld, read return %lld of %lld\n",
-		     proc->pid, thread->pid,
-		     (u64)bwr.write_consumed, (u64)bwr.write_size,
-		     (u64)bwr.read_consumed, (u64)bwr.read_size);
-	if (copy_to_user(ubuf, &bwr, sizeof(bwr))) {
-		ret = -EFAULT;
-		goto out;
-	}
-out:
-	return ret;
-}
-
-static int binder_ioctl_set_ctx_mgr(struct file *filp,
-				    struct flat_binder_object *fbo)
-{
-	int ret = 0;
-	struct binder_proc *proc = filp->private_data;
-	struct binder_context *context = proc->context;
-	struct binder_node *new_node;
-	kuid_t curr_euid = current_euid();
-
-	mutex_lock(&context->context_mgr_node_lock);
-	if (context->binder_context_mgr_node) {
-		pr_err("BINDER_SET_CONTEXT_MGR already set\n");
-		ret = -EBUSY;
-		goto out;
-	}
-	ret = security_binder_set_context_mgr(proc->cred);
-	if (ret < 0)
-		goto out;
-	if (uid_valid(context->binder_context_mgr_uid)) {
-		if (!uid_eq(context->binder_context_mgr_uid, curr_euid)) {
-			pr_err("BINDER_SET_CONTEXT_MGR bad uid %d != %d\n",
-			       from_kuid(&init_user_ns, curr_euid),
-			       from_kuid(&init_user_ns,
-					 context->binder_context_mgr_uid));
-			ret = -EPERM;
-			goto out;
-		}
-	} else {
-		context->binder_context_mgr_uid = curr_euid;
-	}
-	new_node = binder_new_node(proc, fbo);
-	if (!new_node) {
-		ret = -ENOMEM;
-		goto out;
-	}
-	binder_node_lock(new_node);
-	new_node->local_weak_refs++;
-	new_node->local_strong_refs++;
-	new_node->has_strong_ref = 1;
-	new_node->has_weak_ref = 1;
-	context->binder_context_mgr_node = new_node;
-	binder_node_unlock(new_node);
-	binder_put_node(new_node);
-out:
-	mutex_unlock(&context->context_mgr_node_lock);
-	return ret;
-}
-
-static int binder_ioctl_get_node_info_for_ref(struct binder_proc *proc,
-		struct binder_node_info_for_ref *info)
-{
-	struct binder_node *node;
-	struct binder_context *context = proc->context;
-	__u32 handle = info->handle;
-
-	if (info->strong_count || info->weak_count || info->reserved1 ||
-	    info->reserved2 || info->reserved3) {
-		binder_user_error("%d BINDER_GET_NODE_INFO_FOR_REF: only handle may be non-zero.",
-				  proc->pid);
-		return -EINVAL;
-	}
-
-	/* This ioctl may only be used by the context manager */
-	mutex_lock(&context->context_mgr_node_lock);
-	if (!context->binder_context_mgr_node ||
-		context->binder_context_mgr_node->proc != proc) {
-		mutex_unlock(&context->context_mgr_node_lock);
-		return -EPERM;
-	}
-	mutex_unlock(&context->context_mgr_node_lock);
-
-	node = binder_get_node_from_ref(proc, handle, true, NULL);
-	if (!node)
-		return -EINVAL;
-
-	info->strong_count = node->local_strong_refs +
-		node->internal_strong_refs;
-	info->weak_count = node->local_weak_refs;
-
-	binder_put_node(node);
-
-	return 0;
-}
-
-static int binder_ioctl_get_node_debug_info(struct binder_proc *proc,
-				struct binder_node_debug_info *info)
-{
-	struct rb_node *n;
-	binder_uintptr_t ptr = info->ptr;
-
-	memset(info, 0, sizeof(*info));
-
-	binder_inner_proc_lock(proc);
-	for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n)) {
-		struct binder_node *node = rb_entry(n, struct binder_node,
-						    rb_node);
-		if (node->ptr > ptr) {
-			info->ptr = node->ptr;
-			info->cookie = node->cookie;
-			info->has_strong_ref = node->has_strong_ref;
-			info->has_weak_ref = node->has_weak_ref;
-			break;
-		}
-	}
-	binder_inner_proc_unlock(proc);
-
-	return 0;
-}
-
-static bool binder_txns_pending_ilocked(struct binder_proc *proc)
-{
-	struct rb_node *n;
-	struct binder_thread *thread;
-
-	if (proc->outstanding_txns > 0)
-		return true;
-
-	for (n = rb_first(&proc->threads); n; n = rb_next(n)) {
-		thread = rb_entry(n, struct binder_thread, rb_node);
-		if (thread->transaction_stack)
-			return true;
-	}
-	return false;
-}
-
-static int binder_ioctl_freeze(struct binder_freeze_info *info,
-			       struct binder_proc *target_proc)
-{
-	int ret = 0;
-
-	if (!info->enable) {
-		binder_inner_proc_lock(target_proc);
-		target_proc->sync_recv = false;
-		target_proc->async_recv = false;
-		target_proc->is_frozen = false;
-		binder_inner_proc_unlock(target_proc);
-		return 0;
-	}
-
-	/*
-	 * Freezing the target. Prevent new transactions by
-	 * setting frozen state. If timeout specified, wait
-	 * for transactions to drain.
-	 */
-	binder_inner_proc_lock(target_proc);
-	target_proc->sync_recv = false;
-	target_proc->async_recv = false;
-	target_proc->is_frozen = true;
-	binder_inner_proc_unlock(target_proc);
-
-	if (info->timeout_ms > 0)
-		ret = wait_event_interruptible_timeout(
-			target_proc->freeze_wait,
-			(!target_proc->outstanding_txns),
-			msecs_to_jiffies(info->timeout_ms));
-
-	/* Check pending transactions that wait for reply */
-	if (ret >= 0) {
-		binder_inner_proc_lock(target_proc);
-		if (binder_txns_pending_ilocked(target_proc))
-			ret = -EAGAIN;
-		binder_inner_proc_unlock(target_proc);
-	}
-
-	if (ret < 0) {
-		binder_inner_proc_lock(target_proc);
-		target_proc->is_frozen = false;
-		binder_inner_proc_unlock(target_proc);
-	}
-
-	return ret;
-}
-
-static int binder_ioctl_get_freezer_info(
-				struct binder_frozen_status_info *info)
-{
-	struct binder_proc *target_proc;
-	bool found = false;
-	__u32 txns_pending;
-
-	info->sync_recv = 0;
-	info->async_recv = 0;
-
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(target_proc, &binder_procs, proc_node) {
-		if (target_proc->pid == info->pid) {
-			found = true;
-			binder_inner_proc_lock(target_proc);
-			txns_pending = binder_txns_pending_ilocked(target_proc);
-			info->sync_recv |= target_proc->sync_recv |
-					(txns_pending << 1);
-			info->async_recv |= target_proc->async_recv;
-			binder_inner_proc_unlock(target_proc);
-		}
-	}
-	mutex_unlock(&binder_procs_lock);
-
-	if (!found)
-		return -EINVAL;
-
-	return 0;
-}
-
-static int binder_ioctl_get_extended_error(struct binder_thread *thread,
-					   void __user *ubuf)
-{
-	struct binder_extended_error ee;
-
-	binder_inner_proc_lock(thread->proc);
-	ee = thread->ee;
-	binder_set_extended_error(&thread->ee, 0, BR_OK, 0);
-	binder_inner_proc_unlock(thread->proc);
-
-	if (copy_to_user(ubuf, &ee, sizeof(ee)))
-		return -EFAULT;
-
-	return 0;
-}
-
-static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
-{
-	int ret;
-	struct binder_proc *proc = filp->private_data;
-	struct binder_thread *thread;
-	void __user *ubuf = (void __user *)arg;
-
-	/*pr_info("binder_ioctl: %d:%d %x %lx\n",
-			proc->pid, current->pid, cmd, arg);*/
-
-	binder_selftest_alloc(&proc->alloc);
-
-	trace_binder_ioctl(cmd, arg);
-
-	ret = wait_event_interruptible(binder_user_error_wait, binder_stop_on_user_error < 2);
-	if (ret)
-		goto err_unlocked;
-
-	thread = binder_get_thread(proc);
-	if (thread == NULL) {
-		ret = -ENOMEM;
-		goto err;
-	}
-
-	switch (cmd) {
-	case BINDER_WRITE_READ:
-		ret = binder_ioctl_write_read(filp, arg, thread);
-		if (ret)
-			goto err;
-		break;
-	case BINDER_SET_MAX_THREADS: {
-		int max_threads;
-
-		if (copy_from_user(&max_threads, ubuf,
-				   sizeof(max_threads))) {
-			ret = -EINVAL;
-			goto err;
-		}
-		binder_inner_proc_lock(proc);
-		proc->max_threads = max_threads;
-		binder_inner_proc_unlock(proc);
-		break;
-	}
-	case BINDER_SET_CONTEXT_MGR_EXT: {
-		struct flat_binder_object fbo;
-
-		if (copy_from_user(&fbo, ubuf, sizeof(fbo))) {
-			ret = -EINVAL;
-			goto err;
-		}
-		ret = binder_ioctl_set_ctx_mgr(filp, &fbo);
-		if (ret)
-			goto err;
-		break;
-	}
-	case BINDER_SET_CONTEXT_MGR:
-		ret = binder_ioctl_set_ctx_mgr(filp, NULL);
-		if (ret)
-			goto err;
-		break;
-	case BINDER_THREAD_EXIT:
-		binder_debug(BINDER_DEBUG_THREADS, "%d:%d exit\n",
-			     proc->pid, thread->pid);
-		binder_thread_release(proc, thread);
-		thread = NULL;
-		break;
-	case BINDER_VERSION: {
-		struct binder_version __user *ver = ubuf;
-
-		if (put_user(BINDER_CURRENT_PROTOCOL_VERSION,
-			     &ver->protocol_version)) {
-			ret = -EINVAL;
-			goto err;
-		}
-		break;
-	}
-	case BINDER_GET_NODE_INFO_FOR_REF: {
-		struct binder_node_info_for_ref info;
-
-		if (copy_from_user(&info, ubuf, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-
-		ret = binder_ioctl_get_node_info_for_ref(proc, &info);
-		if (ret < 0)
-			goto err;
-
-		if (copy_to_user(ubuf, &info, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-
-		break;
-	}
-	case BINDER_GET_NODE_DEBUG_INFO: {
-		struct binder_node_debug_info info;
-
-		if (copy_from_user(&info, ubuf, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-
-		ret = binder_ioctl_get_node_debug_info(proc, &info);
-		if (ret < 0)
-			goto err;
-
-		if (copy_to_user(ubuf, &info, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-		break;
-	}
-	case BINDER_FREEZE: {
-		struct binder_freeze_info info;
-		struct binder_proc **target_procs = NULL, *target_proc;
-		int target_procs_count = 0, i = 0;
-
-		ret = 0;
-
-		if (copy_from_user(&info, ubuf, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-
-		mutex_lock(&binder_procs_lock);
-		hlist_for_each_entry(target_proc, &binder_procs, proc_node) {
-			if (target_proc->pid == info.pid)
-				target_procs_count++;
-		}
-
-		if (target_procs_count == 0) {
-			mutex_unlock(&binder_procs_lock);
-			ret = -EINVAL;
-			goto err;
-		}
-
-		target_procs = kcalloc(target_procs_count,
-				       sizeof(struct binder_proc *),
-				       GFP_KERNEL);
-
-		if (!target_procs) {
-			mutex_unlock(&binder_procs_lock);
-			ret = -ENOMEM;
-			goto err;
-		}
-
-		hlist_for_each_entry(target_proc, &binder_procs, proc_node) {
-			if (target_proc->pid != info.pid)
-				continue;
-
-			binder_inner_proc_lock(target_proc);
-			target_proc->tmp_ref++;
-			binder_inner_proc_unlock(target_proc);
-
-			target_procs[i++] = target_proc;
-		}
-		mutex_unlock(&binder_procs_lock);
-
-		for (i = 0; i < target_procs_count; i++) {
-			if (ret >= 0)
-				ret = binder_ioctl_freeze(&info,
-							  target_procs[i]);
-
-			binder_proc_dec_tmpref(target_procs[i]);
-		}
-
-		kfree(target_procs);
-
-		if (ret < 0)
-			goto err;
-		break;
-	}
-	case BINDER_GET_FROZEN_INFO: {
-		struct binder_frozen_status_info info;
-
-		if (copy_from_user(&info, ubuf, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-
-		ret = binder_ioctl_get_freezer_info(&info);
-		if (ret < 0)
-			goto err;
-
-		if (copy_to_user(ubuf, &info, sizeof(info))) {
-			ret = -EFAULT;
-			goto err;
-		}
-		break;
-	}
-	case BINDER_ENABLE_ONEWAY_SPAM_DETECTION: {
-		uint32_t enable;
-
-		if (copy_from_user(&enable, ubuf, sizeof(enable))) {
-			ret = -EFAULT;
-			goto err;
-		}
-		binder_inner_proc_lock(proc);
-		proc->oneway_spam_detection_enabled = (bool)enable;
-		binder_inner_proc_unlock(proc);
-		break;
-	}
-	case BINDER_GET_EXTENDED_ERROR:
-		ret = binder_ioctl_get_extended_error(thread, ubuf);
-		if (ret < 0)
-			goto err;
-		break;
-	default:
-		ret = -EINVAL;
-		goto err;
-	}
-	ret = 0;
-err:
-	if (thread)
-		thread->looper_need_return = false;
-	wait_event_interruptible(binder_user_error_wait, binder_stop_on_user_error < 2);
-	if (ret && ret != -EINTR)
-		pr_info("%d:%d ioctl %x %lx returned %d\n", proc->pid, current->pid, cmd, arg, ret);
-err_unlocked:
-	trace_binder_ioctl_done(ret);
-	return ret;
-}
-
-static void binder_vma_open(struct vm_area_struct *vma)
-{
-	struct binder_proc *proc = vma->vm_private_data;
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "%d open vm area %lx-%lx (%ld K) vma %lx pagep %lx\n",
-		     proc->pid, vma->vm_start, vma->vm_end,
-		     (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags,
-		     (unsigned long)pgprot_val(vma->vm_page_prot));
-}
-
-static void binder_vma_close(struct vm_area_struct *vma)
-{
-	struct binder_proc *proc = vma->vm_private_data;
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "%d close vm area %lx-%lx (%ld K) vma %lx pagep %lx\n",
-		     proc->pid, vma->vm_start, vma->vm_end,
-		     (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags,
-		     (unsigned long)pgprot_val(vma->vm_page_prot));
-	binder_alloc_vma_close(&proc->alloc);
-}
-
-static vm_fault_t binder_vm_fault(struct vm_fault *vmf)
-{
-	return VM_FAULT_SIGBUS;
-}
-
-static const struct vm_operations_struct binder_vm_ops = {
-	.open = binder_vma_open,
-	.close = binder_vma_close,
-	.fault = binder_vm_fault,
-};
-
-static int binder_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	struct binder_proc *proc = filp->private_data;
-
-	if (proc->tsk != current->group_leader)
-		return -EINVAL;
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "%s: %d %lx-%lx (%ld K) vma %lx pagep %lx\n",
-		     __func__, proc->pid, vma->vm_start, vma->vm_end,
-		     (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags,
-		     (unsigned long)pgprot_val(vma->vm_page_prot));
-
-	if (vma->vm_flags & FORBIDDEN_MMAP_FLAGS) {
-		pr_err("%s: %d %lx-%lx %s failed %d\n", __func__,
-		       proc->pid, vma->vm_start, vma->vm_end, "bad vm_flags", -EPERM);
-		return -EPERM;
-	}
-	vm_flags_mod(vma, VM_DONTCOPY | VM_MIXEDMAP, VM_MAYWRITE);
-
-	vma->vm_ops = &binder_vm_ops;
-	vma->vm_private_data = proc;
-
-	return binder_alloc_mmap_handler(&proc->alloc, vma);
-}
-
-static int binder_open(struct inode *nodp, struct file *filp)
-{
-	struct binder_proc *proc, *itr;
-	struct binder_device *binder_dev;
-	struct binderfs_info *info;
-	struct dentry *binder_binderfs_dir_entry_proc = NULL;
-	bool existing_pid = false;
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE, "%s: %d:%d\n", __func__,
-		     current->group_leader->pid, current->pid);
-
-	proc = kzalloc(sizeof(*proc), GFP_KERNEL);
-	if (proc == NULL)
-		return -ENOMEM;
-	spin_lock_init(&proc->inner_lock);
-	spin_lock_init(&proc->outer_lock);
-	get_task_struct(current->group_leader);
-	proc->tsk = current->group_leader;
-	proc->cred = get_cred(filp->f_cred);
-	INIT_LIST_HEAD(&proc->todo);
-	init_waitqueue_head(&proc->freeze_wait);
-	proc->default_priority = task_nice(current);
-	/* binderfs stashes devices in i_private */
-	if (is_binderfs_device(nodp)) {
-		binder_dev = nodp->i_private;
-		info = nodp->i_sb->s_fs_info;
-		binder_binderfs_dir_entry_proc = info->proc_log_dir;
-	} else {
-		binder_dev = container_of(filp->private_data,
-					  struct binder_device, miscdev);
-	}
-	refcount_inc(&binder_dev->ref);
-	proc->context = &binder_dev->context;
-	binder_alloc_init(&proc->alloc);
-
-	binder_stats_created(BINDER_STAT_PROC);
-	proc->pid = current->group_leader->pid;
-	INIT_LIST_HEAD(&proc->delivered_death);
-	INIT_LIST_HEAD(&proc->waiting_threads);
-	filp->private_data = proc;
-
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(itr, &binder_procs, proc_node) {
-		if (itr->pid == proc->pid) {
-			existing_pid = true;
-			break;
-		}
-	}
-	hlist_add_head(&proc->proc_node, &binder_procs);
-	mutex_unlock(&binder_procs_lock);
-
-	if (binder_debugfs_dir_entry_proc && !existing_pid) {
-		char strbuf[11];
-
-		snprintf(strbuf, sizeof(strbuf), "%u", proc->pid);
-		/*
-		 * proc debug entries are shared between contexts.
-		 * Only create for the first PID to avoid debugfs log spamming
-		 * The printing code will anyway print all contexts for a given
-		 * PID so this is not a problem.
-		 */
-		proc->debugfs_entry = debugfs_create_file(strbuf, 0444,
-			binder_debugfs_dir_entry_proc,
-			(void *)(unsigned long)proc->pid,
-			&proc_fops);
-	}
-
-	if (binder_binderfs_dir_entry_proc && !existing_pid) {
-		char strbuf[11];
-		struct dentry *binderfs_entry;
-
-		snprintf(strbuf, sizeof(strbuf), "%u", proc->pid);
-		/*
-		 * Similar to debugfs, the process specific log file is shared
-		 * between contexts. Only create for the first PID.
-		 * This is ok since same as debugfs, the log file will contain
-		 * information on all contexts of a given PID.
-		 */
-		binderfs_entry = binderfs_create_file(binder_binderfs_dir_entry_proc,
-			strbuf, &proc_fops, (void *)(unsigned long)proc->pid);
-		if (!IS_ERR(binderfs_entry)) {
-			proc->binderfs_entry = binderfs_entry;
-		} else {
-			int error;
-
-			error = PTR_ERR(binderfs_entry);
-			pr_warn("Unable to create file %s in binderfs (error %d)\n",
-				strbuf, error);
-		}
-	}
-
-	return 0;
-}
-
-static int binder_flush(struct file *filp, fl_owner_t id)
-{
-	struct binder_proc *proc = filp->private_data;
-
-	binder_defer_work(proc, BINDER_DEFERRED_FLUSH);
-
-	return 0;
-}
-
-static void binder_deferred_flush(struct binder_proc *proc)
-{
-	struct rb_node *n;
-	int wake_count = 0;
-
-	binder_inner_proc_lock(proc);
-	for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) {
-		struct binder_thread *thread = rb_entry(n, struct binder_thread, rb_node);
-
-		thread->looper_need_return = true;
-		if (thread->looper & BINDER_LOOPER_STATE_WAITING) {
-			wake_up_interruptible(&thread->wait);
-			wake_count++;
-		}
-	}
-	binder_inner_proc_unlock(proc);
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "binder_flush: %d woke %d threads\n", proc->pid,
-		     wake_count);
-}
-
-static int binder_release(struct inode *nodp, struct file *filp)
-{
-	struct binder_proc *proc = filp->private_data;
-
-	debugfs_remove(proc->debugfs_entry);
-
-	if (proc->binderfs_entry) {
-		binderfs_remove_file(proc->binderfs_entry);
-		proc->binderfs_entry = NULL;
-	}
-
-	binder_defer_work(proc, BINDER_DEFERRED_RELEASE);
-
-	return 0;
-}
-
-static int binder_node_release(struct binder_node *node, int refs)
-{
-	struct binder_ref *ref;
-	int death = 0;
-	struct binder_proc *proc = node->proc;
-
-	binder_release_work(proc, &node->async_todo);
-
-	binder_node_lock(node);
-	binder_inner_proc_lock(proc);
-	binder_dequeue_work_ilocked(&node->work);
-	/*
-	 * The caller must have taken a temporary ref on the node,
-	 */
-	BUG_ON(!node->tmp_refs);
-	if (hlist_empty(&node->refs) && node->tmp_refs == 1) {
-		binder_inner_proc_unlock(proc);
-		binder_node_unlock(node);
-		binder_free_node(node);
-
-		return refs;
-	}
-
-	node->proc = NULL;
-	node->local_strong_refs = 0;
-	node->local_weak_refs = 0;
-	binder_inner_proc_unlock(proc);
-
-	spin_lock(&binder_dead_nodes_lock);
-	hlist_add_head(&node->dead_node, &binder_dead_nodes);
-	spin_unlock(&binder_dead_nodes_lock);
-
-	hlist_for_each_entry(ref, &node->refs, node_entry) {
-		refs++;
-		/*
-		 * Need the node lock to synchronize
-		 * with new notification requests and the
-		 * inner lock to synchronize with queued
-		 * death notifications.
-		 */
-		binder_inner_proc_lock(ref->proc);
-		if (!ref->death) {
-			binder_inner_proc_unlock(ref->proc);
-			continue;
-		}
-
-		death++;
-
-		BUG_ON(!list_empty(&ref->death->work.entry));
-		ref->death->work.type = BINDER_WORK_DEAD_BINDER;
-		binder_enqueue_work_ilocked(&ref->death->work,
-					    &ref->proc->todo);
-		binder_wakeup_proc_ilocked(ref->proc);
-		binder_inner_proc_unlock(ref->proc);
-	}
-
-	binder_debug(BINDER_DEBUG_DEAD_BINDER,
-		     "node %d now dead, refs %d, death %d\n",
-		     node->debug_id, refs, death);
-	binder_node_unlock(node);
-	binder_put_node(node);
-
-	return refs;
-}
-
-static void binder_deferred_release(struct binder_proc *proc)
-{
-	struct binder_context *context = proc->context;
-	struct rb_node *n;
-	int threads, nodes, incoming_refs, outgoing_refs, active_transactions;
-
-	mutex_lock(&binder_procs_lock);
-	hlist_del(&proc->proc_node);
-	mutex_unlock(&binder_procs_lock);
-
-	mutex_lock(&context->context_mgr_node_lock);
-	if (context->binder_context_mgr_node &&
-	    context->binder_context_mgr_node->proc == proc) {
-		binder_debug(BINDER_DEBUG_DEAD_BINDER,
-			     "%s: %d context_mgr_node gone\n",
-			     __func__, proc->pid);
-		context->binder_context_mgr_node = NULL;
-	}
-	mutex_unlock(&context->context_mgr_node_lock);
-	binder_inner_proc_lock(proc);
-	/*
-	 * Make sure proc stays alive after we
-	 * remove all the threads
-	 */
-	proc->tmp_ref++;
-
-	proc->is_dead = true;
-	proc->is_frozen = false;
-	proc->sync_recv = false;
-	proc->async_recv = false;
-	threads = 0;
-	active_transactions = 0;
-	while ((n = rb_first(&proc->threads))) {
-		struct binder_thread *thread;
-
-		thread = rb_entry(n, struct binder_thread, rb_node);
-		binder_inner_proc_unlock(proc);
-		threads++;
-		active_transactions += binder_thread_release(proc, thread);
-		binder_inner_proc_lock(proc);
-	}
-
-	nodes = 0;
-	incoming_refs = 0;
-	while ((n = rb_first(&proc->nodes))) {
-		struct binder_node *node;
-
-		node = rb_entry(n, struct binder_node, rb_node);
-		nodes++;
-		/*
-		 * take a temporary ref on the node before
-		 * calling binder_node_release() which will either
-		 * kfree() the node or call binder_put_node()
-		 */
-		binder_inc_node_tmpref_ilocked(node);
-		rb_erase(&node->rb_node, &proc->nodes);
-		binder_inner_proc_unlock(proc);
-		incoming_refs = binder_node_release(node, incoming_refs);
-		binder_inner_proc_lock(proc);
-	}
-	binder_inner_proc_unlock(proc);
-
-	outgoing_refs = 0;
-	binder_proc_lock(proc);
-	while ((n = rb_first(&proc->refs_by_desc))) {
-		struct binder_ref *ref;
-
-		ref = rb_entry(n, struct binder_ref, rb_node_desc);
-		outgoing_refs++;
-		binder_cleanup_ref_olocked(ref);
-		binder_proc_unlock(proc);
-		binder_free_ref(ref);
-		binder_proc_lock(proc);
-	}
-	binder_proc_unlock(proc);
-
-	binder_release_work(proc, &proc->todo);
-	binder_release_work(proc, &proc->delivered_death);
-
-	binder_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "%s: %d threads %d, nodes %d (ref %d), refs %d, active transactions %d\n",
-		     __func__, proc->pid, threads, nodes, incoming_refs,
-		     outgoing_refs, active_transactions);
-
-	binder_proc_dec_tmpref(proc);
-}
-
-static void binder_deferred_func(struct work_struct *work)
-{
-	struct binder_proc *proc;
-
-	int defer;
-
-	do {
-		mutex_lock(&binder_deferred_lock);
-		if (!hlist_empty(&binder_deferred_list)) {
-			proc = hlist_entry(binder_deferred_list.first,
-					struct binder_proc, deferred_work_node);
-			hlist_del_init(&proc->deferred_work_node);
-			defer = proc->deferred_work;
-			proc->deferred_work = 0;
-		} else {
-			proc = NULL;
-			defer = 0;
-		}
-		mutex_unlock(&binder_deferred_lock);
-
-		if (defer & BINDER_DEFERRED_FLUSH)
-			binder_deferred_flush(proc);
-
-		if (defer & BINDER_DEFERRED_RELEASE)
-			binder_deferred_release(proc); /* frees proc */
-	} while (proc);
-}
-static DECLARE_WORK(binder_deferred_work, binder_deferred_func);
-
-static void
-binder_defer_work(struct binder_proc *proc, enum binder_deferred_state defer)
-{
-	mutex_lock(&binder_deferred_lock);
-	proc->deferred_work |= defer;
-	if (hlist_unhashed(&proc->deferred_work_node)) {
-		hlist_add_head(&proc->deferred_work_node,
-				&binder_deferred_list);
-		schedule_work(&binder_deferred_work);
-	}
-	mutex_unlock(&binder_deferred_lock);
-}
-
-static void print_binder_transaction_ilocked(struct seq_file *m,
-					     struct binder_proc *proc,
-					     const char *prefix,
-					     struct binder_transaction *t)
-{
-	struct binder_proc *to_proc;
-	struct binder_buffer *buffer = t->buffer;
-	ktime_t current_time = ktime_get();
-
-	spin_lock(&t->lock);
-	to_proc = t->to_proc;
-	seq_printf(m,
-		   "%s %d: %pK from %d:%d to %d:%d code %x flags %x pri %ld r%d elapsed %lldms",
-		   prefix, t->debug_id, t,
-		   t->from_pid,
-		   t->from_tid,
-		   to_proc ? to_proc->pid : 0,
-		   t->to_thread ? t->to_thread->pid : 0,
-		   t->code, t->flags, t->priority, t->need_reply,
-		   ktime_ms_delta(current_time, t->start_time));
-	spin_unlock(&t->lock);
-
-	if (proc != to_proc) {
-		/*
-		 * Can only safely deref buffer if we are holding the
-		 * correct proc inner lock for this node
-		 */
-		seq_puts(m, "\n");
-		return;
-	}
-
-	if (buffer == NULL) {
-		seq_puts(m, " buffer free\n");
-		return;
-	}
-	if (buffer->target_node)
-		seq_printf(m, " node %d", buffer->target_node->debug_id);
-	seq_printf(m, " size %zd:%zd data %pK\n",
-		   buffer->data_size, buffer->offsets_size,
-		   buffer->user_data);
-}
-
-static void print_binder_work_ilocked(struct seq_file *m,
-				     struct binder_proc *proc,
-				     const char *prefix,
-				     const char *transaction_prefix,
-				     struct binder_work *w)
-{
-	struct binder_node *node;
-	struct binder_transaction *t;
-
-	switch (w->type) {
-	case BINDER_WORK_TRANSACTION:
-		t = container_of(w, struct binder_transaction, work);
-		print_binder_transaction_ilocked(
-				m, proc, transaction_prefix, t);
-		break;
-	case BINDER_WORK_RETURN_ERROR: {
-		struct binder_error *e = container_of(
-				w, struct binder_error, work);
-
-		seq_printf(m, "%stransaction error: %u\n",
-			   prefix, e->cmd);
-	} break;
-	case BINDER_WORK_TRANSACTION_COMPLETE:
-		seq_printf(m, "%stransaction complete\n", prefix);
-		break;
-	case BINDER_WORK_NODE:
-		node = container_of(w, struct binder_node, work);
-		seq_printf(m, "%snode work %d: u%016llx c%016llx\n",
-			   prefix, node->debug_id,
-			   (u64)node->ptr, (u64)node->cookie);
-		break;
-	case BINDER_WORK_DEAD_BINDER:
-		seq_printf(m, "%shas dead binder\n", prefix);
-		break;
-	case BINDER_WORK_DEAD_BINDER_AND_CLEAR:
-		seq_printf(m, "%shas cleared dead binder\n", prefix);
-		break;
-	case BINDER_WORK_CLEAR_DEATH_NOTIFICATION:
-		seq_printf(m, "%shas cleared death notification\n", prefix);
-		break;
-	default:
-		seq_printf(m, "%sunknown work: type %d\n", prefix, w->type);
-		break;
-	}
-}
-
-static void print_binder_thread_ilocked(struct seq_file *m,
-					struct binder_thread *thread,
-					int print_always)
-{
-	struct binder_transaction *t;
-	struct binder_work *w;
-	size_t start_pos = m->count;
-	size_t header_pos;
-
-	seq_printf(m, "  thread %d: l %02x need_return %d tr %d\n",
-			thread->pid, thread->looper,
-			thread->looper_need_return,
-			atomic_read(&thread->tmp_ref));
-	header_pos = m->count;
-	t = thread->transaction_stack;
-	while (t) {
-		if (t->from == thread) {
-			print_binder_transaction_ilocked(m, thread->proc,
-					"    outgoing transaction", t);
-			t = t->from_parent;
-		} else if (t->to_thread == thread) {
-			print_binder_transaction_ilocked(m, thread->proc,
-						 "    incoming transaction", t);
-			t = t->to_parent;
-		} else {
-			print_binder_transaction_ilocked(m, thread->proc,
-					"    bad transaction", t);
-			t = NULL;
-		}
-	}
-	list_for_each_entry(w, &thread->todo, entry) {
-		print_binder_work_ilocked(m, thread->proc, "    ",
-					  "    pending transaction", w);
-	}
-	if (!print_always && m->count == header_pos)
-		m->count = start_pos;
-}
-
-static void print_binder_node_nilocked(struct seq_file *m,
-				       struct binder_node *node)
-{
-	struct binder_ref *ref;
-	struct binder_work *w;
-	int count;
-
-	count = 0;
-	hlist_for_each_entry(ref, &node->refs, node_entry)
-		count++;
-
-	seq_printf(m, "  node %d: u%016llx c%016llx hs %d hw %d ls %d lw %d is %d iw %d tr %d",
-		   node->debug_id, (u64)node->ptr, (u64)node->cookie,
-		   node->has_strong_ref, node->has_weak_ref,
-		   node->local_strong_refs, node->local_weak_refs,
-		   node->internal_strong_refs, count, node->tmp_refs);
-	if (count) {
-		seq_puts(m, " proc");
-		hlist_for_each_entry(ref, &node->refs, node_entry)
-			seq_printf(m, " %d", ref->proc->pid);
-	}
-	seq_puts(m, "\n");
-	if (node->proc) {
-		list_for_each_entry(w, &node->async_todo, entry)
-			print_binder_work_ilocked(m, node->proc, "    ",
-					  "    pending async transaction", w);
-	}
-}
-
-static void print_binder_ref_olocked(struct seq_file *m,
-				     struct binder_ref *ref)
-{
-	binder_node_lock(ref->node);
-	seq_printf(m, "  ref %d: desc %d %snode %d s %d w %d d %pK\n",
-		   ref->data.debug_id, ref->data.desc,
-		   ref->node->proc ? "" : "dead ",
-		   ref->node->debug_id, ref->data.strong,
-		   ref->data.weak, ref->death);
-	binder_node_unlock(ref->node);
-}
-
-static void print_binder_proc(struct seq_file *m,
-			      struct binder_proc *proc, int print_all)
-{
-	struct binder_work *w;
-	struct rb_node *n;
-	size_t start_pos = m->count;
-	size_t header_pos;
-	struct binder_node *last_node = NULL;
-
-	seq_printf(m, "proc %d\n", proc->pid);
-	seq_printf(m, "context %s\n", proc->context->name);
-	header_pos = m->count;
-
-	binder_inner_proc_lock(proc);
-	for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n))
-		print_binder_thread_ilocked(m, rb_entry(n, struct binder_thread,
-						rb_node), print_all);
-
-	for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n)) {
-		struct binder_node *node = rb_entry(n, struct binder_node,
-						    rb_node);
-		if (!print_all && !node->has_async_transaction)
-			continue;
-
-		/*
-		 * take a temporary reference on the node so it
-		 * survives and isn't removed from the tree
-		 * while we print it.
-		 */
-		binder_inc_node_tmpref_ilocked(node);
-		/* Need to drop inner lock to take node lock */
-		binder_inner_proc_unlock(proc);
-		if (last_node)
-			binder_put_node(last_node);
-		binder_node_inner_lock(node);
-		print_binder_node_nilocked(m, node);
-		binder_node_inner_unlock(node);
-		last_node = node;
-		binder_inner_proc_lock(proc);
-	}
-	binder_inner_proc_unlock(proc);
-	if (last_node)
-		binder_put_node(last_node);
-
-	if (print_all) {
-		binder_proc_lock(proc);
-		for (n = rb_first(&proc->refs_by_desc);
-		     n != NULL;
-		     n = rb_next(n))
-			print_binder_ref_olocked(m, rb_entry(n,
-							    struct binder_ref,
-							    rb_node_desc));
-		binder_proc_unlock(proc);
-	}
-	binder_alloc_print_allocated(m, &proc->alloc);
-	binder_inner_proc_lock(proc);
-	list_for_each_entry(w, &proc->todo, entry)
-		print_binder_work_ilocked(m, proc, "  ",
-					  "  pending transaction", w);
-	list_for_each_entry(w, &proc->delivered_death, entry) {
-		seq_puts(m, "  has delivered dead binder\n");
-		break;
-	}
-	binder_inner_proc_unlock(proc);
-	if (!print_all && m->count == header_pos)
-		m->count = start_pos;
-}
-
-static const char * const binder_return_strings[] = {
-	"BR_ERROR",
-	"BR_OK",
-	"BR_TRANSACTION",
-	"BR_REPLY",
-	"BR_ACQUIRE_RESULT",
-	"BR_DEAD_REPLY",
-	"BR_TRANSACTION_COMPLETE",
-	"BR_INCREFS",
-	"BR_ACQUIRE",
-	"BR_RELEASE",
-	"BR_DECREFS",
-	"BR_ATTEMPT_ACQUIRE",
-	"BR_NOOP",
-	"BR_SPAWN_LOOPER",
-	"BR_FINISHED",
-	"BR_DEAD_BINDER",
-	"BR_CLEAR_DEATH_NOTIFICATION_DONE",
-	"BR_FAILED_REPLY",
-	"BR_FROZEN_REPLY",
-	"BR_ONEWAY_SPAM_SUSPECT",
-	"BR_TRANSACTION_PENDING_FROZEN"
-};
-
-static const char * const binder_command_strings[] = {
-	"BC_TRANSACTION",
-	"BC_REPLY",
-	"BC_ACQUIRE_RESULT",
-	"BC_FREE_BUFFER",
-	"BC_INCREFS",
-	"BC_ACQUIRE",
-	"BC_RELEASE",
-	"BC_DECREFS",
-	"BC_INCREFS_DONE",
-	"BC_ACQUIRE_DONE",
-	"BC_ATTEMPT_ACQUIRE",
-	"BC_REGISTER_LOOPER",
-	"BC_ENTER_LOOPER",
-	"BC_EXIT_LOOPER",
-	"BC_REQUEST_DEATH_NOTIFICATION",
-	"BC_CLEAR_DEATH_NOTIFICATION",
-	"BC_DEAD_BINDER_DONE",
-	"BC_TRANSACTION_SG",
-	"BC_REPLY_SG",
-};
-
-static const char * const binder_objstat_strings[] = {
-	"proc",
-	"thread",
-	"node",
-	"ref",
-	"death",
-	"transaction",
-	"transaction_complete"
-};
-
-static void print_binder_stats(struct seq_file *m, const char *prefix,
-			       struct binder_stats *stats)
-{
-	int i;
-
-	BUILD_BUG_ON(ARRAY_SIZE(stats->bc) !=
-		     ARRAY_SIZE(binder_command_strings));
-	for (i = 0; i < ARRAY_SIZE(stats->bc); i++) {
-		int temp = atomic_read(&stats->bc[i]);
-
-		if (temp)
-			seq_printf(m, "%s%s: %d\n", prefix,
-				   binder_command_strings[i], temp);
-	}
-
-	BUILD_BUG_ON(ARRAY_SIZE(stats->br) !=
-		     ARRAY_SIZE(binder_return_strings));
-	for (i = 0; i < ARRAY_SIZE(stats->br); i++) {
-		int temp = atomic_read(&stats->br[i]);
-
-		if (temp)
-			seq_printf(m, "%s%s: %d\n", prefix,
-				   binder_return_strings[i], temp);
-	}
-
-	BUILD_BUG_ON(ARRAY_SIZE(stats->obj_created) !=
-		     ARRAY_SIZE(binder_objstat_strings));
-	BUILD_BUG_ON(ARRAY_SIZE(stats->obj_created) !=
-		     ARRAY_SIZE(stats->obj_deleted));
-	for (i = 0; i < ARRAY_SIZE(stats->obj_created); i++) {
-		int created = atomic_read(&stats->obj_created[i]);
-		int deleted = atomic_read(&stats->obj_deleted[i]);
-
-		if (created || deleted)
-			seq_printf(m, "%s%s: active %d total %d\n",
-				prefix,
-				binder_objstat_strings[i],
-				created - deleted,
-				created);
-	}
-}
-
-static void print_binder_proc_stats(struct seq_file *m,
-				    struct binder_proc *proc)
-{
-	struct binder_work *w;
-	struct binder_thread *thread;
-	struct rb_node *n;
-	int count, strong, weak, ready_threads;
-	size_t free_async_space =
-		binder_alloc_get_free_async_space(&proc->alloc);
-
-	seq_printf(m, "proc %d\n", proc->pid);
-	seq_printf(m, "context %s\n", proc->context->name);
-	count = 0;
-	ready_threads = 0;
-	binder_inner_proc_lock(proc);
-	for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n))
-		count++;
-
-	list_for_each_entry(thread, &proc->waiting_threads, waiting_thread_node)
-		ready_threads++;
-
-	seq_printf(m, "  threads: %d\n", count);
-	seq_printf(m, "  requested threads: %d+%d/%d\n"
-			"  ready threads %d\n"
-			"  free async space %zd\n", proc->requested_threads,
-			proc->requested_threads_started, proc->max_threads,
-			ready_threads,
-			free_async_space);
-	count = 0;
-	for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n))
-		count++;
-	binder_inner_proc_unlock(proc);
-	seq_printf(m, "  nodes: %d\n", count);
-	count = 0;
-	strong = 0;
-	weak = 0;
-	binder_proc_lock(proc);
-	for (n = rb_first(&proc->refs_by_desc); n != NULL; n = rb_next(n)) {
-		struct binder_ref *ref = rb_entry(n, struct binder_ref,
-						  rb_node_desc);
-		count++;
-		strong += ref->data.strong;
-		weak += ref->data.weak;
-	}
-	binder_proc_unlock(proc);
-	seq_printf(m, "  refs: %d s %d w %d\n", count, strong, weak);
-
-	count = binder_alloc_get_allocated_count(&proc->alloc);
-	seq_printf(m, "  buffers: %d\n", count);
-
-	binder_alloc_print_pages(m, &proc->alloc);
-
-	count = 0;
-	binder_inner_proc_lock(proc);
-	list_for_each_entry(w, &proc->todo, entry) {
-		if (w->type == BINDER_WORK_TRANSACTION)
-			count++;
-	}
-	binder_inner_proc_unlock(proc);
-	seq_printf(m, "  pending transactions: %d\n", count);
-
-	print_binder_stats(m, "  ", &proc->stats);
-}
-
-static int state_show(struct seq_file *m, void *unused)
-{
-	struct binder_proc *proc;
-	struct binder_node *node;
-	struct binder_node *last_node = NULL;
-
-	seq_puts(m, "binder state:\n");
-
-	spin_lock(&binder_dead_nodes_lock);
-	if (!hlist_empty(&binder_dead_nodes))
-		seq_puts(m, "dead nodes:\n");
-	hlist_for_each_entry(node, &binder_dead_nodes, dead_node) {
-		/*
-		 * take a temporary reference on the node so it
-		 * survives and isn't removed from the list
-		 * while we print it.
-		 */
-		node->tmp_refs++;
-		spin_unlock(&binder_dead_nodes_lock);
-		if (last_node)
-			binder_put_node(last_node);
-		binder_node_lock(node);
-		print_binder_node_nilocked(m, node);
-		binder_node_unlock(node);
-		last_node = node;
-		spin_lock(&binder_dead_nodes_lock);
-	}
-	spin_unlock(&binder_dead_nodes_lock);
-	if (last_node)
-		binder_put_node(last_node);
-
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(proc, &binder_procs, proc_node)
-		print_binder_proc(m, proc, 1);
-	mutex_unlock(&binder_procs_lock);
-
-	return 0;
-}
-
-static int stats_show(struct seq_file *m, void *unused)
-{
-	struct binder_proc *proc;
-
-	seq_puts(m, "binder stats:\n");
-
-	print_binder_stats(m, "", &binder_stats);
-
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(proc, &binder_procs, proc_node)
-		print_binder_proc_stats(m, proc);
-	mutex_unlock(&binder_procs_lock);
-
-	return 0;
-}
-
-static int transactions_show(struct seq_file *m, void *unused)
-{
-	struct binder_proc *proc;
-
-	seq_puts(m, "binder transactions:\n");
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(proc, &binder_procs, proc_node)
-		print_binder_proc(m, proc, 0);
-	mutex_unlock(&binder_procs_lock);
-
-	return 0;
-}
-
-static int proc_show(struct seq_file *m, void *unused)
-{
-	struct binder_proc *itr;
-	int pid = (unsigned long)m->private;
-
-	mutex_lock(&binder_procs_lock);
-	hlist_for_each_entry(itr, &binder_procs, proc_node) {
-		if (itr->pid == pid) {
-			seq_puts(m, "binder proc state:\n");
-			print_binder_proc(m, itr, 1);
-		}
-	}
-	mutex_unlock(&binder_procs_lock);
-
-	return 0;
-}
-
-static void print_binder_transaction_log_entry(struct seq_file *m,
-					struct binder_transaction_log_entry *e)
-{
-	int debug_id = READ_ONCE(e->debug_id_done);
-	/*
-	 * read barrier to guarantee debug_id_done read before
-	 * we print the log values
-	 */
-	smp_rmb();
-	seq_printf(m,
-		   "%d: %s from %d:%d to %d:%d context %s node %d handle %d size %d:%d ret %d/%d l=%d",
-		   e->debug_id, (e->call_type == 2) ? "reply" :
-		   ((e->call_type == 1) ? "async" : "call "), e->from_proc,
-		   e->from_thread, e->to_proc, e->to_thread, e->context_name,
-		   e->to_node, e->target_handle, e->data_size, e->offsets_size,
-		   e->return_error, e->return_error_param,
-		   e->return_error_line);
-	/*
-	 * read-barrier to guarantee read of debug_id_done after
-	 * done printing the fields of the entry
-	 */
-	smp_rmb();
-	seq_printf(m, debug_id && debug_id == READ_ONCE(e->debug_id_done) ?
-			"\n" : " (incomplete)\n");
-}
-
-static int transaction_log_show(struct seq_file *m, void *unused)
-{
-	struct binder_transaction_log *log = m->private;
-	unsigned int log_cur = atomic_read(&log->cur);
-	unsigned int count;
-	unsigned int cur;
-	int i;
-
-	count = log_cur + 1;
-	cur = count < ARRAY_SIZE(log->entry) && !log->full ?
-		0 : count % ARRAY_SIZE(log->entry);
-	if (count > ARRAY_SIZE(log->entry) || log->full)
-		count = ARRAY_SIZE(log->entry);
-	for (i = 0; i < count; i++) {
-		unsigned int index = cur++ % ARRAY_SIZE(log->entry);
-
-		print_binder_transaction_log_entry(m, &log->entry[index]);
-	}
-	return 0;
-}
-
-const struct file_operations binder_fops = {
-	.owner = THIS_MODULE,
-	.poll = binder_poll,
-	.unlocked_ioctl = binder_ioctl,
-	.compat_ioctl = compat_ptr_ioctl,
-	.mmap = binder_mmap,
-	.open = binder_open,
-	.flush = binder_flush,
-	.release = binder_release,
-};
-
-DEFINE_SHOW_ATTRIBUTE(state);
-DEFINE_SHOW_ATTRIBUTE(stats);
-DEFINE_SHOW_ATTRIBUTE(transactions);
-DEFINE_SHOW_ATTRIBUTE(transaction_log);
-
-const struct binder_debugfs_entry binder_debugfs_entries[] = {
-	{
-		.name = "state",
-		.mode = 0444,
-		.fops = &state_fops,
-		.data = NULL,
-	},
-	{
-		.name = "stats",
-		.mode = 0444,
-		.fops = &stats_fops,
-		.data = NULL,
-	},
-	{
-		.name = "transactions",
-		.mode = 0444,
-		.fops = &transactions_fops,
-		.data = NULL,
-	},
-	{
-		.name = "transaction_log",
-		.mode = 0444,
-		.fops = &transaction_log_fops,
-		.data = &binder_transaction_log,
-	},
-	{
-		.name = "failed_transaction_log",
-		.mode = 0444,
-		.fops = &transaction_log_fops,
-		.data = &binder_transaction_log_failed,
-	},
-	{} /* terminator */
-};
-
-static int __init init_binder_device(const char *name)
-{
-	int ret;
-	struct binder_device *binder_device;
-
-	binder_device = kzalloc(sizeof(*binder_device), GFP_KERNEL);
-	if (!binder_device)
-		return -ENOMEM;
-
-	binder_device->miscdev.fops = &binder_fops;
-	binder_device->miscdev.minor = MISC_DYNAMIC_MINOR;
-	binder_device->miscdev.name = name;
-
-	refcount_set(&binder_device->ref, 1);
-	binder_device->context.binder_context_mgr_uid = INVALID_UID;
-	binder_device->context.name = name;
-	mutex_init(&binder_device->context.context_mgr_node_lock);
-
-	ret = misc_register(&binder_device->miscdev);
-	if (ret < 0) {
-		kfree(binder_device);
-		return ret;
-	}
-
-	hlist_add_head(&binder_device->hlist, &binder_devices);
-
-	return ret;
-}
-
-static int __init binder_init(void)
-{
-	int ret;
-	char *device_name, *device_tmp;
-	struct binder_device *device;
-	struct hlist_node *tmp;
-	char *device_names = NULL;
-	const struct binder_debugfs_entry *db_entry;
-
-	ret = binder_alloc_shrinker_init();
-	if (ret)
-		return ret;
-
-	atomic_set(&binder_transaction_log.cur, ~0U);
-	atomic_set(&binder_transaction_log_failed.cur, ~0U);
-
-	binder_debugfs_dir_entry_root = debugfs_create_dir("binder", NULL);
-
-	binder_for_each_debugfs_entry(db_entry)
-		debugfs_create_file(db_entry->name,
-					db_entry->mode,
-					binder_debugfs_dir_entry_root,
-					db_entry->data,
-					db_entry->fops);
-
-	binder_debugfs_dir_entry_proc = debugfs_create_dir("proc",
-						binder_debugfs_dir_entry_root);
-
-	if (!IS_ENABLED(CONFIG_ANDROID_BINDERFS) &&
-	    strcmp(binder_devices_param, "") != 0) {
-		/*
-		* Copy the module_parameter string, because we don't want to
-		* tokenize it in-place.
-		 */
-		device_names = kstrdup(binder_devices_param, GFP_KERNEL);
-		if (!device_names) {
-			ret = -ENOMEM;
-			goto err_alloc_device_names_failed;
-		}
-
-		device_tmp = device_names;
-		while ((device_name = strsep(&device_tmp, ","))) {
-			ret = init_binder_device(device_name);
-			if (ret)
-				goto err_init_binder_device_failed;
-		}
-	}
-
-	ret = init_binderfs();
-	if (ret)
-		goto err_init_binder_device_failed;
-
-	return ret;
-
-err_init_binder_device_failed:
-	hlist_for_each_entry_safe(device, tmp, &binder_devices, hlist) {
-		misc_deregister(&device->miscdev);
-		hlist_del(&device->hlist);
-		kfree(device);
-	}
-
-	kfree(device_names);
-
-err_alloc_device_names_failed:
-	debugfs_remove_recursive(binder_debugfs_dir_entry_root);
-	binder_alloc_shrinker_exit();
-
-	return ret;
-}
-
-device_initcall(binder_init);
-
-#define CREATE_TRACE_POINTS
-#include "binder_trace.h"
-
-MODULE_LICENSE("GPL v2");
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
deleted file mode 100644
index e3db8297095a..000000000000
--- a/drivers/android/binder_alloc.c
+++ /dev/null
@@ -1,1284 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/* binder_alloc.c
- *
- * Android IPC Subsystem
- *
- * Copyright (C) 2007-2017 Google, Inc.
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/list.h>
-#include <linux/sched/mm.h>
-#include <linux/module.h>
-#include <linux/rtmutex.h>
-#include <linux/rbtree.h>
-#include <linux/seq_file.h>
-#include <linux/vmalloc.h>
-#include <linux/slab.h>
-#include <linux/sched.h>
-#include <linux/list_lru.h>
-#include <linux/ratelimit.h>
-#include <asm/cacheflush.h>
-#include <linux/uaccess.h>
-#include <linux/highmem.h>
-#include <linux/sizes.h>
-#include "binder_alloc.h"
-#include "binder_trace.h"
-
-struct list_lru binder_alloc_lru;
-
-static DEFINE_MUTEX(binder_alloc_mmap_lock);
-
-enum {
-	BINDER_DEBUG_USER_ERROR             = 1U << 0,
-	BINDER_DEBUG_OPEN_CLOSE             = 1U << 1,
-	BINDER_DEBUG_BUFFER_ALLOC           = 1U << 2,
-	BINDER_DEBUG_BUFFER_ALLOC_ASYNC     = 1U << 3,
-};
-static uint32_t binder_alloc_debug_mask = BINDER_DEBUG_USER_ERROR;
-
-module_param_named(debug_mask, binder_alloc_debug_mask,
-		   uint, 0644);
-
-#define binder_alloc_debug(mask, x...) \
-	do { \
-		if (binder_alloc_debug_mask & mask) \
-			pr_info_ratelimited(x); \
-	} while (0)
-
-static struct binder_buffer *binder_buffer_next(struct binder_buffer *buffer)
-{
-	return list_entry(buffer->entry.next, struct binder_buffer, entry);
-}
-
-static struct binder_buffer *binder_buffer_prev(struct binder_buffer *buffer)
-{
-	return list_entry(buffer->entry.prev, struct binder_buffer, entry);
-}
-
-static size_t binder_alloc_buffer_size(struct binder_alloc *alloc,
-				       struct binder_buffer *buffer)
-{
-	if (list_is_last(&buffer->entry, &alloc->buffers))
-		return alloc->buffer + alloc->buffer_size - buffer->user_data;
-	return binder_buffer_next(buffer)->user_data - buffer->user_data;
-}
-
-static void binder_insert_free_buffer(struct binder_alloc *alloc,
-				      struct binder_buffer *new_buffer)
-{
-	struct rb_node **p = &alloc->free_buffers.rb_node;
-	struct rb_node *parent = NULL;
-	struct binder_buffer *buffer;
-	size_t buffer_size;
-	size_t new_buffer_size;
-
-	BUG_ON(!new_buffer->free);
-
-	new_buffer_size = binder_alloc_buffer_size(alloc, new_buffer);
-
-	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-		     "%d: add free buffer, size %zd, at %pK\n",
-		      alloc->pid, new_buffer_size, new_buffer);
-
-	while (*p) {
-		parent = *p;
-		buffer = rb_entry(parent, struct binder_buffer, rb_node);
-		BUG_ON(!buffer->free);
-
-		buffer_size = binder_alloc_buffer_size(alloc, buffer);
-
-		if (new_buffer_size < buffer_size)
-			p = &parent->rb_left;
-		else
-			p = &parent->rb_right;
-	}
-	rb_link_node(&new_buffer->rb_node, parent, p);
-	rb_insert_color(&new_buffer->rb_node, &alloc->free_buffers);
-}
-
-static void binder_insert_allocated_buffer_locked(
-		struct binder_alloc *alloc, struct binder_buffer *new_buffer)
-{
-	struct rb_node **p = &alloc->allocated_buffers.rb_node;
-	struct rb_node *parent = NULL;
-	struct binder_buffer *buffer;
-
-	BUG_ON(new_buffer->free);
-
-	while (*p) {
-		parent = *p;
-		buffer = rb_entry(parent, struct binder_buffer, rb_node);
-		BUG_ON(buffer->free);
-
-		if (new_buffer->user_data < buffer->user_data)
-			p = &parent->rb_left;
-		else if (new_buffer->user_data > buffer->user_data)
-			p = &parent->rb_right;
-		else
-			BUG();
-	}
-	rb_link_node(&new_buffer->rb_node, parent, p);
-	rb_insert_color(&new_buffer->rb_node, &alloc->allocated_buffers);
-}
-
-static struct binder_buffer *binder_alloc_prepare_to_free_locked(
-		struct binder_alloc *alloc,
-		uintptr_t user_ptr)
-{
-	struct rb_node *n = alloc->allocated_buffers.rb_node;
-	struct binder_buffer *buffer;
-	void __user *uptr;
-
-	uptr = (void __user *)user_ptr;
-
-	while (n) {
-		buffer = rb_entry(n, struct binder_buffer, rb_node);
-		BUG_ON(buffer->free);
-
-		if (uptr < buffer->user_data)
-			n = n->rb_left;
-		else if (uptr > buffer->user_data)
-			n = n->rb_right;
-		else {
-			/*
-			 * Guard against user threads attempting to
-			 * free the buffer when in use by kernel or
-			 * after it's already been freed.
-			 */
-			if (!buffer->allow_user_free)
-				return ERR_PTR(-EPERM);
-			buffer->allow_user_free = 0;
-			return buffer;
-		}
-	}
-	return NULL;
-}
-
-/**
- * binder_alloc_prepare_to_free() - get buffer given user ptr
- * @alloc:	binder_alloc for this proc
- * @user_ptr:	User pointer to buffer data
- *
- * Validate userspace pointer to buffer data and return buffer corresponding to
- * that user pointer. Search the rb tree for buffer that matches user data
- * pointer.
- *
- * Return:	Pointer to buffer or NULL
- */
-struct binder_buffer *binder_alloc_prepare_to_free(struct binder_alloc *alloc,
-						   uintptr_t user_ptr)
-{
-	struct binder_buffer *buffer;
-
-	mutex_lock(&alloc->mutex);
-	buffer = binder_alloc_prepare_to_free_locked(alloc, user_ptr);
-	mutex_unlock(&alloc->mutex);
-	return buffer;
-}
-
-static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
-				    void __user *start, void __user *end)
-{
-	void __user *page_addr;
-	unsigned long user_page_addr;
-	struct binder_lru_page *page;
-	struct vm_area_struct *vma = NULL;
-	struct mm_struct *mm = NULL;
-	bool need_mm = false;
-
-	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-		     "%d: %s pages %pK-%pK\n", alloc->pid,
-		     allocate ? "allocate" : "free", start, end);
-
-	if (end <= start)
-		return 0;
-
-	trace_binder_update_page_range(alloc, allocate, start, end);
-
-	if (allocate == 0)
-		goto free_range;
-
-	for (page_addr = start; page_addr < end; page_addr += PAGE_SIZE) {
-		page = &alloc->pages[(page_addr - alloc->buffer) / PAGE_SIZE];
-		if (!page->page_ptr) {
-			need_mm = true;
-			break;
-		}
-	}
-
-	if (need_mm && mmget_not_zero(alloc->mm))
-		mm = alloc->mm;
-
-	if (mm) {
-		mmap_write_lock(mm);
-		vma = alloc->vma;
-	}
-
-	if (!vma && need_mm) {
-		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-				   "%d: binder_alloc_buf failed to map pages in userspace, no vma\n",
-				   alloc->pid);
-		goto err_no_vma;
-	}
-
-	for (page_addr = start; page_addr < end; page_addr += PAGE_SIZE) {
-		int ret;
-		bool on_lru;
-		size_t index;
-
-		index = (page_addr - alloc->buffer) / PAGE_SIZE;
-		page = &alloc->pages[index];
-
-		if (page->page_ptr) {
-			trace_binder_alloc_lru_start(alloc, index);
-
-			on_lru = list_lru_del(&binder_alloc_lru, &page->lru);
-			WARN_ON(!on_lru);
-
-			trace_binder_alloc_lru_end(alloc, index);
-			continue;
-		}
-
-		if (WARN_ON(!vma))
-			goto err_page_ptr_cleared;
-
-		trace_binder_alloc_page_start(alloc, index);
-		page->page_ptr = alloc_page(GFP_KERNEL |
-					    __GFP_HIGHMEM |
-					    __GFP_ZERO);
-		if (!page->page_ptr) {
-			pr_err("%d: binder_alloc_buf failed for page at %pK\n",
-				alloc->pid, page_addr);
-			goto err_alloc_page_failed;
-		}
-		page->alloc = alloc;
-		INIT_LIST_HEAD(&page->lru);
-
-		user_page_addr = (uintptr_t)page_addr;
-		ret = vm_insert_page(vma, user_page_addr, page[0].page_ptr);
-		if (ret) {
-			pr_err("%d: binder_alloc_buf failed to map page at %lx in userspace\n",
-			       alloc->pid, user_page_addr);
-			goto err_vm_insert_page_failed;
-		}
-
-		if (index + 1 > alloc->pages_high)
-			alloc->pages_high = index + 1;
-
-		trace_binder_alloc_page_end(alloc, index);
-	}
-	if (mm) {
-		mmap_write_unlock(mm);
-		mmput(mm);
-	}
-	return 0;
-
-free_range:
-	for (page_addr = end - PAGE_SIZE; 1; page_addr -= PAGE_SIZE) {
-		bool ret;
-		size_t index;
-
-		index = (page_addr - alloc->buffer) / PAGE_SIZE;
-		page = &alloc->pages[index];
-
-		trace_binder_free_lru_start(alloc, index);
-
-		ret = list_lru_add(&binder_alloc_lru, &page->lru);
-		WARN_ON(!ret);
-
-		trace_binder_free_lru_end(alloc, index);
-		if (page_addr == start)
-			break;
-		continue;
-
-err_vm_insert_page_failed:
-		__free_page(page->page_ptr);
-		page->page_ptr = NULL;
-err_alloc_page_failed:
-err_page_ptr_cleared:
-		if (page_addr == start)
-			break;
-	}
-err_no_vma:
-	if (mm) {
-		mmap_write_unlock(mm);
-		mmput(mm);
-	}
-	return vma ? -ENOMEM : -ESRCH;
-}
-
-static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
-		struct vm_area_struct *vma)
-{
-	/* pairs with smp_load_acquire in binder_alloc_get_vma() */
-	smp_store_release(&alloc->vma, vma);
-}
-
-static inline struct vm_area_struct *binder_alloc_get_vma(
-		struct binder_alloc *alloc)
-{
-	/* pairs with smp_store_release in binder_alloc_set_vma() */
-	return smp_load_acquire(&alloc->vma);
-}
-
-static bool debug_low_async_space_locked(struct binder_alloc *alloc, int pid)
-{
-	/*
-	 * Find the amount and size of buffers allocated by the current caller;
-	 * The idea is that once we cross the threshold, whoever is responsible
-	 * for the low async space is likely to try to send another async txn,
-	 * and at some point we'll catch them in the act. This is more efficient
-	 * than keeping a map per pid.
-	 */
-	struct rb_node *n;
-	struct binder_buffer *buffer;
-	size_t total_alloc_size = 0;
-	size_t num_buffers = 0;
-
-	for (n = rb_first(&alloc->allocated_buffers); n != NULL;
-		 n = rb_next(n)) {
-		buffer = rb_entry(n, struct binder_buffer, rb_node);
-		if (buffer->pid != pid)
-			continue;
-		if (!buffer->async_transaction)
-			continue;
-		total_alloc_size += binder_alloc_buffer_size(alloc, buffer)
-			+ sizeof(struct binder_buffer);
-		num_buffers++;
-	}
-
-	/*
-	 * Warn if this pid has more than 50 transactions, or more than 50% of
-	 * async space (which is 25% of total buffer size). Oneway spam is only
-	 * detected when the threshold is exceeded.
-	 */
-	if (num_buffers > 50 || total_alloc_size > alloc->buffer_size / 4) {
-		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-			     "%d: pid %d spamming oneway? %zd buffers allocated for a total size of %zd\n",
-			      alloc->pid, pid, num_buffers, total_alloc_size);
-		if (!alloc->oneway_spam_detected) {
-			alloc->oneway_spam_detected = true;
-			return true;
-		}
-	}
-	return false;
-}
-
-static struct binder_buffer *binder_alloc_new_buf_locked(
-				struct binder_alloc *alloc,
-				size_t data_size,
-				size_t offsets_size,
-				size_t extra_buffers_size,
-				int is_async,
-				int pid)
-{
-	struct rb_node *n = alloc->free_buffers.rb_node;
-	struct binder_buffer *buffer;
-	size_t buffer_size;
-	struct rb_node *best_fit = NULL;
-	void __user *has_page_addr;
-	void __user *end_page_addr;
-	size_t size, data_offsets_size;
-	int ret;
-
-	/* Check binder_alloc is fully initialized */
-	if (!binder_alloc_get_vma(alloc)) {
-		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-				   "%d: binder_alloc_buf, no vma\n",
-				   alloc->pid);
-		return ERR_PTR(-ESRCH);
-	}
-
-	data_offsets_size = ALIGN(data_size, sizeof(void *)) +
-		ALIGN(offsets_size, sizeof(void *));
-
-	if (data_offsets_size < data_size || data_offsets_size < offsets_size) {
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				"%d: got transaction with invalid size %zd-%zd\n",
-				alloc->pid, data_size, offsets_size);
-		return ERR_PTR(-EINVAL);
-	}
-	size = data_offsets_size + ALIGN(extra_buffers_size, sizeof(void *));
-	if (size < data_offsets_size || size < extra_buffers_size) {
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				"%d: got transaction with invalid extra_buffers_size %zd\n",
-				alloc->pid, extra_buffers_size);
-		return ERR_PTR(-EINVAL);
-	}
-	if (is_async &&
-	    alloc->free_async_space < size + sizeof(struct binder_buffer)) {
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-			     "%d: binder_alloc_buf size %zd failed, no async space left\n",
-			      alloc->pid, size);
-		return ERR_PTR(-ENOSPC);
-	}
-
-	/* Pad 0-size buffers so they get assigned unique addresses */
-	size = max(size, sizeof(void *));
-
-	while (n) {
-		buffer = rb_entry(n, struct binder_buffer, rb_node);
-		BUG_ON(!buffer->free);
-		buffer_size = binder_alloc_buffer_size(alloc, buffer);
-
-		if (size < buffer_size) {
-			best_fit = n;
-			n = n->rb_left;
-		} else if (size > buffer_size)
-			n = n->rb_right;
-		else {
-			best_fit = n;
-			break;
-		}
-	}
-	if (best_fit == NULL) {
-		size_t allocated_buffers = 0;
-		size_t largest_alloc_size = 0;
-		size_t total_alloc_size = 0;
-		size_t free_buffers = 0;
-		size_t largest_free_size = 0;
-		size_t total_free_size = 0;
-
-		for (n = rb_first(&alloc->allocated_buffers); n != NULL;
-		     n = rb_next(n)) {
-			buffer = rb_entry(n, struct binder_buffer, rb_node);
-			buffer_size = binder_alloc_buffer_size(alloc, buffer);
-			allocated_buffers++;
-			total_alloc_size += buffer_size;
-			if (buffer_size > largest_alloc_size)
-				largest_alloc_size = buffer_size;
-		}
-		for (n = rb_first(&alloc->free_buffers); n != NULL;
-		     n = rb_next(n)) {
-			buffer = rb_entry(n, struct binder_buffer, rb_node);
-			buffer_size = binder_alloc_buffer_size(alloc, buffer);
-			free_buffers++;
-			total_free_size += buffer_size;
-			if (buffer_size > largest_free_size)
-				largest_free_size = buffer_size;
-		}
-		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-				   "%d: binder_alloc_buf size %zd failed, no address space\n",
-				   alloc->pid, size);
-		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-				   "allocated: %zd (num: %zd largest: %zd), free: %zd (num: %zd largest: %zd)\n",
-				   total_alloc_size, allocated_buffers,
-				   largest_alloc_size, total_free_size,
-				   free_buffers, largest_free_size);
-		return ERR_PTR(-ENOSPC);
-	}
-	if (n == NULL) {
-		buffer = rb_entry(best_fit, struct binder_buffer, rb_node);
-		buffer_size = binder_alloc_buffer_size(alloc, buffer);
-	}
-
-	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-		     "%d: binder_alloc_buf size %zd got buffer %pK size %zd\n",
-		      alloc->pid, size, buffer, buffer_size);
-
-	has_page_addr = (void __user *)
-		(((uintptr_t)buffer->user_data + buffer_size) & PAGE_MASK);
-	WARN_ON(n && buffer_size != size);
-	end_page_addr =
-		(void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data + size);
-	if (end_page_addr > has_page_addr)
-		end_page_addr = has_page_addr;
-	ret = binder_update_page_range(alloc, 1, (void __user *)
-		PAGE_ALIGN((uintptr_t)buffer->user_data), end_page_addr);
-	if (ret)
-		return ERR_PTR(ret);
-
-	if (buffer_size != size) {
-		struct binder_buffer *new_buffer;
-
-		new_buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
-		if (!new_buffer) {
-			pr_err("%s: %d failed to alloc new buffer struct\n",
-			       __func__, alloc->pid);
-			goto err_alloc_buf_struct_failed;
-		}
-		new_buffer->user_data = (u8 __user *)buffer->user_data + size;
-		list_add(&new_buffer->entry, &buffer->entry);
-		new_buffer->free = 1;
-		binder_insert_free_buffer(alloc, new_buffer);
-	}
-
-	rb_erase(best_fit, &alloc->free_buffers);
-	buffer->free = 0;
-	buffer->allow_user_free = 0;
-	binder_insert_allocated_buffer_locked(alloc, buffer);
-	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-		     "%d: binder_alloc_buf size %zd got %pK\n",
-		      alloc->pid, size, buffer);
-	buffer->data_size = data_size;
-	buffer->offsets_size = offsets_size;
-	buffer->async_transaction = is_async;
-	buffer->extra_buffers_size = extra_buffers_size;
-	buffer->pid = pid;
-	buffer->oneway_spam_suspect = false;
-	if (is_async) {
-		alloc->free_async_space -= size + sizeof(struct binder_buffer);
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC_ASYNC,
-			     "%d: binder_alloc_buf size %zd async free %zd\n",
-			      alloc->pid, size, alloc->free_async_space);
-		if (alloc->free_async_space < alloc->buffer_size / 10) {
-			/*
-			 * Start detecting spammers once we have less than 20%
-			 * of async space left (which is less than 10% of total
-			 * buffer size).
-			 */
-			buffer->oneway_spam_suspect = debug_low_async_space_locked(alloc, pid);
-		} else {
-			alloc->oneway_spam_detected = false;
-		}
-	}
-	return buffer;
-
-err_alloc_buf_struct_failed:
-	binder_update_page_range(alloc, 0, (void __user *)
-				 PAGE_ALIGN((uintptr_t)buffer->user_data),
-				 end_page_addr);
-	return ERR_PTR(-ENOMEM);
-}
-
-/**
- * binder_alloc_new_buf() - Allocate a new binder buffer
- * @alloc:              binder_alloc for this proc
- * @data_size:          size of user data buffer
- * @offsets_size:       user specified buffer offset
- * @extra_buffers_size: size of extra space for meta-data (eg, security context)
- * @is_async:           buffer for async transaction
- * @pid:				pid to attribute allocation to (used for debugging)
- *
- * Allocate a new buffer given the requested sizes. Returns
- * the kernel version of the buffer pointer. The size allocated
- * is the sum of the three given sizes (each rounded up to
- * pointer-sized boundary)
- *
- * Return:	The allocated buffer or %NULL if error
- */
-struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc,
-					   size_t data_size,
-					   size_t offsets_size,
-					   size_t extra_buffers_size,
-					   int is_async,
-					   int pid)
-{
-	struct binder_buffer *buffer;
-
-	mutex_lock(&alloc->mutex);
-	buffer = binder_alloc_new_buf_locked(alloc, data_size, offsets_size,
-					     extra_buffers_size, is_async, pid);
-	mutex_unlock(&alloc->mutex);
-	return buffer;
-}
-
-static void __user *buffer_start_page(struct binder_buffer *buffer)
-{
-	return (void __user *)((uintptr_t)buffer->user_data & PAGE_MASK);
-}
-
-static void __user *prev_buffer_end_page(struct binder_buffer *buffer)
-{
-	return (void __user *)
-		(((uintptr_t)(buffer->user_data) - 1) & PAGE_MASK);
-}
-
-static void binder_delete_free_buffer(struct binder_alloc *alloc,
-				      struct binder_buffer *buffer)
-{
-	struct binder_buffer *prev, *next = NULL;
-	bool to_free = true;
-
-	BUG_ON(alloc->buffers.next == &buffer->entry);
-	prev = binder_buffer_prev(buffer);
-	BUG_ON(!prev->free);
-	if (prev_buffer_end_page(prev) == buffer_start_page(buffer)) {
-		to_free = false;
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				   "%d: merge free, buffer %pK share page with %pK\n",
-				   alloc->pid, buffer->user_data,
-				   prev->user_data);
-	}
-
-	if (!list_is_last(&buffer->entry, &alloc->buffers)) {
-		next = binder_buffer_next(buffer);
-		if (buffer_start_page(next) == buffer_start_page(buffer)) {
-			to_free = false;
-			binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-					   "%d: merge free, buffer %pK share page with %pK\n",
-					   alloc->pid,
-					   buffer->user_data,
-					   next->user_data);
-		}
-	}
-
-	if (PAGE_ALIGNED(buffer->user_data)) {
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				   "%d: merge free, buffer start %pK is page aligned\n",
-				   alloc->pid, buffer->user_data);
-		to_free = false;
-	}
-
-	if (to_free) {
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				   "%d: merge free, buffer %pK do not share page with %pK or %pK\n",
-				   alloc->pid, buffer->user_data,
-				   prev->user_data,
-				   next ? next->user_data : NULL);
-		binder_update_page_range(alloc, 0, buffer_start_page(buffer),
-					 buffer_start_page(buffer) + PAGE_SIZE);
-	}
-	list_del(&buffer->entry);
-	kfree(buffer);
-}
-
-static void binder_free_buf_locked(struct binder_alloc *alloc,
-				   struct binder_buffer *buffer)
-{
-	size_t size, buffer_size;
-
-	buffer_size = binder_alloc_buffer_size(alloc, buffer);
-
-	size = ALIGN(buffer->data_size, sizeof(void *)) +
-		ALIGN(buffer->offsets_size, sizeof(void *)) +
-		ALIGN(buffer->extra_buffers_size, sizeof(void *));
-
-	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-		     "%d: binder_free_buf %pK size %zd buffer_size %zd\n",
-		      alloc->pid, buffer, size, buffer_size);
-
-	BUG_ON(buffer->free);
-	BUG_ON(size > buffer_size);
-	BUG_ON(buffer->transaction != NULL);
-	BUG_ON(buffer->user_data < alloc->buffer);
-	BUG_ON(buffer->user_data > alloc->buffer + alloc->buffer_size);
-
-	if (buffer->async_transaction) {
-		alloc->free_async_space += buffer_size + sizeof(struct binder_buffer);
-
-		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC_ASYNC,
-			     "%d: binder_free_buf size %zd async free %zd\n",
-			      alloc->pid, size, alloc->free_async_space);
-	}
-
-	binder_update_page_range(alloc, 0,
-		(void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data),
-		(void __user *)(((uintptr_t)
-			  buffer->user_data + buffer_size) & PAGE_MASK));
-
-	rb_erase(&buffer->rb_node, &alloc->allocated_buffers);
-	buffer->free = 1;
-	if (!list_is_last(&buffer->entry, &alloc->buffers)) {
-		struct binder_buffer *next = binder_buffer_next(buffer);
-
-		if (next->free) {
-			rb_erase(&next->rb_node, &alloc->free_buffers);
-			binder_delete_free_buffer(alloc, next);
-		}
-	}
-	if (alloc->buffers.next != &buffer->entry) {
-		struct binder_buffer *prev = binder_buffer_prev(buffer);
-
-		if (prev->free) {
-			binder_delete_free_buffer(alloc, buffer);
-			rb_erase(&prev->rb_node, &alloc->free_buffers);
-			buffer = prev;
-		}
-	}
-	binder_insert_free_buffer(alloc, buffer);
-}
-
-static void binder_alloc_clear_buf(struct binder_alloc *alloc,
-				   struct binder_buffer *buffer);
-/**
- * binder_alloc_free_buf() - free a binder buffer
- * @alloc:	binder_alloc for this proc
- * @buffer:	kernel pointer to buffer
- *
- * Free the buffer allocated via binder_alloc_new_buf()
- */
-void binder_alloc_free_buf(struct binder_alloc *alloc,
-			    struct binder_buffer *buffer)
-{
-	/*
-	 * We could eliminate the call to binder_alloc_clear_buf()
-	 * from binder_alloc_deferred_release() by moving this to
-	 * binder_alloc_free_buf_locked(). However, that could
-	 * increase contention for the alloc mutex if clear_on_free
-	 * is used frequently for large buffers. The mutex is not
-	 * needed for correctness here.
-	 */
-	if (buffer->clear_on_free) {
-		binder_alloc_clear_buf(alloc, buffer);
-		buffer->clear_on_free = false;
-	}
-	mutex_lock(&alloc->mutex);
-	binder_free_buf_locked(alloc, buffer);
-	mutex_unlock(&alloc->mutex);
-}
-
-/**
- * binder_alloc_mmap_handler() - map virtual address space for proc
- * @alloc:	alloc structure for this proc
- * @vma:	vma passed to mmap()
- *
- * Called by binder_mmap() to initialize the space specified in
- * vma for allocating binder buffers
- *
- * Return:
- *      0 = success
- *      -EBUSY = address space already mapped
- *      -ENOMEM = failed to map memory to given address space
- */
-int binder_alloc_mmap_handler(struct binder_alloc *alloc,
-			      struct vm_area_struct *vma)
-{
-	int ret;
-	const char *failure_string;
-	struct binder_buffer *buffer;
-
-	if (unlikely(vma->vm_mm != alloc->mm)) {
-		ret = -EINVAL;
-		failure_string = "invalid vma->vm_mm";
-		goto err_invalid_mm;
-	}
-
-	mutex_lock(&binder_alloc_mmap_lock);
-	if (alloc->buffer_size) {
-		ret = -EBUSY;
-		failure_string = "already mapped";
-		goto err_already_mapped;
-	}
-	alloc->buffer_size = min_t(unsigned long, vma->vm_end - vma->vm_start,
-				   SZ_4M);
-	mutex_unlock(&binder_alloc_mmap_lock);
-
-	alloc->buffer = (void __user *)vma->vm_start;
-
-	alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE,
-			       sizeof(alloc->pages[0]),
-			       GFP_KERNEL);
-	if (alloc->pages == NULL) {
-		ret = -ENOMEM;
-		failure_string = "alloc page array";
-		goto err_alloc_pages_failed;
-	}
-
-	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
-	if (!buffer) {
-		ret = -ENOMEM;
-		failure_string = "alloc buffer struct";
-		goto err_alloc_buf_struct_failed;
-	}
-
-	buffer->user_data = alloc->buffer;
-	list_add(&buffer->entry, &alloc->buffers);
-	buffer->free = 1;
-	binder_insert_free_buffer(alloc, buffer);
-	alloc->free_async_space = alloc->buffer_size / 2;
-
-	/* Signal binder_alloc is fully initialized */
-	binder_alloc_set_vma(alloc, vma);
-
-	return 0;
-
-err_alloc_buf_struct_failed:
-	kfree(alloc->pages);
-	alloc->pages = NULL;
-err_alloc_pages_failed:
-	alloc->buffer = NULL;
-	mutex_lock(&binder_alloc_mmap_lock);
-	alloc->buffer_size = 0;
-err_already_mapped:
-	mutex_unlock(&binder_alloc_mmap_lock);
-err_invalid_mm:
-	binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
-			   "%s: %d %lx-%lx %s failed %d\n", __func__,
-			   alloc->pid, vma->vm_start, vma->vm_end,
-			   failure_string, ret);
-	return ret;
-}
-
-
-void binder_alloc_deferred_release(struct binder_alloc *alloc)
-{
-	struct rb_node *n;
-	int buffers, page_count;
-	struct binder_buffer *buffer;
-
-	buffers = 0;
-	mutex_lock(&alloc->mutex);
-	BUG_ON(alloc->vma);
-
-	while ((n = rb_first(&alloc->allocated_buffers))) {
-		buffer = rb_entry(n, struct binder_buffer, rb_node);
-
-		/* Transaction should already have been freed */
-		BUG_ON(buffer->transaction);
-
-		if (buffer->clear_on_free) {
-			binder_alloc_clear_buf(alloc, buffer);
-			buffer->clear_on_free = false;
-		}
-		binder_free_buf_locked(alloc, buffer);
-		buffers++;
-	}
-
-	while (!list_empty(&alloc->buffers)) {
-		buffer = list_first_entry(&alloc->buffers,
-					  struct binder_buffer, entry);
-		WARN_ON(!buffer->free);
-
-		list_del(&buffer->entry);
-		WARN_ON_ONCE(!list_empty(&alloc->buffers));
-		kfree(buffer);
-	}
-
-	page_count = 0;
-	if (alloc->pages) {
-		int i;
-
-		for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
-			void __user *page_addr;
-			bool on_lru;
-
-			if (!alloc->pages[i].page_ptr)
-				continue;
-
-			on_lru = list_lru_del(&binder_alloc_lru,
-					      &alloc->pages[i].lru);
-			page_addr = alloc->buffer + i * PAGE_SIZE;
-			binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
-				     "%s: %d: page %d at %pK %s\n",
-				     __func__, alloc->pid, i, page_addr,
-				     on_lru ? "on lru" : "active");
-			__free_page(alloc->pages[i].page_ptr);
-			page_count++;
-		}
-		kfree(alloc->pages);
-	}
-	mutex_unlock(&alloc->mutex);
-	if (alloc->mm)
-		mmdrop(alloc->mm);
-
-	binder_alloc_debug(BINDER_DEBUG_OPEN_CLOSE,
-		     "%s: %d buffers %d, pages %d\n",
-		     __func__, alloc->pid, buffers, page_count);
-}
-
-static void print_binder_buffer(struct seq_file *m, const char *prefix,
-				struct binder_buffer *buffer)
-{
-	seq_printf(m, "%s %d: %pK size %zd:%zd:%zd %s\n",
-		   prefix, buffer->debug_id, buffer->user_data,
-		   buffer->data_size, buffer->offsets_size,
-		   buffer->extra_buffers_size,
-		   buffer->transaction ? "active" : "delivered");
-}
-
-/**
- * binder_alloc_print_allocated() - print buffer info
- * @m:     seq_file for output via seq_printf()
- * @alloc: binder_alloc for this proc
- *
- * Prints information about every buffer associated with
- * the binder_alloc state to the given seq_file
- */
-void binder_alloc_print_allocated(struct seq_file *m,
-				  struct binder_alloc *alloc)
-{
-	struct rb_node *n;
-
-	mutex_lock(&alloc->mutex);
-	for (n = rb_first(&alloc->allocated_buffers); n != NULL; n = rb_next(n))
-		print_binder_buffer(m, "  buffer",
-				    rb_entry(n, struct binder_buffer, rb_node));
-	mutex_unlock(&alloc->mutex);
-}
-
-/**
- * binder_alloc_print_pages() - print page usage
- * @m:     seq_file for output via seq_printf()
- * @alloc: binder_alloc for this proc
- */
-void binder_alloc_print_pages(struct seq_file *m,
-			      struct binder_alloc *alloc)
-{
-	struct binder_lru_page *page;
-	int i;
-	int active = 0;
-	int lru = 0;
-	int free = 0;
-
-	mutex_lock(&alloc->mutex);
-	/*
-	 * Make sure the binder_alloc is fully initialized, otherwise we might
-	 * read inconsistent state.
-	 */
-	if (binder_alloc_get_vma(alloc) != NULL) {
-		for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
-			page = &alloc->pages[i];
-			if (!page->page_ptr)
-				free++;
-			else if (list_empty(&page->lru))
-				active++;
-			else
-				lru++;
-		}
-	}
-	mutex_unlock(&alloc->mutex);
-	seq_printf(m, "  pages: %d:%d:%d\n", active, lru, free);
-	seq_printf(m, "  pages high watermark: %zu\n", alloc->pages_high);
-}
-
-/**
- * binder_alloc_get_allocated_count() - return count of buffers
- * @alloc: binder_alloc for this proc
- *
- * Return: count of allocated buffers
- */
-int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
-{
-	struct rb_node *n;
-	int count = 0;
-
-	mutex_lock(&alloc->mutex);
-	for (n = rb_first(&alloc->allocated_buffers); n != NULL; n = rb_next(n))
-		count++;
-	mutex_unlock(&alloc->mutex);
-	return count;
-}
-
-
-/**
- * binder_alloc_vma_close() - invalidate address space
- * @alloc: binder_alloc for this proc
- *
- * Called from binder_vma_close() when releasing address space.
- * Clears alloc->vma to prevent new incoming transactions from
- * allocating more buffers.
- */
-void binder_alloc_vma_close(struct binder_alloc *alloc)
-{
-	binder_alloc_set_vma(alloc, NULL);
-}
-
-/**
- * binder_alloc_free_page() - shrinker callback to free pages
- * @item:   item to free
- * @lock:   lock protecting the item
- * @cb_arg: callback argument
- *
- * Called from list_lru_walk() in binder_shrink_scan() to free
- * up pages when the system is under memory pressure.
- */
-enum lru_status binder_alloc_free_page(struct list_head *item,
-				       struct list_lru_one *lru,
-				       spinlock_t *lock,
-				       void *cb_arg)
-	__must_hold(lock)
-{
-	struct mm_struct *mm = NULL;
-	struct binder_lru_page *page = container_of(item,
-						    struct binder_lru_page,
-						    lru);
-	struct binder_alloc *alloc;
-	uintptr_t page_addr;
-	size_t index;
-	struct vm_area_struct *vma;
-
-	alloc = page->alloc;
-	if (!mutex_trylock(&alloc->mutex))
-		goto err_get_alloc_mutex_failed;
-
-	if (!page->page_ptr)
-		goto err_page_already_freed;
-
-	index = page - alloc->pages;
-	page_addr = (uintptr_t)alloc->buffer + index * PAGE_SIZE;
-
-	mm = alloc->mm;
-	if (!mmget_not_zero(mm))
-		goto err_mmget;
-	if (!mmap_read_trylock(mm))
-		goto err_mmap_read_lock_failed;
-	vma = binder_alloc_get_vma(alloc);
-
-	list_lru_isolate(lru, item);
-	spin_unlock(lock);
-
-	if (vma) {
-		trace_binder_unmap_user_start(alloc, index);
-
-		zap_page_range_single(vma, page_addr, PAGE_SIZE, NULL);
-
-		trace_binder_unmap_user_end(alloc, index);
-	}
-	mmap_read_unlock(mm);
-	mmput_async(mm);
-
-	trace_binder_unmap_kernel_start(alloc, index);
-
-	__free_page(page->page_ptr);
-	page->page_ptr = NULL;
-
-	trace_binder_unmap_kernel_end(alloc, index);
-
-	spin_lock(lock);
-	mutex_unlock(&alloc->mutex);
-	return LRU_REMOVED_RETRY;
-
-err_mmap_read_lock_failed:
-	mmput_async(mm);
-err_mmget:
-err_page_already_freed:
-	mutex_unlock(&alloc->mutex);
-err_get_alloc_mutex_failed:
-	return LRU_SKIP;
-}
-
-static unsigned long
-binder_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
-{
-	return list_lru_count(&binder_alloc_lru);
-}
-
-static unsigned long
-binder_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
-{
-	return list_lru_walk(&binder_alloc_lru, binder_alloc_free_page,
-			    NULL, sc->nr_to_scan);
-}
-
-static struct shrinker binder_shrinker = {
-	.count_objects = binder_shrink_count,
-	.scan_objects = binder_shrink_scan,
-	.seeks = DEFAULT_SEEKS,
-};
-
-/**
- * binder_alloc_init() - called by binder_open() for per-proc initialization
- * @alloc: binder_alloc for this proc
- *
- * Called from binder_open() to initialize binder_alloc fields for
- * new binder proc
- */
-void binder_alloc_init(struct binder_alloc *alloc)
-{
-	alloc->pid = current->group_leader->pid;
-	alloc->mm = current->mm;
-	mmgrab(alloc->mm);
-	mutex_init(&alloc->mutex);
-	INIT_LIST_HEAD(&alloc->buffers);
-}
-
-int binder_alloc_shrinker_init(void)
-{
-	int ret = list_lru_init(&binder_alloc_lru);
-
-	if (ret == 0) {
-		ret = register_shrinker(&binder_shrinker, "android-binder");
-		if (ret)
-			list_lru_destroy(&binder_alloc_lru);
-	}
-	return ret;
-}
-
-void binder_alloc_shrinker_exit(void)
-{
-	unregister_shrinker(&binder_shrinker);
-	list_lru_destroy(&binder_alloc_lru);
-}
-
-/**
- * check_buffer() - verify that buffer/offset is safe to access
- * @alloc: binder_alloc for this proc
- * @buffer: binder buffer to be accessed
- * @offset: offset into @buffer data
- * @bytes: bytes to access from offset
- *
- * Check that the @offset/@bytes are within the size of the given
- * @buffer and that the buffer is currently active and not freeable.
- * Offsets must also be multiples of sizeof(u32). The kernel is
- * allowed to touch the buffer in two cases:
- *
- * 1) when the buffer is being created:
- *     (buffer->free == 0 && buffer->allow_user_free == 0)
- * 2) when the buffer is being torn down:
- *     (buffer->free == 0 && buffer->transaction == NULL).
- *
- * Return: true if the buffer is safe to access
- */
-static inline bool check_buffer(struct binder_alloc *alloc,
-				struct binder_buffer *buffer,
-				binder_size_t offset, size_t bytes)
-{
-	size_t buffer_size = binder_alloc_buffer_size(alloc, buffer);
-
-	return buffer_size >= bytes &&
-		offset <= buffer_size - bytes &&
-		IS_ALIGNED(offset, sizeof(u32)) &&
-		!buffer->free &&
-		(!buffer->allow_user_free || !buffer->transaction);
-}
-
-/**
- * binder_alloc_get_page() - get kernel pointer for given buffer offset
- * @alloc: binder_alloc for this proc
- * @buffer: binder buffer to be accessed
- * @buffer_offset: offset into @buffer data
- * @pgoffp: address to copy final page offset to
- *
- * Lookup the struct page corresponding to the address
- * at @buffer_offset into @buffer->user_data. If @pgoffp is not
- * NULL, the byte-offset into the page is written there.
- *
- * The caller is responsible to ensure that the offset points
- * to a valid address within the @buffer and that @buffer is
- * not freeable by the user. Since it can't be freed, we are
- * guaranteed that the corresponding elements of @alloc->pages[]
- * cannot change.
- *
- * Return: struct page
- */
-static struct page *binder_alloc_get_page(struct binder_alloc *alloc,
-					  struct binder_buffer *buffer,
-					  binder_size_t buffer_offset,
-					  pgoff_t *pgoffp)
-{
-	binder_size_t buffer_space_offset = buffer_offset +
-		(buffer->user_data - alloc->buffer);
-	pgoff_t pgoff = buffer_space_offset & ~PAGE_MASK;
-	size_t index = buffer_space_offset >> PAGE_SHIFT;
-	struct binder_lru_page *lru_page;
-
-	lru_page = &alloc->pages[index];
-	*pgoffp = pgoff;
-	return lru_page->page_ptr;
-}
-
-/**
- * binder_alloc_clear_buf() - zero out buffer
- * @alloc: binder_alloc for this proc
- * @buffer: binder buffer to be cleared
- *
- * memset the given buffer to 0
- */
-static void binder_alloc_clear_buf(struct binder_alloc *alloc,
-				   struct binder_buffer *buffer)
-{
-	size_t bytes = binder_alloc_buffer_size(alloc, buffer);
-	binder_size_t buffer_offset = 0;
-
-	while (bytes) {
-		unsigned long size;
-		struct page *page;
-		pgoff_t pgoff;
-
-		page = binder_alloc_get_page(alloc, buffer,
-					     buffer_offset, &pgoff);
-		size = min_t(size_t, bytes, PAGE_SIZE - pgoff);
-		memset_page(page, pgoff, 0, size);
-		bytes -= size;
-		buffer_offset += size;
-	}
-}
-
-/**
- * binder_alloc_copy_user_to_buffer() - copy src user to tgt user
- * @alloc: binder_alloc for this proc
- * @buffer: binder buffer to be accessed
- * @buffer_offset: offset into @buffer data
- * @from: userspace pointer to source buffer
- * @bytes: bytes to copy
- *
- * Copy bytes from source userspace to target buffer.
- *
- * Return: bytes remaining to be copied
- */
-unsigned long
-binder_alloc_copy_user_to_buffer(struct binder_alloc *alloc,
-				 struct binder_buffer *buffer,
-				 binder_size_t buffer_offset,
-				 const void __user *from,
-				 size_t bytes)
-{
-	if (!check_buffer(alloc, buffer, buffer_offset, bytes))
-		return bytes;
-
-	while (bytes) {
-		unsigned long size;
-		unsigned long ret;
-		struct page *page;
-		pgoff_t pgoff;
-		void *kptr;
-
-		page = binder_alloc_get_page(alloc, buffer,
-					     buffer_offset, &pgoff);
-		size = min_t(size_t, bytes, PAGE_SIZE - pgoff);
-		kptr = kmap_local_page(page) + pgoff;
-		ret = copy_from_user(kptr, from, size);
-		kunmap_local(kptr);
-		if (ret)
-			return bytes - size + ret;
-		bytes -= size;
-		from += size;
-		buffer_offset += size;
-	}
-	return 0;
-}
-
-static int binder_alloc_do_buffer_copy(struct binder_alloc *alloc,
-				       bool to_buffer,
-				       struct binder_buffer *buffer,
-				       binder_size_t buffer_offset,
-				       void *ptr,
-				       size_t bytes)
-{
-	/* All copies must be 32-bit aligned and 32-bit size */
-	if (!check_buffer(alloc, buffer, buffer_offset, bytes))
-		return -EINVAL;
-
-	while (bytes) {
-		unsigned long size;
-		struct page *page;
-		pgoff_t pgoff;
-
-		page = binder_alloc_get_page(alloc, buffer,
-					     buffer_offset, &pgoff);
-		size = min_t(size_t, bytes, PAGE_SIZE - pgoff);
-		if (to_buffer)
-			memcpy_to_page(page, pgoff, ptr, size);
-		else
-			memcpy_from_page(ptr, page, pgoff, size);
-		bytes -= size;
-		pgoff = 0;
-		ptr = ptr + size;
-		buffer_offset += size;
-	}
-	return 0;
-}
-
-int binder_alloc_copy_to_buffer(struct binder_alloc *alloc,
-				struct binder_buffer *buffer,
-				binder_size_t buffer_offset,
-				void *src,
-				size_t bytes)
-{
-	return binder_alloc_do_buffer_copy(alloc, true, buffer, buffer_offset,
-					   src, bytes);
-}
-
-int binder_alloc_copy_from_buffer(struct binder_alloc *alloc,
-				  void *dest,
-				  struct binder_buffer *buffer,
-				  binder_size_t buffer_offset,
-				  size_t bytes)
-{
-	return binder_alloc_do_buffer_copy(alloc, false, buffer, buffer_offset,
-					   dest, bytes);
-}
-
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
deleted file mode 100644
index 420dc9cbf774..000000000000
--- a/drivers/android/binderfs.c
+++ /dev/null
@@ -1,827 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-
-#include <linux/compiler_types.h>
-#include <linux/errno.h>
-#include <linux/fs.h>
-#include <linux/fsnotify.h>
-#include <linux/gfp.h>
-#include <linux/idr.h>
-#include <linux/init.h>
-#include <linux/ipc_namespace.h>
-#include <linux/kdev_t.h>
-#include <linux/kernel.h>
-#include <linux/list.h>
-#include <linux/namei.h>
-#include <linux/magic.h>
-#include <linux/major.h>
-#include <linux/miscdevice.h>
-#include <linux/module.h>
-#include <linux/mutex.h>
-#include <linux/mount.h>
-#include <linux/fs_parser.h>
-#include <linux/sched.h>
-#include <linux/seq_file.h>
-#include <linux/slab.h>
-#include <linux/spinlock_types.h>
-#include <linux/stddef.h>
-#include <linux/string.h>
-#include <linux/types.h>
-#include <linux/uaccess.h>
-#include <linux/user_namespace.h>
-#include <linux/xarray.h>
-#include <uapi/asm-generic/errno-base.h>
-#include <uapi/linux/android/binder.h>
-#include <uapi/linux/android/binderfs.h>
-
-#include "binder_internal.h"
-
-#define FIRST_INODE 1
-#define SECOND_INODE 2
-#define INODE_OFFSET 3
-#define BINDERFS_MAX_MINOR (1U << MINORBITS)
-/* Ensure that the initial ipc namespace always has devices available. */
-#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 4)
-
-static dev_t binderfs_dev;
-static DEFINE_MUTEX(binderfs_minors_mutex);
-static DEFINE_IDA(binderfs_minors);
-
-enum binderfs_param {
-	Opt_max,
-	Opt_stats_mode,
-};
-
-enum binderfs_stats_mode {
-	binderfs_stats_mode_unset,
-	binderfs_stats_mode_global,
-};
-
-struct binder_features {
-	bool oneway_spam_detection;
-	bool extended_error;
-};
-
-static const struct constant_table binderfs_param_stats[] = {
-	{ "global", binderfs_stats_mode_global },
-	{}
-};
-
-static const struct fs_parameter_spec binderfs_fs_parameters[] = {
-	fsparam_u32("max",	Opt_max),
-	fsparam_enum("stats",	Opt_stats_mode, binderfs_param_stats),
-	{}
-};
-
-static struct binder_features binder_features = {
-	.oneway_spam_detection = true,
-	.extended_error = true,
-};
-
-static inline struct binderfs_info *BINDERFS_SB(const struct super_block *sb)
-{
-	return sb->s_fs_info;
-}
-
-bool is_binderfs_device(const struct inode *inode)
-{
-	if (inode->i_sb->s_magic == BINDERFS_SUPER_MAGIC)
-		return true;
-
-	return false;
-}
-
-/**
- * binderfs_binder_device_create - allocate inode from super block of a
- *                                 binderfs mount
- * @ref_inode: inode from wich the super block will be taken
- * @userp:     buffer to copy information about new device for userspace to
- * @req:       struct binderfs_device as copied from userspace
- *
- * This function allocates a new binder_device and reserves a new minor
- * number for it.
- * Minor numbers are limited and tracked globally in binderfs_minors. The
- * function will stash a struct binder_device for the specific binder
- * device in i_private of the inode.
- * It will go on to allocate a new inode from the super block of the
- * filesystem mount, stash a struct binder_device in its i_private field
- * and attach a dentry to that inode.
- *
- * Return: 0 on success, negative errno on failure
- */
-static int binderfs_binder_device_create(struct inode *ref_inode,
-					 struct binderfs_device __user *userp,
-					 struct binderfs_device *req)
-{
-	int minor, ret;
-	struct dentry *dentry, *root;
-	struct binder_device *device;
-	char *name = NULL;
-	size_t name_len;
-	struct inode *inode = NULL;
-	struct super_block *sb = ref_inode->i_sb;
-	struct binderfs_info *info = sb->s_fs_info;
-#if defined(CONFIG_IPC_NS)
-	bool use_reserve = (info->ipc_ns == &init_ipc_ns);
-#else
-	bool use_reserve = true;
-#endif
-
-	/* Reserve new minor number for the new device. */
-	mutex_lock(&binderfs_minors_mutex);
-	if (++info->device_count <= info->mount_opts.max)
-		minor = ida_alloc_max(&binderfs_minors,
-				      use_reserve ? BINDERFS_MAX_MINOR :
-						    BINDERFS_MAX_MINOR_CAPPED,
-				      GFP_KERNEL);
-	else
-		minor = -ENOSPC;
-	if (minor < 0) {
-		--info->device_count;
-		mutex_unlock(&binderfs_minors_mutex);
-		return minor;
-	}
-	mutex_unlock(&binderfs_minors_mutex);
-
-	ret = -ENOMEM;
-	device = kzalloc(sizeof(*device), GFP_KERNEL);
-	if (!device)
-		goto err;
-
-	inode = new_inode(sb);
-	if (!inode)
-		goto err;
-
-	inode->i_ino = minor + INODE_OFFSET;
-	simple_inode_init_ts(inode);
-	init_special_inode(inode, S_IFCHR | 0600,
-			   MKDEV(MAJOR(binderfs_dev), minor));
-	inode->i_fop = &binder_fops;
-	inode->i_uid = info->root_uid;
-	inode->i_gid = info->root_gid;
-
-	req->name[BINDERFS_MAX_NAME] = '\0'; /* NUL-terminate */
-	name_len = strlen(req->name);
-	/* Make sure to include terminating NUL byte */
-	name = kmemdup(req->name, name_len + 1, GFP_KERNEL);
-	if (!name)
-		goto err;
-
-	refcount_set(&device->ref, 1);
-	device->binderfs_inode = inode;
-	device->context.binder_context_mgr_uid = INVALID_UID;
-	device->context.name = name;
-	device->miscdev.name = name;
-	device->miscdev.minor = minor;
-	mutex_init(&device->context.context_mgr_node_lock);
-
-	req->major = MAJOR(binderfs_dev);
-	req->minor = minor;
-
-	if (userp && copy_to_user(userp, req, sizeof(*req))) {
-		ret = -EFAULT;
-		goto err;
-	}
-
-	root = sb->s_root;
-	inode_lock(d_inode(root));
-
-	/* look it up */
-	dentry = lookup_one_len(name, root, name_len);
-	if (IS_ERR(dentry)) {
-		inode_unlock(d_inode(root));
-		ret = PTR_ERR(dentry);
-		goto err;
-	}
-
-	if (d_really_is_positive(dentry)) {
-		/* already exists */
-		dput(dentry);
-		inode_unlock(d_inode(root));
-		ret = -EEXIST;
-		goto err;
-	}
-
-	inode->i_private = device;
-	d_instantiate(dentry, inode);
-	fsnotify_create(root->d_inode, dentry);
-	inode_unlock(d_inode(root));
-
-	return 0;
-
-err:
-	kfree(name);
-	kfree(device);
-	mutex_lock(&binderfs_minors_mutex);
-	--info->device_count;
-	ida_free(&binderfs_minors, minor);
-	mutex_unlock(&binderfs_minors_mutex);
-	iput(inode);
-
-	return ret;
-}
-
-/**
- * binder_ctl_ioctl - handle binder device node allocation requests
- *
- * The request handler for the binder-control device. All requests operate on
- * the binderfs mount the binder-control device resides in:
- * - BINDER_CTL_ADD
- *   Allocate a new binder device.
- *
- * Return: %0 on success, negative errno on failure.
- */
-static long binder_ctl_ioctl(struct file *file, unsigned int cmd,
-			     unsigned long arg)
-{
-	int ret = -EINVAL;
-	struct inode *inode = file_inode(file);
-	struct binderfs_device __user *device = (struct binderfs_device __user *)arg;
-	struct binderfs_device device_req;
-
-	switch (cmd) {
-	case BINDER_CTL_ADD:
-		ret = copy_from_user(&device_req, device, sizeof(device_req));
-		if (ret) {
-			ret = -EFAULT;
-			break;
-		}
-
-		ret = binderfs_binder_device_create(inode, device, &device_req);
-		break;
-	default:
-		break;
-	}
-
-	return ret;
-}
-
-static void binderfs_evict_inode(struct inode *inode)
-{
-	struct binder_device *device = inode->i_private;
-	struct binderfs_info *info = BINDERFS_SB(inode->i_sb);
-
-	clear_inode(inode);
-
-	if (!S_ISCHR(inode->i_mode) || !device)
-		return;
-
-	mutex_lock(&binderfs_minors_mutex);
-	--info->device_count;
-	ida_free(&binderfs_minors, device->miscdev.minor);
-	mutex_unlock(&binderfs_minors_mutex);
-
-	if (refcount_dec_and_test(&device->ref)) {
-		kfree(device->context.name);
-		kfree(device);
-	}
-}
-
-static int binderfs_fs_context_parse_param(struct fs_context *fc,
-					   struct fs_parameter *param)
-{
-	int opt;
-	struct binderfs_mount_opts *ctx = fc->fs_private;
-	struct fs_parse_result result;
-
-	opt = fs_parse(fc, binderfs_fs_parameters, param, &result);
-	if (opt < 0)
-		return opt;
-
-	switch (opt) {
-	case Opt_max:
-		if (result.uint_32 > BINDERFS_MAX_MINOR)
-			return invalfc(fc, "Bad value for '%s'", param->key);
-
-		ctx->max = result.uint_32;
-		break;
-	case Opt_stats_mode:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-
-		ctx->stats_mode = result.uint_32;
-		break;
-	default:
-		return invalfc(fc, "Unsupported parameter '%s'", param->key);
-	}
-
-	return 0;
-}
-
-static int binderfs_fs_context_reconfigure(struct fs_context *fc)
-{
-	struct binderfs_mount_opts *ctx = fc->fs_private;
-	struct binderfs_info *info = BINDERFS_SB(fc->root->d_sb);
-
-	if (info->mount_opts.stats_mode != ctx->stats_mode)
-		return invalfc(fc, "Binderfs stats mode cannot be changed during a remount");
-
-	info->mount_opts.stats_mode = ctx->stats_mode;
-	info->mount_opts.max = ctx->max;
-	return 0;
-}
-
-static int binderfs_show_options(struct seq_file *seq, struct dentry *root)
-{
-	struct binderfs_info *info = BINDERFS_SB(root->d_sb);
-
-	if (info->mount_opts.max <= BINDERFS_MAX_MINOR)
-		seq_printf(seq, ",max=%d", info->mount_opts.max);
-
-	switch (info->mount_opts.stats_mode) {
-	case binderfs_stats_mode_unset:
-		break;
-	case binderfs_stats_mode_global:
-		seq_printf(seq, ",stats=global");
-		break;
-	}
-
-	return 0;
-}
-
-static const struct super_operations binderfs_super_ops = {
-	.evict_inode    = binderfs_evict_inode,
-	.show_options	= binderfs_show_options,
-	.statfs         = simple_statfs,
-};
-
-static inline bool is_binderfs_control_device(const struct dentry *dentry)
-{
-	struct binderfs_info *info = dentry->d_sb->s_fs_info;
-
-	return info->control_dentry == dentry;
-}
-
-static int binderfs_rename(struct mnt_idmap *idmap,
-			   struct inode *old_dir, struct dentry *old_dentry,
-			   struct inode *new_dir, struct dentry *new_dentry,
-			   unsigned int flags)
-{
-	if (is_binderfs_control_device(old_dentry) ||
-	    is_binderfs_control_device(new_dentry))
-		return -EPERM;
-
-	return simple_rename(idmap, old_dir, old_dentry, new_dir,
-			     new_dentry, flags);
-}
-
-static int binderfs_unlink(struct inode *dir, struct dentry *dentry)
-{
-	if (is_binderfs_control_device(dentry))
-		return -EPERM;
-
-	return simple_unlink(dir, dentry);
-}
-
-static const struct file_operations binder_ctl_fops = {
-	.owner		= THIS_MODULE,
-	.open		= nonseekable_open,
-	.unlocked_ioctl	= binder_ctl_ioctl,
-	.compat_ioctl	= binder_ctl_ioctl,
-	.llseek		= noop_llseek,
-};
-
-/**
- * binderfs_binder_ctl_create - create a new binder-control device
- * @sb: super block of the binderfs mount
- *
- * This function creates a new binder-control device node in the binderfs mount
- * referred to by @sb.
- *
- * Return: 0 on success, negative errno on failure
- */
-static int binderfs_binder_ctl_create(struct super_block *sb)
-{
-	int minor, ret;
-	struct dentry *dentry;
-	struct binder_device *device;
-	struct inode *inode = NULL;
-	struct dentry *root = sb->s_root;
-	struct binderfs_info *info = sb->s_fs_info;
-#if defined(CONFIG_IPC_NS)
-	bool use_reserve = (info->ipc_ns == &init_ipc_ns);
-#else
-	bool use_reserve = true;
-#endif
-
-	device = kzalloc(sizeof(*device), GFP_KERNEL);
-	if (!device)
-		return -ENOMEM;
-
-	/* If we have already created a binder-control node, return. */
-	if (info->control_dentry) {
-		ret = 0;
-		goto out;
-	}
-
-	ret = -ENOMEM;
-	inode = new_inode(sb);
-	if (!inode)
-		goto out;
-
-	/* Reserve a new minor number for the new device. */
-	mutex_lock(&binderfs_minors_mutex);
-	minor = ida_alloc_max(&binderfs_minors,
-			      use_reserve ? BINDERFS_MAX_MINOR :
-					    BINDERFS_MAX_MINOR_CAPPED,
-			      GFP_KERNEL);
-	mutex_unlock(&binderfs_minors_mutex);
-	if (minor < 0) {
-		ret = minor;
-		goto out;
-	}
-
-	inode->i_ino = SECOND_INODE;
-	simple_inode_init_ts(inode);
-	init_special_inode(inode, S_IFCHR | 0600,
-			   MKDEV(MAJOR(binderfs_dev), minor));
-	inode->i_fop = &binder_ctl_fops;
-	inode->i_uid = info->root_uid;
-	inode->i_gid = info->root_gid;
-
-	refcount_set(&device->ref, 1);
-	device->binderfs_inode = inode;
-	device->miscdev.minor = minor;
-
-	dentry = d_alloc_name(root, "binder-control");
-	if (!dentry)
-		goto out;
-
-	inode->i_private = device;
-	info->control_dentry = dentry;
-	d_add(dentry, inode);
-
-	return 0;
-
-out:
-	kfree(device);
-	iput(inode);
-
-	return ret;
-}
-
-static const struct inode_operations binderfs_dir_inode_operations = {
-	.lookup = simple_lookup,
-	.rename = binderfs_rename,
-	.unlink = binderfs_unlink,
-};
-
-static struct inode *binderfs_make_inode(struct super_block *sb, int mode)
-{
-	struct inode *ret;
-
-	ret = new_inode(sb);
-	if (ret) {
-		ret->i_ino = iunique(sb, BINDERFS_MAX_MINOR + INODE_OFFSET);
-		ret->i_mode = mode;
-		simple_inode_init_ts(ret);
-	}
-	return ret;
-}
-
-static struct dentry *binderfs_create_dentry(struct dentry *parent,
-					     const char *name)
-{
-	struct dentry *dentry;
-
-	dentry = lookup_one_len(name, parent, strlen(name));
-	if (IS_ERR(dentry))
-		return dentry;
-
-	/* Return error if the file/dir already exists. */
-	if (d_really_is_positive(dentry)) {
-		dput(dentry);
-		return ERR_PTR(-EEXIST);
-	}
-
-	return dentry;
-}
-
-void binderfs_remove_file(struct dentry *dentry)
-{
-	struct inode *parent_inode;
-
-	parent_inode = d_inode(dentry->d_parent);
-	inode_lock(parent_inode);
-	if (simple_positive(dentry)) {
-		dget(dentry);
-		simple_unlink(parent_inode, dentry);
-		d_delete(dentry);
-		dput(dentry);
-	}
-	inode_unlock(parent_inode);
-}
-
-struct dentry *binderfs_create_file(struct dentry *parent, const char *name,
-				    const struct file_operations *fops,
-				    void *data)
-{
-	struct dentry *dentry;
-	struct inode *new_inode, *parent_inode;
-	struct super_block *sb;
-
-	parent_inode = d_inode(parent);
-	inode_lock(parent_inode);
-
-	dentry = binderfs_create_dentry(parent, name);
-	if (IS_ERR(dentry))
-		goto out;
-
-	sb = parent_inode->i_sb;
-	new_inode = binderfs_make_inode(sb, S_IFREG | 0444);
-	if (!new_inode) {
-		dput(dentry);
-		dentry = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	new_inode->i_fop = fops;
-	new_inode->i_private = data;
-	d_instantiate(dentry, new_inode);
-	fsnotify_create(parent_inode, dentry);
-
-out:
-	inode_unlock(parent_inode);
-	return dentry;
-}
-
-static struct dentry *binderfs_create_dir(struct dentry *parent,
-					  const char *name)
-{
-	struct dentry *dentry;
-	struct inode *new_inode, *parent_inode;
-	struct super_block *sb;
-
-	parent_inode = d_inode(parent);
-	inode_lock(parent_inode);
-
-	dentry = binderfs_create_dentry(parent, name);
-	if (IS_ERR(dentry))
-		goto out;
-
-	sb = parent_inode->i_sb;
-	new_inode = binderfs_make_inode(sb, S_IFDIR | 0755);
-	if (!new_inode) {
-		dput(dentry);
-		dentry = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	new_inode->i_fop = &simple_dir_operations;
-	new_inode->i_op = &simple_dir_inode_operations;
-
-	set_nlink(new_inode, 2);
-	d_instantiate(dentry, new_inode);
-	inc_nlink(parent_inode);
-	fsnotify_mkdir(parent_inode, dentry);
-
-out:
-	inode_unlock(parent_inode);
-	return dentry;
-}
-
-static int binder_features_show(struct seq_file *m, void *unused)
-{
-	bool *feature = m->private;
-
-	seq_printf(m, "%d\n", *feature);
-
-	return 0;
-}
-DEFINE_SHOW_ATTRIBUTE(binder_features);
-
-static int init_binder_features(struct super_block *sb)
-{
-	struct dentry *dentry, *dir;
-
-	dir = binderfs_create_dir(sb->s_root, "features");
-	if (IS_ERR(dir))
-		return PTR_ERR(dir);
-
-	dentry = binderfs_create_file(dir, "oneway_spam_detection",
-				      &binder_features_fops,
-				      &binder_features.oneway_spam_detection);
-	if (IS_ERR(dentry))
-		return PTR_ERR(dentry);
-
-	dentry = binderfs_create_file(dir, "extended_error",
-				      &binder_features_fops,
-				      &binder_features.extended_error);
-	if (IS_ERR(dentry))
-		return PTR_ERR(dentry);
-
-	return 0;
-}
-
-static int init_binder_logs(struct super_block *sb)
-{
-	struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir;
-	const struct binder_debugfs_entry *db_entry;
-	struct binderfs_info *info;
-	int ret = 0;
-
-	binder_logs_root_dir = binderfs_create_dir(sb->s_root,
-						   "binder_logs");
-	if (IS_ERR(binder_logs_root_dir)) {
-		ret = PTR_ERR(binder_logs_root_dir);
-		goto out;
-	}
-
-	binder_for_each_debugfs_entry(db_entry) {
-		dentry = binderfs_create_file(binder_logs_root_dir,
-					      db_entry->name,
-					      db_entry->fops,
-					      db_entry->data);
-		if (IS_ERR(dentry)) {
-			ret = PTR_ERR(dentry);
-			goto out;
-		}
-	}
-
-	proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc");
-	if (IS_ERR(proc_log_dir)) {
-		ret = PTR_ERR(proc_log_dir);
-		goto out;
-	}
-	info = sb->s_fs_info;
-	info->proc_log_dir = proc_log_dir;
-
-out:
-	return ret;
-}
-
-static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
-{
-	int ret;
-	struct binderfs_info *info;
-	struct binderfs_mount_opts *ctx = fc->fs_private;
-	struct inode *inode = NULL;
-	struct binderfs_device device_info = {};
-	const char *name;
-	size_t len;
-
-	sb->s_blocksize = PAGE_SIZE;
-	sb->s_blocksize_bits = PAGE_SHIFT;
-
-	/*
-	 * The binderfs filesystem can be mounted by userns root in a
-	 * non-initial userns. By default such mounts have the SB_I_NODEV flag
-	 * set in s_iflags to prevent security issues where userns root can
-	 * just create random device nodes via mknod() since it owns the
-	 * filesystem mount. But binderfs does not allow to create any files
-	 * including devices nodes. The only way to create binder devices nodes
-	 * is through the binder-control device which userns root is explicitly
-	 * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
-	 * necessary and safe.
-	 */
-	sb->s_iflags &= ~SB_I_NODEV;
-	sb->s_iflags |= SB_I_NOEXEC;
-	sb->s_magic = BINDERFS_SUPER_MAGIC;
-	sb->s_op = &binderfs_super_ops;
-	sb->s_time_gran = 1;
-
-	sb->s_fs_info = kzalloc(sizeof(struct binderfs_info), GFP_KERNEL);
-	if (!sb->s_fs_info)
-		return -ENOMEM;
-	info = sb->s_fs_info;
-
-	info->ipc_ns = get_ipc_ns(current->nsproxy->ipc_ns);
-
-	info->root_gid = make_kgid(sb->s_user_ns, 0);
-	if (!gid_valid(info->root_gid))
-		info->root_gid = GLOBAL_ROOT_GID;
-	info->root_uid = make_kuid(sb->s_user_ns, 0);
-	if (!uid_valid(info->root_uid))
-		info->root_uid = GLOBAL_ROOT_UID;
-	info->mount_opts.max = ctx->max;
-	info->mount_opts.stats_mode = ctx->stats_mode;
-
-	inode = new_inode(sb);
-	if (!inode)
-		return -ENOMEM;
-
-	inode->i_ino = FIRST_INODE;
-	inode->i_fop = &simple_dir_operations;
-	inode->i_mode = S_IFDIR | 0755;
-	simple_inode_init_ts(inode);
-	inode->i_op = &binderfs_dir_inode_operations;
-	set_nlink(inode, 2);
-
-	sb->s_root = d_make_root(inode);
-	if (!sb->s_root)
-		return -ENOMEM;
-
-	ret = binderfs_binder_ctl_create(sb);
-	if (ret)
-		return ret;
-
-	name = binder_devices_param;
-	for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) {
-		strscpy(device_info.name, name, len + 1);
-		ret = binderfs_binder_device_create(inode, NULL, &device_info);
-		if (ret)
-			return ret;
-		name += len;
-		if (*name == ',')
-			name++;
-	}
-
-	ret = init_binder_features(sb);
-	if (ret)
-		return ret;
-
-	if (info->mount_opts.stats_mode == binderfs_stats_mode_global)
-		return init_binder_logs(sb);
-
-	return 0;
-}
-
-static int binderfs_fs_context_get_tree(struct fs_context *fc)
-{
-	return get_tree_nodev(fc, binderfs_fill_super);
-}
-
-static void binderfs_fs_context_free(struct fs_context *fc)
-{
-	struct binderfs_mount_opts *ctx = fc->fs_private;
-
-	kfree(ctx);
-}
-
-static const struct fs_context_operations binderfs_fs_context_ops = {
-	.free		= binderfs_fs_context_free,
-	.get_tree	= binderfs_fs_context_get_tree,
-	.parse_param	= binderfs_fs_context_parse_param,
-	.reconfigure	= binderfs_fs_context_reconfigure,
-};
-
-static int binderfs_init_fs_context(struct fs_context *fc)
-{
-	struct binderfs_mount_opts *ctx;
-
-	ctx = kzalloc(sizeof(struct binderfs_mount_opts), GFP_KERNEL);
-	if (!ctx)
-		return -ENOMEM;
-
-	ctx->max = BINDERFS_MAX_MINOR;
-	ctx->stats_mode = binderfs_stats_mode_unset;
-
-	fc->fs_private = ctx;
-	fc->ops = &binderfs_fs_context_ops;
-
-	return 0;
-}
-
-static void binderfs_kill_super(struct super_block *sb)
-{
-	struct binderfs_info *info = sb->s_fs_info;
-
-	/*
-	 * During inode eviction struct binderfs_info is needed.
-	 * So first wipe the super_block then free struct binderfs_info.
-	 */
-	kill_litter_super(sb);
-
-	if (info && info->ipc_ns)
-		put_ipc_ns(info->ipc_ns);
-
-	kfree(info);
-}
-
-static struct file_system_type binder_fs_type = {
-	.name			= "binder",
-	.init_fs_context	= binderfs_init_fs_context,
-	.parameters		= binderfs_fs_parameters,
-	.kill_sb		= binderfs_kill_super,
-	.fs_flags		= FS_USERNS_MOUNT,
-};
-
-int __init init_binderfs(void)
-{
-	int ret;
-	const char *name;
-	size_t len;
-
-	/* Verify that the default binderfs device names are valid. */
-	name = binder_devices_param;
-	for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) {
-		if (len > BINDERFS_MAX_NAME)
-			return -E2BIG;
-		name += len;
-		if (*name == ',')
-			name++;
-	}
-
-	/* Allocate new major number for binderfs. */
-	ret = alloc_chrdev_region(&binderfs_dev, 0, BINDERFS_MAX_MINOR,
-				  "binder");
-	if (ret)
-		return ret;
-
-	ret = register_filesystem(&binder_fs_type);
-	if (ret) {
-		unregister_chrdev_region(binderfs_dev, BINDERFS_MAX_MINOR);
-		return ret;
-	}
-
-	return ret;
-}

-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 01/20] rust_binder: define a Rust binder driver
  2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
@ 2023-11-01 18:09   ` Greg Kroah-Hartman
  2023-11-08 10:38     ` Alice Ryhl
  2023-11-01 18:25   ` Boqun Feng
  1 sibling, 1 reply; 38+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-01 18:09 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:31PM +0000, Alice Ryhl wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> Define the Rust binder driver, and set up the helpers for making C types
> accessible from Rust.
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  drivers/android/Kconfig             | 11 +++++++++++
>  drivers/android/Makefile            |  1 +
>  drivers/android/rust_binder.rs      | 21 +++++++++++++++++++++
>  include/uapi/linux/android/binder.h | 30 ++++++++++++++++--------------
>  rust/bindings/bindings_helper.h     |  1 +
>  5 files changed, 50 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> index 07aa8ae0a058..fcfd25c9a016 100644
> --- a/drivers/android/Kconfig
> +++ b/drivers/android/Kconfig
> @@ -13,6 +13,17 @@ config ANDROID_BINDER_IPC
>  	  Android process, using Binder to identify, invoke and pass arguments
>  	  between said processes.
>  
> +config ANDROID_BINDER_IPC_RUST
> +	bool "Android Binder IPC Driver in Rust"
> +	depends on MMU && RUST

Can RUST even build on non-mmu systems?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
@ 2023-11-01 18:10   ` Greg Kroah-Hartman
  2023-11-08 10:42     ` Alice Ryhl
  2023-11-03 10:11   ` Finn Behrens
  2023-11-03 16:30   ` Benno Lossin
  2 siblings, 1 reply; 38+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-01 18:10 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:32PM +0000, Alice Ryhl wrote:
> Add support for accessing the Rust binder driver via binderfs. The
> actual binderfs implementation is done entirely in C, and the
> `rust_binderfs.c` file is a modified version of `binderfs.c` that is
> adjusted to call into the Rust binder driver rather than the C driver.
> 
> We have left the binderfs filesystem component in C. Rewriting it in
> Rust would be a large amount of work and requires a lot of bindings to
> the file system interfaces. Binderfs has not historically had the same
> challenges with security and complexity, so rewriting Binderfs seems to
> have lower value than the rest of Binder.
> 
> We also add code on the Rust side for binderfs to call into. Most of
> this is left as stub implementation, with the exception of closing the
> file descriptor and the BINDER_VERSION ioctl.
> 
> Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  drivers/android/Kconfig         |  24 ++
>  drivers/android/Makefile        |   1 +
>  drivers/android/context.rs      | 144 +++++++
>  drivers/android/defs.rs         |  39 ++
>  drivers/android/process.rs      | 251 ++++++++++++
>  drivers/android/rust_binder.rs  | 196 ++++++++-
>  drivers/android/rust_binderfs.c | 866 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/rust_binder.h     |  16 +
>  include/uapi/linux/magic.h      |   1 +
>  rust/bindings/bindings_helper.h |   2 +
>  rust/kernel/lib.rs              |   7 +
>  scripts/Makefile.build          |   2 +-
>  12 files changed, 1547 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> index fcfd25c9a016..82ed6ddabe1a 100644
> --- a/drivers/android/Kconfig
> +++ b/drivers/android/Kconfig
> @@ -36,6 +36,18 @@ config ANDROID_BINDERFS
>  	  It can be used to dynamically allocate new binder IPC devices via
>  	  ioctls.
>  
> +config ANDROID_BINDERFS_RUST
> +	bool "Android Binderfs filesystem in Rust"
> +	depends on ANDROID_BINDER_IPC_RUST
> +	default n

Nit, the default is always 'n', so no need for this line.

Also, it's the middle of the merge window, many of us are busy with
other things and can't review new code until a few weeks from now,
sorry.

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 20/20] binder: delete the C implementation
  2023-11-01 18:01 ` [PATCH RFC 20/20] binder: delete the C implementation Alice Ryhl
@ 2023-11-01 18:15   ` Greg Kroah-Hartman
  2023-11-01 18:39   ` Carlos Llamas
  1 sibling, 0 replies; 38+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-01 18:15 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:50PM +0000, Alice Ryhl wrote:
> The ultimate goal of this project is to replace the C implementation.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

:)

Seriously, this is all great stuff, thanks for posting it, very
impressive.  Let's see how testing goes!

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 01/20] rust_binder: define a Rust binder driver
  2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
  2023-11-01 18:09   ` Greg Kroah-Hartman
@ 2023-11-01 18:25   ` Boqun Feng
  2023-11-02 10:27     ` Alice Ryhl
  1 sibling, 1 reply; 38+ messages in thread
From: Boqun Feng @ 2023-11-01 18:25 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:31PM +0000, Alice Ryhl wrote:
[...]
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -21,6 +21,7 @@
>  #include <linux/sched.h>
>  #include <linux/task_work.h>
>  #include <linux/workqueue.h>
> +#include <uapi/linux/android/binder.h>

I wonder whether we could (and should) move this into
rust/uapi/uapi_helpers.h

Regards,
Boqun

>  
>  /* `bindgen` gets confused at certain things. */
>  const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
> 
> -- 
> 2.42.0.820.g83a721a137-goog
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 00/20] Setting up Binder for the future
  2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
                   ` (19 preceding siblings ...)
  2023-11-01 18:01 ` [PATCH RFC 20/20] binder: delete the C implementation Alice Ryhl
@ 2023-11-01 18:34 ` Carlos Llamas
  2023-11-02 13:33   ` Alice Ryhl
  20 siblings, 1 reply; 38+ messages in thread
From: Carlos Llamas @ 2023-11-01 18:34 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:30PM +0000, Alice Ryhl wrote:
> We're generally not proponents of rewrites (nasty uncomfortable things
> that make you late for dinner!). So why rewrite Binder? 
> 
> Binder has been evolving over the past 15+ years to meet the evolving
> needs of Android. Its responsibilities, expectations, and complexity
> have grown considerably during that time. While we expect Binder to
> continue to evolve along with Android, there are a number of factors
> that currently constrain our ability to develop/maintain it. Briefly
> those are:
> 
> 1. Complexity: Binder is at the intersection of everything in Android and
>    fulfills many responsibilities beyond IPC. It has become many things
>    to many people, and due to its many features and their interactions
>    with each other, its complexity is quite high. In just 6kLOC it must
>    deliver transactions to the right threads. It must correctly parse
>    and translate the contents of transactions, which can contain several
>    objects of different types (e.g., pointers, fds) that can interact
>    with each other. It controls the size of thread pools in userspace,
>    and ensures that transactions are assigned to threads in ways that
>    avoid deadlocks where the threadpool has run out of threads. It must
>    track refcounts of objects that are shared by several processes by
>    forwarding refcount changes between the processes correctly.  It must
>    handle numerous error scenarios and it combines/nests 13 different
>    locks, 7 reference counters, and atomic variables. Finally, It must
>    do all of this as fast and efficiently as possible. Minor performance
>    regressions can cause a noticeably degraded user experience.
> 
> 2. Things to improve: Thousand-line functions [1], error-prone error
>    handling [2], and confusing structure can occur as a code base grows
>    organically. After more than a decade of development, this codebase
>    could use an overhaul.
> 
> 3. Security critical: Binder is a critical part of Android's sandboxing
>    strategy. Even Android's most de-privileged sandboxes (e.g. the
>    Chrome renderer, or SW Codec) have direct access to Binder. More than
>    just about any other component, it's important that Binder provide
>    robust security, and itself be robust against security
>    vulnerabilities.
> 
> It's #1 (high complexity) that has made continuing to evolve Binder and
> resolving #2 (tech debt) exceptionally difficult without causing #3
> (security issues). For Binder to continue to meet Android's needs, we
> need better ways to manage (and reduce!) complexity without increasing
> the risk.

I 100% agree with this as the vast majority of my time is spent chasing
after memory corruption issues. The fixes here may look simple on the
surface but the complexity makes them non-trivial and there are many
"hidden" things to watch out for.

> 
> The biggest change is obviously the choice of programming language. We
> decided to use Rust because it directly addresses a number of the
> challenges within Binder that we have faced during the last years. It
> prevents mistakes with ref counting, locking, bounds checking, and also
> does a lot to reduce the complexity of error handling. Additionally,
> we've been able to use the more expressive type system to encode the
> ownership semantics of the various structs and pointers, which takes the
> complexity of managing object lifetimes out of the hands of the
> programmer, reducing the risk of use-after-frees and similar problems.

The history in binder teach us that is quite hard to write a patch with
some confidence that it won't introduce one of these issues.

> 
> Rust has many different pointer types that it uses to encode ownership
> semantics into the type system, and this is probably one of the most
> important aspects of how it helps in Binder. The Binder driver has a lot
> of different objects that have complex ownership semantics; some
> pointers own a refcount, some pointers have exclusive ownership, and
> some pointers just reference the object and it is kept alive in some
> other manner. With Rust, we can use a different pointer type for each
> kind of pointer, which enables the compiler to enforce that the
> ownership semantics are implemented correctly.
> 
> Another useful feature is Rust's error handling. Rust allows for more
> simplified error handling with features such as destructors, and you get
> compilation failures if errors are not properly handled. This means that
> even though Rust requires you to spend more lines of code than C on
> things such as writing down invariants that are left implicit in C, the
> Rust driver is still slightly smaller than C binder: Rust is 5.5kLOC and
> C is 5.8kLOC. (These numbers are excluding blank lines, comments,
> binderfs, and any debugging facilities in C that are not yet implemented
> in the Rust driver. The numbers include abstractions in rust/kernel/
> that are unlikely to be used by other drivers than Binder.)
> 
> Although this rewrite completely rethinks how the code is structured and
> how assumptions are enforced, we do not fundamentally change *how* the
> driver does the things it does. A lot of careful thought has gone into
> the existing design. The rewrite is aimed rather at improving code
> health, structure, readability, robustness, security, maintainability
> and extensibility. We also include more inline documentation, and

Can you expand a bit more on what the plan is here? Is it a two step
process? e.g. replacing first and then revisiting the *how* binder does
things later?

> improve how assumptions in the code are enforced. Furthermore, all
> unsafe code is annotated with a SAFETY comment that explains why it is
> correct.
> 
> We have left the binderfs filesystem component in C. Rewriting it in
> Rust would be a large amount of work and requires a lot of bindings to
> the file system interfaces. Binderfs has not historically had the same
> challenges with security and complexity, so rewriting binderfs seems to
> have lower value than the rest of Binder.
> 
> Correctness and feature parity
> ------------------------------
> 
> Rust binder passes all tests that validate the correctness of Binder in
> the Android Open Source Project. We can boot a device, and run a variety
> of apps and functionality without issues. We have performed this both on
> the Cuttlefish Android emulator device, and on a Pixel 6 Pro.
> 
> As for feature parity, Rust binder currently implements all features
> that C binder supports, with the exception of some debugging facilities.
> The missing debugging facilities will be added before we submit the Rust
> implementation upstream.
> 
> Performance numbers
> -------------------
> 
> We have tested the driver using two different benchmarks:
> binderThroughputTest [3] and binderRpcBenchmark [4]. These benchmarks
> show that the Rust implementation has very promising performance
> characteristics. That said, these are only microbenchmarks with very
> simple workloads, and there is still a lot of work to be done before we
> can truly understand how the drivers compare in the real world.
> 
> binderThroughputTest:
> Some visualizations of the benchmarking results are available at the
> following links:
> 
> Average latency with no payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/Average%20latency%20with%20no%20payload.png
> Average latency with 4k payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/Average%20latency%20with%204k%20payload.png
> 99 percentile latency with no payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/99%20percentile%20latency%20with%20no%20payload.png
> 99 percentile latency with 4k payload: https://raw.githubusercontent.com/Darksonn/linux/rust-binder-rfc/img-for-rust-binder-rfc/99%20percentile%20latency%20with%204k%20payload.png
> 
> Raw data with empty payloads:
>     +-----------+----------+---------+----------+---------+----------+----------+
>     | c/s pairs | Rust avg |  C avg  | Rust 99p |  C 99p  | Avg frac | 99p frac |
>     +-----------+----------+---------+----------+---------+----------+----------+
>     |         1 |   17.517 |  17.278 |   31.169 |  34.464 |   +1.38% |   -9.56% |
>     |         2 |   17.405 |  17.425 |   36.051 |  36.825 |   -0.11% |   -2.10% |
>     |         4 |   27.623 |  27.524 |   46.305 |  45.776 |   +0.36% |   +1.16% |
>     |         8 |   25.152 |  25.461 |   61.442 |  61.279 |   -1.21% |   +0.27% |
>     |        16 |   50.251 |  49.987 |  120.158 | 121.297 |   +0.53% |   -0.94% |
>     |        32 |   99.439 | 100.537 |  238.891 | 238.404 |   -1.09% |   +0.20% |
>     +-----------+----------+---------+----------+---------+----------+----------+
> Raw data with 4k payloads:
>     +-----------+----------+---------+----------+---------+----------+----------+
>     | c/s pairs | Rust avg |  C avg  | Rust 99p |  C 99p  | Avg frac | 99p frac |
>     +-----------+----------+---------+----------+---------+----------+----------+
>     |         1 |   19.422 |  19.811 |   30.233 |  31.616 |   -1.96% |   -4.37% |
>     |         2 |   18.393 |  18.277 |   34.790 |  35.319 |   +0.63% |   -1.50% |
>     |         4 |   29.350 |  29.283 |   48.544 |  47.730 |   +0.23% |   +1.71% |
>     |         8 |   25.075 |  25.283 |   66.040 |  65.226 |   -0.82% |   +1.25% |
>     |        16 |   58.608 |  58.949 |  156.657 | 159.709 |   -0.58% |   -1.91% |
>     |        32 |  127.404 | 129.459 |  321.249 | 326.945 |   -1.59% |   -1.74% |
>     +-----------+----------+---------+----------+---------+----------+----------+
> These tables depict roundtrip latencies of transactions as measured by
> binderThroughputTest. Each measurement is given in microseconds. Each
> row has a sample size of 10 million iterations. Negative percentages are
> better for Rust.
> 
> We've found that Rust binder has similar performance to C binder on the
> binderThroughputTest benchmark. The average latencies fluctuate between
> -1.96% and +1.38%.
> 
> binderRpcBenchmark:
>     +---------------------+-----------+---------+----------+---------+-----------+----------+
>     |      Benchmark      | Time Rust | Time C  | CPU Rust |   CPU C | Time frac | CPU frac |
>     +---------------------+-----------+---------+----------+---------+-----------+----------+
>     | pingTransaction     |    21.595 |  22.167 |    9.625 |   9.692 |    -2.58% |   -0.69% |
>     | repeatBinder        |    33.982 |  34.648 |   16.252 |  16.681 |    -1.92% |   -2.57% |
>     | throughput/64       |    26.774 |  26.587 |   11.995 |  11.823 |    +0.70% |   +1.45% |
>     | throughput/1024     |    33.679 |  33.867 |   15.140 |  15.137 |    -0.56% |   +0.02% |
>     | throughput/2048     |    39.744 |  40.092 |   17.898 |  17.926 |    -0.87% |   -0.16% |
>     | throughput/4096     |    52.585 |  53.457 |   23.788 |  24.067 |    -1.63% |   -1.16% |
>     | throughput/8182     |    76.352 |  77.148 |   35.135 |  35.228 |    -1.03% |   -0.26% |
>     | throughput/16364    |   121.875 | 122.877 |   57.342 |  57.614 |    -0.82% |   -0.47% |
>     | throughput/32728    |   212.380 | 212.765 |  101.838 | 101.589 |    -0.18% |   +0.25% |
>     | throughput/65535    |   442.983 | 421.935 |  222.642 | 212.494 |    +4.99% |   +4.78% |
>     | throughput/65536    |   431.250 | 416.916 |  216.634 | 210.160 |    +3.44% |   +3.08% |
>     | throughput/65537    |   512.902 | 492.272 |  242.472 | 232.786 |    +4.19% |   +4.16% |
>     | repeatTwoPageString |   456.546 | 445.398 |  222.921 | 219.821 |    +2.50% |   +1.41% |
>     +---------------------+-----------+---------+----------+---------+-----------+----------+
> This table depicts wall clock time and cpu time measurements over
> various test cases. Each measurement is given in microseconds. The
> throughput benchmarks correspond to the
> BM_throughputForTransportAndBytes test case, and the number is the size
> of the payload. Negative percentages are better for Rust.
> 
> From the above, we find that Rust binder is competitive for all test
> cases except for those with very large transaction sizes. However, this
> is a very rare case in practice [5] and we've been able to fix all other
> performance issues that we've run into, so there's no reason to think
> that we won't also be able to fix this issue. We did not fix it for this

This isn't terrible. Besides being rare binder shouldn't really be used
to send large transaction sizes. You are better off using something else
in those scenarios e.g. shm. However, it would be nice to get to the
bottom of this.

> RFC because we prioritized getting the RFC out to provide context for
> the upcoming discussion at Linux Plumbers Conference [6].
> 
> We ran all of the benchmarks with cross-language LTO enabled, so that C
> code can be inlined into Rust code. We get similar results on the
> Cuttlefish Android emulator (which has an x86 architecture).
> 
> The Binder driver is very performance critical, and although our initial
> numbers are promising, we must gain a better understanding of how it
> performs in realistic workloads and not just in simple benchmarks. What
> we ultimately care about is the performance impact that it has on the
> whole system. Much work remains to be done on this front.

I can send a "stress" test that would tell us more information about the
performance in critical scenarios. Perhaps this can land as kselftest.

> 
> Dependencies
> ------------
> 
> When implementing kernel drivers in Rust, you must write bindings for
> each subsystem that we need to call into from Rust. Binder requires
> quite a few of them. We have not included them in this patch series, but
> you can view them at the following branch:
> 
> https://github.com/Darksonn/linux/commits/rust-binder-rfc
> 
> The branch is based on top of commit 639409a4ac8e ("Merge tag
> 'wq-for-6.7-rust-bindings' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq"),
> which is available in mainline. I did not base it on a tag, since there
> is not yet any tag that includes the Rust workqueue abstractions.
> 
> This RFC uses the kernel's red-black tree for key/value mappings, but we
> are aware that the red-black tree is deprecated. We did this to make the
> performance comparison more fair, since C binder also uses rbtree for
> this. We intend to replace these with XArrays instead. That said, we
> don't think that XArray is a good fit for the range allocator, and we
> propose to continue using the red-black tree for the range allocator.
> (see patch 6)
> 
> Thank you
> ---------
> 
>  * Wedson Almeida Filho who wrote the first version of the driver and
>    started the project.
> 
>  * Miguel Ojeda for his support, and leading the Rust-for-Linux effort,
>    and helping us navigate the upstream community.
> 
>  * Matt Gilbride for his work on the range allocator and oneway spam
>    detection.
> 
>  * Carlos Llamas for patiently answering all my questions to help me
>    understand the C driver, and co-presenting with me at LPC and
>    Kangrejos.
> 
>  * Greg KH for reviews and guidance on upstream development.
> 
>  * Todd Kjos for reviewing the cover letter, answering questions, and
>    pointers on benchmarking the driver.
> 
>  * Matthew Maurer for his mentorship and help with navigating the build
>    system, including getting LTO working.
> 
>  * John Stultz for his help with debugging a performance issue.
> 
>  * Andreas Hindborg for his help with getting LTO working.
> 
>  * Benno Lossin, Gary Guo, Andreas Hindborg, Miguel Ojeda, Wedson
>    Almeida Filho, Martin Rodriguez Reboredo, Björn Roy Baron, Boqun
>    Feng, Tejun Heo, Nathan Huckleberry for reviewing various bindings
>    needed by Binder.
> 
> Thank you,
> Alice
> 
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/android/binder.c?h=v6.5#n2896
> [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/android/binder.c?h=v6.5#n3658
> [3]: https://android-review.googlesource.com/c/platform/frameworks/native/+/2680818
> [4]: https://cs.android.com/android/platform/superproject/main/+/main:frameworks/native/libs/binder/tests/binderRpcBenchmark.cpp
> [5]: https://cs.android.com/android/_/android/platform/frameworks/native/+/b85e7f7dbd0463d2ba78d53d50e64489fcb01ec4:libs/binder/tests/binderRpcBenchmark.cpp;l=206-217;bpv=1;bpt=0
> [6]: https://lpc.events/event/17/contributions/1427/
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> Alice Ryhl (15):
>       rust_binder: add binderfs support to Rust binder
>       rust_binder: add threading support
>       rust_binder: add work lists
>       rust_binder: add nodes and context managers
>       rust_binder: add oneway transactions
>       rust_binder: serialize oneway transactions
>       rust_binder: send nodes in transactions
>       rust_binder: add BINDER_TYPE_PTR support
>       rust_binder: add BINDER_TYPE_FD support
>       rust_binder: add BINDER_TYPE_FDA support
>       rust_binder: add process freezing
>       rust_binder: add TF_UPDATE_TXN support
>       rust_binder: add binder_logs/state
>       rust_binder: add vma shrinker
>       binder: delete the C implementation
> 
> Matt Gilbride (1):
>       rust_binder: add oneway spam detection
> 
> Wedson Almeida Filho (4):
>       rust_binder: define a Rust binder driver
>       rust_binder: add epoll support
>       rust_binder: add non-oneway transactions
>       rust_binder: add death notifications
> 
>  drivers/android/Kconfig                         |   19 +-
>  drivers/android/Makefile                        |    2 +
>  drivers/android/allocation.rs                   |  541 ++
>  drivers/android/binder.c                        | 6630 -----------------------
>  drivers/android/binder_alloc.c                  | 1284 -----
>  drivers/android/context.rs                      |  225 +
>  drivers/android/defs.rs                         |  171 +
>  drivers/android/error.rs                        |   94 +
>  drivers/android/node.rs                         |  761 +++
>  drivers/android/process.rs                      | 1412 +++++
>  drivers/android/range_alloc.rs                  |  442 ++
>  drivers/android/rust_binder.rs                  |  389 ++
>  drivers/android/{binderfs.c => rust_binderfs.c} |  135 +-
>  drivers/android/thread.rs                       | 1552 ++++++
>  drivers/android/transaction.rs                  |  428 ++
>  include/linux/rust_binder.h                     |   16 +
>  include/uapi/linux/android/binder.h             |   30 +-
>  include/uapi/linux/magic.h                      |    1 +
>  rust/bindings/bindings_helper.h                 |    6 +
>  rust/helpers.c                                  |   48 +
>  rust/kernel/file.rs                             |    2 +-
>  rust/kernel/lib.rs                              |    9 +
>  rust/kernel/page_range.rs                       |  715 +++
>  rust/kernel/security.rs                         |   33 +
>  rust/kernel/seq_file.rs                         |   47 +
>  rust/kernel/sync/condvar.rs                     |   10 +
>  rust/kernel/sync/lock.rs                        |   24 +
>  rust/kernel/sync/lock/mutex.rs                  |   10 +
>  rust/kernel/sync/lock/spinlock.rs               |   10 +
>  rust/kernel/task.rs                             |    2 +-
>  scripts/Makefile.build                          |    2 +-
>  31 files changed, 7061 insertions(+), 7989 deletions(-)
> ---
> base-commit: b4be1bd6c44225bf7276a4666fd30b8da9cba517
> change-id: 20231101-rust-binder-464b89651887
> 
> Best regards,
> -- 
> Alice Ryhl <aliceryhl@google.com>
> 

Thanks Alice and others that have worked hard on this. This is a very
cool and exciting move for Android. I'm very much looking forward to
playing with this Rust binder and getting further on-field results.

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 20/20] binder: delete the C implementation
  2023-11-01 18:01 ` [PATCH RFC 20/20] binder: delete the C implementation Alice Ryhl
  2023-11-01 18:15   ` Greg Kroah-Hartman
@ 2023-11-01 18:39   ` Carlos Llamas
  1 sibling, 0 replies; 38+ messages in thread
From: Carlos Llamas @ 2023-11-01 18:39 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer

On Wed, Nov 01, 2023 at 06:01:50PM +0000, Alice Ryhl wrote:
> The ultimate goal of this project is to replace the C implementation.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Nice, great work Alice!

As far as logistics I think it would be nice to make the transition
right after an LTS branch off. This should give the Rust binder more
time to mature before landing on the subsequent LTS release.

Acked-by: Carlos Llamas <cmllamas@google.com>

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 01/20] rust_binder: define a Rust binder driver
  2023-11-01 18:25   ` Boqun Feng
@ 2023-11-02 10:27     ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-02 10:27 UTC (permalink / raw)
  To: boqun.feng
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, brauner, cmllamas, gary, gregkh, jeffv, joel,
	linux-kernel, maco, mattgilbride, mmaurer, ojeda, rust-for-linux,
	surenb, tkjos, wedsonaf

Boqun Feng <boqun.feng@gmail.com> writes:
>> +#include <uapi/linux/android/binder.h>
>
> I wonder whether we could (and should) move this into
> rust/uapi/uapi_helpers.h

Sure. That was introduced after I wrote this part, but I'm happy to move
it if you think it would make sense.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 00/20] Setting up Binder for the future
  2023-11-01 18:34 ` [PATCH RFC 00/20] Setting up Binder for the future Carlos Llamas
@ 2023-11-02 13:33   ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-02 13:33 UTC (permalink / raw)
  To: cmllamas
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, gary, gregkh, jeffv, joel,
	linux-kernel, maco, mattgilbride, mmaurer, ojeda, rust-for-linux,
	surenb, tkjos, wedsonaf

Carlos Llamas <cmllamas@google.com> writes:
> On Wed, Nov 01, 2023 at 06:01:30PM +0000, Alice Ryhl wrote:
>> Although this rewrite completely rethinks how the code is structured and
>> how assumptions are enforced, we do not fundamentally change *how* the
>> driver does the things it does. A lot of careful thought has gone into
>> the existing design. The rewrite is aimed rather at improving code
>> health, structure, readability, robustness, security, maintainability
>> and extensibility. We also include more inline documentation, and
> 
> Can you expand a bit more on what the plan is here? Is it a two step
> process? e.g. replacing first and then revisiting the *how* binder does
> things later?

Yes, a big part of the motivation behind this rewrite is to make it
easier to continue evolving Binder.

For example, we would like to make Binder have more thorough epoll
support and the ability for a single-threaded server to handle many
incoming transactions at the same time, similar to how you can use many
non-blocking tcp sockets on a single thread today. This would have a
number of performance benefits, like fewer threads, less contact
switching, etc.

We would prefer to not attempt this in the C driver because of how
challenging it is to make significant changes.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
  2023-11-01 18:10   ` Greg Kroah-Hartman
@ 2023-11-03 10:11   ` Finn Behrens
  2023-11-08 10:31     ` Alice Ryhl
  2023-11-03 16:30   ` Benno Lossin
  2 siblings, 1 reply; 38+ messages in thread
From: Finn Behrens @ 2023-11-03 10:11 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer



On 1 Nov 2023, at 19:01, Alice Ryhl wrote:

> Add support for accessing the Rust binder driver via binderfs. The
> actual binderfs implementation is done entirely in C, and the
> `rust_binderfs.c` file is a modified version of `binderfs.c` that is
> adjusted to call into the Rust binder driver rather than the C driver.
>
> We have left the binderfs filesystem component in C. Rewriting it in
> Rust would be a large amount of work and requires a lot of bindings to
> the file system interfaces. Binderfs has not historically had the same
> challenges with security and complexity, so rewriting Binderfs seems to
> have lower value than the rest of Binder.
>
> We also add code on the Rust side for binderfs to call into. Most of
> this is left as stub implementation, with the exception of closing the
> file descriptor and the BINDER_VERSION ioctl.
>
> Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  drivers/android/Kconfig         |  24 ++
>  drivers/android/Makefile        |   1 +
>  drivers/android/context.rs      | 144 +++++++
>  drivers/android/defs.rs         |  39 ++
>  drivers/android/process.rs      | 251 ++++++++++++
>  drivers/android/rust_binder.rs  | 196 ++++++++-
>  drivers/android/rust_binderfs.c | 866 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/rust_binder.h     |  16 +
>  include/uapi/linux/magic.h      |   1 +
>  rust/bindings/bindings_helper.h |   2 +
>  rust/kernel/lib.rs              |   7 +
>  scripts/Makefile.build          |   2 +-
>  12 files changed, 1547 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> index fcfd25c9a016..82ed6ddabe1a 100644
> --- a/drivers/android/Kconfig
> +++ b/drivers/android/Kconfig
> diff --git a/drivers/android/Makefile b/drivers/android/Makefile
> index 6348f75832ca..5c819011aa77 100644
> --- a/drivers/android/Makefile
> +++ b/drivers/android/Makefile
> diff --git a/drivers/android/context.rs b/drivers/android/context.rs
> new file mode 100644
> index 000000000000..630cb575d3ac
> --- /dev/null
> +++ b/drivers/android/context.rs
> diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs
> new file mode 100644
> index 000000000000..8fdcb856ccad
> --- /dev/null
> +++ b/drivers/android/defs.rs
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use core::ops::{Deref, DerefMut};
> +use kernel::{
> +    bindings,
> +    io_buffer::{ReadableFromBytes, WritableToBytes},
> +};
> +
> +macro_rules! decl_wrapper {
> +    ($newname:ident, $wrapped:ty) => {
> +        #[derive(Copy, Clone, Default)]
> +        #[repr(transparent)]
> +        pub(crate) struct $newname($wrapped);
> +        // SAFETY: This macro is only used with types where this is ok.
Would it make sense so also annotade this safety requirement on the macro itself?
It is only file private, but could help not overlook it, when using for something new in the same file.
> +        unsafe impl ReadableFromBytes for $newname {}
> +        unsafe impl WritableToBytes for $newname {}
> +        impl Deref for $newname {
> +            type Target = $wrapped;
> +            fn deref(&self) -> &Self::Target {
> +                &self.0
> +            }
> +        }
> +        impl DerefMut for $newname {
> +            fn deref_mut(&mut self) -> &mut Self::Target {
> +                &mut self.0
> +            }
> +        }
> +    };
> +}
> +
> +decl_wrapper!(BinderVersion, bindings::binder_version);
> +
> +impl BinderVersion {
> +    pub(crate) fn current() -> Self {
> +        Self(bindings::binder_version {
> +            protocol_version: bindings::BINDER_CURRENT_PROTOCOL_VERSION as _,
> +        })
> +    }
> +}
> diff --git a/drivers/android/process.rs b/drivers/android/process.rs
> new file mode 100644
> index 000000000000..2f16e4cedbf1
> --- /dev/null
> +++ b/drivers/android/process.rs
> diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs
> index 4b3d6676a9cf..6de2f40846fb 100644
> --- a/drivers/android/rust_binder.rs
> +++ b/drivers/android/rust_binder.rs
> diff --git a/drivers/android/rust_binderfs.c b/drivers/android/rust_binderfs.c
> new file mode 100644
> index 000000000000..2c011e26752c
> --- /dev/null
> +++ b/drivers/android/rust_binderfs.c
> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> index 6325d1d0e90f..e5a20c1498af 100644
> --- a/include/uapi/linux/magic.h
> +++ b/include/uapi/linux/magic.h
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 00a66666f00a..ffeea312f2fd 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index 435d4c2ac5fc..f4d58da9202e 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index da37bfa97211..f78d2e75a795 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> -- 
> 2.42.0.820.g83a721a137-goog

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 03/20] rust_binder: add threading support
  2023-11-01 18:01 ` [PATCH RFC 03/20] rust_binder: add threading support Alice Ryhl
@ 2023-11-03 10:51   ` Finn Behrens
  2023-11-08 10:27     ` Alice Ryhl
  0 siblings, 1 reply; 38+ messages in thread
From: Finn Behrens @ 2023-11-03 10:51 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Matt Gilbride, Jeffrey Vander Stoep, Matthew Maurer



On 1 Nov 2023, at 19:01, Alice Ryhl wrote:


> diff --git a/drivers/android/error.rs b/drivers/android/error.rs
> new file mode 100644
> index 000000000000..41fc4347ab55
> --- /dev/null
> +++ b/drivers/android/error.rs
> +
> +impl core::fmt::Debug for BinderError {
> +    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
> +        match self.reply {
> +            BR_FAILED_REPLY => match self.source.as_ref() {
> +                Some(source) => f
> +                    .debug_struct("BR_FAILED_REPLY")
> +                    .field("source", source)
> +                    .finish(),
> +                None => f.pad("BR_FAILED_REPLY"),
> +            },
> +            BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"),
> +            BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"),
> +            _ => f
> +                .debug_struct("BinderError")
> +                .field("reply", &self.reply)
> +                .finish(),
> +        }
> +    }
> +}
Renaming the debug_struct itself feels like it will make it harder to find later, as I would expect that a debug implementation names the struct its from.
Also this has the fallback in CamelCase and all defined cases as SCREAMING_SNAKE_CASE. Maybe rather in the defined cases something like f.debug_struct(‘BinderError’).field(‘reply’, “name”)?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
  2023-11-01 18:10   ` Greg Kroah-Hartman
  2023-11-03 10:11   ` Finn Behrens
@ 2023-11-03 16:30   ` Benno Lossin
  2023-11-03 17:34     ` Boqun Feng
  2023-11-08 10:25     ` Alice Ryhl
  2 siblings, 2 replies; 38+ messages in thread
From: Benno Lossin @ 2023-11-03 16:30 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Matt Gilbride,
	Jeffrey Vander Stoep, Matthew Maurer

On 01.11.23 19:01, Alice Ryhl wrote:
> +/// There is one context per binder file (/dev/binder, /dev/hwbinder, etc)
> +#[pin_data]
> +pub(crate) struct Context {
> +    #[pin]
> +    manager: Mutex<Manager>,
> +    pub(crate) name: CString,
> +    #[pin]
> +    links: ListLinks,
> +}
> +
> +kernel::list::impl_has_list_links! {
> +    impl HasListLinks<0> for Context { self.links }
> +}
> +kernel::list::impl_list_arc_safe! {
> +    impl ListArcSafe<0> for Context { untracked; }
> +}
> +kernel::list::impl_list_item! {
> +    impl ListItem<0> for Context {
> +        using ListLinks;
> +    }
> +}

I think at some point it would be worth introducing a derive macro that
does this for us. So for example:

    #[pin_data]
    #[derive(HasListLinks)]
    pub(crate) struct Context {
        #[pin]
        manager: Mutex<Manager>,
        pub(crate) name: CString,
        #[pin]
        #[links]
        links: ListLinks,
    }

And if you need multiple links you could do:

    #[pin_data]
    #[derive(HasListLinks)]
    struct Foo {
        #[links = 0]
        a_list: ListLinks,
        #[links = 1]
        b_list: ListLinks,
    }

Same for `ListItem` and `HasWork`. I have not yet taken a look at your
linked list implementation, so I don't know if this is possible (since
`ListItem` seems to have multiple "backends").

I think this improvement can wait though, just wanted to mention it.

-- 
Cheers,
Benno


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-03 16:30   ` Benno Lossin
@ 2023-11-03 17:34     ` Boqun Feng
  2023-11-08 10:25     ` Alice Ryhl
  1 sibling, 0 replies; 38+ messages in thread
From: Boqun Feng @ 2023-11-03 17:34 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Alice Ryhl, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Carlos Llamas, Suren Baghdasaryan, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, linux-kernel, rust-for-linux, Gary Guo,
	Björn Roy Baron, Andreas Hindborg, Matt Gilbride,
	Jeffrey Vander Stoep, Matthew Maurer

On Fri, Nov 03, 2023 at 04:30:54PM +0000, Benno Lossin wrote:
> On 01.11.23 19:01, Alice Ryhl wrote:
> > +/// There is one context per binder file (/dev/binder, /dev/hwbinder, etc)
> > +#[pin_data]
> > +pub(crate) struct Context {
> > +    #[pin]
> > +    manager: Mutex<Manager>,
> > +    pub(crate) name: CString,
> > +    #[pin]
> > +    links: ListLinks,
> > +}
> > +
> > +kernel::list::impl_has_list_links! {
> > +    impl HasListLinks<0> for Context { self.links }
> > +}
> > +kernel::list::impl_list_arc_safe! {
> > +    impl ListArcSafe<0> for Context { untracked; }
> > +}
> > +kernel::list::impl_list_item! {
> > +    impl ListItem<0> for Context {
> > +        using ListLinks;
> > +    }
> > +}
> 
> I think at some point it would be worth introducing a derive macro that
> does this for us. So for example:

Agreed.

> 
>     #[pin_data]
>     #[derive(HasListLinks)]
>     pub(crate) struct Context {
>         #[pin]
>         manager: Mutex<Manager>,
>         pub(crate) name: CString,
>         #[pin]
>         #[links]
>         links: ListLinks,
>     }
> 
> And if you need multiple links you could do:
> 
>     #[pin_data]
>     #[derive(HasListLinks)]
>     struct Foo {
>         #[links = 0]
>         a_list: ListLinks,

we will need more discussion on how the derive syntax would look like,
but I'd expect we can reference the field with names instead of numbers
if we use derive macros. In other words type numbering to distinguish
different fields should be an implementation detail.

Regards,
Boqun

>         #[links = 1]
>         b_list: ListLinks,
>     }
> 
> Same for `ListItem` and `HasWork`. I have not yet taken a look at your
> linked list implementation, so I don't know if this is possible (since
> `ListItem` seems to have multiple "backends").
> 
> I think this improvement can wait though, just wanted to mention it.
> 
> -- 
> Cheers,
> Benno
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-03 16:30   ` Benno Lossin
  2023-11-03 17:34     ` Boqun Feng
@ 2023-11-08 10:25     ` Alice Ryhl
  1 sibling, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-08 10:25 UTC (permalink / raw)
  To: benno.lossin
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, gary, gregkh, jeffv, joel, linux-kernel, maco,
	mattgilbride, mmaurer, ojeda, rust-for-linux, surenb, tkjos,
	wedsonaf

Benno Lossin <benno.lossin@proton.me> writes:
> On 01.11.23 19:01, Alice Ryhl wrote:
>> +/// There is one context per binder file (/dev/binder, /dev/hwbinder, etc)
>> +#[pin_data]
>> +pub(crate) struct Context {
>> +    #[pin]
>> +    manager: Mutex<Manager>,
>> +    pub(crate) name: CString,
>> +    #[pin]
>> +    links: ListLinks,
>> +}
>> +
>> +kernel::list::impl_has_list_links! {
>> +    impl HasListLinks<0> for Context { self.links }
>> +}
>> +kernel::list::impl_list_arc_safe! {
>> +    impl ListArcSafe<0> for Context { untracked; }
>> +}
>> +kernel::list::impl_list_item! {
>> +    impl ListItem<0> for Context {
>> +        using ListLinks;
>> +    }
>> +}
> 
> I think at some point it would be worth introducing a derive macro that
> does this for us. So for example:
> 
>     #[pin_data]
>     #[derive(HasListLinks)]
>     pub(crate) struct Context {
>         #[pin]
>         manager: Mutex<Manager>,
>         pub(crate) name: CString,
>         #[pin]
>         #[links]
>         links: ListLinks,
>     }

Sure, it would be nice to improve the ergonomics of this. However, I
don't think it's that important either. The current solution is a bit
verbose, but good enough for me.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 03/20] rust_binder: add threading support
  2023-11-03 10:51   ` Finn Behrens
@ 2023-11-08 10:27     ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-08 10:27 UTC (permalink / raw)
  To: finn
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, jeffv,
	joel, linux-kernel, maco, mattgilbride, mmaurer, ojeda,
	rust-for-linux, surenb, tkjos, wedsonaf

Finn Behrens <finn@kloenk.de> writes:
> On 1 Nov 2023, at 19:01, Alice Ryhl wrote:
>> diff --git a/drivers/android/error.rs b/drivers/android/error.rs
>> new file mode 100644
>> index 000000000000..41fc4347ab55
>> --- /dev/null
>> +++ b/drivers/android/error.rs
>> +
>> +impl core::fmt::Debug for BinderError {
>> +    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
>> +        match self.reply {
>> +            BR_FAILED_REPLY => match self.source.as_ref() {
>> +                Some(source) => f
>> +                    .debug_struct("BR_FAILED_REPLY")
>> +                    .field("source", source)
>> +                    .finish(),
>> +                None => f.pad("BR_FAILED_REPLY"),
>> +            },
>> +            BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"),
>> +            BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"),
>> +            _ => f
>> +                .debug_struct("BinderError")
>> +                .field("reply", &self.reply)
>> +                .finish(),
>> +        }
>> +    }
>> +}
> 
> Renaming the debug_struct itself feels like it will make it harder to
> find later, as I would expect that a debug implementation names the
> struct its from.
> 
> Also this has the fallback in CamelCase and all defined cases as
> SCREAMING_SNAKE_CASE. Maybe rather in the defined cases something like
> f.debug_struct(‘BinderError’).field(‘reply’, “name”)?

Yeah, you're right. I'll improve the debug formatting. Thanks for the
suggestion.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-03 10:11   ` Finn Behrens
@ 2023-11-08 10:31     ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-08 10:31 UTC (permalink / raw)
  To: me
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, jeffv,
	joel, linux-kernel, maco, mattgilbride, mmaurer, ojeda,
	rust-for-linux, surenb, tkjos, wedsonaf

Finn Behrens <finn@kloenk.de> writes:
> On 1 Nov 2023, at 19:01, Alice Ryhl wrote:
>> +macro_rules! decl_wrapper {
>> +    ($newname:ident, $wrapped:ty) => {
>> +        #[derive(Copy, Clone, Default)]
>> +        #[repr(transparent)]
>> +        pub(crate) struct $newname($wrapped);
>> +        // SAFETY: This macro is only used with types where this is ok.
> 
> Would it make sense so also annotade this safety requirement on the
> macro itself?
> 
> It is only file private, but could help not overlook it, when using for
> something new in the same file.

Sure, I can move the comment.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 01/20] rust_binder: define a Rust binder driver
  2023-11-01 18:09   ` Greg Kroah-Hartman
@ 2023-11-08 10:38     ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-08 10:38 UTC (permalink / raw)
  To: gregkh
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, jeffv, joel,
	linux-kernel, maco, mattgilbride, mmaurer, ojeda, rust-for-linux,
	surenb, tkjos, wedsonaf

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> On Wed, Nov 01, 2023 at 06:01:31PM +0000, Alice Ryhl wrote:
>> +config ANDROID_BINDER_IPC_RUST
>> +	bool "Android Binder IPC Driver in Rust"
>> +	depends on MMU && RUST
> 
> Can RUST even build on non-mmu systems?

I don't know, but if it could, then the dependencies of this RFC
probably broke that. I guess this depends on is in the wrong place.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
  2023-11-01 18:10   ` Greg Kroah-Hartman
@ 2023-11-08 10:42     ` Alice Ryhl
  0 siblings, 0 replies; 38+ messages in thread
From: Alice Ryhl @ 2023-11-08 10:42 UTC (permalink / raw)
  To: gregkh
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, jeffv, joel,
	linux-kernel, maco, mattgilbride, mmaurer, ojeda, rust-for-linux,
	surenb, tkjos, wedsonaf

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> On Wed, Nov 01, 2023 at 06:01:32PM +0000, Alice Ryhl wrote:
>> +config ANDROID_BINDERFS_RUST
>> +	bool "Android Binderfs filesystem in Rust"
>> +	depends on ANDROID_BINDER_IPC_RUST
>> +	default n
> 
> Nit, the default is always 'n', so no need for this line.

Got it. I'll remove it.

> Also, it's the middle of the merge window, many of us are busy with
> other things and can't review new code until a few weeks from now,
> sorry.

That's fine. I had hoped to send it earlier to avoid this, but we wanted
performance numbers from real hardware instead of just from an emulator,
which delayed it.

Alice


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2023-11-08 10:42 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-01 18:01 [PATCH RFC 00/20] Setting up Binder for the future Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 01/20] rust_binder: define a Rust binder driver Alice Ryhl
2023-11-01 18:09   ` Greg Kroah-Hartman
2023-11-08 10:38     ` Alice Ryhl
2023-11-01 18:25   ` Boqun Feng
2023-11-02 10:27     ` Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder Alice Ryhl
2023-11-01 18:10   ` Greg Kroah-Hartman
2023-11-08 10:42     ` Alice Ryhl
2023-11-03 10:11   ` Finn Behrens
2023-11-08 10:31     ` Alice Ryhl
2023-11-03 16:30   ` Benno Lossin
2023-11-03 17:34     ` Boqun Feng
2023-11-08 10:25     ` Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 03/20] rust_binder: add threading support Alice Ryhl
2023-11-03 10:51   ` Finn Behrens
2023-11-08 10:27     ` Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 04/20] rust_binder: add work lists Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 05/20] rust_binder: add nodes and context managers Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 06/20] rust_binder: add oneway transactions Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 07/20] rust_binder: add epoll support Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 08/20] rust_binder: add non-oneway transactions Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 09/20] rust_binder: serialize oneway transactions Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 10/20] rust_binder: add death notifications Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 11/20] rust_binder: send nodes in transactions Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 12/20] rust_binder: add BINDER_TYPE_PTR support Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 15/20] rust_binder: add process freezing Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 16/20] rust_binder: add TF_UPDATE_TXN support Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 17/20] rust_binder: add oneway spam detection Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 18/20] rust_binder: add binder_logs/state Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 19/20] rust_binder: add vma shrinker Alice Ryhl
2023-11-01 18:01 ` [PATCH RFC 20/20] binder: delete the C implementation Alice Ryhl
2023-11-01 18:15   ` Greg Kroah-Hartman
2023-11-01 18:39   ` Carlos Llamas
2023-11-01 18:34 ` [PATCH RFC 00/20] Setting up Binder for the future Carlos Llamas
2023-11-02 13:33   ` Alice Ryhl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).